2010-09-25 101 views
0

您好所有的i都示於下PHP陣列濾波器的正則表達式

Array 
(
    [0] => http://api.tweetmeme.com/imagebutton.gif?url=http://mashable.com/2010/09/25/trailmeme/ 
    [1] => http://cdn.mashable.com/wp-content/plugins/wp-digg-this/i/gbuzz-feed.png 
    [2] => http://mashable.com/wp-content/plugins/wp-digg-this/i/fb.jpg 
    [3] => http://mashable.com/wp-content/plugins/wp-digg-this/i/diggme.png 
    [4] => http://ec.mashable.com/wp-content/uploads/2009/01/bizspark2.gif 
    [5] => http://cdn.mashable.com/wp-content/uploads/2010/09/web.png 
    [6] => http://mashable.com/wp-content/uploads/2010/09/Screen-shot-2010-09-24-at-10.51.26-PM.png 
    [7] => http://cdn.mashable.com/wp-content/uploads/2009/02/bizspark.jpg 
    [8] => http://feedads.g.doubleclick.net/~at/lxx00QTjYBaYojpnpnTa6MXUmh4/0/di 
    [9] => 
    [10] => http://feedads.g.doubleclick.net/~at/lxx00QTjYBaYojpnpnTa6MXUmh4/1/di 
    [11] => 
    [12] => http://feeds.feedburner.com/~ff/Mashable?i=0N_mvMwPHYk:j5Pmi_N-JQ8:D7DqB2pKExk 
    [13] => 
    [14] => http://feeds.feedburner.com/~ff/Mashable?i=0N_mvMwPHYk:j5Pmi_N-JQ8:V_sGLiPBpWU 
    [15] => 
    [16] => http://feeds.feedburner.com/~ff/Mashable?i=0N_mvMwPHYk:j5Pmi_N-JQ8:F7zBnMyn0Lo 
    [17] => 
    [18] => http://feeds.feedburner.com/~ff/Mashable?d=qj6IDK7rITs 
    [19] => 
    [20] => http://feeds.feedburner.com/~ff/Mashable?d=_e0tkf89iUM 
    [21] => 
    [22] => http://feeds.feedburner.com/~ff/Mashable?i=0N_mvMwPHYk:j5Pmi_N-JQ8:gIN9vFwOqvQ 
    [23] => 
    [24] => http://feeds.feedburner.com/~ff/Mashable?d=yIl2AUoC8zA 
    [25] => 
    [26] => http://feeds.feedburner.com/~ff/Mashable?d=P0ZAIrC63Ok 
    [27] => 
    [28] => http://feeds.feedburner.com/~ff/Mashable?d=I9og5sOYxJI 
    [29] => 
    [30] => http://feeds.feedburner.com/~ff/Mashable?d=CC-BsrAYo0A 
    [31] => 
    [32] => http://feeds.feedburner.com/~ff/Mashable?i=0N_mvMwPHYk:j5Pmi_N-JQ8:_cyp7NeR2Rw 
    [33] => 
    [34] => http://feeds.feedburner.com/~r/Mashable/~4/0N_mvMwPHYk 
) 

基本上陣列,我想

  1. 刪除所有空數組元素
  2. 移除而不 擴展".jpg,.png,.gif"每個數組項以其名義;
  3. 最後刪除包含關鍵字(如"digg,fb,tweet,bizspark")的數組項。

已經嘗試我們的代碼,並將其返回例如 喜IVE試圖上面的代碼...它返回一個包含我想出來的東西的陣列。

嗨,ive試過上面的代碼...它返回一個包含我想要的東西的數組。 )

Array ([5] => 
http://feedads.g.doubleclick.net/~at/W-z_kHMi30EtE1mpxK8NvMmNmeg/0/di 
[7] => 
http://feedads.g.doubleclick.net/~at/W-z_kHMi30EtE1mpxK8NvMmNmeg/1/di 
[9] => 
http://feeds.feedburner.com/~ff/Mashable?i=mEedXAp78pg:339cIishd6A:D7DqB2pKExk 
[11] => 
http://feeds.feedburner.com/~ff/Mashable?i=mEedXAp78pg:339cIishd6A:V_sGLiPBpWU 
[13] => 
http://feeds.feedburner.com/~ff/Mashable?i=mEedXAp78pg:339cIishd6A:F7zBnMyn0Lo 
[15] => 
http://feeds.feedburner.com/~ff/Mashable?d=qj6IDK7rITs 
[17] => 
http://feeds.feedburner.com/~ff/Mashable?d=_e0tkf89iUM 
[19] => 
http://feeds.feedburner.com/~ff/Mashable?i=mEedXAp78pg:339cIishd6A:gIN9vFwOqvQ 
[21] => 
http://feeds.feedburner.com/~ff/Mashable?d=yIl2AUoC8zA 
[23] => 
http://feeds.feedburner.com/~ff/Mashable?d=P0ZAIrC63Ok 
[25] => 
http://feeds.feedburner.com/~ff/Mashable?d=I9og5sOYxJI 
[27] => 
http://feeds.feedburner.com/~ff/Mashable?d=CC-BsrAYo0A 
[29] => 
http://feeds.feedburner.com/~ff/Mashable?i=mEedXAp78pg:339cIishd6A:_cyp7NeR2Rw 
[31] => 
http://feeds.feedburner.com/~r/Mashable/~4/mEedXAp78pg 
)) 

我想它從第一個例子

[5] => http://cdn.mashable.com/wp-content/uploads/2010/09/web.png 
    [6] => http://mashable.com/wp-content/uploads/2010/09/Screen-shot-2010-09-24-at-10.51.26-PM.png 

任何想法返回如?


嗨GZIP我已經修改了代碼,並即時得到更好的結果

function url_array_filter($url) 
{ 
    static $words = array('digg', 'fb', 'tweet', 'bizspark','feedburner','feedads','CountImage'); 
    static $extens = array('.jpg', '.png', '.gif'); 
    $ret = true; 
    if (!$url) { 
     $ret = false; 
    } elseif (str_replace($words, '', $url) != $url) { 
     $ret = false; 
    } else { 
     $path = parse_url($url, PHP_URL_PATH); 
     if (in_array(substr($path, -4), $extens)) { 
      $ret = false; 
     } 
    } 
    return $ret; 
} 

現在我的問題自帶的輸出。例如

Array ([0] => http://cdn.dzone.com/images/thumbs/120x90/491551.jpg' style='width:120;height:90;float:left;vertical-align:top;border:1px solid) 

Array ([0] => http://cdn.dzone.com/images/thumbs/120x90/490913.jpg' style='width:120;height:90;float:left;vertical-align:top;border:1px solid) 

我只想要url。我想我有從原始內容中提取網址的問題。 lemme發佈一個鏈接到原始問題以及即時消息。

RSS Feeds and image extraction indepth

我只是想的URL。我想從那個鏈接.... getImagesUrl()可能搞亂了。即時將嘗試使用parse_url帶回正確的網址。讓我知道如果我在正確的軌道上。 IM非常接近管理從RSS拉動圖像的URL解析飼料與喜鵲


確定的GZip,這是修改和添加IVE加入到我們的代碼... 95%的作品!大。 雖然我確實收到了一些有趣的結果如下

function url_array_filter($url) 
{ 
    static $words = array('digg', 'fb', 'tweet', 'bizspark','feedburner','feedads','CountImage','fuelbrand'); 
    static $extens = array('.jpg', '.png', '.gif'); 
    $ret = true; 
    if (!$url) { 
     $ret = false; 
    } elseif (str_replace($words, '', $url) != $url) { 
     $ret = false; 
    } else { 
     $path = parse_url($url, PHP_URL_PATH); 
     if (in_array(substr($path, -4), $extens)) { 
      $ret = false; 
     } 
    } 
    return $ret; 
} 

function cleanURL($a_url) 
    { 
    $ret=array(); 
    foreach ($a_url as $c) 
     { 
     $a=parse_url($c, PHP_URL_SCHEME).'://'.parse_url($c, PHP_URL_HOST).parse_url($c, PHP_URL_PATH);  
     $a=explode("'",$a); 
     $ret[]=$a[0]; 
     } 
    return $ret;   
    } 

例如使用即時發佈。 $這 - > getImagesUrl($ C);下面返回第一個問題的結果。

    foreach($content as $c) { 
         // get the images in content 
         $arr = $this->getImagesUrl($c); 
         $arr = array_filter($arr, 'url_array_filter'); 
         } 
        $ret=cleanURL($arr); 
        if (count($ret)>0) 
         { 
         print_r($ret);         
         echo "<br/><br/>"; 
         } 

到這一點,幾乎一切都很正常,但我不斷收到一些不好的結果一樣

Array ([0] => http://cdn.mashable.com/wp-content/uploads/2010/02/ipad-side-) 
Array ([0] => http://mrg.bz/FZtr2k [1] => http://mrg.bz/IDkx4w) 

人,我們幾乎沒有...任何想法

+2

您的代碼到目前爲止看起來如何? – 2010-09-25 19:11:47

+0

我不知道使用正則表達式...所以不能再進一步。我現在所能做的就是刪除空數組元素 – 2010-09-25 19:16:38

+1

你不需要正則表達式。用[stristr](http://nl2.php.net/manual/en/function.stristr.php)和比較函數來創意。 – Lekensteyn 2010-09-25 19:18:11

回答

6

使用,例如,array_filter()會給你的靈活性和易於維護的(不斷變化的需求,去竊聽等):

function url_array_filter($url) 
{ 
    static $words = array('digg', 'fb', 'tweet', 'bizspark'); 
    static $extens = array('.jpg', '.png', '.gif'); 
    $ret = true; 
    if (!$url) { 
     $ret = false; 
    } elseif (str_replace($words, '', $url) != $url) { 
     $ret = false; 
    } else { 
     $path = parse_url($url, PHP_URL_PATH); 
     if (in_array(substr($path, -4), $extens)) { 
      $ret = false; 
     } 
    } 
    return $ret; 
} 

$arr = array_filter($arr, 'url_array_filter'); 
print_r($arr); 

(工程爲陣,乃可需要更改;它是演示代碼。)

+1

將substr($ path,-4)更改爲strrchr($ path,'。')將擺脫整數常量。 – GZipp 2010-09-25 20:49:44

3
foreach ($array as $key => $value) { 
    if (
     empty($value)|| 
     (preg_match('#^http:\/\/(.*)\.(gif|png|jpg)$#i', $value) == 0)|| 
     (preg_match('#(tweet|bizspark)#i', $value) > 0) 
    ) { 
     unset($array[$key]); 
    } 
}