我有一個印象,你不完全明白如何工作正則表達式。那麼,首先你需要一個輸入來與正則表達式匹配。 例:
$input = "hello, 123"; //I just need the digits part
$regex = @"\d+";
preg_match($regex, $input, $matched);
\ d搜索按數字[0-9]上串,等同[0-9]或:
for ($c = 0, $len = strlen($input); $c < $len; $c++)
{
$tmp = $input[$c];
if ($tmp == '0' || $tmp == '1' || $tmp == '2' ||
$tmp == '3' || $tmp == '4' || $tmp == '5' ||
$tmp == '6'|| $tmp == '7' || $tmp == '8' ||
$tmp == '9')
{
echo $tmp;
}
}
,如果你想下載所有谷歌的圖像,同時檢查:
http://www.google.com/logos/ 網絡爬蟲用於提取鏈接在此頁面中的所有圖像:
<?
Header('Content-Type:text/plain');
$domain = "http://www.google.com/logos/";
$ch = curl_init($domain);
curl_setopt($ch ,CURLOPT_RETURNTRANSFER, true);
$response = curl_exec($ch);
preg_match_all("/<img\s+alt=\"(?<title>[^\"]+)\"\s+src=\"(?<url>[^\"]+)\"/", $response, $matched);
print_r($matched);
?>
輸出:
[title] => Array
(
[0] => Latest Google Logos
[1] => Les Paul's 96th Birthday
[2] => Dragon Boat Festival
[3] => Richard Scarry's 92nd Birthday
[4] => Republic Day
[5] => Birthday of Ibn Khaldun
[6] => Africa Day
[7] => Jordan Independence Day
[8] => Day of Slavonic Alphabet, Bulgarian Enlightenment and Culture
[9] => Emile Berliner's 160th Birthday
[10] => Doodle4Google US Winner
[11] => 100th Birthday of Annie M.G. Schmidt
[12] => Dame Nellie Melba's 150th Birthday
[13] => 120th Birthday of Mikhail Bulgakov
[14] => Paraguay's Independence Day
[15] => Martha Graham's 117th Birthday. Animated by Ryan Woodward, choreographed by Janet Eilber, and danced by Blakeley White-McGuire.
//...
[url] => Array
(
[0] => /images/feed-icon.gif
[1] => /logos/2011/lespaul11-hp.png
[2] => /logos/2011/dragonboat11-hp.jpg
[3] => /logos/2011/scarry11-hp.png
[4] => /logos/2011/republicday11-hp.jpg
[5] => /logos/2011/ibn11-hp.jpg
[6] => /logos/2011/africaday11-hp.jpg
[7] => /logos/2011/jordan11-hp.png
[8] => /logos/2011/slavonic_alaphabet11-hp.jpg
[9] => /logos/2011/berliner11-hp.png
[10] => /logos/2011/d4g11-matteolopez-HP.png
[11] => /logos/2011/annieschmidt11-hp.jpg
[12] => /logos/2011/nelliemelba11-hp.jpg
[13] => /logos/2011/bulgakov11-hp.png
[14] => /logos/2011/paraguay11-hp.jpg
[15] => /logos/2011/graham11-hp.png
//....
多
$urlToDownlaod = implode($domain, $matched["url"]);
$urlToDownlaod = explode("\"", $urlToDownlaod);
print_r($urlToDownlaod);
現在,這裏已在google.com/logos 託管圖像的所有URL做出下載功能
簡單的例子:
function GetSrc($link) {
$ch = curl_init($link);
curl_setopt($ch ,CURLOPT_RETURNTRANSFER, true);
return curl_exec($ch);
}
for($x = 0,$len = count($urlToDownlaod); $x < $len; $x++) {
$fp = fopen("images/".$matched["title"][$x], "w");
fputs($fp, GetSrc($urlToDownlaod[$x]);
fclose($fp);
flush();
}
Leron:正則表達式反過來,從一個條g,不要創建字符串。 – hakre 2011-06-16 16:13:36
你的意思是說你想下載所有在Doodle上使用的GOOGLE圖片? – 2011-06-16 16:14:32
不,我使用這個URL,因爲我在另一個關於cURL的問題中發現它。實際上,我只是想知道是否有某種方法可以通過這種方式獲取未知文件。並且如果最終做出一些循環並嘗試獲取所有的文件。就是這樣。 – Leron 2011-06-16 16:24:16