2012-10-12 158 views
1

我有一些腳本從表格中提取一些文本並將其轉儲到文本文件中。我希望您的幫助只刪除網址,以便文本文件只顯示網址。使用PHP從數據中提取URL

{ 
$result = @mysql_query("SELECT post_content_filtered FROM wp_posts ORDER BY post_date desc limit 200"); 
$tsv = array(); 
$html = array(); 
while($row = mysql_fetch_array($result, MYSQL_NUM)){ 
    $tsv[] = implode("\t", $row); 
    $html[] = "<tr><td>" .implode("</td><td>", $row) ."</td></tr>"; 
} 

$tsv = implode("\r\n", $tsv); 
$html = "<table>" . implode("\r\n", $html) . "</table>"; 
$fileName = 'finishedflippa.txt'; 
header("Content-type: application/vnd.ms-excel"); 
header("Content-Disposition: attachment; filename=$fileName"); 
echo $tsv; 

post_content_filtered文字看起來是這樣的:

blahblahblah http://www.example.com blahblahblah
blahblahblah http://www.12345.com blahblahblah
blahblahblah http://www.gfds.com blahblahblah
blahblahblah http://www.45tyhju.com blahblahblah

blahblahblah對於每一行都是相同的。非常感謝。

+1

下一次請自行張貼可讀的代碼。謝謝。 – Peon

+0

對不起!即使有這麼多,我也遇到了很多麻煩。 – baselinej70

回答

6

URL有非常complex definition

你可以試試這個簡單的匹配preg_match_all

$str = 'text looks something like this: 
blahblahblah http://www.example.com blahblahblah 
blahblahblah http://www.12345.com blahblahblah 
blahblahblah http://www.gfds.com blahblahblah 
blahblahblah http://www.45tyhju.com blahblahblah ' ; 

preg_match_all('!https?://[\S]+!', $str, $matches); 
var_dump($matches[0]); 
+0

非常感謝! – baselinej70

+1

此解決方案僅限於http(s)URL,但還有許多其他有效形式的URL。由於OP指出''blahblahblah'前綴是已知的並且是固定的 - 更好的方法是過濾前綴(例如'!(?: blahblahblah)([\ S] +)!'),其中'([\ S ])'可能應該替換爲有效URL的表達式(如前所述,這可能相當複雜)。 – Drag0nR3b0rn