2013-09-29 78 views
-3

我使用簡單的html dom將數據從站點抓取到我的數據庫並顯示在我的網頁上。但每次運行該文件時,重複的數據也會插入到數據庫中。如何檢查數據庫中是否已存在數據庫?這裏是我的爬行文件:抓取數據後在數據庫中重複數據

<?php 

$con=mysqli_connect("localhost","root","","crawling");\ 

mysql_connect("localhost", "root", "")or die("cannot connect"); 
mysql_select_db("crawling")or die("cannot select DB"); 


include "domcrawl.php"; 
$url="http://www.bgr.in/category/reviews/"; 
$html=file_get_html($url); 
//$arr=$html->find('table[class=findList] tbody tr td[class=result_text]'); 
$m=$html->find('img'); 

$b=$html->find('a'); 

$c=$html->find('p'); 

$imghead = $b[21]->innertext; 

$img = $m[3]; 

$imgtext = $c[0]; 


$sql = sprintf("INSERT INTO image1 
(head, image, text, name) 
VALUES 
('%s', '%s', '%s', '%s')", 

mysql_real_escape_string($imghead), 
mysql_real_escape_string($img), 
mysql_real_escape_string($imgtext), 
mysql_real_escape_string("gm") 
); 
mysql_query($sql); 





$sql = "SELECT head FROM image1 WHERE name='gm'"; 
$sql1 = "SELECT image FROM image1 WHERE name='gm'"; 
$sql2 = "SELECT text FROM image1 WHERE name='gm'"; 
$result = mysql_query("$sql"); 
$result1 = mysql_query("$sql1"); 
$result2 = mysql_query("$sql2"); 

    $head_get= mysql_result($result, 0); 
$img_get= mysql_result($result1, 0); 
$text_get= mysql_result($result2, 0); 
echo "<br><br>"; 

echo $head_get; 
echo "<br><br>"; 
echo $img_get; 
echo $text_get; 


    ?> 
+0

什麼是你的代碼怎麼辦呢?你試過什麼了? – octern

+2

因爲您試圖解析的XML文檔中沒有任何名爲'pubDate'的標籤。 –

+1

[這個答案](http://stackoverflow.com/a/12769983/2209007)非常類似於找出那個錯誤意味着什麼。 – Sumurai8

回答

0

假設'date' => $node->getElementsByTagName('pubDate')->item(0)->nodeValue是11行,似乎有與標籤pubDate沒有元素,這就是爲什麼$node->getElementsByTagName('pubDate')->item(0)返回或者nullfalse

1

必須獲取對象的屬性前檢查,在你的情況下,它發現空對象

$link = $node->getElementsByTagName('link')->item(0); 
if(!empty($link)){ 
$nodeValue = $link->nodeValue, 
} 

'link' => $nodeValue; 

對於同樣做