條標籤放置分隔符或存儲以使用PHP

陣列

$url='http://abcd.com'; 
$d=stripslashes(file_get_contents($url)); 
echo strip_tags($d);

但不幸的是所有的變量值杵在一起像user14036100 9.00user23034003 11.33user32028000 14.00其中在用戶1，用戶2 ，則存儲user3屬性，因爲全部由strip_tags()連接在一起，所以很難分析屬性值。

所以朋友們可以幫助我去掉每個標籤並存儲在一個數組中，或者在每個剝離的標籤數據的末尾放置一個分隔符。

感謝提前:)

來源

2012-02-22 krishna

您能否提供您從url中檢索的原始數據的副本？這將有助於確定如何處理數據。 – 2012-02-22 10:17:05

你不能用strip_tags()實現這一目標，因爲它甫一刪除標籤。你不想用例如一個空白字符（新行，空格，..）。你應該用一個正則表達式來調用它，它只是替換所有的標籤。

更好的方法將解析提取的頁面DOMDocument，以便您可以直接從HTML結構派生結構。 DOM文檔

的使用

示例您有下面的示例中的HTML頁面：

<!DOCTYPE html> 
<html> 
    <head> 
     <title>This is my title</title> 
    </head> 
    <body> 
     <table id="someDataHere"> 
      <tr> 
       <th>Country</th> 
       <th>Population</th> 
      </tr> 

      <tr> 
       <td>Germany</td> 
       <td>81,779,600</td> 
      </tr> 

      <tr> 
       <td>Belgium</td> 
       <td>11,007,020</td> 
      </tr> 

      <tr> 
       <td>Netherlands</td> 
       <td>16,847,007</td> 
      </tr> 

     </table> 
    </body> 
</html>

您可以使用DOMDocument表中獲取的條目：

$url = "..."; 
$dom = new DOMDocument("1.0", "UTF-8"); 
$dom->loadHTML(file_get_contents($url)); 

$preparedData = array(); 
$table = $dom->getElementById("someDataHere"); 
$tableRows = $table->getElementsByTagName('tr'); 

foreach ($tableRows as $tableRow) 
{ 
    $columns = $tableRow->getElementsByTagName('td'); 

    // skip the header row of the table - it has no <td>, just <th> 
    if (0 == $columns->length) 
    { 
     continue; 
    } 

    $preparedData[ $columns->item(0)->nodeValue ] = $columns->item(1)->nodeValue; 
}

$preparedData將現持有以下數據：

Array 
(
    [Germany] => 81,779,600 
    [Belgium] => 11,007,020 
    [Netherlands] => 16,847,007 
)

的一些注意事項

既然你正在開發一個爬蟲（蜘蛛），你是高度依賴於目標網頁的HTML結構。每當他們更改模板中的內容時，您可能需要調整抓取工具。
這只是一個簡單的例子，但它應該清楚，現在如何使用它來產生更高級的結果。
由於DOMDocument實現了DOM方法，因此您必須通過HTML結構來處理它們提供的可能性。
對於非常大的HTML頁面DOMDocument在內存方面會變得非常昂貴。

來源

2012-02-22 11:02:21 apfelbox

感謝您的發人深省的信息。如果你可以舉一個例子，我會更高興，因爲我是編程的新手。 ;） – krishna 2012-02-22 11:16:02

我調整了我的答案，包括一個小例子腳本。 – apfelbox 2012-02-22 12:09:34

條標籤放置分隔符或存儲以使用PHP

回答

的一些注意事項

相關問題