使用simplehtmldom獲取文本片段

我正在嘗試使用simplehtmldom腳本獲取一些文本。該HTML結構如下使用simplehtmldom獲取文本片段

<div id="posts"> 
    <div align="center"> 
    <SEVERAL LEVELS OF HTML> 
     <strong>XXX</strong> 
    </SEVERAL LEVELS OF HTML> 
    </div> 
    <div align="center"> 
    <SEVERAL LEVELS OF HTML> 
     <strong>IGNORE</strong> 
    </SEVERAL LEVELS OF HTML> 
    </div> 
    <div align="center"> 
    <SEVERAL LEVELS OF HTML> 
     <strong>IGNORE</strong> 
    </SEVERAL LEVELS OF HTML> 
    </div> 
</div>

我想要知道的是XXX的字符串，在第一個<strong>標籤第一<div>內具有屬性align="center"，這是<div>與id="posts"內的文本。我對<div align="center">標籤的文字不感興趣。

的「HTML的幾個層次」包括凌亂的嵌套表等

我的代碼：我使用的後代選擇，顯然，我通過HTML的幾個層次上「跳躍」。這就是爲什麼我的print_r顯示"Trying to get property of non-object"？

$html = file_get_html($page_1); 
$es = $html->find('div#posts div[align=center] strong'); 
print_r($es->plaintext); die;

奇怪的是，該語句也返回相同的"Trying to get property of non-object"結果。我究竟做錯了什麼？

$es = $html->find('div#posts');

來源

2011-02-09 stef

2個可能的原因：

在$html = file_get_html($page_1);，$page_1可能不是一個URL。如果它是一個包含html的字符串，使用str_get_html而不是$html = str_get_html('<div id="hello">Hello</div><div id="world">World</div>');。
該html包含多個div#posts（不應該）。

來源

2011-02-09 10:40:42 Shikiryu

使用simplehtmldom獲取文本片段

回答

相關問題