2017-03-05 155 views
0

我有下面的代碼,我試圖從中提取'Washington square - USA'部分。 它位於div/p/strong內部,但div有一個類,如你所見。無法在xpath中獲取div元素中的子元素

下面

是相關的代碼,或者你可以看到entire code in pastebin

<div class="content clearfix"> 
<p><strong>Washington square - USA<br> 
</strong></p> 
<p><strong>2 studios for rent – env. 54m2</strong></p> 
<p><strong>near public transport</strong></p> 
<p>Studios comprise</p> 
<ul> 
<li>A kitchen</li> 
<li>A bedroom</li> 
<li>Tolilet with bathtab</li> 
</ul> 
<p>Visitation date (not yet known)</p> 
<p>To rent from 1st april</p> 
<p>(Current owner : Ben)</p> 
<p><strong>For more details visit: http://example.com<br> 
</strong></p> 
<p><strong>&nbsp;</strong></p> 
    </div> 

所以,我曾嘗試以下的方法來獲得回報所需的輸出內容

//div[contains(@class, "content")]/p/strong 
//div[contains(@class, "content") and contains(@class, 'clearfix')]/p/strong 
//div[contains(@class, "content") and contains(@class, 'clearfix')]/p[1]/strong 
//string(div[contains(@class, "content") and contains(@class, 'clearfix')]/p/strong) 
//div[contains(@class, "content") and contains(@class, 'clearfix')]/p/strong/text() 

但沒有我想用這段代碼解析頁面

$document = new \DOMDocument(); 
$document->loadHTMLFile($htmlUrl); 
$xpath = new \DOMXPath($document); 

foreach ($xpath->evaluate('//div[contains(@class, "content")]//p[1]') as $div) { 
    # Also tried with these 
    //div[contains(@class, "content")]/p/strong 
    //div[contains(@class, "content") and contains(@class, 'clearfix')]/p/strong 
    //div[contains(@class, "content") and contains(@class, 'clearfix')]/p[1]/strong 
    //string(div[contains(@class, "content") and contains(@class, 'clearfix')]/p/strong) 
    //div[contains(@class, "content") and contains(@class, 'clearfix')]/p/strong/text() 
    var_dump($div); 
} 
+0

我沒有看到任何PHP代碼或你的pastebin參考中的XPATH。 – trincot

+0

XPath工作。請顯示代碼在哪裏應用,結果如何,以及您的預期。 – trincot

+0

使用'$ div-> textContent' [在本例中](https://eval.in/748195)。 – trincot

回答

0

元素: // DIV [含有(@class, '內容')]/P [1] /強

然後採取的textContent

或文本: // DIV [含有(@class, '內容')]

和您的XML不形成阱/ p [1] /強/文本():由於< BR>

+0

沒有任何工作,因爲它們什麼都不返回 – user7342807