如何使用XPath從html中檢索特定數據？

嘿，我很難嘗試從使用XPath的網站獲取股票價格。如何使用XPath從html中檢索特定數據？

的HTML是這樣的：

<span class=" price"> 
<meta content="14.400" itemprop="price"> 
14.400 
<span itemprop="priceCurrency"> BRL</span> 
</span>

我用來檢索14.400值（所有的人都讓我空）的路徑，分別爲：

@"//span[@class=' price']"; 
@"/span[@class=' price']"; 
@"span[@class=' price']"; 
@"//meta[@itemprop='price'"]; 
@"/html/body/div[2]/div/div/div/div[2]/span/meta"; 
@"//html/body/div[2]/div/div/div/div[2]/span/meta";

嘗試了很多最接近後我能得到我所需要的就是用這個XPATH：

@"//span[@class=' price']/meta";

得到這個日誌：

2014-02-07 13:50:39.616 manejoderisco[2838:60b] { 
nodeAttributeArray =  (
      { 
     attributeName = itemprop; 
     nodeContent = price; 
    }, 
      { 
     attributeName = content; 
     nodeContent = "14.280"; 
    } 
); 
nodeName = meta; 
}

不過還是我返回空值...

來源

2014-02-07 Marco Almeida

你的HTML結構不好......（不會關閉'meta'標籤）。這是你真正處理的代碼嗎？這可能沒有幫助。 – Robin

是的，我知道沒有關閉元標記，但原始代碼就是這樣，沒有關閉標記。 –

我終於成功地創建正確的XPath這是這一個：

@"//span/meta/@content

來源

2014-02-07 18:22:14

你試圖解析HTML不形成良好，因爲meta沒有結束標籤。
不過，如果你的確能夠趕上meta標籤，你可能想選擇的內容：

//span[@class=' price']/meta/@content

或者，如果你需要的第一個文本字段，

//span[@class=' price']//text()[1]

還不如上班好。

不要忘了，當你做//span/meta您選擇的meta節點，所以<meta content="14.400" itemprop="price">14.400（取決於什麼是使用XPath的地方，因爲HTML格式不正確結束）。如果你想要的內容，你需要選擇@content屬性或文本字段與text()。

來源

2014-02-07 18:22:24 Robin

如何使用XPath從html中檢索特定數據？

回答

相關問題