xpath無法返回html文檔的頭部

我正嘗試讀取此鏈接http://www.aspemail.com與HtmlAtiligtyPack。但它無法讀取頭部並返回null。xpath無法返回html文檔的頭部

 HtmlAgilityPack.HtmlDocument htmlDocument = new HtmlDocument(); 
     System.Net.WebClient webClient = new System.Net.WebClient(); 
     string download = webClient.DownloadString(linkDetails.Url); 

     htmlDocument.LoadHtml(download); 
     HtmlNode htmlNode = htmlDocument.DocumentNode.SelectSingleNode("html/head");

但是當我檢查放置的斷點時，htmlNode包含null。我正在使用這個程序嗎？

來源

2013-07-06 Spirals Whirls

SelectSingleNode("html/head");

你看過這個網站的來源嗎？其中沒有<html>節點。最後只有一個閉幕</html>，但消息來源直接以<head>開頭 - OMG，現在什麼樣的人在寫網站是不可思議的。

你可以像這樣的適應您的選擇：

HtmlNode htmlNode = htmlDocument.DocumentNode.SelectSingleNode("head");

來源

2013-07-06 10:29:49

我與歌劇檢查，並在它裏面的html節點。 –

我使用Google Chrome瀏覽過，並且沒有''。由於您使用的是WebClient，並且沒有指定'User-Agent'請求標頭，我猜這個網站不會返回''。一種可能性是發送歌劇的用戶代理欺騙網站認爲它被Opera訪問並最終呈現''。或者簡單地調整你的選擇器：HtmlNode'htmlNode = htmlDocument.DocumentNode.SelectSingleNode（「head」）;'。 –

好，那麼如何解決這個矛盾。由於我必須閱讀隨機頁面，而且我無法猜測每個頁面中文檔的結構。 –

xpath無法返回html文檔的頭部

回答

相關問題