2014-10-05 40 views
0

我試圖在Windows Phone 8.1應用程序中使用htmlagilitypack 2.28獲取div中段落標記的文本。HTML敏捷從段落標記中獲取文本

DIV的結構是

<div id="55"> 

<p>&nbsp;</p> 

<p><span class="dropcap">W 

</span><span class="zw-portion"><strong>ith the start of festive season in India</strong>, we 
will also witness the f<strong>irst London Derby</strong> of the season  
between the newly London rivals <strong>Chelsea and Arsenal</strong>. It will be a great chance 
for Arsene Wenger to get rid of his <strong>1000</strong></span> 

<strong><span class="zw-portion">th</span><span class="zw-portion"> managed </span> 

<span class="zw-portion">6-0 </spa> 

<span class="zw-portion">massacre</span></strong> 

<span class="zw-portion"> in March,</span> 

<span class="zw-portion">&nbsp;</span> 

<span class="zw-portion">while the Special One will be eager to continue his winning rampage 
</span> 

<span class="zw-portion">&nbsp;</span> 

<span class="zw- portion">over his 「<strong>Specialist in Failure</strong>」 counterpart. Although 
both clubs can boast of being unbeaten this season and both clubs can take this opportunity 
</span> 

<span class="zw-portion"> to bring down their rival</span><span class="zw-portion">.</span></p> 

<p>&nbsp;</p> 

<p><iframe width="640" height="360" src="https://www.youtube.com/embed/zFBN8M1pCxo? 
feature=oembed" frameborder="0" allowfullscreen=""></iframe></p> 

<p class="zw-paragraph" data-textformat=" 
{&quot;type&quot;:&quot;text&quot;,&quot;td&quot;:&quot;none&quot;}"></p> 

<p class="zw-paragraph" data-textformat= 
{&quot;type&quot;:&quot;text&quot;,&quot;td&quot;:&quot;none&quot;}"> 

<span class="zw-portion">The rivalry between Chelsea and Arsenal was not as a primary London 
Derby, until Chelsea rose to top of Premier League in 2000’s, when they consistently competed 
against each other. The rivalry between the two clubs rose higher as compared to their 
traditional rivals. Both the clubs rivalry are now not only limited to their pitch but has also 
been to the fans. In 2009 survey by Football Fans Census, Arsenal fans named Chelsea as the 

<strong>most disliked club</strong> </span> 

<span class="zw-portion"> ahead of their traditional rivals <strong>Manchest</strong></span> 
<strong> <span class="zw-portion">er United and Tottenham Hotspur</span></strong> 

<span class="zw-portion">. However the report of the other camp doesn’t differ much as Chelsea 
fans ranks Arsenal as their <strong>second most-disliked club</strong></span> 

<strong><span class="zw-portion">. 
</span></strong></p> 
</div> 

我想只提取DIV內的段落元素內containined文本。 到目前爲止,我已經寫了以下代碼,其中feedurl包含要從中提取數據的頁面地址(提取正確的地址)。之後,我嘗試使用它的id(總是等於55)來獲得對div的引用。

var feedurl = GetValue("feedurl"); 
string htmlPage = "asdsad"; 
HtmlDocument htmldoc = new HtmlDocument(); 
htmldoc.LoadHtml(feedurl); 
htmldoc.OptionUseIdAttribute=true; 
HtmlNode div = htmldoc.GetElementbyId("55"); 
if (div != null) 
{ 
    htmlPage += "done"; 
} 

_content = htmlPage; 
return _content; 

htmldoc.GetElementbyId("55");正在返回空引用。 我已閱讀並使用htmldoc.DocumentNode.SelectNodes([arguments])。但沒有SelectNodes方法可供我使用。我迷失在如何進一步發展。請幫忙。

回答

1

WP 8.1的HtmlAgilityPack版本不支持SelectNodes(),因爲該方法需要XPath實現,這在WP8.1的.NET版本中不幸丟失。

解決方案是使用HtmlAgilityPack的LINQ API而不是Xpath。例如,爲了獲得具有id屬性等於55<div>元素:

HtmlNode div55 = htmldoc.DocumentNode 
         .Descendants("div") 
         .FirstOrDefault(o => o.GetAttributeValue("id", "") 
                == "55"); 
+0

我將不得不使用system.link吧。 如果在此之後,我使用 'if(div55!= null){做點什麼}'它沒有做任何事情。如果我使用'div55.InnerText',我會得到NullReference異常。 – user3263192 2014-10-06 09:14:28

+0

確保您正確地將HTML加載到'HtmlDocument'(您可以通過'DocumentNode.OuterHtml'屬性進行檢查,看該屬性是否包含預期的HTML標記) – har07 2014-10-06 10:28:58

+0

'DocumentNode.OuterHtml'返回存儲在頁面中的頁面url feedurl變量。這是對的嗎。原諒我這麼愚蠢的問題,因爲我是新手,無法在網上找到答案。 – user3263192 2014-10-06 10:46:43