2012-09-16 137 views
0

我想從HTML代碼獲取值,並且使用此C#代碼使用HtmlAgilityPack從此HTML代碼中獲取值。使用html-agility-pack無法從HTML代碼獲取值

我只想地址和電話號碼

<div class="company-info"> 
    <div id="o-company" class="edit-overlay-section" style="padding-top:5px; width: 400px;"> 
     <a href="http://www.manta.com/c/mm23df2/us-cellular" class="company-name"> 
      <h1 class="profile-company_name" itemprop="name">US Cellular</h1> 
     </a> 
    </div>  
    <div class="addr addr-co-header-gamma" itemprop="address"itemscope=""itemtype="http://schema.org/PostalAddress"> 
     <em>United States Cellular Corporation</em> 
     <div class="company-address"> 
     <div itemprop="streetAddress">2401 12th Avenue NW # 104B</div> 
      <span class="addressLocality" itemprop="addressLocality">Ardmore</span>, 
      <span class="addressRegion" itemprop="addressRegion">OK</span>  
      <span class="addresspostalCode" itemprop="postalCode">73401-1471</span> 
     </div> 
     <dl class="phone_info"><dt>Phone:</dt> 
     <dd class="tel" itemprop="telephone">(580) 490-3333</dd> 
... 

C#代碼:

private HtmlDocument ParseLink(string URL) 
{ 
    HtmlDocument hDoc = new HtmlDocument(); 
    try 
    { 
     WebClient wClient = new WebClient(); 

     byte[] bData = wClient.DownloadData(pageurl); 

     hDoc.LoadHtml(ASCIIEncoding.ASCII.GetString(bData)); 
     Response.Write("<table><tr><td>"); 

     foreach (HtmlNode hNode in hDoc.DocumentNode.SelectNodes("//div[@itemprop='company-address']")) 
     { 
      Response.Write(hNode.InnerText.ToString()); 
     } 
     Response.Write("</tr></td><td>"); 

     foreach (HtmlNode hNode in hDoc.DocumentNode.SelectNodes("//span[@itemprop='addressLocality']")) 
     { 

      Response.Write(hNode.InnerText.ToString()); 
     } 
     Response.Write("</tr></td><td>"); 

     foreach (HtmlNode hNode in hDoc.DocumentNode.SelectNodes("//span[@itemprop='addressRegion']")) 
     { 
      Response.Write(hNode.InnerText.ToString()); 
     } 

     Response.Write("</tr></td><td>"); 

     foreach (HtmlNode hNode in hDoc.DocumentNode.SelectNodes("//span[@itemprop='postalCode']")) 
     { 
      Response.Write(hNode.InnerText.ToString()); 
     } 

     Response.Write("</tr></td><td>"); 

     foreach (HtmlNode hNode in hDoc.DocumentNode.SelectNodes("//dd[@itemprop='telephone']")) 
     { 
      Response.Write(hNode.InnerText.ToString()); 
     } 
     Response.Write("</td>"); 
     Response.Write("</tr></table>"); 

    } 
    catch (Exception ex) 
    { 
     Response.Write(ex.Message); 
     hDoc.LoadHtml(""); 
    } 

    return hDoc; 
} 

但是,當過編譯這段代碼我得到這個錯誤:

"Object reference not set to an instance of an object" 

是否有任何人誰能幫我?謝謝。

+0

你能提供完整的堆棧跟蹤嗎? – mslot

+0

那是異常信息 –

+0

夥計們有什麼幫助嗎? –

回答

0

您需要提供您收到(如在哪一行拋出異常)異常的更多信息,但...

SelectNodes方法將返回null如果沒有項目匹配的XPath表達式這意味着在迭代節點之前,您將必須檢查返回值是否爲null。例如:

var companyAddressNodes = hDoc.DocumentNode.SelectNodes("//div[@itemprop='company-address']"); 

if (companyAddressNodes == null) { 
    //Throw properly exception here, log the error, or do anything you want... 
    throw new Exception("No company address node found. Perhaps the page layout changed?"); 
} 

foreach (HtmlNode hNode in) 
{ 
    Response.Write(hNode.InnerText.ToString()); 
}