閱讀網頁中的特定內容？

我想製作一個應用程序（在C＃中），其中我必須從wiktionary.com或dictionary.com等網站獲取一些含義。但是我從來沒有使用過Xml，或者根本沒有使用過網頁。閱讀網頁中的特定內容？

我設法得到網頁的響應（例如從一個特定的詞dictionary.com）（我希望是xml格式）。

這是我得到了

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Strict//EN"> 
<!--attributes for answers reference--> 
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:fb="http://www.facebook.com/2008/fbml" xmlns:og="http://opengraphprotocol.org/schema/"> 
<head> 
<title> 
Hello | Define Hello at Dictionary.com 
</title> 
<meta name="description" content="Hello definition, (used to express a greeting, answer a telephone, or attract attention.) See more."/> 
<meta name="keywords" content="hello, online dictionary, English dictionary, hello definition, define hello, definition of hello, hello pronunciation, hello meaning, hello origin, hello examples"/> 
<link rel="canonical" href="http://dictionary.reference.com/browse/hello"/> 
<meta property="og:title" content="the definition of hello"/> 
<meta property="og:site_name" content="Dictionary.com"/> 
<meta property="og:image" content="http://sp2.dictionary.com/en/i/dictionary/facebook/dictionary_logo.png"/>

現在我想解析以下字符串進行響應的話「你好」。：。

used to express a greeting, answer a telephone, or attract attention.

我試過使用XmlReader但卡住了。有人可以幫我閱讀這些內容嗎？

來源

2011-04-11 Ankit

小心屏幕抓取，如果這就是你在做什麼的另一種選擇。很多時候它違反了網站的條款和條件，你的實現也將與他們的html格式緊密結合。如果他們改變他們的網站，很多時候你的代碼將不再工作。 – BrandonZeider 2011-04-11 13:22:29

您可以使用HTML Agility Pack輕鬆解析HTML。

HtmlDocument doc = new HtmlDocument(); 
// replace with your own content 
doc.Load("file.htm"); 
foreach(HtmlNode meta in doc.DocumentElement.SelectNodes("/meta[@name='description'"]) 
{ 
    HtmlAttribute att = meta["content"]; 
    Consol.WriteLine(att.Value); 
}

來源

2011-04-11 13:18:12 mathieu

他的迴應是XHTML（請參閱標題），所以XML解析器將正常工作。 – 2011-04-11 13:20:36

根據doctype，它是html。但你說得對：Xml解析器可以很好地工作，根據html完美的形成，適當的標籤關閉，並沒有時髦的字符（例如） – mathieu 2011-04-11 13:24:04

謝謝你的答覆。你還可以告訴如何在這裏使用XML解析器？因爲我正在嘗試學習使用XML。 – Ankit 2011-04-11 14:19:23

您可以使用Web服務，如http://services.aonaware.com/，這對你更更好的廣告定位的網站:-)。

http://words.bighugelabs.com/api.php是其中有一個更簡單的API

來源

2011-04-11 13:20:38

閱讀網頁中的特定內容？

回答

相關問題