如何存儲非常大的HTML流的一部分？

我必須得到一個網頁的HTML 代碼後，要找到這個類：如何存儲非常大的HTML流的一部分？

<span class='uccResultAmount'>0,896903</span>

我與常規表達式嘗試。而且還與流，我的意思是，存儲整個HTML代碼在string。但是，代碼對於string來說非常大。所以這使得它不可能，因爲我搜索的金額0,896903不存在於string。

是否有任何方法只讀取流的一小塊？

方法的一部分：

public static string getValue() 
     { 
      string data = "not found"; 
      string urlAddress = "http://www.xe.com/es/currencyconverter/convert/?Amount=1&From=USD&To=EUR"; 

      HttpWebRequest request = (HttpWebRequest)WebRequest.Create(urlAddress); 
      HttpWebResponse response = (HttpWebResponse)request.GetResponse(); 

      if (response.StatusCode == HttpStatusCode.OK) 
      { 
       Stream receiveStream = response.GetResponseStream(); 
       StreamReader readStream = null; 

       if (response.CharacterSet == null) 
       { 
        readStream = new StreamReader(receiveStream); 
       } 
       else 
       { 
        readStream = new StreamReader(receiveStream, Encoding.GetEncoding(response.CharacterSet)); 
       } 

       data = readStream.ReadToEnd(); // the string in which I should search for the amount 

       response.Close(); 
       readStream.Close(); 
      }

如果你找到一個更簡單的方法來解決我的問題，讓我知道這一點。

來源

2016-10-06 Oscar Martinez

我會用HtmlAgilityPack和XPath

var web = new HtmlAgilityPack.HtmlWeb(); 
var doc = web.Load("http://www.xe.com/es/currencyconverter/convert/?Amount=1&From=USD&To=EUR"); 
var value = doc.DocumentNode.SelectSingleNode("//span[@class='uccResultAmount']") 
       .InnerText;

LINQ的版本也有可能

var value = doc.DocumentNode.Descendants("span") 
      .Where(s => s.Attributes["class"] != null && s.Attributes["class"].Value == "uccResultAmount") 
      .First() 
      .InnerText;

Don't use this。只是爲了顯示

但問題是，這個網站碼不適合在一個字符串

是不正確的

string html = new WebClient().DownloadString("http://www.xe.com/es/currencyconverter/convert/?Amount=1&From=USD&To=EUR"); 
var val = Regex.Match(html, @"<span[^>]+?class='uccResultAmount'>(.+?)</span>") 
       .Groups[1] 
       .Value;

來源

2016-10-06 20:48:27

將是一個辦法做到這一點，而無需使用HtmlAgilityPack？ –

@OscarM你需要一個工具來解析html。你不能使用正則表達式http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags –

但問題是，這個HTML代碼不適合在一個單一的字符串，所以我無法解析那些不包含我需要的子字符串的東西。 –

如何存儲非常大的HTML流的一部分？

回答

相關問題