獲取HTML源窗體應用程序

我正在創建一個Internet Explorer使用BandOjects和C＃Windows窗體應用程序添加，並正在測試瞭解析HTML源代碼。我一直在根據網站的URL解析信息。獲取HTML源窗體應用程序

我想獲得一個例子網站的當前頁面的HTML源代碼中我有一個使用一個登錄。如果我使用我所在頁面的URL，它將始終抓取登錄頁面的源而不是實際頁面，因爲我的應用程序不能識別我登錄。是否需要存儲我的登錄憑據網站使用某種API？或者有沒有辦法抓住HTML的當前頁面？我更喜歡後者，因爲它似乎不那麼麻煩。謝謝！

來源

2011-10-20 Drew

我在應用程序的一個使用此方法：

private static string RetrieveData(string url) 
    { 

     // used to build entire input 
     var sb = new StringBuilder(); 

     // used on each read operation 
     var buf = new byte[8192]; 
     try 
     { 
      // prepare the web page we will be asking for 
      var request = (HttpWebRequest) 
            WebRequest.Create(url); 

      /* Using the proxy class to access the site 
      * Uri proxyURI = new Uri("http://proxy.com:80"); 
      request.Proxy = new WebProxy(proxyURI); 
      request.Proxy.Credentials = new NetworkCredential("proxyuser", "proxypassword");*/ 

      // execute the request 
      var response = (HttpWebResponse) 
             request.GetResponse(); 

      // we will read data via the response stream 
      Stream resStream = response.GetResponseStream(); 

      string tempString = null; 
      int count = 0; 

      do 
      { 
       // fill the buffer with data 
       count = resStream.Read(buf, 0, buf.Length); 

       // make sure we read some data 
       if (count != 0) 
       { 
        // translate from bytes to ASCII text 
        tempString = Encoding.ASCII.GetString(buf, 0, count); 

        // continue building the string 
        sb.Append(tempString); 
       } 
      } while (count > 0); // any more data to read? 

     } 
     catch(Exception exception) 
     { 
      MessageBox.Show(@"Failed to retrieve data from the network. Please check you internet connection: " + 
          exception); 
     } 
     return sb.ToString(); 
    }

你必須只通過網頁，而您需要檢索代碼的URL。

例如：

string htmlSourceGoggle = RetrieveData("www.google.com")

注意：您可以得到取消註釋代理配置，如果你使用代理訪問互聯網。將代理地址，用戶名和密碼替換爲您使用的地址。

對於通過代碼登錄。檢查：Login to website, via C#

來源

2011-10-20 16:15:48 reggie

感謝這麼多，這也爲基於URL獲取源工作（這我也有最初的工作）。但同樣因爲我的網站需要登錄才能查看特定頁面（比如說，在查詢字符串中指定了哪個頁面的頁面），它始終會檢索登錄頁面的來源，因爲如果您嘗試去只是在沒有登錄的URL上的那個頁面，它不會讓你。不知道該怎麼做，或者甚至有什麼我可以做的。 – Drew

@Drew我有一個鏈接更新了答案 – reggie

http://stackoverflow.com/questions/930807/c-sharp-login-to-website-via-program/931030#931030 – reggie

獲取HTML源窗體應用程序

回答

相關問題