WCF(.NET 4.0)提供了檢索網頁HTML的能力嗎?如果不是,.NET 4.0中最好的方法是什麼?WCF(.NET 4.0)是否提供了檢索網頁HTML的功能?
0
A
回答
1
您可以使用WebClient檢索Web paqe
using (WebClient web = new WebClient())
{
var data = web.DownloadData(myURL);
}
WebClient的一些侷限性遭受的HTML如無法設置下載時間和不一致的進度事件。我寫了自己的子類,改進了它。這是代碼,以防萬一它有用。請注意,我正在處理的代碼中存在一個錯誤(請參閱WebClient Subclass Disposed During Event Handler, Result is ObjectDisposedException)。解決這個問題的一個簡單方法就是在我的問題中提到的一條線上放一個try/catch,但我試圖理解這個問題中的核心問題。
public class MyWebClient : WebClient, IDisposable
{
public int Timeout { get; set; }
public int TimeUntilFirstByte { get; set; }
public int TimeBetweenProgressChanges { get; set; }
public long PreviousBytesReceived { get; private set; }
public long BytesNotNotified { get; private set; }
public string Error { get; private set; }
public bool HasError { get { return Error != null; } }
private bool firstByteReceived = false;
private bool success = true;
private bool cancelDueToError = false;
private EventWaitHandle asyncWait = new ManualResetEvent(false);
private Timer abortTimer = null;
private bool isDisposed = false;
const long ONE_MB = 1024 * 1024;
public delegate void PerMbHandler(long totalMb);
public event PerMbHandler NotifyMegabyteIncrement;
public MyWebClient(int timeout = 60000, int timeUntilFirstByte = 30000, int timeBetweenProgressChanges = 15000)
{
this.Timeout = timeout;
this.TimeUntilFirstByte = timeUntilFirstByte;
this.TimeBetweenProgressChanges = timeBetweenProgressChanges;
this.DownloadFileCompleted += new System.ComponentModel.AsyncCompletedEventHandler(MyWebClient_DownloadFileCompleted);
this.DownloadProgressChanged += new DownloadProgressChangedEventHandler(MyWebClient_DownloadProgressChanged);
abortTimer = new Timer(AbortDownload, null, TimeUntilFirstByte, System.Threading.Timeout.Infinite);
}
protected void OnNotifyMegabyteIncrement(long totalMb)
{
if (NotifyMegabyteIncrement != null) NotifyMegabyteIncrement(totalMb);
}
void AbortDownload(object state)
{
cancelDueToError = true;
this.CancelAsync();
success = false;
Error = firstByteReceived ? "Download aborted due to >" + TimeBetweenProgressChanges + "ms between progress change updates." : "No data was received in " + TimeUntilFirstByte + "ms";
asyncWait.Set();
}
void MyWebClient_DownloadProgressChanged(object sender, DownloadProgressChangedEventArgs e)
{
if (cancelDueToError || isDisposed) return;
long additionalBytesReceived = e.BytesReceived - PreviousBytesReceived;
PreviousBytesReceived = e.BytesReceived;
BytesNotNotified += additionalBytesReceived;
if (BytesNotNotified > ONE_MB)
{
OnNotifyMegabyteIncrement(e.BytesReceived);
BytesNotNotified = 0;
}
firstByteReceived = true;
if (!isDisposed) abortTimer.Change(TimeBetweenProgressChanges, System.Threading.Timeout.Infinite);
}
public bool DownloadFileWithEvents(string url, string outputPath)
{
asyncWait.Reset();
Uri uri = new Uri(url);
this.DownloadFileAsync(uri, outputPath);
asyncWait.WaitOne();
return success;
}
void MyWebClient_DownloadFileCompleted(object sender, System.ComponentModel.AsyncCompletedEventArgs e)
{
if (cancelDueToError || isDisposed) return;
asyncWait.Set();
}
protected override WebRequest GetWebRequest(Uri address)
{
var result = base.GetWebRequest(address);
result.Timeout = this.Timeout;
return result;
}
void IDisposable.Dispose()
{
isDisposed = true;
if (asyncWait != null) asyncWait.Dispose();
if (abortTimer != null) abortTimer.Dispose();
base.Dispose();
}
}
1
如果您試圖抓取HTML內容(用於緩存等),那麼Eric J是對的。如果您打算將網頁的一部分作爲數據提取,您可能需要查看HTML Agility Pack。 http://htmlagilitypack.codeplex.com/wikipage?title=Examples
相關問題
- 1. RoR是否提供分頁功能?
- 2. WCF 4.0 WCF 4.0的新增功能
- 3. 是否有java.lang.instrument提供的功能的.Net模擬?
- 4. 應用服務器是否提供了Spring可以提供的功能
- 5. 是否有IBM DB2的.net 4.0提供程序?
- 6. Microsoft是否提供.NET 4.0的脫機API文檔?
- 7. 他們是否提供Amazon AWS提供的全功能PaaS?
- 8. 流星是否提供Timepicker功能?
- 9. Atlassian Stash是否提供pastebin/GitHub功能?
- 10. Shopify是否提供沙盒功能?
- 11. Alfresco:社區版是否提供了文檔庫功能
- 12. drupal是否爲用戶提供了訂閱管理功能?
- 13. .NET中是否提供REXX?
- 14. 在以前的.NET提供此功能
- 15. 使用PInvoke與.NET提供的功能
- 16. .NET 4.0的確認功能?
- 17. 檢測是否提供了可選參數(可能包含無)
- 18. 如何在WCF中提供DISTINCT功能
- 19. WCF和.NET 4.0
- 20. WCF是否真的取代了.NET Remoting?
- 21. 獅身人面像是否提供了提取索引信息而不是點擊數據庫的功能?
- 22. API中提供了新的歷史Twitter搜索功能嗎?
- 23. 從網頁檢索HTML源
- 24. 的MusicBrainz API搜索提供了從網頁
- 25. Graph API提供了自定義頁面標籤的功能嗎?
- 26. 是Win32/.net提供的關鍵「和絃」功能嗎?
- 27. Android上的LibSpotify功能是否爲PCM提供了與Linux不同的格式
- 28. Phriction(Phabricator中的wiki)是否提供了像MediaWiki這樣的模板功能?
- 29. Internet Explorer是否爲BHO提供了通過HTTP下載文件的功能?
- 30. Jackrabbit提供多面搜索功能嗎?