0

我使用的是Microsoft的example,它使用AsyncTasks下載多個URL的數據。使用異步並行下載網頁內容

我的要求是在1分鐘內完成200個鏈接的下載,以便第二分鐘時,同一組200個URL將再次開始下載。我知道很大程度上取決於網絡速度和CPU功率,因爲​​這不是一個IO綁定的過程。

假設網絡和CPU會支持這個操作並且不會成爲一個瓶頸,實際上我會在一段時間後看到超時和取消異常。

因此,問題是在同一個例子中,我可以將其更改爲長時間運行的任務,以便任務不超時?我知道使用TaskCreationOptions枚舉並使用LongRunning。但是,問題是: 1)在下面的示例中創建任務並提供鏈接時,如何提供此參數? 2)什麼是定義LongRunning?這是否意味着每個任務不會超時? 3)我可以通過其他的意思明確地設置無限超時嗎?

基本上,我的要求是,如果一個特定的URL的下載過程完成,它將再次觸發下載相同的URL - 這意味着同一個URL將被一遍又一遍地下載,因此任務不應該完整(MSDN示例中的URL不是我將觸發的URL,將會有其他URL,其內容每分鐘都會更改,因此我需要不斷每分鐘至少下載一次該URL)。

從上面的例子鏈接粘貼在這裏的代碼太:

Dim cts As CancellationTokenSource 
Dim countProcessed As Integer 

Private Async Sub startButton_Click(sender As Object, e As RoutedEventArgs) 

    ' Instantiate the CancellationTokenSource. 
    cts = New CancellationTokenSource() 

    resultsTextBox.Clear() 

    Try 
     Await AccessTheWebAsync(cts.Token) 
     resultsTextBox.Text &= vbCrLf & "Downloads complete." 

    Catch ex As OperationCanceledException 
     resultsTextBox.Text &= vbCrLf & "Downloads canceled." & vbCrLf 

    Catch ex As Exception 
     resultsTextBox.Text &= vbCrLf & "Downloads failed." & vbCrLf 
    End Try 

    ' Set the CancellationTokenSource to Nothing when the download is complete. 
    cts = Nothing 
End Sub 

Private Sub cancelButton_Click(sender As Object, e As RoutedEventArgs) 
    If cts IsNot Nothing Then 
     cts.Cancel() 
    End If 
End Sub 

Async Function AccessTheWebAsync(ct As CancellationToken) As Task 

    Dim client As HttpClient = New HttpClient() 

    ' Call SetUpURLList to make a list of web addresses. 
    Dim urlList As List(Of String) = SetUpURLList() 

    ' ***Create a query that, when executed, returns a collection of tasks. 
    Dim downloadTasksQuery As IEnumerable(Of Task(Of Integer)) = 
     From url In urlList Select ProcessURLAsync(url, client, ct) 

    ' ***Use ToList to execute the query and start the download tasks. 
    Dim downloadTasks As List(Of Task(Of Integer)) = downloadTasksQuery.ToList() 

    Await Task.WhenAll(downloadTasks) 
    'Ideally, this line should never be reached 
    Console.WriteLine("Done") 

End Function 

Async Function ProcessURLAsync(url As String, client As HttpClient, ct As CancellationToken) As Task(Of Integer) 
    Console.WriteLine("URL=" & url) 
    ' GetAsync returns a Task(Of HttpResponseMessage). 
    Dim response As HttpResponseMessage = Await client.GetAsync(url, ct) 

    ' Retrieve the web site contents from the HttpResponseMessage. 
    Dim urlContents As Byte() = Await response.Content.ReadAsByteArrayAsync() 
    Interlocked.Increment(countProcessed) 
    Console.WriteLine(countProcessed) 
    Return urlContents.Length 
End Function 

Private Function SetUpURLList() As List(Of String) 

    Dim urls = New List(Of String) From 
     { 
      "http://msdn.microsoft.com", 
      "http://msdn.microsoft.com/en-us/library/hh290138.aspx", 
      "http://msdn.microsoft.com/en-us/library/hh290140.aspx", 
      "http://msdn.microsoft.com/en-us/library/dd470362.aspx", 
      "http://msdn.microsoft.com/en-us/library/aa578028.aspx", 
      "http://msdn.microsoft.com/en-us/library/ms404677.aspx", 
      "http://msdn.microsoft.com/en-us/library/ff730837.aspx", 
      "http://msdn.microsoft.com/en-us/library/hh290138.aspx", 
      "http://msdn.microsoft.com/en-us/library/hh290140.aspx" 
    'For space constraint I am not including the 200 URLs, but pls assume the above list contains 200 URLs 
    } 

    Return urls 
End Function 

回答

2

問題,因此,在相同的例子,我可以將其更改爲長時間運行的任務,這樣的任務不會超時?

任務本身不超時。你可能看到的是HTTP請求超時。長時間運行的任務沒有任何不同的超時語義。

我知道使用TaskCreationOptions枚舉和使用LongRunning。

你也應該知道,他們應該幾乎從不使用。


您可能會因爲您的所有請求觸及同一網站而超時。嘗試將ServicePointManager.DefaultConnectionLimit設置爲int.MaxValue,並且可能還會增加HttpClient.Timeout

+0

Thx Stephen。我認爲'ServicePointManager.DefaultConnectionLimit'完成了這個訣竅。當然,我也設置了'HttpClient.Timeout',但沒有注意到它有沒有區別。但我現在正在得到隨機錯誤閱讀流「。我的猜測是因爲連接到流在請求和讀取之間關閉,在這種情況下,我的要求是等待,比如說20秒,然後重試那麼'HttpClient'對象會超時,任何建議我應該如何構造代碼,以便我可以在沒有超時的情況下重試? – Kallol

+0

您可以'等待Task.Delay',然後重新調用'Get *'。我建議使用像波莉這樣的庫進行生產質量重試。 –