2017-08-04 79 views
2

我正在下載100K +個文件,並希望在補丁程序中執行此操作,例如一次執行100個文件。在for循環中每x次運行一次異步

static void Main(string[] args) { 
    Task.WaitAll(
     new Task[]{ 
      RunAsync() 
    }); 
} 

// each group has 100 attachments. 
static async Task RunAsync() { 
    foreach (var group in groups) { 
     var tasks = new List<Task>(); 
     foreach (var attachment in group.attachments) { 
      tasks.Add(DownloadFileAsync(attachment, downloadPath)); 
     } 
     await Task.WhenAll(tasks); 
    } 
} 

static async Task DownloadFileAsync(Attachment attachment, string path) { 
    using (var client = new HttpClient()) { 
     using (var fileStream = File.Create(path + attachment.FileName)) { 
      var downloadedFileStream = await client.GetStreamAsync(attachment.url); 
      await downloadedFileStream.CopyToAsync(fileStream); 
     } 
    } 
} 

預計 希望滿月下載100個文件的時間,然後下載下一個100;

實際 它在同一時間下載更多。快速出錯Unable to read data from the transport connection: An existing connection was forcibly closed by the remote host

+2

這是它得到了標記爲重複一個恥辱,因爲其他問題使用顯著不同的方法,我很樂意去學習爲什麼用昆汀使用的一個失敗。 – Bartosz

+2

我同意;不是重複的。我的猜測是HttpClient方法會比你期望的更早返回。 – BradleyDotNET

+2

我強烈推薦閱讀[.net 4.5的異步HttpClient是密集加載應用程序的不好選擇嗎?](https://stackoverflow.com/questions/16194054/is-async-httpclient-from-net-4-5-a -bad選換密集型負載的應用程序)。 –

回答

4

在「批處理」中運行任務在性能方面不是一個好主意。長時間運行的任務會導致整個批處理塊。一個更好的方法是在完成一個任務後立即開始一項新任務。

這可以通過@MertAkcakaya建議的隊列來實現。但我將發佈基於我的其他答案Have a set of Tasks with only X running at a time

int maxTread = 3; 
System.Net.ServicePointManager.DefaultConnectionLimit = 50; //Set this once to a max value in your app 

var urls = new Tuple<string, string>[] { 
    Tuple.Create("http://cnn.com","temp/cnn1.htm"), 
    Tuple.Create("http://cnn.com","temp/cnn2.htm"), 
    Tuple.Create("http://bbc.com","temp/bbc1.htm"), 
    Tuple.Create("http://bbc.com","temp/bbc2.htm"), 
    Tuple.Create("http://stackoverflow.com","temp/stackoverflow.htm"), 
    Tuple.Create("http://google.com","temp/google1.htm"), 
    Tuple.Create("http://google.com","temp/google2.htm"), 
}; 
DownloadParallel(urls, maxTread); 
另一種選擇
async Task DownloadParallel(IEnumerable<Tuple<string,string>> urls, int maxThreads) 
{ 
    SemaphoreSlim maxThread = new SemaphoreSlim(maxThreads); 
    var client = new HttpClient(); 

    foreach(var url in urls) 
    { 
     await maxThread.WaitAsync(); 
     DownloadFile(client, url.Item1, url.Item2) 
        .ContinueWith((task) => maxThread.Release()); 
    } 
} 


async Task DownloadFile(HttpClient client, string url, string fileName) 
{ 
    var stream = await client.GetStreamAsync(url); 
    using (var fileStream = File.Create(fileName)) 
    { 
     await stream.CopyToAsync(fileStream); 
    } 
} 

PS:因爲它開始的上次下載DownloadParallel將盡快返回。所以不要等待吧。如果你真的想等待,你應該在方法結尾處添加for (int i = 0; i < maxThreads; i++) await maxThread.WaitAsync();

PS2:不要忘了異常處理添加到DownloadFile