2013-10-28 103 views
0

這是代碼:我怎樣才能每次從一組鏈接下載同時一個文件?

for (int x = 0; x < imagesSatelliteUrls.Count; x++) 
{ 
    if (!imagesSatelliteUrls[x].StartsWith("http://")) 
    { 
     imagesSatelliteUrls[x] = stringForSatelliteMapUrls + imagesSatelliteUrls[x]; 
    } 

    using (WebClient client = new WebClient()) 
    { 
     if (!imagesSatelliteUrls[x].Contains("href")) 
     { 
      client.DownloadFile(imagesSatelliteUrls[x], 
           UrlsDir + "SatelliteImage" + counter.ToString("D6")); 
     } 
    } 

    counter++; 
} 

這將文件下載文件。列表imagesSatelliteUrls包含按組排列的260個文件鏈接。

例如:

index[0] "Group 1" 
index[1] some link .... 
index[2] some link .... 
. 
. 
. 
index[34] "Group 2" 
index[35] some link .... 
index[36] some link .... 
. 
. 
. 
. 
index[71] "Group 3" 

等等有7組。 我希望它從每個組下載第一個文件togeather這意味着下載並行7個文件。組1中的第一個文件2 3 4 5 6 7 然後,如果其中一個文件在任何組中完成,它將開始從該組中下載下一個文件。

所以我會看到每秒7個文件下載和每個文件從另一個組。 一個文件在某個組中完成下載,它應該移動到同一組中的下一個文件並開始下載它。

我該怎麼辦?由於這個client.DownloadFile即時使用現在只是將文件下載文件。

試圖下載並行:

這是代碼:

Parallel.For(0, imagesSatelliteUrls.Count, /*new ParallelOptions { MaxDegreeOfParallelism = 20 },*/ x => 
      { 
       if (!imagesSatelliteUrls[x].StartsWith("http://")) 
       { 
        imagesSatelliteUrls[x] = stringForSatelliteMapUrls + imagesSatelliteUrls[x]; 
       } 

       using (WebClient client = new WebClient()) 
       { 
        if (!imagesSatelliteUrls[x].Contains("href")) 
        { 
         client.DownloadFile(imagesSatelliteUrls[x], 
              UrlsDir + "SatelliteImage" + counter.ToString("D6")); 
        } 
       } 

       counter++; 
      }); // end of Paralle 

唯一的例外是:

System.Net.WebException was unhandled by user code 
    HResult=-2146233079 
    Message=An exception occurred during a WebClient request. 
    Source=System 
    StackTrace: 
     at System.Net.WebClient.DownloadFile(Uri address, String fileName) 
     at System.Net.WebClient.DownloadFile(String address, String fileName) 
     at WeatherMaps.ExtractImages.<>c__DisplayClass2.<.ctor>b__0(Int32 x) in d:\C-Sharp\WeatherMaps\WeatherMaps\WeatherMaps\ExtractImages.cs:line 145 
     at System.Threading.Tasks.Parallel.<>c__DisplayClassf`1.<ForWorker>b__c() 
    InnerException: System.IO.IOException 
     HResult=-2147024864 
     Message=The process cannot access the file 'd:\localpath\Urls\SatelliteImage000000' because it is being used by another process. 
     Source=mscorlib 
     StackTrace: 
      at System.IO.__Error.WinIOError(Int32 errorCode, String maybeFullPath) 
      at System.IO.FileStream.Init(String path, FileMode mode, FileAccess access, Int32 rights, Boolean useRights, FileShare share, Int32 bufferSize, FileOptions options, SECURITY_ATTRIBUTES secAttrs, String msgPath, Boolean bFromProxy, Boolean useLongPath, Boolean checkHost) 
      at System.IO.FileStream..ctor(String path, FileMode mode, FileAccess access) 
      at System.Net.WebClient.DownloadFile(Uri address, String fileName) 
     InnerException: 

我也試過這個代碼:

for (int i = 0; i < 7; i++) 
      { 
       Task.Factory.StartNew(() => 
       { 
        // Here you can easily implement your checking algo as you see fit 
        while (counter < imagesSatelliteUrls.Count) 
        { 
         if (!imagesSatelliteUrls[count].StartsWith("http://")) 
         { 
          imagesSatelliteUrls[count] = stringForSatelliteMapUrls + imagesSatelliteUrls[count]; 
         } 
         using (WebClient client = new WebClient()) 
         { 
          if (!imagesSatelliteUrls[count].Contains("href")) 
          { 

           client.DownloadFile(imagesSatelliteUrls[count], UrlsDir + "SatelliteImage" + counter.ToString("D6")); 
          } 
         } 

         lock (this) 
         { 
          count++; 
          counter++; 
         } 
        } 
       }); 
      } 


System.Net.WebException was unhandled by user code 
    HResult=-2146233079 
    Message=An exception occurred during a WebClient request. 
    Source=System 
    StackTrace: 
     at System.Net.WebClient.DownloadFile(Uri address, String fileName) 
     at System.Net.WebClient.DownloadFile(String address, String fileName) 
     at WeatherMaps.ExtractImages.<>c__DisplayClass4.<.ctor>b__2() in d:\C-Sharp\WeatherMaps\WeatherMaps\WeatherMaps\ExtractImages.cs:line 122 
     at System.Threading.Tasks.Task.InnerInvoke() 
     at System.Threading.Tasks.Task.Execute() 
    InnerException: System.IO.IOException 
     HResult=-2147024864 
     Message=The process cannot access the file 'd:\localpath\Urls\SatelliteImage000000' because it is being used by another process. 
     Source=mscorlib 
     StackTrace: 
      at System.IO.__Error.WinIOError(Int32 errorCode, String maybeFullPath) 
      at System.IO.FileStream.Init(String path, FileMode mode, FileAccess access, Int32 rights, Boolean useRights, FileShare share, Int32 bufferSize, FileOptions options, SECURITY_ATTRIBUTES secAttrs, String msgPath, Boolean bFromProxy, Boolean useLongPath, Boolean checkHost) 
      at System.IO.FileStream..ctor(String path, FileMode mode, FileAccess access) 
      at System.Net.WebClient.DownloadFile(Uri address, String fileName) 
     InnerException: 
+1

如果找你針對.NET 4.0或更高看看http://msdn.microsoft.com/en-us/library/dd460717(v=vs.110).aspx –

+1

否則,你可能想看看C#上的線程。 – Rafael

回答

1

使用並行。對於

//for (int x = 0; x < imagesSatelliteUrls.Count; x++) 
Parallel.For(0, imagesSatelliteUrls.Count, /*new ParallelOptions { MaxDegreeOfParallelism = 20 },*/ x => 
{ 
    if (!imagesSatelliteUrls[x].StartsWith("http://")) 
    { 
     imagesSatelliteUrls[x] = stringForSatelliteMapUrls + imagesSatelliteUrls[x]; 
    } 

    using (WebClient client = new WebClient()) 
    { 
     if (!imagesSatelliteUrls[x].Contains("href")) 
     { 
      client.DownloadFile(imagesSatelliteUrls[x], 
           UrlsDir + "SatelliteImage" + x.ToString("D6")); 
     } 
    } 

    counter++; 
}); // end of Parallel.For 
+0

Tony嘗試了您的代碼,並且我在Client.DownloadFile行上使用了一個斷點,並且在下載6-7次後即時獲取此異常。將異常消息添加到我的問題! –

+0

我更新了代碼,現在就使用它並再試一次。 – Tony

0

我創建了一個獨立的例子e如果您添加對System.Net.Http.dll的引用並使用HttpClient類,則可以如何執行此操作。

// Create a mock list of data 
string someImageUrl = "..."; // some test url of an image file 
string urlsDirectory = @"C:\Temp"; // some working directory 

var urls = new string[7 * 20]; 

for (int i = 0; i < urls.Length; i += 7) 
{ 
    urls[i] = String.Format("Group {0}", (i/7) + 1); 

    for (int j = 1; j < 7; j++) 
    { 
     urls[i + j] = someImageUrl; 
    } 
} 


// Download 6 files at a time. 
var client = new HttpClient(); 

for (int i = 0; i < urls.Length; i += 7) 
{ 
    var directoryPath = Directory.CreateDirectory(Path.Combine(urlsDirectory, urls[i])).FullName; 

    var tasks = urls.Skip(i + 1).Take(6).Select(url => 
    { 
     return client.GetAsync(url); 
    }).ToArray(); 

    Task.WaitAll(tasks); 

    for (int j = 0; j < tasks.Length; j++) 
    { 
     var response = tasks[j].Result; 

     using (var fs = new FileStream(Path.Combine(directoryPath, String.Format("Image {0}.jpg", j + 1)), FileMode.OpenOrCreate)) 
     { 
      using (var responseStream = response.Content.ReadAsStreamAsync().Result) 
      { 
       responseStream.CopyTo(fs); 
      } 
     } 
    } 
} 

重要的是要注意的是,我認爲你失去了一些WebClient的自動文件名協商。這是值得做的,但你可以在我的例子中看到我只是標圖像「圖像1.JPG」,「圖像2.JPG」等

技術上請求通過HTTP文件時,你可以請求與圖像URL如下:

http://somehost.com/getImage?id=5 

在這種情況下,很難說文件名甚至應該是什麼。處理這種情況的HTTP標準方法是添加一個名爲Content-Disposition頭,它告訴HTTP客戶端的文件名應該是什麼。

但不是網絡服務器會給你一個Content-Disposition頭,所以你需要回退試圖解析上面的URL到Windows兼容的文件名。你可以嘗試找到一個簡單的功能去除所有非NTFS兼容字符的URL。但請記住,在這種情況下,您將無法獲得擴展名(jpg,gif等)。服務器可能會給你一個Content-Type頭來告訴你的MIME類型,比如「圖像/ JPEG」,但它是由你來找出給它哪些擴展。

相關問題