2011-10-28 60 views
1

我一直在努力學習過去幾天的F#,並且一直困擾着我。我的「學習項目」是我對操作感興趣的一些數據的屏幕刮板。響應流昂貴的異步讀取

在F#PowerPack中有一個調用Stream.AsyncReadToEnd。我不想僅僅爲了這個單獨的電話而使用PowerPack,所以我看看他們是如何做到的。

module Downloader = 
    open System 
    open System.IO 
    open System.Net 
    open System.Collections 

    type public BulkDownload(uriList : IEnumerable) = 
     member this.UriList with get() = uriList 

     member this.ParalellDownload() = 
      let Download (uri : Uri) = async { 
       let UnblockViaNewThread f = async { 
        do! Async.SwitchToNewThread() 
        let res = f() 
        do! Async.SwitchToThreadPool() 
        return res } 

       let request = HttpWebRequest.Create(uri) 
       let! response = request.AsyncGetResponse() 
       use responseStream = response.GetResponseStream() 
       use reader = new StreamReader(responseStream) 
       let! contents = UnblockViaNewThread (fun() -> reader.ReadToEnd()) 
       return uri, contents.ToString().Length } 

      this.UriList 
      |> Seq.cast 
      |> Seq.map Download 
      |> Async.Parallel 
      |> Async.RunSynchronously 

他們有那個函數UnblockViaNewThread。這真的是異步讀取響應流的唯一方法嗎?是不是創建一個新的線程真的很昂貴(我已經看到了在整個地方引發的「1mb內存」)。有一個更好的方法嗎?這是每個Async*電話(我可以let!)中發生的情況嗎?

編輯:我遵循托馬斯的建議,實際上想出了獨立於F#PowerTools的東西。這裏是。這確實需要錯誤處理,但它會異步請求並將url下載到字節數組。

namespace Downloader 
open System 
open System.IO 
open System.Net 
open System.Collections 

type public BulkDownload(uriList : IEnumerable) = 
    member this.UriList with get() = uriList 

    member this.ParalellDownload() =     
     let Download (uri : Uri) = async { 
      let processStreamAsync (stream : Stream) = async { 
       let outputStream = new MemoryStream() 
       let buffer = Array.zeroCreate<byte> 0x1000 
       let completed = ref false 
       while not (!completed) do 
        let! bytesRead = stream.AsyncRead(buffer, 0, 0x1000) 
        if bytesRead = 0 then 
         completed := true 
        else 
         outputStream.Write(buffer, 0, bytesRead) 
       stream.Close() 
       return outputStream.ToArray() } 

      let request = HttpWebRequest.Create(uri) 
      let! response = request.AsyncGetResponse() 
      use responseStream = response.GetResponseStream() 
      let! contents = processStreamAsync responseStream 
      return uri, contents.Length } 

     this.UriList 
     |> Seq.cast 
     |> Seq.map Download 
     |> Async.Parallel 
     |> Async.RunSynchronously 

    override this.ToString() = String.Join(", ", this.UriList) 

回答

9

我認爲AsyncReadToEnd是一個單獨的線程只是同步調用ReadToEnd是錯誤的。

F#PowerPack還包含一個AsyncStreamReader類型,其中包含適當的流讀取異步實現。它有一個ReadLine方法(異步)返回下一行,並且只從源流下載幾個塊(使用異步ReadAsync而不是在後臺線程上運行)。

let processStreamAsync stream = async { 
    use asyncReader = new AsyncStreamReader(stream) 
    let completed = ref false 
    while not (!completed) do 
    // Asynchrnously get the next line 
    let! nextLine = asyncReader.ReadLine() 
    if nextLine = null then completed := true 
    else 
     (* process the next line *) } 

如果你想下載的全部內容作爲字符串(而不是處理它行由行),那麼你可以使用的AsyncStreamReaderReadToEnd方法。這是一個正確的異步實現,它開始下載數據塊(異步)並重復此操作而不會阻塞。

async { 
    use asyncReader = new AsyncStreamReader(stream) 
    return! asyncReader.ReadToEnd() } 

此外,F#PowerPack的是開放式souorce並具有許可認證,所以使用它的最好辦法就是經常把剛纔複製你需要到項目的一些文件。

+1

這完全回答了我的問題。感謝Tomas。 –