2011-06-23 74 views
2

我正在使用一些範圍從1-2Gig的文本文件。我不能使用傳統的流媒體閱讀器,並決定閱讀chunck並做我的工作。問題是我不確定什麼時候到達文件的末尾,因爲它已經在一個文件上工作了很長時間,我不確定我可以通過緩衝區讀取多大的文件。這裏是代碼:閱讀大文本文件進行解析

dim Buffer_Size = 30000 
dim bufferread = new [Char](Buffer_Size - 1){} 
dim bytesread as integer = 0 
dim totalbytesread as integer = 0 
dim sb as new stringbuilder 
Do 
    bytesread = inputfile.read(bufferread, 0 , Buffer_Size) 
    sb.append(bufferread) 
    totalbytesread = bytesread + totalbytesread 
    if sb.length > 9999999 then 
     data = sb.tostring 
     if not data is nothing then 
       parsingtools.load(data) 
     endif 
    endif 
    if totalbytesread > 1000000000 then 
     logs.constructlog("File almost done") 
    endif 
loop until inputfile.endofstream 

有沒有任何控制或代碼,我可以檢查多少文件仍然是?

回答

1

你看過BufferedStream嗎?

http://msdn.microsoft.com/en-us/library/system.io.bufferedstream%28v=VS.100%29.aspx

你可以用與您的流。另外,我會將緩衝區大小設置爲megs,而不是像30,000那麼小。

至於剩多少?你可以先問一下它的長度嗎?

下面是一段代碼片斷,我用它來圍繞一個流包裝一個緩衝流。 (對不起,這是C#)

private static void CopyTo(AzureBlobStore azureBlobStore,Stream src, Stream dest, string description) 
    { 
     if (src == null) 
      throw new ArgumentNullException("src"); 
     if (dest == null) 
      throw new ArgumentNullException("dest"); 

     const int bufferSize = (AzureBlobStore.BufferSizeForStreamTransfers); 
     // buffering happening internally. this is just to avoid 4gig boundary and have something to show 
     int readCount; 
     //long bytesTransfered = 0; 
     var buffer = new byte[bufferSize]; 
     //string totalBytes = FormatBytes(src.Length); 
     while ((readCount = src.Read(buffer, 0, buffer.Length)) != 0) 
     { 
      if (azureBlobStore.CancelProcessing) 
      { 
       break; 
      } 
      dest.Write(buffer, 0, readCount); 
      //bytesTransfered += readCount; 
      //Console.WriteLine("AzureBlobStore:CopyTo:{0}:{1} {2}", FormatBytes(bytesTransfered), totalBytes,description); 
     } 
    } 

希望這會有所幫助。