閱讀大文本文件進行解析

我正在使用一些範圍從1-2Gig的文本文件。我不能使用傳統的流媒體閱讀器，並決定閱讀chunck並做我的工作。問題是我不確定什麼時候到達文件的末尾，因爲它已經在一個文件上工作了很長時間，我不確定我可以通過緩衝區讀取多大的文件。這裏是代碼：閱讀大文本文件進行解析

dim Buffer_Size = 30000 
dim bufferread = new [Char](Buffer_Size - 1){} 
dim bytesread as integer = 0 
dim totalbytesread as integer = 0 
dim sb as new stringbuilder 
Do 
    bytesread = inputfile.read(bufferread, 0 , Buffer_Size) 
    sb.append(bufferread) 
    totalbytesread = bytesread + totalbytesread 
    if sb.length > 9999999 then 
     data = sb.tostring 
     if not data is nothing then 
       parsingtools.load(data) 
     endif 
    endif 
    if totalbytesread > 1000000000 then 
     logs.constructlog("File almost done") 
    endif 
loop until inputfile.endofstream

有沒有任何控制或代碼，我可以檢查多少文件仍然是？

來源

2011-06-23 vbNewbie

你看過BufferedStream嗎？

http://msdn.microsoft.com/en-us/library/system.io.bufferedstream%28v=VS.100%29.aspx

你可以用與您的流。另外，我會將緩衝區大小設置爲megs，而不是像30,000那麼小。

至於剩多少？你可以先問一下它的長度嗎？

下面是一段代碼片斷，我用它來圍繞一個流包裝一個緩衝流。（對不起，這是C＃）

private static void CopyTo(AzureBlobStore azureBlobStore,Stream src, Stream dest, string description) 
    { 
     if (src == null) 
      throw new ArgumentNullException("src"); 
     if (dest == null) 
      throw new ArgumentNullException("dest"); 

     const int bufferSize = (AzureBlobStore.BufferSizeForStreamTransfers); 
     // buffering happening internally. this is just to avoid 4gig boundary and have something to show 
     int readCount; 
     //long bytesTransfered = 0; 
     var buffer = new byte[bufferSize]; 
     //string totalBytes = FormatBytes(src.Length); 
     while ((readCount = src.Read(buffer, 0, buffer.Length)) != 0) 
     { 
      if (azureBlobStore.CancelProcessing) 
      { 
       break; 
      } 
      dest.Write(buffer, 0, readCount); 
      //bytesTransfered += readCount; 
      //Console.WriteLine("AzureBlobStore:CopyTo:{0}:{1} {2}", FormatBytes(bytesTransfered), totalBytes,description); 
     } 
    }

希望這會有所幫助。

來源

2011-07-16 02:33:31

閱讀大文本文件進行解析

回答

相關問題