如何在不將整個文件加載到內存的情況下讀取/流式傳輸文件？

如何在不將整個文件加載到內存的情況下，「逐塊」（即逐字節地或其他一些可以提供最佳讀取性能的塊大小）讀取任意文件並對其進行處理？處理的一個例子是生成文件的MD5散列，儘管答案可能適用於任何操作。如何在不將整個文件加載到內存的情況下讀取/流式傳輸文件？

我想寫或寫這個，但如果我可以得到現有的代碼，這將是偉大的。

（C＃）

來源

2011-07-28 Howiecamp

看，真正的答案是「System.IO.FileStream」不會將文件加載到內存中。「 – Vercas

這裏有一個如何閱讀1KB的數據塊文件的例子沒有全部內容加載到內存中：

const int chunkSize = 1024; // read the file by chunks of 1KB 
using (var file = File.OpenRead("foo.dat")) 
{ 
    int bytesRead; 
    var buffer = new byte[chunkSize]; 
    while ((bytesRead = file.Read(buffer, 0, buffer.Length)) > 0) 
    { 
     // TODO: Process bytesRead number of bytes from the buffer 
     // not the entire buffer as the size of the buffer is 1KB 
     // whereas the actual number of bytes that are read are 
     // stored in the bytesRead integer. 
    } 
}

來源

2011-07-28 21:29:10

請說明爲什麼此代碼沒有將文件完全讀入內存。也請解釋你的TODO部分。 – Matt

這會將1KB（或chunkSize字節）加載到內存中。編輯：他也意味着不是整個'緩衝區'寫！只有從索引0到索引'bytesRead'的字節。 – Vercas

Damnit，我的意思是從索引0到索引'bytesRead - 1'。夥計們，多加註意！ – Vercas

System.IO.FileStream不將文件加載到存儲器中。
該流是可搜索的，並且MD5散列算法不必加載流（文件）介紹內存。

請將file_path替換爲您文件的路徑。

byte[] hash = null; 

using (var file = new FileStream(file_path, FileMode.Open)) 
{ 
    using (var md5 = new System.Security.Cryptography.MD5CryptoServiceProvider()) 
    { 
     hash = md5.ComputeHash(stream); 
    } 
}

在這裏，您的MD5哈希將存儲在hash變量中。

來源

2011-07-28 21:28:50 Vercas

我沒有意識到ComputeHash可能需要一個流，謝謝。我也做了一些編輯。 – Howiecamp

@Howiecamp不客氣！ – Vercas

對於未來的堆垛機，您只需要一個使用語句 - 如果我沒有記錯，他們可以聚攏在一起。 –

const int MAX_BUFFER = 1024; 
byte[] Buffer = new byte[MAX_BUFFER]; 
int BytesRead; 
using (System.IO.FileStream fileStream = new FileStream(filePath, FileMode.Open, FileAccess.Read)) 
    while ((BytesRead = fileStream.Read(Buffer, 0, MAX_BUFFER)) != 0) 
    { 
     // Process this chunk starting from offset 0 
     // and continuing for bytesRead bytes! 
    }

來源

2011-07-28 21:34:11 CSharper

int fullfilesize = 0;// full size of file 
    int DefaultReadValue = 10485760; //read 10 mb at a time 
    int toRead = 10485760; 
    int position =0; 

    // int 
// byte[] ByteReadFirst = new byte[10485760]; 

    private void Button_Click(object sender, RoutedEventArgs e) 
    { 
     using (var fs = new FileStream(@"filepath", FileMode.Open, FileAccess.Read)) 
     { 
      using (MemoryStream requestStream = new MemoryStream()) 
      { 


       fs.Position = position; 

       if (fs.Position >= fullfilesize) 
       { 
        MessageBox.Show(" all done"); 
        return; 
       } 
       System.Diagnostics.Debug.WriteLine("file position" + fs.Position); 

       if (fullfilesize-position < toRead) 
       { 
        toRead = fullfilesize - position; 
        MessageBox.Show("last time"); 
       } 
       System.Diagnostics.Debug.WriteLine("toread" + toRead); 
       int bytesRead; 
       byte[] buffer = new byte[toRead]; 
       int offset = 0; 
       position += toRead; 
       while (toRead > 0 && (bytesRead = fs.Read(buffer, offset, toRead)) > 0) 
       { 
        toRead -= bytesRead; 
        offset += bytesRead; 
       } 

       toRead = DefaultReadValue; 


      } 
     } 
    }

複製Darin的，這種方法將讀取10MB塊，直到結束的文件

來源

2014-01-16 12:52:06

儘管您的示例中的MemoryStream不是必需的，但您是唯一發佈設置FileStream Position的示例的人。這解決了我需要分割並傳輸10兆大塊文件的問題。 Upvoted！ – DragonZero

const long numberOfBytesToReadPerChunk = 1000;//1KB 
using (BinaryReader fileData = new BinaryReader(File.OpenRead(aFullFilePath)) 
    while (fileData.BaseStream.Position - fileData.BaseStream.Length > 0) 
     DoSomethingWithAChunkOfBytes(fileData.ReadBytes(numberOfBytesToReadPerChunk));

據我瞭解這裏使用的功能（特別是BinaryReader.ReadBytes），th不需要跟蹤你讀過的字節數。你只需要知道while循環的長度和當前位置 - 流所告訴你的。

來源

2016-06-30 17:18:41

如何在不將整個文件加載到內存的情況下讀取/流式傳輸文件？

回答

相關問題