在C++中快速解析製表符分隔的字符串和整數

我有一個幾十億字節大的文件，並且有數百萬行。每一行都有分開的，像這樣的數據：在C++中快速解析製表符分隔的字符串和整數

string TAB int TAB int TAB int NEWLINE

我以前嘗試讀取這個逐行已經頸瓶作爲CPU，而不是我的SSD的寫入速度的結果。

如何快速解析一個龐大的文件一行一行？

注意：由於文件太大，無法一次將所有文件解析爲矢量。

在我的原代碼，我用你的datastruct數據解析成結構的載體這樣

struct datastruct { 
    std::string name; 
    int year; 
    int occurences; 
    int volcount; 
}; 
std::vector<datastruct> data;

來源

2016-08-27 FelisPhasma

你想解析什麼？ – wally

@flatmouse在我的測試中，我正在使用結構向量 – FelisPhasma

@flatmouse查看我的編輯 – FelisPhasma

，你可以做

std::ifstream file; 
datastruct data; 
while (file >> data.name >> data.year >> data.occurences >> data.volcount) 
{ 
    // do what you want with data, its contents will be replaced during next iteration 
}

是那個慢？

來源

2016-08-27 19:38:34 Zereges

我現在測試這個 – FelisPhasma

@FelisPhasma我認爲這個文件對於'vector'來說太大了嗎？ – wally

@flatmouse正確。逐行方法更好，因爲它不會將整個文件複製到RAM中。 – FelisPhasma

在C++中快速解析製表符分隔的字符串和整數

回答

相關問題