將一串數據標記爲一個結構向量？

所以我有通過TCP winsock連接接收到的以下字符串數據，並且想要執行高級標記化，將其轉換爲結構向量，其中每個結構表示一個記錄。將一串數據標記爲一個結構向量？

std::string buf = "44:william:adama:commander:stuff\n33:luara:roslin:president:data\n" 

struct table_t 
{ 
    std::string key; 
    std::string first; 
    std::string last; 
    std::string rank; 
    std::additional; 
};

字符串中的每個記錄由一個回車分隔。

void tokenize(std::string& str, std::vector<string>records) 
{ 
    // Skip delimiters at beginning. 
    std::string::size_type lastPos = str.find_first_not_of("\n", 0); 
    // Find first "non-delimiter". 
    std::string::size_type pos  = str.find_first_of("\n", lastPos); 
    while (std::string::npos != pos || std::string::npos != lastPos) 
    { 
     // Found a token, add it to the vector. 
     records.push_back(str.substr(lastPos, pos - lastPos)); 
     // Skip delimiters. Note the "not_of" 
     lastPos = str.find_first_not_of("\n", pos); 
     // Find next "non-delimiter" 
     pos = str.find_first_of("\n", lastPos); 
    } 
}

似乎完全沒有必要再重複了代碼通過結腸進一步記號化每個記錄（內部字段分隔符）到結構：我分裂了記錄，但尚未分手了領域的嘗試並將每個結構體推送到一個向量中。我相信有這樣做的更好方法，或者設計本身可能是錯誤的。

謝謝你的幫助。

來源

2011-03-28 rem45acp

如果你可以使用提升，這將是相當整齊地做使用它的標記器庫，它的字符串算法庫，或者對於最強大的解決方案，使用'boost.spirit'，如下所示：http://www.boost.org/doc/libs/1_46_1/libs/spirit/doc /html/spirit/qi/tutorials/employee___parsing_into_structs.html – Cubbi 2011-03-28 16:32:03

錯過了此評論。對於這種情況下使用的[數據格式太重的012] – user237419 2011-03-28 16:40:10

使用[boost :: tokenizer]（http://www.boost.org/doc/libs/1_46_1/libs/tokenizer/index.html） – user237419 2011-03-28 16:38:34

對於將字符串分成記錄，我會使用istringstream，如果只有，因爲這將簡化後來當我想從文件讀取更改。對於符號化，最明顯的解決方案是刺激::正則表達式，所以：

std::vector<table_t> parse(std::istream& input) 
{ 
    std::vector<table_t> retval; 
    std::string line; 
    while (std::getline(input, line)) { 
     static boost::regex const pattern(
      "\([^:]*\):\([^:]*\):\([^:]*\):\([^:]*\):\([^:]*\)"); 
     boost::smatch matched; 
     if (!regex_match(line, matched, pattern)) { 
      // Error handling... 
     } else { 
      retval.push_back(
       table_t(matched[1], matched[2], matched[3], 
         matched[4], matched[5])); 
     } 
    } 
    return retval; 
}

（我假設table_t邏輯構造。另外：有一個非常悠久的傳統用C，在_t結尾的名稱是類型定義的，所以你可能會更好過一些尋找其他約定）

來源

2011-03-28 16:41:16

你應該ping Siek並告訴他:: tokenizer是無用的，因爲使用正則表達式你可以做任何事情。顯然 – user237419 2011-03-28 16:46:42

@adirau他問如何避免重複。使用現有工具是顯而易見的解決方案。在這種情況下，它也是最簡單的解決方案（至少如果你想檢查錯誤）。 – 2011-03-28 17:31:07

避免重複使用代碼;不能說如果你使用getline作爲第一個標記器和正則表達式作爲第二個標記器，你就避免了重複;）不是最簡單的，不是明顯的，即使你想檢查錯誤也不是;該正則表達式將接受令牌級別的錯誤;如果他需要錯誤檢查也許:: spirit是一個更好的解決方案，因爲Cubbi在第一條評論中提到 – user237419 2011-03-28 17:43:50

我的解決辦法：

struct colon_separated_only: std::ctype<char> 
{ 
    colon_separated_only(): std::ctype<char>(get_table()) {} 

    static std::ctype_base::mask const* get_table() 
    { 
     typedef std::ctype<char> cctype; 
     static const cctype::mask *const_rc= cctype::classic_table(); 

     static cctype::mask rc[cctype::table_size]; 
     std::memcpy(rc, const_rc, cctype::table_size * sizeof(cctype::mask)); 

     rc[':'] = std::ctype_base::space; 
     return &rc[0]; 
    } 
}; 

struct table_t 
{ 
    std::string key; 
    std::string first; 
    std::string last; 
    std::string rank; 
    std::string additional; 
}; 

int main() { 
     std::string buf = "44:william:adama:commander:stuff\n33:luara:roslin:president:data\n"; 
     stringstream s(buf); 
     s.imbue(std::locale(std::locale(), new colon_separated_only())); 
     table_t t; 
     std::vector<table_t> data; 
     while (s >> t.key >> t.first >> t.last >> t.rank >> t.additional) 
     { 
      data.push_back(t); 
     } 
     for(size_t i = 0 ; i < data.size() ; ++i) 
     { 
      cout << data[i].key <<" "; 
      cout << data[i].first <<" "<<data[i].last <<" "; 
      cout << data[i].rank <<" "<< data[i].additional << endl; 
     } 
     return 0; 
}

輸出：

44 william adama commander stuff 
33 luara roslin president data

在線演示：http://ideone.com/JwZuk

我這裏使用的技術在我的另一個解決不同的問題描述：

Elegant ways to count the frequency of words in a file

來源

2011-03-28 17:45:27 Nawaz

我還沒有檢查出ctype，將不得不閱讀它。感謝您的幫助。 – rem45acp 2011-03-29 12:05:29

將一串數據標記爲一個結構向量？

回答

相關問題