C++基於/使用（boost）正則表達式的分割字符串來找到令牌

我需要根據通過正則表達式找到的令牌將字符串拆分爲「字符串塊」。我還需要令牌本身保存爲最後的字符串塊C++基於/使用（boost）正則表達式的分割字符串來找到令牌

這裏的一部分是顯示的，我什麼複雜的正則表達式和輸入字符串後：https://regex101.com/r/bR9gW9/1

我試圖做一個簡單的例子，但它編譯失敗：http://cpp.sh/9qifd

#include <iostream> 
#include <string> 
#include <boost/regex.hpp> 
#include <vector> 
using namespace std; 

int main() 
{ 
    string data = "TAKE some stuff\nTAKE other stuff\nTAKE more stuff\n"; 
    boost::regex separate_take_chunks("TAKE"); 
    vector<string> take_chunks; 
    //boost::sregex_token_iterator i(data.begin(), data.end(), separate_take_chunks, -1); 
    boost::sregex_token_iterator j; 
    //while (i != j) cout << *i++; 
}

下面是使用正則表達式的std其中工程，但它並沒有給我的令牌http://cpp.sh/2jlv

#include <iostream> 
#include <string> 
#include <regex> 

using namespace std; 

int main() 
{ 
    string data = "TAKE some stuff\nTAKE other stuff\nTAKE more stuff\n"; 
    std::regex separate_take_chunks("TAKE"); 
    std::sregex_token_iterator iter(data.begin(), data.end(), separate_take_chunks, -1); 
    std::sregex_token_iterator end; 
    for (; iter != end; ++iter) 
    std::cout << *iter << "---\n"; 
}

這裏是不使用正則表達式，但如果我可以用正則表達式替換查找功能，這將很好地工作：

size_t p1 = 4; 
size_t p2 = 0; 
while (p2 != string::npos) { 
    p2 = data.find("TAKE\n", p1); 
    take_chunks.push_back(data.substr(p1-4, p2)); 
    p1 = p2+4; 
}

來源

2015-12-08 Elan Hickler

運行：http://cpp.sh/5ndl

#include <iostream> 
#include <string> 
#include <regex> 
#include <vector> 

using namespace std; 

int main() 
{ 
    string data = "NAME some name stuff\nTAKE some take stuff\nTAKE SEL some take sel stuff\n"; 
    regex separate_take_chunks("TAKE SEL|TAKE|NAME"); 

    vector<string> take_chunks; 

    std::sregex_token_iterator i(data.begin(), data.end(), separate_take_chunks, { -1, 0 }); 
    std::sregex_token_iterator j; 
    ++i; // there is no unmatched content (-1) initially, so skip it 
    while (i != j) { 
     take_chunks.push_back(*i++); // put matched content (0) in new index 
     if (i != j) take_chunks.back() += *i++; // add unmatched content (-1) 
    } 
    for (const auto& c : take_chunks) cout << c << "--" <<endl; 
}

的{ -1, 0 }裝置輸出不匹配的內容，隨後匹配的內容。如果要將1或2表示輸出正則表達式組1或2，並且{ 3, 4 }將輸出/連接組3和4.但我們在此處未使用組，因此-1和0是唯一可能的輸出。

初始++i是跳過所述第一-1（不匹配的內容），並繼續到0（匹配的內容），因爲在字符串NAME的第一部分之前沒有不匹配的內容。

本質上講，這創建的圖案：

-1（跳過無法比擬的，因爲它是空的）

0 + -1（串聯匹配和不匹配）

0 + -1

..等等。

我認爲它的工作方式是，正則表達式函數一旦找到匹配就停止查找匹配項，因此當它找到NAME時，它就完成了捕獲該迭代的內容。那麼-1是空的，0是「NAME」。通過執行最初的++i，我們跳過了空的-1。下一次迭代-1具有無法匹配的內容，因爲正則表達式試圖找到「TAKE」。因此，我們將-1不匹配的內容與「NAME」連接起來，並將「TAKE」放入矢量的新索引中。

也看到這樣來匹配的位置，如果你想採取的位置/ SUBSTR方法：Get the index of all matches using regex_search?

也很有幫助：http://www.cplusplus.com/reference/regex/match_results/

來源

2015-12-08 07:45:40

第一個例子，你有沒有設置升壓頭路徑。我不確定你是否可以在殼中做到這一點。

來源

2015-12-08 04:25:00 Nit

C++基於/使用（boost）正則表達式的分割字符串來找到令牌

回答

相關問題