2014-09-20 39 views
2

Python正則表達式有一個有用的功能,可以使用函數來確定替換。也就是說,re.sub(pattern, functor, string)會將匹配結果傳遞給仿函數以獲取要使用的替換字符串。這比使用'\ 1','\ 2'來引用子匹配的格式化字符串語法更加靈活。`re.sub(pattern,functor,string)`用於C++

現在,我想在C++中實現同樣的事情,我不知道該怎麼做。第一個想法是使用std::regex_replace,但它沒有允許傳遞函數的重載。另一個想法是使用迭代器將文本拆分爲類型爲MATCHNOT_MATCH的標記,但似乎標準正則表達式迭代器只返回一種類型。他們要麼跳過所有的不匹配,要麼跳過所有的匹配。

有沒有辦法做到這一點?我更喜歡標準庫。

+1

對於任何未來的讀者,'的boost ::正則表達式似乎有這樣的重載。 – 2014-10-13 04:26:58

回答

1

您可以使用匹配結果的.prefix()來獲取字符串中不匹配的前綴部分,並使用.suffix()來獲取不匹配的字符串其餘部分。

Demo(改編自here)。

0

我寫了一篇博客有關此主題在這裏:http://blog.brainstembreakfast.com/update/c++/2014/09/20/regex-replace-ext/

你正在尋找的功能是

template< class Traits, class CharT, 
     class STraits, class SAlloc > 
    inline std::basic_string<CharT,STraits,SAlloc> 
    regex_replace_ext(const std::basic_string<CharT,STraits,SAlloc>& s, 
      const std::basic_regex<CharT,Traits>& re, 
      const typename std::common_type<std::function<std::basic_string<CharT,STraits,SAlloc> 
      (const unsigned, const std::basic_string<CharT,STraits,SAlloc> &)>>::type& fmt, 
       std::regex_constants::match_flag_type flags = 
       std::regex_constants::match_default) 
    { 
    std::vector<int> smatches{-1}; 
    if(re.mark_count() == 0) 
    smatches.push_back(0); 
    else 
     { 
    smatches.resize(1+re.mark_count()); 
    std::iota(std::next(smatches.begin()), smatches.end(), 1); //-1, 1, 2, etc...  
     } 

    unsigned smatch_count = smatches.size(); 
    unsigned count = 0; 

    std::regex_token_iterator 
     <typename std::basic_string<CharT,STraits,SAlloc>::const_iterator> 
     tbegin(s.begin(), s.end(), re, smatches, flags), tend;    

    std::basic_stringstream<CharT,STraits,SAlloc> ret_val; 
    std::for_each(tbegin, tend, [&count,&smatch_count,&ret_val,&fmt] 
      (const std::basic_string<CharT,STraits,SAlloc> & token) 
      { 
      if(token.size() != 0) 
       { 
      if(!count) 
       ret_val << token; 
      else 
       ret_val << fmt(count,token); 
       } 
      count = ++count % smatch_count; 
      }); 
    return ret_val.str(); 
    } 

用法:

const std::string bss("{Id_1} [Fill_0] {Id_2} [Fill_1] {Id_3} {Id_4} {Id_5}."); 
    const std::regex re("(\\{.*?\\})|(\\[.*?\\])"); 
    using dictionary = std::map<std::string,std::string>; 
    const std::vector<const dictionary> dict 
     { 
    { 
     {"{Id_1}","This"}, 
     {"{Id_2}","test"}, 
     {"{Id_3}","my"}, 
     {"{Id_4}","favorite"}, 
     {"{Id_5}","hotdog"} 
    }, 
    { 
     {"[Fill_0]","is a"}, 
     {"[Fill_1]","of"} 
    } 
     }; 

    auto fmt1 = [&dict](const unsigned smatch, const std::string & s)->std::string 
     { 
     auto dict_smatch = smatch - 1; 
     if(dict_smatch > dict.size()-1) 
      return s; //more submatches than expected 

     const auto it = dict[dict_smatch].find(s); 
     return it != dict[dict_smatch].cend() ? it->second : s; 
     }; 

    std::string modified_string = regex_replace_ext(bss, re, fmt1);