如何使用Boost Spirit提取修剪文本？

使用助推精神，我想提取一個字符串，其後是一些數據在括號中。相關字符串由左括號中的空格分隔。不幸的是，字符串本身可能包含空格。我正在尋找一個簡潔的解決方案，它返回沒有尾隨空格的字符串。如何使用Boost Spirit提取修剪文本？

下面的代碼說明了此問題：

#include <boost/spirit/include/qi.hpp> 
#include <boost/spirit/include/phoenix_operator.hpp> 
#include <string> 
#include <iostream> 

namespace qi = boost::spirit::qi; 
using std::string; 
using std::cout; 
using std::endl; 

void 
test_input(const string &input) 
{ 
    string::const_iterator b = input.begin(); 
    string::const_iterator e = input.end(); 
    string parsed; 
    bool const r = qi::parse(b, e, 
     *(qi::char_ - qi::char_("(")) >> qi::lit("(Spirit)"), 
      parsed 
    ); 
    if(r) { 
     cout << "PASSED:" << endl; 
    } else { 
     cout << "FAILED:" << endl; 
    } 
    cout << " Parsed: \"" << parsed << "\"" << endl; 
    cout << " Rest: \"" << string(b, e) << "\"" << endl; 
} 

int main() 
{ 
    test_input("Fine (Spirit)"); 
    test_input("Hello, World (Spirit)"); 

    return 0; 
}

它的輸出是：

PASSED: 
    Parsed: "Fine " 
    Rest: "" 
PASSED: 
    Parsed: "Hello, World " 
    Rest: ""

有了這個簡單的語法，將所提取的字符串總是後跟一個空格（即我想到消除）。

解決方案應該在Spirit內工作，因爲這只是更大語法的一部分。（因此，解析後修剪提取的字符串可能會很笨拙。）

在此先感謝您。

來源

2013-10-25 Carsten Scholtes

總是空間而且只有空間？如果是這種情況，我認爲'*（qi :: char_ - qi :: lit（「（」））'應該可以工作，儘管可能有更好的答案。 – llonesmiz

非常感謝！用我的測試用例（儘管我沒有聲稱明白爲什麼（還）：匹配不包含文字的單個字符！？） –

與'〜'不同，差異分析器不是特定於qi :: char_的東西（儘管如果它的第二個參數失敗並且第一個成功，二元運算符'-'就成功了。在你的例子中，當qi :: lit（「（」）'不匹配時，你的表達式不斷地添加字符它的綜合屬性 – llonesmiz

就像評論所說，在單個空間的情況下，您可以對其進行硬編碼。如果您需要更靈活或寬容：

我會使用一個船長與raw以「欺騙」隊長你的目的：

bool const r = qi::phrase_parse(b, e, 
    qi::raw [ *(qi::char_ - qi::char_("(")) ] >> qi::lit("(Spirit)"), 
    qi::space, 
    parsed 
);

這工作，並打印

PASSED: 
    Parsed: "Fine" 
    Rest: "" 
PASSED: 
    Parsed: "Hello, World" 
    Rest: ""

看到它Live on Coliru

完整程序僅供參考：

#include <boost/spirit/include/qi.hpp> 
#include <boost/spirit/include/phoenix_operator.hpp> 
#include <string> 
#include <iostream> 

namespace qi = boost::spirit::qi; 
using std::string; 
using std::cout; 
using std::endl; 

void 
test_input(const string &input) 
{ 
    string::const_iterator b = input.begin(); 
    string::const_iterator e = input.end(); 
    string parsed; 
    bool const r = qi::phrase_parse(b, e, 
     qi::raw [ *(qi::char_ - qi::char_("(")) ] >> qi::lit("(Spirit)"), 
     qi::space, 
     parsed 
    ); 
    if(r) { 
     cout << "PASSED:" << endl; 
    } else { 
     cout << "FAILED:" << endl; 
    } 
    cout << " Parsed: \"" << parsed << "\"" << endl; 
    cout << " Rest: \"" << string(b, e) << "\"" << endl; 
} 

int main() 
{ 
    test_input("Fine (Spirit)"); 
    test_input("Hello, World (Spirit)"); 

    return 0; 
}

來源

2013-10-25 15:47:39 sehe

@cv_and_he謝謝。有一段時間和一切的地方，'raw'假設準確匹配的輸入序列就是你的屬性數據，如果不是這種情況，你需要添加後處理（_semantic actions_？），或者更好地寫出更精細的語法 – sehe

謝謝你提出這個有趣的選項。如果我理解正確，'raw'提供了一個扁平的字符串，而不是反映'raw'內部表達式層次結構的屬性。由於我必須在現有代碼中引入跳過解析器，因此我仍然猶豫要將其標記爲解決方案。 cv_and_he的評論似乎更重要。 –

@CarstenScholtes'qi :: raw'暴露了一個迭代器範圍：[「raw []忽略它的主語法分析器的屬性，而是暴露指向輸入流中匹配字符的半開範圍[first，last] 「]（http://www.boost.org/doc/libs/1_54_0/libs/spirit/doc/html/spirit/qi/reference/directive/raw.html） – sehe

如何使用Boost Spirit提取修剪文本？

回答

相關問題