2017-04-12 132 views
3

我試圖解析一個由空格分隔的,可選標記的關鍵字字符串。例如,Boost.Spirit解析可選前綴

descr:expense type:receivable customer 27.3 

其中冒號之前的表達是標記,並且它是可選的(即,假定默認標記)。

我不能完全得到解析器來做我想做的事。我對canonical example做了一些小修改,其目的是解析鍵/值對(很像HTTP查詢字符串)。

typedef std::pair<boost::optional<std::string>, std::string> pair_type; 
typedef std::vector<pair_type> pairs_type; 

template <typename Iterator> 
struct field_value_sequence_default_field 
    : qi::grammar<Iterator, pairs_type()> 
{ 
    field_value_sequence_default_field() 
     : field_value_sequence_default_field::base_type(query) 
    { 
     query = pair >> *(qi::lit(' ') >> pair); 
     pair = -(field >> ':') >> value; 
     field = +qi::char_("a-zA-Z0-9"); 
     value = +qi::char_("a-zA-Z0-9+-\\."); 
    } 

    qi::rule<Iterator, pairs_type()> query; 
    qi::rule<Iterator, pair_type()> pair; 
    qi::rule<Iterator, std::string()> field, value; 
}; 

然而,當我分析它,當標籤被冷落時,optional<string>不是空/假。相反,它有一個價值的副本。這一對的第二部分也具有價值。

如果untagged關鍵字不能是標籤(語法規則,例如有一個小數點),那麼事情就像我所期望的那樣工作。

我在做什麼錯?這是PEG的概念錯誤嗎?

回答

2

相反,它有一個值的副本。這一對的第二部分也具有價值。

這是容器屬性和回溯的常見錯誤:使用qi::hold,例如, Understanding Boost.spirit's string parser

pair = -qi::hold[field >> ':'] >> value; 

完整的示例Live On Coliru

#include <boost/spirit/include/qi.hpp> 
#include <boost/fusion/adapted/std_pair.hpp> 
#include <boost/optional/optional_io.hpp> 
#include <iostream> 

namespace qi = boost::spirit::qi; 

typedef std::pair<boost::optional<std::string>, std::string> pair_type; 
typedef std::vector<pair_type> pairs_type; 

template <typename Iterator> 
struct Grammar : qi::grammar<Iterator, pairs_type()> 
{ 
    Grammar() : Grammar::base_type(query) { 
     query = pair % ' '; 
     pair = -qi::hold[field >> ':'] >> value; 
     field = +qi::char_("a-zA-Z0-9"); 
     value = +qi::char_("a-zA-Z0-9+-\\."); 
    } 
    private: 
    qi::rule<Iterator, pairs_type()> query; 
    qi::rule<Iterator, pair_type()> pair; 
    qi::rule<Iterator, std::string()> field, value; 
}; 

int main() 
{ 
    using It = std::string::const_iterator; 

    for (std::string const input : { 
      "descr:expense type:receivable customer 27.3", 
      "expense type:receivable customer 27.3", 
      "descr:expense receivable customer 27.3", 
      "expense receivable customer 27.3", 
    }) { 
     It f = input.begin(), l = input.end(); 

     std::cout << "==== '" << input << "' =============\n"; 
     pairs_type data; 
     if (qi::parse(f, l, Grammar<It>(), data)) { 
      std::cout << "Parsed: \n"; 
      for (auto& p : data) { 
       std::cout << p.first << "\t->'" << p.second << "'\n"; 
      } 
     } else { 
      std::cout << "Parse failed\n"; 
     } 

     if (f != l) 
      std::cout << "Remaining unparsed: '" << std::string(f,l) << "'\n"; 
    } 
} 

印刷

==== 'descr:expense type:receivable customer 27.3' ============= 
Parsed: 
descr ->'expense' 
type ->'receivable' 
-- ->'customer' 
-- ->'27.3' 
==== 'expense type:receivable customer 27.3' ============= 
Parsed: 
-- ->'expense' 
type ->'receivable' 
-- ->'customer' 
-- ->'27.3' 
==== 'descr:expense receivable customer 27.3' ============= 
Parsed: 
descr ->'expense' 
-- ->'receivable' 
-- ->'customer' 
-- ->'27.3' 
==== 'expense receivable customer 27.3' ============= 
Parsed: 
-- ->'expense' 
-- ->'receivable' 
-- ->'customer' 
-- ->'27.3'