2015-12-29 89 views
6

我想解析(首先,只能識別,保留符號)LaTeX數學。現在,我在超級和下標方面遇到了麻煩,並加上了花括號(例如a^{bc}及其組合,我已經有了基本的a^b工作得很好)。一個最小的例子(如短力所能及同時保持可讀性可能):如何獲得此遞歸規則的工作?

#include <iostream> 
    using std::cout; 
#include <string> 
    using std::string; 

#include <boost/spirit/home/x3.hpp> 
    namespace x3 = boost::spirit::x3; 
    using x3::space; 
    using x3::char_; 
    using x3::lit; 
    using x3::repeat; 

x3::rule<struct scripts, string> scripts = "super- and subscripts"; 
x3::rule<struct braced_thing, string> braced_thing = "thing optionaly surrounded by curly braces"; 
x3::rule<struct superscript, string> superscript = "superscript"; 
x3::rule<struct subscript, string> subscript = "subscript"; 

// main rule: any number of items with or without braces 
auto const scripts_def = *braced_thing; 
// second level main rule: optional braces, and any number of characters or sub/superscripts 
auto const braced_thing_def = -lit('{') >> *(subscript | superscript | repeat(1)[(char_ - "_^{}")]) >> -lit('}'); 
// superscript: things of the form a^b where a and b can be surrounded by curly braces 
auto const superscript_def = braced_thing >> '^' >> braced_thing; 
// subscript: things of the form a_b where a and b can be surrounded by curly braces 
auto const subscript_def = braced_thing >> '_' >> braced_thing; 

BOOST_SPIRIT_DEFINE(scripts) 
BOOST_SPIRIT_DEFINE(braced_thing) 
BOOST_SPIRIT_DEFINE(superscript) 
BOOST_SPIRIT_DEFINE(subscript) 

int main() 
{ 
    const string input = "a^{b_x y}_z {v_x}^{{x^z}_y}"; 
    string output; // will only contain the characters as the grammar is defined above 
    auto first = input.begin(); 
    auto last = input.end(); 
    const bool result = x3::phrase_parse(first, last, 
             scripts, 
             space, 
             output); 
    if(first != last) 
    std::cout << "partial match only:\n" << output << '\n'; 
    else if(!result) 
    std::cout << "parse failed!\n"; 
    else 
    std::cout << "parsing succeeded:\n" << output << '\n'; 
} 

這也是Available on Coliru

問題是,這段錯誤(我肯定有明顯的原因),我沒有其他的方式,以及......在表達語法中表達這一點。

+0

你的問題是相似的(但複雜得多)[本酮](http://stackoverflow.com/questions/18611990/flipping-the-order-of-subrules-inside-a-rule-in-a-boostspirit-grammar-results)。我很不確定[this](http://coliru.stacked-crooked.com/a/79e2edf0a6ff86d1)是否正確,但看看它是否有幫助。如果將來你需要創建一個AST ...它不會很漂亮(語義動作地獄)。希望你會得到更好的答案。 PS:你的'char _-「_^{}」'不正確,它等同於'char_-lit(「_^{}」)'但是'lit(「abc」)'完全匹配「abc」而不是「a 「或」b「或」c「。 – llonesmiz

+0

@cv_and_he事實上,您的示例刪除了左遞歸,並修復了「{}」的粗糙處理。這是[更新顯示](http://coliru.stacked-crooked.com/a/30b2ee7981c52bab)它至少匹配相同的測試用例(我非常確定AST提供了一些差異「,但我們可以我猜不出有什麼更適合OP的需求了)。 – sehe

回答

4

我還沒有看過@cv_and_he的建議,而是親自調試你的語法。我想出了這個:

auto token  = lexeme [ +~char_("_^{} \t\r\n") ]; 
auto simple  = '{' >> sequence >> '}' | token; 
auto expr   = lexeme [ simple % char_("_^") ]; 
auto sequence_def = expr % +space; 

是什麼使我有基本的什麼實際的語法看起來像一個一步一步的反思/想象。

我花了兩天試圖想以正確的方式來獲得"a b"解析(起初我「黑客」它只是一個下標運算符char_(" _^")但我得到的印象是,不會導致AST作爲你期待它。線索是你用空間的船長)。

現在,有沒有AST,但我們只是「收穫」相匹配的原始字符串使用.. x3::raw[...]

Live Coliru

//#define BOOST_SPIRIT_X3_DEBUG 
#include <iostream> 
#include <string> 

#include <boost/spirit/home/x3.hpp> 
namespace x3 = boost::spirit::x3; 

namespace grammar { 
    using namespace x3; 
    rule<struct _s> sequence { "sequence" }; 

    auto simple = rule<struct _s> {"simple"} = '{' >> sequence >> '}' | lexeme [ +~char_("_^{} \t\r\n") ]; 
    auto expr = rule<struct _e> {"expr"} = lexeme [ simple % char_("_^") ]; 
    auto sequence_def = expr % +space; 
    BOOST_SPIRIT_DEFINE(sequence) 
} 

int main() { 
    for (const std::string input : { 
      "a", 
      "a^b",  "a_b",  "a b", 
      "{a}^{b}", "{a}_{b}", "{a} {b}", 
      "a^{b_x y}", 
      "a^{b_x y}_z {v_x}^{{x^z}_y}" 
     }) 
    { 
     std::string output; // will only contain the characters as the grammar is defined above 
     auto first = input.begin(), last = input.end(); 
     bool result = x3::parse(first, last, x3::raw[grammar::sequence], output); 

     if (result) 
      std::cout << "Parse success: '" << output << "'\n"; 
     else 
      std::cout << "parse failed!\n"; 

     if (last!=first) 
      std::cout << "remaining unparsed: '" << std::string(first, last) << "'\n"; 
    } 
} 

輸出:

Parse success: 'a' 
Parse success: 'a^b' 
Parse success: 'a_b' 
Parse success: 'a b' 
Parse success: '{a}^{b}' 
Parse success: '{a}_{b}' 
Parse success: '{a} {b}' 
Parse success: 'a^{b_x y}' 
Parse success: 'a^{b_x y}_z {v_x}^{{x^z}_y}' 

輸出調試信息啓用:

<sequence> 
<try>a</try> 
<expr> 
    <try>a</try> 
    <simple> 
    <try>a</try> 
    <success></success> 
    </simple> 
    <success></success> 
</expr> 
<success></success> 
</sequence> 
Parse success: 'a' 
<sequence> 
<try>a^b</try> 
<expr> 
    <try>a^b</try> 
    <simple> 
    <try>a^b</try> 
    <success>^b</success> 
    </simple> 
    <simple> 
    <try>b</try> 
    <success></success> 
    </simple> 
    <success></success> 
</expr> 
<success></success> 
</sequence> 
Parse success: 'a^b' 
<sequence> 
<try>a_b</try> 
<expr> 
    <try>a_b</try> 
    <simple> 
    <try>a_b</try> 
    <success>_b</success> 
    </simple> 
    <simple> 
    <try>b</try> 
    <success></success> 
    </simple> 
    <success></success> 
</expr> 
<success></success> 
</sequence> 
Parse success: 'a_b' 
<sequence> 
<try>a b</try> 
<expr> 
    <try>a b</try> 
    <simple> 
    <try>a b</try> 
    <success> b</success> 
    </simple> 
    <success> b</success> 
</expr> 
<expr> 
    <try>b</try> 
    <simple> 
    <try>b</try> 
    <success></success> 
    </simple> 
    <success></success> 
</expr> 
<success></success> 
</sequence> 
Parse success: 'a b' 
<sequence> 
<try>{a}^{b}</try> 
<expr> 
    <try>{a}^{b}</try> 
    <simple> 
    <try>{a}^{b}</try> 
    <sequence> 
     <try>a}^{b}</try> 
     <expr> 
     <try>a}^{b}</try> 
     <simple> 
      <try>a}^{b}</try> 
      <success>}^{b}</success> 
     </simple> 
     <success>}^{b}</success> 
     </expr> 
     <success>}^{b}</success> 
    </sequence> 
    <success>^{b}</success> 
    </simple> 
    <simple> 
    <try>{b}</try> 
    <sequence> 
     <try>b}</try> 
     <expr> 
     <try>b}</try> 
     <simple> 
      <try>b}</try> 
      <success>}</success> 
     </simple> 
     <success>}</success> 
     </expr> 
     <success>}</success> 
    </sequence> 
    <success></success> 
    </simple> 
    <success></success> 
</expr> 
<success></success> 
</sequence> 
Parse success: '{a}^{b}' 
<sequence> 
<try>{a}_{b}</try> 
<expr> 
    <try>{a}_{b}</try> 
    <simple> 
    <try>{a}_{b}</try> 
    <sequence> 
     <try>a}_{b}</try> 
     <expr> 
     <try>a}_{b}</try> 
     <simple> 
      <try>a}_{b}</try> 
      <success>}_{b}</success> 
     </simple> 
     <success>}_{b}</success> 
     </expr> 
     <success>}_{b}</success> 
    </sequence> 
    <success>_{b}</success> 
    </simple> 
    <simple> 
    <try>{b}</try> 
    <sequence> 
     <try>b}</try> 
     <expr> 
     <try>b}</try> 
     <simple> 
      <try>b}</try> 
      <success>}</success> 
     </simple> 
     <success>}</success> 
     </expr> 
     <success>}</success> 
    </sequence> 
    <success></success> 
    </simple> 
    <success></success> 
</expr> 
<success></success> 
</sequence> 
Parse success: '{a}_{b}' 
<sequence> 
<try>{a} {b}</try> 
<expr> 
    <try>{a} {b}</try> 
    <simple> 
    <try>{a} {b}</try> 
    <sequence> 
     <try>a} {b}</try> 
     <expr> 
     <try>a} {b}</try> 
     <simple> 
      <try>a} {b}</try> 
      <success>} {b}</success> 
     </simple> 
     <success>} {b}</success> 
     </expr> 
     <success>} {b}</success> 
    </sequence> 
    <success> {b}</success> 
    </simple> 
    <success> {b}</success> 
</expr> 
<expr> 
    <try>{b}</try> 
    <simple> 
    <try>{b}</try> 
    <sequence> 
     <try>b}</try> 
     <expr> 
     <try>b}</try> 
     <simple> 
      <try>b}</try> 
      <success>}</success> 
     </simple> 
     <success>}</success> 
     </expr> 
     <success>}</success> 
    </sequence> 
    <success></success> 
    </simple> 
    <success></success> 
</expr> 
<success></success> 
</sequence> 
Parse success: '{a} {b}' 
<sequence> 
<try>a^{b_x y}</try> 
<expr> 
    <try>a^{b_x y}</try> 
    <simple> 
    <try>a^{b_x y}</try> 
    <success>^{b_x y}</success> 
    </simple> 
    <simple> 
    <try>{b_x y}</try> 
    <sequence> 
     <try>b_x y}</try> 
     <expr> 
     <try>b_x y}</try> 
     <simple> 
      <try>b_x y}</try> 
      <success>_x y}</success> 
     </simple> 
     <simple> 
      <try>x y}</try> 
      <success> y}</success> 
     </simple> 
     <success> y}</success> 
     </expr> 
     <expr> 
     <try>y}</try> 
     <simple> 
      <try>y}</try> 
      <success>}</success> 
     </simple> 
     <success>}</success> 
     </expr> 
     <success>}</success> 
    </sequence> 
    <success></success> 
    </simple> 
    <success></success> 
</expr> 
<success></success> 
</sequence> 
Parse success: 'a^{b_x y}' 
<sequence> 
<try>a^{b_x y}_z {v_x}^{{</try> 
<expr> 
    <try>a^{b_x y}_z {v_x}^{{</try> 
    <simple> 
    <try>a^{b_x y}_z {v_x}^{{</try> 
    <success>^{b_x y}_z {v_x}^{{x</success> 
    </simple> 
    <simple> 
    <try>{b_x y}_z {v_x}^{{x^</try> 
    <sequence> 
     <try>b_x y}_z {v_x}^{{x^z</try> 
     <expr> 
     <try>b_x y}_z {v_x}^{{x^z</try> 
     <simple> 
      <try>b_x y}_z {v_x}^{{x^z</try> 
      <success>_x y}_z {v_x}^{{x^z}</success> 
     </simple> 
     <simple> 
      <try>x y}_z {v_x}^{{x^z}_</try> 
      <success> y}_z {v_x}^{{x^z}_y</success> 
     </simple> 
     <success> y}_z {v_x}^{{x^z}_y</success> 
     </expr> 
     <expr> 
     <try>y}_z {v_x}^{{x^z}_y}</try> 
     <simple> 
      <try>y}_z {v_x}^{{x^z}_y}</try> 
      <success>}_z {v_x}^{{x^z}_y}</success> 
     </simple> 
     <success>}_z {v_x}^{{x^z}_y}</success> 
     </expr> 
     <success>}_z {v_x}^{{x^z}_y}</success> 
    </sequence> 
    <success>_z {v_x}^{{x^z}_y}</success> 
    </simple> 
    <simple> 
    <try>z {v_x}^{{x^z}_y}</try> 
    <success> {v_x}^{{x^z}_y}</success> 
    </simple> 
    <success> {v_x}^{{x^z}_y}</success> 
</expr> 
<expr> 
    <try>{v_x}^{{x^z}_y}</try> 
    <simple> 
    <try>{v_x}^{{x^z}_y}</try> 
    <sequence> 
     <try>v_x}^{{x^z}_y}</try> 
     <expr> 
     <try>v_x}^{{x^z}_y}</try> 
     <simple> 
      <try>v_x}^{{x^z}_y}</try> 
      <success>_x}^{{x^z}_y}</success> 
     </simple> 
     <simple> 
      <try>x}^{{x^z}_y}</try> 
      <success>}^{{x^z}_y}</success> 
     </simple> 
     <success>}^{{x^z}_y}</success> 
     </expr> 
     <success>}^{{x^z}_y}</success> 
    </sequence> 
    <success>^{{x^z}_y}</success> 
    </simple> 
    <simple> 
    <try>{{x^z}_y}</try> 
    <sequence> 
     <try>{x^z}_y}</try> 
     <expr> 
     <try>{x^z}_y}</try> 
     <simple> 
      <try>{x^z}_y}</try> 
      <sequence> 
      <try>x^z}_y}</try> 
      <expr> 
       <try>x^z}_y}</try> 
       <simple> 
       <try>x^z}_y}</try> 
       <success>^z}_y}</success> 
       </simple> 
       <simple> 
       <try>z}_y}</try> 
       <success>}_y}</success> 
       </simple> 
       <success>}_y}</success> 
      </expr> 
      <success>}_y}</success> 
      </sequence> 
      <success>_y}</success> 
     </simple> 
     <simple> 
      <try>y}</try> 
      <success>}</success> 
     </simple> 
     <success>}</success> 
     </expr> 
     <success>}</success> 
    </sequence> 
    <success></success> 
    </simple> 
    <success></success> 
</expr> 
<success></success> 
</sequence> 
Parse success: 'a^{b_x y}_z {v_x}^{{x^z}_y}' 
+0

您仍然可以在[記錄直播編碼會議](https://www.livecoding.tv/video/rethinking-x3-latex-maths-expression-grammar/)中欣賞到我蹣跚而fla fla。 – sehe