乳膠的餾分MATHML在Python與正則表達式

我有一些文字：乳膠的餾分MATHML在Python與正則表達式

\frac{A}{B}

我需要這個文本轉換成表格：

<mfrac> 
<mrow> 
    A 
</mrow> 
<mrow> 
    B 
</mrow> 
</mfrac>

我必須使用Python和正則表達式。 A和B可以進一步分數，所以函數必須是遞歸的，例如文本：

\frac{1+x}{1+\frac{1}{x}}

必須改變成

<mfrac> 
<mrow> 
    1+x 
</mrow> 
<mrow> 
    1+ 
    <mfrac> 
    <mrow> 
    1 
    </mrow> 
    <mrow> 
    x 
    </mrow> 
    </mfrac> 
</mrow> 
</mfrac>

請用正則表達式:)

來源

2014-01-20 It' sMe

你不能做到這一點與re模塊完整的regex，因爲重模塊沒有遞歸功能。但是，您可以安裝試用替代正則表達式模塊（命名爲：正則表達式）實現該功能：在這裏的更多信息：https://pypi.python.org/pypi/regex –

@CasimiretHippolyte會是什麼樣的正則表達式使用遞歸功能看喜歡？ – Stephan

大概你所需要的輸出只是使用元語法，你並不真正生成' 1 + x'？（MathML應爲' + x'） –

幫助，如果你需要匹配遞歸模式在默認的python re模塊中，您可以像我一樣爲我做最近爲 css預處理器構建的遞歸註釋。

一般採用重只爲拆分文本標記，然後利用循環與嵌套層次變量來查找所有的語法。這裏是我的代碼：

COMMENTsRe = re.compile(r""" 
         // | 
         \n | 
         /\* | 
         \*/ 
         """, re.X) 

def rm_comments(cut): 
    nocomment = 0 # no inside comment 
    c = 1 # c-like comments, but nested 
    cpp = 2 # c++like comments 

    mode = nocomment 
    clevel = 0 # nesting level of c-like comments 
    matchesidx = [] 

    # in pure RE we cannot find nestesd structuries 
    # so we are just finding all boundires and parse it here 
    matches = COMMENTsRe.finditer(str(cut)) 
    start = 0 
    for i in matches: 
    m = i.group() 
    if mode == cpp: 
     if m == "\n": 
     matchesidx.append((start, i.end()-1)) # -1 because without \n 
     mode = nocomment 
    elif mode == c: 
     if m == "/*": 
     clevel += 1 
     if m == "*/": 
     clevel -= 1 
     if clevel == 0: 
     matchesidx.append((start, i.end())) 
     mode = nocomment 
    else: 
     if m == "//": 
     start = i.start() 
     mode = cpp 
     elif m == "/*": 
     start = i.start() 
     mode = c 
     clevel += 1 

    cut.rm_and_save(matchesidx)

來源

2014-03-20 07:22:16

乳膠的餾分MATHML在Python與正則表達式

回答

相關問題