在python組捕獲中對正則表達式的Unicode支持

我目前在python中使用re2，re和pcre來進行正則表達式匹配。當我使用正則表達式，如re.compile（「（？P（\ S *））」）它很好，編譯沒有錯誤，但當我使用unicode字符，如re.compile（「（？P <årsag> （\ S *））「），那麼會出現錯誤，無法編譯。是否有任何Python庫完全支持unicode。在python組捕獲中對正則表達式的Unicode支持

編輯：請參考我的輸出：

>>> import regex 
>>> m = regex.compile(r"(?P<årsag>(\S*))") 
Traceback (most recent call last): 
    File "<stdin>", line 1, in <module> 
    File "/usr/local/lib/python2.7/site-packages/regex.py", line 331, in compile 
    return _compile(pattern, flags, kwargs) 
    File "/usr/local/lib/python2.7/site-packages/regex.py", line 499, in _compile 
    caught_exception.pos) 
_regex_core.error: bad character in group name at position 10

來源

2015-04-02 Jiwan

您需要使用外部正則表達式模塊。 regex模塊將支持命名捕獲組名稱中的Unicode字符。

>>> import regex 
>>> m = regex.compile(r"(?P<årsag>(\S*))") 
>>> m.search('foo').group('årsag') 
'foo' 
>>> m.search('foo bar').group('årsag') 
'foo'

來源

2015-04-02 05:51:43

謝謝，但它仍然不支持...仍然有錯誤「_regex_core.error：組名稱中位置10的壞字符」。任何更改都需要應用？ – Jiwan 2015-04-02 07:23:17

它在3.4中正常工作 – 2015-04-02 07:27:32

我在mac中使用Python 2.7.9 – Jiwan 2015-04-02 07:33:09

在python組捕獲中對正則表達式的Unicode支持

回答

相關問題