Python積極lookbehind拆分可變寬度

雖然我已經適當地設置了表達式，但拆分不按預期工作。Python積極lookbehind拆分可變寬度

c = re.compile(r'(?<=^\d\.\d{1,2})\s+'); 
for header in ['1.1 Introduction', '1.42 Appendix']: 
    print re.split(c, header)

預期結果：

['1.1', 'Introduction'] 
['1.42', 'Appendix']

我正在以下堆棧跟蹤：

Traceback (most recent call last):
     File "foo.py", line 1, in
          c = re.compile(r'(?<=^\d.\d{1,2})\s+');
     File "C:\Python27\lib\re.py", line 190, in compile
          return _compile(pattern, flags)
     File "C:\Python27\lib\re.py", line 242, in _compile
          raise error, v # invalid expression
sre_constants.error: look-behind requires fixed-width pattern
<<< Process finished. (Exit code 1)

來源

2014-03-30 Mr. Polywhirl

錯誤消息說它 - 你不能在python的正則表達式引擎中有可變長度的lookaround。 – roippi

查看[regex]（https://pypi.python.org/pypi/regex）模塊，該模塊允許可變長度lookbehind。 – BrenBarn

python中的lookbehinds不能是可變寬度，所以你的lookbehind是無效的。

您可以使用捕獲組作爲一種解決方法：

c = re.compile(r'(^\d\.\d{1,2})\s+'); 
for header in ['1.1 Introduction', '1.42 Appendix']: 
    print re.split(c, header)[1:] # Remove the first element because it's empty

輸出：

['1.1', 'Introduction'] 
['1.42', 'Appendix']

來源

2014-03-30 18:43:00 Jerry

只要表達式不是全局的，在第一個空間拆分就是你的方法。聽起來像它會在我的情況下工作。 –

@ Mr.Polywhirl它將在第一個'（^ \ d \。\ d {1,2}）\ s +'上分割，但捕獲組意味着它將保留它已經分割的字符。它會分裂你想要的分身之處。 – Jerry

'Do'應該是'So'以上...反正，好戲。謝謝。 –

在正則表達式你的錯誤是在部分{1,2}因爲Lookbehinds需要是固定寬度，因此量詞不被允許。

試一試website在將代碼放入代碼之前測試您的正則表達式。

但你的情況，你不需要使用正則表達式的所有：

只是試試這個：

for header in ['1.1 Introduction', '1.42 Appendix']: 
    print header.split(' ')

結果：

['1.1', 'Introduction'] 
['1.42', 'Appendix']

希望這會有所幫助。

來源

2014-03-30 18:41:57 najjarammar

所以; '*'，'+'，'{n}'等都是不允許的？ –

如果'n'是一個像'1'或'2'或'5''等常數，則允許'{n}'。其他人是不允許的，因爲長度不是恆定的。總之你的字符長度應該是不變的。 – najjarammar

更多信息在這裏：https://docs.python.org/2/library/re.html – najjarammar

我的解決辦法可能看起來跛。但是你只檢查點後兩位數字。所以，你可以使用兩個向後看。

c = re.compile(r'(?:(?<=^\d\.\d\d)|(?<=^\d\.\d))\s+');

來源

2014-03-30 19:01:42

Python積極lookbehind拆分可變寬度

回答

相關問題