2014-04-02 36 views
1
s = "LEV606 (P), LEV230 (P)" 
#Expected result: ['LEV606', 'LEV230'] 

# First attempt 
In [3]: re.findall(r"[A-Z]{3}[0-9]{3}[ \(P\)]?", s) 
Out[3]: ['LEV606 ', 'LEV230 '] 

# Second attempt. The 'P' is not mandatory, can be other letter. 
# Why this doesn't work? 
In [4]: re.findall(r"[A-Z]{3}[0-9]{3}[ \([A-Z]{1}\)]?", s) 
Out[4]: [] 

# Third attempt 
# White space is still there. Why? I want to remove it from the answer 
In [5]: re.findall(r"[A-Z]{3}[0-9]{3}[\s\(\w\)]?", s) 
Out[5]: ['LEV606 ', 'LEV230 '] 

回答

0

您正在錯誤地使用[...]語法;這是一個角色類,可以匹配的字符。該類中列出的任何一個字符都是匹配的,因此無論是空格還是(字符,或者P);該空間將會很好地完成。

使用非捕獲組而不是角色職業,使多餘的文字可選,併爲部分你想有一個捕獲組:

re.findall(r"([A-Z]{3}[0-9]{3})(?: \(P\))?", s) 

演示:

>>> import re 
>>> s = "LEV606 (P), LEV230 (P)" 
>>> re.findall(r"([A-Z]{3}[0-9]{3})(?: \(P\))?", s) 
['LEV606', 'LEV230']