Python的正則表達式捕獲兩種意見

例：Python的正則表達式捕獲兩種意見

a = "bzzzzzz <!-- blabla --> blibli * bloblo * blublu"

我想趕上第一條評論。註釋可能

(<!-- .* -->) or (\* .* \*)

這是確定的：

re.search("<!--(?P<comment> .*)-->",a).group(1)

也是：

re.search("\*(?P<comment> .*)\*",a).group(1)

但是如果我想一個或另一個在評論，我已經試過類似：

re.search("(<!--(?P<comment> .*)-->|\*(?P<comment> .*)\*)",a).group(1)

但它不起作用

謝謝

來源

2011-09-23 pablo07

順便說一句，你regexs是貪婪和失敗，會在像'<！ - 第一個註釋 - >真材實料<！ - 第二個評論 - >'。 –

嘗試條件表達式：

>>> for m in re.finditer(r"(?:(<!--)|(\*))(?P<comment> .*?)(?(1)-->)(?(2)\*)", a): 
... print m.group('comment') 
... 
blabla 
bloblo

來源

2011-09-23 15:35:30 eph

正如Gurney指出的，你有兩個同名的捕獲。既然你實際上並沒有使用這個名字，那就把它留下。

此外，r""原始字符串表示法是一個好習慣。

哦，還有第三件事：你抓錯了索引。 0是整場比賽，1是整個「或 - 或」塊，並且2將成爲成功的內在俘獲。

re.search(r"(<!--(.*)-->|\*(.*)\*)",a).group(2)

來源

2011-09-23 15:22:59 Chriszuma

索引3是什麼？ – sln

這個正則表達式永遠不會有索引3。 – Chriszuma

您在「不工作」得到的異常部分是相當明確的關於什麼是錯誤的：

sre_constants.error: redefinition of group name 'comment' as group 3; was group 2

兩個組具有相同的名稱：只是重命名第二個

>>> re.search("(<!--(?P<comment> .*)-->|\*(?P<comment2> .*)\*)",a).group(1) 
'<!-- blabla -->' 
>>> re.search("(<!--(?P<comment> .*)-->|\*(?P<comment2> .*)\*)",a).groups() 
('<!-- blabla -->', ' blabla ', None) 
>>> re.findall("(<!--(?P<comment> .*)-->|\*(?P<comment2> .*)\*)",a) 
[('<!-- blabla -->', ' blabla ', ''), ('* bloblo *', '', ' bloblo ')]

來源

2011-09-23 15:23:18

re.findall可能是這更好的契合：

import re 

# Keep your regex simple. You'll thank yourself a year from now. Note that 
# this doesn't include the surround spaces. It also uses non-greedy matching 
# so that you can embed multiple comments on the same line, and it doesn't 
# break on strings like '<!-- first comment --> fragment -->'. 
pattern = re.compile(r"(?:<!-- (.*?) -->|\* (.*?) \*)") 

inputstring = 'bzzzzzz <!-- blabla --> blibli * bloblo * blublu foo ' \ 
       '<!-- another comment --> goes here' 

# Now use re.findall to search the string. Each match will return a tuple 
# with two elements: one for each of the groups in the regex above. Pick the 
# non-blank one. This works even when both groups are empty; you just get an 
# empty string. 
results = [first or second for first, second in pattern.findall(inputstring)]

來源

2011-09-23 16:06:34

你可以去的2種方式（如果Python的支持）1 -

1：分公司復位（|圖案|圖案| ...）
(?||\*(.*?)\*)/捕獲組1總是包含註釋文本

2：條件表達式（（條件）是模式|無模式？）
(?:(|\*)這裏的條件，我們什麼上尉GRP1

個

修飾符sg單行和全球

來源

2011-09-23 16:19:38 sln

Python的正則表達式捕獲兩種意見

回答

相關問題