Python的提取模式匹配

的Python 2.7.1 我試圖使用python的正則表達式來提取模式中的單詞Python的提取模式匹配

我有一些字符串，它看起來像這樣

someline abc 
someother line 
name my_user_name is valid 
some more lines

我想提取單詞「my_user_name」。我做類似

import re 
s = #that big string 
p = re.compile("name .* is valid", re.flags) 
p.match(s) #this gives me <_sre.SRE_Match object at 0x026B6838>

現在如何提取my_user_name？

來源

2013-03-11 Kannan Ekanath

你需要從正則表達式捕獲。 search爲模式，如果找到，則使用group(index)檢索字符串。假設執行有效的檢查：

>>> p = re.compile("name (.*) is valid") 
>>> p.search(s) # The result of this is referenced by variable name '_' 
<_sre.SRE_Match object at 0x10555e738> 
>>> _.group(1)  # group(1) will return the 1st capture. 
'my_user_name'

來源

2013-03-11 14:09:16 SuperSaiyan

工作你確定這不是第一場比賽的「組（0）」嗎？ – sharshofski 2015-04-16 14:04:34

+10

有點遲了，但是有，也沒有。 'group（0）'返回匹配的文本，而不是第一個捕獲組。代碼評論是正確的，而你似乎混淆捕獲組和匹配。 'group（1）'返回第一個捕獲組。 – andrewgu 2015-08-07 01:31:48

您可以使用匹配的組：

p = re.compile('name (.*) is valid')

例如

>>> import re 
>>> p = re.compile('name (.*) is valid') 
>>> s = """ 
... someline abc 
... someother line 
... name my_user_name is valid 
... some more lines""" 
>>> p.findall(s) 
['my_user_name']

這裏我用re.findall而不是re.search得到的my_user_name所有實例。使用re.search，你需要拿到賽對象從該組數據：

>>> p.search(s) #gives a match object or None if no match is found 
<_sre.SRE_Match object at 0xf5c60> 
>>> p.search(s).group() #entire string that matched 
'name my_user_name is valid' 
>>> p.search(s).group(1) #first group that match in the string that matched 
'my_user_name'

正如在評論中提到，你可能想使你的正則表達式非貪婪：

p = re.compile('name (.*?) is valid')

只能拿起'name '和下' is valid'（之間的東西，而不是讓你的正則表達式到論壇中拿起其他' is valid'。

來源

2013-03-11 14:08:05 mgilson

這是可能的非貪婪匹配需要...（除非用戶名可以是多個單詞......） – 2013-03-11 14:10:19

@JonClements - 你的意思是'（。*？）'？是的，這是可能的，雖然沒有必要，除非我們使用're.DOTALL' – mgilson 2013-03-11 14:11:51

耶 - 're.findall（'name（。*）is valid'，'name jon clements is valid is valid is valid'）'probably won不會產生預期的結果... – 2013-03-11 14:13:22

你想要一個capture group。

p = re.compile("name (.*) is valid", re.flags) # parentheses for capture groups 
print p.match(s).groups() # This gives you a tuple of your matches.

來源

2013-03-11 14:10:40

你可以使用這樣的事情：

import re 
s = #that big string 
# the parenthesis create a group with what was matched 
# and '\w' matches only alphanumeric charactes 
p = re.compile("name +(\w+) +is valid", re.flags) 
# use search(), so the match doesn't have to happen 
# at the beginning of "big string" 
m = p.search(s) 
# search() returns a Match object with information about what was matched 
if m: 
    name = m.group(1) 
else: 
    raise Exception('name not found')

來源

2013-03-11 14:11:48 Apalala

也許這是一個有點短，更容易理解：

import re 
text = '... someline abc... someother line... name my_user_name is valid.. some more lines' 
>>> re.search('name (.*) is valid', text).group(1) 
'my_user_name'

來源

2017-04-19 14:59:56 John

Python的提取模式匹配

回答

相關問題