Python中的基本正則表達式問題，以幫助我學習

我正在使用本教程來學習Python中的正則表達式 - 看起來像一個很好的教程！Python中的基本正則表達式問題，以幫助我學習

按照教程中，我應該使用的代碼是：

import re 
p = re.compile(r'^(?P<Given>\w+) (?P<Middle>\w\.) (?P<Family>\w+)$', re.MULTILINE) 
str = "Jack A. Smith\nMary B. Miller" 
m = p.match(str) 
print m.group(0) 
Jack A. Smith 
print m.group(1) 
Jack 
print m.group(2) 
A. 
print m.group(3) 
Smith 
print m.group(4) 
Traceback (most recent call last): 
    File "<stdin>", line 1, in <module> 
IndexError: no such group

令我驚訝的是，我失去了小瑪麗B.米勒 - 有沒有m.group(4)

所以我有一些跟進的問題：

（1）我正在使用多行，爲什麼它只匹配第一個，即例子中的Jack A. Smith？

（2）我現在用的是考慮到，中東和家庭爲每場比賽的標籤名，我該如何訪問使用這些標籤中的數據，而不僅僅是m.group(i)

（3）我們說我想做匹配和替換？即，我想匹配Mary B. Miller，並由Jane M. Goldstein取代，以便替換後的字符串現在爲：str = "Jack A. Smith\nJane M. Goldstein"。我怎麼去做那個？（種無關，我們稱之爲獎金Q）

來源

2014-03-29 Dnaiel

請勿使用'str'作爲變量名稱。你會掩飾內置的 – dawg

從re.match()

Note that even in MULTILINE mode, re.match() will only match 
at the beginning of the string and not at the beginning of each line

這是W¯¯複製你只得到第一場比賽。如果你需要的所有比賽中，使用re.findall()

結束語這裏()裏面你的整個正則表達式是一個例子：

p = re.compile(r'^((?P<Given>\w+) (?P<Middle>\w\.) (?P<Family>\w+))$', re.MULTILINE) 
str = "Jack A. Smith\nMary B. Miller" 
print re.findall(p, str)

輸出：

[('Jack A. Smith', 'Jack', 'A.', 'Smith'), ('Mary B. Miller', 'Mary', 'B.', 'Miller')]

更新::

關於您的問題-2：爲此使用re.finditer()。舉個例子：

p = re.compile(r'^(?P<FullName>(?P<Given>\w+) (?P<Middle>\w\.) (?P<Family>\w+))$', re.MULTILINE) 
str = "Jack A. Smith\nMary B. Miller" 
matches = re.finditer(p, str) 
for match in matches: 
    info = match.groupdict() ## pulling out the match as dictionary 
    print info 
    print info['Family']

問題-3：

使用re.sub()就足夠了這種更換。

print re.sub("Mary B\. Miller", "Jane M. Goldstein", str) 
## notice I have escaped the . with \. 
## in regex . means any non white space characters.

來源

2014-03-29 16:24:29

非常感謝，非常有幫助，你會介意評論其他問答，（2）和（3）訪問標籤Given，Middle和Family？以及如何匹配和替換......如果不是沒有擔心，v對（1）的好答案！ – Dnaiel

@Dnaiel我已經更新了我的答案。好心檢查。 –

從re模塊的文檔：

注意，即使在MULTILINE模式，re.match（）將只匹配字符串的開始，而不是在每一行的開頭。

您可以使用re.findall或re.finditer找到的所有匹配：

>>> for match in p.finditer(str): 
    ... print match.groups() 

('Jack', 'A.', 'Smith') 
('Mary', 'B.', 'Miller')

要使用的組，而不是指數的名稱，你可以指定你使用的組名：

>>> for match in p.finditer(str): 
    ... print match.group('Given') 

    Jack 
    Mary

來源

2014-03-29 16:22:02 sateesh

我現在用的是考慮到，中東和家庭爲每場比賽的標籤名，我該如何訪問使用這些標籤中的數據，而不僅僅是m.group（I）

您可以使用m.group('Given'), m.group('Middle'), m.group('Family')

讓我們說我想做匹配和替換？即，我想匹配瑪麗B.米勒，並由簡M.戈德斯坦替換，以便替換後的字符串現在是：「傑克A.史密斯\ nJane M. Goldstein」。我怎麼去做那個？

re.sub()可以用於搜索和替換，據我所知。

來源

2014-03-29 16:36:49 haraprasadj

我想我會做這樣的事情：

import re 

txt='''\ 
Jack A. Smith 
Mary B. Miller 
Jordan Brewster 
Kathy Beth Turner''' 

>>> [m.groups() for m in re.finditer(r'^(\w+)\s+(\w\.|\w*)\s*(\b\w+\b)$', txt, re.M)] 
[('Jack', 'A.', 'Smith'), ('Mary', 'B.', 'Miller'), ('Jordan', '', 'Brewster'), ('Kathy', 'Beth', 'Turner')]

的工作原理是這樣：

^(\w+)\s+(\w\.|\w*)\s*(\b\w+\b)$

Regular expression visualization

Debuggex Demo

這使您可以使用可選的捕捉名稱中間名或中間名。

來源

2014-03-29 17:08:20 dawg

這可能是我見過的最好的正則表達式演示，喜歡他們的正則表達式的圖形表示！ – Dnaiel

Python中的基本正則表達式問題，以幫助我學習

回答

相關問題