爲的findall

充分表達我有一個正則表達式查找一個URL像一些文字：爲的findall

my_urlfinder = re.compile(r'\shttp:\/\/(\S+.|)blah.com/users/(\d+)(\/|)') 
text = "blah blah http://blah.com/users/123 blah blah http://blah.com/users/353" 

for match in my_urlfinder.findall(text): 
    print match #prints an array with all the individual parts of the regex

如何獲得完整的URL？目前匹配只是打印出匹配的部分（我需要其他東西）...但我也想要完整的網址。

來源

2013-03-06 9-bits

最簡單的將是增加一個額外的括號，包圍了整個正則表達式。然後你將它與零件一起得到！ – alexis 2013-03-06 14:51:58

另一種不使用任何捕獲組是添加周圍的一切另一個問題：

my_urlfinder = re.compile(r'\s(http:\/\/(\S+.|)blah.com/users/(\d+)(\/|))')

這將讓你保持內捕獲組同時還具有整個結果。

對於演示文本，將產生以下結果：

('http://blah.com/users/123', '', '123', '') 
('http://blah.com/users/353', '', '353', '')

作爲一個側面說明要小心，目前的表達，需要一個空白在網址前面，所以如果文本開始一個會不匹配。

來源

2013-03-06 14:39:26 poke

這正是我需要的 - 謝謝！ – 2013-03-06 15:41:19

你應該讓你的組非捕獲：

my_urlfinder = re.compile(r'\shttp:\/\/(?:\S+.|)blah.com/users/(?:\d+)(?:\/|)')

findall()當有捕獲組改變行爲。通過組，它只會返回組，而不捕獲組，而是返回整個匹配的文本。

演示：

>>> text = "blah blah http://blah.com/users/123 blah blah http://blah.com/users/353" 
>>> my_urlfinder = re.compile(r'\shttp:\/\/(?:\S+.|)blah.com/users/(?:\d+)(?:\/|)') 
>>> for match in my_urlfinder.findall(text): 
...  print match 
... 
http://blah.com/users/123 
http://blah.com/users/353

來源

2013-03-06 14:36:57

回答

相關問題