不同的行爲，當使用re.finditer和re.match

我正在處理正則表達式以通過某些腳本從頁面收集一些值。我在條件中使用re.match，但它返回false，但如果我使用finditer，它將返回true，並執行條件主體。我在我自己構建的測試器中測試了這個正則表達式，它在那裏工作，但不在腳本中。這裏是示例腳本。不同的行爲，當使用re.finditer和re.match

result = [] 
RE_Add0 = re.compile("\d{5}(?:(?:-| |)\d{4})?", re.IGNORECASE) 
each = ''Expiration Date:\n05/31/1996\nBusiness Address: 23901 CALABASAS ROAD #2000 CALABASAS, CA 91302\n' 
if RE_Add0.match(each): 
    result0 = RE_Add0.match(each).group(0) 
    print result0 
    if len(result0) < 100: 
     result.append(result0) 
    else: 
     print 'Address ignore' 
else: 
    None

來源

2011-01-10 Shahzad

re.match僅匹配字符串的開頭一次。在這方面，re.finditer類似於re.search，即它迭代地匹配。比較：

>>> re.match('a', 'abc') 
<_sre.SRE_Match object at 0x01057AA0> 
>>> re.match('b', 'abc') 
>>> re.finditer('a', 'abc') 
<callable_iterator object at 0x0106AD30> 
>>> re.finditer('b', 'abc') 
<callable_iterator object at 0x0106EA10>

ETA：既然你提的頁，我只能猜測你在談論的HTML解析，如果是這樣的話，使用BeautifulSoup或類似的HTML解析器。不要使用正則表達式。

來源

2011-01-10 12:48:39 SilentGhost

然後你能幫我如何得到執行這個腳本。我堅持了最後6個小時。沒有找到解決方案：（不幸的是，我不是一個好的程序員:-( – Shahzad 2011-01-10 12:50:14

re.finditer()即使沒有匹配也會返回一個迭代器對象（因此if RE_Add0.finditer(each)總是會返回True）。您必須實際遍歷該對象以查看是否存在實際匹配。

然後，re.match()只匹配字符串的開頭，而不是字符串中的任何位置，因爲re.search()或re.finditer()。

第三，該正則表達式可以寫爲r"\d{5}(?:[ -]?\d{4})"。

第四，總是使用原始字符串和正則表達式。

來源

2011-01-10 12:51:56

試試這個：

import re 

postalCode = re.compile(r'((\d{5})([ -])?(\d{4})?(\s*))$') 
primaryGroup = lambda x: x[1] 

sampleStr = """ 
    Expiration Date: 
    05/31/1996 
    Business Address: 23901 CALABASAS ROAD #2000 CALABASAS, CA 91302 
""" 
result = [] 

matches = list(re.findall(postalCode, sampleStr)) 
if matches: 
    for n,match in enumerate(matches): 
     pc = primaryGroup(match) 
     print pc 
     result.append(pc) 
else: 
    print "No postal code found in this string"

這將返回「12345」上的任何的

12345\n 
12345 \n 
12345 6789\n 
12345 6789 \n 
12345 \n 
12345  \n 
12345-6789\n 
12345-6789 \n 
12345-\n 
12345- \n 
123456789\n 
123456789 \n 
12345\n 
12345 \n

我有它僅在一行的末尾匹配，否則它也匹配的「23901」（從街道地址）在你的例子。

來源

2011-01-10 14:55:16

不同的行爲，當使用re.finditer和re.match

回答

相關問題