0
在對堆棧溢出進行一些研究之後,我發現可以使用re.finditer()獲取重疊匹配,但對於我的特定情況它似乎不起作用。我希望我的正則表達式以YYYY-MM-DD的形式提取由非字母數字(\ W)字符組成的日期。如果兩個匹配在它們之間至少有兩個\ W字符,這個工作很好。我的問題是:如何增加我的表達,以便從字符串中提取日期,例如: 「2016-02-29 4354-09-21 1900-03-15 1576-05-16」這隻會提取2016-02-29和1900-03-15,即使其他人都有效。 下面是代碼:正則表達式幫助(重疊)
# Find Dates
# dmoj
# author: Aniekan Umoren
# date: 2016-02-15
import re
#input: int N (number of line)
#input: lines containing dates (YYYY-MM-DD) in them
#output: a list of all VALID dates
# turns it into a reegex object which can now use methods like findall
exp1 = re.compile("\W([0-9]{4}-[0-9]{2}-[0-9]{2})\W")
exp2 = re.compile("([0-9]+)-*")
thirty = (4,6,9,11)
N = int(input())
found = []
for i in range(N):
line = input()
matches = exp1.finditer(line)
# returns a tuple of the matched capturing groups
#(or the entire match if there are no capturing groups)
found.extend([str(x.group(1)) for x in matches])
for it in found:
date = [int(x) for x in exp2.findall(it)]
isthirty = False
if (date[1] > 12 or date[2] > 31):
continue
if (date[1] in thirty): isthirty = True
if (isthirty and date[2] <= 30):
print(it)
elif (date[1] == 2):
if (date[0] % 4 == 0 and date[2] <= 29):
print(it)
elif (date[0] % 4 != 0 and date[2] <= 28):
print(it)
elif (not isthirty and date[2] <= 31):
print(it)
THANKYOU SOO很多。你不知道我在這個問題上花了多長時間 –
由於你使用'\ w'類,所以你可以用字邊界來替換這個lookaround。 –