0
我有兩個文本文件:一個文本來自文章,另一個帶有phrasal verbs列表。我試圖在文章中找到每個短語動詞的每個實例。我知道文章包含短語動詞「登錄」,短語動詞列表也是這樣。當我循環使用短語動詞並使用re.findall()搜索每個動詞時,它找不到任何動詞。當我在短語動詞列表的第1199行手動啓動循環時,它恰好是單詞「登錄」,它會找到它。當我剛剛開始時,它只是一行,在1198行,它沒有找到它。這裏是我的代碼:re.findall()找不到另一個文件中的文件行
import re
PV_HI = []
file = open('article.txt')
for line in open('phrasalVerbs.txt'):
pv = line.strip()
pvFound = re.findall(pv, file.read(), flags=re.I)
PV_HI.extend(pvFound)
print(PV_HI)
這裏是動詞短語列表的文本文件的樣本:
Lock onto
Lock out
Lock up
Lock away
Log in
Log into
Log off
Log on
Log out
Look after
Look back
Look down on
Look for
Look forward to
Look in
Look in on
Look into
以及文章文件的樣本:
<p> If you have a business account, a higher Pay Anyone limit up to $500,000 and also have a Security Device to authorise third party payments and/or can add Operators, you are an ANZ Internet Banking for Business customer.
<p> How do I manage my accounts once I am registered for ANZ Internet Banking?
<p> If you have registered for ANZ Internet Banking, use your CRN and password to log on to ANZ Internet Banking.
<p> If you need help while logged on to ANZ Internet Banking, click the " Help " icon in the top right hand corner of all pages.
最終,我什麼試圖做的是獲得一組1600個文件中所有短語動詞的計數。如果有更好的方法來做到這一點,我肯定會接受建議。
謝謝!
馬特
哇!太棒了,非常感謝!我想我會注意到,當我註釋掉'article_content = f.read()'並使用'f.read()'作爲're.findall()'的字符串參數時,它不起作用,因此將' f.read()'這個變量在這裏至關重要。再次感謝! – MattR
很高興幫助! :d –