2016-03-02 78 views
1

我正面臨一個相當難以捉摸的錯誤,這似乎是由讀取文件造成的。 我簡化了我的程序來演示這個問題:讀取文件導致的錯誤

考慮這個程序正常工作:

import re 

sourceString="Yesterday I had a pizza for lunch it was tasty\n"; 
sourceString+="today I am thinking about grabbing a burger and tomorrow it\n"; 
sourceString+="will probably be some fish if I am lucky\n\n\n"; 
sourceString+="see you later!" 

jj=["pizza","steak","fish"] 

for keyword in jj: 
    regexPattern= keyword+".*"; 
    patternObject=re.compile(regexPattern,re.MULTILINE); 
    match=patternObject.search(sourceString); 
    if match: 
     print("Match found for "+keyword) 
     print(match.group()+"\n") 
    else: 
     print("warning: no match found for :"+ keyword+"\n") 

我用一個很簡單的正則表達式,但我從我的陣列jj獲得正則表達式的要點

腳本按預期工作(含匹配「比薩」和「魚」模式,但不匹配「牛排」)

現在在我實際的程序我想讀這些KEYW從文件ORDS(我不想在源進行硬編碼)

到目前爲止,我有這樣的:

import re 

sourceString="Yesterday I had a pizza for lunch it was tasty\n"; 
sourceString+="today I am thinking about grabbing a burger and tomorrow it\n"; 
sourceString+="will probably be some fish if I am lucky\n\n\n"; 
sourceString+="see you later!" 

with open('keyWords.txt','r') as f: 
    for keyword in f: 
     regexPattern= keyword+".*"; 
     patternObject=re.compile(regexPattern,re.MULTILINE); 
     match=patternObject.search(sourceString); 
     if match: 
      print("Match found for "+keyword) 
      print(match.group()) 
     else: 
      print("warning: no match found for :"+ keyword) 

其中keyWords.txt將包含以下內容:

pizza 
steak 
fish 

但這會破壞代碼,因爲不知何故,只有文件中的LAST關鍵字才能成功匹配(如果匹配存在)。

什麼給?

+0

不要只假定它是一個錯誤。這僅僅是因爲每一行在最後都有一個換行符,你沒有考慮到。 – zondo

+1

..這意味着由於我沒有考慮換行符程序有一個錯誤的權利?我沒有說語言規範有缺陷 – ForeverStudent

+0

我很抱歉;我誤解了。 – zondo

回答

3
with open('keyWords.txt','r') as f: 
    for keyword in f: 
     regexPattern = keyword.strip() + ".*"; 

使用strip()keyword刪除任何newline字符。如果你確實知道不會有任何領先的空白,那麼rstrip()就足夠了。

+0

這解決了這個問題。很努力 – ForeverStudent