Python在文本文件中搜索確切的單詞/短語。 - 新手

目前，我正在嘗試在文本文件中搜索確切的單詞/短語。我正在使用Python 3.4Python在文本文件中搜索確切的單詞/短語。 - 新手

這是我到目前爲止的代碼。

import re 

def main(): 
    fileName = input("Please input the file name").lower() 
    term = input("Please enter the search term").lower() 

    fileName = fileName + ".txt" 

    regex_search(fileName, term) 

def regex_search(file,term): 
    source = open(file, 'r') 
    destination = open("new.txt", 'w') 
    lines = [] 
    for line in source: 
     if re.search(term, line): 
      lines.append(line) 

    for line in lines: 
     destination.write(line) 
    source.close() 
    destination.close() 
''' 
def search(file, term): #This function doesn't work 
    source = open(file, 'r') 
    destination = open("new.txt", 'w') 
    lines = [line for line in source if term in line.split()] 

    for line in lines: 
     destination.write(line) 
    source.close() 
    destination.close()''' 
main()

在我的功能regex_search我用正則表達式來搜索特定的字符串。但是，我不知道如何搜索特定的短語。

在第二個函數search中，我將行分割成一個列表並在那裏搜索單詞。但是，這將無法搜索特定的短語，因爲我正在搜索['the'，'dog'，'walked']中的[「dog walked」]，這將不會返回正確的行。

來源

2014-12-03 Kai Mou

如果你搜索「foo」和文字有「foobar的」，是考慮一場比賽？如果您搜索「富酒吧」，一行以「富」結尾，下一行以「酒吧」開頭，這是否被認爲是匹配？ – 2014-12-03 23:01:46

你能提供一個輸入文件（或其內容）和感興趣的短語的例子嗎？ – Marcin 2014-12-03 23:26:45

@Brian Oakley no – 2014-12-03 23:47:26

編輯：考慮到你不想匹配部分詞（'foo'不應該匹配'foobar'），你需要在數據流中向前看。該代碼是有點尷尬，所以我覺得正則表達式（與修訂當前的regex_search）是要走的路：

def regex_search(filename, term): 
    searcher = re.compile(term + r'([^\w-]|$)').search 
    with open(file, 'r') as source, open("new.txt", 'w') as destination: 
     for line in source: 
      if searcher(line): 
       destination.write(line)

來源

2014-12-03 23:30:28 tdelaney

因此，在這種情況下，當我搜索no並且該行沒有時會發生什麼？難道它不是回到了不是沒有的路線嗎？ – 2014-12-03 23:46:51

'不'會與'not'匹配 - 與您的'regex_search'示例相同。如果這不是你想要的，請告訴我們。 – tdelaney 2014-12-03 23:50:20

我正在尋找沒有隻匹配沒有。與短語相同。 – 2014-12-04 00:40:42

Python在文本文件中搜索確切的單詞/短語。 - 新手

回答

相關問題