文本文件中是否有特定的字符串匹配字符串

-1

我有一個文本文件包含許多單詞（每行上有單個單詞）。我必須閱讀每個單詞，修改單詞，然後檢查修改後的單詞是否與文件中的任何單詞匹配。我遇到了最後一部分的問題（這是我的代碼中的hasMatch方法）。這聽起來很簡單，我知道我該怎麼做，但無論我嘗試什麼都行不通。文本文件中是否有特定的字符串匹配字符串

這裏

#read in textfile 
myFile = open('good_words.txt') 


#function to remove first and last character in string, and reverse string 
def modifyString(str): 
    rmFirstLast = str[1:len(str)-2] #slicing first and last char 
    reverseStr = rmFirstLast[::-1] #reverse string 
    return reverseStr 

#go through list of words to determine if any string match modified string 
def hasMatch(modifiedStr): 
    for line in myFile: 
     if line == modifiedStr: 
      print(modifiedStr + " found") 
     else: 
      print(modifiedStr + "not found") 

for line in myFile: 
    word = str(line) #save string in line to a variable 

    #only modify strings that are greater than length 3 
    if len(word) >= 4: 
     #global modifiedStr #make variable global 
     modifiedStr = modifyString(word) #do string modification 
     hasMatch(modifiedStr) 

myFile.close()

來源

2016-09-03 Stackimus Prime

文件對象是由該外'for'循環消耗。 'hasMatch'中的內循環不會做你認爲它的做法 –

而'word = str（line）'不是必需的。 'line'已經是一個字符串了 –

幾個問題

你必須剝去線或你失敗了比賽換行/ CR字符
一定要仔細閱讀該文件一勞永逸或文件迭代用完第一次
速度是壞後：使用set代替list
切片過於複雜和錯誤的加快的搜索：str[1:-1]是否做到了這一點（感謝那些評論我的回答）
整個代碼真的很長&複雜。我總結了幾行。

代碼：

#read in textfile 
myFile = open('good_words.txt') 
# make a set (faster search), remove linefeeds 
lines = set(x.strip() for x in myFile) 
myFile.close() 

# iterate on the lines 
for word in lines: 
    #only consider strings that are greater than length 3 
    if len(word) >= 4: 
     modifiedStr = word[1:-1][::-1] #do string modification 
     if modifiedStr in lines: 
      print(modifiedStr + " found (was "+word+")") 
     else: 
      print(modifiedStr + " not found")

我測試程序的常見英文單詞的名單上，我得到了那些比賽：

so found (was most) 
or found (was from) 
no found (was long) 
on found (was know) 
to found (was both)

編輯：另外這滴set並使用bisect版本在排序列表上以避免散列/散列衝突。

import os,bisect 

#read in textfile 
myFile = open("good_words.txt")) 
lines = sorted(x.strip() for x in myFile) # make a sorted list, remove linefeeds 
myFile.close() 

result=[] 
for word in lines: 

    #only modify strings that are greater than length 3 
    if len(word) >= 4: 
     modifiedStr = word[1:-1][::-1] #do string modification 
     # search where to insert the modified word 
     i=bisect.bisect_left(lines,modifiedStr) 
     # if can be inserted and word is actually at this position: found 
     if i<len(lines) and lines[i]==modifiedStr: 
      print(modifiedStr + " found (was "+word+")") 
     else: 
      print(modifiedStr + " not found")

來源

2016-09-03 18:19:46

好的答案，也是'rmFirstLast = str [1：len（str）-2]'應該是'str [1：len（str）-1]' –

沒有什麼像'strip（x ）'。你的意思是'x.strip（）' –

@TrevorMerrifield這個'len'甚至不需要在那裏...... –

在你的代碼中，你並沒有切割第一個和最後一個字符，而是第一個和最後兩個字符。

rmFirstLast = str[1:len(str)-2]

改變，要：

rmFirstLast = str[1:len(str)-1]

來源

2016-09-03 18:23:57 Kandhan

的權利，但也有很多其他問題... –

是的，沒有注意到他們。 – Kandhan

是的，謝謝你指出這一點。我已經設置了切片最後兩個字符，因爲我測試它只有-1（而不是-2），它不工作......也許一個空的字符與它有關。 –

文本文件中是否有特定的字符串匹配字符串

回答

相關問題