二進制搜索Python拼寫檢查

我想導入一個文本文件（所有小寫字母，沒有標點符號），並將這些單詞與單詞的詞典列表進行比較。如果一個單詞沒有出現在字典列表中，它將被打印爲可能不正確。如果一個單詞出現在詞典列表中，則不做任何事情。我們應該在這裏使用二進制搜索方法。我認爲我的二進制搜索方法是正確的，我只是不知道在哪裏/如何返回沒有出現在字典列表中的單詞，並將它們宣告爲可能不正確。二進制搜索Python拼寫檢查

謝謝！

我輸入文件一句話：快紅狐狸跳過LZY BRN狗

def spellcheck(inputfile): 

filebeingchecked=open(inputfile,'r') 
spellcheckfile=open("words.txt",'r') 

dictionary=spellcheckfile.read().split() 
checkedwords=filebeingchecked.read().split() 

for word in checkedwords: 

    low = 0 
    high=len(dictionary)-1 

    while low <= high: 

     mid=(low+high)//2 
     item=dictionary[mid] 

     if word == item: 
      return word 

     elif word < item: 
      high=mid-1 

     else: 
      low=mid+1 

    return word 

def main(): 

    print("This program accepts a file as an input file and uses a spell check function to \nidentify any problematic words that are not found in a common dictionary.") 
    inputfile=input("Enter the name of the desired .txt file you wish to spellcheck: ") 

main()

來源

2014-03-30 user3366928

當你想要退出while循環時，你不想返回那個單詞，但是有兩個條件可以退出while循環：

當單詞i S IN字典，你跳出while循環，或
當這個詞是不是在字典中，最終while循環將結束

那麼，我們如何在兩者之間辨別？事實證明，在Python中，while和for循環有一個else子句：else子句中的代碼只在循環自然結束時執行，而不是break語句的結果，所以這就是我使用的。

此外，要返回多個單詞，我可以將它們收集到列表中並稍後返回該列表，或者我可以使用yield關鍵字。看看它是如何工作的。

def spellcheck(inputfile): 

    filebeingchecked=open(inputfile,'r') 
    spellcheckfile=open("words.txt",'r') 

    dictionary=spellcheckfile.read().split() 
    checkedwords=filebeingchecked.read().split() 

    for word in checkedwords: 

     low = 0 
     high=len(dictionary)-1 

     while low <= high: 

      mid=(low+high)//2 
      item=dictionary[mid] 

      if word == item: 
       break 

      elif word < item: 
       high=mid-1 

      else: 
       low=mid+1 
     else: 
      yield word 

def main(): 

    print("This program accepts a file as an input file and uses a spell check function to \nidentify any problematic words that are not found in a common dictionary.") 
    inputfile=input("Enter the name of the desired .txt file you wish to spellcheck: ") 
    for word in spellcheck(inputfile): 
     print(word) 

main()

來源

2014-03-30 03:38:38

我明白你對最後的'else：yield word'語句的使用，然而，輸出回來如下：快速紅狐狸跳過lzy brn狗（超過了）。任何想法爲什麼？ – user3366928

輸出包含*拼錯的單詞。檢查你的字典。那裏有那些單詞嗎？字典是否真的排序？ –

我試着下載不同的字典，並認爲問題在於我的上一個字詞出現了一些可能與排序有關的字詞（如姓名，城市等）。一個新的所有小寫字典的詞典列表似乎都有竅門。謝謝你的幫助！ – user3366928

二進制搜索「結束」時，你應該尋找數組爲空。在你的情況下，你正在跟蹤low和high的數組的開始和結束索引，你應該繼續搜索while low <= high。

但是，當您的程序邏輯爲(low <= high) == False而您沒有找到匹配的詞時，這意味着什麼？

這意味着這個詞是不是在你的字典，你應該採取適當的動作（將它添加到「不正確的單詞列表）。

當然，你只能輸出不正確的話，一旦你有名單看完所有你想要的單詞

來源

2014-03-30 03:22:12 jaynp

二進制搜索Python拼寫檢查

回答

相關問題