2016-02-27 83 views
0

功能應該檢查輸入字符串的每個單詞對所有詞語的correct_spells列表,並返回一個字符串,即:編寫一個名爲spelling_corrector的函數。

  • 如果在原來的句子中的詞與詞精確匹配的 correct_spells然後該單詞未被修改,應直接複製到輸出字符串 。

  • 如果在句子中的詞彙可以通過替換,插入或刪除單個字符匹配在correct_spells列表 一個字,那麼 字應該由correct_spelled 列表中選擇正確的字代替。

  • 如果前兩個條件都不成立,那麼在 這個詞中原來的字符串不應該被修改,應該直接將 拷貝到輸出字符串中。

注:

  • 不要拼寫檢查一個或兩個字母的單詞(直接複製他們到 輸出字符串)。

  • 如果是聯繫,請使用correct_spelled列表中的第一個單詞。

  • 忽略大小寫,即將大寫字母視爲與小寫字母相同的 。

  • 輸出字符串中的所有字符都應該是小寫字母 。

  • 假定輸入字符串僅包括字母字符和 空格。 (a-z和A-Z)

  • 刪除單詞之間的多餘空格。

  • 刪除輸出字符串開始和結尾處的空格。

實例:

enter image description here

說明:

  • 在第一個例子 'THES' 不與任何東西替換。

  • 在第一個例子都「案例」和「車」能替換原句的「中科院」,而是「案」被選中,因爲它是第一次遇到。

這是我做過嘗試,但一直沒有非常有用的代碼:

def spelling_corrector(input_string,input_list): 
new_string = input_string.lower().split() 
count = 0 
for x in new_string: 
    for y in input_list: 
     for i in y: 
      if i not in x: 
       count += 1 
    if count == 1: 
     print(y) 
    if len(x) == len(y) or x not in input_list: 
     print(x) 

spelling_corrector("Thes is the Firs cas", ['that','first','case','car'])` 
+0

對於第二條規則,[Levenshtein距離(HTTPS://en.wikipedia .ORG /維基/ Levenshtein_distance) – RedLaser

回答

1
def replace_1(bad:str, good:str) -> bool: 
    """Return True if bad can be converted to good by replacing 1 letter. 
    """ 
    if len(bad) != len(good): 
     return False 

    changes = 0 
    for i,ch in enumerate(bad): 
     if ch != good[i]: 
      return bad[i+1:] == good[i+1:] 

    return False 

def insert_1(bad:str, good:str) -> bool: 
    """Return True if bad can be converted to good by inserting 1 letter. 
    """ 
    if len(bad) != len(good) - 1: 
     return False 

    for i,ch in enumerate(bad): 
     if ch != good[i]: 
      return bad[i:] == good[i+1:] 

    # At this point, all of bad matches first part of good. So it's an 
    # append of the last character. 
    return True 

def delete_1(bad:str, good:str) -> bool: 
    """Return True if bad can be converted to good by deleting 1 letter. 
    """ 
    if len(bad) != len(good) + 1: 
     return False 
    return insert_1(good, bad) 


def correction(word:str, correct_spells:list) -> str: 
    if len(word) < 3: 
     return word 
    if word in correct_spells: 
     return word 
    for good in correct_spells: 
     if replace_1(word, good): 
      return good 
     if insert_1(word, good): 
      return good 
     if delete_1(word, good): 
      return good 

    return word 

def spelling_corrector(sentence:str, correct_spells:list) -> str: 
    words = sentence.strip().lower().split() 
    correct_lower = [cs.lower() for cs in correct_spells] 
    result = [correction(w, correct_lower) for w in words] 
    return ' '.join(result) 

tests = (
    ('Thes is the Firs cas', "that first case car", 'thes is the first case'), 
    ('programming is fan and easy', "programming this fun easy hook", 'programming is fun and easy'), 
    ('Thes is vary essy', "this is very very easy", 'this is very easy'), 
    ('Wee lpve Python', "we Live In Python", 'we live python'), 
) 

if __name__ == "__main__": 
    for t in tests: 
     correct = t[1].split() 
     print(t[0], "|", t[1], "|", t[2]) 
     print("Result:", spelling_corrector(t[0], correct)) 
     assert spelling_corrector(t[0], correct) == t[2]