2017-03-18 37 views
2

我已經看過pattern.enconjugate,但它只綴成幾個表格,我寧可不要坐下來,程序中的所有例外的規則,讓我做動詞變化,如如何將英語單詞與Python中的漸進式表單結合起來?

  • 免費 - 釋放
  • 吃 - 吃
  • 洗澡 - 洗澡
  • 是 - 是
  • 禁令 - 禁止

nltk有詞幹,但它似乎沒有反向操作,至少從搜索StackOverflow。這似乎是一個非常基本的NLP任務,但是我找不到任何現代化的東西,它在Python中是這樣做的。任何一般的共軛工具都不錯,儘管英語中的漸進式形式並沒有我所知道的不規則性。

我也想看看是否有此規則的例外,這可能工作作爲備用功能:

def present_to_progressive(x): 
    vowels = set(['a','e','i','o','u']) 
    size = len(x) 
    if size == 2: 
     return x + 'ing' 
    elif x[size - 2:] == 'ie': 
     return x[:(size-2)] + 'ying' 
    elif x[size - 1] not in vowels and x[size - 2] not in vowels: 
     return x + 'ing' 
    elif x[size - 1] == 'e' and x[size-2] not in vowels: 
     return x[0:(size-1)] + 'ing' 
    elif x[size - 1] not in vowels and x[size-2] in vowels: 
     if x[size - 3] not in vowels: 
      return x + x[size-1] + 'ing' 
     else: 
      return x + 'ing' 
    else: 
     return x + 'ing' 

編輯:新增案例「即」動詞

回答

2

有一個完整的圖書館這種類型的修改,做你想做的。這就是所謂的pattern.en

,你可以在這裏找到: pattern.en

這是一個很好的來源。 這裏是共軛教程網站上的摘錄:

conjugate(verb, 
    tense = PRESENT,  # INFINITIVE, PRESENT, PAST, FUTURE 
    person = 3,    # 1, 2, 3 or None 
    number = SINGULAR,  # SG, PL 
    mood = INDICATIVE,  # INDICATIVE, IMPERATIVE, CONDITIONAL, SUBJUNCTIVE 
    aspect = IMPERFECTIVE, # IMPERFECTIVE, PERFECTIVE, PROGRESSIVE 
    negated = False,   # True or False 
    parse = True) 

這是非常有用和非常廣闊!

+0

好吧,我想我最初忽略了方面參數。這應該爲我的目的做伎倆。 –

1

我認爲你的代碼覆蓋了大多數情況。我檢查了從this site取得的620個不規則動詞的列表,它錯過了大約84個案例。

with open('/tmp/Verblist.vrb', 'rt') as f: 
    err = 0 
    for l in f: 
     if l.startswith('>'): 
      forms = l[1:].split(' ') 
      guess = present_to_progressive(forms[0]) 
      if forms[4].lower() != guess.lower(): 
       print('CHECK: {} {} {}'.format(forms[0], forms[4], guess)) 
       err += 1 
    print(err) 

只需通過添加'w','y'您元音的名單,可能的錯誤列表下降了18個案例:

CHECK: Aby/Abey Abying/Abeying Aby/Abeying -- Correct 
CHECK: Eat Eating Eatting 
CHECK: Fordo/Foredo Fordoing Fordo/Foredoing -- Correct in one of the 2 variants 
CHECK: Forget Foregetting Forgetting   -- Correct, the list has a typo 
CHECK: Lie Lying Lieing      -- Fixed in your second version 
CHECK: Mischoose Mischoosins Mischoosing  -- Correct, the list has a typo 
CHECK: Miswed Miswedding Misweding 
CHECK: Outswim Outswimming Outswiming 
CHECK: Overlie Overlying Overlieing   -- Fixed in your second version 
CHECK: Quit Quitting Quiting 
CHECK: Relearn Relearn Relearning 
CHECK: Rewed Rewedding Reweding 
CHECK: Rewet Rewetting Reweting 
CHECK: Rewin Rewinning Rewining 
CHECK: Swim Swimming Swiming 
CHECK: Underlie Underlying Underlieing  -- Fixed in your second version 
CHECK: Vex Vexing Vexxing 
CHECK: Zinc Zincking Zincing 

最重要的,這些可以增加一個特殊的情況下,要解決「謊言」並改進最後一個輔音翻倍的規則。我想你可能會決定放棄一些非常不常見的動詞。

相關問題