Python - 生成單數名詞的複數名詞

如何使用NLTK模塊寫出名詞的單數和複數形式，或者說在單詞的txt文件中搜索時不要區分單數和複數？我可以使用NLTK使程序不區分大小寫嗎？Python - 生成單數名詞的複數名詞

2015-09-04 user5301912

您可以通過使用pattern.en做到這一點，不要太肯定NLTK

>>> from pattern.en import pluralize, singularize 
>>> 
>>> print pluralize('child') #children 
>>> print singularize('wolves') #wolf

看到more

來源

2015-09-04 18:53:21 taesu

真棒:) ......它可能是值得一提的youy也需要'pip安裝模式' –

謝謝:)我一定會試試這個，但我仍然在做其他用途的nead NLTK。 – user5301912

您可以導入兩者。我無法在NLTK – taesu

這裏是一個可能的方式與NLTK做到這一點。想象一下，你正在尋找的字「功能」：

from nltk.stem import WordNetLemmatizer 
from nltk.tokenize import word_tokenize 

wnl = WordNetLemmatizer() 
text = "This is a small text, a very small text with no interesting features." 
tokens = [token.lower() for token in word_tokenize(text)] 
lemmatized_words = [wnl.lemmatize(token) for token in tokens] 
'feature' in lemmatized_words

案例敏感性處理了所有單詞使用str.lower()，當然你也有在必要時lemmatize搜索詞。

來源

2015-09-04 20:12:03

我可以將.lower（）直接添加到raw_input（'>'）嗎？ – user5301912

是的，你可以做'raw_input（'>'）.lower（）'。 –

太好了。所以如果我添加.lower（）它會接受這個詞，但是我輸入它？像管理員管理員aDmin admiN等等？ – user5301912

模式目前正在寫不支持Python 3的（雖然有關於這個在這裏https://github.com/clips/pattern/issues/62正在進行的討論。

TextBlob https://textblob.readthedocs.io是建立在模式和NLTK的頂部，還包括了多元化的功能，似乎做了不錯的這份工作，雖然它並不完美，請參見下面的示例代碼

from textblob import TextBlob 
words = "cat dog child goose pants" 
blob = TextBlob(words) 
plurals = [word.pluralize() for word in blob.words] 
print(plurals) 
# >>> ['cats', 'dogs', 'children', 'geese', 'pantss']

來源

2017-01-24 13:58:48 Sixhobbits

這可能是有點晚了回答，但以防萬一有人還在尋找類似的東西：。

有inflect（也可在github）支持python 2.x和3.x. 你可以找到一個給定的詞的單數或複數形式：

import inflect 
p = inflect.engine() 

words = "cat dog child goose pants" 
print([p.plural(word) for word in words.split(' ')]) 
# ['cats', 'dogs', 'children', 'geese', 'pant']

值得注意的是複數的p.plural會給你的單數形式。此外，還可以提供POS（部分語音）標籤或提供數量和LIB確定它需要單複數：

p.plural('cat', 4) # cats 
p.plural('cat', 1) # cat 
# but also... 
p.plural('cat', 0) # cats

來源

2018-03-01 10:35:55

奇怪。 'inflect.engine（）。plural（'children'）'outputs'' childrens'' ...爲什麼？ –

是的，這個庫在某些情況下有一些奇怪的行爲，另外一個： 'inflect.engine（）。plural（'houses'）'outputs''housess'' 我不完全知道內部，我這些日子裏，我實際上正在讓自己穿過它。有一些非常好的工作案例，但也有一些看起來很明顯的錯誤 –

Python - 生成單數名詞的複數名詞

回答

相關問題