好了,終於想出了一個解決方案:
from nltk.corpus import wordnet
f=open("wordnet_wordlist.txt","w")
for syn in list(wordnet.all_synsets()):
f.write(syn.name[:-5])
f.write("\n")
f.close()
f = open("wordnet_wordlist.txt")
f2 = open("wordnet_wordlist_final.txt", "w")
uniquelines = set(f.read().split("\n"))
f2.write("".join([line + "\n" for line in uniquelines]))
f2.close()
現在從最終wordlist_final文件讀取和使用nltk.edit_distance名單可以發現
wordnetobj=open("wordnet_wordlist_final.txt","r")
wordnet=wordnetobj.readlines()
def edit(word,distance):
validlist=[]
for valid in wordnet:
valids=valid[:-1]
if(abs(len(valids)-len(word))<=2):
if(nltk.edit_distance(word,valids)==distance):
validlist.append(valids)
return validlist
確定Wordnet是你想要的嗎?似乎過度殺傷。附魔可能會更好:http://packages.python.org/pyenchant/ –