我試圖從很小的文本像短信中提取名稱和組織名稱中的專有名詞,可用於nltk Finding Proper Nouns using NLTK WordNet的基本解析器正在能夠得到的名詞,但問題是,當我們得到專有名詞不開頭大寫字母,像這樣類似SUMIT名稱文本沒有得到認可的專有名詞解析文本,以獲得專有名詞(名稱和組織)-python,nltk,
>>> sentence = "i spoke with sumit and rajesh and Samit about the gridlock situation last night @ around 8 pm last nite"
>>> tagged_sent = pos_tag(sentence.split())
>>> print tagged_sent
[('i', 'PRP'), ('spoke', 'VBP'), ('with', 'IN'), **('sumit', 'NN')**, ('and', 'CC'), ('rajesh', 'JJ'), ('and', 'CC'), **('Samit', 'NNP'),** ('about', 'IN'), ('the', 'DT'), ('gridlock', 'NN'), ('situation', 'NN'), ('last', 'JJ'), ('night', 'NN'), ('@', 'IN'), ('around', 'IN'), ('8', 'CD'), ('pm', 'NN'), ('last', 'JJ'), ('nite', 'NN')]
您可以在應用命名實體識別器之前嘗試使用truecasing。 –