如何使用Afrikaans語言詞作爲nltk語料庫訪問文本文件

我有一個使用南非荷蘭語語言的純文本語句的文本文件。我希望能夠在此文本文件上執行nltk語料庫函數，但找不到如何執行此操作的任何示例。如何使用Afrikaans語言詞作爲nltk語料庫訪問文本文件

我喜歡做的事情，如：

mytext.concordance("woord") 
mytext.similar("woord")

誰能幫助我？

來源

2013-01-07 Superdooperhero

託管推測出來的東西：

# How to load a text file as a corpus. 
import nltk 
from nltk.corpus import PlaintextCorpusReader 
from nltk.corpus.util import LazyCorpusLoader 
afrikaans = LazyCorpusLoader('afrikaans', PlaintextCorpusReader, r'(?!\.).*\.txt') 
afrikaans.sents()[1] 
af = nltk.Text(afrikaans.words()) 
af.concordance("mense")

這裏假設你的語料文本文件是在C：\ nltk_data \語料庫\南非\ afrikaans.txt

來源

2013-01-10 21:10:25 Superdooperhero

如何使用Afrikaans語言詞作爲nltk語料庫訪問文本文件

回答

相關問題