0
我打算從R.這個載體得到NGRAM無法安裝RWeka/rJava不管我做什麼,所以我找了這是NGRAM包替代。但是,這個腳本有問題,並且不起作用。如何提取來自R NGRAM時無法安裝RWeka
library(tm)
library(ngram)
text=c("A vector of n-grams","listed in decreasing blocks","it is a vector","it works a little differently","there are many vectors","another vector")
myCorpus=VCorpus(VectorSource(text))
bigram_tokenizer <- function(x)
ngram_asweka(x, min = 2, max = 2)
bigram_tdm <- DocumentTermMatrix(myCorpus)
findFreqTerms(bigram_tdm, 3)
什麼是造成字符(0)錯誤,以及如何處理它?謝謝!
「載體」也僅僅是兩次......嘗試添加一個額外的字符串'文本< - C(文字,「另一個向量」)' –
'字符(0)'意味着什麼也沒有發現 –
謝謝@EnriquePérezHerrero我加入,並將結果返回「向量」了,但因爲我指定n最小= 2,爲什麼沒有像返回「向量」兩字組? – santoku