0
爲什麼我不能使用「TermDocumentMatrix」?爲什麼我不能使用「TermDocumentMatrix」?
我用下面的命令以單數形式統一複數單詞,但是我得到一個錯誤。
crudeCorp <- tm_map(crudeCorp, gsub, pattern = "smells", replacement = "smell")
crudeCorp <- tm_map(crudeCorp, gsub, pattern = "feels", replacement = "feel")
crudeDtm <- TermDocumentMatrix(crudeCorp, control=list(removePunctuation=T))
Error in UseMethod("meta", x) :
no applicable method for 'meta' applied to an object of class "character"
我該如何解決? 1.是否有從單數變爲清潔的命令? 2.這個命令我用錯了嗎?
我將下面的代碼附加到句子的處理和矩陣。
library(tm)
library(XML)
crudeCorp<-VCorpus(VectorSource(readLines(file.choose())))
#(Eliminating Extra Whitespace)
crudeCorp <- tm_map(crudeCorp, stripWhitespace)
#(Convert to Lower Case)
crudeCorp<-tm_map(crudeCorp, content_transformer(tolower))
# remove stopwords from corpus
crudeCorp<-tm_map(crudeCorp, removeWords, stopwords("english"))
myStopwords <- c(stopwords("english"), "can", "will","got","also","goes","get","much","since","way","even")
myStopwords <- setdiff(myStopwords, c("will","can"))
crudeCorp <- tm_map(crudeCorp, removeWords, myStopwords)
crudeCorp<-tm_map(crudeCorp,removeNumbers)
crudeCorp <- tm_map(crudeCorp, gsub, pattern = "smells", replacement = "smell")
crudeCorp <- tm_map(crudeCorp, gsub, pattern = "feels", replacement = "feel")
#-(Creating Term-Document Matrices)
crudeDtm <- TermDocumentMatrix(crudeCorp, control=list(removePunctuation=T))
例如:我的數據
1. I'M HAPPY
2. how are you?
3. This apple is good
(skip)