tm_map在R中出現錯誤

這是我第一次進行twitter分析。tm_map在R中出現錯誤

#Search data from Twitter 
library("twitteR") 
SearchData = searchTwitter("Bruno Mars", n=1000,lang = 'en') 
SearchData 

#Scrapping Data 
userTimeline("BrunoMars", n=100, maxID =NULL, excludeReplies = FALSE, includeRts = FALSE) 

class(SearchData) 
head(SearchData) 

#Cleanning Data 
library(NLP) 
library(tm) 



TweetList <- sapply(SearchData, function(x) x$getText()) 

TweetList <- (TweetList[!is.na(TweetList)]) 
TweetCorpus <- Corpus(VectorSource(TweetList)) 
TweetCorpus <- iconv(TweetCorpus, to ="utf-8") 

#change data to lower case 

TweetCorpus <- tm_map(TweetCorpus,removePunctuation) 
TweetCorpus <- tm_map(TweetCorpus, removeNumbers) 
TweetCorpus <- tm_map(TweetCorpus, tolower)

我有這個錯誤「錯誤UseMethod（」 tm_map」，X）：應用於類的一個對象‘tm_map’不適用的方法。‘人物’在我的最後3行

我試圖通過在removePunctuation，removeNumbers和tolower之前添加content_transformer來解決這個問題，但是我仍然有同樣的錯誤，我真的不知道，我需要你的建議和你的建議。發行了幾天，但還沒有解決。

非常感謝 Ros

來源

2017-05-25 Siroros Roongdonsai

tm_map必須應用於語料庫對象，而不是字符向量。但iconv將您的TweetCorpus對象從語料庫返回變成字符向量。

爲了解決這個問題，切換您預先處理的順序，讓您使用iconv之前你把微博變成語料庫對象：

TweetList <- c("hello", "world", "Hooray", "yep") 
TweetList <- iconv(TweetList, to ="utf-8") 
TweetCorpus <- Corpus(VectorSource(TweetList))

來源

2017-05-25 12:47:35

非常感謝Patronus –

的tm最新版本說得那麼你不能使用對tm_map的函數進行簡單的字符值操作。所以問題在於你的步驟，因爲這不是一個「規範」的轉換（見getTransformations()）。只是

TweetCorpus <- tm_map(TweetCorpus, content_transformer(tolower))

的content_transformer函數包裝更換將一切轉換爲軀體內正確的數據類型。您可以將content_transformer與任何旨在操縱字符向量的函數一起使用，以便它將在tm_map管道中工作。

來源

2017-05-25 12:49:02

非常感謝Lorenzo。你非常有幫助。我遵循的教程可能會很老。 –

tm_map在R中出現錯誤

回答

相關問題