2013-06-25 9 views
0

我正在關注R twitteR軟件包的初學者教程,並且我遇到了障礙。 (教程網址:https://sites.google.com/site/miningtwitter/questions/sentiment/analysis*仍然難倒*使用R與twitteR,plyr,stringr和RMySQL包進行tweets的情感分析

與第4節中詳細介紹的通過searchTwitter函數導入推文列表不同,我正在從MySQL數據庫導入推文數據框。林能夠導入從MySQL精細的微博,但是當我嘗試執行:

wine_txt = sapply(wine_tweets,函數(x)x $的getText())

我得到一個錯誤:

x $ getText中的錯誤:$運算符對於原子向量無效

數據已經存在於data.frame表單中,我之後再次強制它進入data.frame,只是爲了確保我仍然得到相同的錯誤。我粘貼了下面的完整代碼,任何幫助將不勝感激。

library(twitteR) 
library(plyr) 
library(stringr) 
library(RMySQL) 

tweets.con<-dbConnect(MySQL(),user="XXXXXXXX",password="XXXXXXXX",dbname="XXXXXXX",host="XXXXXXX") 
wine_tweets<-dbGetQuery(tweets.con,"select `tweet_text` from `tweets` where `created_at` BETWEEN timestamp(DATE_SUB(NOW(), INTERVAL 11 MINUTE)) AND timestamp(NOW())") 

# function score.sentiment 
score.sentiment = function(sentences, pos.words, neg.words, .progress='none') 
{ 
# Parameters 
# sentences: vector of text to score 
# pos.words: vector of words of postive sentiment 
# neg.words: vector of words of negative sentiment 
# .progress: passed to laply() to control of progress bar 

# create simple array of scores with laply 
scores = laply(sentences, 
function(sentence, pos.words, neg.words) 
{ 
    # remove punctuation 
    sentence = gsub("[[:punct:]]", "", sentence) 
    # remove control characters 
    sentence = gsub("[[:cntrl:]]", "", sentence) 
    # remove digits? 
    sentence = gsub('\\d+', '', sentence) 

    # define error handling function when trying tolower 
    tryTolower = function(x) 
    { 
    # create missing value 
    y = NA 
    # tryCatch error 
    try_error = tryCatch(tolower(x), error=function(e) e) 
    # if not an error 
    if (!inherits(try_error, "error")) 
    y = tolower(x) 
    # result 
    return(y) 
    } 
    # use tryTolower with sapply 
    sentence = sapply(sentence, tryTolower) 

    # split sentence into words with str_split (stringr package) 
    word.list = str_split(sentence, "\\s+") 
    words = unlist(word.list) 

    # compare words to the dictionaries of positive & negative terms 
    pos.matches = match(words, pos.words) 
    neg.matches = match(words, neg.words) 

    # get the position of the matched term or NA 
    # we just want a TRUE/FALSE 
    pos.matches = !is.na(pos.matches) 
    neg.matches = !is.na(neg.matches) 

    # final score 
    score = sum(pos.matches) - sum(neg.matches) 
    return(score) 
    }, pos.words, neg.words, .progress=.progress) 

# data frame with scores for each sentence 
scores.df = data.frame(text=sentences, score=scores) 
return(scores.df) 
} 

# import positive and negative words 
pos = readLines("/home/jgraab/R/scripts/positive_words.txt") 
neg = readLines("/home/jgraab/R/scripts/negative_words.txt") 
wine_txt = sapply(wine_tweets, function(x) x$getText()) 

回答

0

$是用於抓取一個數據幀(或列表等),則不能使用它的一列作爲你是應用的功能。你想要類似

getText(x) 

那裏。

+0

我換出X $的getText()進行的getText(x),但是仍然沒有運氣:(。 –

+0

如何教程適應數據框中輸入任何其他指導? –

+0

什麼是錯誤消息,當你換了嗎? –