NLTK情緒維達：排序

我剛剛運行我的數據集維德情感分析：NLTK情緒維達：排序

from nltk.sentiment.vader import SentimentIntensityAnalyzer 
from nltk import tokenize 
sid = SentimentIntensityAnalyzer() 
for sentence in filtered_lines2: 
    print(sentence) 
    ss = sid.polarity_scores(sentence) 
    for k in sorted(ss): 
     print('{0}: {1}, '.format(k, ss[k]),) 
     print()

這裏我的結果的一個樣本：我想

Are these guests on Samsung and Google event mostly Chinese Wow Theyre 
boring 

Google Samsung 

('compound: 0.3612, ',) 

() 

('neg: 0.12, ',) 

() 


('neu: 0.681, ',) 


() 


('pos: 0.199, ',) 


() 

Adobe lose 135bn to piracy Report 


('compound: -0.4019, ',) 


() 


('neg: 0.31, ',) 


() 


('neu: 0.69, ',) 


() 


('pos: 0.0, ',) 


() 

Samsung Galaxy Nexus announced 

('compound: 0.0, ',) 

() 

('neg: 0.0, ',) 

() 

('neu: 1.0, ',) 

() 

('pos: 0.0, ',) 

()

知道有多少次「化合物」等於，大於或小於零。

我知道這可能很簡單，但我對Python和編碼一般都很陌生。我嘗試了很多不同的方式來創建我需要的東西，但我找不到任何解決方案。

（請編輯我的問題，如果「成績的樣品」是不正確的，因爲我不知道寫的正確方法）

來源

2016-09-29 Luca Perinati

看起來你正在編寫Python 3的代碼，但與Python 2運行（這無關你的問題，但可能讓你最終陷入困境）。 – lenz

謝謝你的建議！ –

到目前爲止，這並不是最pythonic的做法，但我認爲這將是最容易理解的，如果你沒有太多的python經驗。本質上，你創建一個有0值的字典並在每個案例中增加值。

from nltk.sentiment.vader import SentimentIntensityAnalyzer 
from nltk import tokenize 
sid = SentimentIntensityAnalyzer() 
res = {"greater":0,"less":0,"equal":0} 
for sentence in filtered_lines2: 
    ss = sid.polarity_scores(sentence) 
    if ss["compound"] == 0.0: 
     res["equal"] +=1 
    elif ss["compound"] > 0.0: 
     res["greater"] +=1 
    else: 
     res["less"] +=1 
print(res)

來源

2016-09-29 12:00:44

我認爲這是非常pythonic。畢竟，Python只是簡單易懂而已！對於一個簡單的問題，不需要複雜的解決方案。 – lenz

@lenz我完全同意。但是作爲Python的for循環可以通過3行代碼實現（至少乍一看）。 –

謝謝，我認爲這是最簡單的方法，它完美的工作！ –

您可以使用一個簡單的計數器爲每個類：

positive, negative, neutral = 0, 0, 0

然後，句子循環內，測試該化合物的價值和增加相應的計數器：

... 
    if ss['compound'] > 0: 
     positive += 1 
    elif ss['compound'] == 0: 
     neutral += 1 
    elif ...

等

來源

2016-09-29 11:58:38 lenz

我可能會返回多數民衆贊成由一個文件表示不平等的類型的函數：

def inequality_type(val): 
    if val == 0.0: 
     return "equal" 
    elif val > 0.0: 
     return "greater" 
    return "less"

然後在所有的句子化合物分數用這個來增加相應的計不平等類型。

from collections import defaultdict 

def count_sentiments(sentences): 
    # Create a dictionary with values defaulted to 0 
    counts = defaultdict(int) 

    # Create a polarity score for each sentence 
    for score in map(sid.polarity_scores, sentences): 
     # Increment the dictionary entry for that inequality type 
     counts[inequality_type(score["compound"])] += 1 

    return counts

然後，您可以在您的過濾行上調用它。

然而，這可以通過只使用collections.Counter被省略：

from collections import Counter 

def count_sentiments(sentences): 
    # Count the inequality type for each score in the sentences' polarity scores 
    return Counter((inequality_type(score["compound"]) for score in map(sid.polarity_scores, sentences)))

來源

2016-09-29 12:13:55 erip

'collections.Counter'使第二步變得微不足道。 – alexis

@alexis是的，非常好的一點！將補充說。 – erip

@erip非常感謝。它工作得很好！但我認爲Alex的解決方案對於像我這樣的人來說更容易理解和使用，從而邁出了編碼的第一步。 –

NLTK情緒維達：排序

回答

相關問題