1
我想執行以下計算:NGRAM在R:計算單詞頻率和值的總和
輸入:
Column_A Column_B
Word_A 10
Word_A Word_B 20
Word_B Word_A 30
Word_A Word_B Word_C 40
輸出:
Column_A1 Column_B1
Word_A 100 = 10+20+30+40
Word_B 90 = 20+30+40
Word_C 40 = 40
Word_A Word_B 90 = 20+30+40
Word_A Word_C 40 = 40
Word_B Word_C 40 = 40
Word_A Word_B Word_C 40 = 40
的輸出中單詞的順序無關緊要,所以Word_A Word_B = 90 = Word_B Word_A。使用RWeka和TM庫,我能提取unigram進行(只有一個字),位我需要有n元,其中n = 1,2,3和計算column_B1