計數詞頻

我使用下面的代碼在一個給定的句子計數詞頻

wordCount = function(sentence,word){ 
splitedVectorString <- c() 
splitedVectorString <- strsplit(sentence," ") 
count <- 0 

for (j in splitedVectorString) { 
    print(length(splitedVectorString)) 
    print(splitedVectorString) 
    print(word) 
    if (identical(word,j)) { 
    count <- count + 1 
    print(count) 
    } 
} 
}

程序成功運行算一個詞的出現，但我得到的計數爲0。我調用這個函數在控制檯

wordCount("This is XYZ and is running late","is")

當我打印分裂矢量的長度splitedVectorString它給我1.我在分割句子時遇到問題嗎？我真的不知道最近出了什麼問題。我剛開始學習[R編程

來源

2016-12-21 unflagged.destination

您可能需要閱讀[這個問題]（http://stackoverflow.com/questions/7782113/counting-word-occurrences-in-r），這幾乎是相同的（這是一個載體，但使用'strsplit'你可以在一個向量中轉換你的句子）。 – etienne

嘗試了長度（grep（word，sentence）），但仍然得到1作爲輸出。我檢查了splatted矢量的長度，它給了我1.爲什麼splatted矢量「splitedVectorString」的長度是1.因此，它不遍歷整個矢量和循環只是執行onnce –

使用'length（grep（「\\ 」， strsplit（「This is XYZ and is running late」，「」）[[1]]））' – etienne

你可以做的是：

wordCount = function(sentence,word){ 

    splitedVectorString <- unlist(strsplit(sentence," ")) 
    count <- sum(word == splitedVectorString) 
    count 

    }

您選擇不公開strsplit的結果，讓你有一個載體（strsplit返回一個列表，這是你長的原因1！）與句子中的所有單詞相加，然後將所有與您的單詞相同的值相加。

表達式word == splitedVectorString將根據矢量的特定元素是否與該單詞相同，返回一個與具有True和False的splitedVectorString相同長度的矢量。

> wordCount("This is XYZ and is running late","is") 
[1] 2

來源

2016-12-21 10:26:17 User2321

它對我有用。謝謝！！ –

回答

相關問題