計算比例按組地從數據幀

我有詞的頻率，如一個數據幀：計算比例按組地從數據幀

df <- data.frame(
    Predictor = c("for","of","as","for","for","as","of","of","as","for"), 
    ToPredict = c("sure","course","much","him","keeps","far","them","this","an","petes"), 
    Freq = c(53,32,21,17,13,5,3,2,2,1))

欲計算新的列，它是每個ToPredict構成每個預測的比例。

所以，在上面的例子中，這個新列的值將是：

df$Props = c(0.631,0.865,0.75,0.202,0.155,0.179,0.081,0.054,0.071,0.012)

目前，我有和的數據幀：

sums <- aggregate(df$Freq, by=list(Category=df$Predictor), FUN=sum)

，我曾嘗試：

df$Props <- with(df, Freq/sums$x[which(sums$Category == Predictor)])

很明顯，這是行不通的。但我不知道會發生什麼。任何幫助最受讚賞。

來源

2017-02-16 davo1979

我有一個偷渡懷疑這是一個重複的問題，但用'（DF，AVE（頻率，預測，FUN = prop.table））'應做到這一點。 – thelatemail

可能重複的候選人，雖然答案不是很好 - http://stackoverflow.com/questions/15009011/calculate-proportions-within-subsets-of-a-data-frame和http://stackoverflow.com/questions/26885819 /按數據集的子集計算比例 – thelatemail

這很有可能。但是，我找不到有關搜索的答案。你的解決方案有效謝謝！ – davo1979

a=aggregate(df$Freq, by=list(df$Pred), FUN=sum) 
a1=a[,2] 
names(a1)=as.character(a[,1]) 
df$Props=df$Freq/a1[df$Pred]

來源

2017-02-16 01:57:26 user31264

這個也適用。對我來說更直觀（雖然我會想象會更慢，因爲它會創建一個額外的向量）。不過，我不能接受我的（thelatemail）答案（至少不會立即）。所以這會起作用。 – davo1979

每thelatemail：

with(df, ave(Freq, Predictor, FUN=prop.table))

來源

2017-02-16 01:40:01 davo1979

計算比例按組地從數據幀

回答

相關問題