求和的一個變量的產品

我有這樣一個數據集：求和的一個變量的產品

test <- 
    data.frame(
     variable = c("A","A","B","B","C","D","E","E","E","F","F","G"), 
     confidence = c(1,0.6,0.1,0.15,1,0.3,0.4,0.5,0.2,1,0.4,0.9),   
     freq  = c(2,2,2,2,1,1,3,3,3,2,2,1), 
     weight  = c(2,2,0,0,1,3,5,5,5,0,0,4) 
    ) 

> test 
    variable confidence freq weight 
1   A  1.00 2  2 
2   A  0.60 2  2 
3   B  0.10 2  0 
4   B  0.15 2  0 
5   C  1.00 1  1 
6   D  0.30 1  3 
7   E  0.40 3  5 
8   E  0.50 3  5 
9   E  0.20 3  5 
10  F  1.00 2  0 
11  F  0.40 2  0 
12  G  0.90 1  4

我想每個變量的信心來計算權重的總和，是這樣的：，其中i是變量（A，B，C ...）

發展上面的公式：

w[1]c[1]+w[1]c[2]=2*1+2*0.6=3.2 
w[2]c[1]+w[2]c[2] 
w[3]c[3]+w[3]c[4] 
w[4]c[3]+w[4]c[4] 
w[5]c[5] 
w[6]c[6] 
w[7]c[7]+w[7]c[8]+w[7]c[9] 
w[8]c[7]+w[8]c[8]+w[8]c[9] 
w[9]c[7]+w[9]c[8]+w[9]c[9] 
…

結果應該是這樣的：

> test 
    variable confidence freq weight SWC 
1   A  1.00 2  2 3.2 
2   A  0.60 2  2 3.2 
3   B  0.10 2  0 0.0 
4   B  0.15 2  0 0.0 
5   C  1.00 1  1 1.0 
6   D  0.30 1  3 0.9 
7   E  0.40 3  5 5.5 
8   E  0.50 3  5 5.5 
9   E  0.20 3  5 5.5 
10  F  1.00 2  0 0.0 
11  F  0.40 2  0 0.0 
12  G  0.90 1  4 3.6

請注意，每個觀測值的置信度值不同，但每個變量具有相同的權重，所以我需要的總和對於每個相同的變量觀測值都是相同的。

首先，我試圖讓一個循環迭代每個變量與次數：

> table(test$variable) 

A B C D E F G 
2 2 1 1 3 2 1

，但我不能使它工作。那麼，我計算出的位置，其中每個變量開始，要儘量使for循環迭代只在這些值：

> tpos = cumsum(table(test$variable)) 
> tpos = tpos+1 
> tpos 
A B C D E F G 
3 5 6 7 10 12 13 
> tpos = shift(tpos, 1) 
> tpos 
[1] NA 3 5 6 7 10 12 
> tpos[1]=1 
> tpos 
[1] 1 3 5 6 7 10 12 

# tpos is a vector with the positions where each variable (A, B, c...) start 

> tposn = c(1:nrow(test))[-tpos] 
> tposn 
[1] 2 4 8 9 11 
> c(1:nrow(test))[-tposn] 
[1] 1 3 5 6 7 10 12 

# then i came up with this loop but it doesn't give the correct result 

for(i in 1:nrow(test)[-tposn]){ 
    a = test$freq[i]-1 
    test$SWC[i:i+a] = sum(test$weight[i]*test$confidence[i:i+a]) 
    }

也許有這種更簡單的方法？ tapply？

來源

2017-09-04 Hoju

通過使用dplyr：

library(dplyr) 

test %>% 
    group_by(variable) %>% 
    mutate(SWC=sum(confidence*weight)) 

# A tibble: 12 x 5 
# Groups: variable [7] 
variable confidence freq weight SWC 
<fctr>  <dbl> <dbl> <dbl> <dbl> 
1  A  1.00  2  2 3.2 
2  A  0.60  2  2 3.2 
3  B  0.10  2  0 0.0 
4  B  0.15  2  0 0.0 
5  C  1.00  1  1 1.0 
6  D  0.30  1  3 0.9 
7  E  0.40  3  5 5.5 
8  E  0.50  3  5 5.5 
9  E  0.20  3  5 5.5 
10  F  1.00  2  0 0.0 
11  F  0.40  2  0 0.0 
12  G  0.90  1  4 3.6

來源

2017-09-04 03:27:54 Wen

隨着基R，'AVE（測試，測試$變量，FUN =函數（x）的總和（X [ '信心'] * X [ '重量']）） ' –

很好用，非常感謝！但是在運行你的代碼之後，SWC輸出不會在數據框中「保存」（如果我運行'test'，它不在那裏） – Hoju

^我想我已經解決了它，我只是在你之前添加了'test < - '碼。 – Hoju

求和的一個變量的產品

回答

相關問題