2016-06-29 79 views
0

我有數據框「d」下面有2列PCT1和PCT2。我想繪製每個組的加權PCT1和PCT2。這需要:dplyr可以避免使用多個合併嗎?

(1)計算每個組的加權pct1和加權pct2。目前我這樣做是在兩個電話使用dplyr (2)然後我合併2結果與rbind()

有沒有辦法避免調用dplyr兩次,仍然產生「結果」數據框?在現實中,我有10列不是2,我將不得不呼籲dplyr 10次,並做類似:

rbind(PCT1,PCT2,PCT3,PCT4, PCT5, ....,PCT10) 

謝謝。

d= data.frame (group =c("A","A","B","B"), 
      PCT1 = c(100,50,100,50), 
      PCT2 = c(50,1,10,5), 
      weight = c(99,1, 100,100)) 
d 

    group PCT1 PCT2 weight 
1  A 100 50  99 
2  A 50 1  1 
3  B 100 10 100 
4  B 50 5 100 

PCT1 = d %>% group_by(group) %>% summarise(vmean = weighted.mean(PCT1, weight)) 
PCT1$PCT =1 
PCT2 = d %>% group_by(group) %>% summarise(vmean = weighted.mean(PCT2, weight)) 
PCT2$PCT =2 
result = rbind(PCT1, PCT2) 

結果

group vmean PCT 
1  A 99.50 1 
2  B 75.00 1 
3  A 49.51 2 
4  B 7.50 2 

回答

3

你只需要進一步融化你的數據幀:

library(dplyr) 
library(tidyr) 

d <- data.frame (group =c("A","A","B","B"), 
          PCT1 = c(100,50,100,50), 
          PCT2 = c(50,1,10,5), 
          weight = c(99,1, 100,100)) 

d %>% 
    gather(key = PCT_GRP,value = PCT,PCT1:PCT2) %>% 
    group_by(group,PCT_GRP) %>% 
    summarise(vmean = weighted.mean(PCT,weight)) 
+0

當我嘗試安裝tidyr控制檯只是拖延了最後一條消息打印爲:嘗試URL'https://cran.rstudio.com/bin/windows/contrib/3.2/tidyr_0.5.1.zip' 內容類型'application/zip'長度789503字節(770 KB) 已下載770 KB – user3022875

+0

@ user3022875嘗試不同的CRAN鏡像。否則,你的網絡連接似乎有問題。 – joran

1

另一種選擇是data.table

library(data.table) 
melt(setDT(d), measure = c("PCT1", "PCT2"), variable.name = "PCT_GRP")[, 
     .(vmean = weighted.mean(value, weight)) , .(group, PCT_GRP)] 
# group PCT_GRP vmean 
#1:  A PCT1 99.50 
#2:  B PCT1 75.00 
#3:  A PCT2 49.51 
#4:  B PCT2 7.50