2017-02-19 23 views
0

我R中有一個數據幀,看起來像這樣:支點和計數會員

df1 <- data.frame(id = letters[seq(from = 1, to = 20)], 
        var1 = sample(1:5,20,replace=T), 
        var2 = sample(1:5,20,replace=T)) 

這裏var1var2是分類變量1和5

之間我想創建一個矩陣,其中列標題爲var1 1至5,行標題爲var2 1至5,矩陣由屬於每個組的條目數計數填充。

我使用reshape試過了,看了看lazyeval包,發現像this類似的問題,但我不能讓任何符合這個要求。

+0

請添加您的預期輸出。另外,使用'set.seed'進行再現。 –

+3

試試'table(df1 [-1])' –

回答

1

使用dcast從reshape2

df1 %>% reshape2::dcast(var2~var1) 
reshape2::dcast(df1,var2 ~ var1) 
1
library(dplyr) 
library(tidyr) 

set.seed(1) 

df1 <- data.frame(id = letters[seq(from = 1, to = 20)], 
        var1 = sample(1:5,20,replace=T), 
        var2 = sample(1:5,20,replace=T)) 

df1 

# id var1 var2 
# 1 a 2 5 
# 2 b 2 2 
# 3 c 3 4 
# 4 d 5 1 
# 5 e 2 2 
# 6 f 5 2 
# 7 g 5 1 
# 8 h 4 2 
# 9 i 4 5 
# 10 j 1 2 
# 11 k 2 3 
# 12 l 1 3 
# 13 m 4 3 
# 14 n 2 1 
# 15 o 4 5 
# 16 p 3 4 
# 17 q 4 4 
# 18 r 5 1 
# 19 s 2 4 
# 20 t 4 3 


df1 %>% 
    count(var1,var2) %>%      # count how many times you have each combination 
    ungroup %>% 
    mutate(var1 = paste0("var1_",var1)) %>% # update variable values 
    spread(var1,n, fill=0) %>%    # reshape dataset 
    mutate(var2 = paste0("var2_",var2)) %>% # update variable values 
    print() -> df2 

# # A tibble: 5 × 6 
#  var2 var1_1 var1_2 var1_3 var1_4 var1_5 
# <chr> <dbl> <dbl> <dbl> <dbl> <dbl> 
# 1 var2_1  0  1  0  0  3 
# 2 var2_2  1  2  0  1  1 
# 3 var2_3  1  1  0  2  0 
# 4 var2_4  0  1  2  1  0 
# 5 var2_5  0  1  0  2  0 

如果你真的喜歡有var2值作爲行名稱,而不是作爲一列添加此

df2 = data.frame(df2) 
row.names(df2) = df2$var2 
df2$var2 = NULL 

df2 

#  var1_1 var1_2 var1_3 var1_4 var1_5 
# var2_1  0  1  0  0  3 
# var2_2  1  2  0  1  1 
# var2_3  1  1  0  2  0 
# var2_4  0  1  2  1  0 
# var2_5  0  1  0  2  0