2017-07-29 40 views
1

當通過幾列進行分組並彙總dplyr中的幾列時,我會得到一個奇怪的數據結構。數據幀很大,結果數據結構的怪異性更加顯着,但是低於這個數值就會產生小問題。R:dplyr只有在由多個列進行分組時纔會給出奇怪的數據結構

一切都很好:

library(dplyr) 
df <- data.frame(A = c(1,1,2,2), B = c(1,1,2,2), C = c(10,20,30,40), D = c(1000,2000,3000,4000)) 
df %>% group_by(A) %>% summarize(C = sum(C),D = sum(D)) %>% str() 
Classes ‘tbl_df’, ‘tbl’ and 'data.frame':  2 obs. of 3 variables: 
$ A: num 1 2 
$ C: num 30 70 
$ D: num 3000 7000 

這是什麼?

df %>% group_by(A,B) %>% summarize(C = sum(C),D = sum(D)) %>% str() 
Classes ‘grouped_df’, ‘tbl_df’, ‘tbl’ and 'data.frame': 2 obs. of 4 variables: 
$ A: num 1 2 
$ B: num 1 2 
$ C: num 30 70 
$ D: num 3000 7000 
- attr(*, "vars")=List of 1 
    ..$ : symbol A 
- attr(*, "drop")= logi TRUE 

回答

1

group_by創建一些附加屬性。如果我們不需要這些屬性,然後ungroupsummarise後一個選項

df %>% 
    group_by(A, B) %>% 
    summarize(C = sum(C),D = sum(D)) %>% 
    ungroup() %>% 
    str() 
#Classes ‘tbl_df’, ‘tbl’ and 'data.frame':  2 obs. of 4 variables: 
# $ A: num 1 2 
# $ B: num 1 2 
# $ C: num 30 70 
# $ D: num 3000 7000 
+1

謝謝!它也適用於我非常大的數據框! –

相關問題