2016-02-23 32 views
3
expand.grid(country = c('Sweden','Norway', 'Denmark','Finland'), 
      sport = c('curling','crosscountry','downhill')) %>% 
    mutate(medals = sample(0:3, 12, TRUE)) -> 
data 

使用reshape2的dcast可以在一行中實現這一點。使用自定義名稱的邊距需要額外的步驟。用dplyr和tidyr計算小計

library(reshape2) 

data %>% 
    dcast(country ~ sport, margins = TRUE, sum) %>% 

# optional renaming of the margins `(all)` 
    rename(Total = `(all)`) %>% 
    mutate(country = ifelse(country == "(all)", "Total", country)) 

我的dplyr + tidyr方法是冗長的。使用tidyr和dplyr編寫此代碼的最佳方式(緊湊且可讀)。

library(dplyr) 
library(tidyr) 

data %>% 
    group_by(sport) %>% 
    summarise(medals = sum(medals)) %>% 
    mutate(country = 'Total') -> 
    sport_totals 

data %>% 
    group_by(country) %>% 
    summarise(medals = sum(medals)) %>% 
    mutate(sport = 'Total') -> 
    country_totals 

data %>% 
    summarise(medals = sum(medals)) %>% 
    mutate(sport = 'Total', 
     country = 'Total') -> 
    totals 

data %>% 
    bind_rows(country_totals, sport_totals, totals) %>% 
    spread(sport, medals) 
+0

這是那些基本的東西,這是在Excel和WAY可笑容易一( !)在R中太耗時了。建議您查看[rpivotTable](https://cran.r-project.org/web/packages/rpivotTable/vignettes/rpivotTableIntroduction.html) – Nettle

回答

3

我不知道這是否是最好的(緊湊,可讀的)但它的工作原理;)

data %>% 
    spread(sport, medals) %>% 
    mutate(Total = rowSums(.[2:4])) %>% 
    rbind(., data.frame(country="Total", t(colSums(.[2:5])))) 

    country curling crosscountry downhill Total 
1 Sweden  0   2  0  2 
2 Norway  1   1  0  2 
3 Denmark  2   2  1  5 
4 Finland  3   0  2  5 
5 Total  6   5  3 14