2017-03-26 11 views
0

像下面跨行用df添加列condtionally

df <- data.frame(
    name = rep(c("A", "B", "C"),2), 
    type = c("10", "10", "10","20", "20", "20"), 
    val = c(1,2,3,4,5,6) 
) 
> df 
    name type val 
1 A 10 1 
2 B 10 2 
3 C 10 3 
4 A 20 4 
5 B 20 5 
6 C 20 6 
> 

預期的輸出是

我需要與nameC來記錄VAL與nameA添加的所有記錄val與相應type一個新的name AC。需要輸出保持nameC並沒有它。

輸出1

name type val 
1 A 10 1 
2 B 10 2 
3 C 10 3 
4 AC 10 4 
5 A 20 4 
6 B 20 5 
7 C 20 6 
8 AC 20 10 

輸出2

name type val 
1 AC 10 4 
2 B 10 2 
4 AC 20 10 
5 B 20 5 
> 

喜歡dplyr基礎的解決方案

回答

3

這裏有一種方法,

library(dplyr) 

df %>% 
    mutate(new = as.integer(name %in% c('A', 'C'))) %>% 
    group_by(type, new) %>% 
    summarise(name = paste0(name, collapse = ''), val = sum(val)) %>% 
    ungroup() %>% 
    select(-new) 

# A tibble: 4 × 3 
# type name val 
# <fctr> <chr> <dbl> 
#1  10  B  2 
#2  10 AC  4 
#3  20  B  5 
#4  20 AC 10 

要得到其他輸出,那麼,

df %>% 
    mutate(new = as.integer(name %in% c('A', 'C'))) %>% 
    group_by(type, new) %>% 
    summarise(name = paste0(name, collapse = ''), val = sum(val)) %>% 
    ungroup() %>% 
    select(-new) %>% 
    filter(nchar(name) > 1) %>% 
    bind_rows(df) %>% 
    arrange(val) 

# A tibble: 8 × 3 
# type name val 
# <fctr> <chr> <dbl> 
#1  10  A  1 
#2  10  B  2 
#3  10  C  3 
#4  10 AC  4 
#5  20  A  4 
#6  20  B  5 
#7  20  C  6 
#8  20 AC 10 
+0

應該問這個早期版本 - 更新的問題 - 如果我需要什麼,以保持''name's A'和'C' – user3206440

1

這裏是另一個(需要tidyr以及dplyr

df1 <- df %>% group_by(type) %>% 
       summarise(AC=sum(val[name %in% c("A","C")]),B=val[name=="B"]) %>% 
       gather(key=name,value=val,-type) %>% 
       arrange(type) 
1

下面是使用一個選項data.table

library(data.table) 
rbindlist(list(df, setDT(df)[, .(name = "AC", val = sum(val[as.character(name) %chin% 
    c("A", "C")])) , .(type)][, names(df), with = FALSE]))[order(type, name)] 
# name type val 
#1: A 10 1 
#2: B 10 2 
#3: C 10 3 
#4: AC 10 4 
#5: A 20 4 
#6: B 20 5 
#7: C 20 6 
#8: AC 20 10 

dplyr

library(dplyr) 
df %>% 
    filter(name %in% c("A", "C")) %>% 
    group_by(type) %>% 
    summarise(name = 'AC', val = sum(val)) %>% 
    full_join(df, ., on = 'type') %>% 
    arrange(type, val) 
# name type val 
#1 A 10 1 
#2 B 10 2 
#3 C 10 3 
#4 AC 10 4 
#5 A 20 4 
#6 B 20 5 
#7 C 20 6 
#8 AC 20 10