2013-01-21 38 views
3

我需要把這個如何轉置/將兩個變量轉換爲一行?

id | amount | day 
--------------------- 
A | 10 | 0 
A | 54 | 8 
A | 23 | 18 
A | 43 | 28 
A | 87 | 51 
B | 34 | 0 
B | 76 | 1 
B | 12 | 7 

這個

id | a1 | a2 | a3 | a4 | a5 | d1 | d2 | d3 | d4 | d5 
-------------------------------------------------------- 
A | 10 | 54 | 23 | 43 | 87 | 0 | 8 | 18 | 28 | 51 
B | 34 | 76 | 12 | 0 | 0 | 0 | 1 | 7 | 0 | 0 

即。通過id轉置/將df的行轉換爲未知數量的列,將零置於由於長度不等而存在空值的位置。

我已經嘗試了

df <- data.frame(id=c('A','A','A','A','A','B','B','B'), amount=c(10,54,23,43,87,34,76,12), day=c(0,8,18,28,51,0,1,7)) 
library(reshape2) 
x <- dcast(df, id ~ day, mean, value = 'amount') 

但它並不完全正確。我該怎麼做?

回答

4

使用基礎R reshape()意思引入「時間」變量:

df$time <- ave(as.numeric(as.character(df$id)), df$id, FUN = seq_along) 
df 
# id amount day time 
# 1 A  10 0 1 
# 2 A  54 8 2 
# 3 A  23 18 3 
# 4 A  43 28 4 
# 5 A  87 51 5 
# 6 B  34 0 1 
# 7 B  76 1 2 
# 8 B  12 7 3 
reshape(df, direction = "wide", idvar="id", timevar="time") 
# id amount.1 day.1 amount.2 day.2 amount.3 day.3 amount.4 day.4 amount.5 day.5 
# 1 A  10  0  54  8  23 18  43 28  87 51 
# 6 B  34  0  76  1  12  7  NA NA  NA NA 

可選步驟:

  1. 重新組織列順序:

    df2 <- df2[c("id", 
          grep("amount", names(df2), value=TRUE), 
          grep("day", names(df2), value = TRUE))] 
    
  2. 更換NA0

    df2[is.na(df2)] <- 0 
    df2 
    # id amount.1 amount.2 amount.3 amount.4 amount.5 day.1 day.2 day.3 day.4 day.5 
    # 1 A  10  54  23  43  87  0  8 18 28 51 
    # 6 B  34  76  12  0  0  0  1  7  0  0 
    
+0

+1大家好用! – agstudy

+0

@agstudy,謝謝。我認爲'as.numeric'部分實際上不是必需的。 – A5C1D2H2I1M1N2O1R2T1

3

我要創建一個新的變量DD

df$dd <-unlist(by(df$id,df$id, FUN= function(x)seq(1,length(x)))) 


id amount day dd 
1 A  10 0 1 
2 A  54 8 2 
3 A  23 18 3 
4 A  43 28 4 
5 A  87 51 5 
6 B  34 0 1 
7 B  76 1 2 
8 B  12 7 3 

mm <- melt(df,id.vars=c('id','dd'),measure.vars=c('amount','day')) 
dcast(mm,id~variable+dd,fun.aggregate=mean) 
id amount_1 amount_2 amount_3 amount_4 amount_5 day_1 day_2 day_3 day_4 day_5 
1 A  10  54  23  43  87  0  8 18 28 51 
2 B  34  76  12  NaN  NaN  0  1  7 NaN NaN 

編輯 得到一個不錯的輸出中0,我替換後的自定義功能

dcast(mm,id~variable+dd,fun.aggregate= 
         function(x) ifelse(is.nan(mean(x)),0,mean(x))) 
    id amount_1 amount_2 amount_3 amount_4 amount_5 day_1 day_2 day_3 day_4 day_5 
1 A  10  54  23  43  87  0  8 18 28 51 
2 B  34  76  12  0  0  0  1  7  0  0 
+0

你打我這一次! – A5C1D2H2I1M1N2O1R2T1

+0

也許..你創造新變量更優雅... – agstudy

相關問題