2012-04-30 86 views
0

我有一個長格式的數據集,並希望使用Reshape或Reshape之前的任何預處理將其轉換爲寬格式。難點在於「價值」變量是非數字的。請注意,原始數據中也有合法的重複記錄。以下代碼顯示每個數據的佈局。Reshape中的「聚合」非數字變量

id = c(1, 1, 1, 1, 1, 1, 1) 
month <- c("jan", "feb", "feb", "march", "april", "april", "april") 
stress <- c("mild", "mild", "high", "moderate", "mild", "high", "mild") 
Longdata <- data.frame(id, month, stress, stringsAsFactors = FALSE) 

這是原單格式:

> Longdata 
    id month stress 
1 1 jan  mild 
2 1 feb  mild 
3 1 feb  high 
4 1 march moderate 
5 1 april  mild 
6 1 april  high 
7 1 april  mild 

這是我想怎麼組織起來的數據:

id <- c(1) 
jan <- c("mild") 
feb <- c("mild-high") 
march <- c("moderate") 
april <- c("mild-high-mild") 
widedata <- data.frame(id, jan, feb, march, april, stringsAsFactors = FALSE) 
> widedata 
    id jan  feb march   april 
1 1 mild mild-high moderate mild-high-mild 

回答

0

您可以分兩步做到這一點,首先使用aggregate,第二次使用「reshape2」包中的R reshapedcast

  1. 聚集步驟:

    Mediumdata <- aggregate(stress ~ id + month, Longdata, paste, collapse="-") 
    Mediumdata 
    # id month   stress 
    # 1 1 april mild-high-mild 
    # 2 1 feb  mild-high 
    # 3 1 jan   mild 
    # 4 1 march  moderate 
    
  2. 的成形步驟:

    # Using base R reshape 
    reshape(Mediumdata, direction="wide", idvar="id", timevar="month") 
    # id stress.april stress.feb stress.jan stress.march 
    # 1 1 mild-high-mild mild-high  mild  moderate 
    
    # Using `dcast` from "reshape2" 
    dcast(mediumdata, id ~ month, value.var="stress") 
    # id   april  feb jan march 
    # 1 1 mild-high-mild mild-high mild moderate