2017-06-03 135 views
0

我使用data.table重塑數據。R data.table重塑數據

library(data.table) 
market <- data.table(
    stkcd=c(1,2), 
    type =c(1,0), 
    roa2013=c(2,3), 
    roa2014=c(4,5), 
    lev2013=c(6,7), 
    lev2016=c(8,9)) 
market 
#  stkcd type roa2013 roa2014 lev2013 lev2016 
# 1:  1 1  2  4  6  8 
# 2:  2 0  3  5  7  9 
melt(market, 
    measure.vars = patterns("^roa", "^lev"), 
    variable.name = "year", 
    value.name = c("roa","lev")) 
#  stkcd type year roa lev 
# 1:  1 1 1 2 6 
# 2:  2 0 1 3 7 
# 3:  1 1 2 4 8 
# 4:  2 0 2 5 9 

這就是最終數據的樣子。

#  stkcd type year roa lev 
# 1  1 1 2013 2 6 
# 2  1 1 2014 4 NA 
# 3  1 1 2016 NA 8 
# 4  2 0 2013 3 7 
# 5  2 0 2014 5 NA 
# 6  2 0 2016 NA 9 

有沒有人有任何好的方法呢? 謝謝。

+0

對於命名的「年「的值,請參閱[在融化使用模式之後將'變量'列的數值表示轉換爲原始字符串](https://stackoverflow.com/questions/41883573/convert-numeric-representation-of-variable-column-to-original -string-以下)。 – Henrik

+0

謝謝。我會嘗試重塑{統計數據}。 – Cheng

回答

0

我們可以用splitstackshape輕鬆做到這一點。創建感興趣的列的數字和非數字部分之間的分隔符,然後使用merged.stack重塑成「長」和改變」 .time_1`列名‘年’

library(splitstackshape) 
names(market) <- sub("(\\d+)", "_\\1", names(market)) 
res <- merged.stack(market, var.stubs = c("roa", "lev"), sep="_") 
setnames(res, ".time_1", "year") 
res 
# stkcd type year roa lev 
#1:  1 1 2013 2 6 
#2:  1 1 2014 4 NA 
#3:  1 1 2016 NA 8 
#4:  2 0 2013 3 7 
#5:  2 0 2014 5 NA 
#6:  2 0 2016 NA 9 
+0

謝謝。這是一個好方法。 – Cheng

+0

@Cheng謝謝您的評論。你也可以勾選[here](https://stackoverflow.com/help/someone-answers) – akrun

0

1.使用重塑{}統計,

library(data.table) 
market <- data.table(
    stkcd=c(1,2), 
    type =c(1,0), 
    roa2013=c(2,3), 
    roa2014=c(4,5), 
    lev2013=c(6,7), 
    lev2016=c(8,9)) 

market[,`:=`(roa2016=NA,lev2014=NA)] 
long <- reshape(market, 
     idvar = "stkcd", 
     varying = c("roa2013","lev2013", 
        "roa2014","lev2014", 
        "roa2016","lev2016"), 
     sep = "", 
     timevar = "year", 
     direction = "long") 
setorder(long,stkcd,year) 
long 
#  stkcd type year roa lev 
# 1:  1 1 2013 2 6 
# 2:  1 1 2014 4 NA 
# 3:  1 1 2016 NA 8 
# 4:  2 0 2013 3 7 
# 5:  2 0 2014 5 NA 
# 6:  2 0 2016 NA 9 

2.str_extract海峽

library(data.table) 
library(stringr) 
market <- data.table(
    stkcd=c(1,2), 
    type =c(1,0), 
    roa2013=c(2,3), 
    roa2014=c(4,5), 
    lev2013=c(6,7), 
    lev2016=c(8,9)) 
market 
long <- melt(market, 
      id.vars = c("stkcd","type")) 
long[,`:=`(year=str_extract(variable,pattern = "[0-9]{4}"), 
      vars=str_extract(variable,pattern = "[a-zA-Z]{1,}"))][,variable:=NULL] 
long <- dcast(long, stkcd + type + year ~ vars, value.var = "value") 
long 
#  stkcd type year lev roa 
# 1:  1 1 2013 6 2 
# 2:  1 1 2014 NA 4 
# 3:  1 1 2016 8 NA 
# 4:  2 0 2013 7 3 
# 5:  2 0 2014 NA 5 
# 6:  2 0 2016 9 NA 

...