2015-09-11 44 views
2

我有以下數據集:重塑DF到數據面板模型

df1 <- data.frame(country = c("A", "B","A","B"), year = c(2011,2011,2012,2012), variable_1= c(1,3,5,7))  

df2 <- data.frame(country = c("A", "B","A","B"), year = c(2011,2012,2012,2013), variable_2= c(2,4,6,8)) 

df3 <- data.frame(country = c("A", "C","C"), year = c(2011,2011,2013), variable_3= c(9,9,9)) 

我想他們重塑成一個面板數據模型,這樣我就可以得到以下結果:

df4 <- data.frame(country = c("A","A","A","B","B","B","C","C","C"), year = c(2011,2012,2013,2011,2012,2013,2011,2012,2013), variable_1 = c(1,5,NA,3,7,NA,NA,NA,NA), variable_2 = c(2,6,NA,NA,4,8,NA,NA,NA), variable_3 = c(9,NA,NA,NA,NA,NA,9,NA,9)) 

我搜索了這個信息,但我發現的主題(Reshaping panel data)沒有幫助我。

關於如何做到這一點的任何想法?我的真實數據集有數千行(「國家」),幾個變量,年份和NA,因此請考慮這一點。

回答

5

嘗試

library(tidyr) 
library(dplyr) 

Reduce(full_join, list(df1, df2, df3)) %>% 
    complete(country, year) 

其中給出:

#Source: local data frame [9 x 5] 
# 
# country year variable_1 variable_2 variable_3 
# (chr) (dbl)  (dbl)  (dbl)  (dbl) 
#1  A 2011   1   2   9 
#2  A 2012   5   6   NA 
#3  A 2013   NA   NA   NA 
#4  B 2011   3   NA   NA 
#5  B 2012   7   4   NA 
#6  B 2013   NA   8   NA 
#7  C 2011   NA   NA   9 
#8  C 2012   NA   NA   NA 
#9  C 2013   NA   NA   9 
+1

我剛剛發現自己減少,我很高興你讓我想起了在這裏。我認爲你可以做'減少(full_join,列表(df1,df2,df3))%>%完成(國家,年份)'。 – jowalski

+0

你說得對。它可以被簡化。我會相應地編輯。 –

+1

很好的解決方案。有用的新功能添加到'tidyr' –