2013-10-22 85 views
0

我有一個csv文件,每個唯一標識有多行,我需要格式化爲一個數據幀的單行。在這個文件看完後,我結束了一個初始數據框:在R中讀取多個csv行爲單行R

id week v1 v2 
01 week1 3 2 
01 week2 5 2 
01 week3 2 3 
02 week1 1 2 
02 week2 5 5 
03 week1 4 1 
03 week2 4 3 
03 week3 4 2 
[etc...] 

我想拉V1的所有實例對於給定的ID,所以我抓住所有的唯一ID

uniqid<-unique(data$id) 

,然後從1迭代這些:長度(uniqid)

temp <- subset(data,data$id==uniqid[i]) 

和每週數據拉成臨時變量

week1 <- temp$v1[temp$week=="week1] 

,所以我可以用rbind

output <- rbind(output,data.frame(ID=uniqid[i],week1,week2,week3)) 

我的問題是,例如使用id = 02改革數據幀,沒有譯員更加,所以rbind休息。看來week3變量永遠不會被創建;它不顯示爲NA。如何測試以查看變量是否已創建並將其設置爲NA(或0),以便rbind不會失敗?或者有沒有完全不同的/更有效的方法來完成這個?

回答

1

您可以使用reshape2軟件包中的recast函數。

DF 
## id week v1 v2 
## 1 1 week1 3 2 
## 2 1 week2 5 2 
## 3 1 week3 2 3 
## 4 2 week1 1 2 
## 5 2 week2 5 5 
## 6 3 week1 4 1 
## 7 3 week2 4 3 
## 8 3 week3 4 2 


require(reshape2) 
temp <- recast(DF, id ~ week, measure.var = "v1") 
result <- temp$data 
row.names(result) <- temp$labels[[1]]$id 
colnames(result) <- temp$labels[[2]]$week 
result 
## week1 week2 week3 
## 1  3  5  2 
## 2  1  5 NA 
## 3  4  4  4 

或@AnandaMahto建議,只需使用dcast

dcast(DF, id ~ week, value.var = "v1") 
## id week1 week2 week3 
## 1 1  3  5  2 
## 2 2  1  5 NA 
## 3 3  4  4  4 
+0

爲什麼'重鑄'而不是更常用的'dcast'? – A5C1D2H2I1M1N2O1R2T1

+0

@AnandaMahto,因爲我首先學會了'recast';),並且它不需要首先熔化數據。 –

+0

我仍然不遵循你的邏輯。這個數據集已經很長了(不需要'融化'),使用'dcast'而不是'recast'就可以解決問題,而不需要跳過重命名的事情:'dcast(DF,id〜week,value.var =「v1 「)' – A5C1D2H2I1M1N2O1R2T1

1

在基礎R,您可以使用reshape

> reshape(mydf, direction = "wide", idvar="id", timevar="week") 
    id v1.week1 v2.week1 v1.week2 v2.week2 v1.week3 v2.week3 
1 1  3  2  5  2  2  3 
4 2  1  2  5  5  NA  NA 
6 3  4  1  4  3  4  2 

如果你想刪除從 「V2」 列輸出,您可以在重新整形數據之前執行此操作,也可以從函數中刪除它。

> reshape(mydf, direction = "wide", idvar="id", timevar="week", drop="v2") 
    id v1.week1 v1.week2 v1.week3 
1 1  3  5  2 
4 2  1  5  NA 
6 3  4  4  4