2017-04-10 127 views
1

我有一個類似於以下示例中所示的大型數據集。如何根據R中其他列中的值使列值唯一?

df <- structure(list(FCN = structure(c(1L, 1L, 1L, 2L, 2L, 3L, 3L, 
3L), .Label = c("010.X91116.3D3.A8", "010.X91116.6B7.F9", "010.X91116.6C6.C12" 
), class = "factor"), DOM = structure(c(1L, 2L, 2L, 1L, 2L, 1L, 
2L, 2L), .Label = c("VH", "VK"), class = "factor"), FN = structure(c(1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "OM", class = "factor"), 
    RV = c(49257.4, 23571.2, 24115.6, 49351.4, 24102.6, 49641.8, 
    23226.2, 23408.2)), .Names = c("FCN", "DOM", "FN", "RV"), class = "data.frame", row.names = c(NA, 
-8L)) 

我想創建一個表,通過使FN列值基於FCN,RV和DOM中的值的後綴唯一。我可以使用for循環並處理數據,如圖所示。但是,處理數千行需要時間。

最後,我想旋轉數據,將FN中的值作爲RV中的列和值。如果可能的話,請指導我如何使用某些庫函數來優雅地實現結果。

library(reshape2) 
pivot_df <- dcast(df, FCN + DOM ~ FN) 
+0

我面臨的問題是如何將序列化後綴添加到FN列?最後,我想使用reshape2庫函數dcast(df,FCN + DOM〜FN)將數據作爲FN中的值作爲列和RV中的值。 – RanonKahn

+0

我用'dcast'發佈了一個更新解決方案。請檢查 – akrun

+0

沒關係,沒問題,但data.table中的'dcast'進行了效率優化 – akrun

回答

1

採用@ akrun的使用意見建議:

library(reshape2) 
df <- structure(list(FCN = structure(c(1L, 1L, 1L, 2L, 2L, 3L, 3L, 3L), .Label = c("010.X91116.3D3.A8", "010.X91116.6B7.F9", "010.X91116.6C6.C12"), class = "factor"), DOM = structure(c(1L, 2L, 2L, 1L, 2L, 1L, 2L, 2L), .Label = c("VH", "VK"), class = "factor"), FN = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "OM", class = "factor"), RV = c(49257.4, 23571.2, 24115.6, 49351.4, 24102.6, 49641.8, 23226.2, 23408.2)), .Names = c("FCN", "DOM", "FN", "RV"), class = "data.frame", row.names = c(NA, -8L)) 
df$FN <- with(df, paste0(FN, ave(seq_along(FN), FCN, DOM, FUN = seq_along))) 
pivot_df <- dcast(df, FCN + DOM ~ FN) 
2

我們可以使用ave做到這一點

df$FN <- with(df, paste0(FN, ave(seq_along(FN), FCN, DOM, FUN = seq_along))) 

如果我們需要重塑,以 '寬',然後rowiddata.table可以dcast

library(data.table) 
dcast(setDT(df), FCN + DOM ~FN + rowid(DOM), value.var = "RV") 
+1

我調整了列的順序,並採用了關於序列化FN列值的建議並達到了我想要的效果。非常感謝。 – RanonKahn

相關問題