2
我有一個數據集(數據),看起來像這樣:從廣角重塑凌亂和不平衡的數據集長
ID,ABC.BC,ABC.PL,DEF.BC,DEF.M,GHI.PL
SB0005,C01,D20,C01a,C01b,D20
BC0013,C05,D5,C05a,NA,D5
我想從廣角到長格式來得到這樣的重塑它:
ID,FC,Type,Var
SB0005,ABC,BC,C01
SB0005,ABC,PL,D20
SB0005,DEF,BC,C01a
SB0005,DEF,M,C01b
SB0005,GHI,PL,D20
BC0013,ABC,BC,C05
BC0013,ABC,PL,D5
BC0013,DEF,BC,C05a
# BC0013,DEF,M,NA (This row need not be in the dataset as I will remove it later)
BC0013,GHI,PL,D5
由於數據集不平衡,通常的整形包不起作用。我也嘗試過從splitstackshape重塑,但它不給我我想要的。
library(splitstackshape)
vary <- grep("\\.BC$|\\.PL$|\\.M$", names(data))
stubs <- unique(sub("\\..*$", "", names(data[vary])))
Reshape(data, id.vars=c("ID"), var.stubs=stubs, sep=".")
ID,time,ABC,DEF,GHI
SB0005,1,C01,C01a,D20
BC0013,1,C05,C05a,D5
SB0005,2,D20,C01b,NA
BC0013,2,D5,NA,NA
SB0005,3,NA,NA,NA
BC0013,3,NA,NA,NA
感謝任何建議,謝謝!
提供了dput(data)
輸出的要求
structure(list(ID = structure(c(2L, 1L), .Label = c("BC0013",
"SB0005"), class = "factor"), ABC.BC = structure(1:2, .Label = c("C01",
"C05"), class = "factor"), ABC.PL = structure(1:2, .Label = c("D20",
"D5"), class = "factor"), DEF.BC = structure(1:2, .Label = c("C01a",
"C05a"), class = "factor"), DEF.M = structure(1:2, .Label = c("C01b",
"NA"), class = "factor"), GHI.PL = structure(1:2, .Label = c("D20",
"D5"), class = "factor")), .Names = c("ID", "ABC.BC", "ABC.PL",
"DEF.BC", "DEF.M", "GHI.PL"), row.names = c(NA, -2L), class = "data.frame")
請提供'dput(數據)的輸出'在你的問題,所以我們可以重現你的努力。 – Chrisss
它是如何不平衡的?你的意思是你想放棄的NA?此外,預期產出的最後一行是否應該有'D5'而不是'D20'? –
你說得對,我糾正了錯誤,謝謝。由於BC,PL和M沒有出現在所有FC中,所以它不平衡。 BC出現在ABC和DEF中,而不是GHI。 – phusion