2012-11-20 151 views
0

我試圖用熔體AMB投改造這個數據幀重塑一個數據幀

knowngene           Meth 
uc003fia.3 cg00000108;0.864484486796394;0.928944704280193 
uc003cha.4 cg00000108;0.864484486796394;0.928944704280193 
uc003fhz.4 cg00000109;0.881060551674426;0.910939682196076 
uc003fhz.4 cg00000132;0.881060551674426;0.910939682196076 
uc003fia.3 cg00000109;0.881060551674426;0.910939682196076 
uc003fia.3 cg00000236;0.799251070221749;0.898656886868738 

在這樣的

knowngene           Meth 
uc003fia.3 cg00000108;0.864484486796394;0.928944704280193;cg00000109;0.881060551674426;0.910939682196076;cg00000236;0.799251070221749;0.898656886868738 
uc003cha.4 cg00000108;0.864484486796394;0.928944704280193 
uc003fhz.4 cg00000109;0.881060551674426;0.910939682196076;cg00000132;0.881060551674426;0.910939682196076 

但是,對於一個特定的原因,我不能重塑數據框架,可能首先改爲列表?

回答

1

這聽起來像你只需要aggregate()

首先,你的數據:

myDF <- read.table(header = TRUE, text = "knowngene Meth 
uc003fia.3 cg00000108;0.864484486796394;0.928944704280193 
uc003cha.4 cg00000108;0.864484486796394;0.928944704280193 
uc003fhz.4 cg00000109;0.881060551674426;0.910939682196076 
uc003fhz.4 cg00000132;0.881060551674426;0.910939682196076 
uc003fia.3 cg00000109;0.881060551674426;0.910939682196076 
uc003fia.3 cg00000236;0.799251070221749;0.898656886868738") 

二,聚合:

aggregate(Meth ~ knowngene, myDF, paste, collapse=";") 
# knowngene                                   Meth 
# 1 uc003cha.4                        cg00000108;0.864484486796394;0.928944704280193 
# 2 uc003fhz.4            cg00000109;0.881060551674426;0.910939682196076;cg00000132;0.881060551674426;0.910939682196076 
# 3 uc003fia.3 cg00000108;0.864484486796394;0.928944704280193;cg00000109;0.881060551674426;0.910939682196076;cg00000236;0.799251070221749;0.898656886868738 
2

斯普利特和應用將讓你關閉:

lapply(split(x$Meth, x$knowngene), paste, collapse="; ") 

$uc003cha.4 
[1] "cg00000108;0.864484486796394;0.928944704280193" 

$uc003fhz.4 
[1] "cg00000109;0.881060551674426;0.910939682196076; cg00000132;0.881060551674426;0.910939682196076" 

$uc003fia.3 
[1] "cg00000108;0.864484486796394;0.928944704280193; cg00000109;0.881060551674426;0.910939682196076; cg00000236;0.799251070221749;0.898656886868738" 

結果是命名列表所有你想要的方式連接起來的文字。您可以使用names()unname()它轉換成一個數據幀:

data.frame(knowngene=names(x), Meth=unlist(unname(x))) 

    knowngene 
1 uc003cha.4 
2 uc003fhz.4 
3 uc003fia.3 
                                      Meth 
1                         cg00000108;0.864484486796394;0.928944704280193 
2             cg00000109;0.881060551674426;0.910939682196076; cg00000132;0.881060551674426;0.910939682196076 
3 cg00000108;0.864484486796394;0.928944704280193; cg00000109;0.881060551674426;0.910939682196076; cg00000236;0.799251070221749;0.898656886868738 
1

嘗試

cast(knowngene ~ ., data = your.data.frame, value = "Meth", 
    function = paste, sep = ";")