2012-04-03 37 views
2

我有結構性這樣my.list[[file.id]][value.id]]<-a value(1 or 0)列表。相同的value.id可以存在於不同的file.ids中。R:重新組織列表轉換矩陣

我需要rownames所有value.ids的矩陣,colnames是file.ids並且每個小區是my.list[[file.id]][[value.id]]

有沒有一種快速的方法來做到這一點,而不會像瘋了似的迭代?

示例數據:

列表:

$`Zhou_et_al_2004` 
    CDC42:P60953 CDK2D:NONAME MAPK12:P53778 E2F3:NONAME GRB2:P62424 GRB2:P62993  RFA:NONAME 
      "up"   "up"   "down"   "down"   "down"   "down"   "down" 
    CDK9:P50750 JUP/DP3:NONAME MEK1:NONAME RFC38:NONAME  DP2:NONAME RFC37:NONAME GADD45:NONAME 
     "down"   "down"   "down"   "down"   "down"   "down"   "down" 

$`Zhou_et_al_2006` 
    CTTN:Q14247 GTSE1:Q9NYZ3  CHST11:Q9N  CHST11:PF2 TNRC6A:Q8NDV7 MMP9:P14780  NRIP3:Q9N 
      "up"   "up"   "up"   "up"   "up"   "up"   "up" 
    NRIP3:Q35 EGFR:P00533 GFPT2:NONAME TPCN2:Q8NHX9  BBP:NONAME SQLE:Q14534 DISP2:NONAME 
      "up"   "up"   "up"   "up"   "up"   "up"   "up" 
    PAPPA:Q13219 BMP2:P12643 PCM1:Q15154 SUCLG2:Q96I99 ASAH1:Q13510 UQCRC2:P22695 MTUS1:NONAME 
      "up"   "up"   "down"   "down"   "down"   "down"   "down" 
    MUC20:NONAME FRAT2:NONAME PLA2G4A:P47712 
     "down"   "down"   "down" 

$`Zhou_et_al_2007` 
    CTTN:Q14247 GTSE1:Q9NYZ3  CHST11:Q9N  CHST11:PF2 TNRC6A:Q8NDV7  NRIP3:Q9N 
      "up"   "up"   "up"   "up"   "up"   "up" 
     NRIP3:Q35 USP32:Q8NFA0 PPFIBP1:Q86W92 MALAT1:NONAME TRA2A:NONAME MGC17624:NONAME 
      "up"   "up"   "up"   "up"   "up"   "up" 
    SLC6A2:P23975 USP42:Q9H9J4 RASEF:NONAME SEMA3C:Q99985  NDE1:Q9NXR1  TRA1:NONAME 
      "up"   "up"   "up"   "up"   "up"   "up" 
    PPFIA1:Q13136 PPFIA1:Q16787 ITGA9:Q13797 ITGA9:Q14469  LMO2:P25791 NR2F2:P24468 
      "up"   "up"   "down"   "down"   "down"   "down" 
KIAA0882:NONAME  PCM1:Q15154  CYB5:NONAME  IDH1:NONAME MYLIP:Q8WY64 ASAH1:Q13510 
     "down"   "down"   "down"   "down"   "down"   "down" 
    HADHSC:NONAME FAM84B:Q96KN1  ADH5:P11766  NTN4:Q9HB63  AK3:Q9UIJ7 MTUS1:NONAME 
     "down"   "down"   "down"   "down"   "down"   "down" 
KIAA1815:NONAME 
     "down" 

MATRIX:

   Zhou2004 Zhou2006 Zhou2007 
CDC42:P60953 "up"  NA  NA  
CDK2D:NONAME "up"  NA  NA  
MAPK12:P53778 "down" NA  NA  
E2F3:NONAME  "down" NA  NA  
GRB2:P62424  "down" NA  NA  
GRB2:P62993  "down" NA  NA  
RFA:NONAME  "down" NA  NA  
CDK9:P50750  "down" NA  NA  
JUP/DP3:NONAME "down" NA  NA  
MEK1:NONAME  "down" NA  NA  
RFC38:NONAME "down" NA  NA  
DP2:NONAME  "down" NA  NA  
RFC37:NONAME "down" NA  NA  
GADD45:NONAME "down" NA  NA  
CTTN:Q14247  NA  "up"  "up"  
GTSE1:Q9NYZ3 NA  "up"  "up"  
CHST11:Q9N  NA  "up"  "up"  
CHST11:PF2  NA  "up"  "up"  

等(將有更多的行)

+0

請添加一些示例數據作爲輸入和預期輸出。 – Chase 2012-04-03 15:15:23

+1

你能'dput'樣本數據,使其更容易在粘貼? – James 2012-04-03 16:05:28

回答

2

與@ flodel的樣本數據開始,

my.list <- list() 
my.list[["Zhou_et_al_2004"]]["CDC42:P60953"] <- 1 
my.list[["Zhou_et_al_2004"]]["CDK2D:NONAME"] <- 2 
my.list[["Zhou_et_al_2006"]]["CTTN:Q14247"] <- 3 
my.list[["Zhou_et_al_2006"]]["GTSE1:Q9NYZ3"] <- 4 
my.list[["Zhou_et_al_2006"]]["CHST11:Q9N"] <- 5 
my.list[["Zhou_et_al_2009"]]["CTTN:Q14247"] <- 6 

使列表中的每個元素到一個數據幀,

a <- lapply(seq_along(my.list), function(i) { 
    x <- my.list[[i]] 
    out <- data.frame(name=names(x), out=x) 
    names(out)[2] <- names(my.list)[[i]] 
    out 
}) 

合併所有數據幀一起,

out <- Reduce(function(x,y) { merge(x, y, all=TRUE) }, a) 

並修復rownames。

rownames(out) <- out[,1] 
out <- out[,-1] 

結果如下!

> out 
      Zhou_et_al_2004 Zhou_et_al_2006 Zhou_et_al_2009 
CDC42:P60953    1    NA    NA 
CDK2D:NONAME    2    NA    NA 
CHST11:Q9N    NA    5    NA 
CTTN:Q14247    NA    3    6 
GTSE1:Q9NYZ3    NA    4    NA 
+0

謝謝!這工作完美 – JoshDG 2012-04-03 19:35:23

4

ldplyplyr包是特別有用的爲這種任務。從DOC:

當.fun返回一個數據幀取得的最明確的行爲 - ,其中rbind.fill是此方便的功能結合data.frames在這種情況下,片將與rbind.fill. *

組合與NA填補丟失的數據。

所以這裏的關鍵是要申請,將您的列表元素爲data.frame函數:

my.list <- list() 
my.list[["Zhou_et_al_2004"]]["CDC42:P60953"] <- 1 
my.list[["Zhou_et_al_2004"]]["CDK2D:NONAME"] <- 2 
my.list[["Zhou_et_al_2006"]]["CTTN:Q14247"] <- 3 
my.list[["Zhou_et_al_2006"]]["GTSE1:Q9NYZ3"] <- 4 
my.list[["Zhou_et_al_2006"]]["CHST11:Q9N"] <- 5 

library(plyr) 
ldply(my.list, .fun = function(x)as.data.frame(as.list(x))) 
#    .id CDC42.P60953 CDK2D.NONAME CTTN.Q14247 GTSE1.Q9NYZ3 CHST11.Q9N 
# 1 Zhou_et_al_2004   1   2   NA   NA   NA 
# 2 Zhou_et_al_2006   NA   NA   3   4   5 

我相信你會知道如何將其轉換爲最終格式。