2017-04-08 84 views
1

我有幾個矩陣,它們都是不同的大小,順序稍有不同。我正在嘗試組織這些矩陣,這樣我可以對它們進行平均。最直接的方法(我認爲)將是創建相等的矩陣,然後使用先前建議的解決方案之一,例如, Reduce("+", my.list)/length(my.list)通過填充NaN創建同質矩陣R

我在想,有可能創建一個模板矩陣是那麼10×10的每個矩陣應用於模板,因此,如果正在應用的矩陣不是10×10(例如,它是4×4)的其餘部分矩陣將充滿NaN。我提供了三個示例矩陣和三個矩陣,看起來像我希望輸出看起來那樣。

三個樣品基質:

  0   1 2   3 4 5 6   7 8 9 
0 0.7134503 0.0000000 0 0.0000000 0.00 0 0 0.0000000 0.0 0 
1 0.6800000 0.0000000 0 0.0000000 0.00 0 0 0.0000000 0.0 0 
2 0.2352941 0.2941176 0 0 0.0000000 0.00 0 0.4117647 0.0 0 
3 0.3333333 0.0000000 0 0.0000000 0.00 0 0 0.0000000 0.2 0 
4 0.0000000 0.0000000 0 0.0000000 0.00 0 0 0.0000000 0.0 0 
5 0.5000000 0.0000000 0 0.0000000 0.25 0 0 0.0000000 0.0 0 
6 0.6000000 0.4000000 0 0.0000000 0.00 0 0 0.0000000 0.0 0 
7 0.5250000 0.0000000 0 0.0000000 0.00 0 0 0.0000000 0.0 0 
8 0.6060606 0.0000000 0 0.2121212 0.00 0 0 0.0000000 0.0 0 
9 0   0   0 0   0 0 0 0   0 0 

      0 1   2   3   4 5 7 8 9 
0 0.5550000 0.0 0.0000000 0.2200000 0.0000000 0 0 0.0 0 
1 0.6363636 0.0 0.2727273 0.0000000 0.0000000 0 0 0.0 0 
2 0.4516129 0.0 0.0000000 0.2580645 0.0000000 0 0 0.0 0 
3 0.4150943 0.0 0.0000000 0.3679245 0.0000000 0 0 0.0 0 
4 0.7647059 0.0 0.0000000 0.2352941 0.0000000 0 0 0.0 0 
5 0.4285714 0.0 0.0000000 0.0000000 0.0000000 0 0 0.0 0 
7 0.2000000 0.2 0.2000000 0.2000000 0.0000000 0 0 0.2 0 
8 0.3000000 0.0 0.0000000 0.7000000 0.0000000 0 0 0.0 0 
9 0.5555556 0.0 0.0000000 0.0000000 0.2222222 0 0 0.0 0 

      0 2   3 4 7 8 
0 0.4020101 0 0.5075377 0 0 0 
2 0.0000000 0 0.0000000 0 0 0 
3 0.6322581 0 0.2322581 0 0 0 
4 0.0000000 0 0.0000000 0 0 0 
7 0.0000000 0 0.0000000 0 0 0 
8 0.4883721 0 0.3488372 0 0 0 

所需的輸出:

  0   1 2 3   4 5 6 7   8 9 
0 0.7134503 0.0000000 0 0 0.0000000 0.00 0 0 0.0000000 0.0 
1 0.6800000 0.0000000 0 0 0.0000000 0.00 0 0 0.0000000 0.0 
2 0.2352941 0.2941176 0 0 0.0000000 0.00 0 0 0.4117647 0.0 
3 0.3333333 0.0000000 0 0 0.0000000 0.00 0 0 0.0000000 0.2 
4 0.0000000 0.0000000 0 0 0.0000000 0.00 0 0 0.0000000 0.0 
5 0.5000000 0.0000000 0 0 0.0000000 0.25 0 0 0.0000000 0.0 
6 0.6000000 0.4000000 0 0 0.0000000 0.00 0 0 0.0000000 0.0 
7 0.5250000 0.0000000 0 0 0.0000000 0.00 0 0 0.0000000 0.0 
8 0.6060606 0.0000000 0 0 0.2121212 0.00 0 0 0.0000000 0.0 
9 0.7272727 0.0000000 0 0 0.0000000 0.00 0 0 0.0000000 0.0 

      0 1   2   3   4 5 6 7 8 9 
0 0.5550000 0.0 0.0000000 0.2200000 0.0000000 0 NA 0.0 0 
1 0.6363636 0.0 0.2727273 0.0000000 0.0000000 0 NA 0.0 0 
2 0.4516129 0.0 0.0000000 0.2580645 0.0000000 0 NA 0.0 0 
3 0.4150943 0.0 0.0000000 0.3679245 0.0000000 0 NA 0.0 0 
4 0.7647059 0.0 0.0000000 0.2352941 0.0000000 0 NA 0.0 0 
5 0.4285714 0.0 0.0000000 0.0000000 0.0000000 0 NA 0.0 0 
6 NA  NA NA  NA  NA  NANA NA NA 
7 0.2000000 0.2 0.2000000 0.2000000 0.0000000 0 NA 0.2 0 
8 0.3000000 0.0 0.0000000 0.7000000 0.0000000 0 NA 0.0 0 
9 0   0 0   0   0   0 NA 0 0 

      0 1 2   3 4 5 6 7 8 9 
0 0.4020101 NA 0 0.5075377 0 NANA0 0 NA 
1 NA  NA NA  NA NANANANANANA 
2 0.0000000 NA 0 0.0000000 0 0 0NANANA 
3 0.6322581 NA 0 0.2322581 0 0 0NANANA 
4 0.0000000 NA 0 0.0000000 0 0 0NANANA 
5  NA NANA  NA NANA NA NA NA 
6  NA NANA  NA NANA NA NA NA 
7 0.0000000 NA 0 0.0000000 0 0 0NANANA 
8 0.4883721 NA 0 0.3488372 0 0 0NANANA 
9  NA NANA  NA NANA NA NA NA 

回答

2

一個快速的方法:拿到一套跨列表唯一的列和rownames的。用這些維度創建一個新矩陣,然後使用子集機制(按行和列名稱)分配值。您所使用的Reduce("+", my.list)/length(my.list)

# some dummy data 
m1 <- matrix(1:4, 2, dimnames=list(0:1, c(0,3))) 
m2 <- matrix(1:9, 3, dimnames=list(0:2, 0:2)) 
lst <- list(m1, m2) 
#> lst 
#[[1]] 
# 0 3 
#0 1 3 
#1 2 4 

#[[2]] 
# 0 1 2 
#0 1 4 7 
#1 2 5 8 
#2 3 6 9 

# Get unique col and row names 
nc <- sort(unique(unlist(lapply(lst, colnames)))) 
nr <- sort(unique(unlist(lapply(lst, rownames)))) 

# loop through matrices 
lst2 <- lapply(lst , function(x) { 
    out = matrix(NA, ncol=length(nc), nrow=length(nr), dimnames=list(nr, nc)) 
    idx = as.matrix(expand.grid(rownames(x), colnames(x))) 
    out[idx] <- x 
    out 
    }) 
# lst2 
#[[1]] 
# 0 1 2 3 
#0 1 NA NA 3 
#1 2 NA NA 4 
#2 NA NA NA NA 

#[[2]] 
# 0 1 2 3 
#0 1 4 7 NA 
#1 2 5 8 NA 
#2 3 6 9 NA 

一種意見是,總和不會工作的(我覺得)你希望如果有NA。但可以通過

s <- simplify2array(lst2) 
rowMeans(s, dim=2, na.rm = TRUE) 
# 0 1 2 3 
#0 1 4 7 3 
#1 2 5 8 4 
#2 3 6 9 NaN 

另一種方法讓他們得到手段

d <- Reduce(function(...) merge(..., by=c("Var1", "Var2"), all=TRUE), lapply(lst, reshape2::melt)) 
v <- rowMeans(d[-(1:2)], na.rm = TRUE) 
xtabs(v ~ Var1 + Var2, data=d) 
# Var2 
#Var1 0 1 2 3 
# 0 1 4 7 3 
# 1 2 5 8 4 
# 2 3 6 9 0 
+1

偉大的作品,謝謝! – Mik