找到一個矩陣

列的最佳組合，假設我有2000列的大型矩陣（matrix_1）。每個單元格的值爲0或1.我想要找到10列的最佳組合。最佳組合給出每行非0值的最大數量。因此，它基本上提供了最大的找到一個矩陣

sum (apply (matrix_2, 1, function(x) any(x == 1)))

我不能去通過所有可能的組合，因爲它是計算量太大（有2.758988e + 26）。有什麼建議麼？

舉一個例子藉此矩陣具有4行，我一次只

mat <- matrix (c(1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0), nrow = 4, byrow = FALSE) 
mat 
# combination of columns 2 and 3 is best: 3 rows with at least a single 1 value 
sum (apply (mat[, c(2, 3)], 1, function(x) any (x == 1))) 
# combination of columns 1 and 2 is worse: 2 rows with at least a single 1 value 
sum (apply (mat[, c(1, 2)], 1, function(x) any (x == 1)))

來源

2017-07-24 Pavel Shliaha

在你的矩陣有多少行？ – CPak

100-200行。取決於應用程序通過'colSums（COL）' –

你不能爲了你的列並選擇前10名？我問，因爲我不是100％確定你想要什麼，這有助於我更好地瞭解你在找什麼。 – CPak

你可以使用這樣的功能選擇2列...

find10 <- function(mat,n=10){ 
    cols <- rep(FALSE,ncol(mat)) #columns to exclude 
    rows <- rep(TRUE,nrow(mat)) #rows to include 
    for(i in 1:n){ 
    colsums <- colSums(mat[rows,]) 
    colsums[cols] <- -1 #to exclude those already accounted for 
    maxcol <- which.max(colsums) 
    cols[maxcol] <- TRUE 
    rows <- rows & !as.logical(mat[,maxcol]) 
    } 
    return(which(cols)) 
}

它看起來對於大多數非零的列，從比較中刪除這些行，然後重複。它返回n個最佳列的列號。

一個例子...

m <- matrix(sample(0:1,100,prob = c(0.8,0.2),replace=TRUE),nrow=10) 

m 
     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] 
[1,] 0 1 0 0 0 0 0 1 1  0 
[2,] 1 0 0 0 0 0 0 0 1  1 
[3,] 0 0 0 0 1 0 0 0 0  0 
[4,] 0 0 0 1 0 1 0 1 0  1 
[5,] 0 0 0 0 1 0 0 1 0  0 
[6,] 0 0 0 1 0 1 1 0 0  0 
[7,] 0 0 1 0 0 0 0 0 0  0 
[8,] 0 0 0 0 0 0 0 0 1  0 
[9,] 0 0 0 0 0 0 0 1 0  0 
[10,] 0 0 0 0 0 0 0 0 0  0 

find10(m,5) 
[1] 3 4 5 8 9

它還2,3你給的例子出現。

來源

2017-07-24 15:51:16

有趣的解決方案。我得想想！ –

是的，你是對的！很好的答案！非常感謝！ –

找到一個矩陣

回答

相關問題