2016-10-03 99 views
1

我想修改包含0,1,2個值的大矩陣並將2替換爲1. 矩陣包含500.000列和7000行。數據已經被讀入50行,現在我想用塊和多線程使用foreach()%dopar%分解它。在R問題中做並行處理

> SNPchunk 
     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] 
[1,] 0 0 0 0 1 0 0 2  
[2,] 1 0 1 0 1 1 1 0  
[3,] 1 0 1 0 1 1 0 1 
[4,] 0 0 0 0 1 0 0 2  
[5,] 0 0 0 0 2 0 2 1  
[6,] 0 0 0 0 0 0 0 1 
[7,] 0 0 0 0 1 0 0 2 
[8,] 0 0 0 0 2 0 1 1 
[9,] 1 1 1 0 1 1 0 1 
[10,] 0 0 0 0 1 0 1 1  

chunk = foreach (part = 1:snpsplit) %do% 
{ 
    snpchunk = SNPcomponents[,snp.start[part]:snp.stop[part]] 

    #print(part) 

    res = foreach(SNP=1:ncol(snpchunk), .combine='cbind') %dopar% 
    { 
     a = snpchunk[,SNP] 
     a[a==2] <- 1 
     print(a) 
    }   
} 

與打印(一個)語句返回的變量RES是n的矩陣通過X與由1S全部換成2S。

 result.1 result.2 result.3 result.4 result.5 result.6 result.7 result.8 
[1,]  0  1  1  1  0  1  1  1 
[2,]  0  0  0  0  0  0  0  0 
[3,]  1  0  0  0  0  0  0  0 
[4,]  0  0  0  0  0  0  1  1 
[5,]  0  1  1  1  0  0  1  1 
[6,]  1  0  1  1  0  1  1  1 
[7,]  0  1  1  1  0  0  1  1 
[8,]  0  1  0  0  1  1  1  1 
[9,]  0  0  0  0  0  0  0  0 
[10,]  1  1  0  0  0  0  0  1 

但是沒有打印(一)語句返回的變量資源是X僅包含值1

>res 
result.1 result.2 result.3 result.4 result.5 result.6 result.7 result.8 
    1  1  1  1  1  1  1  1 

我如何獲得的第一個結果1矩陣不使用打印語句?

謝謝你的幫忙! J.

回答

0

如果完全刪除print(a)a[a==2] <- 1行返回1.這就是爲什麼你應該使用a代替print(a)

res = foreach(SNP=1:ncol(snpchunk), .combine='cbind') %dopar% 
{ 
    a = snpchunk[,SNP] 
    a[a==2] <- 1 
    a 
}