2017-05-03 48 views
3

我正在處理一些大矩陣,其值與對角線類似,如下所示。R - 沿矩陣對角線填充長度爲'n'或更小的數據間隙

ontrack <- matrix(c(
     runif(1),NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA, 
     runif(1),NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA, 
     NA,NA,runif(1),NA,NA,NA,NA,NA,NA,NA,NA,NA,NA, 
     NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA, 
     NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA, 
     NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA, 
     NA,NA,NA,NA,runif(1),runif(1),NA,NA,NA,NA,NA,NA,NA, 
     NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA, 
     NA,NA,NA,NA,NA,NA,NA,runif(1),NA,NA,NA,NA,NA, 
     NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA, 
     NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA, 
     NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA, 
     NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA, 
     NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,runif(1)), 
     nrow=14, byrow=T 
     ) 

我想填充長度'n'或更小的數據間隙來連接對角線段。使用例如上述矩陣和在3個或更少的數據填補空白,我想從這個去:

diag_indx <- which(!is.na(ontrack), arr.ind=T) 

其給出

 row col 
[1,] 1 1 
[2,] 2 1 
[3,] 3 3 
[4,] 7 5 
[5,] 7 6 
[6,] 9 8 
[7,] 14 13 

 row col 
     1 1 
     2 1 
newV 3 2 
     3 3 
    new 4 4 
    new 5 4 
    new 6 4 
     7 5 
     7 6 
    new 8 7 
     9 8 
     14 13 

對於例如newV,結果可以是(2,2)或(3,2)。我的後續代碼使用diag_indx矩陣,但如果效率更高,則數據間隙可以直接填充到ontrack矩陣中(使用任何值都可以)。

在試圖找出一個解決方案,我發現預想在diag_indx矩陣中的數據間隙使用該sequence length equation

seqle <- function(x, incr=1) { 
    if(!is.integer(x)) x <- as.integer(x) 
    n <- length(x) 
    y <- x[-1L] != x[-n] + incr 
    i <- c(which(y|is.na(y)),n) 
    list(lengths = diff(c(0L,i)), 
     values = x[head(c(0L,i)+1L,-1L)]) 
} 

,然後在數據缺口使用seq()填充。我只是不確定如何有效地把它放在一起。感謝您的幫助。

回答

1

經過一些試驗和錯誤,我想出了一個(不是很漂亮)的解決方案,只需要基本的R函數。

diagFillSeq <- function(diag_indx, fillgap=1){ 
    repeat{ 
    for(cols in 1:2){ 
     diag_indx <- diag_indx[order(diag_indx[, cols]), ] #Sort by selected column 
     repeat{ 
     diffs <- diff(diag_indx[, cols]) 
     #Find breaks in sequence with differences >1 (diffs==1 are in sequence) and less than or equal to fillgap 
     gap_indx <- which(diffs > 1 & diffs <= (fillgap +1)) #need +1 because fencepost error: 3rd & 7th post diffs=4 but fillgap=3) 
     if(length(gap_indx) == 0){break} 
     insert_indx <- gap_indx[1] 
     seq_length <- diffs[gap_indx[1]] - 1 #need -1 because fencepost error 
     #Subset diag_indx and insert filling sequence 
     diag_indx <- rbind(diag_indx[1:insert_indx, ], 
         cbind(
         as.integer(seq(from=diag_indx[insert_indx, 1] +1, to=diag_indx[insert_indx+1, 1] -1, length.out=seq_length)), 
         as.integer(seq(from=diag_indx[insert_indx, 2] +1, to=diag_indx[insert_indx+1, 2] -1, length.out=seq_length)) 
        ), 
         diag_indx[(insert_indx+1):nrow(diag_indx), ]) 
     } 
    } 
    #Recheck first column to see if any new sequence gaps were created 
    diffs <- diff(diag_indx[, 1]) 
    gap_indx <- which(diffs > 1 & diffs <= (fillgap +1)) 
    if(length(gap_indx) == 0){return(unname(diag_indx))} 
    } 
} 

和上面

whatIwant <- matrix(as.integer(c(1,2,3,3,4,5,6,7,7,8,9,14, 1,1,2,3,4,4,4,5,6,7,8,13)), ncol=2) 
whatIwant 
#  [,1] [,2] 
# [1,] 1 1 
# [2,] 2 1 
# [3,] 3 2 
# [4,] 3 3 
# [5,] 4 4 
# [6,] 5 4 
# [7,] 6 4 
# [8,] 7 5 
# [9,] 7 6 
#[10,] 8 7 
#[11,] 9 8 
#[12,] 14 13 

identical(diagFillSeq(diag_indx, fillgap=3), whatIwant) 
#TRUE 
diag_indx測試