2012-07-17 22 views
2

我有一個字符/數字組合的矩陣,我需要刪除那些列中兩列中出現相同字符的列。爲一個簡化的例子:當行包含相同字符時刪除列

> chars <- c("A1","A2","B1","B2") 
> charsmat <- combn(chars, 2) 
> charsmat 
    [,1] [,2] [,3] [,4] [,5] [,6] 
[1,] "A1" "A1" "A1" "A2" "A2" "B1" 
[2,] "A2" "B1" "B2" "B1" "B2" "B2" 

當單個列的兩行包含相同的字符(在這種情況下,將列1和6),我需要刪除該列。我覺得我有件:使用gsub()str_extract()隔離字符,並測試行之間是否匹配,但我堅持如何制定它。預先感謝您提供的任何幫助。

回答

3

首先,創建一個只提取的字母部分新的矩陣:

> (charsmat.alpha <- substr(charsmat, 0, 1)) 
    [,1] [,2] [,3] [,4] [,5] [,6] 
[1,] "A" "A" "A" "A" "A" "B" 
[2,] "A" "B" "B" "B" "B" "B" 

然後,從charsmat獲得列的子集,其中的charsmat.alpha兩行是不一樣的:

> charsmat[,(charsmat.alpha[1,] != charsmat.alpha[2,])] 
    [,1] [,2] [,3] [,4] 
[1,] "A1" "A1" "A2" "A2" 
[2,] "B1" "B2" "B1" "B2" 
+0

謝謝一噸;我很感激。 – ahj 2012-07-17 19:23:18

1

下面是一個更一般的解決方案,它將刪除列中的任何行中的任何字母匹配任意字母行2中的en請嘗試:

## Your data 
chars <- c("A1","A2","B1","B2") 
charsmat <- combn(chars, 2) 

vetMatrix <- function(mat) { 
    ## Remove non-alpha characters from matrix entries 
    mm <- gsub("[^[:alpha:]]", "", mat)  
    ## Construct character class regex patterns from first row 
    patterns <- paste0("[", mm[1,], "]") 
    xs <- mm[2,]  
    ## Extract columns in which no character in first row is found in second 
    mat[,!mapply("grepl", patterns, xs), drop=FALSE] 
} 

## Try it with your matrix ... 
vetMatrix(charsmat) 
#  [,1] [,2] [,3] [,4] 
# [1,] "A1" "A1" "A2" "A2" 
# [2,] "B1" "B2" "B1" "B2" 

## ... and with a different matrix 
mat <- matrix(c("AB1", "B1", "AA11", "BB22", "this", "that"), ncol=3) 
mat 
#  [,1] [,2] [,3] 
# [1,] "AB1" "AA11" "this" 
# [2,] "B1" "BB22" "that" 
vetMatrix(mat) 
#  [,1] 
# [1,] "AA11" 
# [2,] "BB22" 
+0

您是否構建了一個類似於此的響應,並自動在語句後註釋語句的結果? – 2012-07-17 19:42:25

+0

@邁克爾霍夫曼 - 害怕我不關注。 – 2012-07-17 19:43:48

+0

你有一個程序,將採取'mat'和'vetMatrix(mat)'並且吐出你的答案的底部八行,這兩個語句的結果交織爲註釋? – 2012-07-17 19:54:13