的R - 刪除行提供的值的特定組合

我有100個對象的列表，並創建它們的所有可能的對使用的R - 刪除行提供的值的特定組合

pairs <- t(combn(my_objects, 2))

而是一組中的任何對象不能被合併到任何物體B.組意味着如果

group_A <- c(5:10) 
group_B <- c(50:55)

一對6和55應被刪除，無論在哪個行它們。 5和6的組合是可以的。如何檢查這些「禁止」對中的某一行的每一行並將其刪除？我嘗試了%in%，但我不知道如何將它用於多個對象。

編輯

我真正的問題是這樣的：我有75個字符，A1 ... A75的列表。他們應該結合成對。但group_A（5 ... 10）中的一個決不能與group_B（50 ... 55）之一組合。

第二步：它可以是，pairs中的一行條目可以是：A1.A8 - A2.A.12.A51。這雙也應該被刪除。

我的數據幀是：

pairs <- cbind (c("A1", "A9.A3", "A5", "A52.A12", "A8"), 
       c("A76.A14", "A50", "A2.A7", "A70", "A50.A51")) 

group_A <- c("A5", "A6" "A7", "A8", "A9", "A10") 
group_B <- c("A50", "A51", "A52", "A53", "A54", "A55")

我的目標 - 從pairs數據幀刪除group_A和group_B的項的所有組合。這樣pairs =

 [,1] [,2]  
[1,] "A1" "A76.A14" 
[2,] "A5" "A2.A7" 
[3,] "A52.A12" "A70"

來源

2016-11-15 Miguel123

Plese顯示一個可重複的例子和預期的輸出 – akrun

我們可以使用expand.grid與paste創建組合的vector，檢查這些元素是否有%in%的vector與combn創建，否定（!），並與此邏輯子集「對」向量。

v1 <- combn(length(my_objects), 2, FUN = paste, collapse=" ") 
pairs[!v1 %in% do.call(paste, expand.grid(group_A, group_B)),]

來源

2016-11-15 11:07:56 akrun

嗯，不知道它不工作。 'pairs'仍然有相同的長度和被禁止的組合extist – Miguel123

@ Miguel123我創建了'set.seed（24）; my_objects < - sample（1：200,100，replace = FALSE）'並且它在爲我工作dim （配對）＃[1] 4950 2'和'dim（配對[！％％％do.call（粘貼，展開）網格（group_A，group_B）），]）＃[1] 4914 2' – akrun

hm好的，我的對象總是以字母作爲前綴。這可能是問題嗎？ – Miguel123

這是不是很優雅，但（我認爲）是你想要做什麼：爲每一行，通過字符分割在各列中的條目，然後檢查是否有與group_A或任何重疊「」 group_B，如果是這樣，行索引添加到待刪除的行的矢量：

rm_idx <- c() 
for (r in 1:nrow(pairs)) { # for each row of pairs... 
    pair_A <- unlist(strsplit(pairs[r,1],'[.]')) # split first element by '.' 
    pair_B <- unlist(strsplit(pairs[r,2],'[.]')) # split second elemnt by '.' 
    l_A <- length(intersect(group_A, pair_A)) # elements of group_A in pair_A 
    l_B <- length(intersect(group_B, pair_B)) # elements of group_A in pair_A 
    if (l_A > 0 & l_B > 0) { # if there is overlap in both entries of that row --> remove 
    rm_idx <- c(rm_idx, r) 
    } 
} 
new_pairs <- pairs[-rm_idx,]

或者，可以創建可以與grepl被用於發現的元素的任何發生正則表達式，例如，group_A in a string：

# concatenate vector of strings (with '|'), creating a regular expression for 
# searching for any of them (using grep/grepl) 
grp2reg <- function (g) { 
    paste(g, collapse ='|') 
} 

# append dot to string (or each of a vector of strings) 
add_dot <- function(g) { 
    paste(g, '.', sep='') 
} 

# find strings from group_A/group_B in first/second column of pairs 
idx_A <- grepl(grp2reg(add_dot(group_A)), add_dot(pairs[,1])) 
idx_B <- grepl(grp2reg(add_dot(group_B)), add_dot(pairs[,2])) 

# remove rows with a match in both columns 
pairs_new <- pairs[!(idx_A & idx_B),]

請注意，附加'。'所有字符串在這裏是必要的，以避免「找到」，例如A5在A50。所以，group_A的正則表達式實際上是"A5.|A6.|A7.|A8.|A9.|A10."，並且點也被附加到pairs的元素（以找到，例如，A5.在A50.A5.）。

來源

2016-11-20 21:13:16 julius

嗨@ Miguel123，這是否回答你的問題？在任何一種情況下，都會期待您的反應。 – julius

的R - 刪除行提供的值的特定組合

回答

相關問題