找到列表中的所有匹配列表

我有一個列表我想找到所有常用向量。也就是說，那些包含完全相同元素的元素，在R中保留每個列表的位置編號。如果可能的話，使用一個班輪命令。找到列表中的所有匹配列表

這裏是MYLIST：

mylist<-list(c("yes", "no"), c("no", "other", "up", 
"down"), c("no", "yes"), c("no", 
"yes"), c("no", "yes", "maybe"), c("no", 
"yes", "maybe"), c("no", "yes", "maybe"))

希望的輸出：

共用列表是：匹配1：1,3,4 匹配2：5,6,7

來源

2017-07-16 Elias EstatisticsEU

我希望固定它@lmo –

一個直接的方法是'ML2 = lapply（MYLIST，排序）;匹配（ml2，unique（ml2））' –

@alexis_laz您的解決方案不提供每個匹配匹配列表的位置！檢查akrun的答案。無論如何感謝您的時間！ –

下面是使用split

Filter(function(x) length(x) >1, split(seq_along(mylist), 
        sapply(mylist, function(x) toString(sort(x))))) 
#$`maybe, no, yes` 
#[1] 5 6 7 

#$`no, yes` 
#[1] 1 3 4

來源

2017-07-16 14:29:00 akrun

它就像一個魅力！謝謝你們！ –

你能寫一些評論，這是如何工作的？謝謝阿克倫！ –

@EliasEstatisticsEU這個想法是通過粘貼'mylist'的排序元素，然後'過濾'具有長度的序列'list'創建的一組'vector'來'拆分'mylist'的序列即大於1. – akrun

duplicated接受列表作爲它的主要論據。所以你可以使用

which(duplicated(mylist1) | duplicated(mylist1, fromLast=TRUE)) 
[1] 3 4 5 6 7

爲你的第一個例子。請注意，這不會區分帶有公共元素的列表元素組，但只會爲具有相同元素的元素返回TRUE。

對於第二個示例數據集，您可以使用以下方法來查找組的位置

# get group values as integers 
groups <- as.integer(factor(sapply(mylist2, 
            function(x) paste(sort(x), collapse="")))) 
# return list of groups 
lapply(seq_len(max(groups)), function(x) which(x == groups)) 
[[1]] 
[1] 2 

[[2]] 
[1] 5 6 7 

[[3]] 
[1] 1 3 4

數據

mylist1 <- 
list(c("yes", "no"), c("no", "other", "up", "down"), c("no", 
"yes", "maybe"), c("no", "yes", "maybe"), c("no", "yes", "maybe" 
), c("no", "yes", "maybe"), c("no", "yes", "maybe")) 

mylist2 <- 
list(c("yes", "no"), c("no", "other", "up", "down"), c("no", 
"yes"), c("no", "yes"), c("no", "yes", "maybe"), c("no", "yes", 
"maybe"), c("no", "yes", "maybe"))

來源

2017-07-16 13:54:42 lmo

更新了我的問題 –

我想區分匹配，請參閱更新的問題。謝謝 –

@EliasEstatisticsEU請避免發佈移動目標問題。把時間花在編輯後突然變得無效的答案上（可能甚至是不公正的低調提示，如這裏），這可能是相當令人沮喪的。請花點時間仔細考慮您在發佈前的問題。乾杯。 – Henrik

這個工作對我來說：

mylist<-list(c("yes", "no"), c("no", "other", "up", 
           "down"), c("no", "yes"), c("no", 
                  "yes"), c("no", "yes", "maybe"), c("no", 
                          "yes", "maybe"), c("no", "yes", "maybe")) 

library(dplyr) 

# function to create a dataframe from your list. Might not be the most efficient way to do this. 
f <- function(data) { 
    nCol <- max(vapply(data, length, 0)) 
    data <- lapply(data, function(row) c(row, rep(NA, nCol-length(row)))) 
    data <- matrix(unlist(data), nrow=length(data), ncol=nCol, byrow=TRUE) 
    data.frame(data) 
} 

# create a dataframe from the list, and add a 'key' column 
df = f(mylist) 
df$key = apply(df , 1 , paste , collapse = "-") 

# find the total times the key occurs 
df_total = df %>% group_by(key) %>% summarise(n =n()) 

# find the indices that belong to the groups 
result = lapply(df_total$key, function(x) which(df$key==x))

結果：

> result 
[[1]] 
[1] 2 

[[2]] 
[1] 5 6 7 

[[3]] 
[1] 3 4 

[[4]] 
[1] 1

希望這會有所幫助！

來源

2017-07-16 14:15:22 Florian

儘管它有效，但我不能接受它作爲一個被接受的答案，因爲它不是單線。感謝您的回答F Maas –

爲什麼需要一個班輪？ – Florian

因爲我想保持我的代碼清潔！ –

數據

mylist <- list(c("yes", "no"), c("no", "other", "up", "down"), c("no", "yes"), 
      c("no", "yes"), c("no", "yes", "maybe"), c("no", "yes", "maybe"), 
      c("no", "yes", "maybe"))

一個（長）的單行

sapply(unique(unlist(lapply(mylist, function(x) paste(sort(x), collapse = " ")))), function(y) which(y == unlist(lapply(mylist, function(x) paste(sort(x), collapse = " ")))))

輸出一個選項：

$`no yes` 
[1] 1 3 4 

$`down no other up` 
[1] 2 

$`maybe no yes` 
[1] 5 6 7

來源

2017-07-16 14:49:42 lampros

Bravo，你做到了！（希臘怪胎？） –

是埃利亞斯我是希臘人（y），我希望線索會幫助。 – lampros

這是一個有趣的。您可以使用mtabulate從qdapTools包得到以下數據幀，

d1 <- qdapTools::mtabulate(mylist) 
d1 
# down maybe no other up yes 
#1 0  0 1  0 0 1 
#2 1  0 1  1 1 0 
#3 0  0 1  0 0 1 
#4 0  0 1  0 0 1 
#5 0  1 1  0 0 1 
#6 0  1 1  0 0 1 
#7 0  1 1  0 0 1

然後你就可以通過粘貼把它分解，

l1 <- split(d1, do.call(paste, d1)) 

l1 
#$`0 0 1 0 0 1` 
# down maybe no other up yes 
#1 0  0 1  0 0 1 
#3 0  0 1  0 0 1 
#4 0  0 1  0 0 1 

#$`0 1 1 0 0 1` 
# down maybe no other up yes 
#5 0  1 1  0 0 1 
#6 0  1 1  0 0 1 
#7 0  1 1  0 0 1 

#$`1 0 1 1 1 0` 
# down maybe no other up yes 
#2 1  0 1  1 1 0

但是，您可以利用該列表中你想要的，即

甚至，

setNames(lapply(l1, rownames), lapply(l1, function(i)toString(names(i)[i[1,] == 1]))) 
#$`no, yes` 
#[1] "1" "3" "4" 

#$`maybe, no, yes` 
#[1] "5" "6" "7" 

#$`down, no, other, up` 
#[1] "2"

來源

2017-07-16 15:23:09 Sotos

創建「d1」後（順便說一下，可以簡單創建爲'd1 = table（rep（1：length（mylist），lengths（mylist）），unlist（mylist））'），以避免強制和'粘貼'，可以使用'd1％*％（2 ^（0：（ncol（d1） - 1）））'創建組，以分割 –

@alexis_laz感謝您的建議。我不明白這個分組代碼是如何工作的（或者實際上不會 - 拋出一個錯誤）......但是我的意思是它背後的邏輯 – Sotos

它基本上將每行都轉換爲一個整數，遵循二進制 - >十進制方法。這是'apply（d1，1，function（x）sum（x *（2 ^（0：（length（x） - 1））））''的一個更好的選擇。它現在被貼在SO上。 –

找到列表中的所有匹配列表

回答

相關問題