編輯 這是一個解決方案,它假定所有的第1列字符串都以「var /// var2 /// ...」的形式出現。我們首先恢復所有唯一變量是這樣的:
resultsList <- list(c("x","0","1","1","1","5"),
c("x /// y","0","1","1","2","3"),
c("x","0","1","3","2","4"),
c("x /// z","0","1","2","2","2"),
c("x","0","1","3","3","4"),
c("x","0","0","0","1","2"),
c("x","0","2","2","1","4"),
c("x /// y","0","2","2","1","2"))
firstColumn <- sapply(resultsList,"[[",1)
listsOfVariables <- c(strsplit(firstColumn," /// "))
vector <- c()
for(i in 1:length(listsOfVariables))
{
vector <- c(vector,listsOfVariables[[i]])
}
uniqueVariables <- unique(vector)
uniqueVariables
[1] "x" "y" "z"
接下來,我們找出這些變量都包含在所有各行的:
matches <- sapply(1:length(uniqueVariables), function(x,y) grep(uniqueVariables[x],y), y=firstColumn)
variablesMatchingAllRows <- uniqueVariables[sapply(matches,"length")==length(resultsList)]
variablesMatchingAllRows
[1] "x"
我們則變量粘貼在一起(如果你的所有行符合超過1個變量):
variablesMatchingAllRowsTest <- c("x","y","z")
paste(variablesMatchingAllRowsTest,collapse=" /// ")
[1] "x /// y /// z"
我們獲得最後一列1串並添加列總和:
> finalString <- paste(variablesMatchingAllRows,collapse=" /// ")
> c(finalString,colSums("mode<-"(do.call(rbind, resultsList)[ , -1], "numeric")))
[1] "x" "0" "9" "14" "13" "26"
OLD ANSWER
在下面的例子中,我們將首先發現在具有最小stringsize的第1列的唯一的字符串,然後我們將檢查這個最小的字符串包含在其他字符串。然後,我們將計算匹配行的列數。我們使用這個數據爲例:
> resultsList <- list(c("x","0","1","1","1","5"),
+ c("a b x /// y","0","1","1","2","3"),
+ c("x","0","1","3","2","4"),
+ c("a /// z","0","1","3","3","4"),
+ c("bd x","0","1","5","3","6"))
> resultsList
[[1]]
[1] "x" "0" "1" "1" "1" "5"
[[2]]
[1] "a b x /// y" "0" "1" "1" "2" "3"
[[3]]
[1] "x" "0" "1" "3" "2" "4"
[[4]]
[1] "a /// z" "0" "1" "3" "3" "4"
[[5]]
[1] "bd x" "0" "1" "5" "3" "6"
首先,我們找到了匹配這minimalString
的minimalString
和相應的行索引:
firstColumn <- sapply(resultsList,"[[",1)
minimalString <- unique(firstColumn[nchar(firstColumn)==min(nchar(firstColumn))])
indices <- grep(minimalString,firstColumn) # Grep on the first element in minimalString
我們得到:
> minimalString
[1] "x"
> indices
[1] 1 2 3 5
換句話說,除第4行外的所有行都與您的minimalString匹配。接下來,我們添加所有的columnsums在匹配的行這樣的:
> c(minimalString, as.character(apply(sapply(2:6,function(x,y,z) as.numeric(sapply(y,"[[",x)),y=resultsList)[indices,],2,sum)))
[1] "x" "0" "4" "10" "8" "18"
我們將進一步把它分解爲清楚:
內sapply(y,"[[",x))
將獲取列表Y索引x的所有元素,並返回他們作爲一個載體。我們爲y = resultsList
和x = 2:6
這樣做。 請注意,我們也有人物首先轉換爲數學運算:
> intermediateResult <- sapply(2:6,function(x,y,z) as.numeric(sapply(y,"[[",x)),y=resultsList)
> intermediateResult
[,1] [,2] [,3] [,4] [,5]
[1,] 0 1 1 1 5
[2,] 0 1 1 2 3
[3,] 0 1 3 2 4
[4,] 0 1 3 3 4
[5,] 0 1 5 3 6
接下來,我們計算出各行的匹配indices
的columnsums:最後
> sums <- apply(intermediateResult[indices,],2,sum)
> sums
[1] 0 4 10 8 18
,我們還是要轉換總結回字符並在前面添加唯一的第1列標識符。我們得到:
> finalResult <- c(minimalString,as.character(sums))
> finalResult
[1] "x" "0" "4" "10" "8" "18"
對於示例,我們得到如下結果:
> resultsList <- list(c("x","0","1","1","1","5"),
+ c("x /// y","0","1","1","2","3"),
+ c("x","0","1","3","2","4"),
+ c("x /// z","0","1","2","2","2"),
+ c("x","0","1","3","3","4"),
+ c("x","0","0","0","1","2"),
+ c("x","0","2","2","1","4"),
+ c("x // y","0","2","2","1","2"))
> firstColumn <- sapply(resultsList,"[[",1)
> minimalString <- unique(firstColumn[nchar(firstColumn)==min(nchar(firstColumn))])
> indices <- grep(minimalString,firstColumn) # Grep on the first element in minimalString
> minimalString
[1] "x"
> indices
[1] 1 2 3 4 5 6 7 8
> c(minimalString, as.character(apply(sapply(2:6,function(x,y,z) as.numeric(sapply(y,"[[",x)),y=resultsList)[indices,],2,sum)))
[1] "x" "0" "9" "14" "13" "26"
您的編輯假定正確。列1始終包含由'///'分隔的字符串。雖然我喜歡這兩個答案,但你可以讓我確定總是出現在第1列的變量。謝謝! –