2017-05-06 32 views
1

我有5個矢量的每個項目在這些載體多於2個向量之間進行比較或者爲「是」或「否」 因此,我希望這些5個載體(逐行)之間進行比較,並計算出對每一行進行多數投票並將結果添加到新的向量中。 如何以有效的方式執行此操作。在R(表決)

v1=c("yes","no","no","yes") 
v2=c("no","no","yes","yes") 
v3=c("yes","yes","no","yes") 
v4=c("yes","no","yes","yes") 
v5=c("yes","yes","yes","no") 
#The expected output is "yes", "no", "yes", "yes" 

回答

3

首先把數據的形式是基於字符的:

dat <- data.frame(v1=c("yes","no","no","yes"), 
        v2=c("no","no","yes","yes"), 
        v3=c("yes","yes","no","yes"), 
        v4=c("yes","no","yes","yes"), 
        v5=c("yes","yes","yes","no"), stringsAsFactors=FALSE) 

然後拉出的最大值的名稱爲表對象:

apply(dat, 1, function(x) names(which.max(table(x)))) 
[1] "yes" "no" "yes" "yes" 
+0

它的工作原理也如果數據存儲爲因素,因爲這將是一個「標準」 data.frame的情況下 – Gilles

0

另一種方法是使用mapply==返回真假對比一個矩陣,其中向量的元素相等的東西(在這裏,「是」)。然後rowMeans計算跨行的比例,並且> 0.5檢查多數。我們加1,轉換爲數字位置,然後用這個作爲位置從c("no", "yes")元素進行選擇。

c("no", "yes")[(rowMeans(mapply("==", moreArgs=list("yes"), myList)) > 0.5) + 1L] 
[1] "yes" "no" "yes" "yes" 

使用矩陣乘法另一種是,你通過把向量在列表如下開始

c("no", "yes")[((do.call(cbind, myList) == "yes") %*% 
       rep(1, length(myList)) > (length(myList)/2)) + 1L] 
[1] "yes" "no" "yes" "yes" 

注意。

數據

myList <- list(v1=c("yes","no","no","yes"), 
       v2=c("no","no","yes","yes"), 
       v3=c("yes","yes","no","yes"), 
       v4=c("yes","no","yes","yes"), 
       v5=c("yes","yes","yes","no"))