的通用組合並數據幀我有兩個由不同採樣器採集的龍蝦卵尺寸數據的數據集,這些數據集將用於評估測量變異性。每個採樣器測量來自多個龍蝦的〜50個雞蛋和龍蝦。然而,偶爾有一些龍蝦由採樣器1處理,而不是採樣器2處理,反之亦然。我想將來自兩個採樣器的數據合併爲一個新的數據集,但要刪除所有僅由一個採樣器處理的龍蝦數據。我用semi_join和dplyr玩過相交,但我需要在數據集1 - > 2和2 < -1之間執行匹配。我能夠創建一個新的數據集,該數據集綁定來自兩個採樣器的行,但不清楚如何刪除新數據集中兩個數據集之間的所有唯一龍蝦ID。按照
這裏是我的數據的簡化版本,其中從多個龍蝦取得多個雞蛋麪積測量結果,但採樣並不總是重疊(即,雞蛋僅由一個採樣器而不是從另一個採樣器測量):
install.packages(dplyr)
library(dplyr)
sampler1 <- data.frame(LobsterID=c("Lobster1","Lobster1","Lobster2",
"Lobster2","Lobster2","Lobster2",
"Lobster2","Lobster3","Lobster3","Lobster3"),
Area=c(.4,.35,1.1,1.04,1.14,1.1,1.05,1.7,1.63,1.8),
Sampler=c(rep("Sampler1", 10)))
sampler2 <- data.frame(LobsterID=c("Lobster1","Lobster1","Lobster1",
"Lobster1","Lobster1","Lobster2",
"Lobster2","Lobster2","Lobster4","Lobster4"),
Area=c(.41,.44,.47,.43,.38,1.14,1.11,1.09,1.41,1.4),
Sampler=c(rep("Sampler2", 10)))
combined <- bind_rows(sampler1, sampler2)
desiredresult <- combined[-c(8, 9, 10, 19, 20), ]
該腳本的底線是模擬數據所需的結果。我曾希望限制使用R或dplyr。
幹得子集的行!謝謝! – user24537