我有團隊成員如何從多個團隊互相評分的數據。每個人都有自己的身份證號碼,但團隊內的評級人數也是這樣的:從同一張表查找dplyr
StudyID TeamID CATMERater Rated Rating
(int) (int) (int) (dbl) (dbl)
1 2930 551 1 1 5.000000 #How rater 1 rated 1 (themselves)
2 2938 551 2 1 3.800000 #How rater 2 rated 1
3 2939 551 3 1 5.000000 #How rater 3 rated 1
4 2930 551 1 2 3.666667 #How rater 1 rated 2
5 2938 551 2 2 4.000000 #...
6 2939 551 3 2 3.866667
...
等等。我使用tidyr
得到了這種格式,我試圖獲得TeamID和被評估人員相同的StudyID的新列。這是我嘗試過,但沒有工作,因爲我不知道如何引用同一個表:
edges %>% mutate(RatedStudyID = filter(edges, TeamID == TeamID & Rated == CATMERater))
希望這是有道理的,但我會很感激的建議在得到領導正確的方向。如果是left_join
的東西我怎麼說TeamID == TeamID
?
這是我想什麼到底要看到(主要是最後一列雖然):每@akron
StudyID TeamID CATMERater Rated Rating RatedStudyID
(int) (int) (int) (dbl) (dbl)
1 2930 551 1 1 5.000000 2930
2 2938 551 2 1 3.800000 2930
3 2939 551 3 1 5.000000 2930
4 2930 551 1 2 3.666667 2938
5 2938 551 2 2 4.000000 2938
6 2939 551 3 2 3.866667 2938
...
dput結果給出了一個錯誤:
structure(list(StudyID = c(2930L, 2938L, 2939L, 2930L, 2938L,
2939L, 2930L, 2938L, 2939L, 2930L, 2938L, 2939L, 2930L, 2938L,
2939L, 2930L, 2938L, 2939L, 2920L, 2941L, 2989L, 2920L, 2941L,
2989L, 2920L, 2941L, 2989L, 2920L, 2941L, 2989L, 2920L, 2941L,
2989L, 2920L, 2941L, 2989L, 2922L, 2924L, 2943L, 2922L, 2924L,
2943L, 2922L, 2924L, 2943L, 2922L, 2924L, 2943L, 2922L, 2924L
), TeamID = c(551L, 551L, 551L, 551L, 551L, 551L, 551L, 551L,
551L, 551L, 551L, 551L, 551L, 551L, 551L, 551L, 551L, 551L, 552L,
552L, 552L, 552L, 552L, 552L, 552L, 552L, 552L, 552L, 552L, 552L,
552L, 552L, 552L, 552L, 552L, 552L, 553L, 553L, 553L, 553L, 553L,
553L, 553L, 553L, 553L, 553L, 553L, 553L, 553L, 553L), CATMERater = c(1L,
2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L,
3L, 2L, 1L, 3L, 2L, 1L, 3L, 2L, 1L, 3L, 2L, 1L, 3L, 2L, 1L, 3L,
2L, 1L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L,
2L), Rated = c(1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5, 6,
6, 6, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5, 6, 6, 6, 1,
1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5), Rating = c(5, 3.8, 5,
3.66666666666667, 4, 3.86666666666667, 4.53333333333333, 4, 4.8,
NaN, NaN, NaN, NaN, NaN, NaN, NA, NA, NA, 3.93333333333333, 5,
5, 5, 5, 5, 5, 5, 5, NaN, NaN, NaN, NaN, NaN, NaN, NA, NA, NA,
4, 4, 4, 4, 4, 4, 4, 3.86666666666667, 4, NaN, NaN, NaN, NaN,
NaN)), .Names = c("StudyID", "TeamID", "CATMERater", "Rated",
"Rating"), class = c("tbl_df", "data.frame"), row.names = c(NA,
-50L))
查看[如何創建可重現的示例](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example)以獲取更好的方式來共享示例數據讓它更容易幫助你。 – MrFlick
你可以輸入數據幀嗎? – user2600629
'edges%>%group_by(Rated,TeamID)%>%mutate(new = StudyID [CATMERater == Rated])'? – jeremycg