我有這樣 樣本數據幀我試圖找到2列coauthors
和nacoauthors
之間的交叉使用下面的代碼dplyr發生變異相交不工作
interscout =
sample_test %>%
mutate(commonauth = intersect(coauthors, nacoauthors))
,我得到這個輸出 我不知道爲什麼我無法使用在mutate
中獲得常見交集。
理想情況下,最後一行應該是空的,第二行應該只有JAMES M ANDERSON
交集。
這裏是結構的代碼。
> dput(sample_test)
structure(list(fname = c("JACK", "JACK", "JACK"), lname = c("SMITH",
"SMITH", "SMITH"), cname = c("JACK SMITH", "JACK A SMITH", "JACK B SMITH"
), coauthors = list(c("AMEY S BAILEY", "JAMES M ANDERSON"), "JAMES M ANDERSON",
"JOHN MURRAY"), nacoauthors = list(c("AMEY S BAILEY", "JAMES M ANDERSON"
), c("AMEY S BAILEY", "JAMES M ANDERSON"), c("AMEY S BAILEY",
"JAMES M ANDERSON"))), row.names = c(NA, -3L), vars = list(fname,
lname), drop = TRUE, indices = list(0:2), group_sizes = 3L, biggest_group_size = 3L, labels = structure(list(
fname = "JACK", lname = "SMITH"), class = "data.frame", row.names = c(NA,
-1L), vars = list(fname, lname), drop = TRUE, .Names = c("fname",
"lname")), class = c("grouped_df", "tbl_df", "tbl", "data.frame"
), .Names = c("fname", "lname", "cname", "coauthors", "nacoauthors"
))
它會拋出一個錯誤因爲mutate正在尋找與完整數據集具有相同輸出長度的東西。你可以使用dply'intersect(sample_test $ coauthors,sample_test $ nacoauthors)之外的相交',它應該可以工作 –