我的數據是:dplyr/dt總結列是否不爲空/ NA並粘貼?
Name House Street Apt City Postal Phone
DUMA PAUL 2030 GREEN ROAD DESERT Z0K2K1 999-577-3789
DUNN S GREEN ROAD DESERT Z0K2K1 999-577-3256
FERGUSON BOB GREEN ROAD DESERT Z0K2K1 999-577-3771
FITSCHEN A 3989 GREEN ROAD DESERT Z0K2K1 999-577-3557
BLACK CARY 2079 GREEN ROAD DESERT Z0K2K1 999-577-3779
BLACK RUTH 2079 GREEN ROAD DESERT Z0K2K1 999-577-3779
我想比較名稱(動態,數據由衆議院排序),如果相等,房子#是平等的,連接具有各自的兩個電話號碼「OR」和刪除行那不是連接起來並串聯了名稱「和」
我使用:
data <- data %>%
group_by(House, Street, Apt, City, Postal) %>%
summarise(Name = first(paste(Name, collapse = ", AND ")), Phone =
paste(unique(Phone), collapse = " OR ")) %>%
ungroup() %>%
arrange(Street, desc(House)) %>%
select(colnames(dataset)) %>%
filter(!Phone %in% dnc$`Home Phone`)
問題:上述dplyr,我串聯如果房子是NA (或空白,我把我的NA留空),Apt是NA(或「」),我不想。因此,使用上面的代碼,我會
Name House Street Apt City Postal Phone
DUNN S, AND FERGUSON BOB GREEN ROAD DESERT Z0K2K1 9995773256
OR 9995773772
DUMAS PAUL 2030 GREEN ROAD DESERT Z0K2K1
9995773789
BLACK CARY, AND BLACK RUTH 2079 GREEN ROAD DESERT Z0K2K1
9995773779
FITSCHEN A 3989 GREEN ROAD DESERT Z0K2K1
9995773556
通過以上,請注意鄧恩S,而現在弗格森BOB在一起。我不要那個。
dput(抱歉,如果沒有幫助):
list(structure(list(X__1 = c(NA, NA, NA, NA, NA, NA), Name = c("DUMAS
PAUL",
"DUNN S", "FERGUSON BOB", "FITSCHEN A", "BLACK CARY", "BLACK RUTH"
), House = c("2030", NA, NA, "3989", "2079", "2079"), Street = c("GREEN
ROAD",
"GREEN ROAD", "GREEN ROAD", "GREEN ROAD", "GREEN ROAD", "GREEN ROAD"
), Apt = c(NA, NA, NA, NA, NA, NA), City = c("DESERT", "DESERT",
"DESERT", "DESERT", "DESERT", "DESERT"), Prov = c("ZK", "ZK",
"ZK", "ZK", "ZK", "ZK"), Postal = c("Z0K2K1", "Z0K2K1", "Z0K2K1",
"Z0K2K1", "Z0K2K1", "Z0K2K1"), Phone = c("999-577-3789", "999-577-3256",
"999-577-3772", "999-577-3556", "999-577-3779", "999-577-3779"
), `Last Appear Date` = c(NA, NA, NA, NA, NA, NA)), .Names = c("X__1",
"Name", "House", "Street", "Apt", "City", "Prov", "Postal", "Phone",
"Last Appear Date"), class = c("tbl_df", "tbl", "data.frame"), row.names
= c(NA,
-6L)))
感謝