2017-06-22 40 views
2

我有一個包含兩列的.csv文件。第一個是ID,第二個是文本字段。然而,在文本框中文本分成運行到另一行句子,因此該文件是這樣的:如何合併一列中的行以匹配另一列中的非空行?

ID TEXT 
TXT_1 This is the first sentence 
NA This is the second sentence 
NA This is the third sentence 
TXT_2 This is the first sentence of the second text 
NA This is the second sentence of the second text 

我想什麼做的是合併的文本字段,以便它會看起來像這樣:

ID TEXT 
TXT_1 This is the first sentence This is the second sentence This is the third sentence 
TXT_2 This is the first sentence of the second text This is the second sentence of the second text 

R是否有一個簡單的解決方案呢?

回答

1

我們創建基於在 'ID' 和paste的 'TEXT' 一起

library(dplyr) 
df1 %>% 
    group_by(Grp = cumsum(!is.na(ID))) %>% 
    summarise(ID = ID[!is.na(ID)], TEXT = paste(TEXT, collapse = ' ')) %>% 
    ungroup() %>% 
    select(-Grp) 
# A tibble: 2 x 2 
#  ID                       TEXT 
# <chr>                      <chr> 
#1 TXT_1   This is the first sentence This is the second sentence This is the third sentence 
#2 TXT_2 This is the first sentence of the second text This is the second sentence of the second text 

或者作爲@Jaap非NA元件建議

df1 %>% 
    group_by(ID = zoo::na.locf(ID)) %>% 
    summarise(TEXT = paste(TEXT, collapse = ' ')) 
+1

或者分組變量: df1%>%group_by(ID = zoo :: na.locf(ID))%>%summarize(TEXT = paste(TEXT,collapse ='')) – Jaap

相關問題