2017-03-17 88 views
1

我有以下data.frame:匹配字符串並將其粘貼到上面的排

d <- data.frame(id = c(1:20), 
       name = c("Paraffinole (CAS 8042-47-5)", "Pirimicarb", "Rapsol", "Thiacloprid", 
        "Chlorantraniliprole", "Flonicamid", "Tebufenozid", "Fenoxycarb", 
        "Bacillus thuringiensis subspecies", "aizawai Stamm AB", "Methoxyfenozide", 
        "Acequinocyl", "lndoxacarb", "Acetamiprid", "Spirotet_r:amat", 
        "Cydia pomonella Granulovirus", "mexikanischer Stamm", "lmidacloprid", 
        "Spirodiclofen", "Pyrethrine"), 
       desc = LETTERS[1:20]) 

列包含字符串「斯塔姆」的兩個條目。 Id'喜歡選擇這些條目並將它們粘貼到之前的一列條目中,然後刪除該行。所以df$name[9]最後應該看起來像這樣Bacillus thuringiensis subspecies__aizawai Stamm ABdf$name[16]如下:Cydia pomonella Granulovirus__mexikanischer Stamm。然後應該刪除d$name[c(10,17]

如何匹配一個字符串並將其粘貼到上面的行?

+0

我想'df $ name [9]'應該是''蘇雲金芽孢桿菌亞種_aizawai Stamm AB',對吧? – lbusett

+0

你是對的,複製粘貼錯誤 – andrasz

回答

1

這是怎麼回事?

library(stringr) 
d$name <- as.character(d$name) 
where_stamm <- which(str_detect(d$name, "Stamm") == TRUE) 
for (i in where_stamm) { 
    d$name[i-1] <- paste(d$name[i-1], d$name[i], sep = '__') 
} 
d <- d[-where_stamm, ] 

> d$name[9] 
[1] "Bacillus thuringiensis subspecies__aizawai Stamm AB" 
> d$name[15] 
[1] "Cydia pomonella Granulovirus__mexikanischer Stamm" 

(注意, 「蘋果蠹蛾......」 現在將在15位置,因爲我們刪除的行10)

+0

謝謝,我在想方式太複雜了,只要grep索引並循環通過它們! – andrasz

1

下面是使用dplyr一個解決方案:

library(dplyr) 
d %>% 
    mutate(
    to_delete = grepl("stamm", name, ignore.case = TRUE), 
    name = if_else(lead(to_delete, default = FALSE), paste(name, lead(name), sep = "__"), 
        as.character(name)) 
) %>% 
    filter(!to_delete) %>% 
    select(- to_delete) 
# id            name desc 
# 1 1       Paraffinole (CAS 8042-47-5) A 
# 2 2           Pirimicarb B 
# 3 3            Rapsol C 
# 4 4           Thiacloprid D 
# 5 5         Chlorantraniliprole E 
# 6 6           Flonicamid F 
# 7 7           Tebufenozid G 
# 8 8           Fenoxycarb H 
# 9 9 Bacillus thuringiensis subspecies__aizawai Stamm AB I 
# 10 11          Methoxyfenozide K 
# 11 12           Acequinocyl L 
# 12 13           lndoxacarb M 
# 13 14           Acetamiprid N 
# 14 15          Spirotet_r:amat O 
# 15 16 Cydia pomonella Granulovirus__mexikanischer Stamm P 
# 16 18          lmidacloprid R 
# 17 19          Spirodiclofen S 
# 18 20           Pyrethrine T 
相關問題