2016-03-31 50 views
1

字符串我有一列,其中,每一行代表一個SQL SELECT語句的一部分的數據幀出現多次的圖案,例如低於:改變,在R中

test <- 
    bind_rows(
    data.frame(text = "spend_1 + spend_2", stringsAsFactors = FALSE), 
    data.frame(text = "spend_1 + spend_2 + spend_3", stringsAsFactors = FALSE), 
    data.frame(text = "spend_2 - spend_3", stringsAsFactors = FALSE) 
) 

print(test) 

Source: local data frame [3 x 1] 

         text 
         (chr) 
1   spend_1 + spend_2 
2 spend_1 + spend_2 + spend_3 
3   spend_2 - spend_3 

我想,對於\w+的每個實例,將表別名添加到變量。例如:

      text text_adj 

1   spend_1 + spend_2 a.spend_1 + a.spend_2 
2 spend_1 + spend_2 + spend_3 a.spend_1 + a.spend_2 + a.spend_3 
3   spend_2 - spend_3 a.spend_2 - a.spend_3 

使用str_replace我可以替換「一些文本」每個變量,但我無法弄清楚如何我可以再與別名+原裝可變文本替換每個實例

library(stringr) 

str_replace_all(text, "\\w+", "some text") 

回答

2

您只需要捕獲該模式並使用\\1進行參考。例如,

test %>% 
    mutate(., text2 = str_replace_all(text, "(\\w+)", "alias.\\1")) 
# Source: local data frame [3 x 2] 
# 
#       text           text2 
#       (chr)           (chr) 
# 1   spend_1 + spend_2     alias.spend_1 + alias.spend_2 
# 2 spend_1 + spend_2 + spend_3 alias.spend_1 + alias.spend_2 + alias.spend_3 
# 3   spend_2 - spend_3     alias.spend_2 - alias.spend_3