我有一個長矢量。每個元素都是一個字符串。每個字符串可以拆分爲由','分隔的子字符串。R如何用一個新子串替換較長字符串中的子串
我想檢查我的向量中的每個字符串是否至少包含一個'bad'字符串。如果是這樣,那麼包含該'壞'字符串的整個SUB字符串應該替換爲一個新字符串。我寫了一個帶循環的長函數。但我可以發誓必須有一個更簡單的方法來做到這一點 - 也許用stringr? 非常感謝您的建議!
# Create an example data frame:
test <- data.frame(a = c("str1_element_1_aaa, str1_element_2",
"str2_element_1",
"str3_element_1, str3_element_2_aaa, str3_element_3"),
stringsAsFactors = F)
test
str(test)
# Defining my long function that checks if each string in a
# vector contains a substring with a "bad" string in it.
# If it does, that whole substring is replaced with a new string:
library(stringr)
mystring_replace = function(strings_vector, badstring, newstring){
with_string <- grepl(badstring, strings_vector) # what elements contain badstring?
mysplits <- str_split(string = test$a[with_string], pattern = ', ') # split those elements with badstring based on ', '
for (i in 1:length(mysplits)) { # loop through the list of splits:
allstrings <- mysplits[[i]]
for (ii in 1:length(allstrings)) { # loop through substrings
if (grepl(badstring, allstrings[ii])) mysplits[[i]][ii] <- newstring
}
}
for (i in seq_along(mysplits)) { # merge the split elements back together
mysplits[[i]] <- paste(mysplits[[i]], collapse = ", ")
}
strings_vector[with_string] <- unlist(mysplits)
return(strings_vector)
}
# Test
mystring_replace(test$a, badstring = '_aaa', newstring = "NEW")
而不是使用3 for循環,你可以分裂一個壞的字符串,並加入一個好的字符串。 – numbtongue
好主意,但這不會幫助我。我不想加入一個很好的字符串。我想用新的子字符串替換包含壞字符串的WHOLE子字符串。 – user3245256