一個特徵向量的結合stringr::str_detect
有個好處(超過grepl
)雖然在大多數情況下,我會stringr
封裝已去建議在CPak的回答中,也有我grep的解決辦法:
# create the sample string
c <- ("She sold seashells by the seashore, and she had a great time while doing so.")
# match any sold and great string within the text
# ignore case so that Sold and Great are also matched
grep("(sold.*great|great.*sold)", c, value = TRUE, ignore.case = TRUE)
嗯,不錯,對吧?但是如果有一個詞只含有短語sold
或great
?
# set up alternative string
d <- ("She saw soldier eating seashells by the seashore, and she had a great time while doing so.")
# even soldier is matched here:
grep("(sold.*great|great.*sold)", d, value = TRUE, ignore.case = TRUE)
所以,你可能想使用單詞邊界,也就是整個單詞匹配:
# \\b is a special character which matches word endings
grep("(\\bsold\\b.*\\bgreat\\b|\\bgreat\\b.*\\bsold\\b)", d, value = TRUE, ignore.case = TRUE)
的\\b
匹配字符串的第一個字符,最後一個字符的字符串或其中一個屬於兩個字符之間一個字和另一個沒有:
更多關於\b
元字符這裏:
來源
2017-09-17 07:58:27
ira
謝謝,但我實際上正在尋找一個包含兩個,而不是任何單詞的行。如果線路已售出但不是很好,我不希望線路被退回。 – intern14
@ intern14,道歉,我誤解了。看到我上面的編輯。 –