1
我有一些麻煩,在R的正則表達式字符串我試圖使用正則表達式從字符串(從網上刮)提取標籤如下:正則表達式,R和逗號
str <- "\n\n\n \n\n\n 「Don't cry because it's over, smile because it happened.」\n ―\n Dr. Seuss\n\n\n\n\n \n tags:\n attributed-no-source,\n cry,\n crying,\n experience,\n happiness,\n joy,\n life,\n misattributed-dr-seuss,\n optimism,\n sadness,\n smile,\n smiling\n \n \n 176513 likes\n \n\n\n\n\nLike\n\n"
# Why doesn't this work at all?
stringr::str_match(str, "tags:(.+)\\d")
[,1] [,2]
[1,] NA NA
# Why just the first tag? What happens at the comma?
stringr::str_match(str, "tags:\n(.+)")
[,1] [,2]
[1,] "tags:\n attributed-no-source," " attributed-no-source,"
所以兩個問題 - 爲什麼我的第一個想法工作,爲什麼不通過字符串的結尾,而不僅僅是第一個逗號第二擷取?
謝謝!
這將是有益的,如果你解釋你期望的結果是什麼。 – Dason
難道你的意思'str_match(STR, 「標籤:[^ 0-9] * [0-9] *」)'對於第一種情況 – akrun