子可以做到這一點正確的正則表達式
Text = c("Positron emission tomography using flutemetamol (18F)
with computed tomography of brain (procedure)",
"Urinary tract infection prophylaxis (procedure)",
"Xanthoma of eyelid (disorder)",
"Ventricular tachyarrhythmia (disorder)",
"Abnormal urine odor (finding)",
"Coloboma of iris (disorder)",
"Macroencephaly (disorder)",
"Right main coronary artery thrombosis (disorder)")
sub(".*\\((.*)\\).*", "\\1", Text)
[1] "procedure" "procedure" "disorder" "disorder" "finding" "disorder"
[7] "disorder" "disorder"
增編:正則表達式
的問題要求尋找在該最後括號中的內容的詳細解釋字符串。這個表達式有點令人困惑,因爲它包含括號的兩種不同用法,一種是在正在處理的字符串中表示括號,另一種是設置一個「捕獲組」,我們指定應該由表達式返回的部分的方式。表達是由五個基本單元:
1. Initial .* - matches everything up to the final open parenthesis.
Note that this is relying on "greedy matching"
2. \\( ... \\) - matches the final set of parentheses.
Because (by itself means something else, we need to "escape" the
parentheses by preceding them with \. That is we want the regular
expression to say \( ... \). However, the way R interprets strings,
if we just typed \(and \), R would interpret the \ as escaping the (
and so interpret this as just (...). So we escape the backslash.
R will interpret \\( ... \\) as \(... \) meaning the literal
characters (&).
3. (...) Inside the pair in part 2
This is making use of the special meaning of parentheses. When we
enclose an expression in parentheses, whatever value is inside them
will be stored in a variable for later use. That variable is called
\1, which is what was used in the substitution pattern. Again, is
we just wrote \1, R would interpret it as if we were trying to escape
the 1. Writing \\1 is interpreted as the character \ followed by 1,
i.e. \1.
4. Central .* Inside the pair in part 3
This is what we are looking for, all characters inside the parentheses.
5. Final .*
This is in the expression to match any characters that may follow the
final set of parentheses.
子功能將使用此帶有取代模式\ 1替換匹配的模式(在這種情況下,在字符串中的所有字符),即的內容變量包含第一個(僅在我們的例子中)捕獲組 - 最終括號內的內容。
來源
2017-02-09 21:35:24
G5W
您可以評論解決方案。我認爲\\ 1是指正則表達式中的一些定義元素。它的作品,但理解它如何工作會更好 – userJT
@userJT - 添加到答案 – G5W