R正則表達式根據上下文從字符串中提取數字

s <- c('abc_1_efg', 'efg_2', 'hi2jk_lmn', 'opq')

如何使用正則表達式來獲取旁邊至少有一個下劃線（「_」）的數字。實際上我想獲得的輸出是這樣的：R正則表達式根據上下文從字符串中提取數字

> output # The result 
[1] 1 2 
> output_l # Alternatively 
[1] TRUE TRUE FALSE FALSE

實際的數字或指標？ –

我希望這兩個解決方案都可以算作一個問題 – user3375672

我們可以使用正則表達式lookarounds

grep("(?<=_)\\d+", s, perl = TRUE) 
grepl("(?<=_)\\d+", s, perl = TRUE) 
#[1] TRUE TRUE FALSE FALSE

2016-12-01 12:58:29 akrun

使用這個表達式：

[_]([0-9]){1}

，並選擇1組，你會得到你的數字，如果你想要更多，請使用

[_]([0-9]+)

，它不會匹配最後兩個字符串

您可以使用此工具：如果你需要得到的只是指數，使用grep用一個簡單的TRE正則表達式（無lookarounds是必要的）https://regex101.com/

2016-12-01 13:01:00 lordyoum

：

> grep("_\\d+", s) 
[1] 1 2

要獲得數字本身，使用正則表達式PCRE以積極的前瞻與regmatches/gregexpr：

> unlist(regmatches(s, gregexpr("(?<=_)[0-9]+", s, perl=TRUE))) 
[1] "1" "2"

詳細：

編輯：如果_左邊的數字應該是als考慮使用1）"(^|_)\\d|\\d(_|$)"與grep解決方案和2）"(?<![^_])\\d+|\\d+(?![^_])"與號碼提取解決方案。

2016-12-01 13:04:11

我爲所有這些場景添加了解決方案。 –

與stringr：

s <- c('abc_1_efg', 'efg_2', 'hi2jk_lmn', 'opq', 'a_1_b') 
library(stringr) 
which(!is.na(str_match(s, '_\\d|\\d_'))) 
# [1] 1 2 5

2016-12-01 13:11:39

回答