2017-03-07 21 views
1

我有一個名爲cars串如下:正則表達式到直到托架關閉的第一次出現

cars 
[1] "Only one car(52;model-14557) had a good engine(workable condition), others engine were damaged beyond repair" 
[2] "Other car(21, model-155) looked in good condition but car (36, model-8878) looked to be in terrible condition." 

我需要從字符串中提取以下部分:

car(52;model-14557) 
car(21, model-155) 
car (36, model-8878) 

我嘗試使用下面的一塊可以提取它:

stringr::str_extract_all(cars, "(.car\\s{0,5}\\(([^]]+)\\))") 

這給了我以下輸出:

[[1]] 
[1] " car(52;model-14557) had a good engine(workable condition)" 

[[2]] 
[1] " car(21, model-155) looked in good condition but car (36, model-8878)" 

有沒有一種方法可以提取帶有關聯號碼和型號的單詞汽車?

回答

2

Your regex does not work因爲您使用的不是]匹配(),因而從第一(直到最後)中間沒有]匹配其他[^]]+,一個或多個符號。

使用

> cars <- c("Only one car(52;model-14557) had a good engine(workable condition), others engine were damaged beyond repair","Other car(21, model-155) looked in good condition but car (36, model-8878) looked to be in terrible condition.") 
> library(stringr) 
> str_extract_all(cars, "\\bcar\\s*\\([^()]+\\)") 
[[1]] 
[1] "car(52;model-14557)" 

[[2]] 
[1] "car(21, model-155)" "car (36, model-8878)" 

正則表達式爲\bcar\s*\([^()]+\),看到online regex demo here

它匹配:

  • \b - 字邊界
  • car - 字面炭序列
  • \s* - 0+空格
  • \( - 字面(
  • [^()]+ - 1或除()之外的更多字符
  • \) - 字面值)

注相同的正則表達式將產生與以下基礎R代碼相同的結果:

> regmatches(cars, gregexpr("\\bcar\\s*\\([^()]+\\)", cars)) 
[[1]] 
[1] "car(52;model-14557)" 

[[2]] 
[1] "car(21, model-155)" "car (36, model-8878)" 
+1

正是我想要的。謝謝 – SBista

相關問題