2016-12-04 34 views
0

多consitions我有一個數據幀如下與str_extract [R

1  Tertiary seen. 
    2  No tertiary seen. 
    3  No anything seen. 
    4  Tertiary everywhere. 

我想補充,只有當三級看到一列,但不是在正則表達式沒有。* \。被看到。

1  Tertiary seen.  Tertiary 
    2  No tertiary seen.  NA 
    3  No anything seen.  NA 
    4  Tertiary everywhere. Tertiary 

我知道我可以使用str_extract |但&似乎並沒有被接受如下

Mydata$newcol<-str_extract(Mydata$Text,"[Tt]ertiary&!No.*[Tt]ertiary\\.") 
+0

我會用'MYDATA $ NEWCOL去[grepl( 「(否)三級?!」,MYDATA $文本,perl的= TRUE)] < - 「三級」'(負向後看) –

+0

啊哈。負面的後顧之憂。謝謝。請作爲答覆發佈 –

回答

2

你可以嘗試一個Negative lookebehind,像

Mydata$newcol[grepl("(?!No)Tertiary", Mydata$Text, perl = TRUE)] <- "Tertiary" 
0

「AND」模式可以用「NOT(NOT A NOT NOT B)」模式表示。另見regex - Regular Expressions: Is there an AND operator? - Stack Overflow

library(dplyr) 
library(stringr) 

Mydata <- data_frame(
    Text = c("Tertiary seen.", 
      "No tertiary seen.", 
      "No anything seen.", 
      "Tertiary everywhere.") 
) 

Mydata %>% 
    mutate(
    newcol = str_extract(Text, "^(^[Tt]ertiary|^No.*[Tt]ertiary\\.)") 
) 
# A tibble: 4 × 2 
# Text newcol 
# <chr> <chr> 
# 1  Tertiary seen. Tertiary 
# 2 No tertiary seen.  <NA> 
# 3 No anything seen.  <NA> 
# 4 Tertiary everywhere. Tertiary