模式中的字符串

我有一個包含由0,1大量字符串的數據幀使用正則表達式的匹配，和N.以下是幾個例子：模式中的字符串

a = "10000000011111111" 
b = "11111111111111111" 
c = "11111110000000NNN" 
d = "00000000000000000" 
e = "00000001111111111" 
f = "11111000000000000"

進出口尋找用於識別僅包含'0'和'1'而不包含'N'的字符串的方式。我的最終目標是在發生這種情況的地方替換成原始數據框'REC'。與此question中所做的相似。

從我上面的數據結果將是：

a = "REC" 
b = "11111111111111111" 
c = "11111110000000NNN" 
d = "00000000000000000" 
e = "REC" 
f = "REC"

達到我的目的主要戰略伊夫使用（從以前的問題被答覆的指導下）使用gsub但我不能讓一個正則表達式，將工作爲我的期望輸出。我試過太多的反覆嘗試在這裏，但這裏是我的最新的功能如下：

markREC <- function(X) { 
gsub(X, pattern = "^(0)+.*(1)+$", 
     replacement = "REC?")}

此功能將數據幀與lapply

運行的其他戰略我用盡依賴strsplit但我很難得到這個工作。如果有人願意看到他們，我可以舉例說明。我想這對於那裏的一些正則表達式專家來說很簡單，但經過數小時的嘗試，我愛一些幫助！

來源

2011-10-27 Sam Globus

Ehm我不知道你想用你的正則表達式實現什麼。

^(0)+.*(1)+$

實際上意味着：

串的

開始，匹配的至少一個0後跟有的話，隨後在至少一個1和字符串的結尾。所以這樣的：032985472395871比賽:)

^(?=.*0)(?=.*1)[01]+$只有在完整的字符串由0和1，並至少有一個0和至少一個將匹配1.

// ^(?=.*0)(?=.*1)[01]+$ 
// 
// Assert position at the beginning of the string «^» 
// Assert that the regex below can be matched, starting at this position (positive lookahead) «(?=.*0)» 
// Match any single character that is not a line break character «.*» 
//  Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*» 
// Match the character 「0」 literally «0» 
// Assert that the regex below can be matched, starting at this position (positive lookahead) «(?=.*1)» 
// Match any single character that is not a line break character «.*» 
//  Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*» 
// Match the character 「1」 literally «1» 
// Match a single character present in the list 「01」 «[01]+» 
// Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+» 
// Assert position at the end of the string (or before the line break at the end of the string, if any) «$»

來源

2011-10-27 21:36:35 FailedDev

這並不完全奏效，因爲它拿起那只有0或只有1個，其ID喜歡串排除此組。我編輯了我想要的輸出的問題。 –

@SamGlobus其實這完美的作品。我不確定你在說什麼。 – FailedDev

只是對於類似於雙引號中的字符串。字符串將在數據框架中，並且將具有不同的長度。 –

正確的正則表達式是：

"[^N]*"

我相信。這將匹配任何長度的字符串，除非它包含N.

來源

2011-10-27 21:41:22 BicMacinaPimpHat

這也匹配「」=空字符串。 – FailedDev

那麼空字符串仍然是一個字符串不是嗎？ – BicMacinaPimpHat

嘗試此

^([01]*)[^01]+([01]*)$

匹配開始串，接着0以上0/1的，隨後是心不是0/1至少1個字符，接着0以上0/1的（其次是字符串的結尾）

來源

2011-10-27 21:44:23 carpii

這不匹配1111111111111 =不是你想要的。 – FailedDev

你是對的，我的壞:)我太忙於專注於'N'方面，我沒有注意到它不應該匹配只包含1的字符串 – carpii

要匹配只包含0和1（而不是隻含0或1字符串）的字符串，你可以這樣做：

grepl("^((0)+(1)+(0|1)+)|((1)+(0)+(0|1)+)$", <string>)

對於一些您的例子：

> grepl("^((0)+(1)+(0|1)+)|((1)+(0)+(0|1)+)$", a) 
[1] TRUE 

> grepl("^((0)+(1)+(0|1)+)|((1)+(0)+(0|1)+)$", b) 
[1] FALSE 

> grepl("^((0)+(1)+(0|1)+)|((1)+(0)+(0|1)+)$", c) 
[1] FALSE

現在堵到這個gsub：

> gsub(a, pattern="^((0)+(1)+(0|1)+)|((1)+(0)+(0|1)+)$", replacement="REC") 
[1] "REC" 

> gsub(b, pattern="^((0)+(1)+(0|1)+)|((1)+(0)+(0|1)+)$", replacement="REC") 
[1] "11111111111111111" 

> gsub(c, pattern="^((0)+(1)+(0|1)+)|((1)+(0)+(0|1)+)$", replacement="REC") 
[1] "11111110000000NNN" 

> gsub(d, pattern="^((0)+(1)+(0|1)+)|((1)+(0)+(0|1)+)$", replacement="REC") 
[1] "00000000000000000" 

> gsub(e, pattern="^((0)+(1)+(0|1)+)|((1)+(0)+(0|1)+)$", replacement="REC") 
[1] "REC" 

> gsub(f, pattern="^((0)+(1)+(0|1)+)|((1)+(0)+(0|1)+)$", replacement="REC") 
[1] "REC"

來源

2011-10-27 21:45:14

模式中的字符串

回答

相關問題