2014-01-15 17 views
0

我想將字符串拆分爲連續字母的子字符串,它們共享一些屬性:特別是字母數字(儘管會對通用解決方案感興趣)。正則表達式將字符串拆分成連續的字母數字部分

E.g. "string#example[is-like="html"].selectors"

將匹配到[string, #, example, [, is, -, like, =", html, "]., selectors]

任何想法如何做到這一點的正則表達式?謝謝!

編輯:我將通過preg_match_all使用PHP的RegEx引擎。

+2

哪個正則表達式引擎您使用的? –

回答

2
\w+|\W+ 

1以上後果或非字的一個或多個後果字符

輸出

Array 
    (
     [0] => string 
     [1] => # 
     [2] => example 
     [3] => [ 
     [4] => is 
     [5] => - 
     [6] => like 
     [7] => =" 
     [8] => html 
     [9] => "]. 
     [10] => selectors 
    ) 
1

使用word boundary anchor,例如在C#:

splitArray = Regex.Split(subjectString, @"\b"); 

如果你想避免在字符串的開始/結束的空場比賽,與lookaround assertions結合起來:

splitArray = Regex.Split(subjectString, @"(?<!^)\b(?!$)"); 

說明:

(?<!^) # Assert we're not at the start of the string 
\b  # Match a position between an alnum an a non-alnum character 
(?!$) # Assert we're not at the end of the string, either 

一般解決方案如下所示:

假設您想要在數字(\d)和非數字(\D)之間進行拆分。然後,你可以使用

splitArray = Regex.Split(subjectString, @"(?<=\d)(?=\D)|(?<=\D)(?=\d)"); 

說明:字字符

(?<=\d) # Assert that the previous character is a digit 
(?=\D) # and the next character is a non-digit. 
|  # Or: 
(?<=\D) # Assert that the previous character is a non-digit 
(?=\d) # and the next character is a digit. 
相關問題