2014-01-11 89 views
2

我正在嘗試使正則表達式選擇所有具有a-z且帶或不帶符號的單詞。正則表達式:符號不能重複在彼此旁邊

  1. 字需要是至少2個字符
  2. 不能與開始「符號
  3. 兩個」符號不能彼此相鄰
  4. 和「二字符」字不能與「符號

我已經是幾個小時的工作對正則表達式的結束,我不能讓它工作:

/\b[a-z]([a-z(\')](?!\1))+\b/ 

它不起作用,我不知道爲什麼! (兩個'彼此相鄰的符號)

有什麼想法嗎?

回答

0
([a-z](?:[a-z]|'(?!'))+[a-z']|[a-z]{2}) 

Live @ RegExPal

你可能不會需要使用\b爲正則表達式是貪婪,將消耗所有單詞作爲一個整體。
這個版本不能與RegexPal(不承認的回顧後)進行測試,但有自定義單詞邊界:

(?<![a-z'])([a-z](?:[a-z]|'(?!'))+[a-z']|[a-z]{2})(?![a-z']) 
0

這應該工作(免責聲明:未經測試)

/\b(?![a-z]{2}'\b)[a-z]((?!'')['a-z])+\b/ 

你的沒有,因爲你正試圖巢字符類中的括號表達式。這隻會將()添加到課程中,它不會設置您的下一個\1代碼的值。

(編輯)在aa'上添加約束。

0

假設字由空格分隔:

(?:^|\s)((?:[a-z]{2})|(?:[a-z](?!.*'')[a-z']{2,}))(?:$|\s) 

在行動中一個Perl腳本:

my $re = qr/(?:^|\s)((?:[a-z]{2})|(?:[a-z](?!.*'')[a-z']{2,}))(?:$|\s)/; 
while(<DATA>) { 
    chomp; 
    say (/$re/ ? "OK: $_" : "KO: $_"); 
} 
__DATA__ 
ab 
abc 
a' 
ab'' 
abc' 
a''b 
:!ù 

輸出:

OK: ab 
OK: abc 
KO: a' 
OK: ab'' 
OK: abc' 
KO: a''b 
KO: :!ù 

說明:

The regular expression: 

(?-imsx:\b((?:[a-z]{2})|(?:[a-z](?!.*'')[a-z']{2,}))\b) 

matches as follows: 

NODE      EXPLANATION 
---------------------------------------------------------------------- 
(?-imsx:     group, but do not capture (case-sensitive) 
         (with^and $ matching normally) (with . not 
         matching \n) (matching whitespace and # 
         normally): 
---------------------------------------------------------------------- 
    \b      the boundary between a word char (\w) and 
          something that is not a word char 
---------------------------------------------------------------------- 
    (      group and capture to \1: 
---------------------------------------------------------------------- 
    (?:      group, but do not capture: 
---------------------------------------------------------------------- 
     [a-z]{2}     any character of: 'a' to 'z' (2 times) 
---------------------------------------------------------------------- 
    )      end of grouping 
---------------------------------------------------------------------- 
    |      OR 
---------------------------------------------------------------------- 
    (?:      group, but do not capture: 
---------------------------------------------------------------------- 
     [a-z]     any character of: 'a' to 'z' 
---------------------------------------------------------------------- 
     (?!      look ahead to see if there is not: 
---------------------------------------------------------------------- 
     .*      any character except \n (0 or more 
           times (matching the most amount 
           possible)) 
---------------------------------------------------------------------- 
     ''      '\'\'' 
---------------------------------------------------------------------- 
    )      end of look-ahead 
---------------------------------------------------------------------- 
     [a-z']{2,}    any character of: 'a' to 'z', ''' (at 
           least 2 times (matching the most 
           amount possible)) 
---------------------------------------------------------------------- 
    )      end of grouping 
---------------------------------------------------------------------- 
)      end of \1 
---------------------------------------------------------------------- 
    \b      the boundary between a word char (\w) and 
          something that is not a word char 
---------------------------------------------------------------------- 
)      end of grouping 
----------------------------------------------------------------------