2014-12-23 38 views
2

我有以下模式匹配。正則表達式匹配以下模式

RENAME_JAVASCRIPT(18), RENAME_IMAGE(7), MINIFY_JAVASCRIPT(26), (1), JAVASCRIPT_HTML5_CACHE(19), EMBED_JAVASCRIPT(1), RENAME_CSS(3), (1), IMAGE_COMPRESSION(7), RESPONSIVE_IMAGES(6), ASYNC_JAVASCRIPT(2);TextTransApplied:RENAME_JAVASCRIPT(18), RENAME_IMAGE(7), MINIFY_JAVASCRIPT(26), (1), JAVASCRIPT_HTML5_CACHE(19), EMBED_JAVASCRIPT(1), RENAME_CSS(3), (1), IMAGE_COMPRESSION(7), RESPONSIVE_IMAGES(6), ASYNC_JAVASCRIPT(2);TagTransAttempted:(8), ASYNC_JAVASCRIPT(61);TagTransFailed:ASYNC_JAVASCRIPT(42);TagTransApplied:(8), ASYNC_JAVASCRIPT(19); 

我在Python中有如下正則表達式。

for ele in re.findall("[A-Z]+_[A-Z]+\(\d+\)",str(feed)): 
    print ele 

但這與JAVASCRIPT_HTML5_CACHE不符。

如何指定由'_'分隔的多個單詞並且可以包含數字?

+0

替換'[A-Z]''與[A-Z \ d]' – hjpotter92

回答

4

你可以使用下面的正則表達式。

[A-Z]+(?:_[A-Z\d]+)+\(\d+\) 

+重複上一個令牌一次或多次。 [A-Z\d]+匹配一個或多個大寫字母或數字。

DEMO

>>> import re 
>>> s = "RENAME_JAVASCRIPT(18), RENAME_IMAGE(7), MINIFY_JAVASCRIPT(26), (1), JAVASCRIPT_HTML5_CACHE(19), EMBED_JAVASCRIPT(1), RENAME_CSS(3), (1), IMAGE_COMPRESSION(7), RESPONSIVE_IMAGES(6), ASYNC_JAVASCRIPT(2);TextTransApplied:RENAME_JAVASCRIPT(18), RENAME_IMAGE(7), MINIFY_JAVASCRIPT(26), (1), JAVASCRIPT_HTML5_CACHE(19), EMBED_JAVASCRIPT(1), RENAME_CSS(3), (1), IMAGE_COMPRESSION(7), RESPONSIVE_IMAGES(6), ASYNC_JAVASCRIPT(2);TagTransAttempted:(8), ASYNC_JAVASCRIPT(61);TagTransFailed:ASYNC_JAVASCRIPT(42);TagTransApplied:(8), ASYNC_JAVASCRIPT(19);" 
>>> for i in re.findall(r'[A-Z]+(?:_[A-Z\d]+)+\(\d+\)', s): 
...  print(i) 
RENAME_JAVASCRIPT(18) 
RENAME_IMAGE(7) 
MINIFY_JAVASCRIPT(26) 
JAVASCRIPT_HTML5_CACHE(19) 
EMBED_JAVASCRIPT(1) 
RENAME_CSS(3) 
IMAGE_COMPRESSION(7) 
RESPONSIVE_IMAGES(6) 
ASYNC_JAVASCRIPT(2) 
RENAME_JAVASCRIPT(18) 
RENAME_IMAGE(7) 
MINIFY_JAVASCRIPT(26) 
JAVASCRIPT_HTML5_CACHE(19) 
EMBED_JAVASCRIPT(1) 
RENAME_CSS(3) 
IMAGE_COMPRESSION(7) 
RESPONSIVE_IMAGES(6) 
ASYNC_JAVASCRIPT(2) 
ASYNC_JAVASCRIPT(61) 
ASYNC_JAVASCRIPT(42) 
ASYNC_JAVASCRIPT(19) 
>>> 
0

嘗試這一個

[A-Z]+_[A-Z]+\(\d+\)|[^,]+(?<=\s)J+[^)]+\)