2013-03-07 77 views
1

我想編譯一個正則表達式,以便能夠從推文中累積一系列標籤(r'#\w+')。我希望能夠編寫兩個正則表達式,這些正則表達式可以從推文的開始和結束中做到這一點。我使用python 272,我的代碼是這樣的。Python re.compile。不平衡括號錯誤

HASHTAG_SEQ_REGEX_PATTERN   = r""" 
(          #Outermost grouping to match overall regex 
#\w+         #The hashtag matching. It's a valid combination of \w+ 
([:\s,]*#\w+)*       #This is an optional (0 or more) sequence of hashtags separated by [\s,:]* 
)          #Closing parenthesis of outermost grouping to match overall regex 
""" 

LEFT_HASHTAG_REGEX_SEQ  = re.compile('^' + HASHTAG_SEQ_REGEX_PATTERN , re.VERBOSE | re.IGNORECASE) 

當執行在那裏我編譯正則表達式行,我得到以下錯誤:

sre_constants.error: unbalanced parenthesis 

我不知道爲什麼我會收到這個,因爲沒有不平衡的括弧我可以看到我的正則表達式模式。

回答

5

此行是第一#之後註釋掉:

 v----comment starts here 
([:\s,]*#\w+)* ... 

逃避它:

([:\s,]*\#\w+)* 

此行過,但它不會引起不平衡括號:)

v----escape me 
#\w+         #The hashtag matching ... 

 

HASHTAG_SEQ_REGEX_PATTERN   = r""" 
(    # Outermost grouping to match overall regex 
\#\w+    # The hashtag matching. It's a valid combination of \w+ 
([:\s,]*\#\w+)* # This is an optional (0 or more) sequence of hashtags separated by [\s,:]* 
)     # Closing parenthesis of outermost grouping to match overall regex 
""" 
+0

我怎麼能這麼非常非常愚蠢!感謝帕維爾感謝爆炸爲您的答案。 – VaidAbhishek 2013-03-07 22:20:57

3

你有一些轉義哈希那裏,你要合法使用,但VERBOSE被擰您:

\#\w+ 
([:\s,]*\#\w+)* #reported issue caused by this hash 
0

或者,使用[#]#標誌添加到其不打算正則表達式開始評論:

HASHTAG_SEQ_REGEX_PATTERN   = r""" 
(     #Outermost grouping to match overall regex 
[#]\w+    #The hashtag matching. It's a valid combination of \w+ 
([:\s,]*[#]\w+)*  #This is an optional (0 or more) sequence of hashtags separated by [\s,:]* 
)     #Closing parenthesis of outermost grouping to match overall regex 
""" 

我覺得這樣更具可讀性。

2

,如果你寫的圖案folows你不會有這樣的問題:

HASHTAG_SEQ_REGEX_PATTERN = (
'(' #Outermost grouping to match overall regex 
'#\w+'  #The hashtag matching. It's a valid combination of \w+ 
'([:\s,]*#\w+)*' #This is an optional (0 or more) sequence of hashtags separated by [\s,:]* 
')' #Closing parenthesis of outermost grouping to match overall regex 
) 

就個人而言,我從來沒有使用re.VERBOSE,我從來沒有提醒有關空白和其他

規則