如何通過Python中的多個正則表達式簡潔級聯

我的兩難：我將我的函數傳遞給我需要的字符串，然後在其上執行大量的正則表達式操作。邏輯是，如果第一個正則表達式匹配，則執行一件事。如果不匹配，檢查與第二個匹配，並做其他事情，如果不檢查第三個，等等。我可以做這樣的事情：如何通過Python中的多個正則表達式簡潔級聯

if re.match('regex1', string): 
    match = re.match('regex1', string) 
    # Manipulate match.group(n) and return 
elif re.match('regex2', string): 
    match = re.match('regex2', string) 
    # Do second manipulation 
[etc.]

然而，這種感覺不必要的冗長，而且通常當那樣的話就意味着有更好的方式，我可以俯瞰或還不知道。

有沒有人有一個更好的方式來做到這一點（從代碼外觀的角度來看，內存使用的立場，或兩者兼得）的建議？

來源

2009-02-28 One Crayon

Dupicate：HTTP：/ /stackoverflow.com/questions/122277/how-do-you-translate-this-regular-expression-idiom-from-perl-into-python – 2009-02-28 12:11:47

在九月回

類似的問題：How do you translate this regular-expression idiom from Perl into Python?

模塊中使用全局變量也許不是最好的方式做到這一點，但它轉換成一個類：

import re 

class Re(object): 
    def __init__(self): 
    self.last_match = None 
    def match(self,pattern,text): 
    self.last_match = re.match(pattern,text) 
    return self.last_match 
    def search(self,pattern,text): 
    self.last_match = re.search(pattern,text) 
    return self.last_match 

gre = Re() 
if gre.match(r'foo',text): 
    # do something with gre.last_match 
elif gre.match(r'bar',text): 
    # do something with gre.last_match 
else: 
    # do something else

來源

2009-02-28 05:32:56

感謝您的鏈接！我在搜索中沒有找到那個主題，但是這是我正在嘗試做的事。我喜歡使用類而不是模塊的想法。 – 2009-02-28 16:22:36

嗯......你可以使用與with結構的東西...嗯

class rewrapper() 
    def __init__(self, pattern, target): 
     something 

    def __enter__(self): 
     something 

    def __exit__(self): 
     something 


with rewrapper("regex1", string) as match: 
    etc 

with rewrapper("regex2", string) as match: 
    and so forth

來源

2009-02-28 04:24:22 SingleNegationElimination

是每個正則表達式的操作相似？如果是的話，試試這個：

for regex in ('regex1', 'regex2', 'regex3', 'regex4'): 
    match = re.match(regex, string) 
    if match: 
     # Manipulate match.group(n) 
     return result

來源

2009-02-28 04:25:47

不幸的是，操作因不同的正則表達式而異;回想起來，我應該在這個問題中指出這一點。 – 2009-02-28 05:05:13

這裏您regexs和火柴不重複兩次：

match = re.match('regex1', string) 
if match: 
    # do stuff 
    return 

match = re.match('regex2', string) 
if match: 
    # do stuff 
    return

來源

2009-02-28 04:30:40

一般來說，在這些各種各樣的情況，要進行的代號爲「數據驅動」。也就是說，把重要的信息放在一個容器中，並通過它循環。

就你而言，重要的信息是（字符串，函數）對。

import re 

def fun1(): 
    print('fun1') 

def fun2(): 
    print('fun2') 

def fun3(): 
    print('fun3') 

regex_handlers = [ 
    (r'regex1', fun1), 
    (r'regex2', fun2), 
    (r'regex3', fun3) 
    ] 

def example(string): 
    for regex, fun in regex_handlers: 
     if re.match(regex, string): 
      fun() # call the function 
      break 

example('regex2')

來源

2009-02-28 04:31:57

感謝您的建議！這就是我可能最終要做的事情，但是re模塊的重寫版本更適合這個項目。 – 2009-02-28 16:23:40

我和你一樣的問題。 Here's我的解決方案：

import re 

regexp = { 
    'key1': re.compile(r'regexp1'), 
    'key2': re.compile(r'regexp2'), 
    'key3': re.compile(r'regexp3'), 
    # ... 
} 

def test_all_regexp(string): 
    for key, pattern in regexp.items(): 
     m = pattern.match(string) 
     if m: 
      # do what you want 
      break

It'sa略作修改從Extracting info from large structured text files

來源

2009-03-02 14:27:29

詞典不保證排序。您可能應該使用序列而不是字典來獲得可預測的行爲。 – 2018-01-27 21:41:23

class RegexStore(object): 
    _searches = None 

    def __init__(self, pat_list): 
     # build RegEx searches 
     self._searches = [(name,re.compile(pat, re.VERBOSE)) for 
         name,pat in pat_list] 

    def match(self, text): 
     match_all = ((x,y.match(text)) for x,y in self._searches) 
     try: 
     return ifilter(op.itemgetter(1), match_all).next() 
     except StopIteration, e: 
     # instead of 'name', in first arg, return bad 'text' line 
     return (text,None)

答案的解決方案，您可以使用這個類，像這樣：

rs = RegexStore((('pat1', r'.*STRING1.*'), 
        ('pat2', r'.*STRING2.*'))) 
name,match = rs.match("MY SAMPLE STRING1") 

if name == 'pat1': 
    print 'found pat1' 
elif name == 'pat2': 
    print 'found pat2'

來源

2011-05-28 22:46:01 cmcginty

如何通過Python中的多個正則表達式簡潔級聯

回答

相關問題