2014-05-12 81 views
0

我想匹配下劃線的字符串,整個字符串都有下劃線但我想匹配最後一個下劃線後有字符串的字符串:讓我舉個例子:只匹配最後一個下劃線後有字符串的字符串

s = "hello_world" 
s1 = "hello_world_foo" 
s2 = "hello_world_foo_boo" 

在我來說,我只捕捉S1S2

我從以下開始,但無法真正瞭解我將如何進行匹配以捕獲hello_world的下劃線後面具有字符串的字符串。

rgx = re.compile(ur'(?P<firstpart>\w+)[_]+(?P<secondpart>\w+)$', re.I | re.U) 
+0

一些澄清會很好。 's'中的最後一個下劃線在o和w之間,在's1'中,在d和f之間,在s2中,在o和b之間(但是你後來說在第一個下劃線之後要'串' o和w之間的一個)。看起來你的「串」和「捕捉」的含義與我所擁有的不同。你能詳細說明你到底想要什麼嗎? – Jerry

回答

0

如果我明白你問什麼(你想與一個以上的下劃線和下面的文本匹配字符串)

rgx = re.compile(ur'(?P<firstpart>\w+)[_]+(?P<secondpart>\w+)_[^_]+$', re.I | re.U) 
1

試試這個:

reobj = re.compile("^(?P<firstpart>[a-z]+)_(?P<secondpart>[a-z]+)_(?P<lastpart>.*?)$", re.IGNORECASE) 
result = reobj.findall(subject) 

正則表達式的說明

^(?P<firstpart>[a-z]+)_(?P<secondpart>[a-z]+)_(?P<lastpart>.*?)$ 

Options: case insensitive 

Assert position at the beginning of the string «^» 
Match the regular expression below and capture its match into backreference with name 「firstpart」 «(?P<firstpart>[a-z]+)» 
    Match a single character in the range between 「a」 and 「z」 «[a-z]+» 
     Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+» 
Match the character 「_」 literally «_» 
Match the regular expression below and capture its match into backreference with name 「secondpart」 «(?P<secondpart>[a-z]+)» 
    Match a single character in the range between 「a」 and 「z」 «[a-z]+» 
     Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+» 
Match the character 「_」 literally «_» 
Match the regular expression below and capture its match into backreference with name 「lastpart」 «(?P<lastpart>.*?)» 
    Match any single character that is not a line break character «.*?» 
     Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?» 
Assert position at the end of the string (or before the line break at the end of the string, if any) «$» 
相關問題