2013-06-11 58 views
1

我想使用發生在另一個文本模式之前的最後一個文本模式。正則表達式:發生在另一個模式之前的模式的最後一次發生

例如,我有這樣的文字:

code 4ab6-7b5 
Another lorem ipsum 
Random commentary. 

code f6ee-304 
Lorem ipsum text 
Dummy text 

code: ebf6-649 
Other random text 
id-x: 7662dd41-29b5-9646-a4bc-1f6e16e8095e 

code: abcd-ebf 
Random text 
id-x: 7662dd41-29b5-9646-a4bc-1f6e16e8095e 

我想採取的id-x第一次出現之前發生(這意味着我要獲取代碼ebf6-649

我怎樣才能最後code用正則表達式來做這件事?

+0

我使用Git的bash,所以我認爲這是UNIX引擎 –

+0

通過「用正則表達式」,我認爲你的意思是「用git-bash」? (我的意思是,爲什麼你關心,如果答案恰巧使用正則表達式?) – ruakh

+0

@JasonSwartz我實際上認爲這個問題的以前版本是完全正確的,並會給你更有用的答案。這種有限形式的解決方案可能會在您的實際投入中產生誤報。 –

回答

7

如果您正則表達式的味道支持lookaheads,你可以使用這樣

^code:[ ]([0-9a-f-]+)(?:(?!^code:[ ])[\s\S])*id-x 

的解決方案,你可以找到你的結果在捕獲數1

它是如何工作的?

^code:[ ]   # match "code: " at the beginning of a line, the square 
        # brackets are just to aid readability. I recommend always 
        # using them for literal spaces. 

(     # capturing group 1, your key 
    [0-9a-f-]+  # match one or more hex-digits or hyphens 
)     # end of group 1 

(?:     # start a non-capturing group; each "instance" of this group 
        # will match a single arbitrary character that does not start 
        # a new "code: " (hence this cannot go beyond the current 
        # block) 

    (?!    # negative lookahead; this does not consume any characters, 
        # but causes the pattern to fail, if its subpattern could 
        # match here 

    ^code:[ ]  # match the beginning of a new block (i.e. "code: " at the 
        # beginning of another line 

)     # end of negative lookahead, if we've reached the beginning 
        # of a new block, this will cause the non-capturing group to 
        # fail. otherwise just ignore this. 

    [\s\S]   # match one arbitrary character 
)*     # end of non-capturing group, repeat 0 or more times 
id-x    # match "id-x" literally 

(?:(?!stopword)[\s\S])*模式讓我們你儘可能的匹配,而不超出的stopword另一個發生。

請注意,對於^,您可能必須使用某種形式的多行模式才能匹配行首。如果您的random text包含open:,則^對於避免錯誤否定很重要。

Working demo(使用Ruby的正則表達式的味道,因爲我不知道哪一個,你最終要使用)

相關問題