Ruby中的正則表達式問題

我遇到了使用Ruby從文本文件中獲取數據的問題。我打開並讀取了文件，並用'％'代替了所有換行符（因爲換行符似乎會導致問題），但是當我嘗試調用字符串掃描時，它不會解析我想要的方式至。我敢肯定，這個正則表達式比它需要的更醜陋，但這裏是它在做什麼：http://rubular.com/r/JNgleGA5bd Ruby中的正則表達式問題

該文件有一個編號列表，由於格式是一致的，我想要一個正則表達式來抓住名單。在片段包括我，它應該抓住一切之前「2.（標籤）在‘其他條件’船製造商，」

下面是字符串的一個樣本：

「1.什麼讓你船？%% [ - 選擇一個 - ] %%變量1：代碼= A2_asdfw，名稱= A2_WhatMakeIsYourBoat %%類型=類別%%模板=標準類別%% Cat。1：Code = 339，Name = NONE %% Cat 2：代碼= 3，名稱= asdfg %% 2在「其他條件」船製造商，請在這裏描述：％_ __ _ __ _ ___ %%無功1：代碼= A154_asdf，名稱= A3 6_asdfg %%類型=文字%%模板=標準文字%%最大長度= 20個字符%% 「

這裏是我的正則表達式：

([0-9]+\.\t[\/0-9a-zA-Z\s,"()'-]+[%\t?:].*?)[0-9]+\.\t[\/0-9a-zA-Z\s,"()'-]+[%\t?:]

來源

2013-04-14 user1079401

什麼是跨度？不要指望我們點擊鏈接看看你有什麼。不要指望我們爲你的懶惰而慷慨。在這裏寫相關的代碼。 – sawa

哎呀！掃描!!!!! – user1079401

仍然不知道你在做什麼。在'2'之前得到文本的一部分。如果...'？如果是這樣，'。*？（？= 2 \。）'會做。 – Loamhoof

假設每個條目與模式開始」 digit-期間標籤」，你可以用這個表達式：

[0-9][.]\t(?:(?![0-9][.]\t).)*

Working demo.

下面是一些解釋：

[0-9]   # match a digit 
[.]   # match a period - same as "\.", but more readable IMHO 
\t    # match a tab 
(?:   # open non-capturing group. this group will match/consume single 
       # character, that is not the beginning of the next item 
    (?!   # negative lookahead - this does not consume anything, but ensure 
       # its contents canNOT be matched at the current position 
    [0-9][.]\t # check that there is no new item starting 
)   # end of negative lookahead ... if we get here, the next character 
       # still belongs to the current item; note that the engine's 
       # "cursor" has not moved 
    .   # consume an arbitrary character 
)    # end of group 
*    # repeat 0 or more times (as often as possible)

More information on lookarounds.

如果你的項目可以超越數9（即，有多個數字），只需兩個[0-9]後添加+。

來源

2013-04-14 15:15:23

非常感謝解釋。這非常有幫助。 – user1079401

此外，我採取了user2103316的建議，並保留換行符。我結束的正則表達式是/([0-9]+[.]\t(?:(?![0-9]+[.]\t).)*)/m，因爲有更高的數字超過9，並且它工作得很好！ – user1079401

Ruby中的正則表達式問題

回答

相關問題