2014-09-30 73 views
0

我試圖找到處理我遇到的問題的最佳方法。我需要能夠從字符串中提取註釋,這些字符串被視爲字符串末尾括號內的內容。評論可以是單個,多個,嵌套或這些的組合。正則表達式正確地處理嵌套模式

一些例子:

this is a string (with comment) 
this is another string (with comment)(and more comment) 
this is yet another string (with comment (and some nested comment) 

這是最簡單的形式,很容易使用下面的正則表達式來分離(進入VBA)

regex.Pattern = "^([^(]*)(\(.*\))+$" 

我得到以下正確的輸出,其中1組爲值和組2將是評論

group1: this is a string/group2: (with comment) 
group1: this is another string/group2: (with comment)(and more comment) 
group1: this is yet another string/group2: (with comment (and some nested comment) 

問題是,在有些情況下我有數組,這些都會失敗。數組可以用逗號或斜線定義。非常簡單,但問題是這些令牌也可以用於其他目的。所以,如果一個逗號或斜線的字符串中發現它被認爲是一個數組,除非:

- the token is within the comment 
- the slash is part of a fractional number 

一些例子:

this is string1 with a fractional 1/4 number (with comment) 
this is string1 (with a fractional 1/4 in comment) 
this is string1 (with comment1)/this is string2 (with comment2) 
this is string1 (with some data, seperated by a comma) , this is string2 (with comment3/comment4) 
this is string1 (with a fractional 1/4)/this is string2 (with comment2,comment3) 

補充的例子:第一,因爲它包含數組令牌一個出現故障(在斜線),這不是小數的一部分。第二個選擇太多,因爲它只應該從第一個到第二個評論取最後的評論而不是整個字符串。

this is string1 without comment/this is string2 (with comment2) 
This is a string (with subcomment) where only the last should be selected (so this one) 

我將如何調整邏輯的最佳使得其可以在重複失敗,除非逗號或斜線是例外的一部分?我結束了怪物代碼,所以想看看是否有更容易的選項。所以,上述例外情況應該結束瞭如下:

ex1/group1 : this is string1 with a fractional 1/4 number group2: (with comment) 
ex2/group1 : this is string1 group2 : (with a fractional 1/4 in comment) 
ex3 to 5 should fail as they are considered arrays and need some additional logic 

希望這是一個有點清楚..

回答

1

我想你想這樣的事情,

^((?:(?!\)\s*[,\/]).)*?)(\([^()]*\))$ 

DEMO

更新:

^(?=(?:(?!\)\s*[,\/]|\s\/\s).)*$)(.*?)((?:\([^()\n]*\))+)$ 

DEMO

+0

很近,太棒了!它現在不再與'更簡單的'完全匹配了,因爲[這是另一個字符串(有評論)(還有更多評論)],但我會從這裏試試 – Wokoman 2014-09-30 12:46:11

+0

上述輸入的輸出是什麼? – 2014-09-30 12:48:22

+0

第一組應該是[這是另一個字符串],第二組應該是註釋[(帶註釋)(和更多評論)],所以任何重複的括號內的數據。 – Wokoman 2014-09-30 12:55:21