2011-03-22 91 views
1

給定的文本,如:如何刪除不 - HTML標記 - CID

This is my reply. This is paragraph one. 

This is paragraph two. Capture everything before me as this is the last sentence. 

[cid:0BE7856F-9507-4AEA-854D-C01A6CFAF15F] 
[cid:1DA3C231-846D-4490-9458-04A2484F4294] 
[cid:33225087-994A-4FAF-B74D-5D56F334F29D] 

什麼是去除CID標籤的最佳途徑,導致在短短的:

This is my reply. This is paragraph one. 

This is paragraph two. Capture everything before me as this is the last sentence. 
+0

東西liek呢? body.sub(\ [cid:(。*)\],'')? – AnApprentice 2011-03-22 01:01:55

+0

正則表達式喜歡'^ \ [cid:(。*)\] $' – Zabba 2011-03-22 01:24:26

+0

編輯問題通常比留下評論更好。 – 2011-03-22 02:43:54

回答

3

如果你想趕上是非常具體的格式,你會怎麼做:

regex = /\[cid:[0-9A-Fa-f]{8}-[0-9A-Fa-f]{4}-[0-9A-Fa-f]{4}-[0-9A-Fa-f]{4}-[0-9A-Fa-f]{12}\]/ 
    body[0..(body =~ regex).to_i-1] 

如果你想放鬆它一點點,你會怎麼做:

body[0..(body =~ /\[cid:/).to_i-1] 

如果你不知道會出現[CID聲明之前的內容,那麼你應該將其拉出,並做到這一點:

regex = # choose your expression 
    test = body =~ regex 
    body[0..(test.nil? ? -1 : test - 1)] 
+1

show-ooooffffff:D – Zabba 2011-03-22 01:32:14

+0

哇哇,慢下來......作爲一名高級開發人員,我會把第一個模式放在第一個模式中,並說它有點過於正則表達式。除非事實證明,否則第二種模式對我們的現實要好得多。而且,爲了使用'..'而使用+1,因爲它使用像這樣的觸發器操作符超級完美。那時候我會給你買一杯拿鐵咖啡。 :-) – 2011-03-22 02:50:01