Ruby/Rails掃描/匹配正則表達式從標記到另一個文本

在下面的內容示例中，我包裝了行以便於在Stackoverflow上閱讀（因此您不必滾動到右側以便看看例子）。Ruby/Rails掃描/匹配正則表達式從標記到另一個文本

內容答：

"Lorem Ipsum\r\n 
[img]http://example.org/first.jpg[/img]\r\n 
[img]http://example.org/second.jpg[/img]\r\n 
more lorem ipsum ..."

內容B：

"Lorem Ipsum\r\n 
[img caption="Sample caption"]http://example.org/third.jpg[/img] 
[img]http://example.org/fourth.jpg[/img]"

內容C：

"Lorem Ipsum [img]http://example.org/fifth.jpg[/img]\r\n 
more lorem ipsum\r\n\r\n 
[img caption="Some other caption"]http://example.org[/img]"

我已經試過：

content.match(/\[img\]([^<>]*)\[\/img\]/imu) 
return example: "[img]...[/img]\r\n[img]...[/img] 
content.scan(/\[img\]([^<>]*)\[\/img\]/imu) 
return example: "...[/img]\r\n[img]..."

在上述3個內容示例中運行掃描/匹配/正則表達式解決方案時，我想完成的是將[img]...[/img]和[img caption="?"]...[/img]的每個出現次數都放入數組中供以後使用。

Array 
    1 : A : [img]http://example.org/first.jpg[/img] 
    2 : A : [img]http://example.org/second.jpg[/img] 
    3 : B : [img caption="Sample caption"]http://example.org/third.jpg[/img] 
    4 : B : [img]http://example.org/fourth.jpg[/img] 
    5 : C : [img]http://example.org/fifth.jpg[/img] 
    6 : C : [img caption="Some other caption"]http://example.org[/img]

這也將是有益的限制「剝內容」只有那裏是一個開放的，closign標籤，當有[img]/[img caption="?"]，而遺漏[/img]後來，忽略它的意義。

我已經讀了http://www.ruby-doc.org/core-1.9.3/String.html上下，但找不到任何似乎適用於此的東西。

更新：

所以我想這：

\[img([^<>]*)\]([^<>]*)\[\/img\]

會發現兩種：

[img]something[/img]

和：

[img caption="something"]something[/img]

現在我只需要知道如何抓住每一個內部的事件不同的內容。我總是可以從第一個到最後一個[img] [/ img]標籤中獲得它，所以當其他Lorem Ipsum介於兩者之間時，它也會被抓取。

來源

2013-01-22 tomthorgal

您可以使用/\[img(?:\s+caption=".+")?\].+?\[\/img\]/掃描文件：

regex = /\[img(?:\s+caption=".+")?\].+?\[\/img\]/ 

text = <<EOT 
Lorem Ipsum 
[img]http://example.org/first.jpg[/img] 
[img]http://example.org/second.jpg[/img] 
more lorem ipsum ... 

Content B: 

Lorem Ipsum 
[img caption="Sample caption"]http://example.org/third.jpg[/img] 
[img]http://example.org/fourth.jpg[/img] 

Content C: 

Lorem Ipsum [img]http://example.org/fifth.jpg[/img] 
more lorem ipsum 

[img caption="Some other caption"]http://example.org[/img] 
EOT 

array = text.scan(regex) 
puts array

產生：

 
[img]http://example.org/first.jpg[/img] 
[img]http://example.org/second.jpg[/img] 
[img caption="Sample caption"]http://example.org/third.jpg[/img] 
[img]http://example.org/fourth.jpg[/img] 
[img]http://example.org/fifth.jpg[/img] 
[img caption="Some other caption"]http://example.org[/img]

如果你想忽略標籤，只搶內容，改變正則表達式來：

regex = /\[img(?:\s+caption=".+")?\](.+?)\[\/img\]/

與該改變的回報再次運行：

http://example.org/first.jpg 
http://example.org/second.jpg 
http://example.org/third.jpg 
http://example.org/fourth.jpg 
http://example.org/fifth.jpg 
http://example.org

（Rubular proof）

如果你需要尋找不同的標籤，你可以很容易地生成一個「OR」的文章：

Regexp.union(%w[foo img bar]) 
=> /foo|img|bar/

如果您需要確保「魔」字是預先轉義：

Regexp.union(%w[foo img bar].map{ |s| Regexp.escape(s) })

來源

2013-01-22 03:49:49

幸運的是，我已經在我自己的應用程序中解決了這個問題！

鑑於@tags爲標籤的數組（如["img"]）：

regex = /\[(#{@tags.join("|")})\s*(.*?)?\/?\](?:(.*?)\[\/\1\])?/ 
matches = content.scan(regex)

完整的示例：

require 'pp' 

@tags = %w(img) 
regex = /\[(#{@tags.join("|")})\s*(.*?)?\/?\](?:(.*?)\[\/\1\])?/ 

content = <<-EOF 
    Lorem Ipsum\r\n 
    [img]http://example.org/first.jpg[/img]\r\n 
    [img]http://example.org/second.jpg[/img]\r\n 
    more lorem ipsum ..." 
    Content B: 

    "Lorem Ipsum\r\n 
    [img caption="Sample caption"]http://example.org/third.jpg[/img] 
    [img]http://example.org/fourth.jpg[/img]" 
    Content C: 

    "Lorem Ipsum [img]http://example.org/fifth.jpg[/img]\r\n 
    more lorem ipsum\r\n\r\n 
    [img caption="Some other caption"]http://example.org[/img]" 
EOF 

matches = content.scan(regex) 
pp matches

輸出：

[["img", "", "http://example.org/first.jpg"], 
["img", "", "http://example.org/second.jpg"], 
["img", "caption=\"Sample caption\"", "http://example.org/third.jpg"], 
["img", "", "http://example.org/fourth.jpg"], 
["img", "", "http://example.org/fifth.jpg"], 
["img", "caption=\"Some other caption\"", "http://example.org"]]

來源

2013-01-22 03:23:44

Ruby/Rails掃描/匹配正則表達式從標記到另一個文本

回答

相關問題