將nowiki-tags添加到此解析器是否可行？

更新：備案，here's the implementation I ended up using。將nowiki-tags添加到此解析器是否可行？

下面是我正在處理的解析器的修剪版本。還有一些代碼，但應該很容易掌握這個解析器的基本概念。

class Markup 
    def initialize(markup) 
    @markup = markup 
    end 

    def to_html 
    @html ||= @markup.split(/(\r\n){2,}|\n{2,}/).map {|p| Paragraph.new(p).to_html }.join("\n") 
    end 

    class Paragraph 
    def initialize(paragraph) 
     @p = paragraph 
    end 

    def to_html 
     @p.gsub!(/'{3}([^']+)'{3}/, "<strong>\\1</strong>") 
     @p.gsub!(/'{2}([^']+)'{2}/, "<em>\\1</em>") 
     @p.gsub!(/`([^`]+)`/, "<code>\\1</code>") 

     case @p 
     when /^=/ 
     level = (@p.count("=")/2) + 1 # Starting on h2 
     @p.gsub!(/^[= ]+|[= ]+$/, "") 
     "<h#{level}>" + @p + "</h#{level}>" 
     when /^(\*|\#)/ 
     # I'm parsing lists here. Quite a lot of code, and not relevant, so 
     # I'm leaving it out. 
     else 
     @p.gsub!("\n", "\n<br/>") 
     "<p>" + @p + "</p>" 
     end 
    end 
    end 
end 

p Markup.new("Here is `code` and ''emphasis'' and '''bold'''! 

Baz").to_html 

# => "<p>Here is <code>code</code> and <em>emphasis</em> and <strong>bold</strong>!</p>\n<p>Baz</p>"

所以，你可以看到，我打破了文成段，每個段是一個頁眉，列表或常規段落。

對於像這樣的解析器添加對nowiki標記的支持（< nowiki> </nowiki>沒有被解析）是否可行？隨意回答「否」，並建議創建解析器的其他方法:)

作爲旁註，您可以在Github上看到實際的解析器代碼。 markup.rb和paragraph.rb

來源

2009-09-16 August Lilleaas

如果您使用簡單的標記器，管理這類事情會容易得多。一種方法是創建一個可以捕捉整個語法的單一正則表達式，但這可能會帶來問題。另一種方法是將文檔分成需要重寫的部分和應該跳過的部分，這可能是更簡單的方法。

這裏有一個簡單的框架，你可以根據需要擴展：

def wiki_subst(string) 
    buffer = string.dup 
    result = '' 

    while (m = buffer.match(/<\s*nowiki\s*>.*?<\s*\/\s*nowiki\s*>/i)) 
    result << yield(m.pre_match) 
    result << m.to_s 
    buffer = m.post_match 
    end 

    result << yield(buffer) 

    result 
end 

example = "replace me<nowiki>but not me</nowiki>replace me too<NOWIKI>but not me either</nowiki>and me" 

puts wiki_subst(example) { |s| s.upcase } 
# => REPLACE ME<nowiki>but not me</nowiki>REPLACE ME TOO<NOWIKI>but not me either</nowiki>AND ME

來源

2009-09-16 21:55:23 tadman

是文本的分割成段，像我的解析器確實，分詞的一種形式？ – 2009-09-17 05:33:26

也許使用一個非常鬆散的定義。一般來說，一個標記器將輸入流分割成不同的組件，可以使用最好的粒度級別單獨運行。分割成段落，然後分裂成其他部分是一種雙通道標記器。通常，在編寫這種類型的東西時，您只能通過自己的方法來解析。在某個時候，使用適當的解析器框架更有效，但這是另一個主題。 – tadman 2009-09-17 15:12:44

標記爲答案。謝謝！ – 2009-09-21 08:23:53

將nowiki-tags添加到此解析器是否可行？

回答

相關問題