使用Nokogiri SAX從節點中選擇兒童？

Nokogiri :: XML :: SAX :: Document有沒有類似(accessions = doc.at_xpath('//Node/Childtag').content)的方法？使用Nokogiri SAX從節點中選擇兒童？

我喜歡XML：

<accession>Police-1234</accession> 
<accession>Police-6574</accession>  
<police> 
    <privateCar> 
     <fullName>BMW 750Li</fullName> 
    </privateCar> 
    <officeCar> 
     <fullName>Ford Mustang GT</fullName> 
    </officeCar> 
    <optional> 
     <fullName>Porsche carrera 511</fullName> 
    </optional> 
    </police>

我的代碼是一些什麼樣的：

require 'rubygems' 
require 'nokogiri' 

include Nokogiri 

class PostCallbacks < XML::SAX::Document 


    def initialize 
    @in_title = false 
    @in_title2 = false 
    end 

    def start_element(element, attributes) 
    @attrs = attributes 
    @content = '' 
    @in_title = element.eql?("accession") 
    # Collecting all the other nodes/tags 
    @in_title2 = element.eql?("fullName") 
    end 



    def end_document 
     # puts "Here is where the attributes could be played with" 
    end 


    def characters string 

    string.strip! 
    if @in_title and !string.empty? 
      puts "Accession: #{string}" 

    elsif @in_title2 and !string.empty? 
      puts "Full Name: #{string}" 
    end 

    @content << string if @content 

    end 

end 


parser = XML::SAX::Parser.new(PostCallbacks.new) 
parser.parse(File.open(ARGV[0]))

我的結果是：

Accessions:Police-1234 
Accessions:Police-6574 

Full Name: BMW 750Li 
Full Name: Ford Mustang GT 
Full Name: Porsche carrera 511

現在我有兩個問題。

如何限制收集值爲「Police-1234」的「加入」元素。
我想只檢索privatecar的孩子的全名。即我只想要BMW 750Li作爲我的結果。

對於第一點，我通常使用doc.xpath(//accession).first來拉出XML中的第一個條目。

對於第二點，我知道我可以使用XPath與doc.at_xpath(//police/privateCar/fullName)進行選擇，但SAX解析器有類似的地方嗎？

我正在使用SAX，因爲我有一個很大的XML文件需要解析。

來源

2014-06-17 A1aks

簡短的答案是否定的，在SAX中沒有類似的功能。

你不理解SAX解析和DOM解析之間的區別。通常情況下，當我們使用Nokogiri時，我們正在處理足夠小的文檔以適合內存，並將其解析爲DOM（「document object model）」。這對於迭代文檔並對其進行搜索具有巨大的優勢，因爲我們可以按照我們想要的頻率倒帶和搜索，而不會受到任何懲罰;而且，因爲它全部在內存中，所以我們很容易告訴解析器根據一系列節點找到特定的節點;這些都是爲了（「Simple API for XML」）處理從文檔的流的頂部一直到文檔的結尾，並且隨着每個標記的打開或者每個標記的打開或者關閉時，我們有機會對其參數進行一些操作，而不是使用XPath或CSS選擇器進行搜索，而必須查找t ag的名字，因爲我們獲得標記打開的事件，並設置標記以記住我們已經看到它，然後查找隨後打開的標記名稱，直到獲得所需的內容爲止。

SAX是處理文檔的完全不同的方式，但它的優點是它的內存效率更高。

來源

2014-06-22 03:13:25

使用Nokogiri SAX從節點中選擇兒童？

回答

相關問題