2015-09-17 67 views
1

我有一個問題,REXML::XPath.first從子上下文中呈現正確的節點文本。如何使用ruby,xpath,rexml從子上下文中獲取節點文本

請參閱下面的測試腳本和xml。

test.rb

require 'rexml/document' 
require 'rexml/xpath' 

file = File.new('test.xml') 
doc = REXML::Document.new(file) 

employers = REXML::XPath.match(doc, '//EmployerOrg') 
employers.each do |employer| 
    # this looks fine, position_history is being set for each employer 
    position_history = REXML::XPath.first(employer, 'PositionHistory') 

    # always returns the title from the first employer, in spite of the position_history context 
    p title = REXML::XPath.first(position_history, '//Title').text 
end 

輸出:

"Director of Web Applications Development" 
"Director of Web Applications Development" 
"Director of Web Applications Development" 

示例XML:

<?xml version="1.0" encoding="UTF-8"?> 
<Resume xml:lang="en" xmlns="http://ns.hr-xml.org/2006-02-28" xmlns:sov="http://sovren.com/hr-xml/2006-02-28"> 
    <StructuredXMLResume> 
    <EmploymentHistory> 
     <EmployerOrg> 
     <EmployerOrgName>Technical Difference</EmployerOrgName> 
     <PositionHistory positionType="directHire" currentEmployer="true"> 
      <Title>Director of Web Applications Development</Title> 
      <OrgName> 
      <OrganizationName>Technical Difference</OrganizationName> 
      </OrgName> 
      <StartDate> 
      <AnyDate>2004-10-01</AnyDate> 
      </StartDate> 
      <EndDate> 
      <AnyDate>2015-09-15</AnyDate> 
      </EndDate> 
     </PositionHistory> 
     </EmployerOrg> 
     <EmployerOrg> 
     <EmployerOrgName>Convergence Inc. LLC</EmployerOrgName> 
     <PositionHistory positionType="directHire"> 
      <Title>Senior Web Developer/DBA</Title> 
      <OrgName> 
      <OrganizationName>Convergence Inc. LLC</OrganizationName> 
      </OrgName> 
      <StartDate> 
      <AnyDate>2003-03-01</AnyDate> 
      </StartDate> 
      <EndDate> 
      <AnyDate>2004-12-01</AnyDate> 
      </EndDate> 
      <UserArea> 
      <sov:PositionHistoryUserArea> 
       <sov:Id>POS-2</sov:Id> 
       <sov:CompanyNameProbability>23</sov:CompanyNameProbability> 
       <sov:PositionTitleProbability>30</sov:PositionTitleProbability> 
      </sov:PositionHistoryUserArea> 
      </UserArea> 
     </PositionHistory> 
     </EmployerOrg> 
     <EmployerOrg> 
     <EmployerOrgName>Avalon Digital Marketing Systems, Inc</EmployerOrgName> 
     <PositionHistory positionType="contract"> 
      <Title>Contractor - Web Development</Title> 
      <OrgName> 
      <OrganizationName>Avalon Digital Marketing Systems, Inc</OrganizationName> 
      </OrgName> 
      <StartDate> 
      <AnyDate>2002-05-01</AnyDate> 
      </StartDate> 
      <EndDate> 
      <AnyDate>2003-03-01</AnyDate> 
      </EndDate> 
     </PositionHistory> 
     <PositionHistory positionType="directHire"> 
      <Title>Web Developer/Junior DBA</Title> 
      <OrgName> 
      <OrganizationName>European Division</OrganizationName> 
      </OrgName> 
      <StartDate> 
      <AnyDate>2000-05-01</AnyDate> 
      </StartDate> 
      <EndDate> 
      <AnyDate>2002-04-30</AnyDate> 
      </EndDate> 
     </PositionHistory> 
     </EmployerOrg> 
    </EmploymentHistory> 
    </StructuredXMLResume> 
</Resume> 
+1

與其使用REXML,不如使用[Nokogiri](http://nokogiri.org),它是Ruby的XML/HTML解析的實際標準,並且使用[Builder](http:// www .rubydoc.info/github/sparklemotion/nokogiri/Nokogiri/XML/Builder)API。 –

+0

事實標準? REXML是在stdlib中,我想避免添加一個新的依賴關係。 – doremi

+1

@theTinMan,我認爲你在流行方面是正確的,所以SO可以作爲判斷人氣的粗略指南(120個q,2個關注REXML,2557個關注nokogiri的135個關注者)。雖然建議另一個工具來做這項工作並不總是一個合適的答案... – Abel

回答

1

可能是因爲你的XPath '//Title'表示從文檔的頂部開始,幾乎忽略了上下文節點position_history。嘗試用'./Title''Title'替換。

+0

對我來說似乎是一個正確的觀察,現貨! :) – Abel

+0

「Title」和「。/ Title」都可以工作。謝謝!知道爲什麼'/'開始在文檔的頂部而不是我傳入的上下文會很有趣。這就是讓我感到沮喪的原因。 – doremi

+1

由於文檔只被解析一次,函數返回的是應該考慮指向該解析文檔中節點的指針或指針集。您可以考慮如同在命令提示符下考慮當前的工作目錄。如果我說(Unix格式)'ls /',它會列出根目錄,而不管我在文件系統樹上的什麼位置。 – bjimba

相關問題