使用Nokogiri根據文本選擇HTML塊？

我有HTML以下塊：使用Nokogiri根據文本選擇HTML塊？

<tr> 
    <th>Consignment Service Code</th> 
    <td>ND16</td> 
</tr>

什麼我最終想拉是ND16字符串，但要做到這一點，我需要基於文本Consignment Service Code選擇<tr>。

我正在使用Nokogiri已經解析HTML，所以它會很好，只是繼續使用它。

那麼，如何根據文本「Consignment Service Code」選擇HTML塊呢？

來源

2013-09-25 Shpigford

的可能重複[引入nokogiri：如何通過匹配的文本選擇節點（HTTP：//計算器.com/questions/1474688/nokogiri-how-to-select-nodes-by-matching-text） – Phrogz

你可以這樣做：

require 'nokogiri' 

doc=Nokogiri::HTML::parse <<-eot 
<tr> 
    <th>Consignment Service Code</th> 
    <td>ND16</td> 
</tr> 
eot 

node = doc.at_xpath("//*[text()='Consignment Service Code']/following-sibling::*[1]") 
puts node.text 
# >> ND16

這裏有一個額外的嘗試，這可能會幫助你得到持續：

## parent node 
parent_node = doc.at_xpath("//*[text()='Consignment Service Code']/..") 
puts parent_node.name # => tr 

## to get the child td 
puts parent_node.at_xpath("//td").text # => ND16 

puts parent_node.to_html 

#<tr> 
#<th>Consignment Service Code</th> 
# <td>ND16</td> 
#</tr>

來源

2013-09-25 11:09:41

的另一種方式。

使用Nokogiri的css方法找到合適的tr節點，然後在th標記中選擇具有所需文本的節點。最後，選定的節點工作，並提取td值：

require 'nokogiri' 

str = '<tr> 
    <th>Consignment</th> 
    <td>ND15</td> 
</tr> 
<tr> 
    <th>Consignment Service Code</th> 
    <td>ND16</td> 
</tr> 
<tr> 
    <th>Consignment Service Code</th> 
    <td>ND17</td> 
</tr>' 

doc = Nokogiri::HTML.parse(str) 
nodes = doc.css('tr') 
      .select{|el| 
      el.css('th').text =~ /^Consignment Service Code$/ 
      } 

nodes.each do |el| 
    p el.css('td').text 
end

輸出是：

"ND16" 
"ND17"

來源

2013-09-25 12:22:33

你可以像使用'＃css'一樣使用這個http://api.jquery.com/text-selector/方法... –

您可能想要解釋該鏈接如何適用於Nokogiri的'css'和這個答案。 –

使用Nokogiri根據文本選擇HTML塊？

回答

相關問題