2016-09-23 191 views
0

解析一個html表格後,我能夠獲得表格的第一行作爲Nokogiri元素。訪問Nokogiri元素子元素

2.2.1 :041 > pp content[1]; nil 
#(Element:0x3feee917d1e0 { 
    name = "tr", 
    children = [ 
    #(Element:0x3feee917cfd8 { 
     name = "td", 
     attributes = [ 
     #(Attr:0x3feee917cf74 { name = "valign", value = "top" })], 
     children = [ 
     #(Element:0x3feee917ca60 { 
      name = "a", 
      attributes = [ 
      #(Attr:0x3feee917c9fc { 
       name = "href", 
       value = "/cgi-bin/own-disp?action=getowner&CIK=0001513362" 
       })], 
      children = [ #(Text "Maestri Luca")] 
      })] 
     }), 
    #(Text "\n"), 
    #(Element:0x3feee917c150 { 
     name = "td", 
     children = [ 
     #(Element:0x3feee917d794 { 
      name = "a", 
      attributes = [ 
      #(Attr:0x3feee9179fb8 { 
       name = "href", 
       value = "/cgi-bin/browse-edgar?action=getcompany&CIK=0001513362" 
       })], 
      children = [ #(Text "0001513362")] 
      })] 
     }), 
    #(Text "\n"), 
    #(Element:0x3feee91796a8 { 
     name = "td", 
     children = [ #(Text "2016-09-04")] 
     }), 
    #(Text "\n"), 
    #(Element:0x3feee9179194 { 
     name = "td", 
     children = [ #(Text "officer: Senior Vice President, CFO")] 
     }), 
    #(Text "\n")] 
    }) 
=> nil 

這是該行的內容:

馬斯特里盧卡0001513362 2016年9月4日官:高級副總裁,CFO

我需要訪問的姓名,號碼,日期和Nokogiri元素的標題。這樣做的

一種方法是如下:

2.2.1 :042 > pp content[1].text; nil 
"Maestri Luca\n0001513362\n2016-09-04\nofficer: Senior Vice President, CFO\n" 

不過,我正在尋找單獨訪問的元素,而不是作爲一個長刺用換行符的一種方式。我該怎麼做?

回答

1
name, number, date, title = *content[1].css('td').map(&:text) 

如果content[1]trcontent[1].css('td')會發現所有td元素在它下面,.map(&:text)會調用td.text爲每個td,並把它變成一個數組,我們比*圖示,所以我們可以做多重分配。

(注意:下次請包含原始HTML片段,不包括Nokogiri節點檢查結果。)