2012-10-30 96 views
-1

我想湊有關網站的新專輯發佈的信息,而且我通過引入nokogiri處理這個。這個想法是創建一個不錯的數組將包含像這樣紅寶石 - 寫入嵌套陣列

[ 
    0 => ['The Wall', 'Pink Floyd', '1979'], 
    1 => ['Led Zeppelin I', 'Led Zeppelin', '1969'] 
] 

這是我當前的代碼項目。我是一個總的紅寶石新手,所以任何建議將不勝感激。

@events = Array.new() 
# for every date we encounter 
doc.css("#main .head_type_1").each do |item| 

    date = item.text 

    # get every albumtitle 
    doc.css(".albumTitle").each_with_index do |album, index| 
    album = album.text 
    @events[index]['album'] = album 
    @events[index]['release_date'] = date 
    end 

    #get every artistname 
    doc.css(".artistName").each do |artist| 
    artist = artist.text 
    @events[index]['artist'] = artist 
    end 

end 

puts @events 

P.S.我想刮頁面的格式有點怪:

<tr><th class="head_type_1">20 October 1989</th></tr> 
<tr><td class="artistName">Jean Luc-Ponty</td><td class="albumTitle">Some example album</td></tr> 
<tr><td class="artistName">Some Other Artist</td><td class="albumTitle">Some example album</td></tr> 
<tr><td class="artistName">Some Other Artist</td><td class="albumTitle">Some example album</td></tr> 
<tr><th class="head_type_1">29 October 1989</th></tr> 
<tr><td class="artistName">Some Other Artist</td><td class="albumTitle">Some example album</td></tr> 

當我嘗試Ruby解釋我遇到下面的錯誤中運行此:

get_events.rb:25:in `block (2 levels) in <main>': undefined method `[]=' for nil:NilClass (NoMethodError) 
from /Users/adrian/.rvm/gems/ruby-1.9.3-p286/gems/nokogiri-1.5.5/lib/nokogiri/xml/node_set.rb:239:in `block in each' 
from /Users/adrian/.rvm/gems/ruby-1.9.3-p286/gems/nokogiri-1.5.5/lib/nokogiri/xml/node_set.rb:238:in `upto' 
from /Users/adrian/.rvm/gems/ruby-1.9.3-p286/gems/nokogiri-1.5.5/lib/nokogiri/xml/node_set.rb:238:in `each' 
from get_events.rb:23:in `each_with_index' 
from get_events.rb:23:in `block in <main>' 
from /Users/adrian/.rvm/gems/ruby-1.9.3-p286/gems/nokogiri-1.5.5/lib/nokogiri/xml/node_set.rb:239:in `block in each' 
from /Users/adrian/.rvm/gems/ruby-1.9.3-p286/gems/nokogiri-1.5.5/lib/nokogiri/xml/node_set.rb:238:in `upto' 
from /Users/adrian/.rvm/gems/ruby-1.9.3-p286/gems/nokogiri-1.5.5/lib/nokogiri/xml/node_set.rb:238:in `each' 
from get_events.rb:18:in `<main>' 

如何解決這個?

+3

什麼是你的問題? – waldrumpus

+0

添加了錯誤輸出和問題:) – Carvefx

+1

當您爲該複雜性的代碼添加錯誤消息時,應該將行號添加到代碼中。你認爲有人會通過代碼爲你做所有的工作嗎? – sawa

回答

1

我不能換我的頭在你的解決方案,但附近有一座小打後,我想出了這個。

require 'pp' 
require 'nokogiri' 

str = %Q{ 
<tr><th class="head_type_1">20 October 1989</th></tr> 
<tr><td class="artistName">Jean Luc-Ponty</td><td class="albumTitle">Some album</td></tr> 
<tr><td class="artistName">Some Other Artist</td><td class="albumTitle">Some album</td></tr> 
<tr><td class="artistName">Some Other Artist</td><td class="albumTitle">Some album</td></tr> 
<tr><th class="head_type_1">29 October 1989</th></tr> 
<tr><td class="artistName">Some Other Artist</td><td class="albumTitle">Some album</td></tr> 
} 

doc = Nokogiri::HTML(str) 
date = "" 
result = [] 

doc.xpath("//tr").each do |tr| 
    children = tr.children 
    if children.first["class"] == "head_type_1" 
    date = children.first.content 
    else 
    artist, album = children.map {|c| c.content} 
    result << {album: album, artist: artist, date: date} 
    end 
end 

pp result 

輸出:

[{:album=>"Some album", :artist=>"Jean Luc-Ponty", :date=>"20 October 1989"},
{:album=>"Some album", :artist=>"Some Other Artist", :date=>"20 October 1989"},
{:album=>"Some album", :artist=>"Some Other Artist", :date=>"20 October 1989"},
{:album=>"Some album", :artist=>"Some Other Artist", :date=>"29 October 1989"}]

不正是你所要求的,但也許更多一點的Ruby成語,我敢肯定,如果需要,你可以修改它。

+0

這正是我想要實現的,幾乎到了那裏,但對於一個小錯誤。非常感謝您的關注! – Carvefx

+0

而這段代碼的不言自明的信息是,除了讓·呂克 - 龐蒂之外,沒有其他的藝術家。 :-) –

-1

索引變量是未定義的關於你的第二each

+0

這不是它,我試過doc.css(「。artistName」)。each_with_index do | artist,index | - 同樣的輸出 – Carvefx