我正在關注「Ruby的混蛋之書」,我試圖使用nokogiri構建一個webscraper,但大約有四分之一的方式進入它,當我嘗試運行它拋出的錯誤代碼:RUBY(Errno :: ENOENT),沒有這樣的文件或目錄@ dir_s_mkdir
Crawler.rb:6:in `mkdir': No such file or directory @ dir_s_mkdir - data-hold/nobel (Errno::ENOENT)
from Crawler.rb:6:in `<main>'
我的代碼如下:
require 'rubygems'
require 'nokogiri'
require 'open-uri'
DATA_DIR = "data-hold/nobel"
Dir.mkdir(DATA_DIR) unless File.exists?(DATA_DIR)
BASE_WIKIPEDIA_URL = "http://en.wikipedia.org"
LIST_URL = "#{BASE_WIKIPEDIA_URL}/wiki/List_of_Nobel_laureates"
HEADERS_HASH = {"User-Agent" => "Ruby/#{RUBY_VERSION}"}
page = Nokogiri::HTML(open(LIST_URL))
rows = page.css('div.mw-content-ltr table.wikitable tr')
rows[1..-2].each do |row|
hrefs = row.css("td a").map{ |a|
a['href'] if a['href'] =~ /^\/wiki\//
}.compact.uniq
hrefs.each do |href|
remote_url = BASE_WIKIPEDIA_URL + href
local_fname = "#{DATA_DIR}/#{File.basename(href)}.html"
unless File.exists?(local_fname)
puts "Fetching #{remote_url}..."
begin
wiki_content = open(remote_url, HEADERS_HASH).read
rescue Exception=>e
puts "Error: #{e}"
sleep 5
else
File.open(local_fname, 'w'){|file| file.write(wiki_content)}
puts "\t...Success, saved to #{local_fname}"
ensure
sleep 1.0 + rand
end # done: begin/rescue
end # done: unless File.exists?
end # done: hrefs.each
end # done: rows.each
我硬是不知道它爲什麼不創建一個新的目錄來將數據存儲在我知道我必須錯過一件非常簡單的事情...