我試圖構建一個rake實用程序,它會經常更新我的數據庫。高效批量更新導軌數據庫
這是我到目前爲止的代碼:
namespace :utils do
# utils:update_ip
# Downloads the file frim <url> to the temp folder then unzips it in <file_path>
# Then updates the database.
desc "Update ip-to-country database"
task :update_ip => :environment do
require 'open-uri'
require 'zip/zipfilesystem'
require 'csv'
file_name = "ip-to-country.csv"
file_path = "#{RAILS_ROOT}/db/" + file_name
url = 'http://ip-to-country.webhosting.info/downloads/ip-to-country.csv.zip'
#check last time we updated the database.
mod_time = ''
mod_time = File.new(file_path).mtime.httpdate if File.exists? file_path
begin
puts 'Downloading update...'
#send conditional GET to server
zipped_file = open(url, {'If-Modified-Since' => mod_time})
rescue OpenURI::HTTPError => the_error
if the_error.io.status[0] == '304'
puts 'Nothing to update.'
else
puts 'HTTPError: ' + the_error.message
end
else # file was downloaded without error.
Rails.logger.info 'ip-to-coutry: Remote database was last updated: ' + zipped_file.meta['last-modified']
delay = Time.now - zipped_file.last_modified
Rails.logger.info "ip-to-country: Database was outdated for: #{delay} seconds (#{delay/60/60/24 } days)"
puts 'Unzipping...'
File.delete(file_path) if File.exists? file_path
Zip::ZipFile.open(zipped_file.path) do |zipfile|
zipfile.extract(file_name, file_path)
end
Iptocs.delete_all
puts "Importing new database..."
# TODO: way, way too heavy find a better solution.
CSV.open(file_path, 'r') do |row|
ip = Iptocs.new( :ip_from => row.shift,
:ip_to => row.shift,
:country_code2 => row.shift,
:country_code3 => row.shift,
:country_name => row.shift)
ip.save
end #CSV
puts "Complete."
end #begin-resuce
end #task
end #namespace
我遇到的問題是,這需要幾分鐘的時間進入10萬加項。我想找到一個更有效的方式來更新我的數據庫。理想情況下,這將保持獨立於數據庫類型,但如果不是我的生產服務器將在MySQL上運行。
謝謝你的任何見解。
這正是我在找的,謝謝。 – codr 2010-02-19 01:54:05
該gem支持從CSV導入。這消除了「ActiveRecord」實例化和驗證成本。有關更多詳細信息,請參閱此文章。 http://www.rubyinside.com/advent2006/17-extendingarhtml – 2010-02-19 02:24:52
幫助我也 - 謝謝! – ambertch 2010-05-17 19:57:58