2016-08-22 94 views
-1

我創建了一個ruby腳本,如果從控制檯運行它,它可以正常運行。Rake任務沒有在數據庫中保存或創建新記錄

該腳本從各個網站獲取一些信息並將其保存到我的數據庫表中。

但是,當我想將代碼轉換爲rake任務時,代碼仍然運行,但它不保存任何新記錄。我也沒有從耙子中得到任何錯誤。

# Add your own tasks in files placed in lib/tasks ending in .rake, 
# for example lib/tasks/capistrano.rake, and they will automatically be   available to Rake. 

require File.expand_path('../config/application', __FILE__) 

Rails.application.load_tasks 

require './crawler2.rb' 
task :default => [:crawler] 

task :crawler do 

### ### 

require 'rubygems' 
require 'nokogiri' 
require 'open-uri' 

start = Time.now 

$a = 0 

sites = ["http://www.nytimes.com","http://www.news.com"] 

for $a in 0..sites.size-1 

url = sites[$a] 

$i = 75 

$error = 0 

avoid_these_links = ["/tv", "//www.facebook.com/"] 

doc = Nokogiri::HTML(open(url)) 

    links = doc.css("a") 
    hrefs = links.map {|link| link.attribute('href').to_s}.uniq.sort.delete_if {|href| href.empty?}.delete_if {|href| avoid_these_links.any? { |w| href =~ /#{w}/ }}.delete_if {|href| href.size < 10 } 

#puts hrefs.length 

#puts hrefs 

for $i in 0..hrefs.length 
    begin 

     #puts hrefs[60] #for debugging) 

    #file = open(url) 
    #doc = Nokogiri::HTML(file) do 

     if hrefs[$i].downcase().include? "http://" 

      doc = Nokogiri::HTML(open(hrefs[$i])) 

     else 

      doc = Nokogiri::HTML(open(url+hrefs[$i])) 

     end 

     image = doc.at('meta[property="og:image"]')['content'] 
     title = doc.at('meta[property="og:title"]')['content'] 
     article_url = doc.at('meta[property="og:url"]')['content'] 
     description = doc.at('meta[property="og:description"]')['content'] 
     category = doc.at('meta[name="keywords"]')['content'] 

     newspaper_id = 1 


     puts "\n" 
     puts $i 
     #puts "Image: " + image 
     #puts "Title: " + title 
     #puts "Url: " + article_url 
     #puts "Description: " + description 
     puts "Catory: " + category 

      Article.create({ 
      :headline => title, 
      :caption => description, 
      :thumbnail_url => image, 
      :category_id => 3, 
      :status => true, 
      :journalist_id => 2, 
      :newspaper_id => newspaper_id, 
      :from_crawler => true, 
      :description => description, 
      :original_url => article_url}) unless Article.exists?(original_url: article_url) 

     $i +=1 

     #puts $i #for debugging 

     rescue 
     #puts "Error here: " + url+hrefs[$i] if $i < hrefs.length 
     $i +=1 # do_something_* again, with the next i 
     $error +=1 

    end 

end 

puts "Page: " + url 
puts "Articles: " + hrefs.length.to_s 
puts "Errors: " + $error.to_s 

$a +=1 

end 

finish = Time.now 

diff = ((finish - start)/60).to_s 

puts diff + " Minutes" 


### ### 


end 

的代碼執行罰款,如果我將文件保存爲crawler.rb並通過這樣做在控制檯打開 - >「負載‘./crawler2.rb’」。當我在rake任務中使用完全相同的代碼時,我沒有得到新的記錄。

+0

感覺就像缺少這裏的東西。 'task:crawler do'永遠不會被'end'關閉。文章創作是否在任務內?縮進表明可能不是? – jaydel

+0

感謝您的意見,但恐怕不是這樣。我用一些print/puts語句測試了一下,這些工作也很完美。這就像代碼剛剛跳過.create部分。我不知道我是否使用了Rake,或者語法錯誤? –

+0

語法錯誤。 '做'需要和'結束'的地方。 – jaydel

回答

0

我想通了什麼是錯的。

我需要刪除:

require './crawler2.rb' 
task :default => [:crawler] 

,而是編輯如下:

task :crawler => :environment do 

現在履帶每十分鐘與Heroku的調度有點幫助:-)運行

感謝您的幫助 - 對於格式不好的抱歉。希望這個答案可以幫助其他人。

相關問題