2016-03-08 40 views
0
我使用機械化關於Ruby

,並不斷收到此異常錯誤太多的連接復位異常錯誤 - 機械化在Ruby中

C:/Ruby200/lib/ruby/2.0.0/net/protocol.rb:158:in `rescue in rbuf_fill': too many connection resets (due to Net::ReadTimeout - Net::ReadTimeout) after 0 requests on 37920120, last used 1457465950.371121 seconds ago (Net::HTTP::Persistent::Error) 
    from C:/Ruby200/lib/ruby/2.0.0/net/protocol.rb:152:in `rbuf_fill' 
    from C:/Ruby200/lib/ruby/2.0.0/net/protocol.rb:134:in `readuntil' 
    from C:/Ruby200/lib/ruby/2.0.0/net/protocol.rb:144:in `readline' 
    from C:/Ruby200/lib/ruby/2.0.0/net/http/response.rb:39:in `read_status_line' 
    from C:/Ruby200/lib/ruby/2.0.0/net/http/response.rb:28:in `read_new' 
    from C:/Ruby200/lib/ruby/2.0.0/net/http.rb:1406:in `block in transport_request' 
    from C:/Ruby200/lib/ruby/2.0.0/net/http.rb:1403:in `catch' 
    from C:/Ruby200/lib/ruby/2.0.0/net/http.rb:1403:in `transport_request' 
    from C:/Ruby200/lib/ruby/2.0.0/net/http.rb:1376:in `request' 
    from C:/Ruby200/lib/ruby/gems/2.0.0/gems/rest-client-1.6.7/lib/restclient/net_http_ext.rb:51:in `request' 
    from C:/Ruby200/lib/ruby/gems/2.0.0/gems/net-http-persistent-2.9/lib/net/http/persistent.rb:986:in `request' 
    from C:/Ruby200/lib/ruby/gems/2.0.0/gems/mechanize-2.7.3/lib/mechanize/http/agent.rb:259:in `fetch' 
    from C:/Ruby200/lib/ruby/gems/2.0.0/gems/mechanize-2.7.3/lib/mechanize.rb:1281:in `post_form' 
    from C:/Ruby200/lib/ruby/gems/2.0.0/gems/mechanize-2.7.3/lib/mechanize.rb:548:in `submit' 
    from C:/Users/Feshaq/workspace/ERISScrap/eca_sample/eca_on_scraper.rb:152:in `<main>' 

這是行152:

#Click the form button 
agent.page.forms[0].click_button 

另外,我試過給出的片段,並繼續得到異常錯誤:

#get the form 
form = agent.page.form_with(:name => "AdvancedSearchForm") 
# get the button you want from the form 
button = form.button_with(:value => "Search") 
# submit the form using that button 
agent.submit(form, button) 

任何幫助表示讚賞

+0

這些通常意味着互聯網連接有問題。 – pguardiario

+0

還有各種可以嘗試的超時(和存活)設置(http://mechanize.rubyforge.org/Mechanize.html#method-i-keep_alive-3D)。 – Felix

回答

1

我已經遇到過這個問題多次。我處理它的方式是將運行刮板的代碼塊包裝在一個救援條款中,並且錯誤時我只需要終止連接並重置代理及其標題。這已經100%的時間,並沒有給我任何問題。然後,我繼續在代碼中停止的地方。下面的例子是刮我的迭代建築列表,並查找網頁等:

def begin_scraping_list 
    Building.all.each do |building_info| 
     begin   
     next if convert_boroughs_for_form(building_info) == :no_good 
     fill_in_first_page_address_form_and_submit(building_info) 
     get_to_proper_second_page 
     go_to_page_we_want_for_scraping 
     scrape_the_table(building_info) 
     rescue 
     puts "error happened" 
     @agent.shutdown 
     @agent = Mechanize.new { |agent| agent.user_agent_alias = 'Windows Chrome'} 
     @agent.request_headers 
     sleep(5) 
     redo 
     end 
    end 
    end 

所以你的情況,你會想你包裹在救援塊貼在問題區域運行

begin 
    #get the form 
    form = agent.page.form_with(:name => "AdvancedSearchForm") 
    # get the button you want from the form 
    button = form.button_with(:value => "Search") 
    # submit the form using that button 
    agent.submit(form, button) 
    rescue 
    agent.shutdown 
    agent = Mechanize.new { |agent| agent.user_agent_alias = 'Windows Chrome'} 
    agent.request_headers 
    sleep(2) 
    #get the form 
    form = agent.page.form_with(:name => "AdvancedSearchForm") 
    # get the button you want from the form 
    button = form.button_with(:value => "Search") 
    # submit the form using that button 
    agent.submit(form, button) 
    end