2011-06-28 86 views
3

我目前正在編寫一個腳本,它遍歷URL列表並對它們執行一些處理。然而,我列表中的一個URL給我一個問題。代碼如下:使用Ruby發出HTTP請求時發生EOF錯誤

url = "https://secure.www.alumniconnections.com/olc/pub/CDB/events/attendance.cgi? tmpl=attendance&event=2309515&sort=4" 
uri = URI.parse(url) 
response = Net::HTTP.get_response(uri) 

最後一行引發以下錯誤:

EOFError: end of file reached 
    from /usr/lib/ruby/1.8/net/protocol.rb:135:in `sysread' 
    from /usr/lib/ruby/1.8/net/protocol.rb:135:in `rbuf_fill' 
    from /usr/lib/ruby/1.8/timeout.rb:67:in `timeout' 
    from /usr/lib/ruby/1.8/timeout.rb:101:in `timeout' 
    from /usr/lib/ruby/1.8/net/protocol.rb:134:in `rbuf_fill' 
    from /usr/lib/ruby/1.8/net/protocol.rb:116:in `readuntil' 
    from /usr/lib/ruby/1.8/net/protocol.rb:126:in `readline' 
    from /usr/lib/ruby/1.8/net/http.rb:2028:in `read_status_line' 
    from /usr/lib/ruby/1.8/net/http.rb:2017:in `read_new' 
    from /usr/lib/ruby/1.8/net/http.rb:1051:in `request' 
    from /usr/lib/ruby/1.8/net/http.rb:948:in `request_get' 
    from /usr/lib/ruby/1.8/net/http.rb:380:in `get_response' 
    from /usr/lib/ruby/1.8/net/http.rb:543:in `start' 
    from /usr/lib/ruby/1.8/net/http.rb:379:in `get_response' 
    from (irb):5 
    from /usr/lib/ruby/1.8/uri/ftp.rb:190 

在我的名單沒有其他網址似乎是給我任何的悲傷。任何人都可以解釋爲什麼我得到這個錯誤?

回答

6

我輸入https://secure.www.alumniconnections.com/似乎將我重定向到http://www.harrisconnect.com/。我的猜測是你的代碼無法處理重定向。嘗試使用Mechanize(http://mechanize.rubyforge.org/)來處理這個問題。此外,我建議你換你的代碼中的一些錯誤處理,如:

# Prevent Infinite Loops 
counter = 0 

begin 
    # Your Code Here 

rescue EOFError 
    puts "encountered EOFError" 

    # Fail the connection after 3 attempts 
    if counter < 3 
    counter += 1 
    puts "redo: #{counter}" 
    redo 
    else 
    puts "FAILED CONNECTION #{counter} TIMES" 
    counter = 0 
    end 
end 

這將嘗試重新連接到了很多過去的URL時幫助我的連接。

編輯:

require 'rubygems' 
require 'mechanize' 

agent = Mechanize.new 
html_text = agent.get("https://secure.www.alumniconnections.com/olc/pub/CDB/events/attendance.cgi?tmpl=attendance&event=2309515&sort=4").body 

html_file = File.open("html_file.html", "w") 
html_file.write(html_text) 
html_file.close 

這對我這麼試試看寫你的網頁的文件就好了。

+1

您的第一個片段可能導致無限循環嗎? –

+1

是的,我相信可以。不知何故,我從來沒有抓到過。我會做一個快速修改來解決,儘管我不能保證它會是最好的解決方案。 – scradge

0

如果這是HTTPS,而不僅僅是HTTP,你可以試試這個(關於Ruby 1.8.6工作):

require 'rubygems' 
require "net/https" 
require "uri" 


address = "https://www.your-secure-domain-here.com" 
uri = URI.parse(address) 
http = Net::HTTP.new(uri.host, uri.port) 
http.use_ssl = true 
http.verify_mode = OpenSSL::SSL::VERIFY_NONE 
request = Net::HTTP::Get.new(uri.request_uri) 
request.basic_auth("username", "password") 
response = http.request(request) 

在我的例子,而不是usernamepassword我不得不做SECRET-API-KEYapi_token

試試看看是否有幫助。

相關問題