2012-04-23 45 views
2

我有一個登錄到表單的頁面。登錄後有幾個重定向。第一個是這樣的:如何從Mechanize :: File對象轉換爲Mechanize :: Page對象?

#<Mechanize::File:0x1f4ff23 @filename="MYL.html", @code="200", @response={"cache-control"=>"no-cache=\"set-cookie\"", "content-length"=>"114", "set-cookie"=>"JSESSIONID=GdJnPVnhtN91KZfQPc3QzM1NLCyWDsnyvpGg8LL0Knnz3RgqxLFs!1803804592!-2134626567; path=/; secure, COOKIE_TEST=Aslyn; secure", "x-powered-by"=>"Servlet/2.4 JSP/2.0"}, @body="\r\n<html>\r\n <head>\r\n <meta http-equiv=\"refresh\" content=\"0;URL=MYL?Select=OK&StateName=38\">\r\n </head>\r\n</html>", @uri=#<URI::HTTPS:0x16e1eff URL:https://www.manageyourloans.com/MYL?StateName=global_CALMLandingPage&GUID=D1704621-1994-E076-460A-10B2B682B960>> 

所以當我在這裏做一個page.class我得到

Mechanize::File 

如何將其轉換成一個Mechanize::Page


@pguardiario

爲了更好地解釋我在我的原始消息的代碼存儲在頁。

當我做page.class我得到機械化::文件

於是我EXCUTE上面代碼:

agent = Mechanize.new 
agent.post_connect_hooks << lambda {|http| http[:response].content_type = 'text/html'} 

所以我這樣做: agent.get(page.uri.to_s ) 或事件試圖用任何URL agent.get( 「https://www.manageyourloans.com/MYL」) 我得到一個錯誤: 引發ArgumentError:錯誤的參數數目(4 1)

我甚至試過這樣:

agent = Mechanize.new { |a| 
    a.post_connect_hooks << lambda { |_,_,response,_| 
    if response.content_type.nil? || response.content_type.empty? 
     response.content_type = 'text/html' 
    end 
    } 
} 

我的問題是一旦我這樣做,我如何將前一頁轉換爲一個Mechanize :: Page?

回答

3

您可以通過採取包含在文件對象的身體和傳球,在作爲新頁面的主體從機械化::文件到機械化::頁面轉換:

irb(main):001:0> require 'mechanize' 
true 
irb(main):002:0> file = Mechanize::File.new(URI.parse('http://foo.com'),nil,File.read('foo.html')) 
#<Mechanize::File:0x100ef0190 
    @full_path = false, 
    attr_accessor :body = "<html><body>foo</body></html>\n", 
    attr_accessor :code = nil, 
    attr_accessor :filename = "index.html", 
    attr_accessor :response = {}, 
    attr_accessor :uri = #<URI::HTTP:0x100ef02d0 
     attr_accessor :fragment = nil, 
     attr_accessor :host = "foo.com", 
     attr_accessor :opaque = nil, 
     attr_accessor :password = nil, 
     attr_accessor :path = "", 
     attr_accessor :port = 80, 
     attr_accessor :query = nil, 
     attr_accessor :registry = nil, 
     attr_accessor :scheme = "http", 
     attr_accessor :user = nil, 
     attr_reader :parser = nil 
    > 
> 

首先,我創建一個虛假的Mechanize :: File對象只是爲了讓示例代碼遵循一個。您可以在:body中看到它讀取的文件的內容。

當無法確定真正的內容類型是什麼時,機械化會創建一個Mechanize :: File對象。

irb(main):003:0> page = Mechanize::Page.new(URI.parse('http://foo.com'),nil,file.body) 
#<Mechanize::Page:0x100ed5e30 
    @full_path = false, 
    @meta_content_type = nil, 
    attr_accessor :body = "<html><body>foo</body></html>\n", 
    attr_accessor :code = nil, 
    attr_accessor :encoding = nil, 
    attr_accessor :filename = "index.html", 
    attr_accessor :mech = nil, 
    attr_accessor :response = { 
     "content-type" => "text/html" 
    }, 
    attr_accessor :uri = #<URI::HTTP:0x100ed5ed0 
     attr_accessor :fragment = nil, 
     attr_accessor :host = "foo.com", 
     attr_accessor :opaque = nil, 
     attr_accessor :password = nil, 
     attr_accessor :path = "", 
     attr_accessor :port = 80, 
     attr_accessor :query = nil, 
     attr_accessor :registry = nil, 
     attr_accessor :scheme = "http", 
     attr_accessor :user = nil, 
     attr_reader :parser = nil 
    >, 
    attr_reader :bases = nil, 
    attr_reader :encodings = [ 
     [0] nil, 
     [1] "US-ASCII" 
    ], 
    attr_reader :forms = nil, 
    attr_reader :frames = nil, 
    attr_reader :iframes = nil, 
    attr_reader :labels = nil, 
    attr_reader :labels_hash = nil, 
    attr_reader :links = nil, 
    attr_reader :meta_refresh = nil, 
    attr_reader :parser = nil, 
    attr_reader :title = nil 
> 
irb(main):004:0> page.class 
Mechanize::Page < Mechanize::File 

只需傳入文件對象的主體並讓機械化轉換爲您應該知道的內容即可。

+0

我的工作,通過這個答案,我使用這個:'code'page =機械化: :Page.new(URI.parse(page.uri.to_s),零,page.body)'code'。我得到一個錯誤:未定義的方法'[]'爲零:NilClass – user1198316 2012-04-24 12:11:17

+0

偉大的答案,適合我! – 2012-09-11 14:59:14

0

我喜歡@The鐵皮人的答案,但它可能是簡單的強制響應的內容類型:

agent.post_connect_hooks << lambda {|http| http[:response].content_type = 'text/html'} 
+0

當我在irb中這樣做時,我得到:undefined method'post_connect_hooks'for# user1198316 2012-04-24 11:57:40

+0

在我的答案代理中引用了一個Mechanize對象,您可以用'Mechanize.new'實例化 – pguardiario 2012-04-24 12:01:19

+0

agent = Mechanize.new agent.post_connect_hooks << lambda {| http | http [:response] .content_type ='text/html'}。閱讀它說它檢索響應後調用的鉤子列表。代理調用掛鉤並返回響應。所以我會在我有我的機械化::文件後做到這一點,對嗎?那麼,如果我做了agent.get(urlofpagehere),那麼應該返回Mechanize :: Page? – user1198316 2012-04-24 13:10:14