2012-04-05 92 views
15

我有這樣的代碼機械化如何獲取當前url

require 'mechanize' 
@agent = Mechanize.new 
page = @agent.get('http://something.com/?page=1') 
next_page = page.link_with(:href=>/^?page=2/).click 

正如你可以看到這個代碼應該進入下一個頁面。

next_page應該有URL http://something.com/?page=2

如何獲得當前的URL next_page

回答

22
next_page.uri.to_s 

http://www.rubydoc.info/gems/mechanize/Mechanize/Page/Link#uri-instance_methodhttp://ruby-doc.org/stdlib-2.4.1/libdoc/uri/rdoc/URI.html

出於測試目的,我的確在IRB如下:

require 'mechanize' 
@agent = Mechanize.new 

page = @agent.get('http://news.ycombinator.com/news') 
=> #<Mechanize::Page 
{url #<URI::HTTP:0x00000001ad3198 URL:http://news.ycombinator.com/news>} 
{meta_refresh} 
{title "Hacker News"} 
{iframes} 
{frames} 
{links 
    #<Mechanize::Page::Link "" "http://ycombinator.com"> 
    #<Mechanize::Page::Link "Hacker News" "news"> 
    #<Mechanize::Page::Link "new" "newest"> 
    #<Mechanize::Page::Link "comments" "newcomments"> 
    #<Mechanize::Page::Link "ask" "ask"> 
    #<Mechanize::Page::Link "jobs" "jobs"> 
    #<Mechanize::Page::Link "submit" "submit"> 
    #<Mechanize::Page::Link "login" "newslogin?whence=%6e%65%77%73"> 
    #<Mechanize::Page::Link "" "vote?for=3803568&dir=up&whence=%6e%65%77%73"> 
    #<Mechanize::Page::Link 
    "Don’t Be Evil: How Google Screwed a Startup" 
    "http://blog.hatchlings.com/post/20171171127/dont-be-evil-how-google-screwed-a-startup"> 
    #<Mechanize::Page::Link "mikeknoop" "user?id=mikeknoop"> 
    #<Mechanize::Page::Link "64 comments" "item?id=3803568"> 
    #<Mechanize::Page::Link "" "vote?for=3802515&dir=up&whence=%6e%65%77%73"> 
    # Omitted for brevity... 

next_page.uri 
=> #<URI::HTTP:0x00000001fa7818 URL:http://news.ycombinator.com/news2> 

next_page.uri.to_s 
=> "http://news.ycombinator.com/news2" 
+5

這是鏈接的網址,但該鏈接後,當前URL之後(和重定向發生)將是:@ agent.page.uri.to_s – pguardiario 2012-04-06 03:20:10