Ruby抓取網站，添加網址參數

我試圖抓取一個網站，追加一個URL參數到每個地址之前擊中他們。這是我到目前爲止有：Ruby抓取網站，添加網址參數

require "spidr" 
    Spidr.site('http://www.example.com/') do |spider| 
     spider.every_url { |url| puts url } 
    end

但我想蜘蛛打所有頁面並追加設置了一個param像這樣：

example.com/page1?var=param1
example.com/page2?var=param1
example.com/page3?var=param1

更新1 - 試過，但沒有工作，E rrors出（「405方法不允許」），經過幾次反覆：

require "spidr" 
require "open-uri" 
Spidr.site('http://example.com') do |spider| 
    spider.every_url do |url| 
    link= url+"?foo=bar" 
    response = open(link).read 
    end 
end

來源

2017-09-25 mustacheMcGee

您只需將該參數添加到URL數組中...您嘗試過任何操作嗎？ –

對我來說確實發生了，但那又如何？在我創建了一個格式正確的URL數組後，通過spidr運行它？ – mustacheMcGee

聽起來對我很好。嘗試一下 –

不是依靠SPIDR，我只是抓住了我從谷歌Analytics（分析）所需要的URL的CSV，然後跑通的。完成了工作。

require 'csv' 
require 'open-uri' 

CSV.foreach(File.path("the-links.csv")) do |row| 
    link = "http://www.example.com"+row[0]+"?foo=bar" 
    encoded_url = URI.encode(link) 
    response = open(encoded_url).read 
    puts encoded_url 
    puts 
end

來源

2017-09-26 14:30:30 mustacheMcGee

Ruby抓取網站，添加網址參數

回答

相關問題