從軌道上的紅寶石網站獲取HTML

如何獲得其他網站上的其他網站的頁面數據與軌道上的紅寶石？在標準庫從軌道上的紅寶石網站獲取HTML

require "net/https" 

http = Net::HTTP.new "google.com", 80 
request = Net::HTTP::Get.new "/" 
response = http.request request 

puts response.code 
puts response.body

來源

2010-09-04 kaibakker

您可以使用httparty只得到數據

示例代碼（來自example）：

require File.join(dir, 'httparty') 
require 'pp' 

class Google 
    include HTTParty 
    format :html 
end 

# google.com redirects to www.google.com so this is live test for redirection 
pp Google.get('http://google.com') 

puts '', '*'*70, '' 

# check that ssl is requesting right 
pp Google.get('https://www.google.com')

Nokogiri真正擅長分析這些數據。這裏是從Railscast一些示例代碼：

url = "http://www.walmart.com/search/search-ng.do?search_constraint=0&ic=48_0&search_query=batman&Find.x=0&Find.y=0&Find=Find" 
doc = Nokogiri::HTML(open(url)) 
puts doc.at_css("title").text 
doc.css(".item").each do |item| 
    title = item.at_css(".prodLink").text 
    price = item.at_css(".PriceCompare .BodyS, .PriceXLBold").text[/\$[0-9\.]+/] 
    puts "#{title} - #{price}" 
    puts item.at_css(".prodLink")[:href] 
end

來源

2010-09-04 12:39:18

使用Net/HTTP（例如，讀this cheatsheet）：

RestClient.get('http://example.com/resource', params: {x: "1", y: "2"})

來源

2010-09-04 11:27:09

Net::HTTP船，這是一個加，但也有冷靜，你可以看看，像rest-client更高級別的庫：

來源

2010-09-04 12:01:14 tokland

感謝您的支持。這可能只是我的一個新項目的門票。 – 2010-09-04 22:58:33

我喜歡OpenURI自己，如果只是簡單地獲得內容沒有大驚小怪。

只需將require 'open-uri'添加到環境中，然後再做open('http://domain.tld/document.html').read。

來源

2010-09-04 13:37:59 gaqzi

從軌道上的紅寶石網站獲取HTML

回答

相關問題