使用ruby進行網頁抓取

我是一名編程新手，我有一個項目，我必須編寫一個Ruby腳本以從github上檢索指定存儲庫上的信息，解析JSON格式的數據，並以可用格式打印命令行。使用ruby進行網頁抓取

我已經簽出機械指南。我可以檢查以完成此任何文件？

來源

2011-07-11 kamenraider2

我只安裝了hpricot和mechanize，沒有安裝其他人的權限。 – kamenraider2

使用Github的Repositories API。你想要的一切都是在那裏完成的，沒有刮或怪異的黑客。 JSON格式的響應默認情況下。

來源

2011-07-11 22:17:49

謝謝。我以前也看到過，但事情是我沒有捲曲。任何建議？ – kamenraider2

按照@jdc，RESTClient建議的方式使用httparty，或者僅使用普通的舊的Net：HTTP。 –

繼@Douglas的迴應。

require 'httparty' 
class Repository 
    include HTTParty 
    base_uri 'www.github.com' 
end 
response = Repository.get('/api/v2/json/repos/show/joncooper/beanstalkd') 

require 'awesome_print' 
>> ap response.parsed_response 
{ 
    "repository" => { 
       "name" => "beanstalkd", 
       "size" => 128, 
      "created_at" => "2011/04/29 09:43:43 -0700", 
      "has_wiki" => true, 
       "parent" => "kr/beanstalkd", 
       "private" => false, 
      "watchers" => 1, 
       "fork" => true, 
      "language" => "C", 
        "url" => "https://github.com/joncooper/beanstalkd", 
      "pushed_at" => "2011/07/05 22:10:53 -0700", 
      "open_issues" => 0, 
     "has_downloads" => true, 
      "has_issues" => false, 
      "homepage" => "http://kr.github.com/beanstalkd/", 
       "forks" => 0, 
      "description" => "Beanstalk is a simple, fast work queue.", 
       "source" => "kr/beanstalkd", 
       "owner" => "joncooper" 
    } 
}

更多見http://httparty.rubyforge.org/：你想要做什麼用GitHub的API和HTTParty寶石是很容易。

來源

2011-07-11 23:05:49 jdc

使用ruby進行網頁抓取

回答

相關問題