我想從[GitHub存檔]檢索數據:https://www.githubarchive.org/,並且在添加範圍時檢索數據時遇到問題。當我使用http://data.githubarchive.org/2015-01-01-15.json.gz時,它起作用,但在使用http://data.githubarchive.org/2015-01-01- {0..23} .json.gz時得到'open_http':404 Not Found(OpenURI :: HTTPError)消息。GitHub存檔 - 與範圍檢索數據的問題
使用捲曲http://data.githubarchive.org/2015-01-01- {0..23} .json.gz似乎正在工作。
基本上,我的目標是編寫一個程序來檢索特定時間範圍內前42個最活躍的存儲庫。
這是我的代碼,請讓我知道我使用的API不正確或代碼問題。
require 'open-uri'
require 'zlib'
require 'yajl'
require 'pry'
require 'date'
events = Hash.new(0)
type = 'PushEvent'
after = '2015-01-01T13:00:00Z'
before = '2015-01-02T03:12:14-03:00'
f_after_time = DateTime.parse(after).strftime('%Y-%m-%d-%H')
f_after_time = DateTime.parse(before).strftime('%Y-%m-%d-%H')
base = 'http://data.githubarchive.org/'
# query = '2015-01-01-15.json.gz'
query = '2015-01-01-{0..23}.json.gz'
url = base + query
uri = URI.encode(url)
gz = open(uri)
js = Zlib::GzipReader.new(gz).read
Yajl::Parser.parse(js) do |event|
if event['type'] == type
if event['repository']
repo_name = event['repository']['url'].gsub('https://github.com/', '')
events[repo_name] +=1
elsif event['repo'] #to account for older api
repo_name = event['repo']['url'].gsub('https://github.com/', '')
events[repo_name] +=1
end
end
end
# Sort events based on # of events and return top 42 repos
sorted_events = events.sort_by {|_key, value| value}.reverse.first(42)
sorted_events.each { |e| puts "#{e[0]} - #{e[1]} events" }
此問題已在另一頁上回答。 [鏈接到答案](http://stackoverflow.com/questions/30789924/url-encoding-issues-with-curly-braces') –