而不是求助於rvest
和刮,你可以直接使用他們的API。正如我所說的,他們的SQL例子出錯了,但它並不是沒有WHERE…
部分(下面的例子)。下面是在直搜索或SQL搜索一個重複的過程積木:
library(jsonlite)
library(httr)
# for passing in a SQL statement
query_nj_sql <- function(sql=NULL) {
if (is.null(sql)) return(NULL)
res <- GET("http://data.ci.newark.nj.us/api/action/datastore_search_sql",
query=list(sql=sql))
stop_for_status(res) # catches errors
fromJSON(content(res, as="text"))
}
# for their plain search syntax
query_nj_search <- function(resource_id=NULL, query=NULL, offset=NULL) {
if (is.null(resource_id)) return(NULL)
res <- GET("http://data.ci.newark.nj.us/api/action/datastore_search",
query=list(resource_id=resource_id,
offset=NULL,
q=query))
stop_for_status(res) # catches errors
fromJSON(content(res, as="text"))
}
# this SQL does not error out
sql_dat <- query_nj_sql('SELECT * from "d7b23f97-cba5-4c15-997c-37a696395d66"')
search_dat <- query_nj_search(resource_id="d7b23f97-cba5-4c15-997c-37a696395d66")
正如我所說的,是SQL查詢將不會報錯了。
兩個調用返回一個稍微複雜list
結構,可以用檢查:
str(sql_dat)
str(search_dat)
但記錄都在那裏:
dplyr::glimpse(sql_dat$result$records)
## Observations: 545
## Variables: 40
## $ Total population 25 years and over (chr) "6389.0", "68.0", "4197.0", "389.0", "1211.0", "4...
## $ Male - Associate's degree (chr) "286.0", "0.0", "63.0", "6.0", "69.0", "31.0", "7...
## $ Male - Master's degree (chr) "148.0", "29.0", "379.0", "17.0", "79.0", "24.0",...
## $ Male - 7th and 8th grade (chr) "49.0", "0.0", "16.0", "2.0", "14.0", "0.0", "0.0...
## $ Female - High school graduate, GED, or alternative (chr) "915.0", "0.0", "426.0", "46.0", "174.0", "30.0",...
## $ Male - 11th grade (chr) "88.0", "0.0", "12.0", "0.0", "3.0", "0.0", "0.0"...
## $ Male - Bachelor's degree (chr) "561.0", "0.0", "878.0", "93.0", "137.0", "58.0",...
## $ Male - Some college, 1 or more years, no degree (chr) "403.0", "0.0", "179.0", "23.0", "39.0", "0.0", "...
… (this goes on a while)
的API看起來可能分頁,所以你可能必須處理那個(因此offset
參數)。
由於NJ Edu API支持OData查詢,因此您也可以使用RSocrata包。
你想絕對使用R嗎?我會推薦python這個動態的報廢... –
@ColonelBeauvel所以你沒有看到'httr','rvest','xml2'在行動中,呃? SO不是開始宗教戰爭的地方。 – hrbrmstr
不幸的是,示例SQL調用(在他們的網站上,它是你粘貼的內容,但它不是你的錯)在他們的API中產生一個錯誤。你是否需要使用SQL?其他基於參數的工作(我認爲) – hrbrmstr