2014-10-17 26 views
2

我有下面列出的SPARQL查詢(對長度表示歉意)我想將此查詢的結果轉換爲R數據框,類似於預覽here中可用的數據框將查詢內容粘貼到中輸入查詢窗口。在一個句子中,我只對下載數字,列標題和第一列標識地理區域感興趣。當運行當前查詢並試圖強制數據框中的結果並在gggplot中使用它時,我一直得到一個錯誤ggplot2不知道如何處理班級列表的數據,這是因爲返回的數據沒有與測試查詢內容時在預覽窗口中返回的CSV文件相似。我的問題是我應該在下面的代碼中更改什麼,它會生成一個R數據框對象,其值和結構對應於下面的預覽表。 query results將SPARQL結果作爲CSV獲取到R

代碼導入數據

# Libs 
    library(SPARQL) 

    # Source the data 
    ## Define endpoint URL. 
    endpoint <- "http://data.opendatascotland.org/sparql?query" 

    ### Create Query and download table for the SIMD rank 
    query.simd <- "PREFIX stats: <http://statistics.data.gov.uk/id/statistical-geography/> 
    PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> 
    PREFIX simd: <http://data.opendatascotland.org/def/simd/> 
    PREFIX cube: <http://purl.org/linked-data/cube#> 
    PREFIX stats_dim: <http://data.opendatascotland.org/def/statistical-dimensions/> 
    PREFIX year: <http://reference.data.gov.uk/id/year/> 

    SELECT DISTINCT 
    ?dz_label 
    ?overall_rank 
    ?income_deprivation_rank 
    ?employment_deprivation_rank 
    ?health_deprivation_rank 
    ?education_deprivation_rank 
    ?access_deprivation_rank 
    ?housing_deprivation_rank 
    ?crime_deprivation_rank 

    WHERE { 

    GRAPH <http://data.opendatascotland.org/graph/simd/rank> { 
    ?overall_rank_observation stats_dim:refArea ?dz . 
    ?overall_rank_observation stats_dim:refPeriod year:2012 . 
    ?overall_rank_observation simd:rank ?overall_rank . 
    } 

    GRAPH <http://data.opendatascotland.org/graph/simd/income-rank> { 
    ?income_rank_observation stats_dim:refArea ?dz . 
    ?income_rank_observation stats_dim:refPeriod year:2012 . 
    ?income_rank_observation simd:incomeRank ?income_deprivation_rank . 
    } 

    GRAPH <http://data.opendatascotland.org/graph/simd/employment-rank> { 
    ?employment_rank_observation stats_dim:refArea ?dz . 
    ?employment_rank_observation stats_dim:refPeriod year:2012 . 
    ?employment_rank_observation simd:employmentRank ?employment_deprivation_rank . 
    } 

    GRAPH <http://data.opendatascotland.org/graph/simd/health-rank> { 
    ?health_rank_observation stats_dim:refArea ?dz . 
    ?health_rank_observation stats_dim:refPeriod year:2012 . 
    ?health_rank_observation simd:healthRank ?health_deprivation_rank . 
    } 

    GRAPH <http://data.opendatascotland.org/graph/simd/education-rank> { 
    ?education_rank_observation stats_dim:refArea ?dz . 
    ?education_rank_observation stats_dim:refPeriod year:2012 . 
    ?education_rank_observation simd:educationRank ?education_deprivation_rank . 
    } 

    GRAPH <http://data.opendatascotland.org/graph/simd/geographic-access-rank> { 
    ?access_rank_observation stats_dim:refArea ?dz . 
    ?access_rank_observation stats_dim:refPeriod year:2012 . 
    ?access_rank_observation simd:geographicAccessRank ?access_deprivation_rank . 
    } 

    GRAPH <http://data.opendatascotland.org/graph/simd/housing-rank> { 
    ?housing_rank_observation stats_dim:refArea ?dz . 
    ?housing_rank_observation stats_dim:refPeriod year:2012 . 
    ?housing_rank_observation simd:housingRank ?housing_deprivation_rank . 
    } 

    GRAPH <http://data.opendatascotland.org/graph/simd/crime-rank> { 
    ?crime_rank_observation stats_dim:refArea ?dz . 
    ?crime_rank_observation stats_dim:refPeriod year:2012 . 
    ?crime_rank_observation simd:crimeRank ?crime_deprivation_rank . 
    } 

    { 
    SELECT ?dz ?dz_label WHERE 
    { 
    ?dz a <http://data.opendatascotland.org/def/geography/DataZone> . 
    ?dz rdfs:label ?dz_label . 
    } 
    } 
    }" 

    # Make the data 
    dta.main <- SPARQL(endpoint, query.simd, format="csv") 
+0

什麼是服務器託管的數據? 「format = csv」是常見的但不是標準。它可能拼寫爲「輸出」。理想情況下,使用HTTP請求的「Accept」頭來詢問服務器。 – AndyS 2014-10-19 15:12:47

+0

感謝您表示興趣,服務器是http://www.opendatascotland.org/。據推測,我應該能夠通過在端點地址中提供.csv擴展名來獲取CSV表,但我一直在獲取返回的XML內容不可讀的信息。這很奇怪,因爲通過網站進行測試時查詢起作用。 – Konrad 2014-10-19 20:45:26

+0

它運行的是Apache Jena Fuseki 1.0.0(如果他們認爲Fuseki是免費和開放源代碼的,並且他們對此沒有貢獻)。他們已經使用原始SPARQL協議之上的某種處理器對其進行了分層。有一個聯繫人的電子郵件地址 - 你需要問問他們。 – AndyS 2014-10-20 09:35:08

回答

0

對於那些誰可能有興趣在這個問題上,一組試驗和錯誤,並檢查與它出現在網站開發之後工作的解決辦法請使用以下命令執行查詢:

dta.simd<- SPARQL(url = endpoint, query = query.simd, format = "csv")$results 

和下面的查詢來源數據:

## Define endpoint URL. 
    endpoint <- "http://data.opendatascotland.org/sparql.csv" 

    ### Create Query and download table for the SIMD rank 
    query.simd <- "PREFIX stats: <http://statistics.data.gov.uk/id/statistical-geography/> 
     PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> 
     PREFIX simd: <http://data.opendatascotland.org/def/simd/> 
     PREFIX cube: <http://purl.org/linked-data/cube#> 
     PREFIX stats_dim: <http://data.opendatascotland.org/def/statistical-dimensions/> 
     PREFIX year: <http://reference.data.gov.uk/id/year/> 

     SELECT DISTINCT 
     ?dz_label 
     ?overall_rank 
     ?income_deprivation_rank 
     ?employment_deprivation_rank 
     ?health_deprivation_rank 
     ?education_deprivation_rank 
     ?access_deprivation_rank 
     ?housing_deprivation_rank 
     ?crime_deprivation_rank 

     WHERE { 

     GRAPH <http://data.opendatascotland.org/graph/simd/rank> { 
     ?overall_rank_observation stats_dim:refArea ?dz . 
     ?overall_rank_observation stats_dim:refPeriod year:2012 . 
     ?overall_rank_observation simd:rank ?overall_rank . 
     } 

     GRAPH <http://data.opendatascotland.org/graph/simd/income-rank> { 
     ?income_rank_observation stats_dim:refArea ?dz . 
     ?income_rank_observation stats_dim:refPeriod year:2012 . 
     ?income_rank_observation simd:incomeRank ?income_deprivation_rank . 
     } 

     GRAPH <http://data.opendatascotland.org/graph/simd/employment-rank> { 
     ?employment_rank_observation stats_dim:refArea ?dz . 
     ?employment_rank_observation stats_dim:refPeriod year:2012 . 
     ?employment_rank_observation simd:employmentRank ?employment_deprivation_rank . 
     } 

     GRAPH <http://data.opendatascotland.org/graph/simd/health-rank> { 
     ?health_rank_observation stats_dim:refArea ?dz . 
     ?health_rank_observation stats_dim:refPeriod year:2012 . 
     ?health_rank_observation simd:healthRank ?health_deprivation_rank . 
     } 

     GRAPH <http://data.opendatascotland.org/graph/simd/education-rank> { 
     ?education_rank_observation stats_dim:refArea ?dz . 
     ?education_rank_observation stats_dim:refPeriod year:2012 . 
     ?education_rank_observation simd:educationRank ?education_deprivation_rank . 
     } 

     GRAPH <http://data.opendatascotland.org/graph/simd/geographic-access-rank> { 
     ?access_rank_observation stats_dim:refArea ?dz . 
     ?access_rank_observation stats_dim:refPeriod year:2012 . 
     ?access_rank_observation simd:geographicAccessRank ?access_deprivation_rank . 
     } 

     GRAPH <http://data.opendatascotland.org/graph/simd/housing-rank> { 
     ?housing_rank_observation stats_dim:refArea ?dz . 
     ?housing_rank_observation stats_dim:refPeriod year:2012 . 
     ?housing_rank_observation simd:housingRank ?housing_deprivation_rank . 
     } 

     GRAPH <http://data.opendatascotland.org/graph/simd/crime-rank> { 
     ?crime_rank_observation stats_dim:refArea ?dz . 
     ?crime_rank_observation stats_dim:refPeriod year:2012 . 
     ?crime_rank_observation simd:crimeRank ?crime_deprivation_rank . 
     } 

    { 
     SELECT ?dz ?dz_label WHERE 
    { 
     ?dz a <http://data.opendatascotland.org/def/geography/DataZone> . 
     ?dz rdfs:label ?dz_label . 
    } 
    } 
     }" 

這爲6505個地域和所有指標類似於下面的例子中所需的數據幀:

datazone   overall_rank income_deprivation rank 
Data zone S000001 2   4 
Data zone S000002 5   3