2011-04-29 38 views
4

我希望用read.csv閱讀谷歌文檔電子表格。read.csv無法讀取從谷歌文檔的CSV文件

我嘗試使用下面的代碼:

data_url <- "http://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv" 
read.csv(data_url) 

這將導致以下錯誤:

Error in file(file, "rt") : cannot open the connection 

我在Windows 7和代碼受審R上2.12和2.13

我記得幾個月前嘗試這和它工作得很好。 任何建議可能會導致這種情況或如何解決它?

謝謝。

+3

我不知道這是否會讓你開心或傷心。你的代碼完全適用於我在Windows 7和R上的安裝2.13 – Farrel 2011-04-30 05:57:16

回答

3

我遇到了同樣的問題,最終在forum thread中找到了解決方案。使用我自己的公共CSV文件:

library(RCurl) 
tt = getForm("https://spreadsheets.google.com/spreadsheet/pub", 
      hl ="en_US", key = "0Aonsf4v9iDjGdHRaWWRFbXdQN1ZvbGx0LWVCeVd0T1E", 
      output = "csv", 
     .opts = list(followlocation = TRUE, verbose = TRUE, ssl.verifypeer = FALSE)) 

holidays <- read.csv(textConnection(tt)) 
+0

這對我的作品。好的解決方案 – 2011-08-31 13:21:40

9

它可能有一些做的事實,谷歌報告的是302暫時搬到響應。

> download.file(data_url, "~/foo.csv", method = "wget") 
--2011-04-29 18:01:01-- http://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv 
Resolving spreadsheets0.google.com... 74.125.230.132, 74.125.230.128, 74.125.230.130, ... 
Connecting to spreadsheets0.google.com|74.125.230.132|:80... connected. 
HTTP request sent, awaiting response... 302 Moved Temporarily 
Location: https://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv [following] 
--2011-04-29 18:01:01-- https://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv 
Connecting to spreadsheets0.google.com|74.125.230.132|:443... connected. 
HTTP request sent, awaiting response... 200 OK 
Length: unspecified [text/plain] 
Saving to: `/home/gavin/foo.csv' 

    [ <=>                                     ] 41   --.-K/s in 0s  

2011-04-29 18:01:02 (1.29 MB/s) - `/home/gavin/foo.csv' saved [41] 

> read.csv("~/foo.csv") 
    column1 column2 
1  a  1 
2  b  2 
3  ds  3 
4  d  4 
5  f  5 
6  ga  5 

我不知道的r內下載代碼能夠應對這樣的重定向:

> download.file(data_url, "~/foo.csv") 
trying URL 'http://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv' 
Error in download.file(data_url, "~/foo.csv") : 
    cannot open URL 'http://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv' 
+0

我同意加文 - 我認爲這是重定向。 method ='wget'也爲我修復它。謝謝! – 2011-06-19 04:07:19

1

檢查解決方案上http://blog.forret.com/2011/07/google-docs-infamous-moved-temporarily-error-fixed/

So what is the solution: just add 「&ndplr=1」 to your URL and you will skip the authentication redirect. I’m not sure what the NDPLR parameter name stands for, let’s just call it: 「Never Do Published Link Redirection「.

+0

你好。感謝這個想法 - 但它似乎沒有解決我的情況。 – 2011-07-16 07:40:07

+0

我明白了。在你的情況下,就像上面提到的Gavin一樣,它會重定向到同一個URL,但前面有https://。根據經驗,Google現在可以通過https執行所有導出操作,因此總是將http替換爲https是安全的。而且,如果谷歌將要求進行身份驗證,還可以添加和ndplr = 1 :-) – pforret 2011-07-16 11:43:28

+0

注意,對不起,我不能使它工作:) – 2011-07-17 17:34:23