使用download.file從下載HTTPS文件（）

我想使用download.file()將在線數據讀取到R，如下所示。使用download.file從下載HTTPS文件（）

URL <- "https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Fss06hid.csv" 
download.file(URL, destfile = "./data/data.csv", method="curl")

有人建議，我認爲我行添加setInternet2(TRUE)，但它仍然無法正常工作。

我得到的錯誤是：

Warning messages: 
1: running command 'curl "https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Fss06hid.csv" -o "./data/data.csv"' had status 127 
2: In download.file(URL, destfile = "./data/data.csv", method = "curl", : 
    download had nonzero exit status

感謝你的幫助。

來源

2014-04-12 useR

什麼問題你看到下面？它是否因某種錯誤而失敗或根本沒有返回到控制檯？它是否顯示不更新的進度條？額外的信息將有助於診斷問題。 –

你有捲曲嗎？ – A5C1D2H2I1M1N2O1R2T1

你使用哪個操作系統？ – sgibb

這可能是最簡單的嘗試RCurl包。安裝包，請嘗試以下操作：

# install.packages("RCurl") 
library(RCurl) 
URL <- "https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Fss06hid.csv" 
x <- getURL(URL) 
## Or 
## x <- getURL(URL, ssl.verifypeer = FALSE) 
out <- read.csv(textConnection(x)) 
head(out[1:6]) 
# RT SERIALNO DIVISION PUMA REGION ST 
# 1 H  186  8 700  4 16 
# 2 H  306  8 700  4 16 
# 3 H  395  8 100  4 16 
# 4 H  506  8 700  4 16 
# 5 H  835  8 800  4 16 
# 6 H  989  8 700  4 16 
dim(out) 
# [1] 6496 188 

download.file("https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Fss06hid.csv",destfile="reviews.csv",method="libcurl")

來源

2014-04-12 10:42:48 A5C1D2H2I1M1N2O1R2T1

嗨阿南達。但是，當使用getURL（URL）時，出現以下錯誤消息： > x < - getURL（URL）函數錯誤（type，msg，asError = TRUE）： SSL證書問題，請驗證CA證書是否正常。詳細信息：錯誤：14090086：SSL例程：SSL3_GET_SERVER_CERTIFICATE：證書驗證失敗 – useR

@Yin，您可以嘗試在'getURL'語句中添加'ssl.verifypeer = FALSE'。 – A5C1D2H2I1M1N2O1R2T1

它的作品=]非常感謝。 – useR

我用下面的代碼成功：

url = "http://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Fss06hid.csv" 
x = read.csv(file=url)

注意，我從HTTPS到HTTP更改了協議，因爲第一個在R中似乎不受支持。

來源

2014-04-12 14:27:49 Baumann

這個「解決方案」的問題是，並非所有https url都可以用http替換。「RCurl」軟件包通常在很多情況下都做得很好。 – A5C1D2H2I1M1N2O1R2T1

這不是解決問題的辦法。只有在您無法解決問題時才應考慮應對變通辦法。 – Muktadir

這解決了這個問題，並且不需要安裝外部依賴關係或搞亂SSL證書。它可能不適用於所有情況，但它適用於此。 – bonh

127表示未找到命令

在你的情況下，沒有找到curl命令。因此，這意味着，沒有發現捲曲。

您需要安裝/重新安裝CURL。就這樣。從http://curl.haxx.se/download.html獲取最新版本

安裝前關閉RStudio。

來源

2014-06-18 23:10:09 Muktadir

如果使用RCurl，GetURL（）函數會出現SSL錯誤，然後在GetURL（）之前設置這些選項。這將全局設置CurlSSL設置。

擴展代碼：

install.packages("RCurl") 
library(RCurl) 
options(RCurlOptions = list(cainfo = system.file("CurlSSL", "cacert.pem", package = "RCurl"))) 
URL <- "https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Fss06hid.csv" 
x <- getURL(URL)

使用R3.1.0爲我工作在Windows 7 64位！

來源

2014-06-21 08:49:54 user3762466

您可以通過按Control + K而不是使用反引號來重新格式化它。 –

這是一個很好的答案！有沒有辦法將這些選項設置爲默認持續在不同的R會話？ –

與UseR（原始問題）完全相同的問題，我也使用Windows 7.我嘗試了所有建議的解決方案，他們沒有工作。

我解決了這個問題做如下：

使用，而不是[R控制檯RStudio。
實現R的版本（從3.1.0到3.1.1），以便庫RCurl在其上運行OK。（我現在使用的是R3.1.1 32bit，雖然我的系統是64位）。
我將URL地址鍵入爲https（安全連接），並使用「/」而不是反斜槓「\」。
設置方法=「自動」。

它現在適用於我。您應該看到以下消息：

內容類型'text/csv;字符集= UTF-8' 長度9294字節打開URL 通過

來源

2014-08-10 16:24:54 JeromeROD

下載9294這裏是爲十一月的2014年。我發現，設置method='curl'奏效了，我（而method='auto'，沒有）的更新。

例如：

# does not work 
download.file(url='https://s3.amazonaws.com/tripdata/201307-citibike-tripdata.zip', 
       destfile='localfile.zip') 

# does not work. this appears to be the default anyway 
download.file(url='https://s3.amazonaws.com/tripdata/201307-citibike-tripdata.zip', 
       destfile='localfile.zip', method='auto') 

# works! 
download.file(url='https://s3.amazonaws.com/tripdata/201307-citibike-tripdata.zip', 
       destfile='localfile.zip', method='curl')

來源

2014-11-12 03:17:11 arvi1000

我沒有找到捲曲。 –

也許你的系統沒有捲曲。至少在Mac OS上，你可以在R中運行'system（'curl -V'）''（必須是大寫'V'）來檢查你的捲曲版本 – arvi1000

https://lehd.ces.census.gov/data/lodes /LODES7/ut/wac/ut_wac_S000_JT00_2013.csv.gz ＃不爲我工作 install.packages（「RCurl」）庫（RCurl） download.file（URL =的「https：//s3.amazonaws。 com/tripdata/201307-citibike-tripdata.zip'， destfile ='localfile.zip'，method ='curl'）＃-------導致的錯誤-------- 警告消息： 1：運行命令'curl「https://s3.amazonaws.com/tripdata/201307-citibike-tripdata.zip」-o「localfile.zip」'有狀態127 2：在download.file（url = 「https://s3.amazonaws.com/tripdata/201307-citibike-tripdata.zip」，：dow nload具有非零退出狀態 – Mox

您可以設置全局選項和試戴

options('download.file.method'='curl') 
download.file(URL, destfile = "./data/data.csv", method="auto")

有關問題指鏈路 https://stat.ethz.ch/pipermail/bioconductor/2011-February/037723.html

來源

2015-06-07 13:50:59

提供捲曲包作爲替代，我從聯機數據庫提取大型文件時發現可靠。在最近的一個項目中，我不得不從一個在線數據庫下載120個文件，發現它只有一半的傳輸時間，並且比download.file更可靠。

#install.packages("curl") 
library(curl) 
#install.packages("RCurl") 
library(RCurl) 

ptm <- proc.time() 
URL <- "https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Fss06hid.csv" 
x <- getURL(URL) 
proc.time() - ptm 
ptm 

ptm1 <- proc.time() 
curl_download(url =URL ,destfile="TEST.CSV",quiet=FALSE, mode="wb") 
proc.time() - ptm1 
ptm1 

ptm2 <- proc.time() 
y = download.file(URL, destfile = "./data/data.csv", method="curl") 
proc.time() - ptm2 
ptm2

在這種情況下，您的網址的粗略計時顯示傳輸時間沒有一致的差異。在我的應用程序中，通過在腳本中使用curl_download從網站選擇並下載120個文件，可將傳輸時間從每個文件2000秒減少到1000秒，並將120個文件中的可靠性從50％提高到2個故障。這個腳本發佈在我之前問過的問題的答案中，請看。

來源

2016-09-21 23:44:33 user3838963

嘗試重文件

library(data.table) 
URL <- "http://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Fss06hid.csv" 
x <- fread(URL)

來源

2017-04-19 19:28:19 zyduss

使用download.file從下載HTTPS文件（）

回答

相關問題