2016-12-28 106 views
0

試圖快速進入以下頁面上MTA旋轉門數據:EOF錯誤?

http://web.mta.info/developers/turnstile.html

我已經打算通過頁碼循環運行的fread或download.file來存儲數據和綁定,但在一些我得到的文件和錯誤。這裏有兩個例子,一個可行,一個不可行。我注意到第二個文件看起來有點不同:

test_mta_works = fread("http://web.mta.info/developers/data/nyct/turnstile/turnstile_161224.txt", sep = ',') 

test_mta_wont_work = fread("http://web.mta.info/developers/data/nyct/turnstile/turnstile_140419.txt", sep = ',') 

錯誤我收到的第二個:

Error in fread("http://web.mta.info/developers/data/nyct/turnstile/turnstile_140419.txt", : 
    Expected sep (',') but new line, EOF (or other non printing character) ends field 12 when detecting types from point 0: A002,R051,02-00-00,04-18-14,16:00:00,REGULAR,004575433,001558298,04-18-14,20:00:00,REGULAR,004575838,001558374 

任何想法的問題可能是和/或如何解決這個問題?我嘗試使用fill = T,但它造成了數據問題。

謝謝!

編輯

使用補= T我得到輸出中爲以下時:

V1 V2  V3  V4  V5  V6  V7  V8  V9  V10  V11  V12  V13  V14  V15  V16  V17  V18  V19  V20 
1: A002 R051 02-00-00 04-12-14 00:00:00 REGULAR 4566812 1555499 04-12-14 04:00:00 REGULAR 4566850 1555508 04-12-14 08:00:00 REGULAR 4566875 1555536 04-12-14 12:00:00 
2: A002 R051 02-00-00 04-13-14 08:00:00 REGULAR 4567968 1555789 04-13-14 12:00:00 REGULAR 4568069 1555842 04-13-14 16:00:00 REGULAR 4568278 1555903 04-13-14 20:00:00 
3: A002 R051 02-00-00 04-14-14 16:00:00 REGULAR 4569148 1556362 04-14-14 20:00:00 REGULAR 4569786 1556420 04-15-14 00:00:00 REGULAR 4569949 1556447 04-15-14 04:00:00 
4: A002 R051 02-00-00 04-16-14 00:00:00 REGULAR 4571423 1556965 04-16-14 04:00:00 REGULAR 4571442 1556966 04-16-14 08:00:00 REGULAR 4571486 1557049 04-16-14 12:00:00 
5: A002 R051 02-00-00 04-17-14 08:00:00 REGULAR 4573294 1557587 04-17-14 12:00:00 REGULAR 4573469 1557848 04-17-14 16:00:00 REGULAR 4573800 1557901 04-17-14 20:00:00 
6: A002 R051 02-00-00 04-18-14 16:00:00 REGULAR 4575433 1558298 04-18-14 20:00:00 REGULAR 4575838 1558374        NA  NA  

同時,這並不第一個文件需要填寫= T給了我下面的:

 C/A UNIT  SCP  STATION LINENAME DIVISION  DATE  TIME DESC ENTRIES EXITS 
1: A002 R051 02-00-00   59 ST NQR456W  BMT 12/17/2016 03:00:00 REGULAR 5967477 2022101 
2: A002 R051 02-00-00   59 ST NQR456W  BMT 12/17/2016 07:00:00 REGULAR 5967485 2022116 
3: A002 R051 02-00-00   59 ST NQR456W  BMT 12/17/2016 11:00:00 REGULAR 5967553 2022233 
4: A002 R051 02-00-00   59 ST NQR456W  BMT 12/17/2016 15:00:00 REGULAR 5967790 2022331 
5: A002 R051 02-00-00   59 ST NQR456W  BMT 12/17/2016 19:00:00 REGULAR 5968186 2022421   
+0

當使用'fill = T'時,您對數據有什麼問題?我能夠使用'fill'參數讀取數據,它對我來說看起來很好。 – krish

+0

添加上面的輸出作爲編輯 – LoF10

+0

test_mta_wont_work = fread(「http://web.mta.info/developers/data/nyct/turnstile/turnstile_140419.txt」,sep =',',fill = TRUE,na.strings =「」,NA) – krish

回答

2

使用na.strings作爲參數fread

test_mta_wont_work = fread("http://web.mta.info/developers/data/nyct/turnstile/turnstile_140419.txt", sep = ',', fill = TRUE, na.strings = "",NA) 

head(test_mta_wont_work) 

    V1 V2  V3  V4  V5  V6  V7  V8  V9  V10  V11 
1: A002 R051 02-00-00 04-12-14 00:00:00 REGULAR 4566812 1555499 04-12-14 04:00:00 REGULAR 
2: A002 R051 02-00-00 04-13-14 08:00:00 REGULAR 4567968 1555789 04-13-14 12:00:00 REGULAR 
3: A002 R051 02-00-00 04-14-14 16:00:00 REGULAR 4569148 1556362 04-14-14 20:00:00 REGULAR 
4: A002 R051 02-00-00 04-16-14 00:00:00 REGULAR 4571423 1556965 04-16-14 04:00:00 REGULAR 
5: A002 R051 02-00-00 04-17-14 08:00:00 REGULAR 4573294 1557587 04-17-14 12:00:00 REGULAR 
6: A002 R051 02-00-00 04-18-14 16:00:00 REGULAR 4575433 1558298 04-18-14 20:00:00 REGULAR 
     V12  V13  V14  V15  V16  V17  V18  V19  V20  V21  V22 
1: 4566850 1555508 04-12-14 08:00:00 REGULAR 4566875 1555536 04-12-14 12:00:00 REGULAR 4567031 
2: 4568069 1555842 04-13-14 16:00:00 REGULAR 4568278 1555903 04-13-14 20:00:00 REGULAR 4568507 
3: 4569786 1556420 04-15-14 00:00:00 REGULAR 4569949 1556447 04-15-14 04:00:00 REGULAR 4569966 
4: 4571442 1556966 04-16-14 08:00:00 REGULAR 4571486 1557049 04-16-14 12:00:00 REGULAR 4571666 
5: 4573469 1557848 04-17-14 16:00:00 REGULAR 4573800 1557901 04-17-14 20:00:00 REGULAR 4574676 
6: 4575838 1558374  NA  NA  NA  NA  NA  NA  NA  NA  NA 
     V23  V24  V25  V26  V27  V28  V29  V30  V31  V32  V33 
1: 1555629 04-12-14 16:00:00 REGULAR 4567347 1555694 04-12-14 20:00:00 REGULAR 4567736 1555738 
2: 1555953 04-14-14 00:00:00 REGULAR 4568639 1555975 04-14-14 04:00:00 REGULAR 4568657 1555979 
3: 1556449 04-15-14 08:00:00 REGULAR 4569998 1556529 04-15-14 12:00:00 REGULAR 4570176 1556774 
4: 1557328 04-16-14 16:00:00 REGULAR 4572020 1557392 04-16-14 20:00:00 REGULAR 4572975 1557459 
5: 1557989 04-18-14 00:00:00 REGULAR 4574912 1558020 04-18-14 04:00:00 REGULAR 4574943 1558020 
6:  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA 
     V34  V35  V36  V37  V38  V39  V40  V41  V42  V43 
1: 04-13-14 00:00:00 REGULAR 4567914 1555770 04-13-14 04:00:00 REGULAR 4567952 1555773 
2: 04-14-14 08:00:00 REGULAR 4568697 1556064 04-14-14 12:00:00 REGULAR 4568858 1556308 
3: 04-15-14 16:00:00 REGULAR 4570437 1556855 04-15-14 20:00:00 REGULAR 4571260 1556938 
4: 04-17-14 00:00:00 REGULAR 4573228 1557492 04-17-14 04:00:00 REGULAR 4573250 1557497 
5: 04-18-14 08:00:00 REGULAR 4574977 1558080 04-18-14 12:00:00 REGULAR 4575130 1558233 
6:  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA