2016-03-11 77 views
0

我想弄清楚如何將CSV文件分成小塊。我想分割任何數量或行。也許20,1000,或其他。如何將CSV文件分成小塊?

setwd("C:/Users/my_path/test_folder/") 
mydata = read.csv("NHLData.csv") 


split(mydata, ceiling(seq_along(mydata)/20)) 

錯誤:警告消息:在split.default(X = seq_len(nrow(X))中,f = F,一滴一滴=,...):數據長度不是分裂可變

的倍數

我也試過這個。

split(mydata, ceiling(seq_along(mydata)/(length(mydata)/20))) 

相同的錯誤:警告消息:在split.default(X = seq_len(nrow(X))中,f = F,一滴一滴=,...):數據長度不分裂可變的倍數

我爲這些想法Google搜索。我沒有真正發現其他任何有用的東西。這一定非常簡單,沒錯。

+0

http://stackoverflow.com/questions/14164525/splitting-a-large-data-frame-into-smaller-segments有幾個解決方案 –

+0

'read.csv中'skip'和'nrows'的組合'會給你所有需要閱讀的任何你想要的csv文件的行... – cory

+0

Ryguy72(72),不要多個帳戶。 [見這裏](http://meta.stackexchange.com/help/merging-accounts)瞭解如何合併它們。 –

回答

0

利用'樣本'功能,這將有所幫助。

setwd("C:/Users/my_path/test_folder/") 
mydata = read.csv("NHLData.csv") 

# If you want 5 different chunks with same number of lines, lets say 30. 
Chunks = split(mydata,sample(rep(1:5,30))) ## 5 Chunks of 30 lines each 

# If you want 20 samples, put any range of 20 values within the range of number of rows 
First_chunk <- sample(mydata[1:20,]) ## this would contain first 20 rows 

# Or you can print any number of rows within the range 
Second_chunk <- sample(mydata[100:70,] ## this would contain last 30 rows in reverse order if your data had 100 rows. 

# If you want to write these chunks out in a csv file: 
write.csv(First_chunk,file="First_chunk.csv",quote=F,row.names=F,col.names=T) 
write.csv(Second_chunk,file="Second_chunk.csv",quote=F,row.names=F,col.names=T) 

希望這對我有所幫助。