2017-08-15 70 views
1

我有這樣一組數據如何將OHLCV數據重新採樣爲5分鐘?

2016-08-09 12:39:00,536.7841,536.7849,536.6141,536.7849,0.656 
2016-08-09 12:40:00,536.6749,536.6749,536.6749,536.6749,0.2642 
2016-08-09 12:41:00,535.84,535.84,535.615,535.615,0.348 
2016-08-09 12:42:00,535.5401,535.5401,534.1801,534.1801,0.507 
2016-08-09 12:43:00,534.5891,534.8753,534.5891,534.807,0.656 
2016-08-09 12:44:00,534.8014,534.878,534.8014,534.8416,0.502 
2016-08-09 12:45:00,534.8131,534.8131,534.2303,534.6736,0.552 
2016-08-09 12:47:00,534.756,538.5999,534.756,534.7836,0.62647241 
2016-08-09 12:48:00,536.0557,536.6864,536.0557,536.6864,1.2614 
2016-08-09 12:49:00,536.8966,537.7289,536.8966,537.7289,0.532 
2016-08-09 12:50:00,537.9829,539.2199,537.9829,539.2199,0.67752932 
2016-08-09 12:51:00,538.5,539.2199,538.5,539.2199,0.43768953 

我想將它重新取樣到5分鐘OHCLV的,所以我做了這個代碼:

import pandas as pd 

df= pd.read_csv("C:\Users\Araujo's PC\Desktop\python_scripts\CSV\cex_btc.csv", 
       names=['timestamps','open','high','low','close','volume']) 

df.set_index('timestamps',inplace=True) 

ohlc_dict = { 
    'open':'first', 
    'high':'max', 
    'low':'min', 
    'close':'last', 
    'volume':'sum' 
    } 

df.resample('5T', how=ohlc_dict) 

print df 

看來我這個錯誤:

TypeError: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of 'Index'

有人可以幫助我嗎?

回答

1

您只需將timestamps列中的值轉換爲熊貓時間戳,然後再使用它們的值設置索引。他們目前只是我相信的文本字段。

df['timestamps'] = pd.to_datetime(df['timestamps']) 
df.set_index('timestamps', inplace=True) 

>>> df.resample('5T', how=ohlc_dict) 

         high  close  open  low volume 
timestamps               
2016-08-09 12:35:00 536.7849 536.7849 536.7841 536.6141 0.656000 
2016-08-09 12:40:00 536.6749 534.8416 536.6749 534.1801 2.277200 
2016-08-09 12:45:00 538.5999 537.7289 534.8131 534.2303 2.971872 
2016-08-09 12:50:00 539.2199 539.2199 537.9829 537.9829 1.115219 

您也可以嘗試分析這些讀取CSV時:

pd.read_csv(filename, parse_dates=['timestamps'], 
      names=['timestamps','open','high','low','close','volume'])