2014-12-25 65 views
1

我有DataFrame熊貓.asfreq部分的問題。我有股票股票的股票名稱的股票數據。與數據文件看起來是這樣的:使用asfreq顯示當時只有一個股票代碼

uri:/instrument/1.0/AAPL/chartdata;type=quote;range=1d/csv 
ticker:aapl 
Company-Name:Apple Inc. 
Exchange-Name:NMS 
unit:MIN 
timezone:EST 
currency:USD 
gmtoffset:-18000 
previous_close:114.6300 
Timestamp:1417617000,1417640400 
labels:1417618800,1417622400,1417626000,1417629600,1417633200,1417636800,1417640400 
values:Timestamp,close,high,low,open,volume 
close:115.1500,116.2500 
high:115.2200,116.3500 
low:115.1100,116.2000 
open:115.1425,116.2450 
volume:13400,3646700 
1417617011,115.7498,115.8100,115.5707,115.7150,1622500 
1417617060,115.6300,115.7500,115.5000,115.7300,284000 
1417617179,115.3990,115.6600,115.3600,115.6500,349600 
1417617180,115.6050,115.6400,115.3700,115.3990,300400 
1417617299,115.7099,115.7700,115.6000,115.6401,279200 
… 

我有功能,其拍攝的所有代號(例如[AAPL,NGF15]),並拉出的數據類型(例如 - 「關閉」),從時間到時間(例如 - ['2014-12-03 15:29:00','2014-12-03 16:31:00']),並將其更新爲名爲data的嵌套字典。在我調用函數嵌套字典後,看起來像這樣:

{'AAPL':{'2014-12-03 16:03:00':'115.4200','2014-12-03 15:31:00' :'115.6300','2014-12-03 15:51:00':'116.1100','2014-12-03 16:08:00':'115.4100'...},'NGF15':{'2014-12 -03 16:02:52':'3.8170','2014-12-03 16:14:58':'3.8000','2014-12-03 15:53:58':'3.8010','2014- 12-03 15:33:59':'3.7930','2014-12-03 15:59:58':'3.8110','2014-12-03 16:15:00':'3.8040',...} }

然後代碼是這樣的:

a=DataFrame(data=data) 
a.index.name = 'vrime' 

數據框看起來像THI S:

      AAPL NGF15 
vrime         
2014-12-03 15:29:59  NaN 3.7870 
2014-12-03 15:30:11 115.7498  NaN 
2014-12-03 15:30:54  NaN 3.7880 
2014-12-03 15:31:00 115.6300  NaN 
2014-12-03 15:31:57  NaN 3.7880 
2014-12-03 15:32:58  NaN 3.7920 
… 
2014-12-03 16:21:59 115.5900 3.8090 
… 

所以我要改變數據的頻率爲每15秒的這個價格在特定的時間(如十五點三十分15秒)是去年的價格爲每股票。

a.index = pd.to_datetime(a.index) 
print a.asfreq('15s', method=‘pad',how = {'2014-12-03 15:30:00','2014-12-03 16:30:00'}) 

所以我的結果是這樣的:

      AAPL NGF15 
2014-12-03 15:29:59  NaN 3.7870 
2014-12-03 15:30:14 115.7498  NaN 
2014-12-03 15:30:29 115.7498  NaN 
2014-12-03 15:30:44 115.7498  NaN 
2014-12-03 15:30:59  NaN 3.7880 
2014-12-03 15:31:14 115.6300  NaN 
2014-12-03 15:31:29 115.6300  NaN 
2014-12-03 15:31:44 115.6300  NaN 
2014-12-03 15:31:59  NaN 3.7880 
2014-12-03 15:32:14  NaN 3.7880 

它dosent從15:30:00開始,並顯示當時只有一個股票。什麼似乎是問題?

這就是我想要的:

     AAPL NGF15 
2014-12-03 15:30:15 115.7498 3.7870 
2014-12-03 15:30:30 115.7498 3.7870 
2014-12-03 15:30:45 115.7498 3.7870 
2014-12-03 15:31:00 115.6300 3.7880 
2014-12-03 15:31:15 115.6300 3.7880 
2014-12-03 15:31:30 115.6300 3.7880 
2014-12-03 15:31:45 115.6300 3.7880 
2014-12-03 15:32:00 115.6300 3.7880 
2014-12-03 15:32:15 115.6300 3.7880 

預先感謝您!很抱歉,如果英文不好

回答

0

documentation對於DataFrame.asfreq()表示how關鍵字的工作原理「僅用於PeriodIndex」。

asfreq()只是resample()的包裝。像resample('15s', fill_method='pad')應該工作。從上面使用一些簡短的數據:

In [49]: data 
Out[49]: 
         0 1 
2014-12-03 15:29:59 NaN 3.7 
2014-12-03 15:30:11 115.7 NaN 
2014-12-03 15:30:54 NaN 3.8 
2014-12-03 15:31:00 115.6 NaN 

[4 rows x 2 columns] 

In [50]: data.resample('15s', fill_method='pad') 
Out[50]: 
         0 1 
2014-12-03 15:29:45 NaN 3.7 
2014-12-03 15:30:00 115.7 3.7 
2014-12-03 15:30:15 115.7 3.7 
2014-12-03 15:30:30 115.7 3.7 
2014-12-03 15:30:45 115.7 3.8 
2014-12-03 15:31:00 115.6 3.8 

如果你想在15:30:00開始,你可以從數據框下降的第一行。

+0

這是正確的。這是我的最終代碼:'a = DataFrame(data = data,dtype = float)'和'print a.resample('15Min',fill_method ='pad') ' – matoliki