2014-09-29 79 views
1

日期和時間列中添加時間指數我有一個數據幀OHLC如下:如何從熊貓

 
trade_date trade_time open_price high_price low_price close_price volumn 
    19911223  15:00  27.70  27.9  27.60  27.80 1270 
    19911224  15:00  27.90  29.3  27.00  29.05 1050 
    19911225  15:00  29.15  30.0  29.10  29.30 2269 
    19911226  15:00  29.30  29.3  28.00  28.00 1918 
    19911227  15:00  28.00  28.5  28.00  28.45 2105 
    19911228  15:00  28.40  29.3  28.40  29.25 1116 
    19911230  15:00  29.30  29.4  28.80  28.80 1059 
    ........ 

如何將trade_date和trade_time列到時間序列指標結合起來? 我通過simular問題看,它們都基於read_csv ....

+0

是'trade_date'和'trade_time'字符串? – filmor 2014-09-29 09:11:18

+0

我認爲你應該接受傑夫的答案,因爲它會比我的快得多 – EdChum 2014-09-30 08:09:04

回答

0

假設trade_date是D型Int64和trade_time是str那麼下面將工作:

In [26]: 
# use strptime to format the data into a datetime  
import datetime as dt 
def datetime(x): 
    return dt.datetime.strptime(str(x.trade_date) + '' + x.trade_time, '%Y%m%d%H:%M') 
# create a datetime column call apply to do the conversion 
df['datetime'] = df.apply(lambda row: datetime(row), axis=1) 
# set the index to this datetime, by default this column will become the index and drop it as a column 
df.set_index('datetime',inplace=True) 
df 
Out[26]: 
        trade_date trade_time open_price high_price low_price \ 
datetime                   
1991-12-23 15:00:00 19911223  15:00  27.70  27.9  27.6 
1991-12-24 15:00:00 19911224  15:00  27.90  29.3  27.0 
1991-12-25 15:00:00 19911225  15:00  29.15  30.0  29.1 
1991-12-26 15:00:00 19911226  15:00  29.30  29.3  28.0 
1991-12-27 15:00:00 19911227  15:00  28.00  28.5  28.0 
1991-12-28 15:00:00 19911228  15:00  28.40  29.3  28.4 
1991-12-30 15:00:00 19911230  15:00  29.30  29.4  28.8 

        close_price volumn 
datetime         
1991-12-23 15:00:00  27.80 1270 
1991-12-24 15:00:00  29.05 1050 
1991-12-25 15:00:00  29.30 2269 
1991-12-26 15:00:00  28.00 1918 
1991-12-27 15:00:00  28.45 2105 
1991-12-28 15:00:00  29.25 1116 
1991-12-30 15:00:00  28.80 1059 

In [27]: 

df.index.dtype 
Out[27]: 
dtype('<M8[ns]') 
+0

謝謝,它的工作原理.... – firefoxuser 2014-09-30 01:22:13

1

這是一個全矢量SOLN 。

將trade_date列轉換爲 dtype(它可以是int64object d型先驗)。將trade_time轉換爲timedelta64[ns] dtype。您需要通過添加秒組件來提示時間爲hh:mm。

總結一個日期時間和一個timedelta產生一個日期時間。

In [5]: pd.to_datetime(df['trade_date'],format='%Y%m%d') + pd.to_timedelta(df['trade_time'] + ':00') 
Out[5]: 
0 1991-12-23 15:00:00 
1 1991-12-24 15:00:00 
2 1991-12-25 15:00:00 
3 1991-12-26 15:00:00 
4 1991-12-27 15:00:00 
5 1991-12-28 15:00:00 
6 1991-12-30 15:00:00 
dtype: datetime64[ns] 

然後,您可以直接設置索引

In [6]: df.index = pd.to_datetime(df['trade_date'],format='%Y%m%d') + pd.to_timedelta(df['trade_time'] + ':00') 

In [7]: df 
Out[7]: 
        trade_date trade_time open_price high_price low_price close_price volumn 
1991-12-23 15:00:00 19911223  15:00  27.70  27.9  27.6  27.80 1270 
1991-12-24 15:00:00 19911224  15:00  27.90  29.3  27.0  29.05 1050 
1991-12-25 15:00:00 19911225  15:00  29.15  30.0  29.1  29.30 2269 
1991-12-26 15:00:00 19911226  15:00  29.30  29.3  28.0  28.00 1918 
1991-12-27 15:00:00 19911227  15:00  28.00  28.5  28.0  28.45 2105 
1991-12-28 15:00:00 19911228  15:00  28.40  29.3  28.4  29.25 1116 
1991-12-30 15:00:00 19911230  15:00  29.30  29.4  28.8  28.80 1059