2016-04-14 118 views
3

實際問題

如所示的這個小例子,我嘗試每週重新取樣大熊貓數據幀:大熊貓timedelta重採樣周未能

import datetime 
import pandas as pd 

df = pd.DataFrame([{ 
    'A' : datetime.datetime.now() - datetime.datetime.now(), 
    'B' : 2 
},{ 
    'A' : datetime.datetime.now() - datetime.datetime.now(), 
    'B' : 3 
}]) 

df = df.set_index('A') 

df.resample('W', how="mean") 

,這將引發一個AttributeError

AttributeError: 'Week' object has no attribute 'nanos' 

(注意:如果我通過"D"重新採樣,問題不會發生)

如果我改爲將索引投射到日期時間:

df.index = pd.to_datetime(df.index.values) 
df.resample('W', how="mean") 

重採樣也適用。
問題:有沒有不依賴nano秒的熊貓timedelta類型?
或者:你有沒有比更適合timedelta的優雅方式?


完整跟蹤:

Traceback (most recent call last): 
    File "<stdin>", line 1, in <module> 
    File "/Library/Python/2.7/site-packages/pandas/core/generic.py", line 3266, in resample 
    return sampler.resample(self).__finalize__(self) 
    File "/Library/Python/2.7/site-packages/pandas/tseries/resample.py", line 98, in resample 
    rs = self._resample_timestamps(kind='timedelta') 
    File "/Library/Python/2.7/site-packages/pandas/tseries/resample.py", line 272, in _resample_timestamps 
    self._get_binner_for_resample(kind=kind) 
    File "/Library/Python/2.7/site-packages/pandas/tseries/resample.py", line 122, in _get_binner_for_resample 
    self.binner, bins, binlabels = self._get_time_delta_bins(ax) 
    File "/Library/Python/2.7/site-packages/pandas/tseries/resample.py", line 236, in _get_time_delta_bins 
    name=ax.name) 
    File "/Library/Python/2.7/site-packages/pandas/tseries/tdi.py", line 167, in __new__ 
    closed=closed) 
    File "/Library/Python/2.7/site-packages/pandas/tseries/tdi.py", line 235, in _generate 
    index = _generate_regular_range(start, end, periods, offset) 
    File "/Library/Python/2.7/site-packages/pandas/tseries/tdi.py", line 895, in _generate_regular_range 
    stride = offset.nanos 
AttributeError: 'Week' object has no attribute 'nanos' 

版本

>>> pd.__version__ 
'0.16.2' 
>>> np.__version__ 
'1.10.1' 

回答

0

我相信不同的是,熊貓使用numpy的的datetime64而蟒蛇DateTime類是不同的東西。當你調用

df.index = pd.to_datetime(df.index.values) 

您從您創建到需要重新取樣作爲參數numpy.datetime64對象datetime.datetime對象鑄造。

+0

當然,但是這怎麼回答我的問題呢? –

+0

嗯,你的問題是: 問題:有沒有一種不依賴nano秒的熊貓timedelta類型? 對此我回答說: 是,numpy.datetime64 你還問: 或:你除了利用對timedelta日期時間任何更優雅的方式? 答案是: 不,因爲沒有與numpy.datetime64中的datetime.datetime.now()等效。請參閱http://docs.scipy.org/doc/numpy/reference/arrays.datetime.html – kingledion