2012-10-11 19 views
15

我遇到了一些我很肯定曾經跑過的小組代碼的問題(在較老的熊貓版本上)。在0.9上,我得到沒有數字類型來聚合錯誤。有任何想法嗎?沒有數字類型要聚合 - groupby()行爲有變化嗎?

In [31]: data 
Out[31]: 
<class 'pandas.core.frame.DataFrame'> 
DatetimeIndex: 2557 entries, 2004-01-01 00:00:00 to 2010-12-31 00:00:00 
Freq: <1 DateOffset> 
Columns: 360 entries, -89.75 to 89.75 
dtypes: object(360) 

In [32]: latedges = linspace(-90., 90., 73) 

In [33]: lats_new = linspace(-87.5, 87.5, 72) 

In [34]: def _get_gridbox_label(x, bins, labels): 
    ....:    return labels[searchsorted(bins, x) - 1] 
    ....: 

In [35]: lat_bucket = lambda x: _get_gridbox_label(x, latedges, lats_new) 

In [36]: data.T.groupby(lat_bucket).mean() 
--------------------------------------------------------------------------- 
DataError         Traceback (most recent call last) 
<ipython-input-36-ed9c538ac526> in <module>() 
----> 1 data.T.groupby(lat_bucket).mean() 

/usr/lib/python2.7/site-packages/pandas/core/groupby.py in mean(self) 
    295   """ 
    296   try: 
--> 297    return self._cython_agg_general('mean') 
    298   except DataError: 
    299    raise 

/usr/lib/python2.7/site-packages/pandas/core/groupby.py in _cython_agg_general(self, how, numeric_only) 
    1415 
    1416  def _cython_agg_general(self, how, numeric_only=True): 
-> 1417   new_blocks = self._cython_agg_blocks(how, numeric_only=numeric_only) 
    1418   return self._wrap_agged_blocks(new_blocks) 
    1419 

/usr/lib/python2.7/site-packages/pandas/core/groupby.py in _cython_agg_blocks(self, how, numeric_only) 
    1455 
    1456   if len(new_blocks) == 0: 
-> 1457    raise DataError('No numeric types to aggregate') 
    1458 
    1459   return new_blocks 

DataError: No numeric types to aggregate 

回答

20

您是如何生成數據的?

看看輸出如何顯示您的數據是'對象'類型? groupby操作會首先檢查每列是否爲數字dtype。

In [31]: data 
Out[31]: 
<class 'pandas.core.frame.DataFrame'> 
DatetimeIndex: 2557 entries, 2004-01-01 00:00:00 to 2010-12-31 00:00:00 
Freq: <1 DateOffset> 
Columns: 360 entries, -89.75 to 89.75 
dtypes: object(360) 

看↑


你首先初始化一個空的數據幀,然後進行填充呢?如果是這樣,這可能是爲什麼它像以前一樣改變了新版本0.9空的DataFrames被初始化爲float類型,但現在它們是對象類型。如果是這樣,您可以將初始化更改爲DataFrame(dtype=float)

您也可以撥打frame.astype(float)

+0

謝謝!我想我真的開始仔細閱讀每個發行版的發行說明...;) –

4

我得到這個錯誤產生由時間戳和數據的數據幀:

df = pd.DataFrame({'data':value}, index=pd.DatetimeIndex(timestamp)) 

添加建議的解決方案適用於我:

df = pd.DataFrame({'data':value}, index=pd.DatetimeIndex(timestamp), dtype=float)) 

謝謝常她!

例子:

     data 
2005-01-01 00:10:00 7.53 
2005-01-01 00:20:00 7.54 
2005-01-01 00:30:00 7.62 
2005-01-01 00:40:00 7.68 
2005-01-01 00:50:00 7.81 
2005-01-01 01:00:00 7.95 
2005-01-01 01:10:00 7.96 
2005-01-01 01:20:00 7.95 
2005-01-01 01:30:00 7.98 
2005-01-01 01:40:00 8.06 
2005-01-01 01:50:00 8.04 
2005-01-01 02:00:00 8.06 
2005-01-01 02:10:00 8.12 
2005-01-01 02:20:00 8.12 
2005-01-01 02:30:00 8.25 
2005-01-01 02:40:00 8.27 
2005-01-01 02:50:00 8.17 
2005-01-01 03:00:00 8.21 
2005-01-01 03:10:00 8.29 
2005-01-01 03:20:00 8.31 
2005-01-01 03:30:00 8.25 
2005-01-01 03:40:00 8.19 
2005-01-01 03:50:00 8.17 
2005-01-01 04:00:00 8.18 
        data 
2005-01-01 00:00:00 7.636000 
2005-01-01 01:00:00 7.990000 
2005-01-01 02:00:00 8.165000 
2005-01-01 03:00:00 8.236667 
2005-01-01 04:00:00 8.180000