2
簡短的問題:
我試圖在兩種不同的方式編組的多索引熊貓數據幀後得到一個平均列(數據系列)。區別僅在於DataFrame的構造。一個給我所期望的結果,另外提供了一個錯誤DataError: No numeric types to aggregate
同樣多指標數據框(平均)
描述:用於施工
import pandas as pd
import numpy as np
indexTuples = [('a', 1), ('b', 3), ('a', 2), ('c', 2), ('c', 3), ('b', 8)]
multiIndex = pd.MultiIndex.from_tuples(indexTuples, names = ['x', 'y'])
通過方法1
columns = ['alpha', 'beta', 'gamma']
df = pd.DataFrame(index=multiIndex, columns=columns)
alpha = pd.Series(index=multiIndex)
beta = pd.Series(index=multiIndex)
gamma = pd.Series(index=multiIndex)
for tup in indexTuples:
alpha[tup[0], tup[1]] = np.random.randint(400)
beta[tup[0], tup[1]] = np.random.randint(400)
gamma[tup[0], tup[1]] = np.random.randint(400)
df.alpha = alpha
df.beta = beta
df.gamma = gamma
df.alpha['a'] = np.nan
df
構建數據幀
公共數據給出的數據幀看起來像下面那樣
alpha beta gamma
x y
a 1 NaN 136.0 224.0
b 3 375.0 227.0 191.0
a 2 NaN 367.0 195.0
c 2 247.0 61.0 78.0
3 238.0 187.0 366.0
b 8 302.0 14.0 272.0
,如果我做了以下操作,我得到預期的結果
df.groupby(level='x').alpha.mean()
結果
x
a NaN
b 148.0
c 244.5
Name: alpha, dtype: float64
的方法構建數據框2
columns = ['alpha', 'beta', 'gamma']
_df = pd.DataFrame(index=multiIndex, columns=columns)
for tup in indexTuples:
_df.alpha[tup[0], tup[1]] = np.random.randint(400)
_df.beta[tup[0], tup[1]] = np.random.randint(400)
_df.gamma[tup[0], tup[1]] = np.random.randint(400)
_df.alpha['a'] = np.nan
給出了類似的使用NaN
的值查看DataFrame,如p。中所示revious方法
但現在當我嘗試通過水平分組後,發現平均
_df.groupby(level='x').alpha.mean()
我收到以下錯誤
---------------------------------------------------------------------------
DataError Traceback (most recent call last)
<ipython-input-192-ad2de6450fab> in <module>()
----> 1 _df.groupby(level='x').alpha.mean()
/film/tools/packages/pandas/0.18.0/CentOS-6.2_thru_7/python-2.7/lib/python2.7/site-packages/pandas-0.18.0-py2.7-linux-x86_64.egg/pandas/core/groupby.pyc in mean(self)
933 """
934 try:
--> 935 return self._cython_agg_general('mean')
936 except GroupByError:
937 raise
/film/tools/packages/pandas/0.18.0/CentOS-6.2_thru_7/python-2.7/lib/python2.7/site-packages/pandas-0.18.0-py2.7-linux-x86_64.egg/pandas/core/groupby.pyc in _cython_agg_general(self, how, numeric_only)
750
751 if len(output) == 0:
--> 752 raise DataError('No numeric types to aggregate')
753
754 return self._wrap_aggregated_output(output, names)
DataError: No numeric types to aggregate
爲什麼在第一種情況下工作,而不是在第二種情況?
不知何故** ** D型不能在我的數據框中工作,但您解決方案的工作!正確指出爲dtype問題 '_df.dtype AttributeError:'DataFrame'對象沒有屬性'dtype'' – narenandu
這是我的錯字。它應該是dtypes(複數爲dataframe) – piRSquared
謝謝...工作 – narenandu