-2
我有熊貓df同列,T max
& T min
。我想在下一欄中計算T mean
。我做了這個df['T mean']= df[['T max','T min']].mean(axis=1)
但沒有解決。我得到T max
爲T mean
。有人能幫助我嗎?如何計算熊貓數據框中的行平均值?
我有熊貓df同列,T max
& T min
。我想在下一欄中計算T mean
。我做了這個df['T mean']= df[['T max','T min']].mean(axis=1)
但沒有解決。我得到T max
爲T mean
。有人能幫助我嗎?如何計算熊貓數據框中的行平均值?
我認爲存在問題的列T min
- type
的值是string
,而不是數字。因此,你需要通過astype
投它:
樣品:
df=pd.DataFrame({'T max':[1,2,3],'T min':['5','6','7']})
print (df)
T max T min
0 1 5
1 2 6
2 3 7
print (type(df.ix[0,'T min']))
<class 'str'>
df['T mean']= df[['T max','T min']].mean(axis=1)
print (df)
T max T min T mean
0 1 5 1.0
1 2 6 2.0
2 3 7 3.0
#cast column to int
df['T min'] = df['T min'].astype(int)
print (type(df.ix[0,'T min']))
<class 'numpy.int32'>
df['T mean new']= df[['T max','T min']].mean(axis=1)
print (df)
T max T min T mean T mean new
0 1 5 1.0 3.0
1 2 6 2.0 4.0
2 3 7 3.0 5.0
如果astype
返回錯誤:
ValueError: invalid literal for int() with base 10: 'aaa'
這意味着在T min
列至少一個無效值。
樣品:
df=pd.DataFrame({'T max':[1,2,3],'T min':['5','6','aaa']})
print (df)
T max T min
0 1 5
1 2 6
2 3 aaa
df['T mean']= df[['T max','T min']].mean(axis=1)
print (df)
T max T min T mean
0 1 5 1.0
1 2 6 2.0
2 3 aaa 3.0
#check invalid rows where is bad value in T min
print (df[ pd.to_numeric(df['T min'], errors='coerce').isnull()])
T max T min T mean
2 3 aaa 3.0
#replace invlid value to NaN
df['T min'] = pd.to_numeric(df['T min'], errors='coerce')
df['T mean new']= df[['T max','T min']].mean(axis=1)
print (df)
T max T min T mean T mean new
0 1 5.0 1.0 3.0
1 2 6.0 2.0 4.0
2 3 NaN 3.0 3.0
我將列投射到int並且它工作。謝謝 ! –
請提供樣品數據幀的工作。 –
發佈原始數據,您的代碼,期望的輸出和您的錯誤輸出 – EdChum