2016-02-02 14 views
0

此問題有been raised here,但尚未得到解答。我在這個主題中提供了更多細節,希望能夠讓這些機會流動起來。熊貓相關錯誤 - 十進制和浮點類型不匹配

我有一個熊貓據幀master_frame包含時間序列數據:

 SUBMIT_DATE CRUX_VOL  CRUX_RATE 
0  2016-02-01 76.38733173161 0.02832710529 
1  2016-01-31 76.68984699154 0.02720243998 
2  2016-01-30 75.59094829615 0.02720243998 
3  2016-01-29 75.91758975956 0.02720243998 
4  2016-01-28 76.31809997200 0.02671927211 
...   ... ...   ... 

我想CRUX_VOLCRUX_RATE列之間的相關性。兩者都是十進制類型:

ln[3]: print type(master_frame["CRUX_VOL"][0]), type(master_frame["CRUX_RATE"][0]) 
out[3]: <class 'decimal.Decimal'> <class 'decimal.Decimal'> 

當我使用CORR函數,我得到的,涉及到的輸入的類型討厭的錯誤。

print master_frame['CRUX_VOL'].corr(master_frame['CRUX_RATE']) 

Traceback (most recent call last): 
    File "U:/Programming/VolPathReport/VolPath.py", line 52, in <module> 
    print master_frame['CRUX_VOL'].corr(master_frame['CRUX_RATE']) 
    File "C:\Anaconda2\lib\site-packages\pandas\core\series.py", line 1312, in corr 
    min_periods=min_periods) 
    File "C:\Anaconda2\lib\site-packages\pandas\core\nanops.py", line 47, in _f 
    return f(*args, **kwargs) 
    File "C:\Anaconda2\lib\site-packages\pandas\core\nanops.py", line 644, in nancorr 
    return f(a, b) 
    File "C:\Anaconda2\lib\site-packages\pandas\core\nanops.py", line 652, in _pearson 
    return np.corrcoef(a, b)[0, 1] 
    File "C:\Anaconda2\lib\site-packages\numpy\lib\function_base.py", line 2145, in corrcoef 
    c = cov(x, y, rowvar) 
    File "C:\Anaconda2\lib\site-packages\numpy\lib\function_base.py", line 2065, in cov 
    avg, w_sum = average(X, axis=1, weights=w, returned=True) 
    File "C:\Anaconda2\lib\site-packages\numpy\lib\function_base.py", line 599, in average 
    scl = np.multiply(avg, 0) + scl 
TypeError: unsupported operand type(s) for +: 'Decimal' and 'float' 

我搞砸與類型,不能讓這件事的工作。幫助我,互聯網奇才!

回答

0

錯誤信息點的最後一行

np.multiply(avg, 0) + scl 

的原因

TypeError: unsupported operand type(s) for +: 'Decimal' and 'float' 

我不認爲numpyDecimal型,所以np.multiply回報float,然後沒有按當使用+運營商時,不與Decimal合作。由於pandas依靠numpy,它可能是最好的轉換使用

master_frame.loc[:, ['CRUX_VOL', 'CRUX_RATE']].astype(float) 

master_frame.convert_objects(convert_numeric=True) 
+0

'master_frame DataFramefloatdtype [ 「CRUX_VOL」] = master_frame [ 「CRUX_VOL」。astype(浮動) '& 'master_frame [「CRUX_RATE」] = master_frame [「CRUX_RATE」]。astype(float)' 訣竅。謝謝 – rvictordelta

+0

爲什麼'.loc'與我寫的是什麼? – rvictordelta

+0

'master_frame.convert_objects(convert_numeric = True)'被棄用,使用的替代方法是'pd.to_numeric':'master_frame [「CRUX_VOL」] = pd.to_numeric(master_frame [「CRUX_VOL」])' 可悲的是,不是'DataFrame'本身的版本 – Robin