2012-09-30 44 views
9

我有一個熊貓數據幀的柱子一樣線性迴歸 - 減少自由度

Order  Balance  Profit cum (%) 

我做了線性迴歸

model_profit_tr = pd.ols(y=df_closed['Profit cum (%)'], x=df_closed['Order']) 

這樣做的問題是,標準模型是像(不通過原點的線的方程式)

y = a * x + b 

存在2個自由度(a和b)

斜率(一):

a=model_profit_tr.beta['x'] 

和截距(b)中:

b=model_profit_tr.beta['intercept'] 

我想減少自由度用於我的模型(從2到1)和I'想聽聽喜歡

y = a * x 

回答

8

模型使用intercept關鍵字參數:

model_profit_tr = pd.ols(y=df_closed['Profit cum (%)'], 
         x=df_closed['Order'], 
         intercept=False) 

從文檔:

In [65]: help(pandas.ols) 
Help on function ols in module pandas.stats.interface: 

ols(**kwargs) 

    [snip] 

    Parameters 
    ---------- 
    y: Series or DataFrame 
     See above for types 
    x: Series, DataFrame, dict of Series, dict of DataFrame, Panel 
    weights : Series or ndarray 
     The weights are presumed to be (proportional to) the inverse of the 
     variance of the observations. That is, if the variables are to be 
     transformed by 1/sqrt(W) you must supply weights = 1/W 
    intercept: bool 
     True if you want an intercept. Defaults to True. 
    nw_lags: None or int 
     Number of Newey-West lags. Defaults to None. 

    [snip] 
+0

非常感謝(兩個解決方案,並幫助提示)! –

+0

我只是另一個問題,但我不知道我是否可以在這裏提問...如果我想將截距設置爲給定值(0除外),我應該怎麼做。 (我也將自由度數從2減少到1) –

+0

@FemtoTrader:我不認爲'ols'具有這種功能。但是,考慮到最小二乘方法,您可以從'y'中減去截距,然後在'intercept = False'時使用'ols'。它應該是一樣的。 – Avaris

0

下面是一個示例顯示解決方案:

#!/usr/bin/env python 

import pandas as pd 
import matplotlib.pylab as plt 
import numpy as np 

data = [ 
(0.2, 1.3), 
(1.3, 3.9), 
(2.1, 4.8), 
(2.9,5.5), 
(3.3,6.9) 
] 

df = pd.DataFrame(data, columns=['X', 'Y']) 

print(df) 

# 2 degrees of freedom : slope/intercept 
model_with_intercept = pd.ols(y=df['Y'], x=df['X'], intercept=True) 
df['Y_fit_with_intercept'] = model_with_intercept.y_fitted 

# 1 degree of freedom : slope ; intersept=0 
model_no_intercept = pd.ols(y=df['Y'], x=df['X'], intercept=False) 
df['Y_fit_no_intercept'] = model_no_intercept.y_fitted 

# 1 degree of freedom : slope ; intersept=offset 
offset = -1 
df['Yoffset'] = df['Y'] - offset 
model_with_offset = pd.ols(y=df['Yoffset'], x=df['X'], intercept=False) 
df['Y_fit_offset'] = model_with_offset.y_fitted + offset 

print(model_with_intercept) 
print(model_no_intercept) 
print(model_with_offset) 

df.plot(x='X', y=['Y', 'Y_fit_with_intercept', 'Y_fit_no_intercept', 'Y_fit_offset']) 
plt.show()