2013-08-30 67 views
1

試圖通過熊貓和statsmodels做邏輯迴歸。不知道爲什麼我得到一個錯誤或如何解決它。Python中的迴歸

import pandas as pd 
import statsmodels.api as sm 
x = [1, 3, 5, 6, 8] 
y = [0, 1, 0, 1, 1] 
d = { "x": pd.Series(x), "y": pd.Series(y)} 
df = pd.DataFrame(d) 

model = "y ~ x" 
glm = sm.Logit(model, df=df).fit() 

錯誤:

Traceback (most recent call last): 
    File "regress.py", line 45, in <module> 
    glm = sm.Logit(model, df=df).fit() 
TypeError: __init__() takes exactly 3 arguments (2 given) 

回答

8

你不能一個公式傳遞給Logit。做:

In [82]: import patsy 

In [83]: f = 'y ~ x' 

In [84]: y, X = patsy.dmatrices(f, df, return_type='dataframe') 

In [85]: sm.Logit(y, X).fit().summary() 
Optimization terminated successfully. 
     Current function value: 0.511631 
     Iterations 6 
Out[85]: 
<class 'statsmodels.iolib.summary.Summary'> 
""" 
          Logit Regression Results 
============================================================================== 
Dep. Variable:      y No. Observations:     5 
Model:       Logit Df Residuals:      3 
Method:       MLE Df Model:       1 
Date:    Fri, 30 Aug 2013 Pseudo R-squ.:     0.2398 
Time:      16:56:38 Log-Likelihood:    -2.5582 
converged:      True LL-Null:      -3.3651 
             LLR p-value:     0.2040 
============================================================================== 
       coef std err   z  P>|z|  [95.0% Conf. Int.] 
------------------------------------------------------------------------------ 
Intercept  -2.0544  2.452  -0.838  0.402  -6.861  2.752 
x    0.5672  0.528  1.073  0.283  -0.468  1.603 
============================================================================== 
""" 

這是從the docs on how to do exactly what you're asking非常直接。

編輯:您也可以用公式API,如@ user333700建議:

In [22]: print sm.formula.logit(model, data=df).fit().summary() 
Optimization terminated successfully. 
     Current function value: 0.511631 
     Iterations 6 
          Logit Regression Results 
============================================================================== 
Dep. Variable:      y No. Observations:     5 
Model:       Logit Df Residuals:      3 
Method:       MLE Df Model:       1 
Date:    Fri, 30 Aug 2013 Pseudo R-squ.:     0.2398 
Time:      18:14:26 Log-Likelihood:    -2.5582 
converged:      True LL-Null:      -3.3651 
             LLR p-value:     0.2040 
============================================================================== 
       coef std err   z  P>|z|  [95.0% Conf. Int.] 
------------------------------------------------------------------------------ 
Intercept  -2.0544  2.452  -0.838  0.402  -6.861  2.752 
x    0.5672  0.528  1.073  0.283  -0.468  1.603 
============================================================================== 
+0

或使用配方功能'進口statsmodels.api爲smf'然後smf.logit(公式.. ) – user333700

+0

已編輯。我不知道,謝謝! –

+0

感謝Phillip提供了一個更正的答案。我的評論很快。我想寫'import statsmodels.formula.api as smf',它也可以訪問公式接口的快捷鍵,小寫函數。這些只是模型的'from_formula'方法的便利包裝,例如'sm.Logit.from_formula' – user333700