2015-05-23 49 views
0

新來scikit學習。我試圖對某些組成數據進行邏輯迴歸,但是我得到錯誤「X和y有不相容的形狀,X有1個樣本,但是y有6個。」sklearn錯誤:「X和y具有不兼容的形狀。」

import pandas as pd 
from sklearn.linear_model import LogisticRegression 

# Create a sample dataframe 
data = [['Age', 'ZepplinFan'], [13 , 0], [40, 1], [25, 0], [55, 0], [51, 1], [58, 1]] 
columns=data.pop(0) 
df = pd.DataFrame(data=data, columns=columns) 

# Fit Logistic Regression 
lr = LogisticRegression() 
lr.fit(X=df.Age.values, y = df.ZepplinFan) 

This post表明,我需要以某種方式重塑df.Age.values到(N_SAMPLES次,1)。我該怎麼做呢?

回答

1

形狀很重要是的。做到這一點的方法之一,是通過像

In [24]: lr.fit(df[['Age']], df['ZepplinFan']) 
Out[24]: 
LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True, 
      intercept_scaling=1, penalty='l2', random_state=None, tol=0.0001) 

列如果你想明確地傳遞值,那麼你可以

In [25]: lr.fit(df[['Age']].values, df['ZepplinFan'].values) 
Out[25]: 
LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True, 
      intercept_scaling=1, penalty='l2', random_state=None, tol=0.0001) 

或者你可以newaxis到您現有的語法像

In [26]: lr.fit(df.Age.values[:,np.newaxis], df.ZepplinFan.values) 
Out[26]: 
LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True, 
      intercept_scaling=1, penalty='l2', random_state=None, tol=0.0001)