scikit-learn的LogisticRegression（）會自動將輸入數據標準化爲z分數嗎？

有沒有辦法讓LogisticRegression()的實例自動將提供的擬合/訓練數據標準化爲z-scores來構建模型？ LinearRegression()有一個normalize=True參數，但也許這對LogisticRegression()沒有意義？scikit-learn的LogisticRegression（）會自動將輸入數據標準化爲z分數嗎？

如果是這樣，在調用predict_proba()之前，我是否必須手工歸一化未標記的輸入向量（即重新計算每列的平均值，標準偏差）？如果模型已經執行了可能花費昂貴的計算，這將會很奇怪。

謝謝

來源

2015-06-24 ministry

by z-score，你的意思是像x - x.mean（）/ x.std（）？ –

是的，這是一種常用的方式來引用「標準分數」 – ministry

這是你在找什麼？

from sklearn.datasets import make_classification 
from sklearn.preprocessing import StandardScaler 
from sklearn.pipeline import make_pipeline 
from sklearn.linear_model import LogisticRegression 


X, y = make_classification(n_samples=1000, n_features=100, weights=[0.1, 0.9], random_state=0) 
X.shape 

# build pipe: first standardize by substracting mean and dividing std 
# next do classificaiton 
pipe = make_pipeline(StandardScaler(), LogisticRegression(class_weight='auto')) 

# fit 
pipe.fit(X, y) 
# predict 
pipe.predict_proba(X) 

# to get back mean/std 
scaler = pipe.steps[0][1] 
scaler.mean_ 
Out[12]: array([ 0.0313, -0.0334, 0.0145, ..., -0.0247, 0.0191, 0.0439]) 

scaler.std_ 
Out[13]: array([ 1. , 1.0553, 0.9805, ..., 1.0033, 1.0097, 0.9884])

來源

2015-06-25 00:03:13

scikit-learn的LogisticRegression（）會自動將輸入數據標準化爲z分數嗎？

回答

相關問題