2017-04-04 53 views
0

我們知道,在邏輯迴歸算法我們預測之一,當THETA次X是大於0.5。我想提高精度值。所以我想改變它的預測函數來預測1時THETA倍X是大於0.5大於0.7或其它值。如何改變時,對預測-一個參數sklearn?

如果我寫的算法,我可以很容易地做到這一點。但隨着sklearn包,我不知道該怎麼做。

任何人都可以幫我一把嗎?

爲了清楚地足夠解釋的問題,這裏是在八度的預測函數羅滕:

p = sigmoid(X*theta); 

for i=1:size(p)(1) 
    if p(i) >= 0.6 
     p(i) = 1; 
    else 
     p(i) = 0; 
    endif; 
endfor 

回答

0

從sklearn的LogisticRegression預測對象具有predict_proba方法,其輸出與一個輸入例如屬於某一類的概率。您可以使用此功能與自己定義的THETA次X一起得到你想要的功能。

一個例子:

from sklearn import linear_model 
import numpy as np 

np.random.seed(1337) # Seed random for reproducibility 
X = np.random.random((10, 5)) # Create sample data 
Y = np.random.randint(2, size=10) 

lr = linear_model.LogisticRegression().fit(X, Y) 

prob_example_is_one = lr.predict_proba(X)[:, 1] 

my_theta_times_X = 0.7 # Our custom threshold 
predict_greater_than_theta = prob_example_is_one > my_theta_times_X 

下面是predict_proba文檔字符串:

Probability estimates. 

The returned estimates for all classes are ordered by the 
label of classes. 

For a multi_class problem, if multi_class is set to be "multinomial" 
the softmax function is used to find the predicted probability of 
each class. 
Else use a one-vs-rest approach, i.e calculate the probability 
of each class assuming it to be positive using the logistic function. 
and normalize these values across all the classes. 

Parameters 
---------- 
X : array-like, shape = [n_samples, n_features] 

Returns 
------- 
T : array-like, shape = [n_samples, n_classes] 
    Returns the probability of the sample for each class in the model, 
    where classes are ordered as they are in ``self.classes_``. 
0

這個工程的二進制和多類分類:

from sklearn.linear_model import LogisticRegression 
import numpy as np 

#X = some training data 
#y = labels for training data 
#X_test = some test data 

clf = LogisticRegression() 
clf.fit(X, y) 

predictions = clf.predict_proba(X_test) 

predictions = clf.classes_[np.argmax(predictions > threshold, axis=1)] 
相關問題