0

在Windows上使用Python 2.7。對於分類問題,要使用功能T1T2擬合邏輯迴歸模型,目標是T3scikit中具有不同特徵維度的訓練邏輯迴歸模型學習

我顯示的值爲T1T2,以及我的代碼。問題是,由於T1的維度爲5,而T2的維度爲1,我們應該如何對它們進行預處理,以便scikit-learn邏輯迴歸訓練能夠正確使用它們?

順便說一句,我的意思是訓練樣本1,它的T1特點是[ 0 -1 -2 -3],而T2特點是[0],用於訓練樣本2,其T1的功能是[ 1 0 -1 -2]T2特點是[1],...

import numpy as np 
from sklearn import linear_model, datasets 

arc = lambda r,c: r-c 
T1 = np.array([[arc(r,c) for c in xrange(4)] for r in xrange(5)]) 
print T1 
print type(T1) 
T2 = np.array([[arc(r,c) for c in xrange(1)] for r in xrange(5)]) 
print T2 
print type(T2) 
T3 = np.array([0,0,1,1,1]) 

logreg = linear_model.LogisticRegression(C=1e5) 

# we create an instance of Neighbours Classifier and fit the data. 
# using T1 and T2 as features, and T3 as target 
logreg.fit(T1+T2, T3) 

T1,

[[ 0 -1 -2 -3] 
[ 1 0 -1 -2] 
[ 2 1 0 -1] 
[ 3 2 1 0] 
[ 4 3 2 1]] 

T2,

[[0] 
[1] 
[2] 
[3] 
[4]] 

回答

1

它需要使用numpy.concatenate連接特徵數據矩陣。

import numpy as np 
from sklearn import linear_model, datasets 

arc = lambda r,c: r-c 
T1 = np.array([[arc(r,c) for c in xrange(4)] for r in xrange(5)]) 
T2 = np.array([[arc(r,c) for c in xrange(1)] for r in xrange(5)]) 
T3 = np.array([0,0,1,1,1]) 

X = np.concatenate((T1,T2), axis=1) 
Y = T3 
logreg = linear_model.LogisticRegression(C=1e5) 

# we create an instance of Neighbours Classifier and fit the data. 
# using T1 and T2 as features, and T3 as target 
logreg.fit(X, Y) 

X_test = np.array([[1, 0, -1, -1, 1], 
        [0, 1, 2, 3, 4,]]) 

print logreg.predict(X_test) 
+0

感謝大元,投票了。連接後,邏輯迴歸會認爲每個訓練樣本有6個特徵? –

+0

順便說一句,'X_test'的目的是什麼? –

+1

@ LinMa是的,你是對的。連接後,你有6個特徵(T1的前5個特徵和T2的最後一個特徵)。 X_test只是爲了測試:) –

相關問題