Sklearn數字數據集

import matplotlib.pyplot as plt 

from sklearn import datasets 
from sklearn import svm 

digits = datasets.load_digits() 

print(digits.data) 

classifier = svm.SVC(gamma=0.4, C=100) 
x, y = digits.data[:-1], digits.target[:-1] 

x = x.reshape(1,-1) 
y = y.reshape(-1,1) 
print((x)) 

classifier.fit(x, y) 
### 
print('Prediction:', classifier.predict(digits.data[-3])) 
### 
plt.imshow(digits.images[-1], cmap=plt.cm.gray_r, interpolation='nearest') 
plt.show()

我也改變了x和y。仍然我發現了一個錯誤說：Sklearn數字數據集

與樣品的不一致數實測值輸入變量：[1，1796]

Y具有與1796個元件，而x具有許多1-d陣列。它如何爲x顯示1？

來源

2016-10-25 linthum

其實放棄了我以下建議：

This link describes the general dataset API。屬性data是每個圖像的2D陣列，已經變平：

import sklearn.datasets 
digits = sklearn.datasets.load_digits() 
digits.data.shape 
#: (1797, 64)

這是所有需要提供，不需要整形。類似地，屬性是data每個標籤的一維數組：

digits.data.shape 
#: (1797,)

必要否整形。只需分成訓練和測試並運行。

嘗試打印x.shape和y.shape。我覺得你會找到類似於：(1, 1796, ...)和(1796, ...)分別。在scikit中爲分類器調用fit時，它需要兩個相同形狀的迭代器。

線索，圍繞重塑不同的方式時，爲何是參數：

x = x.reshape(1, -1) 
y = y.reshape(-1, 1)

也許嘗試：

x = x.reshape(-1, 1)

完全無關你的問題，但你預測對digits.data[-3]當訓練集中剩下的唯一元素是digits.data[-1]。不知道這是否是故意的。

無論如何，最好使用scikit度量包檢查分類器以獲得更多結果。 This page has an example of using it over the digits dataset。

來源

2016-10-25 13:18:58 SCB

它給出了一個錯誤：'發現輸入變量的樣本數不一致：[114944，1796]' – linthum

@lithum像我建議的那樣打印'x.shape'和'y.shape'的結果是什麼？ – SCB

@linthum實際上做了一些改變。我們都錯了。 – SCB

整形將您的8x8矩陣轉換爲1維矢量，可用作特徵。您需要重新整形整個X向量，而不僅僅是訓練數據的整個X向量，因爲您將用於預測的那個對象需要具有相同的格式。

下面的代碼演示如何：

import matplotlib.pyplot as plt 

from sklearn import datasets 
from sklearn import svm 

digits = datasets.load_digits() 


classifier = svm.SVC(gamma=0.4, C=100) 
x, y = digits.images, digits.target 

#only reshape X since its a 8x8 matrix and needs to be flattened 
n_samples = len(digits.images) 
x = x.reshape((n_samples, -1)) 
print("before reshape:" + str(digits.images[0])) 
print("After reshape" + str(x[0])) 


classifier.fit(x[:-2], y[:-2]) 
### 
print('Prediction:', classifier.predict(x[-2])) 
### 
plt.imshow(digits.images[-2], cmap=plt.cm.gray_r, interpolation='nearest') 
plt.show() 

### 
print('Prediction:', classifier.predict(x[-1])) 
### 
plt.imshow(digits.images[-1], cmap=plt.cm.gray_r, interpolation='nearest') 
plt.show()

它將輸出：

before reshape:[[ 0. 0. 5. 13. 9. 1. 0. 0.] 
[ 0. 0. 13. 15. 10. 15. 5. 0.] 
[ 0. 3. 15. 2. 0. 11. 8. 0.] 
[ 0. 4. 12. 0. 0. 8. 8. 0.] 
[ 0. 5. 8. 0. 0. 9. 8. 0.] 
[ 0. 4. 11. 0. 1. 12. 7. 0.] 
[ 0. 2. 14. 5. 10. 12. 0. 0.] 
[ 0. 0. 6. 13. 10. 0. 0. 0.]] 
After reshape[ 0. 0. 5. 13. 9. 1. 0. 0. 0. 0. 13. 15. 10. 15. 5. 
    0. 0. 3. 15. 2. 0. 11. 8. 0. 0. 4. 12. 0. 0. 8. 
    8. 0. 0. 5. 8. 0. 0. 9. 8. 0. 0. 4. 11. 0. 1. 
    12. 7. 0. 0. 2. 14. 5. 10. 12. 0. 0. 0. 0. 6. 13. 
    10. 0. 0. 0.]

和正確的預測在過去2張圖像，不用於訓練 - 你可以決定然而在測試和訓練集之間做出更大的分割。

來源

2016-10-25 13:54:06 DJanssens

Sklearn數字數據集

回答

相關問題