2016-08-14 85 views
0
from sklearn.neighbors import KNeighborsClassifier 
import pandas as pd 
from sklearn import metrics 
from sklearn.cross_validation import train_test_split 
import matplotlib.pyplot as plt 

r = pd.read_csv("vitalsign_test.csv") 
clm_list = [] 
for column in r.columns: 
    clm_list.append(column) 
X = r[clm_list[1:len(clm_list)-1]].values 
y = r[clm_list[len(clm_list)-1]].values 

X_train, X_test, y_train, y_test = train_test_split (X,y, test_size = 0.3, random_state=4) 


k_range = range(1,25) 
scores = [] 
for k in k_range: 
    clf = KNeighborsClassifier(n_neighbors = k) 
    clf.fit(X_train,y_train) 
y_pred = clf.predict(X_test) 
scores.append(metrics.accuracy_score(y_test,y_pred)) 

plt.plot(k_range,scores) 
plt.xlabel('value of k for clf') 
plt.ylabel('testing accuracy') 

效應初探,我得到的是如何解決? x和y必須具有相同的第一維

ValueError: x and y must have same first dimension

我的功能和響應形狀:

y.shape 
Out[60]: (500,) 

X.shape 
Out[61]: (500, 6) 

回答

0

它無關,與你的Xy ,它是關於xy參數到,因爲你的scores有一個元素和k_range有25.錯誤是不正確的縮進:

for k in k_range: 
    clf = KNeighborsClassifier(n_neighbors = k) 
    clf.fit(X_train,y_train) 
y_pred = clf.predict(X_test) 
scores.append(metrics.accuracy_score(y_test,y_pred)) 

應該

for k in k_range: 
    clf = KNeighborsClassifier(n_neighbors = k) 
    clf.fit(X_train,y_train) 
    y_pred = clf.predict(X_test) 
    scores.append(metrics.accuracy_score(y_test,y_pred)) 
+0

這是工作。我是新的Python和學習。萬分感謝 –

相關問題