雖然我想在KNeighborsClassifier中使用標準Euclidean度量。混淆sklearn距離算法
knn = KNeighborsRegressor(n_neighbors=k,metric='seuclidean')
knn.fit(newx,y)
和顯示的類型錯誤:
C:\Anaconda3\lib\site-packages\sklearn\neighbors\base.py in fit(self, X, y)
741 X, y = check_X_y(X, y, "csr", multi_output=True)
742 self._y = y
--> 743 return self._fit(X)
744
745
C:\Anaconda3\lib\site-packages\sklearn\neighbors\base.py in _fit(self, X)
238 self._tree = BallTree(X, self.leaf_size,
239 metric=self.effective_metric_,
--> 240 **self.effective_metric_params_)
241 elif self._fit_method == 'kd_tree':
242 self._tree = KDTree(X, self.leaf_size,
sklearn\neighbors\binary_tree.pxi in sklearn.neighbors.ball_tree.BinaryTree.__init__ (sklearn\neighbors\ball_tree.c:9220)()
sklearn\neighbors\dist_metrics.pyx in sklearn.neighbors.dist_metrics.DistanceMetric.get_metric (sklearn\neighbors\dist_metrics.c:4821)()
sklearn\neighbors\dist_metrics.pyx in sklearn.neighbors.dist_metrics.SEuclideanDistance.__init__ (sklearn\neighbors\dist_metrics.c:6399)()
TypeError: __init__() takes exactly 1 positional argument (0 given)
我只需要輸入我自己的功能,實現KNN,如:
import numpy as np
from sklearn.preprocessing import StandardScaler
x = np.random.randint(0,10,(10,2))
y = np.random.randint(0,10,(10,1))
testx = np.random.randint(0,10,(1,2))
sds = StandardScaler()
sds.fit(x)
sklean_newx = sds.transform(x)
sklearn_newtestx = sds.transform(testx)
distance = np.sqrt(((testx - newx) ** 2).sum(axis=1))
for k in range(1,8):
kn = distance.argsort()[:k]
print(y[kn].mean(), '%'*10, k)
的sklearn:
for k in range(1,8):
knn = KNeighborsRegressor(n_neighbors=k,metric='seuclidean' , metric_params={'V':x.std(axis=0)})
knn.fit(x ,y)
print(knn.predict(testx)[0], '%'*10, k)
上述兩結果不一致,爲什麼?
如果用本頁底部的文檔示例中的'X'和'y'替換'newx'和'y',是否會得到相同的錯誤? http://scikit-learn.org/stable/modules/generated/sklearn.neighbors.KNeighborsRegressor.html –