0
在Udacity的機器學習入門課程中,我發現我的代碼的結果每次運行時都會改變。正確的值是acc_min_samples_split_2 = .908和acc_min_samples_split_2 = .912,但是當我運行我的腳本時,有時也會使用acc_min_samples_split_2 = .912的值。這發生在我的本地機器和Udacity內的Web界面上。爲什麼會發生這種情況?我每次運行代碼時都會改變sklearn的決定邊界
該程序使用SciKit Learn library for python。 下面是我寫的代碼部分:使用一些PRNG內部產生隨機數
def classify(features, labels, samples):
# Creates a new Decision Tree Classifier, and fits it based on sample data
# and a specified min_sample_split value
from sklearn import tree
clf = tree.DecisionTreeClassifier(min_samples_split = samples)
clf = clf.fit(features, labels)
return clf
#Create a classifier with a min sample split of 2, and test its accuracy
clf2 = classify(features_train, labels_train, 2)
acc_min_samples_split_2 = clf2.score(features_test,labels_test)
#Create a classifier with a min sample split of 50, and test its accuracy
clf50 = classify(features_train, labels_train, 50)
acc_min_samples_split_50 = clf50.score(features_test,labels_test)
def submitAccuracies():
return {"acc_min_samples_split_2":round(acc_min_samples_split_2,3),
"acc_min_samples_split_50":round(acc_min_samples_split_50,3)}
print submitAccuracies()