1
我想對我有一些數據使用RandomForestClassifier。代碼如下:模型的特徵數量必須與輸入值匹配?
print train_data[0,0:20]
print train_data[0,21::]
print test_data[0]
print 'Training...'
forest = RandomForestClassifier(n_estimators=100)
forest = forest.fit(train_data[0::,0::20], train_data[0::,21::])
print 'Predicting...'
output = forest.predict(test_data)
但是這會生成以下錯誤:
ValueError: Number of features of the model must match the input. Model n_features is 3 and input n_features is 21
從前三報表打印輸出是:
[ 0. 0. 0. 0. 1. 0.
0. 0. 0. 0. 1. 0.
0. 0. 0. 37.7745986 -122.42589168
0. 0. 0. ]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
1. 0.]
[ 0. 0. 0. 0. 0. 0.
0. 1. 0. 0. 1. 0.
0. 0. 0. 0. 37.73505101
-122.3995877 0. 0. 0. ]
我假設數據是正確格式爲我的fit
/predict
調用,但它在predict
上出錯。任何人都可以看到我在這裏做錯了嗎?
我支持這個答案,並希望補充一點,即在擬合分類器時,您不需要額外的分配。你的「fit」代碼行應該看起來像'forest.fit(train_data [:,:21],train_data [:,21:])'(假設索引從0到20的前21列是特徵,其餘從21到最後一列索引的列是標籤) – lanenok