0

不確定究竟是什麼錯誤。但是,我的目標是建立一個交叉驗證python代碼。我知道有各種指標,但我認爲我使用的是正確的指標。相反,讓我的期望CV10結果我收到了一個錯誤:「標量變量索引無效」 - 使用Scikit時學習「accuracy_score」

「無效的指數標量變量」

,我發現這個StackOverflow上: IndexError:當您試圖索引無效索引標量情況numpy標記,如numpy.int64或numpy.float64。它與TypeError非常相似:'int'對象沒有屬性'_ getitem _'當您嘗試索引int時。

任何幫助,將不勝感激......

我試圖按照:: http://scikit-learn.org/stable/modules/model_evaluation.html

from sklearn.ensemble import RandomForestClassifier 
from sklearn import cross_validation 
from numpy import genfromtxt 
import numpy as np 
from sklearn.metrics import accuracy_score 

def main(): 
    #read in data, parse into training and target sets 
    dataset = genfromtxt(open('D:\\CA_DataPrediction_TrainData\\CA_DataPrediction_TrainDataGenetic.csv','r'), delimiter=',', dtype='f8')[1:] 
    target = np.array([x[0] for x in dataset]) 
    train = np.array([x[1:] for x in dataset]) 

    #In this case we'll use a random forest, but this could be any classifier 
    cfr = RandomForestClassifier(n_estimators=10) 

    #Simple K-Fold cross validation. 10 folds. 
    cv = cross_validation.KFold(len(train), k=10, indices=False) 

    #iterate through the training and test cross validation segments and 
    #run the classifier on each one, aggregating the results into a list 
    results = [] 
    for traincv, testcv in cv: 
     pred = cfr.fit(train[traincv], target[traincv]).predict(train[testcv]) 
     results.append(accuracy_score(target[testcv], [x[1] for x in pred])) 

    #print out the mean of the cross-validated results 
    print "Results: " + str(np.array(results).mean()) 

if __name__=="__main__": 
    main() 

回答

2

pred變量只是一個預測的名單,所以你不能索引其內容(此對於錯誤的原因)

results.append(accuracy_score(target[testcv], [x[1] for x in pred])) 

應該

results.append(accuracy_score(target[testcv], pred)) 

,或者如果你真的想要一個副本

results.append(accuracy_score(target[testcv], [x for x in pred]))