2
我有一組文檔和一組標籤。 現在,我正在使用train_test_split以90:10的比例分割我的數據集。但是,我希望使用Kfold交叉驗證。我如何做K摺疊交叉驗證分裂列車和測試集?
train=[]
with open("/Users/rte/Documents/Documents.txt") as f:
for line in f:
train.append(line.strip().split())
labels=[]
with open("/Users/rte/Documents/Labels.txt") as t:
for line in t:
labels.append(line.strip().split())
X_train, X_test, Y_train, Y_test= train_test_split(train, labels, test_size=0.1, random_state=42)
當我嘗試scikit的文檔中提供的方法學:我收到一個錯誤,指出:
kf=KFold(len(train), n_folds=3)
for train_index, test_index in kf:
X_train, X_test = train[train_index],train[test_index]
y_train, y_test = labels[train_index],labels[test_index]
錯誤
X_train, X_test = train[train_index],train[test_index]
TypeError: only integer arrays with one element can be converted to an index
我如何可以執行10個折交叉在我的文檔和標籤上驗證?
什麼您是否嘗試過讓Kfold交叉驗證工作?你有沒有看到[文檔頁面]上的例子(http://scikit-learn.org/stable/modules/generated/sklearn.cross_validation.KFold.html#sklearn.cross_validation.KFold)? –
是的,我已經嘗試了在我的文檔和標籤集上給出的例子,但我收到一個錯誤:* X_train,X_test = train [train_index],train [test_index] TypeError:只有一個元素的整數數組可以轉換爲指數* – minks