vector和DataFrame長度

我有一個由12列組成的數據幀。然後我從它剪下一個矢量，並將它分開。然後我進行train_test_split從sklearn庫象下面這樣：vector和DataFrame長度

X=pd.DataFrame() 

X['annua_inc']=annual_inc 
X['delinq_2yrs']=delinq_2yrs 
X['dti']=dti 
X['emp_length']=emp_length 
X['loan_amnt']=loan_amnt 
X['installment']=installment 
X['int_rate']=int_rate 
X['total_acc']=total_acc 
X['open_acc']=open_acc 
X['pub_rec']=pub_rec 
X['acc_now_delinq']=acc_now_delinq 
X['loan_stat']=loan_stat 

X=X.fillna(0) 
y=X['loan_stat'] 
X=X.drop(['loan_stat'], axis=1) 

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, 
random_state=42)

當我檢查例如X_test的長度和y_test（兩者應該是相同的），它返回箱相同的值。但是當我嘗試調用X_test [len（X_test）]時，它告訴我索引超出了軸0的範圍。但對於y_test [len（y_test）]，它給了我一個正確的值。有誰知道爲什麼？因爲X_test中的最後一行和y_test中的最後一行先前組合在X中的同一行中，爲什麼現在X_test的最後一行不存在，對於y_test呢？

來源

2017-06-03 Blazej Kowalski

python，pandas，numpy，scipy和其他的數組都是基於零的索引。所以[0, 1, 2, 3]的長度是4.但[0, 1, 2, 3][4]將會出界。稱之爲要麼[0, 1, 2, 3][4 - 1]或[0, 1, 2, 3][-1]

最後的元素在你的情況

X_test[len(X_test) - 1]

或者

X_test[-1]

來源

2017-06-04 02:57:50 piRSquared

好，但對於y_test是相同長度X_test，並返回一個值。但是y_test是一個向量，而不是數組，所以也許情況就是這樣呢？ –

vector和DataFrame長度

回答

相關問題