2016-03-14 134 views
1

我試圖創建一個函數來刪除相互之間高度相關的功能。但是,我得到錯誤''AttributeError: 'numpy.ndarray' object has no attribute 'columns' '' ...AttributeError:'numpy.ndarray'對象沒有屬性'列'

我只是想調用熊貓閱讀列號。接下來我可以做什麼?

import pandas as pd 
import numpy as np 

def remove_features_identical(DataFrame,data_source): 
    n=len(DataFrame.columns) 
    print 'dealing with %d features of %s data......... \n' % (n,data_source) 
    remove_ind = [] 
    R = np.corrcoef(DataFrame.T) 
    for i in range(n-1): 
     for j in range(i+1,n): 
      if R[i,j]==1: 
       remove_ind.append(j)  

    DataFrame.drop(remove_ind, axis=1, inplace=True) 
    DataFrame.drop(remove_ind, axis=1, inplace=True) 
    print ('deleting %d columns with correration factor >0.99') % (len(remove_ind)) 
    return DataFrame 

if __name__ == "__main__": 
    # load data and initialize y and x from train set and test set 
    df_train = pd.read_csv('train.csv') 
    df_test = pd.read_csv('test.csv') 
    y_train=df_train['TARGET'].values 
    X_train =df_train.drop(['ID','TARGET'], axis=1).values 
    y_test=[] 
    X_test = df_test.drop(['ID'], axis=1).values 

    # delete identical feartures in raw data 
    X_train = remove_features_identical(X_train,'train set') 
    X_test = remove_features_identical(X_test,'test set') 

回答

3

檢查熊貓的文檔,但我認爲

X_train =df_train.drop(['ID','TARGET'], axis=1).values 

.values返回numpy陣列,而不是一個熊貓數據幀。一個數組沒有columns屬性。

remove_features_identical - 如果您傳遞此數組,請確保您只使用數組而不是數據框功能。否則,請確保您傳遞一個數據幀。並且不要使用像DataFrame這樣的變量名稱。