2017-01-06 23 views
1

我對數據框的列運行了字符串操作,以在另一個數組中生成新的列名。到現在爲止還挺好。在python中循環使用qcut的兩個數組

columns = dfnum.columns.values 
print(columns) 
qcolumns = [x + 'q' for x in columns] 
print(qcolumns) 

當我嘗試這兩個數組在for循環運行生成原始值的數據庫中位數削減我得到這個雖然:

for column in columns, qcolumn in qcolumns: 
dfnumqcut = pd.qcut(dfnum[[column]],5) 
dfnum[qcolumn] = dfnumqcut.codes 

我得到每一堆錯誤下面。我試圖去獲取qcuts並將它們連接到數據框。我可以列做此列按下面的,但應該有一些方法來做到這一點使用一個for循環:

dfnumqcut = pd.qcut(dfnum[['Market Cap']],5) 
dfnum['Market Capq'] = dfnumqcut.codes 

1 for column in columns, qcolumn in qcolumns:
----> 2 dfnumqcut = pd.qcut(dfnum[[column]],5)
3 dfnum[qcolumn] = dfnumqcut.codes

TypeError: unhashable type: 'numpy.ndarray'

回答

0

我想你可以按位置使用enumerateqcolumns選擇值:

for i, column in enumerate(columns): 
    dfnumqcut = pd.qcut(dfnum[[column]],5) 
    dfnum[qcolumns[i]] = dfnumqcut.codes 
print (dfnum) 

解決方案,而無需創建qcolumns - 新列名在一個loop創建:

for i, column in enumerate(columns): 
    dfnumqcut = pd.qcut(dfnum[[column]],5) 
    dfnum[column + 'q'] = dfnumqcut.codes 
print (dfnum) 

Simplier解決方案 - 如果循環DataFrame如果循環columns值:

for i, column in enumerate(dfnum): 
    dfnum[column + 'q'] = pd.qcut(dfnum[[column]],5).codes 
print (dfnum) 

樣品:

dfnum = pd.DataFrame({'A':[1,2,3], 
        'B':[4,5,6], 
        'C':[7,8,9], 
        'D':[1,3,5], 
        'E':[5,3,6], 
        'F':[7,4,3]}) 

print (dfnum) 
    A B C D E F 
0 1 4 7 1 5 7 
1 2 5 8 3 3 4 
2 3 6 9 5 6 3 

for i, column in enumerate(dfnum): 
    dfnum[column + 'q'] = pd.qcut(dfnum[[column]],5).codes 
print (dfnum) 
    A B C D E F Aq Bq Cq Dq Eq Fq 
0 1 4 7 1 5 7 0 0 0 0 2 4 
1 2 5 8 3 3 4 2 2 2 2 0 2 
2 3 6 9 5 6 3 4 4 4 4 4 0 
+0

如果我的回答對您有所幫助,不要忘了[接受](HTTP://元。 stackexchange.com/a/5235/295067)。謝謝。 – jezrael

+0

超級有用和很好的簡潔解決方案,謝謝! –