2017-05-26 70 views
0

我有一個像下面的一個Python熊貓數據幀:Python的大熊貓多列組合到單個列

movie  unknown action adventure animation fantasy horror romance sci-fi 

Toy Story 0  1  1   0  1  0  0  1    
Golden Eye 0  1  0   0  0  0  1  0  
Four Rooms 1  0  0   0  0  0  0  0  
Get Shorty 0  0  0   1  1  0  1  0 
Copy Cat  0  0  1   0  0  1  0  0 

我想這部電影流派結合成一個燒毛列。輸出會是這樣:

movie  genre 

Toy Story action, adventure, fantasy, sci-fy 
Golden Eye action, romance 
Four Rooms unknown 
Get Shorty animation, fantasy, romance 
Copy Cat adventure, horror 

回答

2

你可以這樣來做:

In [171]: df['genre'] = df.iloc[:, 1:].apply(lambda x: df.iloc[:, 1:].columns[x.astype(bool)].tolist(), axis=1) 

In [172]: df 
Out[172]: 
     movie unknown action adventure animation fantasy horror romance sci-fi         genre 
0 Toy Story  0  1   1   0  1  0  0  1 [action, adventure, fantasy, sci-fi] 
1 Golden Eye  0  1   0   0  0  0  1  0      [action, romance] 
2 Four Rooms  1  0   0   0  0  0  0  0        [unknown] 
3 Get Shorty  0  0   0   1  1  0  1  0   [animation, fantasy, romance] 
4 Copy Cat  0  0   1   0  0  1  0  0     [adventure, horror] 

PS,但我不明白它如何能夠幫助你,我沒有看到任何好處相比「一個熱點編碼矩陣

+1

'df ['genre'] = df.apply(lambda x:df.columns [x.astype(bool)]。tolist()[1:],axis = 1)+1並同意它不會提供任何額外的好處 – bernie

+1

@bernie,謝謝:) – MaxU