2015-04-21 91 views
0

我用下面的數據幀的工作熊貓獲得指數爲行值

like   max_interest min_interest 
basketball  4    2 
football   2    0 
soccer   4    2 
softball   4    2 
volleyball  4    2 
swimming   2    0 
cheerleading  4    2 
baseball   4    2 

我想通過max_interest/min的興趣像

group   max_interest             min_interest 
     4   basketball,soccer,softball,volleyball,cheerleading,baseball N/A 
     2   football,swimming            basketball,soccre,softball,volleyball,cheerleading,baseball 
     0   N/A               football,swimming 

我試圖使它工作組它通過使用groupby(max_interest),但未能找到如何連接類似的列

這實質上是做什麼是在串行下max_interest和simila爲了低迷感興趣。

有可能是通過寫iterateng的手工編寫邏輯的方式,並保持附加喜歡,但想知道如果我可以用熊貓/ NP庫讚賞

幫助寫。

+0

你應該看看pivot_tables,但我不知道,你可以得到值的列表到一個單一的「細胞「一舉 – Skorpeo

+0

ok.will檢查數據透視表 – Yantraguru

回答

0

首先根據興趣水平分割DataFrame並連接相應的喜歡:

u = ({k: ','.join(n['like'])} for k, n in df.groupby('max_interest'))    
v = ({k: ','.join(n['like'])} for k, n in df.groupby('min_interest')) 

然後創建一個新DataFrame

df1 = pd.DataFrame(list(u)+list(v), index=['max_interest', 'max_interest', 'min_interest', 'min_interest'] 

認沽在你想要的形式框架,使用groupby().last()

adjustframe = df1.grouby(level=0).last().transpose() 

輸出:

      max_interest       min_interest                  
0         NaN        foot,swim                  
2        foot,swim basket,soccer,soft,volley,cheer,base                  
4 basket,soccer,soft,volley,cheer,base         NaN                  

要設置索引名:

adjustframe.index.name = 'group'          
0

這裏有一個選項:

In [39]: def groupby(key): 
    ....:   result = data.groupby(key).agg({'like': lambda v: ','.join(v)}) 
    ....:   result.index.name = 'group' 
    ....:   result.columns = [key] 
    ....:   return result 
    ....: 

In [40]: pd.concat((groupby(key) for key in ['max_interest', 'min_interest']), axis=1) 
Out[40]: 
              max_interest          min_interest 
group 
0             NaN         football,swimming 
2          football,swimming basketball,soccer,softball,volleyball,cheerlea... 
4  basketball,soccer,softball,volleyball,cheerlea...            NaN