2015-10-20 172 views
2

我在將數據幀轉換爲新結構時遇到問題。 轉向樞軸表到數據幀後,我的數據是這樣的:轉換數據幀(數據透視)

  model  model1  model2 
time  color  
2001-01 blue  200,000 120,000 
      red  100,000 100,000 
      yellow  250,000 80,000 
      white  100,000 100,000 
2002-01 blue  140,000 150,000 
      red  200,000 100,000 
      yellow  400,000 200,000 
      white  200,000 100,000 
... 

現在,這是我想要把它變成什麼: 時間作爲指標,每個模型的顏色鮮明的列。

  model1_blue model1_red model1_yellow model1_white model2_blue ... 
time  
2001-01 200,000  100,000  250,000  100,000  120,000 
2002-01 140,000  200,000  400,000  200,000  150,000 
... 

現在:這是如何工作:)?謝謝!

+0

是'model'索引與否? –

回答

1

假設model是一個指標,如果不是你可以簡單地將其轉換爲使用

df.set_index('model' , inplace=True) 

轉換模型索引是列,而不是

df_unstacked = df.unstack('model') 


Out[28]: 
     model1      model2 
model blue red  white yellow blue red  white yellow 
time         
2001-01 200,000 100,000 100,000 250,000 120,000 100,000 100,000 80,000 
2002-01 140,000 200,000 200,000 400,000 150,000 100,000 100,000 200,000 

檢索兩個級別

first_level_names = df_unstacked.columns.levels[0] 
second_level_names = df_unstacked.columns.levels[1] 
列名的索引

創建新列名稱

new_columns = [ first+ '_' + second for first in first_level_names for second in second_level_names ] 

爲您的數據框指定新的列名稱

df_unstacked.columns = new_columns 

Out[33]: 
     model1_blue model1_red model1_white model1_yellow model2_blue model2_red model2_white model2_yellow 
time         
2001-01 200,000  100,000  100,000   250,000  120,000  100,000 100,000   80,000 
2002-01 140,000  200,000  200,000   400,000  150,000  100,000 100,000   200,000 
1

假設timemodel color正在形成一個分層索引(如果他們沒有,你可以創建這個索引很容易與pd.MultiIndex.from_arrays),最簡單的解決辦法是「拆散」該索引:

import pandas as pd 
df = pd.DataFrame([ 
     [200, 120], 
     [201, 123], 
     [202, 124], 
     [203, 125] , 
     [204, 126] , 
     [205, 126] , 
     [205, 127], 
     [205, 127],   
    ], columns=["model1", "model2"]) 

df.index = pd.MultiIndex.from_product([["2001-01", "2001-02"], ["blue", "red", "yellow", "white"]]) 
df 

enter image description here

df.unstack() 

enter image description here