2017-08-01 77 views
0

我在問自己是否可以取消多索引數據框的一個級別,以便返回的數據幀的其餘索引沒有排序! 代碼例如:熊貓堆棧不應該對剩餘索引進行排序

arrays = [["room1", "room1", "room1", "room1", "room1", "room1", 
      "room2", "room2", "room2", "room2", "room2", "room2"], 
      ["bed1", "bed1", "bed1", "bed2", "bed2", "bed2", 
      "bed1", "bed1", "bed1", "bed2", "bed2", "bed2"], 
      ["blankets", "pillows", "all", "blankets", "pillows", "all", 
      "blankets", "pillows", "all", "blankets", "pillows", "all"]] 

tuples = list(zip(*arrays)) 

index = pd.MultiIndex.from_tuples(tuples, names=['first index', 
               'second index', 'third index']) 

series = pd.Series([1, 2, 3, 1, 1, 2, 2, 2, 4, 2, 1, 3 ], index=index) 

series 

first index second index third index 
room1  bed1   blankets  1 
          pillows  2 
          all   3 
      bed2   blankets  1 
          pillows  1 
          all   2 
room2  bed1   blankets  2 
          pillows  2 
          all   4 
      bed2   blankets  2 
          pillows  1 
          all   3 

取消堆棧第二索引:

series.unstack(1) 

second index    bed1 bed2 
first index third index    
room1  all    3  2 
      blankets  1  1 
      pillows   2  1 
room2  all    4  3 
      blankets  2  2 
      pillows   2  1 

的問題是,該第三索引的順序已經改變,因爲指數爲自動和按字母順序排序。現在,行'毛毯'和'枕頭'之和的'all'行是第一行,而不是最後一行。那麼如何解決這個問題呢?似乎沒有一個選項可以阻止自動排序。另外,似乎沒有可能使用像myDataFrame.sort_index(...,key = ['some_key'])這樣的鍵對數據框的索引進行排序。

回答

3

一種可能的解決方案是reindexreindex_axis與參數level=1

s = series.unstack(1).reindex(['blankets','pillows','all'], level=1) 
print (s) 
second index    bed1 bed2 
first index third index    
room1  blankets  1  1 
      pillows   2  1 
      all    3  2 
room2  blankets  2  2 
      pillows   2  1 
      all    4  3 

s = series.unstack(1).reindex_axis(['blankets','pillows','all'], level=1) 
print (s) 
second index    bed1 bed2 
first index third index    
room1  blankets  1  1 
      pillows   2  1 
      all    3  2 
room2  blankets  2  2 
      pillows   2  1 
      all    4  3 

更動態的解決方案:

a = series.index.get_level_values('third index').unique() 
print (a) 
Index(['blankets', 'pillows', 'all'], dtype='object', name='third index') 

s = series.unstack(1).reindex_axis(a, level=1) 
print (s) 
second index    bed1 bed2 
first index third index    
room1  blankets  1  1 
      pillows   2  1 
      all    3  2 
room2  blankets  2  2 
      pillows   2  1 
      all    4  3