2017-04-23 92 views
1

我想用兩個新的索引將兩個熊貓數據幀合併爲一個新的第三個數據幀。假設我起始於以下:熊貓:連接並重新索引數據幀

df = pd.DataFrame(np.ones(25).reshape((5,5)),index = ['A','B','C','D','E']) 
df1 = pd.DataFrame(np.ones(25).reshape((5,5))*2,index = ['A','B','C','D','E']) 
df[2] = np.nan 
df1[3] = np.nan 
df[4] = np.nan 
df1[4] = np.nan 

我想至少費解的方式來實現以下結果:

NewIndex OldIndex df df1 
1 A 1 2 
2 B 1 2 
3 C 1 2 
4 D 1 2 
5 E 1 2 
6 A 1 2 
7 B 1 2 
8 C 1 2 
9 D 1 2 
10 E 1 2 
11 A NaN 2 
12 B NaN 2 
13 C NaN 2 
14 D NaN 2 
15 E NaN 2 
16 A 1 NaN 
17 B 1 NaN 
18 C 1 NaN 
19 D 1 NaN 
20 E 1 NaN 

什麼是做到這一點的最好方法是什麼?

回答

1

您必須拆除數據框,然後重新鏈接串聯的數據框。

import numpy as np 
import pandas as pd 
# test data 
df = pd.DataFrame(np.ones(25).reshape((5,5)),index = ['A','B','C','D','E']) 
df1 = pd.DataFrame(np.ones(25).reshape((5,5))*2,index = ['A','B','C','D','E']) 
df[2] = np.nan 
df1[3] = np.nan 
df[4] = np.nan 
df1[4] = np.nan 

# unstack tables and concat 
newdf = pd.concat([df.unstack(),df1.unstack()], axis=1) 
# reset multiindex for level 1 
newdf.reset_index(1, inplace=True) 
# rename columns 
newdf.columns = ['OldIndex','df','df1'] 
# drop old index 
newdf = newdf.reset_index().drop('index',1) 
# set index from 1 
newdf.index = np.arange(1, len(newdf) + 1) 
# rename new index 
newdf.index.name='NewIndex' 
print(newdf) 

輸出:

  OldIndex df df1 
NewIndex     
1    A 1.0 2.0 
2    B 1.0 2.0 
3    C 1.0 2.0 
4    D 1.0 2.0 
5    E 1.0 2.0 
6    A 1.0 2.0 
7    B 1.0 2.0 
8    C 1.0 2.0 
9    D 1.0 2.0 
10    E 1.0 2.0 
11    A NaN 2.0 
12    B NaN 2.0 
13    C NaN 2.0 
14    D NaN 2.0 
15    E NaN 2.0 
16    A 1.0 NaN 
17    B 1.0 NaN 
18    C 1.0 NaN 
19    D 1.0 NaN 
20    E 1.0 NaN 
21    A NaN NaN 
22    B NaN NaN 
23    C NaN NaN 
24    D NaN NaN 
25    E NaN NaN 
+1

是的,這個答案是__much__更好! – MaxU

+1

謝謝你的評論。 – Serenity