熊貓 - 連接兩個多指數dataframes

我有一個數據幀如下：熊貓 - 連接兩個多指數dataframes

df.head() 
       Student Name   Q1 Q2 Q3 
Month Roll No    
2016-08-01 0 Save Mithil Vinay  0.0 0.0 0.0 
      1 Abraham Ancy Chandy  6.0 5.0 5.0 
      2 Barabde Pranjal Sanjiv 7.0 5.0 5.0 
      3 Bari Siddhesh Kishor 8.0 5.0 3.0 
      4 Barretto Cleon Domnic 1.0 5.0 4.0

現在，我想作一個分層列的索引，所以我做了以下的方法：

big_df = pd.concat([df['Student Name'], df[['Q1', 'Q2', 'Q3']]], axis=1, keys=['Name', 'IS'])

，並能夠得到如下：

>>> big_df 
       Name     IS 
       Student Name   Q1 Q2 Q3 
Month Roll No    
2016-08-01 0 Save Mithil Vinay  0.0 0.0 0.0 
      1 Abraham Ancy Chandy  6.0 5.0 5.0 
      2 Barabde Pranjal Sanjiv 7.0 5.0 5.0 
      3 Bari Siddhesh Kishor 8.0 5.0 3.0 
      4 Barretto Cleon Domnic 1.0 5.0 4.0

現在的第二次迭代，我只想來串聯Q1, Q2, Q3從新數據幀到big_df數據幀的值（以前連接的數據幀）。現在的第二次迭代數據框如下：

   Student Name   Q1 Q2 Q3 
Month Roll No    
2016-08-01 0 Save Mithil Vinay  0.0 0.0 0.0 
      1 Abraham Ancy Chandy  8.0 5.0 5.0 
      2 Barabde Pranjal Sanjiv 7.0 5.0 4.0 
      3 Bari Siddhesh Kishor 8.0 4.0 3.0 
      4 Barretto Cleon Domnic 2.0 3.0 4.0

我想要的big_df類似如下：

   Name     IS   CC 
       Student Name   Q1 Q2 Q3 Q1 Q2 Q3 
Month Roll No        
2016-08-01 0 Save Mithil Vinay  0.0 0.0 0.0 0.0 0.0 0.0 
      1 Abraham Ancy Chandy  6.0 5.0 5.0 8.0 5.0 5.0 
      2 Barabde Pranjal Sanjiv 7.0 5.0 5.0 7.0 5.0 4.0 
      3 Bari Siddhesh Kishor 8.0 5.0 3.0 8.0 4.0 3.0 
      4 Barretto Cleon Domnic 1.0 5.0 4.0 2.0 3.0 4.0

我嘗試了以下代碼，但都給人錯誤：

big_df.concat([df[['Q1', 'Q2', 'Q3']]], axis=1, keys=['CC']) 

pd.concat([big_df, df[['Q1', 'Q2', 'Q3']]], axis=1, keys=['Name', 'CC'])

我在哪裏做錯誤？請幫助。我是新來的大熊貓

來源

2016-11-07 Jeril

如果當你提出問題這將是真棒，你的東西在那裏格式化你的問題可以簡單地複製並使用pd.read_clipboard（）來獲取初始數據。您應該測試它的工作原理，並且還會突出顯示read_clipboard（）或幾個後期處理行中需要哪些參數以準確獲取您的數據框。這將使任何人都可以更容易地提供幫助。 –

@JulienMarrec很抱歉，下次會改進它。感謝支持 – Jeril

首先，您最好將您的索引設置爲['Month', 'Roll no.', 'Student Name']。這將簡化您的concat語法，並確保您也可以匹配學生的姓名。

df.set_index('Student Name', append=True, inplace=True)

其次，我建議你做不同的看法，並與參考的名稱爲最高列級你的迭代過程中存儲您的df dataframes（與Q1/Q2/Q3值）（例如：「IS」，'CC'）。一個字典將是完美的這一點，大熊貓確實接受字典作爲參數傳遞給pd.concat

# Creating a dictionnary with the first df from your question 
df_dict = {'IS': df} 

# Iterate.... 
    # Append the new df to the df_dict 
    df_dict['CC'] = df

現在，循環通過後，這裏是你的字典：

df_dict 

In [10]: df_dict 

Out[10]: 
{'CC':            Q1 Q2 Q3 
Month  Roll No Student Name       
2016-08-01 0  Save Mithil Vinay  0.0 0.0 0.0 
      1  Abraham Ancy Chandy  6.0 5.0 5.0 
      2  Barabde Pranjal Sanjiv 7.0 5.0 5.0 
      3  Bari Siddhesh Kisho  8.0 5.0 3.0 
      4  Barretto Cleon Domnic 1.0 5.0 4.0, 
'IS':            Q1 Q2 Q3 
Month  Roll No Student Name       
2016-08-01 0  Save Mithil Vinay  0.0 0.0 0.0 
      1  Abraham Ancy Chandy  8.0 5.0 5.0 
      2  Barabde Pranjal Sanjiv 7.0 5.0 4.0 
      3  Bari Siddhesh Kisho  8.0 4.0 3.0 
      4  Barretto Cleon Domnic 2.0 3.0 4.0}

所以，現在如果你Concat的，熊貓的確好聽，而且自動爲您：

In [11]: big_df = pd.concat(df_dict, axis=1) 
     big_df 

Out[11]:

如果你真的想反覆做，你應該CONCAT之前預先考慮您的新多（「CC」）與big_df

df.columns = pd.MultiIndex.from_tuples([('IS', x) for x in df.columns]) 

# Then you can concat, give the same result as the picture above. 
pd.concat([big_df, df], axis=1)

來源

2016-11-07 13:36:07

感謝您的幫助，但是'學生姓名'已經連接在一起。 [Image1]（https://i.imgsafe.org/08d67d6aff.png）和[Image2]（https://i.imgsafe.org/08d93130b9.png）。如何刪除'學生姓名'？ – Jeril

我想你錯過了我說你應該把你的索引設置爲['月'，'卷號'，'學生姓名']的部分。對你來說：，你需要做'df.set_index（'學生姓名'，append = True，inplace = True）' –

工作完美..非常感謝......抱歉沒有正確發佈問題。 – Jeril

下降big_df最高級別：

big_df.columns = big_df.columns.droplevel(level=0)

將它們連接起來，提供三種不同的幀作爲輸入匹配鍵的數量要使用：

Q_cols = ['Q1', 'Q2', 'Q3'] 
key_names = ['Name', 'IS', 'CC'] 
pd.concat([big_df[['Student Name']], big_df[Q_cols], df[Q_cols]], axis=1, keys=key_names)

來源

2016-11-07 12:47:24

非常感謝。我需要放棄'0級'。這是造成這個問題。 – Jeril

熊貓 - 連接兩個多指數dataframes

回答

相關問題