Python：合併/連接兩個數據幀

我想合併/連接兩個數據幀，每個數據幀都有三個鍵（Age，Gender和Signed_In）。兩個數據框都具有相同的父級，並由groupby創建，但具有唯一的值列。Python：合併/連接兩個數據幀

鑑於獨特的組合鍵在兩個數據框之間共享，似乎合併/連接應該是無痛的。想到那裏，我想嘗試'合併'和'加入'，但是不能在我的生活中解決它。

times = pd.read_csv('nytimes.csv') 

# Produces times_mean table consisting of two value columns, avg_impressions and avg_clicks 
times_mean = times.groupby(['Age','Gender','Signed_In']).mean() 
times_mean.columns = ['avg_impressions', 'avg_clicks'] 

# Produces times_max table consisting of two value columns, max_impressions and max_clicks 
times_max = times.groupby(['Age','Gender','Signed_In']).max() 
times_max.columns = ['max_impressions', 'max_clicks'] 

# Following intended to produce combined table with four value columns 
times_join = times_mean.join(times_max, on = ['Age', 'Gender', 'Signed_In']) 
times_join2 = pd.merge(times_mean, times_max, on=['Age', 'Gender', 'Signed_In'])

來源

2014-02-22 jamesbev

我們如果沒有'nytimes.csv'就無法測試。我的猜測是，既然''年齡''，'性別'，'Signed_In''是指數，你也不需要'加入'' –

'的調用，你應該提供什麼錯誤。 –

欣賞筆記，我第一次發佈 - 絕對應該包含原始文件。 – jamesbev

加入上等價的結構化MultiIndex

下面是一個例子演示這個時候你並不需要在on kwarg：

import numpy as np 
import pandas 

a = np.random.normal(size=10) 
b = a + 10 
index = pandas.MultiIndex.from_product([['A', 'B'], list('abcde')]) 

df_a = pandas.DataFrame(a, index=index, columns=['colA']) 
df_b = pandas.DataFrame(b, index=index, columns=['colB']) 

df_a.join(df_b)

這給了我：

colA  colB 
A a -1.525376 8.474624 
    b 0.778333 10.778333 
    c 1.153172 11.153172 
    d 0.966560 10.966560 
    e 0.089765 10.089765 
B a 0.717717 10.717717 
    b 0.305545 10.305545 
    c 0.123548 10.123548 
    d -1.018660 8.981340 
    e -0.635103 9.364897

來源

2014-02-22 01:24:58

謝謝，解決了它。此外，還沒有看到MultiIndex之前 - 歡呼。 – jamesbev

Python：合併/連接兩個數據幀

回答

相關問題