2
我想合併兩個數據幀。讓我們考慮以下兩個DFS:合併兩個具有複雜條件的熊貓數據幀
DF1:
id_A, ts_A, course, weight
id1, 2017-04-27 01:35:30, cotton, 3.5
id1, 2017-04-27 01:36:05, cotton, 3.5
id1, 2017-04-27 01:36:55, cotton, 3.5
id1, 2017-04-27 01:37:20, cotton, 3.5
id2, 2017-04-27 02:35:35, cotton blue, 5.0
id2, 2017-04-27 02:36:00, cotton blue, 5.0
id2, 2017-04-27 02:36:35, cotton blue, 5.0
id2, 2017-04-27 02:37:20, cotton blue, 5.0
DF2:
id_B, ts_B, value
id1, 2017-03-27 01:25:40, 100
id1, 2017-03-27 01:25:50, 200
id1, 2017-03-27 01:25:50, 230
id1, 2017-04-27 01:35:40, 240
id1, 2017-04-27 01:35:50, 200
id1, 2017-04-27 01:36:00, 350
id1, 2017-04-27 01:36:10, 400
id1, 2017-04-27 01:36:20, 500
id1, 2017-04-27 01:36:30, 600
id1, 2017-04-27 01:36:40, 700
id1, 2017-04-27 01:36:50, 800
id1, 2017-04-27 01:37:00, 900
id1, 2017-04-27 01:37:10, 1000
id2, 2017-04-27 02:35:40, 1000
id2, 2017-04-27 02:35:50, 2000
id2, 2017-04-27 02:36:00, 4500
id2, 2017-04-27 02:36:10, 3000
id2, 2017-04-27 02:36:20, 6000
id2, 2017-04-27 02:36:30, 5000
id2, 2017-04-27 02:36:40, 5022
id2, 2017-04-27 02:36:50, 5040
id2, 2017-04-27 02:37:00, 3200
id2, 2017-04-27 02:37:10, 9000
DF1應DF2合併使得下列條件成立: 由於時間間隔的差異在df1中的兩個連續行之間,我想將它與在該時間間隔內跟隨的df2中所有行的平均值合併。例如,
id_A, ts_A, course, weight
id1, 2017-04-27 01:35:30, cotton, 3.5
應合併
id_B, ts_B, value
id1, 2017-04-27 01:35:40, 240
id1, 2017-04-27 01:35:50, 200
id1, 2017-04-27 01:36:00, 350
,並獲得
id_A, ts_A, course, weight avgValue
id1, 2017-04-27 01:35:30, cotton, 3.5 263.3
我想看看從另一個角度思考問題 - 這將包括DF2的缺失行成DF1 - 通過使用merge_asof
但我沒有得到正確的結果:
pd.merge_asof(df2_sorted, df1, left_on='ts_B', right_on='ts_A', left_by='id_B', right_by='id_A', direction='backward')
非常感謝。我正在將其應用於我的案例。幾分鐘,我回來了。 –
沒問題,仔細檢查;) – jezrael
執行df = df.groupby(schema2,as_index = False)['value']。mean().drop('index',axis = 1)時出現以下錯誤raise DataError ('沒有數字類型來聚合') pandas.core.base.DataError:沒有數字類型來聚合 –