0
如何連接兩個帶有日期時間索引的pandas數據框,以便時間戳儘可能接近。有沒有可以使用的填充方法?試圖連接兩個時間序列數據框並儘可能接近匹配時間戳
一個例子可以是:
#required packages
import pandas as pd
import numpy as np
# defining stuff
num_periods_1 = 11
num_periods_2 = 4
# create sample time series
dates1 = pd.date_range('1/1/2000', periods=num_periods_1, freq='10min')
dates2 = pd.date_range('1/1/2000 00:40:00', periods=num_periods_2, freq='10min')
column_names_1 = ['B', 'C', 'A']
column_names_2 = ['B', 'C', 'D']
df1 = pd.DataFrame(np.random.randn(num_periods_1, len(column_names_1)), index=dates1, columns=column_names_1)
df2 = pd.DataFrame(np.random.randn(num_periods_2, len(column_names_2)), index=dates2, columns=column_names_2)
print("\nData Frame One:\n", df1)
print("\nData Frame Two:\n", df2)
df3 = pd.concat([df1.reset_index().add_suffix('_x'), df2.reset_index().add_suffix('_y')], axis=1).set_index(['index_x', 'index_y']).sort_index(axis=1)
print("\nData Frame Three:\n", df3)
這裏輸出將顯示如下:
A_x B_x B_y \
index_x index_y
2000-01-01 00:00:00 2000-01-01 00:40:00 0.878508 -0.608439 -0.468326
2000-01-01 00:10:00 2000-01-01 00:50:00 -1.056812 0.070073 0.802728
2000-01-01 00:20:00 2000-01-01 01:00:00 -0.085436 0.577973 1.278077
2000-01-01 00:30:00 2000-01-01 01:10:00 -0.061046 -0.410809 -1.913346
2000-01-01 00:40:00 NaT -0.522415 -1.128558 NaN
2000-01-01 00:50:00 NaT 0.1.266240 NaN
2000-01-01 01:00:00 NaT -2.411029 -0.303869 NaN
2000-01-01 01:10:00 NaT 0.050969 -0.807989 NaN
2000-01-01 01:20:00 NaT -0.466958 0.311464 NaN
2000-01-01 01:30:00 NaT -0.137329 -0.234095 NaN
2000-01-01 01:40:00 NaT -1.089133 -0.173481 NaN
C_x C_y D_y
index_x index_y
2000-01-01 00:00:00 2000-01-01 00:40:00 2.298649 0.673585 -1.586648
2000-01-01 00:10:00 2000-01-01 00:50:00 -1.791427 0.907333 0.950786
2000-01-01 00:20:00 2000-01-01 01:00:00 -0.980498 -0.625798 0.284694
2000-01-01 00:30:00 2000-01-01 01:10:00 1.337427 -0.859036 -0.237332
2000-01-01 00:40:00 NaT -1.493857 NaN NaN
2000-01-01 00:50:00 NaT 0.455737 NaN NaN
2000-01-01 01:00:00 NaT 0.393388 NaN NaN
2000-01-01 01:10:00 NaT -1.612417 NaN NaN
2000-01-01 01:20:00 NaT 2.471329 NaN NaN
2000-01-01 01:30:00 NaT -0.541828 NaN NaN
2000-01-01 01:40:00 NaT -0.162694 NaN NaN
我想要做的是轉向第二個索引到的時間戳匹配的第一個指數。這可能是通過concat,join或merge進行的嗎?
也許做'DF2 = df2.reindex_axis(df1.index,0,方法= '最近')'的CONCAT過嗎? –
它取決於它不完全匹配你想要的方向pd.merge_asof(df1,df2,left_on ='index_x',right_on ='index_y',direction ='backward')' – Wen