在這裏。這將是緩慢的。
注意,這個計算每一行與自身重疊,這樣的成績列將永遠爲0
import pandas as pd
df = pd.DataFrame({'start_time': [4,3,1,2],'end_time': [7,5,3,8]})
df = df[['start_time','end_time']] #just changing the order of the columns for aesthetics
def overlaps_with_row(row,frame):
starts_before_mask = frame.start_time <= row.start_time
ends_after_mask = frame.end_time > row.start_time
return (starts_before_mask & ends_after_mask).sum()
df['number_which_overlap'] = df.apply(overlaps_with_row,frame=df,axis=1)
收益率(從結果中減去1做它的其他方式。):
In [8]: df
Out[8]:
start_time end_time number_which_overlap
0 4 7 3
1 3 5 2
2 1 3 1
3 2 8 2
[4 rows x 3 columns]
我很困惑你想要什麼。你可以請更具體嗎?假設row_1有start_time = 4和end_time = 7,row_2有start_time = 3和end_time = 5,row_3有start_time = 1和end_time = 3,row_4有start_time = 2和end_time = 8。你想要什麼輸出? – exp1orer
剛纔意識到我在上面說錯了。它應該計算在該記錄的*開始*時間內仍然活動的事件。所以在你的例子中,你會得到這個: Row_1:開始:4結束:7併發:3 | Row_2:開始:3結束:5併發:2 | Row_3:開始:1結束:3併發:1 | Row_4:開始:2結束:8併發:2 – user3838505
所以問題是「當這個入口開始時還有多少行被激活」? – exp1orer