2017-07-05 23 views
0

我有一個帶有員工開始時間和結束時間的熊貓數據框。我想知道員工在一個班次中工作了多少小時(班次1:上午8:00-2:00;班次2:下午2點至10點,班次3:下午10點至上午8點)。感謝您的幫助。如何從兩個數據時間提取工作時間

  Start  End 
0 2015-01-01 18:44:00 2015-01-02 07:31:00 
1 2015-01-01 06:38:00 2015-01-01 19:57:00 
2 2015-01-01 06:34:00 2015-01-01 19:13:00 
3 2015-01-01 18:48:00 2015-01-02 07:15:00 
4 2015-01-01 06:50:00 2015-01-01 20:02:00 
+0

什麼是您預期的輸出? – Zero

+1

'df.diff(axis = 1)' –

+0

在向我們展示您的預期輸出的同時,檢查[此鏈接](https://stackoverflow.com/questions/39370879/extract-hour-from-timestamp-with-蟒蛇)獲取小時提取。你想要分鐘/秒進場嗎? – MattR

回答

0

請注意,我的答案還沒有完全打磨。首先,我創建示例數據集。

import pandas as pd 

df = pd.DataFrame([ 
    ['2015-01-01 18:44:00', '2015-01-02 07:31:00'], 
    ['2015-01-01 06:38:00', '2015-01-01 19:57:00'], 
    ['2015-01-01 06:34:00', '2015-01-01 19:13:00'], 
    ['2015-01-01 18:48:00', '2015-01-02 07:15:00'], 
    ['2015-01-01 06:50:00', '2015-01-01 20:02:00'] 
], columns=['start', 'stop']) 

df.start = pd.to_datetime(df.start) 
df.stop = pd.to_datetime(df.stop) 

然後找工作的時間給予之間的每個輪換間隔

from datetime import datetime, timedelta 


def find_interval(r): 
    """ 
    r: row of dataframe, with 'start' and 'stop' column 
    """ 
    t_start = r['start'] 
    t_stop = r['stop'] 
    t = t_start 
    s1_start = datetime(t.date().year, t.date().month, t.date().day, 8) 
    s1_stop = datetime(t.date().year, t.date().month, t.date().day, 14) 
    s2_start = datetime(t.date().year, t.date().month, t.date().day, 14) 
    s2_stop = datetime(t.date().year, t.date().month, t.date().day, 22) 
    s3_start = datetime(t.date().year, t.date().month, t.date().day, 22) 
    s3_stop = datetime(t.date().year, t.date().month, t.date().day + 1, 8) 

    shift_hours = [] 
    for (s_start, s_stop) in [(s1_start, s1_stop), (s2_start, s2_stop), (s3_start, s3_stop)]: 
     if t_stop < s_start: 
      shift_hours.append(timedelta(seconds=0)) 
     elif t_stop > s_start and t_stop < s_stop: 
      shift_hours.append(t_stop - s_start) 
     elif t_start < s_stop and t_stop > s_stop: 
      shift_hours.append(s_stop - t_start) 
     else: 
      shift_hours.append(timedelta(seconds=0)) 
    return shift_hours 

串聯回

df_shift = pd.DataFrame([find_interval(r) for _, r in df.iterrows()]) 
df_out = pd.concat((df, df_shift), axis=1) # output