您可以使用:
import pandas as pd
import io
temp=u"""#,Job_ID,Date/Time,value1,value2,
0,ID1,05/01 24:00:00,5,6
1,ID2,05/02 24:00:00,6,15
2,ID3,05/03 24:00:00,20,21"""
dateparse = lambda x: pd.datetime.strptime(x.replace('24:','00:'), '%m/%d %H:%M:%S')
#after testing replace io.StringIO(temp) to filename
df = pd.read_csv(io.StringIO(temp),
skipinitialspace=True,
date_parser=dateparse,
parse_dates=['Date/Time'],
index_col=['Date/Time'],
usecols=['Job_ID', 'Date/Time', 'value1', 'value2'],
header=0)
print (df)
Job_ID value1 value2
Date/Time
1900-05-01 ID1 5 6
1900-05-02 ID2 6 15
1900-05-03 ID3 20 21
另一種解決方案採用雙replace
- year
也可以添加:
dateparse = lambda x: x.replace('24:','00:').replace(' ','/1900 ')
df = pd.read_csv(io.StringIO(temp),
skipinitialspace=True,
date_parser=dateparse,
parse_dates=['Date/Time'],
index_col=['Date/Time'],
usecols=['Job_ID', 'Date/Time', 'value1', 'value2'],
header=0)
print (df)
Job_ID value1 value2
Date/Time
1900-05-01 ID1 5 6
1900-05-02 ID2 6 15
1900-05-03 ID3 20 21
dateparse = lambda x: x.replace('24:','00:').replace(' ','/2016 ')
df = pd.read_csv(io.StringIO(temp),
skipinitialspace=True,
date_parser=dateparse,
parse_dates=['Date/Time'],
index_col=['Date/Time'],
usecols=['Job_ID', 'Date/Time', 'value1', 'value2'],
header=0)
print (df)
Job_ID value1 value2
Date/Time
2016-05-01 ID1 5 6
2016-05-02 ID2 6 15
2016-05-03 ID3 20 21
你總是噴滴! – Andreuccio
我面臨導入類似數據集的任務,每小時值而不是每天。因此,我不需要用'00:'代替'24:',而是需要將所有的小時數移回1個單位,即:'24:' - >'23:',...,'01:' - >' 00:'。代碼如何改變呢? – Andreuccio
我想同樣的方法,只減去一小時像'df.index = df.index - pd.Timedelta(1,unit ='h')' – jezrael