您只能使用to_datetime
:
print (df)
DateTime
0 3/1/2016 12:15:00 AM
1 3/1/2016 12:30:00 AM
2 3/1/2016 12:45:00 AM
3 3/1/2016 1:00:00 AM
4 3/1/2016 1:15:00 AM
5 3/1/2016 1:30:00 AM
6 3/1/2016 1:45:00 AM
7 3/1/2016 2:00:00 AM
8 3/1/2016 2:15:00 PM <-date is changed for better testing
df.DateTime = pd.to_datetime(df.DateTime)
print (df)
DateTime
0 2016-03-01 00:15:00
1 2016-03-01 00:30:00
2 2016-03-01 00:45:00
3 2016-03-01 01:00:00
4 2016-03-01 01:15:00
5 2016-03-01 01:30:00
6 2016-03-01 01:45:00
7 2016-03-01 02:00:00
8 2016-03-01 14:15:00
編輯:
這時需要參數errors='coerce'
爲以NaT
替換有問題的值:
print (df)
DateTime
0 3/1/2016 28:15:00 AM <- wrong date
1 3/1/2016 12:30:00 AM
2 3/1/2016 12:45:00 AM
3 3/1/2016 1:00:00 AM
4 3/1/2016 1:15:00 AM
5 3/1/2016 1:30:00 AM
6 3/1/2016 1:45:00 AM
7 3/1/2016 2:00:00 AM
8 3/1/2016 2:15:00 PM
df.DateTime = pd.to_datetime(df.DateTime, errors='coerce')
print (df)
DateTime
0 NaT
1 2016-03-01 00:30:00
2 2016-03-01 00:45:00
3 2016-03-01 01:00:00
4 2016-03-01 01:15:00
5 2016-03-01 01:30:00
6 2016-03-01 01:45:00
7 2016-03-01 02:00:00
8 2016-03-01 14:15:00
爲了檢查有問題的值,用boolean indexing
:
print (df[pd.to_datetime(df.DateTime, errors='coerce').isnull()])
DateTime
0 3/1/2016 28:15:00 AM
謝謝,我試過,但我得到這個錯誤:ValueError異常:未知的字符串格式。 – nish
請檢查編輯答案。 – jezrael
已檢查。它的作品謝謝你:) – nish