您可以使用read_csv
:
import pandas as pd
import io
temp=u'''01/Jul/2016 00:05:09 8438.2
01/Jul/2016 00:05:19 8422.4 g'''
#after testing replace io.StringIO(temp) to filename
df = pd.read_csv(io.StringIO(temp),
sep='\s+',
names=['date','time','float','string'],
parse_dates=[['date','time']])
print (df)
date_time float string
0 2016-07-01 00:05:09 8438.2 NaN
1 2016-07-01 00:05:19 8422.4 g
或者:
import pandas as pd
import io
temp=u'''01/Jul/2016 00:05:09 8438.2
01/Jul/2016 00:05:19 8422.4 g'''
#after testing replace io.StringIO(temp) to filename
df = pd.read_csv(io.StringIO(temp),
delim_whitespace=True,
names=['date','time','float','string'],
parse_dates=[['date','time']])
print (df)
date_time float string
0 2016-07-01 00:05:09 8438.2 NaN
1 2016-07-01 00:05:19 8422.4 g
解決方案與read_fwf
:
import pandas as pd
import io
temp=u'''01/Jul/2016 00:05:09 8438.2
01/Jul/2016 00:05:19 8422.4 g'''
#after testing replace io.StringIO(temp) to filename
df = pd.read_fwf(io.StringIO(temp),
names=['date','time','float','string'],
parse_dates=[['date','time']])
print (df)
date_time float string
0 2016-07-01 00:05:09 8438.2 NaN
1 2016-07-01 00:05:19 8422.4 g
你也可以指定列的寬度:
df = pd.read_fwf(io.StringIO(temp),
fwidths = [20,12,2],
names=['date','time','float','string'],
parse_dates=[['date','time']])
print (df)
date_time float string
0 2016-07-01 00:05:09 8438.2 NaN
1 2016-07-01 00:05:19 8422.4 g