我遇到了Python + Numpy + Pandas的問題。添加到Pandas DataFrame時發生datetime64錯誤
我有一個時間戳列表,精確到毫秒,編碼爲字符串。然後我將它們四捨五入到10ms的分辨率,這很順利。當我將新的四捨五入時間戳添加到DataFrame中作爲一個新列時,會出現這個錯誤 - datetime64對象的值會被完全破壞。
我做錯了什麼?或者是Pandas/NumPy錯誤?
順便說一句,我懷疑,這個錯誤只出現在Windows上 - 我沒有注意到,當我昨天在Mac上嘗試相同的代碼(沒有驗證這一點)。
import numpy
import pandas as pd
# We create a list of strings.
time_str_arr = ['2017-06-30T13:51:15.854', '2017-06-30T13:51:16.250',
'2017-06-30T13:51:16.452', '2017-06-30T13:51:16.659']
# Then we create a time array, rounded to 10ms (actually floored,
# not rounded), everything seems to be fine here.
rounded_time = numpy.array(time_str_arr, dtype="datetime64[10ms]")
rounded_time
# Then we create a Pandas DataFrame and assign the time array as a
# column to it. The datetime64 is destroyed.
d = {'one' : pd.Series([1., 2., 3.], index=['a', 'b', 'c']),
'two' : pd.Series([1., 2., 3., 4.], index=['a', 'b', 'c', 'd'])}
df = pd.DataFrame(d)
df = df.assign(wrong_time=rounded_time)
df
輸出我得到:
one two wrong_time
a 1.0 1.0 1974-10-01 18:11:07.585
b 2.0 2.0 1974-10-01 18:11:07.625
c 3.0 3.0 1974-10-01 18:11:07.645
d NaN 4.0 1974-10-01 18:11:07.665
輸出pd.show_versions()的:
INSTALLED VERSIONS
commit: None
python: 3.6.1.final.0
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 78 Stepping 3, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None
pandas: 0.20.1
pytest: 3.0.7
pip: 9.0.1
setuptools: 27.2.0
Cython: 0.25.2
numpy: 1.12.1
scipy: 0.19.0
xarray: None
IPython: 5.3.0
sphinx: 1.5.6
patsy: 0.4.1
dateutil: 2.6.0
pytz: 2017.2
blosc: None
bottleneck: 1.2.1
tables: 3.2.2
numexpr: 2.6.2
feather: None
matplotlib: 2.0.2
openpyxl: 2.4.7
xlrd: 1.0.0
xlwt: 1.2.0
xlsxwriter: 0.9.6
lxml: 3.7.3
bs4: 4.6.0
html5lib: 0.999
sqlalchemy: 1.1.9
pymysql: None
psycopg2: None
jinja2: 2.9.6
s3fs: None
pandas_gbq: None
pandas_datareader: None
你可以用'pd.to_datetime(time_str_arr )' –
我試過了pd.to_datetime(time_str_arr)。它沒有改變任何東西。該錯誤不會將字符串轉換爲日期時間。這一步工作正常。錯誤是,當我嘗試將datetime64數組添加到dateframe時,datetime64數組被破壞(或未正確導入)。 –