2015-12-22 85 views
0

我有一個具有「日期」,「時間」等欄目(10個左右)爲HDF5文件強制數據類型h5py

Date,Time,C 
20020515,123000000,10293 
20020515,160000000,10287 
20020516,111800000,10270 
20020516,160000000,10260 
20020517,130500000,10349 
20020517,160000000,10276 
20020520,123700000,10313 
20020520,160000000,10258 
20020521,114500000,10223 

我想這個加載到一個HDF5文件和CSV文件日期和時間類型爲「字符串」,而不是整數32。所以我這樣做

import h5py,numpy as np 
my_data = np.genfromtxt("/tmp/data.txt",delimiter=",",dtype=None,names=True) 
myFile="/tmp/data.h5" 
with h5py.File(myFile,"a") as f: 
    dset = f.create_dataset('foo',data=my_data) 

我想在HDF5上存儲「日期」和「時間」作爲類型「字符串」。不是Int32。

+0

我不認爲這是可能的。根據[docs](http://docs.h5py.org/en/latest/high/dataset.html):'數據集與NumPy數組非常相似。它們是數據元素的同類集合,具有不可變的數據類型和(超)矩形形狀。「這意味着所有列必須具有相同的'dtype'。 – iled

+0

您是否想要改變將數據存儲在HDF5文件中的方式,還是希望能夠在從文件中讀取數據後將這些列轉換爲字符串? –

+0

我想改變我存儲數據的方式。我想將它們存儲爲字符串而不是整數。 – NinjaGaiden

回答

2

一個簡單的解決辦法是改變my_data的D型其寫入文件之前:

newtype = np.dtype([('Date', 'S8'), ('Time', 'S8'), ('C', '<i8')]) 
dset2 = f.create_dataset('foo2', data=my_data.astype(newtype)) 

您還可以創建通過將適當dtype=shape=參數f.create_dataset一個空的數據集,然後填寫從my_data值:

dset3 = f.create_dataset('foo3', shape=my_data.shape, dtype=newtype) 
dset3[:] = my_data.astype(newtype) 

請注意,我仍然有寫之前投my_datanewtype - h5p y似乎不能夠處理類型轉換本身:

In [15]: dset3[:] = my_data 
--------------------------------------------------------------------------- 
OSError         Traceback (most recent call last) 
<ipython-input-15-6e62dae3d59a> in <module>() 
----> 1 dset3[:] = my_data 

h5py/_objects.pyx in h5py._objects.with_phil.wrapper (/tmp/pip-build-aayglkf0/h5py/h5py/_objects.c:2579)() 

h5py/_objects.pyx in h5py._objects.with_phil.wrapper (/tmp/pip-build-aayglkf0/h5py/h5py/_objects.c:2538)() 

/home/alistair/.venvs/core3/lib/python3.4/site-packages/h5py/_hl/dataset.py in __setitem__(self, args, val) 
    584   mspace = h5s.create_simple(mshape_pad, (h5s.UNLIMITED,)*len(mshape_pad)) 
    585   for fspace in selection.broadcast(mshape): 
--> 586    self.id.write(mspace, fspace, val, mtype) 
    587 
    588  def read_direct(self, dest, source_sel=None, dest_sel=None): 

h5py/_objects.pyx in h5py._objects.with_phil.wrapper (/tmp/pip-build-aayglkf0/h5py/h5py/_objects.c:2579)() 

h5py/_objects.pyx in h5py._objects.with_phil.wrapper (/tmp/pip-build-aayglkf0/h5py/h5py/_objects.c:2538)() 

h5py/h5d.pyx in h5py.h5d.DatasetID.write (/tmp/pip-build-aayglkf0/h5py/h5py/h5d.c:3421)() 

h5py/_proxy.pyx in h5py._proxy.dset_rw (/tmp/pip-build-aayglkf0/h5py/h5py/_proxy.c:1794)() 

h5py/_proxy.pyx in h5py._proxy.H5PY_H5Dwrite (/tmp/pip-build-aayglkf0/h5py/h5py/_proxy.c:1501)() 

OSError: Can't prepare for writing data (No appropriate function for conversion path)