2017-01-26 51 views
0

我HDF5文件傳輸到Amazon EC2 Linux實例後,我似乎無法看到的數據集在該文件(5GB,的md5sum轉移後檢查)H5py數據集沒有見過

當我運行代碼:

import h5py 
h5_fname = 'DATA\DATA.h5' 
print (h5py.version.info) 
f = h5py.File(h5_fname, 'r') 
print(f) 
for name in f: 
    print(name) 
    print(f[name].shape) 
f.close() 

在我的本地計算機上我得到(這是正確的):

h5py 2.6.0 
HDF5 1.8.15 
Python 3.5.2 |Anaconda 4.2.0 (64-bit)| (default, Jul 5 2016, 11:41:13) [MSC v.1900 64 bit (AMD64)] 
sys.platform win32 
sys.maxsize  9223372036854775807 
numpy 1.12.0 

<HDF5 file "DATA.h5" (mode r)> 
X_train 
(1397, 1, 128, 128, 128) 
y_train 
(1397, 1) 
i_train 
(1397, 1) 
X_test 
(198, 1, 128, 128, 128) 
y_test 
(198, 1) 
i_test 
(198, 1) 

當亞馬遜實例上運行:

h5py 2.6.0 
HDF5 1.8.17 
Python 3.5.1 (default, Sep 13 2016, 18:48:37) 
[GCC 4.8.3 20140911 (Red Hat 4.8.3-9)] 
sys.platform linux 
sys.maxsize  9223372036854775807 
numpy 1.11.3 

<HDF5 file "DATA\DATA.h5" (mode r)> 

有版本差異,但我不認爲這是這裏的問題。 有什麼建議嗎?

編輯: 我如何創建HDF5文件可能有用的代碼:

def create_h5(fname_): 
    f = h5py.File(fname_, 'w', libver='latest') 
    dtype_ = h5py.special_dtype(vlen=bytes) 
    num_samples_train = 1397 
    num_samples_test = 1595 - 1397  
    chunks_ = (1, 1, 128, 128, 128) #100MB 
    chunks_2 = (1, 1) 

    f.create_dataset('X_train', (num_samples_train, 1, 128, 128, 128), dtype=np.float32, maxshape=(None, None, None, 128, 128), chunks=chunks_, compression="gzip") 
    f.create_dataset('y_train', (num_samples_train, 1), dtype=np.int32, maxshape=(None, 1), chunks=chunks_2, compression="gzip") 
    f.create_dataset('i_train', (num_samples_train, 1), dtype=dtype_, maxshape=(None, 1), chunks=chunks_2, compression="gzip") 


    f.create_dataset('X_test', (num_samples_test, 1, 128, 128, 128), dtype=np.float32, maxshape=(None, None, None, 128, 128), chunks=chunks_, compression="gzip") 
    f.create_dataset('y_test', (num_samples_test, 1), dtype=np.int32, maxshape=(None, 1), chunks=chunks_2, compression="gzip") 
    f.create_dataset('i_test', (num_samples_test, 1), dtype=dtype_, maxshape=(None, 1), chunks=chunks_2, compression="gzip") 


    f.flush() 
    f.close() 
    print('HDF5 file created') 

回答

0

更改h5_fname = 'DATA\DATA.h5'h5_fname = 'DATA//DATA.h5'解決了這個問題。

然而,這是非常奇怪的,因爲即使第一個選項我能夠打開文件。

+0

您應該使用'os'包中的'os.path.join' –