2017-07-03 94 views


columns=['col1', 'col2'] 
condition= "col2==1" 
with pd.HDFStore(Hdf5File, mode='r', format='table') as store: 
    if groupname in store: 
     df=pd.read_hdf(store, key=groupname, columns=columns, where=["col2==1"]) 


TypeError: cannot pass a column specification when reading a Fixed format store. this store must be selected in its entirety





[Python的熊貓閱讀的可能的複製使用讀\ _hdf和HDFStore.select從HDF5文件的特定值(https://stackoverflow.com/questions/26302480/python-pandas-reading-specific-values-from -hdf5-files-using-read-hdf-and-hdfstor) –





df = pd.DataFrame(np.random.rand(100,5), columns=list('abcde')) 
df.to_hdf('c:/temp/file.h5', 'df_key', format='t', data_columns=True) 

In [10]: pd.read_hdf('c:/temp/file.h5', 'df_key', where="a > 0.5 and a < 0.75") 
      a   b   c   d   e 
3 0.744123 0.515697 0.005335 0.017147 0.176254 
5 0.555202 0.074128 0.874943 0.660555 0.776340 
6 0.667145 0.278355 0.661728 0.705750 0.623682 
8 0.701163 0.429860 0.223079 0.735633 0.476182 
14 0.645130 0.302878 0.428298 0.969632 0.983690 
15 0.633334 0.898632 0.881866 0.228983 0.216519 
16 0.535633 0.906661 0.221823 0.608291 0.330101 
17 0.715708 0.478515 0.002676 0.231314 0.075967 
18 0.587762 0.262281 0.458854 0.811845 0.921100 
21 0.551251 0.537855 0.906546 0.169346 0.063612 
..  ...  ...  ...  ...  ... 
68 0.610958 0.874373 0.785681 0.147954 0.966443 
72 0.619666 0.818202 0.378740 0.416452 0.903129 
73 0.500782 0.536064 0.697678 0.654602 0.054445 
74 0.638659 0.518900 0.210444 0.308874 0.604929 
76 0.696883 0.601130 0.402640 0.150834 0.264218 
77 0.692149 0.963457 0.364050 0.152215 0.622544 
85 0.737854 0.055863 0.346940 0.003907 0.678405 
91 0.644924 0.840488 0.151190 0.566749 0.181861 
93 0.710590 0.900474 0.061603 0.144200 0.946062 
95 0.601144 0.288909 0.074561 0.615098 0.737097 

[33 rows x 5 columns] 



In [13]: df = pd.concat([x.query("0.5 < a < 0.75") 
         for x in pd.read_hdf('c:/temp/file.h5', 'df_key', chunksize=10)], 

In [14]: df 
      a   b   c   d   e 
0 0.744123 0.515697 0.005335 0.017147 0.176254 
1 0.555202 0.074128 0.874943 0.660555 0.776340 
2 0.667145 0.278355 0.661728 0.705750 0.623682 
3 0.701163 0.429860 0.223079 0.735633 0.476182 
4 0.645130 0.302878 0.428298 0.969632 0.983690 
5 0.633334 0.898632 0.881866 0.228983 0.216519 
6 0.535633 0.906661 0.221823 0.608291 0.330101 
7 0.715708 0.478515 0.002676 0.231314 0.075967 
8 0.587762 0.262281 0.458854 0.811845 0.921100 
9 0.551251 0.537855 0.906546 0.169346 0.063612 
..  ...  ...  ...  ...  ... 
23 0.610958 0.874373 0.785681 0.147954 0.966443 
24 0.619666 0.818202 0.378740 0.416452 0.903129 
25 0.500782 0.536064 0.697678 0.654602 0.054445 
26 0.638659 0.518900 0.210444 0.308874 0.604929 
27 0.696883 0.601130 0.402640 0.150834 0.264218 
28 0.692149 0.963457 0.364050 0.152215 0.622544 
29 0.737854 0.055863 0.346940 0.003907 0.678405 
30 0.644924 0.840488 0.151190 0.566749 0.181861 
31 0.710590 0.900474 0.061603 0.144200 0.946062 
32 0.601144 0.288909 0.074561 0.615098 0.737097 

[33 rows x 5 columns] 

我對HDF5文件擁有隻讀訪問權限,我不想再保存它們,因爲它們是大文件。 – Safariba


@Safariba,請檢查更新 – MaxU