2017-07-08 133 views
2

我有兩個數據幀,df1它存儲在一個pd.HDFStore對象和另一個要附加到數據幀。HDFStore更新存儲的HDF5 python熊貓數據幀

store = pd.HDFStore('dataframe_store.h5') 

df1 = pd.DataFrame(np.empty((100, 5))) 
df2 = pd.DataFrame(np.empty((100, 5))) 

store['df1'] = df1 

實際上,我想最終的結果等於...

store['df1'] = df1.append(df2) 

我想追加df2到存儲df1,而完全用新的數據框覆蓋HDFStore對象。這可能嗎?

此外,當我運行下面的代碼時,我會返回ValueError can only append to Tables ...爲什麼會這樣?

df = pd.DataFrame(np.empty((1000, 5))) 
df2 = pd.DataFrame(np.empty((1000, 5))) 

store = pd.HDFStore('store.h5') 

store['df'] = df 

store.append('df', df2) 

回答

1

the docs(我的重點):

HDFStore supports another PyTables format on disk, the table format. Conceptually a table is shaped very much like a DataFrame, with rows and columns. A table may be appended to in the same or other sessions. In addition, delete & query type operations are supported. This format is specified by format='table' or format='t' to append or put or to_hdf

New in version 0.13.

This format can be set as an option as well pd.set_option('io.hdf.default_format','table') to enable put/append/to_hdf to by default store in the table format.

In [361]: store = pd.HDFStore('store.h5') 

In [362]: df1 = df[0:4] 

In [363]: df2 = df[4:] 

# append data (creates a table automatically) 
In [364]: store.append('df', df1) 

In [365]: store.append('df', df2) 

In [366]: store 
Out[366]: 
<class 'pandas.io.pytables.HDFStore'> 
File path: store.h5 

# select the entire object 
In [367]: store.select('df') 
Out[367]: 
        A   B   C 
2000-01-01 0.887163 0.859588 -0.636524 
2000-01-02 0.015696 -2.242685 1.150036 
2000-01-03 0.991946 0.953324 -2.021255 
2000-01-04 -0.334077 0.002118 0.405453 
2000-01-05 0.289092 1.321158 -1.546906 
2000-01-06 -0.202646 -0.655969 0.193421 
2000-01-07 0.553439 1.318152 -0.469305 
2000-01-08 0.675554 -1.817027 -0.183109 

# the type of stored data 
In [368]: store.root.df._v_attrs.pandas_type 
Out[368]: 'frame_table' 

Note: You can also create a table by passing format='table' or format='t' to a put operation.

+0

我一直在尋找的文檔,並能弄明白,但如果你能解釋一下,爲什麼'店.append('df',df2)'返回'ValueError:只能附加到Tables' ...? –

+1

您的'hdf5'文件必須以['table' format](http://pandas.pydata.org/pandas-docs/version/0.20/io.html#table-format)創建(與[ 'fixed' format](http://pandas.pydata.org/pandas-docs/version/0.20/io.html#fixed-format))。使用'df.to_hdf(filename,'df',mode ='w',format ='table')'來創建它,或者設置'pd.set_option('io.hdf.default_format','table')'所以'format ='table''是默認的格式。 – unutbu