2016-12-03 62 views
2

我有以下代碼:什麼時候一個任務在Python中進行深層複製?

import pandas as pd 
store = pd.HDFStore('cache.h5') 
data = store['data'] 

在這種情況下,是data的HDF5數據的深層,在內存中拷貝,或者是一個指向磁盤上的原始數據?

+0

究竟你*指針到磁盤上的原始數據意味着什麼*? – Tobias

回答

1

這是一個「內存對象」,它不會自動反映(刷新)到磁盤。

演示:

In [16]: fn = r'D:\temp\.data\test.h5' 

In [17]: store = pd.HDFStore(fn) 

In [18]: store 
Out[18]: 
<class 'pandas.io.pytables.HDFStore'> 
File path: D:\temp\.data\test.h5 
/df2    frame_table (typ->appendable,nrows->7,ncols->4,indexers->[index],dc->[Col1,Col2,Col3,Col4]) 
/test   frame_table (typ->appendable,nrows->7,ncols->4,indexers->[index],dc->[Col1,Col2,Col3,Col4]) 

從磁盤(HDF店)讀入數據幀(在內存中的對象):

In [19]: data = store['test'] 

In [20]: data 
Out[20]: 
     Col1  Col2 Col3 Col4 
0  what  the  0  0 
1  are curves  1  8 
2  men  of  2 16 
3   to  your  3 24 
4  rocks  lips  4 32 
5  and rewrite  5 40 
6 mountains history.  6 48 

In [21]: data.Col4 = 1000 

In [22]: data 
Out[22]: 
     Col1  Col2 Col3 Col4 
0  what  the  0 1000 
1  are curves  1 1000 
2  men  of  2 1000 
3   to  your  3 1000 
4  rocks  lips  4 1000 
5  and rewrite  5 1000 
6 mountains history.  6 1000 

In [23]: store.close() 

In [24]: store = pd.HDFStore(fn) 

In [25]: store['test'] 
Out[25]: 
     Col1  Col2 Col3 Col4 
0  what  the  0  0 
1  are curves  1  8 
2  men  of  2 16 
3   to  your  3 24 
4  rocks  lips  4 32 
5  and rewrite  5 40 
6 mountains history.  6 48 

UPDATE:以下的小的演示表明,data DF不取決於store已從HDF Store中讀取:

In [26]: store.close() 

In [27]: store = pd.HDFStore(fn) 

In [28]: del data 

In [29]: data = store['test'] 

讓我們刪除store對象

In [30]: del store 

data仍然存在

In [31]: data 
Out[31]: 
     Col1  Col2 Col3 Col4 
0  what  the  0  0 
1  are curves  1  8 
2  men  of  2 16 
3   to  your  3 24 
4  rocks  lips  4 32 
5  and rewrite  5 40 
6 mountains history.  6 48 
+0

這是否意味着它是一個深層複製?如果我從'數據'讀取,我現在是從RAM還是從磁盤讀取? – cjm2671

+0

是的,您可以將'data'視爲HDF商店中一個表的深層副本。基本上它是一個DataFrame(內存中的對象),其中cache.h5是HDF(h5)文件(在磁盤上),可能包含多個表(DataFrame) – MaxU

+0

謝謝,這非常有幫助! – cjm2671

相關問題