2017-03-23 37 views
0

檢索數據和標籤我有一個泡菜的文件,如下所示:Python的 - 從泡菜文件

[array([[[148, 124, 115], 
     [150, 127, 116], 
     [154, 129, 121], 
     ..., 
     [159, 142, 133], 
     [159, 142, 133], 
     [161, 145, 142]], 

     [[165, 136, 145], 
     [176, 137, 141], 
     [178, 138, 144], 
     ..., 
     [199, 163, 171], 
     [202, 163, 167], 
     [200, 158, 163]]]), array([1, 1])] 

previous question,我們能夠獲取由這樣做單獨的數據和標籤。但是,當我有許多圖像的方法不適合。

我的腳本現在看起來如下:

data, labels = [], [] 
    for i in range(0, 1): 

     filename = 'data.pickle' 
     batch_data = unpickle(filename) 
     if len(data) > 0: 
      data = np.vstack((data, batch_data[0][i])) 
      labels = np.hstack((labels, batch_data[1][i])) 
     else: 
      data = batch_data[0][0] 
      labels = batch_data[1][0] 

     data = data.astype(np.float32) 
     return data, labels 

當我運行的代碼和print例如標籤,我總是得到1,而我期待獲得兩個標籤,[1 1](我不是當然,如果他們應該顯示爲陣列?)

我在做什麼錯在這裏?

感謝。

回答

2

我能得到標籤你期待它的方式。我用

# Create batch data that represents what you are asking, I created three labels and data 
batch_data = np.array([[np.random.random((5,5)), np.random.random((5,5)), np.random.random((5,5))], np.array([1,1,1])]) 

#pickle the data 
import pickle 
pickle.dump(batch_data, open("test.pickle", "wb")) 

# create data and labels seperately 

def test_func(batch_data): 
    data, labels = [], [] 
    for i in range(0, batch_data.shape[1]): 
     if len(data) > 0: 
      data = np.vstack((data, batch_data[0][i])) 
      labels = np.hstack((labels, batch_data[1][i])) 
     else: 
      data = batch_data[0][0] 
      labels = batch_data[1][0] 
     data = data.astype(np.float32) 
    return data, labels 

# unpickle 
unpickled_batch_data = pickle.load(open("test.pickle", "rb")) 

# get stacked data and labels 
data, labels = test_func(unpickled_batch_data) 
print labels 

回報

[1 1 1] 
0

您可以只使用zip兩次逃脫:

In [24]: pickle_data = [array([[[148, 124, 115], 
    ...:   [150, 127, 116], 
    ...:   [154, 129, 121], 
    ...:   [159, 142, 133], 
    ...:   [159, 142, 133], 
    ...:   [161, 145, 142]], 
    ...: 
    ...:  [[165, 136, 145], 
    ...:   [176, 137, 141], 
    ...:   [178, 138, 144], 
    ...:   [199, 163, 171], 
    ...:   [202, 163, 167], 
    ...:   [200, 158, 163]]]), array([1, 1])] 

您還需要論證與*操作拆包:

In [25]: data, labels = zip(*zip(*pickle_data)) 

In [26]: data 
Out[26]: 
(array([[148, 124, 115], 
     [150, 127, 116], 
     [154, 129, 121], 
     [159, 142, 133], 
     [159, 142, 133], 
     [161, 145, 142]]), array([[165, 136, 145], 
     [176, 137, 141], 
     [178, 138, 144], 
     [199, 163, 171], 
     [202, 163, 167], 
     [200, 158, 163]])) 

In [27]: labels 
Out[27]: (1, 1) 

現在的標籤和數據對應的指標:

In [28]: data[0] 
Out[28]: 
array([[148, 124, 115], 
     [150, 127, 116], 
     [154, 129, 121], 
     [159, 142, 133], 
     [159, 142, 133], 
     [161, 145, 142]]) 

In [29]: data[1] 
Out[29]: 
array([[165, 136, 145], 
     [176, 137, 141], 
     [178, 138, 144], 
     [199, 163, 171], 
     [202, 163, 167], 
     [200, 158, 163]]) 

In [30]: labels[0] 
Out[30]: 1 

In [31]: labels[1] 
Out[31]: 1 

或者更好的是,我認爲,因爲y我們的圖像正在沿第一軸存儲,您可以使用列表中理解只是分解陣列到陣列的列表:

In [37]: images = pickle_data[0] 

In [38]: labels = pickle_data[1] 

分解數組:

In [39]: images = [x for x in images] 

In [40]: images[0] 
Out[40]: 
array([[148, 124, 115], 
     [150, 127, 116], 
     [154, 129, 121], 
     [159, 142, 133], 
     [159, 142, 133], 
     [161, 145, 142]]) 

In [41]: images[1] 
Out[41]: 
array([[165, 136, 145], 
     [176, 137, 141], 
     [178, 138, 144], 
     [199, 163, 171], 
     [202, 163, 167], 
     [200, 158, 163]]) 

In [42]: labels[0] 
Out[42]: 1 

In [43]: labels[1] 
Out[43]: 1 

In [44]: labels 
Out[44]: array([1, 1]) 

In [45]: images 
Out[45]: 
[array([[148, 124, 115], 
     [150, 127, 116], 
     [154, 129, 121], 
     [159, 142, 133], 
     [159, 142, 133], 
     [161, 145, 142]]), array([[165, 136, 145], 
     [176, 137, 141], 
     [178, 138, 144], 
     [199, 163, 171], 
     [202, 163, 167], 
     [200, 158, 163]])] 

In [46]: