2017-09-07 117 views
2

我想修改keras'blog中的分類示例顯示,將圖像分爲3個不同的類。keras分類:輸入數據形狀

我有3000張圖片(3 x 1000)的培訓和1200(3 x 400)的驗證。 代碼被修改以分類3個分類。

的代碼如下:

import numpy as np 
from keras.preprocessing.image import ImageDataGenerator 
from keras.models import Sequential 
from keras.layers import Dropout, Flatten, Dense 
from keras import applications 

# dimensions of our images. 
img_width, img_height = 150, 150 

top_model_weights_path = 'bottleneck_fc_model.h5' 
train_data_dir = 'data/train' 
validation_data_dir = 'data/validation' 
nb_train_samples = 3000 
nb_validation_samples = 1200 
epochs = 50 
batch_size = 16 

n_classes = 3 


def save_bottlebeck_features(): 
    datagen = ImageDataGenerator(rescale=1./255) 

    # build the VGG16 network 
    model = applications.VGG16(include_top=False, weights='imagenet') 

    generator = datagen.flow_from_directory(
     train_data_dir, 
     target_size=(img_width, img_height), 
     batch_size=batch_size, 
     class_mode='categorical', 
     shuffle=False) 
    bottleneck_features_train = model.predict_generator(
     generator, nb_train_samples // batch_size) 
    np.save(open('bottleneck_features_train.npy', 'wb'), 
      bottleneck_features_train) 

    generator = datagen.flow_from_directory(
     validation_data_dir, 
     target_size=(img_width, img_height), 
     batch_size=batch_size, 
     class_mode='categorical', 
     shuffle=False) 
    bottleneck_features_validation = model.predict_generator(
     generator, nb_validation_samples // batch_size) 
    np.save(open('bottleneck_features_validation.npy', 'wb'), 
      bottleneck_features_validation) 


def train_top_model(): 
    train_data = np.load(open('bottleneck_features_train.npy','rb')) 
    train_labels = np.array([0] * (nb_train_samples // n_classes) + [1] * (nb_train_samples // n_classes) + \ 
          [2] * (nb_train_samples // n_classes)) 

    validation_data = np.load(open('bottleneck_features_validation.npy','rb')) 
    validation_labels = np.array([0] * (nb_train_samples // n_classes) + [1] * (nb_train_samples // n_classes) + \ 
           [2] * (nb_train_samples // n_classes)) 

    model = Sequential() 
    model.add(Flatten(input_shape=train_data.shape[1:])) 
    model.add(Dense(256, activation='relu')) 
    model.add(Dropout(0.5)) 
    model.add(Dense(n_classes, activation='softmax')) 

    model.compile(optimizer='rmsprop', 
        loss='categorical_crossentropy', metrics=['accuracy']) 

    model.fit(train_data, train_labels, epochs=epochs, batch_size=batch_size, \ 
       validation_data=(validation_data, validation_labels)) 
    model.save_weights(top_model_weights_path) 

當我最後執行兩個功能:

save_bottlebeck_features() 
train_top_model() 

第二功能返回下列錯誤:

--------------------------------------------------------------------------- 
ValueError        Traceback (most recent call last) 
<ipython-input-143-070a6188c611> in <module>() 
     4 print(validation_labels.shape) 
     5 
----> 6 train_top_model() 

<ipython-input-129-ea2b02024693> in train_top_model() 
    64     loss='categorical_crossentropy', metrics=['accuracy']) 
    65 
---> 66  model.fit(train_data, train_labels, epochs=epochs, batch_size=batch_size,    validation_data=(validation_data, validation_labels)) 
    67  model.save_weights(top_model_weights_path) 

~/anaconda/lib/python3.6/site-packages/keras/models.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, **kwargs) 
    865        class_weight=class_weight, 
    866        sample_weight=sample_weight, 
--> 867        initial_epoch=initial_epoch) 
    868 
    869  def evaluate(self, x, y, batch_size=32, verbose=1, 

~/anaconda/lib/python3.6/site-packages/keras/engine/training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, **kwargs) 
    1520    class_weight=class_weight, 
    1521    check_batch_axis=False, 
-> 1522    batch_size=batch_size) 
    1523   # Prepare validation data. 
    1524   do_validation = False 

~/anaconda/lib/python3.6/site-packages/keras/engine/training.py in _standardize_user_data(self, x, y, sample_weight, class_weight, check_batch_axis, batch_size) 
    1380          output_shapes, 
    1381          check_batch_axis=False, 
-> 1382          exception_prefix='target') 
    1383   sample_weights = _standardize_sample_weights(sample_weight, 
    1384              self._feed_output_names) 

~/anaconda/lib/python3.6/site-packages/keras/engine/training.py in _standardize_input_data(data, names, shapes, check_batch_axis, exception_prefix) 
    142        ' to have shape ' + str(shapes[i]) + 
    143        ' but got array with shape ' + 
--> 144        str(array.shape)) 
    145  return arrays 
    146 

ValueError: Error when checking target: expected dense_58 to have shape (None, 3) but got array with shape (3000, 1) 

如果我打印數據和標籤的形狀,它返回:

print(train_labels.shape) 
(3000, 3) 
print(train_data.shape) 
(3000, 3) 
print(validation_data.shape) 
(1200, 4, 4, 512) 
print(validation_labels.shape) 
(1200,) 

編輯

我張貼的完整代碼,以及與我考慮的圖像數據庫。

該數據庫可以下載here

的代碼如下:

# dimensions of our images. 
img_width, img_height = 150, 150 

top_model_weights_path = 'what.h5'#'bottleneck_fc_model.h5' 
train_data_dir = 'data_short/train' 
validation_data_dir = 'data_short/validation' 
nb_train_samples = 30 
nb_validation_samples = 6 
epochs = 50 
batch_size = 16 

n_classes = 3 


def save_bottlebeck_features(): 
    datagen = ImageDataGenerator(rescale=1./255) 

    # build the VGG16 network 
    model = applications.VGG16(include_top=False, weights='imagenet') 

    generator = datagen.flow_from_directory(train_data_dir, target_size=(img_width, img_height),\ 
              batch_size=batch_size, class_mode='categorical', shuffle=False) 

    bottleneck_features_train = model.predict_generator(generator, nb_train_samples // batch_size) 

    np.save(open('bottleneck_features_train.npy', 'wb'), bottleneck_features_train) 

    generator = datagen.flow_from_directory(validation_data_dir, target_size=(img_width, img_height),\ 
              batch_size=batch_size, class_mode='categorical', shuffle=False) 

    bottleneck_features_validation = model.predict_generator(generator, nb_validation_samples // batch_size) 

    np.save(open('bottleneck_features_validation.npy', 'wb'), bottleneck_features_validation) 


def train_top_model(): 
    encoder = OneHotEncoder() 
    #train_data = np.load(open('bottleneck_features_train.npy','rb')) 
    train_data = np.load('bottleneck_features_train.npy') 

    train_labels = np.array([0] * (nb_train_samples // n_classes) + [1] * (nb_train_samples // n_classes) + 
          [2] * (nb_train_samples // n_classes)) 

    train_labels = to_categorical(train_labels) 


    validation_data = np.load(open('bottleneck_features_validation.npy','rb')) 
    validation_labels = np.array([0] * (nb_validation_samples // n_classes) + \ 
           [1] * (nb_validation_samples // n_classes) + \ 
           [2] * (nb_validation_samples // n_classes)) 

    validation_labels = to_categorical(validation_labels) 

    model = Sequential() 
    model.add(Flatten(input_shape=train_data.shape[1:])) 
    model.add(Dense(256, activation='relu')) 
    model.add(Dropout(0.5)) 
    model.add(Dense(n_classes, activation='softmax')) 

    model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy']) 

    model.fit(train_data, train_labels, epochs=epochs, batch_size=batch_size,\ 
       validation_data=(validation_data, validation_labels)) 
    model.save_weights(top_model_weights_path) 

給出的錯誤是:

--------------------------------------------------------------------------- 
ValueError        Traceback (most recent call last) 
<ipython-input-8-6869607a6e44> in <module>() 
----> 1 train_top_model() 

<ipython-input-6-933b6592c6c1> in train_top_model() 
    56  model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy']) 
    57 
---> 58  model.fit(train_data, train_labels, epochs=epochs, batch_size=batch_size,    validation_data=(validation_data, validation_labels)) 
    59  model.save_weights(top_model_weights_path) 

~/anaconda/lib/python3.6/site-packages/keras/models.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, **kwargs) 
    861        class_weight=class_weight, 
    862        sample_weight=sample_weight, 
--> 863        initial_epoch=initial_epoch) 
    864 
    865  def evaluate(self, x, y, batch_size=32, verbose=1, 

~/anaconda/lib/python3.6/site-packages/keras/engine/training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, **kwargs) 
    1356    class_weight=class_weight, 
    1357    check_batch_axis=False, 
-> 1358    batch_size=batch_size) 
    1359   # Prepare validation data. 
    1360   if validation_data: 

~/anaconda/lib/python3.6/site-packages/keras/engine/training.py in _standardize_user_data(self, x, y, sample_weight, class_weight, check_batch_axis, batch_size) 
    1244       for (ref, sw, cw, mode) 
    1245       in zip(y, sample_weights, class_weights, self._feed_sample_weight_modes)] 
-> 1246   _check_array_lengths(x, y, sample_weights) 
    1247   _check_loss_and_target_compatibility(y, 
    1248            self._feed_loss_fns, 

~/anaconda/lib/python3.6/site-packages/keras/engine/training.py in _check_array_lengths(inputs, targets, weights) 
    235       'the same number of samples as target arrays. ' 
    236       'Found ' + str(list(set_x)[0]) + ' input samples ' 
--> 237       'and ' + str(list(set_y)[0]) + ' target samples.') 
    238  if len(set_w) > 1: 
    239   raise ValueError('All sample_weight arrays should have ' 

ValueError: Input arrays should have the same number of samples as target arrays. Found 16 input samples and 30 target samples. 

EDIT2解決辦法:

我解決這個問題做一個根本性的變化代碼。可以看到here

+1

'train_lables.shape [1]'應該是'='爲'validation_labels.shape [1]'和'train_data.shape [1],train_data.shape [2],train_data.shape [3]'應該等於'validation_data.shape [1],validation_data.shape [2],validation_data.shape [3]'和這應該是清楚的,因爲數據測試和數據驗證之間的唯一區別是第一個diminsion,它是您擁有的「樣本」的總數 – DJK

+0

我認爲這不是問題。我減少系統爲每個類的10個圖像的訓練和2個圖像進行驗證。張量的大小是正確的,它服從你提出的規則。但是我仍然有錯誤。 – NunodeSousa

+0

ValueError:輸入數組應該具有與目標數組相同數量的樣本。找到了16個輸入樣本和30個目標樣本。 – NunodeSousa

回答

2
  • 你有「輸入數據」,這是你的圖像集。 - 形狀:(BatchSize,w,h,channels)
  • 而且你有「輸出數據/真值/預測」這是類。形狀:(BatchSize,3)

錯誤消息告訴你,你給狀如(BatchSize,1)模型的輸出數據,這將不適合的模式。

因此,您在創建train_labels時肯定遇到了問題。

您必須使其形狀爲(3000,3)。和與每個類別的指數應爲1:

  • 第1類:[1,0,0]
  • 2類:[0,1,0]
  • 3類:[0,0 ,1]

您可能已經組合了類(如果可能的話)。


使用keras.utils.to_categorical()

但要確保train_labels.shape[0]是完全一樣train_data.shape[0]

from keras.utils import to_categorical 

train_labels = np.array([0] * (nb_train_samples // n_classes) + [1] * (nb_train_samples // n_classes) + 
         [2] * (nb_train_samples // n_classes)) 

train_labels = to_categorical(train_labels) 

創建標籤的另一種非常簡單的方法:

train_labels = np.zeros((30,3)) 
train_labels[:10,0] = 1. 
train_labels[10:20,1] = 1. 
train_labels[20:,2] = 1. 
+0

但是,當我打印張量的形狀時,它說它是一個3000,3。 print(train_labels.shape)(3000,3) – NunodeSousa

+0

那麼,在調用'fit'之前立即執行打印(從'train_top_model()'函數的**內部打印)。該模型正在接收'(3000,1)'。 –

+0

@丹尼爾,我認爲最初這是問題,應該是被接受的答案,但現在OP對輸入數據的形狀有問題,它顯示16個樣本 – DJK