2

問題陳述爲什麼MatConvNet會說數據和衍生物沒有匹配的格式?

我用MatConvNet使用功能cnn_train自帶的實例庫來構建一個非常簡單的例子1D和小型網絡。通過這些實例我建了一個小CNN舉例如下:

clc;clear;clc;clear; 
%% prepare Data 
M = 32; %batch size 
X_train = zeros(1,1,1,M); % (1 1 1 2) = (1 1 1 M) 
for m=1:M, 
    X_train(:,:,:,m) = m; %training example value 
end 
Y_test = 10*X_train; 
split = ones(1,M); 
split(floor(M*0.75):end) = 2; 
% load image dadabase (imgdb) 
imdb.images.data = X_train; 
imdb.images.label = Y_test; 
imdb.images.set = split; 
%% prepare parameters 
L1=3; 
w1 = randn(1,1,1,L1); %1st layer weights 
w2 = randn(1,1,1,L1); %2nd layer weights 
b1 = randn(1,1,1,L1); %1st layer biases 
b2 = randn(1,1,1,L1); %2nd layer biases 
G1 = ones(1,1,1,L1); % (1 1 1 3) = (1 1 1 L1) BN scale, one per dimension 
B1 = zeros(1,1,1,L1); % (1 1 1 3) = (1 1 1 L1) BN shift, one per dimension 
EPS = 1e-4; 
%% make CNN layers: conv, BN, relu, conv, pdist, l2-loss 
net.layers = {} ; 
net.layers{end+1} = struct('type', 'conv', ... 
          'name', 'conv1', ... 
          'weights', {{w1, b1}}, ... 
          'pad', 0) ; 
net.layers{end+1} = struct('type', 'bnorm', ... 
          'weights', {{G1, B1}}, ... 
          'EPSILON', EPS, ... 
          'learningRate', [1 1 0.05], ... 
          'weightDecay', [0 0]) ;      
net.layers{end+1} = struct('type', 'relu', ... 
          'name', 'relu1') ; 
net.layers{end+1} = struct('type', 'conv', ... 
          'name', 'conv2', ... 
          'weights', {{w2, b2}}, ... 
          'pad', 0) ; 
net.layers{end+1} = struct('type', 'pdist', ... 
          'name', 'averageing1', ... 
          'class', 0, ... 
          'p', 1) ; 
%% add L2-loss     
fwfun = @l2LossForward; 
bwfun = @l2LossBackward; 
net = addCustomLossLayer(net, fwfun, bwfun) ; 
net.layers{end}.class = Y_test; % its the test set 
net = vl_simplenn_tidy(net) ; 
res = vl_simplenn(net, X_train); 
%% prepare train options 
trainOpts.expDir = 'results/' ; %save results/trained cnn 
trainOpts.gpus = [] ; 
trainOpts.batchSize = 2 ; 
trainOpts.learningRate = 0.02 ; 
trainOpts.plotDiagnostics = false ; 
%trainOpts.plotDiagnostics = true ; % Uncomment to plot diagnostics 
trainOpts.numEpochs = 20 ; % number of training epochs 
trainOpts.errorFunction = 'none' ; 
%% CNN TRAIN 
vl_simplenn_display(net) ; 
net = cnn_train(net, imdb, @getBatch, trainOpts) ; 

我創造了這個根據the example they provided,每當我跑的例子中,我得到的錯誤:

Error using vl_nnconv 
DATA and DEROUTPUT do not have compatible formats. 

Error in vl_simplenn (line 397) 
      [res(i).dzdx, dzdw{1}, dzdw{2}] = vl_nnconv(res(i).x, l.weights{1}, 
      l.weights{2}, res(i+1).dzdx) 

Error in cnn_train>process_epoch (line 323) 
    res = vl_simplenn(net, im, dzdy, res, ... 

Error in cnn_train (line 139) 
    [net,stats.train,prof] = process_epoch(opts, getBatch, epoch, train, learningRate, 
    imdb, net) ; 

Error in main_1D_1layer_hard_coded_example (line 64) 
net = cnn_train(net, imdb, @getBatch, trainOpts) ; 

沒有人知道是怎麼回事?這個例子實際上是假設它很簡單,所以它讓我困惑,可能是錯的。事情


補充部分我試圖解決這個

欲瞭解更多細節,我試圖解決這個事情。

我去的文件在該行導致錯誤並打印輸入該功能,以確保我給論點是有道理的和看起來一切都很好與尊重:

case 'conv' 
     size(res(i).x) 
     size(res(i+1).dzdx) 
     size(l.weights{1}) 
     size(l.weights{2}) 
     [res(i).dzdx, dzdw{1}, dzdw{2}] = vl_nnconv(res(i).x, l.weights{1}, l.weights{2}, res(i+1).dzdx) 
    [res(i).dzdx, dzdw{1}, dzdw{2}] = ... 
     vl_nnconv(res(i).x, l.weights{1}, l.weights{2}, res(i+1).dzdx, ... 
     'pad', l.pad, ... 
     'stride', l.stride, ... 
     l.opts{:}, ... 
     cudnn{:}) ; 

打印:

ans = 

    1  1  3 16 


ans = 

    1  1  3 16 


ans = 

    1  1  1  3 


ans = 

    1  1  1  3 

我的預期。

我甚至說幹就幹,手動硬編碼什麼衍生物網絡應該計算的鏈條和文件似乎很好地工作:

clc;clear;clc;clear; 
%% prepare Data 
M = 3; 
x = zeros(1,1,1,M); % (1 1 1 2) = (1 1 1 M) 
for m=1:M, 
    x(:,:,:,m) = m; 
end 
Y = 5; 
r=Y; 
%% parameters 
L1 = 3; 
w1 = randn(1,1,1,L1); % (1 1 1 L1) = (1 1 1 3) 
b1 = ones(1,L1); 
w2 = randn(1,1,1,L1); % (1 1 1 L1) = (1 1 1 3) 
b2 = ones(1,L1); 
G1 = ones(1,1,1,L1); % (1 1 1 3) = (1 1 1 L1) BN scale, one per dimension 
B1 = zeros(1,1,1,L1); % (1 1 1 3) = (1 1 1 L1) BN shift, one per dimension 
EPS = 1e-4; 
%% Forward Pass 
z1 = vl_nnconv(x,w1,b1); % (1 1 3 2) = (1 1 L1 M) 
%bn1 = z1; 
bn1 = vl_nnbnorm(z1,G1,B1,'EPSILON',EPS); % (1 1 3 2) = (1 1 L1 M) 
a1 = vl_nnrelu(bn1); % (1 1 3 2) = (1 1 L1 M) 
z2 = vl_nnconv(a1,w2,b2); 
y1 = vl_nnpdist(z2, 0, 1); 
loss_forward = l2LossForward(y1,Y); 
%% 
net.layers = {} ; 
net.layers{end+1} = struct('type', 'conv', ... 
          'name', 'conv1', ... 
          'weights', {{w1, b1}}, ... 
          'pad', 0) ; 
net.layers{end+1} = struct('type', 'bnorm', ... 
          'weights', {{G1, B1}}, ... 
          'EPSILON', EPS, ... 
          'learningRate', [1 1 0.05], ... 
          'weightDecay', [0 0]) ;      
net.layers{end+1} = struct('type', 'relu', ... 
          'name', 'relu1') ; 
net.layers{end+1} = struct('type', 'conv', ... 
          'name', 'conv2', ... 
          'weights', {{w2, b2}}, ... 
          'pad', 0) ; 
net.layers{end+1} = struct('type', 'pdist', ... 
          'name', 'averageing1', ... 
          'class', 0, ... 
          'p', 1) ; 
fwfun = @l2LossForward; 
bwfun = @l2LossBackward; 
net = addCustomLossLayer(net, fwfun, bwfun) ; 
net.layers{end}.class = Y; 
net = vl_simplenn_tidy(net) ; 
res = vl_simplenn(net, x); 
%% 
loss_forward = squeeze(loss_forward) % (1 1) 
loss_res = squeeze(res(end).x) % (1 1) 
%% Backward Pass 
p = 1; 
dldx = l2LossBackward(y1,r,p); 
dy1dx = vl_nnpdist(z2, 0, 1, dldx); 
[dz2dx, dz2dw2] = vl_nnconv(a1, w2, b2, dy1dx); 
da1dx = vl_nnrelu(bn1, dz2dx); 
[dbn1dx,dbn1dG1,dbn1dB1] = vl_nnbnorm(z1,G1,B1,da1dx); 
[dz1dx, dz1dw1] = vl_nnconv(x, w1, b1, dbn1dx); 
%% 
dzdy = 1; 
res = vl_simplenn(net, x, dzdy, res); 
%% 
% func = @(x) proj(p, forward(x, x0)) ; 
% err = checkDerivativeNumerically(f, x, dx) 
% %% 
dz1dx = squeeze(dz1dx) 
dz1dx_vl_simplenn = squeeze(res(1).dzdx) 

衍生物似乎數學所以我會承擔這一切該文件起作用。它不會拋出錯誤,所以它甚至不運行的事實讓我非常困惑。任何人都知道發生了什麼事?


我加載我的CNN的方式是基於他們在該教程中提供的the example file。我將粘貼該文件的重要方面的摘要(使用cnn_train函數可以正常運行,而我的不會)。

setup() ; 
% setup('useGpu', true); % Uncomment to initialise with a GPU support 
%% Part 3.1: Prepare the data 
% Load a database of blurred images to train from 
imdb = load('data/text_imdb.mat') ; 

%% Part 3.2: Create a network architecture 

net = initializeSmallCNN() ; 
%net = initializeLargeCNN() ; 
% Display network 
vl_simplenn_display(net) ; 

%% Part 3.3: learn the model 
% Add a loss (using a custom layer) 
net = addCustomLossLayer(net, @l2LossForward, @l2LossBackward) ; 

% Train 
trainOpts.expDir = 'data/text-small' ; 
trainOpts.gpus = [] ; 
% Uncomment for GPU training: 
%trainOpts.expDir = 'data/text-small-gpu' ; 
%trainOpts.gpus = [1] ; 
trainOpts.batchSize = 16 ; 
trainOpts.learningRate = 0.02 ; 
trainOpts.plotDiagnostics = false ; 
%trainOpts.plotDiagnostics = true ; % Uncomment to plot diagnostics 
trainOpts.numEpochs = 20 ; 
trainOpts.errorFunction = 'none' ; 

net = cnn_train(net, imdb, @getBatch, trainOpts) ; 
+2

你能想出一個少的例子.... *非*最小? –

+0

@安德拉斯肯定,你的意思是它複雜嗎?它假設是一個簡單的網絡,我只是提供了一些我試圖解決的問題。 – Pinocchio

+0

我試着更多地評論這個例子,使它更加令人信服,它非常簡單。希望能幫助到你。 – Pinocchio

回答

3

w2的尺寸應該是1x1x3x3。

此外,通常情況下偏差將給出爲1x3,因爲它們只有一個維度(或者1x1x3xN的權重和1xN的相應偏差,其中N是過濾器的數量),B1和G1(這裏是1xM,其中M是前一層過濾器的數量)。但它可能以任何方式工作。

在你的例子中,第一次卷積後x的尺寸是1x1x3x16。這意味着在一個批次中有16個元素,其中每個元素具有寬度和高度1並且深度3。深度3是因爲第一次卷積使用3個過濾器(w1具有1x1x1x3的尺寸)完成。

在您的示例中,w2的尺寸爲1x1x1x3,表示寬度,高度和深度爲1的3個過濾器。因此,過濾器的深度與輸入的深度不匹配。

+0

兩條評論,1)我做了w2 1x1x3x3,但它仍然沒有運行不幸。 2)你的推理在邏輯意義上是有道理的,但仍然令我困惑的是,如果你的論點是正確的,爲什麼我能成功地在模型中運行前鋒?如果你注意到在我給出的代碼中有'res = vl_simplenn(net,X_train)'這一行;'這會評估模型,令我驚訝的是實際運行。如果尺寸不匹配,爲什麼前鋒傳球甚至會跑? 我還不知道,但似乎回傳是錯誤輸出的問題。我會盡力檢查。 – Pinocchio

+1

我一定忽略了這一點。真奇怪。也許我會在接下來的幾天裏有時間親自嘗試你的代碼。 – Wiseful

0

通過創建自定義圖層,我得到了同樣的問題。我終於通過跟蹤matconvnet實現找到了解決方案。希望以後能幫助其他人。總之,您需要確保兩個數據不是空的,不爲空,並且具有相同的設備類型(GPU或CPU)和相同的數據類型(float,single或char)。

在我的情況下,兩個數據都必須具有相同的'gpuArray'和'single'。

======詳細================== 首先,錯誤

DATA and FILTERS do not have compatible formats

DATA and BIASES do not have compatible formats

DATA and DEROUTPUT do not have compatible formats

準確說,這兩個變量不具備兼容的格式。 那麼Matconvnet是什麼意思'兼容格式'? 它在vl_nnconv.cu實現,線路269〜278

/* check for GPU/data class consistency */ 


if (hasFilters && ! vl::areCompatible(data, filters)) { 
    vlmxError(VLMXE_IllegalArgument, "DATA and FILTERS do not have compatible formats.") ; 
    } 
    if (hasBiases && ! vl::areCompatible(data, biases)) { 
    vlmxError(VLMXE_IllegalArgument, "DATA and BIASES do not have compatible formats.") ; 
    } 
    if (backMode && ! vl::areCompatible(data, derOutput)) { 
    vlmxError(VLMXE_IllegalArgument, "DATA and DEROUTPUT do not have compatible formats.") ; 
    } 

從功能VL :: areCompatible其作爲

inline bool areCompatible(Tensor const & a, Tensor const & b) 
    { 
    return 
    (a.isEmpty() || a.isNull()) || 
    (b.isEmpty() || b.isNull()) || 
    ((a.getDeviceType() == b.getDeviceType()) & (a.getDataType() == b.getDataType())) ; 
    } 

所以實現來的誤差,基本上,它檢查任何輸入是否是空或空,並確保兩個輸入具有相同的數據類型(double,single,vs char)和設備類型(GPU,CPU)。

/// Type of device: CPU or GPU 
    enum DeviceType { 
    VLDT_CPU = 0, 
    VLDT_GPU 
    } ; 

    /// Type of data (char, float, double, ...) 
    enum DataType { 
    VLDT_Char, 
    VLDT_Float, 
    VLDT_Double 
    } ; 
相關問題