2016-04-06 52 views
0

我正在從Theano過渡到火炬。所以請耐心等待。在Theano中,計算甚至是特定重量的損失函數的梯度是很直接的。我想知道,火炬手怎麼能做到這一點?如何計算火炬中任意圖層/重量的損失梯度?

假設我們有以下的代碼生成一些數據/標籤,並且限定模型:

t = require 'torch' 
require 'nn' 
require 'cunn' 
require 'cutorch' 



-- Generate random labels 
function randLabels(nExamples, nClasses) 
    -- nClasses: number of classes 
    -- nExamples: number of examples 
    label = {} 
    for i=1, nExamples do 
     label[i] = t.random(1, nClasses) 
    end 
    return t.FloatTensor(label) 
end 

inputs = t.rand(1000, 3, 32, 32) -- 1000 samples, 3 color channels 
inputs = inputs:cuda() 
labels = randLabels(inputs:size()[1], 10) 
labels = labels:cuda() 

net = nn.Sequential() 
net:add(nn.SpatialConvolution(3, 6, 5, 5)) 
net:add(nn.ReLU()) 
net:add(nn.SpatialMaxPooling(2, 2, 2, 2)) 
net:add(nn.View(6*14*14)) 
net:add(nn.Linear(6*14*14, 300)) 
net:add(nn.ReLU()) 
net:add(nn.Linear(300, 10)) 
net = net:cuda() 

-- Loss 
criterion = nn.CrossEntropyCriterion() 
criterion = criterion:cuda() 
forwardPass = net:forward(inputs) 
net:zeroGradParameters() 
dEd_WeightsOfLayer1 -- How to compute this? 



forwardPass = nil 
net = nil 
criterion = nil 
inputs = nil 
labels = nil 

collectgarbage() 

如何能夠計算梯度w.r.t convolutinal層的權重?

回答

0

好的,我找到了答案(感謝Torch7 Google團隊的alban desmaison)。 問題中的代碼有一個錯誤,不起作用。所以我重新編寫代碼。這裏是你如何可以相對於得到的梯度,以每個節點/參數:

t = require 'torch' 
require 'cunn' 
require 'nn' 
require 'cutorch' 



-- A function to generate some random labels 
function randLabels(nExamples, nClasses) 
    -- nClasses: number of classes 
    -- nExamples: number of examples 
    label = {} 
    for i=1, nExamples do 
     label[i] = t.random(1, nClasses) 
    end 
    return t.FloatTensor(label) 
end 

-- Declare some variables 
nClass = 10 
kernelSize = 5 
stride = 2 
poolKernelSize = 2 
nData = 100 
nChannel = 3 
imageSize = 32 

-- Generate some [random] data 
data = t.rand(nData, nChannel, imageSize, imageSize) -- 100 Random images with 3 channels 
data = data:cuda() -- Transfer to the GPU (remove this line if you're not using GPU) 
label = randLabels(data:size()[1], nClass) 
label = label:cuda() -- Transfer to the GPU (remove this line if you're not using GPU) 

-- Define model 
net = nn.Sequential() 
net:add(nn.SpatialConvolution(3, 6, 5, 5)) 
net:add(nn.ReLU()) 
net:add(nn.SpatialMaxPooling(poolKernelSize, poolKernelSize, stride, stride)) 
net:add(nn.View(6*14*14)) 
net:add(nn.Linear(6*14*14, 350)) 
net:add(nn.ReLU()) 
net:add(nn.Linear(350, 10)) 
net = net:cuda() -- Transfer to the GPU (remove this line if you're not using GPU) 

criterion = nn.CrossEntropyCriterion() 
criterion = criterion:cuda() -- Transfer to the GPU (remove this line if you're not using GPU) 

-- Do forward pass and get the gradient for each node/parameter: 

net:forward(data) -- Do the forward propagation 
criterion:forward(net.output, label) -- Computer the overall negative log-likelihood error 
criterion:backward(net.output, label); -- Don't forget to put ';'. Otherwise you'll get everything printed on the screen 
net:backward(data, criterion.gradInput); -- Don't forget to put ';'. Otherwise you'll get everything printed on the screen 

-- Now you can access the gradient values 

layer1InputGrad = net:get(1).gradInput 
layer1WeightGrads = net:get(1).gradWeight 

net = nil 
data = nil 
label = nil 
criterion = nil 

複製和粘貼代碼和它的作品般的魅力:)