1
我使用Theano隨機梯度下降來解決最小化問題。在運行我的代碼時,第一次迭代看起來很有效,但一段時間後,突然間,優化參數(eta)變成了NaN(以及衍生物g_eta)。這似乎是一個Theano技術問題,而不是我的代碼中的一個錯誤,因爲我用幾種不同的方法檢查了它。Theano隨機梯度下降NaN輸出
任何人都有一個想法,這可能是原因?我的代碼如下:
n_exp = 4
features = theano.shared(value=X_comb_I, name='features', borrow=True)
x = T.dmatrix()
y = T.ivector()
srng = RandomStreams()
rv_u = srng.uniform((64,n_exp))
eta = theano.shared(value=rv_u.eval(), name='eta', borrow=True)
ndotx = T.exp(T.dot(features, eta))
g = ndotx/T.reshape(T.repeat(T.sum(ndotx, axis=1), (n_exp), axis=0),[n_i,n_exp])
my_score_given_eta = T.sum((g*x),axis=1)
cost = T.mean(T.abs_(my_score_given_eta - y))
g_eta = T.grad(cost=cost, wrt=eta)
learning_rate = 0.5
updates = [(eta, eta - learning_rate * g_eta)]
train_set_x = theano.shared(value=score, name='train_set_x', borrow=True)
train_set_y = theano.shared(value=labels.astype(np.int32), name='train_set_y', borrow=True)
train = theano.function(inputs=[],
outputs=cost,
updates=updates, givens={x: train_set_x, y: train_set_y})
validate = theano.function(inputs=[],
outputs=cost, givens={x: train_set_x, y: train_set_y})
train_monitor = []
val_monitor = []
n_epochs = 1000
for epoch in range(n_epochs):
loss = train()
train_monitor.append(validate())
if epoch%2 == 0:
print "Iteration: ", epoch
print "Training error, validation error: ", train_monitor[-1] #, val_monitor[-1]
謝謝!
您是否嘗試過一個較小的學習率是多少? –
嗨,是的,我已經嘗試過。但我仍然得到幾次迭代後的NaNs :( –