Theano隨機梯度下降NaN輸出

我使用Theano隨機梯度下降來解決最小化問題。在運行我的代碼時，第一次迭代看起來很有效，但一段時間後，突然間，優化參數（eta）變成了NaN（以及衍生物g_eta）。這似乎是一個Theano技術問題，而不是我的代碼中的一個錯誤，因爲我用幾種不同的方法檢查了它。Theano隨機梯度下降NaN輸出

任何人都有一個想法，這可能是原因？我的代碼如下：

n_exp = 4 
features = theano.shared(value=X_comb_I, name='features', borrow=True) 

x = T.dmatrix() 
y = T.ivector() 

srng = RandomStreams() 
rv_u = srng.uniform((64,n_exp)) 


eta = theano.shared(value=rv_u.eval(), name='eta', borrow=True) 

ndotx = T.exp(T.dot(features, eta)) 
g = ndotx/T.reshape(T.repeat(T.sum(ndotx, axis=1), (n_exp), axis=0),[n_i,n_exp]) 
my_score_given_eta = T.sum((g*x),axis=1) 

cost = T.mean(T.abs_(my_score_given_eta - y)) 

g_eta = T.grad(cost=cost, wrt=eta) 

learning_rate = 0.5 

updates = [(eta, eta - learning_rate * g_eta)] 

train_set_x = theano.shared(value=score, name='train_set_x', borrow=True) 
train_set_y = theano.shared(value=labels.astype(np.int32), name='train_set_y', borrow=True) 

train = theano.function(inputs=[], 
       outputs=cost, 
       updates=updates, givens={x: train_set_x, y: train_set_y}) 

validate = theano.function(inputs=[], 
       outputs=cost, givens={x: train_set_x, y: train_set_y}) 

train_monitor = [] 
val_monitor = [] 

n_epochs = 1000 

for epoch in range(n_epochs): 
    loss = train() 
    train_monitor.append(validate()) 

    if epoch%2 == 0: 
     print "Iteration: ", epoch 
     print "Training error, validation error: ", train_monitor[-1] #, val_monitor[-1]

謝謝！

來源

2015-06-11 Isadora Nun

您是否嘗試過一個較小的學習率是多少？ –

嗨，是的，我已經嘗試過。但我仍然得到幾次迭代後的NaNs :( –

事實上，您遇到的問題相同，但學習速度較慢，這表明您的功能可能會出現不穩定現象，這種不穩定現象會在您開始SGD的地方爆發。

嘗試不同的起始值
調整你的成本函數懲罰討厭的面積被炸燬
嘗試不同的梯度下降法

來源

2015-06-11 23:21:56

Theano隨機梯度下降NaN輸出

回答

相關問題