2016-08-28 53 views

回答

5

tf.train.Optimizer documentation

處理梯度。

調用最小化()負責的兩個計算所述梯度和將它們應用到的變量。如果你想在應用之前處理梯度,您可以改用三個步驟優化:

計算的梯度與compute_gradients()。 如您所願處理漸變。 將apply_gradients()應用於已處理的漸變。 例子:

# Create an optimizer. 
opt = GradientDescentOptimizer(learning_rate=0.1) 

# Compute the gradients for a list of variables. 
grads_and_vars = opt.compute_gradients(loss, <list of variables>) 

# grads_and_vars is a list of tuples (gradient, variable). Do whatever you 
# need to the 'gradient' part, for example cap them, etc. 
capped_grads_and_vars = [(MyCapper(gv[0]), gv[1]) for gv in grads_and_vars] 

# Ask the optimizer to apply the capped gradients. 
opt.apply_gradients(capped_grads_and_vars) 
2

你可能會尋找tf.Graph.gradient_override_map。有一個在tensorflow docs一個很好的例子:

@tf.RegisterGradient("CustomSquare") 
def _custom_square_grad(op, grad): 
# ... 

with tf.Graph().as_default() as g: 
    c = tf.constant(5.0) 
    s_1 = tf.square(c) # Uses the default gradient for tf.square. 
    with g.gradient_override_map({"Square": "CustomSquare"}): 
    s_2 = tf.square(s_2) # Uses _custom_square_grad to compute the 
         # gradient of s_2. 

有一個現實世界使用它here通過量化權重,通過實值梯度回來做再發網實現。