2
我有一個類似於this one的問題。如何累積張量流中的梯度?
因爲我的資源有限,而且我使用深度模型(VGG-16) - 用於訓練三重網絡 - 我想累積128批次大小一個訓練示例的梯度,然後傳播錯誤並更新重量。
我不清楚我該怎麼做。我使用tensorflow工作,但是歡迎任何實現/僞代碼。
我有一個類似於this one的問題。如何累積張量流中的梯度?
因爲我的資源有限,而且我使用深度模型(VGG-16) - 用於訓練三重網絡 - 我想累積128批次大小一個訓練示例的梯度,然後傳播錯誤並更新重量。
我不清楚我該怎麼做。我使用tensorflow工作,但是歡迎任何實現/僞代碼。
讓我們穿行在你喜歡的答案之一提出的代碼:
## Optimizer definition - nothing different from any classical example
opt = tf.train.AdamOptimizer()
## Retrieve all trainable variables you defined in your graph
tvs = tf.trainable_variables()
## Creation of a list of variables with the same shape as the trainable ones
# initialized with 0s
accum_vars = [tf.Variable(tf.zeros_like(tv.initialized_value()), trainable=False) for tv in tvs]
zero_ops = [tv.assign(tf.zeros_like(tv)) for tv in accum_vars]
## Calls the compute_gradients function of the optimizer to obtain... the list of gradients
gvs = opt.compute_gradients(rmse, tvs)
## Adds to each element from the list you initialized earlier with zeros its gradient (works because accum_vars and gvs are in the same order)
accum_ops = [accum_vars[i].assign_add(gv[0]) for i, gv in enumerate(gvs)]
## Define the training step (part with variable value update)
train_step = opt.apply_gradients([(accum_vars[i], gv[1]) for i, gv in enumerate(gvs)])
這第一部分主要增加了新的variables
和ops
你的圖形,這將使你
accum_vars
accum_ops
(列表中的變量)accum_ops
train_step
然後,用它訓練時,你必須遵循這些步驟(仍然您鏈接的答案):
## The while loop for training
while ...:
# Run the zero_ops to initialize it
sess.run(zero_ops)
# Accumulate the gradients 'n_minibatches' times in accum_vars using accum_ops
for i in xrange(n_minibatches):
sess.run(accum_ops, feed_dict=dict(X: Xs[i], y: ys[i]))
# Run the train_step ops to update the weights based on your accumulated gradients
sess.run(train_step)
你爲什麼不使用來自問題,你鏈接的答案? – Pop
@Pop因爲我不理解他們。我正在尋找更詳細的東西(初學者級別) –