下面是我從閱讀文檔,其他類似的解決方案以及試驗和錯誤中得出的結論。這是一個關於隨機數據的簡單自動編碼器。如果跑了,然後又跑了,它會從停下的地方繼續跑下去(即第一次跑的成本函數從大約0.5 - > 0.3秒開始跑〜0.3)。除非我錯過了一些東西,否則所有的保存,構造函數,模型構建,add_to_collection都需要並按照精確的順序進行,但可能有一個更簡單的方法。
是的,加載與import_meta_graph
圖形並不是真的需要在這裏,因爲代碼是正確的上面,但是我想在我的實際應用程序。
from __future__ import print_function
import tensorflow as tf
import os
import math
import numpy as np
output_dir = "/root/Data/temp"
model_checkpoint_file_base = os.path.join(output_dir, "model.ckpt")
input_length = 10
encoded_length = 3
learning_rate = 0.001
n_epochs = 10
n_batches = 10
if not os.path.exists(model_checkpoint_file_base + ".meta"):
print("Making new")
brand_new = True
x_in = tf.placeholder(tf.float32, [None, input_length], name="x_in")
W_enc = tf.Variable(tf.random_uniform([input_length, encoded_length],
-1.0/math.sqrt(input_length),
1.0/math.sqrt(input_length)), name="W_enc")
b_enc = tf.Variable(tf.zeros(encoded_length), name="b_enc")
encoded = tf.nn.tanh(tf.matmul(x_in, W_enc) + b_enc, name="encoded")
W_dec = tf.transpose(W_enc, name="W_dec")
b_dec = tf.Variable(tf.zeros(input_length), name="b_dec")
decoded = tf.nn.tanh(tf.matmul(encoded, W_dec) + b_dec, name="decoded")
cost = tf.sqrt(tf.reduce_mean(tf.square(decoded - x_in)), name="cost")
saver = tf.train.Saver()
else:
print("Reloading existing")
brand_new = False
saver = tf.train.import_meta_graph(model_checkpoint_file_base + ".meta")
g = tf.get_default_graph()
x_in = g.get_tensor_by_name("x_in:0")
cost = g.get_tensor_by_name("cost:0")
sess = tf.Session()
if brand_new:
optimizer = tf.train.AdamOptimizer(learning_rate).minimize(cost)
init = tf.global_variables_initializer()
sess.run(init)
tf.add_to_collection("optimizer", optimizer)
else:
saver.restore(sess, model_checkpoint_file_base)
optimizer = tf.get_collection("optimizer")[0]
for epoch_i in range(n_epochs):
for batch in range(n_batches):
batch = np.random.rand(50, input_length)
_, curr_cost = sess.run([optimizer, cost], feed_dict={x_in: batch})
print("batch_cost:", curr_cost)
save_path = tf.train.Saver().save(sess, model_checkpoint_file_base)
來源
2017-04-06 00:12:11
Ken
我和AdamOptimizer有同樣的問題。我設法通過將我的操作放到集合中來讓事情發揮作用。這個例子幫助了我很多:http://www.seaandsailor.com/tensorflow-checkpointing.html –