2017-03-02 70 views
0

我已經爲圖像分類構建了CNN。在訓練期間,我保存了幾個檢查點。數據通過feed_dictionary饋入網絡。Tensorflow抱怨在圖形還原期間丟失了feed_dict

現在我想恢復失敗的模型,我不知道爲什麼。有代碼的重要線路如下:

with tf.Graph().as_default(): 

.... 

if checkpoint_dir is not None: 
    checkpoint_saver = tf.train.Saver() 
    session_hooks.append(tf.train.CheckpointSaverHook(checkpoint_dir, 
                 save_secs=flags.save_interval_secs, 
                 saver=checkpoint_saver)) 
.... 

with tf.train.MonitoredTrainingSession(
     save_summaries_steps=flags.save_summaries_steps, 
     hooks=session_hooks, 
     config=tf.ConfigProto(
      log_device_placement=flags.log_device_placement)) as mon_sess: 

    checkpoint = tf.train.get_checkpoint_state(checkpoint_dir) 
    if checkpoint and checkpoint.model_checkpoint_path: 

     # restoring from the checkpoint file 
     checkpoint_saver.restore(mon_sess, checkpoint.model_checkpoint_path) 

     global_step_restore = checkpoint.model_checkpoint_path.split('/')[-1].split('-')[-1] 
     print("Model restored from checkpoint: global_step = %s" % global_step_restore) 

行 「checkpoint_saver.restore」 拋出一個錯誤:

回溯(最近通話最後一個): 文件「C:\ Program Files文件\ Anaconda3 \ (* args) 文件「C:\ Program Files \ Anaconda3 \ envs \ tensorflow \ lib \」文件名爲「envs \ tensorflow \ lib \ site-packages \ tensorflow \ python \ client \ session.py」,第1022行,在_do_call中 site_packages \ tensorflow \ python \ client \ session.py「,第1004行,在_run_fn status,run_metadata) 文件」C:\ Program Files \ Anaconda3 \ envs \ tensorflow \ lib \ contextlib.py「,第6行6,在退出 next(self.gen) 在raise_exception_on_not_ok_status文件「C:\ Program Files \ Anaconda3 \ envs \ tensorflow \ lib \ site-packages \ tensorflow \ python \ framework \ errors_impl.py」行469 pywrap_tensorflow.TF_GetCode(status)) tensorflow.python.framework.errors_impl.InvalidArgumentError:您必須爲dtype float提供佔位符張量'input_images'的值 [[Node:input_images = Placeholderdtype = DT_FLOAT,shape = [],_device =「/ job:localhost/replica:0/task:0/cpu:0」]]

任何知道如何解決這個問題?爲什麼我只需要填充的feed_dictionary來恢復圖形?

在此先感謝!

更新:

這是保護對象的恢復方法的代碼:

def restore(self, sess, save_path): 
    """Restores previously saved variables. 

    This method runs the ops added by the constructor for restoring variables. 
    It requires a session in which the graph was launched. The variables to 
    restore do not have to have been initialized, as restoring is itself a way 
    to initialize variables. 

    The `save_path` argument is typically a value previously returned from a 
    `save()` call, or a call to `latest_checkpoint()`. 

    Args: 
     sess: A `Session` to use to restore the parameters. 
     save_path: Path where parameters were previously saved. 
    """ 
    if self._is_empty: 
     return 
    sess.run(self.saver_def.restore_op_name, 
      {self.saver_def.filename_tensor_name: save_path}) 

什麼我不明白:爲什麼圖表立即執行?我使用錯誤的方法嗎?我只想恢復所有可訓練的變數。

+0

命名所有變量和佔位符。這有幫助嗎? http://stackoverflow.com/questions/34793978/tensorflow-complaining-about-placeholder-after-model-restore – hars

+0

所有變量都被命名。我的圖像張量輸入飼料丟失。我認爲問題是由MonitoredTrainingSession和feed_dict的組合使用引起的。 MonitoredTrainingSession旨在用於更大的設置,可能與Feed Dictionarys不兼容?!?。我正在嘗試爲我的自定義「培訓框架」構建測試用例。因此,我想保持示例模型的輕重(使用feed_dict而不是導入隊列) – monchi

回答

1

問題是用於進程日誌原因由SessionRunHook:

原始鉤:

class _LoggerHook(tf.train.SessionRunHook): 
    """Logs loss and runtime.""" 

    def begin(self): 
    self._step = -1 

    def before_run(self, run_context): 
    self._step += 1 
    self._start_time = time.time() 
    return tf.train.SessionRunArgs(loss) # Asks for loss value. 

    def after_run(self, run_context, run_values): 
    duration = time.time() - self._start_time 
    loss_value = run_values.results 
    if self._step % 5 == 0: 
     num_examples_per_step = FLAGS.batch_size 
     examples_per_sec = num_examples_per_step/duration 
     sec_per_batch = float(duration) 

     format_str = ('%s: step %d, loss = %.2f (%.1f examples/sec; %.3f ' 
        'sec/batch)') 
     print (format_str % (datetime.now(), self._step, loss_value, 
          examples_per_sec, sec_per_batch)) 

改性鉤:

class _LoggerHook(tf.train.SessionRunHook): 
    """Logs loss and runtime.""" 

    def __init__(self, flags, loss_op): 
     self._flags = flags 
     self._loss_op = loss_op 
     self._start_time = time.time() 

    def begin(self): 
     self._step = 0 

    def before_run(self, run_context): 
     if self._step == 0: 
      run_args = None 
     else: 
      run_args = tf.train.SessionRunArgs(self._loss_op) 

     return run_args 

    def after_run(self, run_context, run_values): 

     if self._step > 0: 
      duration_n_steps = time.time() - self._start_time 
      loss_value = run_values.results 
      if self._step % self._flags.log_every_n_steps == 0: 
       num_examples_per_step = self._flags.batch_size 

       duration = duration_n_steps/self._flags.log_every_n_steps 
       examples_per_sec = num_examples_per_step/duration 
       sec_per_batch = float(duration) 

       format_str = ('%s: step %d, loss = %.2f (%.1f examples/sec; %.3f ' 
           'sec/batch)') 
       print(format_str % (datetime.now(), self._step, loss_value, 
            examples_per_sec, sec_per_batch)) 

       self._start_time = time.time() 
     self._step += 1 

說明:

測井現在skiped第一次迭代。因此,由Saver.restore(..)執行的session.run不再需要填充的飼料字典。