2017-02-05 84 views
0

我訓練tensorflow,然後我將搗碎鍵盤媽和笑聲:鍵盤中斷後繼續訓練?

INFO:tensorflow:global step 101: loss = 5.1761 (52.61 sec/step) 
INFO:tensorflow:global step 102: loss = 4.8679 (18.78 sec/step) 
INFO:tensorflow:global step 103: loss = 4.9662 (19.02 sec/step) 
INFO:tensorflow:global step 104: loss = 5.1126 (17.36 sec/step) 
^C^X^C^[^[^[^[^[ 





exit 
Traceback (most recent call last): 
    File "/Users/kristoffer/web/im2txt/im2txt/bazel-bin/im2txt/train.runfiles/im2txt/im2txt/train.py", line 114, in <module> 
    tf.app.run() 
    File "/Library/Python/2.7/site-packages/tensorflow/python/platform/app.py", line 43, in run 
    sys.exit(main(sys.argv[:1] + flags_passthrough)) 
    File "/Users/kristoffer/web/im2txt/im2txt/bazel-bin/im2txt/train.runfiles/im2txt/im2txt/train.py", line 110, in main 
    saver=saver) 
    File "/Library/Python/2.7/site-packages/tensorflow/contrib/slim/python/slim/learning.py", line 782, in train 
    sess, train_op, global_step, train_step_kwargs) 
    File "/Library/Python/2.7/site-packages/tensorflow/contrib/slim/python/slim/learning.py", line 530, in train_step 
    run_metadata=run_metadata) 
    File "/Library/Python/2.7/site-packages/tensorflow/python/client/session.py", line 766, in run 
    run_metadata_ptr) 
    File "/Library/Python/2.7/site-packages/tensorflow/python/client/session.py", line 964, in _run 
    feed_dict_string, options, run_metadata) 
    File "/Library/Python/2.7/site-packages/tensorflow/python/client/session.py", line 1014, in _do_run 
    target_list, options, run_metadata) 
    File "/Library/Python/2.7/site-packages/tensorflow/python/client/session.py", line 1021, in _do_call 
    return fn(*args) 
    File "/Library/Python/2.7/site-packages/tensorflow/python/client/session.py", line 1003, in _run_fn 
    status, run_metadata) 
KeyboardInterrupt 
Kristoffers-MacBook-Pro:im2txt kristoffer$ logout 
Saving session... 
...copying shared history... 
...saving history...truncating history files... 
...completed. 

[Process completed] 

當我試圖重新開始訓練,我得到以下錯誤:

$ bazel-bin/im2txt/train --input_file_pattern="${MSCOCO_DIR}/train-?????-of-00256" --inception_checkpoint_file="${INCEPTION_CHECKPOINT}" --train_dir="${MODEL_DIR}/train" --train_inception=false --number_of_steps=150 
CRITICAL:tensorflow:Found no input files matching /train-?????-of-00256 
Traceback (most recent call last): 
    File "/Users/kristoffer/web/im2txt/im2txt/bazel-bin/im2txt/train.runfiles/im2txt/im2txt/train.py", line 114, in <module> 
    tf.app.run() 
    File "/Library/Python/2.7/site-packages/tensorflow/python/platform/app.py", line 43, in run 
    sys.exit(main(sys.argv[:1] + flags_passthrough)) 
    File "/Users/kristoffer/web/im2txt/im2txt/bazel-bin/im2txt/train.runfiles/im2txt/im2txt/train.py", line 65, in main 
    model.build() 
    File "/Users/kristoffer/web/im2txt/im2txt/bazel-bin/im2txt/train.runfiles/im2txt/im2txt/show_and_tell_model.py", line 353, in build 
    self.build_inputs() 
    File "/Users/kristoffer/web/im2txt/im2txt/bazel-bin/im2txt/train.runfiles/im2txt/im2txt/show_and_tell_model.py", line 153, in build_inputs 
    num_reader_threads=self.config.num_input_reader_threads) 
    File "/Users/kristoffer/web/im2txt/im2txt/bazel-bin/im2txt/train.runfiles/im2txt/im2txt/ops/inputs.py", line 98, in prefetch_input_data 
    data_files, shuffle=True, capacity=16, name=shard_queue_name) 
    File "/Library/Python/2.7/site-packages/tensorflow/python/training/input.py", line 211, in string_input_producer 
    raise ValueError(not_null_err) 
ValueError: string_input_producer requires a non-null input tensor 

是什麼原因導致這我能做些什麼呢?有沒有適當的方法來暫停/取消培訓課程? (Tensorflow似乎拿起它離開的地方,如果你啓動訓練50步驟,然後設置步驟爲100)

回答

0

這似乎是你的問題是由於Tensorflow正試圖加載會話,這是不是在中斷代碼時正確保存。現在,您的解決方案要麼在重新啓動代碼(通過註釋加載行)或刪除已保存的會話文件(然後應自動從頭開始重新啓動)時不加載上次會話。很難給出更具體的例子,因爲你沒有分享你的代碼...