2017-06-01 36 views
1

我使用的tensorflow輸入管線數據集API R1.2爲什麼配料

我建立自己的數據集,並與批量批次之後dataset.output_shapes收益產品尺寸(無)= 128

然後將其輸入到RNN。

但dataset.output_shape返回尺寸(無)在第一維的,所以RNN提出了一個錯誤:

Traceback (most recent call last): 
    File "untitled1.py", line 188, in <module> 
    tf.app.run(main=main, argv=[sys.argv[0]] + unparsed) 
    File "/home/harold/anaconda2/envs/tensorflow_py2.7/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 48, in run 
    _sys.exit(main(_sys.argv[:1] + flags_passthrough)) 
    File "untitled1.py", line 121, in main 
    run_training() 
    File "untitled1.py", line 57, in run_training 
    is_training=True) 
    File "/home/harold/huawei/ConvLSTM/ConvLSTM.py", line 216, in inference 
    initial_state=initial_state) 
    File "/home/harold/anaconda2/envs/tensorflow_py2.7/lib/python2.7/site-packages/tensorflow/python/ops/rnn.py", line 566, in dynamic_rnn 
    dtype=dtype) 
    File "/home/harold/anaconda2/envs/tensorflow_py2.7/lib/python2.7/site-packages/tensorflow/python/ops/rnn.py", line 636, in _dynamic_rnn_loop 
    "Input size (depth of inputs) must be accessible via shape inference," 
ValueError: Input size (depth of inputs) must be accessible via shape inference, but saw value None. 

我認爲這種錯誤是由輸入的形狀造成的,所述第一尺寸應批量大小但不是無。

這裏是代碼:

origin_dataset = Dataset.BetweenS_Dataset(FLAGS.data_path) 
train_dataset = origin_dataset.train_dataset 
test_dataset = origin_dataset.test_dataset 
shuffle_train_dataset = train_dataset.shuffle(buffer_size=10000) 
shuffle_batch_train_dataset = shuffle_train_dataset.batch(128) 
batch_test_dataset = test_dataset.batch(FLAGS.batch_size) 

iterator = tf.contrib.data.Iterator.from_structure(
          shuffle_batch_train_dataset.output_types, 
          shuffle_batch_train_dataset.output_shapes) 
(images, labels) = iterator.get_next() 

training_init_op = iterator.make_initializer(shuffle_batch_train_dataset) 
test_init_op = iterator.make_initializer(batch_test_dataset) 

print(shuffle_batch_train_dataset.output_shapes) 

我打印output_shapes,它給:

(TensorShape([Dimension(None), Dimension(36), Dimension(100)]), TensorShape([Dimension(None)])) 

我想它應該是128,因爲我已經成批的數據集:

(TensorShape([Dimension(128), Dimension(36), Dimension(100)]), TensorShape([Dimension(128)])) 
+0

爲什麼'shuffle_batch_test_dataset'(其中你打印的形狀)沒有在你的代碼段界定?你的意思是'shuffle_batch_train_dataset'而不是? –

+0

是的,我的意思是shuffle_batch_train_dataset。 – HaroldZ

+0

我認爲將第一維作爲'None'不應該導致問題,並且看起來像是在查看代碼時的預期行爲。您得到的錯誤可能是由於您輸入到dynamic_rnn的輸入具有未定義的尺寸而不是批量尺寸。你可以在你設置RNN的地方加入代碼嗎? –

回答

1

它們在實現中對批處理大小進行了硬編碼,並且它總是會返回None(tf 1.3)。

def _padded_shape_to_batch_shape(s): 
    return tensor_shape.vector(None).concatenate(
     tensor_util.constant_value_as_shape(s)) 

以這種方式,它們可以批量的所有元素(例如dataset_size=14batch_size=5last_batch_size=4)。

您可以使用dataset.filter和dataset.map來解決這個問題

d = contrib.data.Dataset.from_tensor_slices([[5] for x in range(14)]) 
batch_size = 5 
d = d.batch(batch_size) 
d = d.filter(lambda e: tf.equal(tf.shape(e)[0], batch_size)) 
def batch_reshape(e): 
    return tf.reshape(e, [args.batch_size] + [s if s is not None else -1 for s in e.shape[1:].as_list()]) 
d = d.map(batch_reshape) 
r = d.make_one_shot_iterator().get_next() 
print('dataset_output_shape = %s' % r.shape) 
with tf.Session() as sess: 
    while True: 
     print(sess.run(r)) 

Output

dataset_output_shape = (5, 1)

[[5][5][5][5][5]]

[[5][5][5][5][5]]

OutOfRangeError

+0

非常感謝你! – HaroldZ

+0

有一個新的功能,這是[tf.contrib.data.batch_and_drop_remainder](https://www.tensorflow.org/api_docs/python/tf/contrib/data/batch_and_drop_remainder) –

+0

好!這個功能更方便 – HaroldZ