2017-06-01 47 views
0

目前,我有以下代碼:更新在tensorflow一個經常性的神經網絡的初始狀態

init_state = tf.Variable(tf.zeros([batch_partition_length, state_size])) # -> [16, 1024]. 
final_state = tf.Variable(tf.zeros([batch_partition_length, state_size])) 

And inside my inference method that is responsible producing the output, I have the following: 

def inference(frames): 
    # Note that I write the final_state as a global valriable to avoid the shadowing issue, since it is referenced at the dynamic_rnn line. 
    global final_state 
    # .... Here we have some conv layers and so on... 

    # Now the RNN cell 
    with tf.variable_scope('local1') as scope: 

     # Move everything into depth so we can perform a single matrix multiply. 
     shape_d = pool3.get_shape() 
     shape = shape_d[1] * shape_d[2] * shape_d[3] 
     # tf_shape = tf.stack(shape) 
     tf_shape = 1024 

     print("shape:", shape, shape_d[1], shape_d[2], shape_d[3]) 

     # So note that tf_shape = 1024, this means that we have 1024 features are fed into the network. And 
     # the batch size = 1024. Therefore, the aim is to divide the batch_size into num_steps so that 
     reshape = tf.reshape(pool3, [-1, tf_shape]) 
     # Now we need to reshape/divide the batch_size into num_steps so that we would be feeding a sequence 
     rnn_inputs = tf.reshape(reshape, [batch_partition_length, step_size, tf_shape]) 

     print('RNN inputs shape: ', rnn_inputs.get_shape()) # -> (16, 64, 1024). 

     cell = tf.contrib.rnn.BasicRNNCell(state_size) 
     # note that rnn_outputs are the outputs but not multiplied by W. 
     rnn_outputs, final_state = tf.nn.dynamic_rnn(cell, rnn_inputs, initial_state=init_state) 

    # linear Wx + b 
    with tf.variable_scope('softmax_linear') as scope: 
     weight_softmax = \ 
      tf.Variable(
       tf.truncated_normal([state_size, n_classes], stddev=1/state_size, dtype=tf.float32, name='weight_softmax')) 
     bias_softmax = tf.constant(0.0, tf.float32, [n_classes], name='bias_softmax') 

     softmax_linear = tf.reshape(
      tf.matmul(tf.reshape(rnn_outputs, [-1, state_size]), weight_softmax) + bias_softmax, 
      [batch_size, n_classes]) 

     print('Output shape:', softmax_linear.get_shape()) 

    return softmax_linear 

# Here we define the loss, accuracy and the optimzer. 
# now run the graph: 

with tf.Session() as sess: 
    _, accuracy_train, loss_train, summary = \ 
      sess.run([optimizer, accuracy, cost_scalar, merged], feed_dict={x: image_batch, 
                      y_valence: valences, 
                      confidence_holder: confidences}) 

    .... 

問題:我將如何能夠分配initial_state存儲在final_state價值?也就是說,如何更多地更新給定的另一個變量值?運行sess.run命令後

tf.assign(init_state, final_state.eval()) 
下會議

我已經使用以下。但是,這是拋出一個錯誤: 必須養活佔位符張量「輸入」的值與D型浮動 其中tf.Variable:如下「輸入」聲明:

x = tf.placeholder(tf.float32, [None, 112, 112, 3], name='inputs') 

和進料之後進行通過以下命令讀取來自tfRecords圖像:

example = tf.train.Example() 
example.ParseFromString(string_record) 

height = int(example.features.feature['height'] 
      .int64_list 
      .value[0]) 

width = int(example.features.feature['width'] 
      .int64_list 
      .value[0]) 

img_string = (example.features.feature['image_raw'] 
       .bytes_list 
       .value[0]) 

img_1d = np.fromstring(img_string, dtype=np.uint8) 
reconstructed_img = img_1d.reshape((height, width, -1)) # Where this is added to the image_batch list, which is fed into the placeholder. 

如果嘗試了以下內容:

img_1d = np.fromstring(img_string, dtype=np.float32) 

這將產生以下錯誤:

ValueError異常:不能重塑大小9408的陣列分成形狀(112112,newaxis)

任何幫助非常感謝!!

回答

0

所以這裏是我迄今爲止所犯的錯誤。做了一些修改後,我想出了以下內容:

  1. 我不應該創建final_state作爲tf.Variable。由於tf.nn.dynamic_rnn將張量返回爲ndarray,所以我不應該在開始處實例化final_state int。我不應該在函數定義下使用全局的final_state。

  2. 爲了分配初始狀態的final_state,我用:

    tf.assign(intial_state, final_state) 
    

,事情工作了。 注意:在tensorflow中,操作返回數據爲Python中的numpy數組,以及C和C++中的tensorflow :: Tensor。

查看https://www.tensorflow.org/versions/r0.10/get_started/basic_usage瞭解更多信息。