你可以看到,權重不通過執行以下腳本共享:
import tensorflow as tf
with tf.variable_scope("scope1") as vs:
cell = tf.nn.rnn_cell.GRUCell(10)
stacked_cell = tf.nn.rnn_cell.MultiRNNCell([cell] * 2)
stacked_cell(tf.Variable(np.zeros((100, 100), dtype=np.float32), name="moo"), tf.Variable(np.zeros((100, 100), dtype=np.float32), "bla"))
# Retrieve just the LSTM variables.
vars = [v.name for v in tf.all_variables()
if v.name.startswith(vs.name)]
print vars
你會看到,除了虛擬變量返回兩套GRU的權重:那些「Cell1」和那些「Cell0 」。
爲了讓他們共享,可以實現從GRUCell
繼承並始終始終使用相同的變量範圍的方式重新使用權自己的電池類:
import tensorflow as tf
class SharedGRUCell(tf.nn.rnn_cell.GRUCell):
def __init__(self, num_units, input_size=None, activation=tf.nn.tanh):
tf.nn.rnn_cell.GRUCell.__init__(self, num_units, input_size, activation)
self.my_scope = None
def __call__(self, a, b):
if self.my_scope == None:
self.my_scope = tf.get_variable_scope()
else:
self.my_scope.reuse_variables()
return tf.nn.rnn_cell.GRUCell.__call__(self, a, b, self.my_scope)
with tf.variable_scope("scope2") as vs:
cell = SharedGRUCell(10)
stacked_cell = tf.nn.rnn_cell.MultiRNNCell([cell] * 2)
stacked_cell(tf.Variable(np.zeros((20, 10), dtype=np.float32), name="moo"), tf.Variable(np.zeros((20, 10), dtype=np.float32), "bla"))
# Retrieve just the LSTM variables.
vars = [v.name for v in tf.all_variables()
if v.name.startswith(vs.name)]
print vars
這樣兩者之間的變量GRUCells是共享的。請注意,您需要小心形狀,因爲同一個單元需要同時處理原始輸入和輸出。