用keras計算梯度範數和權重

我正在嘗試計算與keras（作爲診斷工具）的神經網絡的權重有關的梯度範數。最後，我想爲此創建一個回調函數，但是在那裏我一直在努力創建一個函數，它可以計算梯度並以numpy數組/標量值的形式返回實際值（而不僅僅是一個張量流張量）。代碼如下：用keras計算梯度範數和權重

import numpy as np 
import keras.backend as K 
from keras.layers import Dense 
from keras.models import Sequential 


def get_gradient_norm_func(model): 
    grads = K.gradients(model.total_loss, model.trainable_weights) 
    summed_squares = [K.sum(K.square(g)) for g in grads] 
    norm = K.sqrt(sum(summed_squares)) 
    func = K.function([model.input], [norm]) 
    return func 


def main(): 
    x = np.random.random((128,)).reshape((-1, 1)) 
    y = 2 * x 
    model = Sequential(layers=[Dense(2, input_shape=(1,)), 
           Dense(1)]) 
    model.compile(loss='mse', optimizer='RMSprop') 
    get_gradient = get_gradient_norm_func(model) 
    history = model.fit(x, y, epochs=1) 
    print(get_gradient([x])) 

if __name__ == '__main__': 
    main()

代碼在撥打到get_gradient()時失敗。追溯是漫長的，涉及很多形狀，但關於什麼是正確的形狀的信息很少。我該如何解決這個問題？

理想情況下，我想要一個後端不可知的解決方案，但基於張量流的解決方案也是一種選擇。

2017-08-15 15:39:14.914388: W tensorflow/core/framework/op_kernel.cc:1148] Invalid argument: Shape [-1,-1] has negative dimensions 
2017-08-15 15:39:14.914414: E tensorflow/core/common_runtime/executor.cc:644] Executor failed to create kernel. Invalid argument: Shape [-1,-1] has negative dimensions 
     [[Node: dense_2_target = Placeholder[dtype=DT_FLOAT, shape=[?,?], _device="/job:localhost/replica:0/task:0/cpu:0"]()]] 
2017-08-15 15:39:14.915026: W tensorflow/core/framework/op_kernel.cc:1148] Invalid argument: Shape [-1,-1] has negative dimensions 
2017-08-15 15:39:14.915038: E tensorflow/core/common_runtime/executor.cc:644] Executor failed to create kernel. Invalid argument: Shape [-1,-1] has negative dimensions 
     [[Node: dense_2_target = Placeholder[dtype=DT_FLOAT, shape=[?,?], _device="/job:localhost/replica:0/task:0/cpu:0"]()]] 
2017-08-15 15:39:14.915310: W tensorflow/core/framework/op_kernel.cc:1148] Invalid argument: Shape [-1] has negative dimensions 
2017-08-15 15:39:14.915321: E tensorflow/core/common_runtime/executor.cc:644] Executor failed to create kernel. Invalid argument: Shape [-1] has negative dimensions 
     [[Node: dense_2_sample_weights = Placeholder[dtype=DT_FLOAT, shape=[?], _device="/job:localhost/replica:0/task:0/cpu:0"]()]] 
Traceback (most recent call last): 
    File "/home/josteb/.local/opt/anaconda3/envs/timeseries/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1139, in _do_call 
    return fn(*args) 
    File "/home/josteb/.local/opt/anaconda3/envs/timeseries/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1121, in _run_fn 
    status, run_metadata) 
    File "/home/josteb/.local/opt/anaconda3/envs/timeseries/lib/python3.6/contextlib.py", line 89, in __exit__ 
    next(self.gen) 
    File "/home/josteb/.local/opt/anaconda3/envs/timeseries/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status 
    pywrap_tensorflow.TF_GetCode(status)) 
tensorflow.python.framework.errors_impl.InvalidArgumentError: Shape [-1] has negative dimensions 
     [[Node: dense_2_sample_weights = Placeholder[dtype=DT_FLOAT, shape=[?], _device="/job:localhost/replica:0/task:0/cpu:0"]()]] 

During handling of the above exception, another exception occurred: 

Traceback (most recent call last): 
    File "gradientlog.py", line 45, in <module> 
    main() 
    File "gradientlog.py", line 42, in main 
    print(get_gradient([x])) 
    File "/home/josteb/sandbox/keras/keras/backend/tensorflow_backend.py", line 2251, in __call__ 
    **self.session_kwargs) 
    File "/home/josteb/.local/opt/anaconda3/envs/timeseries/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 789, in run 
    run_metadata_ptr) 
    File "/home/josteb/.local/opt/anaconda3/envs/timeseries/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 997, in _run 
    feed_dict_string, options, run_metadata) 
    File "/home/josteb/.local/opt/anaconda3/envs/timeseries/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1132, in _do_run 
    target_list, options, run_metadata) 
    File "/home/josteb/.local/opt/anaconda3/envs/timeseries/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1152, in _do_call 
    raise type(e)(node_def, op, message) 
tensorflow.python.framework.errors_impl.InvalidArgumentError: Shape [-1] has negative dimensions 
     [[Node: dense_2_sample_weights = Placeholder[dtype=DT_FLOAT, shape=[?], _device="/job:localhost/replica:0/task:0/cpu:0"]()]] 

Caused by op 'dense_2_sample_weights', defined at: 
    File "gradientlog.py", line 45, in <module> 
    main() 
    File "gradientlog.py", line 39, in main 
    model.compile(loss='mse', optimizer='RMSprop') 
    File "/home/josteb/sandbox/keras/keras/models.py", line 783, in compile 
    **kwargs) 
    File "/home/josteb/sandbox/keras/keras/engine/training.py", line 799, in compile 
    name=name + '_sample_weights')) 
    File "/home/josteb/sandbox/keras/keras/backend/tensorflow_backend.py", line 435, in placeholder 
    x = tf.placeholder(dtype, shape=shape, name=name) 
    File "/home/josteb/.local/opt/anaconda3/envs/timeseries/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py", line 1530, in placeholder 
    return gen_array_ops._placeholder(dtype=dtype, shape=shape, name=name) 
    File "/home/josteb/.local/opt/anaconda3/envs/timeseries/lib/python3.6/site-packages/tensorflow/python/ops/gen_array_ops.py", line 1954, in _placeholder 
    name=name) 
    File "/home/josteb/.local/opt/anaconda3/envs/timeseries/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op 
    op_def=op_def) 
    File "/home/josteb/.local/opt/anaconda3/envs/timeseries/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 2506, in create_op 
    original_op=self._default_original_op, op_def=op_def) 
    File "/home/josteb/.local/opt/anaconda3/envs/timeseries/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1269, in __init__ 
    self._traceback = _extract_stack() 

InvalidArgumentError (see above for traceback): Shape [-1] has negative dimensions 
     [[Node: dense_2_sample_weights = Placeholder[dtype=DT_FLOAT, shape=[?], _device="/job:localhost/replica:0/task:0/cpu:0"]()]]

來源

2017-08-15 josteinb

有與在Keras梯度計算過程中的幾個佔位符：

輸入x
目標y
樣品重量：即使你不model.fit()提供它， Keras仍然生成樣本權重的佔位符，並在訓練期間將np.ones((y.shape[0],), dtype=K.floatx())饋送到圖中。
學習階段：僅當使用該佔位符的圖層（例如Dropout）時，此佔位符纔會連接到漸變張量。

因此，在你提供的例子，爲了計算梯度，你需要養活x，y和sample_weights到圖形。這是錯誤的根本原因。

裏面Model._make_train_function()有the following lines展示如何構建必要的投入到K.function()在這種情況下：

inputs = self._feed_inputs + self._feed_targets + self._feed_sample_weights 
if self.uses_learning_phase and not isinstance(K.learning_phase(), int): 
    inputs += [K.learning_phase()] 

with K.name_scope('training'): 
    ... 
    self.train_function = K.function(inputs, 
            [self.total_loss] + self.metrics_tensors, 
            updates=updates, 
            name='train_function', 
            **self._function_kwargs)

通過模仿這個功能，你應該能夠得到規範值：

def get_gradient_norm_func(model): 
    grads = K.gradients(model.total_loss, model.trainable_weights) 
    summed_squares = [K.sum(K.square(g)) for g in grads] 
    norm = K.sqrt(sum(summed_squares)) 
    inputs = model.model._feed_inputs + model.model._feed_targets + model.model._feed_sample_weights 
    func = K.function(inputs, [norm]) 
    return func 

def main(): 
    x = np.random.random((128,)).reshape((-1, 1)) 
    y = 2 * x 
    model = Sequential(layers=[Dense(2, input_shape=(1,)), 
           Dense(1)]) 
    model.compile(loss='mse', optimizer='rmsprop') 
    get_gradient = get_gradient_norm_func(model) 
    history = model.fit(x, y, epochs=1) 
    print(get_gradient([x, y, np.ones(len(y))]))

執行輸出：

Epoch 1/1 
128/128 [==============================] - 0s - loss: 2.0073  
[4.4091368]

請注意，由於您使用的是Sequential而不是Model，因此需要model.model._feed_*而不是model._feed_*。

來源

2017-08-18 20:46:03

感謝您的非常明確的答案。使用'_make_train_function'的指針，我還能夠弄清楚如何在keras的度量系統中插入一個任意的keras張量，從而確保在每次迭代中記錄該張量的值（這可以通過添加張量到編譯模型後的'model.metrics_tensors'和'model.metrics_names'（都是列表））。 – josteinb

用keras計算梯度範數和權重

回答

相關問題