2017-08-15 27 views
1

我正在嘗試計算與keras(作爲診斷工具)的神經網絡的權重有關的梯度範數。最後,我想爲此創建一個回調函數,但是在那裏我一直在努力創建一個函數,它可以計算梯度並以numpy數組/標量值的形式返回實際值(而不僅僅是一個張量流張量)。代碼如下:用keras計算梯度範數和權重

import numpy as np 
import keras.backend as K 
from keras.layers import Dense 
from keras.models import Sequential 


def get_gradient_norm_func(model): 
    grads = K.gradients(model.total_loss, model.trainable_weights) 
    summed_squares = [K.sum(K.square(g)) for g in grads] 
    norm = K.sqrt(sum(summed_squares)) 
    func = K.function([model.input], [norm]) 
    return func 


def main(): 
    x = np.random.random((128,)).reshape((-1, 1)) 
    y = 2 * x 
    model = Sequential(layers=[Dense(2, input_shape=(1,)), 
           Dense(1)]) 
    model.compile(loss='mse', optimizer='RMSprop') 
    get_gradient = get_gradient_norm_func(model) 
    history = model.fit(x, y, epochs=1) 
    print(get_gradient([x])) 

if __name__ == '__main__': 
    main() 

代碼在撥打到get_gradient()時失敗。追溯是漫長的,涉及很多形狀,但關於什麼是正確的形狀的信息很少。我該如何解決這個問題?

理想情況下,我想要一個後端不可知的解決方案,但基於張量流的解決方案也是一種選擇。

2017-08-15 15:39:14.914388: W tensorflow/core/framework/op_kernel.cc:1148] Invalid argument: Shape [-1,-1] has negative dimensions 
2017-08-15 15:39:14.914414: E tensorflow/core/common_runtime/executor.cc:644] Executor failed to create kernel. Invalid argument: Shape [-1,-1] has negative dimensions 
     [[Node: dense_2_target = Placeholder[dtype=DT_FLOAT, shape=[?,?], _device="/job:localhost/replica:0/task:0/cpu:0"]()]] 
2017-08-15 15:39:14.915026: W tensorflow/core/framework/op_kernel.cc:1148] Invalid argument: Shape [-1,-1] has negative dimensions 
2017-08-15 15:39:14.915038: E tensorflow/core/common_runtime/executor.cc:644] Executor failed to create kernel. Invalid argument: Shape [-1,-1] has negative dimensions 
     [[Node: dense_2_target = Placeholder[dtype=DT_FLOAT, shape=[?,?], _device="/job:localhost/replica:0/task:0/cpu:0"]()]] 
2017-08-15 15:39:14.915310: W tensorflow/core/framework/op_kernel.cc:1148] Invalid argument: Shape [-1] has negative dimensions 
2017-08-15 15:39:14.915321: E tensorflow/core/common_runtime/executor.cc:644] Executor failed to create kernel. Invalid argument: Shape [-1] has negative dimensions 
     [[Node: dense_2_sample_weights = Placeholder[dtype=DT_FLOAT, shape=[?], _device="/job:localhost/replica:0/task:0/cpu:0"]()]] 
Traceback (most recent call last): 
    File "/home/josteb/.local/opt/anaconda3/envs/timeseries/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1139, in _do_call 
    return fn(*args) 
    File "/home/josteb/.local/opt/anaconda3/envs/timeseries/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1121, in _run_fn 
    status, run_metadata) 
    File "/home/josteb/.local/opt/anaconda3/envs/timeseries/lib/python3.6/contextlib.py", line 89, in __exit__ 
    next(self.gen) 
    File "/home/josteb/.local/opt/anaconda3/envs/timeseries/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status 
    pywrap_tensorflow.TF_GetCode(status)) 
tensorflow.python.framework.errors_impl.InvalidArgumentError: Shape [-1] has negative dimensions 
     [[Node: dense_2_sample_weights = Placeholder[dtype=DT_FLOAT, shape=[?], _device="/job:localhost/replica:0/task:0/cpu:0"]()]] 

During handling of the above exception, another exception occurred: 

Traceback (most recent call last): 
    File "gradientlog.py", line 45, in <module> 
    main() 
    File "gradientlog.py", line 42, in main 
    print(get_gradient([x])) 
    File "/home/josteb/sandbox/keras/keras/backend/tensorflow_backend.py", line 2251, in __call__ 
    **self.session_kwargs) 
    File "/home/josteb/.local/opt/anaconda3/envs/timeseries/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 789, in run 
    run_metadata_ptr) 
    File "/home/josteb/.local/opt/anaconda3/envs/timeseries/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 997, in _run 
    feed_dict_string, options, run_metadata) 
    File "/home/josteb/.local/opt/anaconda3/envs/timeseries/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1132, in _do_run 
    target_list, options, run_metadata) 
    File "/home/josteb/.local/opt/anaconda3/envs/timeseries/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1152, in _do_call 
    raise type(e)(node_def, op, message) 
tensorflow.python.framework.errors_impl.InvalidArgumentError: Shape [-1] has negative dimensions 
     [[Node: dense_2_sample_weights = Placeholder[dtype=DT_FLOAT, shape=[?], _device="/job:localhost/replica:0/task:0/cpu:0"]()]] 

Caused by op 'dense_2_sample_weights', defined at: 
    File "gradientlog.py", line 45, in <module> 
    main() 
    File "gradientlog.py", line 39, in main 
    model.compile(loss='mse', optimizer='RMSprop') 
    File "/home/josteb/sandbox/keras/keras/models.py", line 783, in compile 
    **kwargs) 
    File "/home/josteb/sandbox/keras/keras/engine/training.py", line 799, in compile 
    name=name + '_sample_weights')) 
    File "/home/josteb/sandbox/keras/keras/backend/tensorflow_backend.py", line 435, in placeholder 
    x = tf.placeholder(dtype, shape=shape, name=name) 
    File "/home/josteb/.local/opt/anaconda3/envs/timeseries/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py", line 1530, in placeholder 
    return gen_array_ops._placeholder(dtype=dtype, shape=shape, name=name) 
    File "/home/josteb/.local/opt/anaconda3/envs/timeseries/lib/python3.6/site-packages/tensorflow/python/ops/gen_array_ops.py", line 1954, in _placeholder 
    name=name) 
    File "/home/josteb/.local/opt/anaconda3/envs/timeseries/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op 
    op_def=op_def) 
    File "/home/josteb/.local/opt/anaconda3/envs/timeseries/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 2506, in create_op 
    original_op=self._default_original_op, op_def=op_def) 
    File "/home/josteb/.local/opt/anaconda3/envs/timeseries/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1269, in __init__ 
    self._traceback = _extract_stack() 

InvalidArgumentError (see above for traceback): Shape [-1] has negative dimensions 
     [[Node: dense_2_sample_weights = Placeholder[dtype=DT_FLOAT, shape=[?], _device="/job:localhost/replica:0/task:0/cpu:0"]()]] 

回答

2

有與在Keras梯度計算過程中的幾個佔位符:

  1. 輸入x
  2. 目標y
  3. 樣品重量:即使你不model.fit()提供它, Keras仍然生成樣本權重的佔位符,並在訓練期間將np.ones((y.shape[0],), dtype=K.floatx())饋送到圖中。
  4. 學習階段:僅當使用該佔位符的圖層(例如Dropout)時,此佔位符纔會連接到漸變張量。

因此,在你提供的例子,爲了計算梯度,你需要養活xysample_weights到圖形。這是錯誤的根本原因。

裏面Model._make_train_function()the following lines展示如何構建必要的投入到K.function()在這種情況下:

inputs = self._feed_inputs + self._feed_targets + self._feed_sample_weights 
if self.uses_learning_phase and not isinstance(K.learning_phase(), int): 
    inputs += [K.learning_phase()] 

with K.name_scope('training'): 
    ... 
    self.train_function = K.function(inputs, 
            [self.total_loss] + self.metrics_tensors, 
            updates=updates, 
            name='train_function', 
            **self._function_kwargs) 

通過模仿這個功能,你應該能夠得到規範值:

def get_gradient_norm_func(model): 
    grads = K.gradients(model.total_loss, model.trainable_weights) 
    summed_squares = [K.sum(K.square(g)) for g in grads] 
    norm = K.sqrt(sum(summed_squares)) 
    inputs = model.model._feed_inputs + model.model._feed_targets + model.model._feed_sample_weights 
    func = K.function(inputs, [norm]) 
    return func 

def main(): 
    x = np.random.random((128,)).reshape((-1, 1)) 
    y = 2 * x 
    model = Sequential(layers=[Dense(2, input_shape=(1,)), 
           Dense(1)]) 
    model.compile(loss='mse', optimizer='rmsprop') 
    get_gradient = get_gradient_norm_func(model) 
    history = model.fit(x, y, epochs=1) 
    print(get_gradient([x, y, np.ones(len(y))])) 

執行輸出:

Epoch 1/1 
128/128 [==============================] - 0s - loss: 2.0073  
[4.4091368] 

請注意,由於您使用的是Sequential而不是Model,因此需要model.model._feed_*而不是model._feed_*

+1

感謝您的非常明確的答案。使用'_make_train_function'的指針,我還能夠弄清楚如何在keras的度量系統中插入一個任意的keras張量,從而確保在每次迭代中記錄該張量的值(這可以通過添加張量到編譯模型後的'model.metrics_tensors'和'model.metrics_names'(都是列表))。 – josteinb