2017-07-14 147 views
0

這是使用CNTK模塊創建自定義錯誤功能CNTK

batch_axis = C.Axis.default_batch_axis() 
input_seq_axis = C.Axis.default_dynamic_axis() 

input_dynamic_axes = [batch_axis, input_seq_axis] 
input_dynamic_axes2 = [batch_axis, input_seq_axis] 

input = C.input_variable(n_ins, dynamic_axes=input_dynamic_axes, dtype=numpy.float32) 
output = C.input_variable(n_outs, dynamic_axes=input_dynamic_axes2, dtype=numpy.float32) 

dnn_model = cntk_model.create_model(input, hidden_layer_type, hidden_layer_size, n_outs) 

loss = C.squared_error(dnn_model, output) 
error = C.squared_error(dnn_model, output) 

lr_schedule = C.learning_rate_schedule(current_finetune_lr, C.UnitType.minibatch) 
      momentum_schedule = C.momentum_schedule(current_momentum) 

learner = C.adam(dnn_model.parameters, lr_schedule, momentum_schedule, unit_gain = False, l1_regularization_weight=l1_reg, l2_regularization_weight= l2_reg)  

trainer = C.Trainer(dnn_model, (loss, error), [learner]) 

在蟒蛇NN訓練我目前Python代碼組成部分,這裏是代碼創建神經網絡模型

def create_model(features, hidden_layer_type, hidden_layer_size, n_out): 
    logger.debug('Creating cntk model') 
    assert len(hidden_layer_size) == len(hidden_layer_type) 

    n_layers = len(hidden_layer_size) 

    my_layers = list() 
    for i in xrange(n_layers): 
     if(hidden_layer_type[i] == 'TANH'): 
      my_layers.append(C.layers.Dense(hidden_layer_size[i], activation=C.tanh, init=C.layers.glorot_uniform())) 
     elif (hidden_layer_type[i] == 'LSTM'): 
      my_layers.append(C.layers.Recurrence(C.layers.LSTM(hidden_layer_size[i]))) 
     else: 
      raise Exception('Unknown hidden layer type') 

    my_layers.append(C.layers.Dense(n_out, activation=None)) 

    my_model = C.layers.Sequential([my_layers]) 
    my_model = my_model(features) 

    return my_model 

現在,我想改變一個反向傳播,所以當計算出來的錯誤不是直接網絡輸出使用,而是輸出一些額外的計算後。我試圖定義類似這樣的東西

def create_error_function(self, prediction, target): 

    prediction_denorm = C.element_times(prediction, self.std_vector) 
    prediction_denorm = C.plus(prediction_denorm, self.mean_vector) 
    prediction_denorm_rounded = C.round(C.element_times(prediction_denorm[0:5], C.round(prediction_denorm[5]))) 
    prediction_denorm_rounded = C.element_divide(prediction_denorm_rounded, C.round(prediction_denorm[5])) 

    prediction_norm = C.minus(prediction_denorm_rounded, self.mean_vector[0:5]) 
    prediction_norm = C.element_divide(prediction_norm, self.std_vector[0:5]) 

    first = C.squared_error(prediction_norm, target[0:5]) 
    second = C.minus(C.round(prediction_denorm[5]), self.mean_vector[5]) 
    second = C.element_divide(second, self.std_vector[5]) 

    return C.plus(first, C.squared_error(second, target[5])) 

並用它代替標準squared_error。 而對於NN訓練的一部分

dnn_model = cntk_model.create_model(input, hidden_layer_type, hidden_layer_size, n_outs) 
error_function = cntk_model.ErrorFunction(cmp_mean_vector, cmp_std_vector) 
loss = error_function.create_error_function(dnn_model, output) 
error = error_function.create_error_function(dnn_model, output) 
lr_schedule = C.learning_rate_schedule(current_finetune_lr, C.UnitType.minibatch) 
momentum_schedule = C.momentum_schedule(current_momentum) 

learner = C.adam(dnn_model.parameters, lr_schedule, momentum_schedule, unit_gain = False, l1_regularization_weight=l1_reg, 
           l2_regularization_weight= l2_reg)  

trainer = C.Trainer(dnn_model, (loss, error), [learner]) 
trainer.train_minibatch({input: temp_train_x, output: temp_train_y}) 

但兩個時代後,我開始一直流汗相同的平均損失,我的網絡沒有學習

回答

0

你想怎麼改backprop作品每一次,你需要使用stop_gradient。這是唯一的功能,其梯度不同於正向操作的梯度。在正向傳遞stop_gradient作爲身份。在向後傳遞中,它阻止傳播的漸變。

要在直傳做一些x操作f(x),假裝好像它永遠不會在你需要做這樣的事情後向通行事情發生了: C.stop_gradient(f(x) - x) + x。在你的情況下,將是

norm_features = C.stop_gradient(features/normalization - features) + features

+0

我更新了我的問題。我成功地創造了新的損失函數的工作示例,但在我的實現中看起來有些問題,因爲我在所有時期的平均值相同。 我也不確定應該在哪裏添加您建議的修改 – sinisha