2017-08-22 48 views
2

如何在MXNET中創建自定義丟失功能?例如,不是計算一個標籤的交叉熵損失(使用計算交叉熵損失的標準mx.sym.SoftmaxOutput圖層並返回一個可作爲損失符號傳遞給擬合函數的符號),我想計算加權交叉熵損失爲每個可能的標籤。該MXNET教程提到使用MXNET自定義丟失功能和eval_metric

mx.symbol.MakeLoss(scalar_loss_symbol, normalization='batch') 

然而,當我使用MakeLoss功能,標準eval_metric - "acc"不工作(顯然由於模型不知道什麼是我的預測概率向量)。所以我需要寫我自己的eval_metric。另外,在預測時,我還需要預測概率向量,除非我將帶有丟失符號的最終概率向量和其上的block_grad分組,否則不能被訪問。

回答

1

下面的代碼是對MXNET教程http://mxnet.io/tutorials/python/mnist.html的修改,其中標準SoftmaxOutput丟失函數被重寫爲自定義加權損失函數,並且寫入了所需的自定義eval_metric。

import logging 
logging.getLogger().setLevel(logging.DEBUG) 
import mxnet as mx 
import numpy as np 
mnist = mx.test_utils.get_mnist() 

batch_size = 100 
weighted_train_labels =  
np.zeros((mnist['train_label'].shape[0],np.max(mnist['train_label'])+ 1)) 
weighted_train_labels[np.arange(mnist['train_label'].shape[0]),mnist['train_label']] = 1 
train_iter = mx.io.NDArrayIter(mnist['train_data'], {'label':weighted_train_labels}, batch_size, shuffle=True) 

weighted_test_labels = np.zeros((mnist['test_label'].shape[0],np.max(mnist['test_label'])+ 1)) 
weighted_test_labels[np.arange(mnist['test_label'].shape[0]),mnist['test_label']] = 1 
val_iter = mx.io.NDArrayIter(mnist['test_data'], {'label':weighted_test_labels}, batch_size) 

data = mx.sym.var('data') 
# first conv layer 
conv1 = mx.sym.Convolution(data=data, kernel=(5,5), num_filter=20) 
tanh1 = mx.sym.Activation(data=conv1, act_type="tanh") 
pool1 = mx.sym.Pooling(data=tanh1, pool_type="max", kernel=(2,2), stride=(2,2)) 
# second conv layer 
conv2 = mx.sym.Convolution(data=pool1, kernel=(5,5), num_filter=50) 
tanh2 = mx.sym.Activation(data=conv2, act_type="tanh") 
pool2 = mx.sym.Pooling(data=tanh2, pool_type="max", kernel=(2,2), stride=(2,2)) 
# first fullc layer 
flatten = mx.sym.flatten(data=pool2) 
fc1 = mx.symbol.FullyConnected(data=flatten, num_hidden=500) 
tanh3 = mx.sym.Activation(data=fc1, act_type="tanh") 
# second fullc 
fc2 = mx.sym.FullyConnected(data=tanh3, num_hidden=10) 
# softmax loss 
#lenet = mx.sym.SoftmaxOutput(data=fc2, name='softmax') 

label = mx.sym.var('label') 
softmax = mx.sym.log_softmax(data=fc2) 
softmax_output = mx.sym.BlockGrad(data = softmax,name = 'softmax') 
ce = ce = -mx.sym.sum(mx.sym.sum(mx.sym.broadcast_mul(softmax,label),1)) 
lenet = mx.symbol.MakeLoss(ce, normalization='batch') 

sym = mx.sym.Group([softmax_output,lenet]) 
print sym.list_outputs 

def custom_metric(label,softmax): 
    return len(np.where(np.argmax(softmax,1)==np.argmax(label,1))[0])/float(label.shape[0]) 

eval_metrics = mx.metric.CustomMetric(custom_metric,name='custom-accuracy', output_names=['softmax_output'],label_names=['label']) 

lenet_model = mx.mod.Module(symbol=sym, context=mx.gpu(),data_names=['data'], label_names=['label']) 
lenet_model.fit(train_iter, 
       eval_data=val_iter, 
       optimizer='sgd', 
       optimizer_params={'learning_rate':0.1}, 
       eval_metric=eval_metrics,#mx.metric.Loss(),#'acc', 
       #batch_end_callback = mx.callback.Speedometer(batch_size, 100), 
       num_epoch=10)