1
當我試圖學習一些簡單的MLP,並從一切剝離代碼之後,我得到了奇怪的結果,但什麼是必要的,並縮小了它,我仍然得到奇怪的結果。千層麪,MLP零輸出
代碼:
import numpy as np
import theano
import theano.tensor as T
import lasagne
dtype = np.float32
states = np.eye(3, dtype=dtype).reshape(3,1,1,3)
values = np.array([[147, 148, 135,147], [147,147,149,148], [148,147,147,147]], dtype=dtype)
output_dim = values.shape[1]
hidden_units = 50
#Network setup
inputs = T.tensor4('inputs')
targets = T.matrix('targets')
network = lasagne.layers.InputLayer(shape=(None, 1, 1, 3), input_var=inputs)
network = lasagne.layers.DenseLayer(network, 50, nonlinearity=lasagne.nonlinearities.rectify)
network = lasagne.layers.DenseLayer(network, output_dim)
prediction = lasagne.layers.get_output(network)
loss = lasagne.objectives.squared_error(prediction, targets).mean()
params = lasagne.layers.get_all_params(network, trainable=True)
updates = lasagne.updates.sgd(loss, params, learning_rate=0.01)
f_learn = theano.function([inputs, targets], loss, updates=updates)
f_test = theano.function([inputs], prediction)
#Training
it = 5000
for i in range(it):
l = f_learn(states, values)
print "Loss: " + str(l)
print "Expected:"
print values
print "Learned:"
print f_test(states)
print "Last layer weights:"
print lasagne.layers.get_all_param_values(network)[-1]
我希望在網絡學習中給定的值的「價值」變量並經常這樣做,但同樣經常離開與零和的巨大損失一些輸出節點。
樣本輸出:
Loss: 5426.83349609
Expected:
[[ 147. 148. 135. 147.]
[ 147. 147. 149. 148.]
[ 148. 147. 147. 147.]]
Learned:
[[ 146.99993896 0. 134.99993896 146.99993896]
[ 146.99993896 0. 148.99993896 147.99993896]
[ 147.99995422 0. 146.99996948 146.99993896]]
Last layer weights:
[ 11.40957355 0. 11.36747837 10.98625183]
這究竟是爲什麼?