下面的網絡代碼工作正常,但速度太慢。 This site意味着網絡在學習率爲0.2的100個時期後應該達到99%的準確率,而即使在1900年以後,我的網絡也從未超過97%。爲什麼這個簡單的神經網絡不能收斂到XOR?
Epoch 0, Inputs [0 0], Outputs [-0.83054376], Targets [0]
Epoch 100, Inputs [0 1], Outputs [ 0.72563824], Targets [1]
Epoch 200, Inputs [1 0], Outputs [ 0.87570863], Targets [1]
Epoch 300, Inputs [0 1], Outputs [ 0.90996706], Targets [1]
Epoch 400, Inputs [1 1], Outputs [ 0.00204791], Targets [0]
Epoch 500, Inputs [0 1], Outputs [ 0.93396672], Targets [1]
Epoch 600, Inputs [0 0], Outputs [ 0.00006375], Targets [0]
Epoch 700, Inputs [0 1], Outputs [ 0.94778227], Targets [1]
Epoch 800, Inputs [1 1], Outputs [-0.00149935], Targets [0]
Epoch 900, Inputs [0 0], Outputs [-0.00122716], Targets [0]
Epoch 1000, Inputs [0 0], Outputs [ 0.00457281], Targets [0]
Epoch 1100, Inputs [0 1], Outputs [ 0.95921556], Targets [1]
Epoch 1200, Inputs [0 1], Outputs [ 0.96001748], Targets [1]
Epoch 1300, Inputs [1 0], Outputs [ 0.96071742], Targets [1]
Epoch 1400, Inputs [1 1], Outputs [ 0.00110912], Targets [0]
Epoch 1500, Inputs [0 0], Outputs [-0.00], Targets [0]
Epoch 1600, Inputs [1 0], Outputs [ 0.9640324], Targets [1]
Epoch 1700, Inputs [1 0], Outputs [ 0.96431516], Targets [1]
Epoch 1800, Inputs [0 1], Outputs [ 0.97004973], Targets [1]
Epoch 1900, Inputs [1 0], Outputs [ 0.96616225], Targets [1]
我使用的數據集:
0 0 0
1 0 1
0 1 1
1 1 1
訓練集是使用一個輔助文件中的函數讀取,但就是不相關的網絡。
import numpy as np
import helper
FILE_NAME = 'data.txt'
EPOCHS = 2000
TESTING_FREQ = 5
LEARNING_RATE = 0.2
INPUT_SIZE = 2
HIDDEN_LAYERS = [5]
OUTPUT_SIZE = 1
class Classifier:
def __init__(self, layer_sizes):
np.set_printoptions(suppress=True)
self.activ = helper.tanh
self.dactiv = helper.dtanh
network = list()
for i in range(1, len(layer_sizes)):
layer = dict()
layer['weights'] = np.random.randn(layer_sizes[i], layer_sizes[i-1])
layer['biases'] = np.random.randn(layer_sizes[i])
network.append(layer)
self.network = network
def forward_propagate(self, x):
for i in range(0, len(self.network)):
self.network[i]['outputs'] = self.network[i]['weights'].dot(x) + self.network[i]['biases']
if i != len(self.network)-1:
self.network[i]['outputs'] = x = self.activ(self.network[i]['outputs'])
else:
self.network[i]['outputs'] = self.activ(self.network[i]['outputs'])
return self.network[-1]['outputs']
def backpropagate_error(self, x, targets):
self.forward_propagate(x)
self.network[-1]['deltas'] = (self.network[-1]['outputs'] - targets) * self.dactiv(self.network[-1]['outputs'])
for i in reversed(range(len(self.network)-1)):
self.network[i]['deltas'] = self.network[i+1]['deltas'].dot(self.network[i+1]['weights'] * self.dactiv(self.network[i]['outputs']))
def adjust_weights(self, inputs, learning_rate):
self.network[0]['weights'] -= learning_rate * np.atleast_2d(self.network[0]['deltas']).T.dot(np.atleast_2d(inputs))
self.network[0]['biases'] -= learning_rate * self.network[0]['deltas']
for i in range(1, len(self.network)):
self.network[i]['weights'] -= learning_rate * np.atleast_2d(self.network[i]['deltas']).T.dot(np.atleast_2d(self.network[i-1]['outputs']))
self.network[i]['biases'] -= learning_rate * self.network[i]['deltas']
def train(self, inputs, targets, epochs, testfreq, lrate):
for epoch in range(epochs):
i = np.random.randint(0, len(inputs))
if epoch % testfreq == 0:
predictions = self.forward_propagate(inputs[i])
print('Epoch %s, Inputs %s, Outputs %s, Targets %s' % (epoch, inputs[i], predictions, targets[i]))
self.backpropagate_error(inputs[i], targets[i])
self.adjust_weights(inputs[i], lrate)
inputs, outputs = helper.readInput(FILE_NAME, INPUT_SIZE, OUTPUT_SIZE)
print('Input data: {0}'.format(inputs))
print('Output targets: {0}\n'.format(outputs))
np.random.seed(1)
nn = Classifier([INPUT_SIZE] + HIDDEN_LAYERS + [OUTPUT_SIZE])
nn.train(inputs, outputs, EPOCHS, TESTING_FREQ, LEARNING_RATE)
您是否嘗試其他學習率? 0.2可能太低,而且會變得不穩定。 – eventHandler
@eventHandler我已更新帖子。它基於基準不足夠快或足夠準確地收斂:https://stackoverflow.com/questions/30688527/how-many-epochs-should-a-neural-net-need-to-learn-to-square-testing -results-in –