2016-09-13 240 views
0

我試圖執行使用Theano簡單XNOR神經網絡功能,我正在類型不匹配Theano ValueError:args與gemm的尺寸不匹配; 2D陣列尺寸被解釋爲1D

ValueError: dimension mismatch in args to gemm (8,1)x(2,1)->(8,1)

儘管輸入是在維度(4X2)和輸出是(4X1),我不知道爲什麼它將輸入的維數讀作(8X1)。它應該是(4,2)X(2,1) - >(4,1),但它有些如何將其視爲(8,1)x(2,1) - >(8,1)

任何想法爲什麼,它正在讀取輸入維(n,m)爲(n * m,1)?

簡單的神經網絡的XNOR實現:

print 'Importing Theano Library ...' 
import theano 
print 'Importing General Libraries ...' 
import numpy as np 
import theano.tensor as T 
from theano import function 
from theano import shared 
from theano.ifelse import ifelse 
import os 
from random import random 
import time 

print(theano.config.device) 

print 'Building Neural Network ...' 
startTime = time.clock() 
rng = np.random 
#Define variables: 
x = T.matrix('x') 
w1 = shared(np.array([rng.random(1).astype(theano.config.floatX), rng.random(1).astype(theano.config.floatX)])) 
w2 = shared(np.array([rng.random(1).astype(theano.config.floatX), rng.random(1).astype(theano.config.floatX)])) 
w3 = shared(np.array([rng.random(1).astype(theano.config.floatX), rng.random(1).astype(theano.config.floatX)])) 
b1 = shared(np.asarray(1., dtype=theano.config.floatX)) 
b2 = shared(np.asarray(1., dtype=theano.config.floatX)) 
learning_rate = 0.01 

a1 = 1/(1+T.exp(-T.dot(x,w1)-b1)) 
a2 = 1/(1+T.exp(-T.dot(x,w2)-b1)) 
x2 = T.stack([a1,a2],axis=1) 
a3 = 1/(1+T.exp(-T.dot(x2,w3)-b2)) 

a_hat = T.vector('a_hat') #Actual output 
cost = -(a_hat*T.log(a3) + (1-a_hat)*T.log(1-a3)).sum() 
dw1,dw2,dw3,db1,db2 = T.grad(cost,[w1,w2,w3,b1,b2]) 

train = function(inputs = [x,a_hat], outputs = [a3,cost], updates = [[w1, w1-learning_rate*dw1],[w2, w2-learning_rate*dw2],[w3, w3-learning_rate*dw3],[b1, b1-learning_rate*b1],[b2, b2-learning_rate*b2]]) 

print 'Neural Network Built' 
TimeDelta = time.clock() - startTime 
print 'Building Time: %.2f seconds' %TimeDelta 


inputs = np.array([[0,0],[0,1],[1,0],[1,1]]).astype(theano.config.floatX) 
outputs = np.array([1,0,0,1]).astype(theano.config.floatX) 

#Iterate through all inputs and find outputs: 

print 'Training the network ...' 
startTime = time.clock() 
cost = [] 
print 'input shape', inputs.shape 
print 'output shape', outputs.shape 

for iteration in range(60000): 
    print 'Iteration no. %d \r' %iteration, 
    pred, cost_iter = train(inputs, outputs) 
    cost.append(cost_iter) 

TimeDelta = time.clock() - startTime 
print 'Training Time: %.2f seconds' %TimeDelta 

#Print the outputs: 
print 'The outputs of the NN are: ' 

for i in range(len(inputs)): 
    print 'The output for x1=%d | x2=%d is %.2f' % (inputs[i][0], inputs[i][1], pred[i]) 

predict = function([x],a3) 

print predict([[0,0]]) 
print predict([[0,1]]) 
print predict([[1,0]]) 
print predict([[1,1]]) 

端子輸出:

Importing Theano Library ... 
Using gpu device 0: NVIDIA Tegra X1 (CNMeM is enabled with initial size: 75.0% of memory, cuDNN 5005) 
Importing General Libraries ... 
gpu 
Building Neural Network ... 
Neural Network Built 
Building Time: 1.78 seconds 
Training the network ... 
input shape (4, 2) 
output shape (4,) 
Traceback (most recent call last): 
    File "neuron2.py", line 59, in <module> 
    pred, cost_iter = train(inputs, outputs) 
    File "/home/ubuntu/Theano/theano/compile/function_module.py", line 879, in __call__ 
    storage_map=getattr(self.fn, 'storage_map', None)) 
    File "/home/ubuntu/Theano/theano/gof/link.py", line 325, in raise_with_op 
    reraise(exc_type, exc_value, exc_trace) 
    File "/home/ubuntu/Theano/theano/compile/function_module.py", line 866, in __call__ 
    self.fn() if output_subset is None else\ 
ValueError: dimension mismatch in args to gemm (8,1)x(2,1)->(8,1) 
Apply node that caused the error: GpuDot22(GpuReshape{2}.0, GpuReshape{2}.0) 
Toposort index: 68 
Inputs types: [CudaNdarrayType(float32, matrix), CudaNdarrayType(float32, matrix)] 
Inputs shapes: [(8, 1), (2, 1)] 
Inputs strides: [(1, 0), (1, 0)] 
Inputs values: ['not shown', CudaNdarray([[ 0.14762458] 
[ 0.12991147]])] 
Outputs clients: [[GpuReshape{3}(GpuDot22.0, Join.0)]] 

HINT: Re-running with most Theano optimization disabled could give you a back-trace of when this node was created. This can be done with by setting the Theano flag 'optimizer=fast_compile'. If that does not work, Theano optimizations can be disabled with 'optimizer=None'. 
HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map footprint of this apply node. 

回答

0

的共享變量W1,W2,W3被同時鑄造作爲矩陣創建的,它們應該是載體,所述應按下列方式進行鑄造:

這些行:

w1 = shared(np.array([rng.random(1).astype(theano.config.floatX), rng.random(1).astype(theano.config.floatX)])) 
w2 = shared(np.array([rng.random(1).astype(theano.config.floatX), rng.random(1).astype(theano.config.floatX)])) 
w3 = shared(np.array([rng.random(1).astype(theano.config.floatX), rng.random(1).astype(theano.config.floatX)])) 

應該是:

from random import random 
w1 = shared(np.asarray([random(), random()], dtype=theano.config.floatX)) 
w2 = shared(np.asarray([random(), random()], dtype=theano.config.floatX)) 
w3 = shared(np.asarray([random(), random()], dtype=theano.config.floatX))