2016-03-07 30 views
0

我是theano的新手,我無法與theano進行簡單的XOR示例。我嘗試了很多事情來使它工作,但似乎我只是在做薩滿教。看看代碼非常簡單,但使用它我得到了隨機結果。theano上的簡單XOR示例

import numpy as np 
import collections 

import theano 
import theano.tensor as T 

INPUT_SIZE = 2 
HIDDEN_SIZE = 2 
OUTPUT_SIZE = 1 

def train_2(data, valid_set_x): 
    lr = 0.2 

    x, y = data 

    # symbol declarations 
    ep = T.scalar() 
    sx = T.matrix() 
    sy = T.matrix() 

    w1 = theano.shared(np.random.normal(loc=0, scale=1, size=(INPUT_SIZE, HIDDEN_SIZE))) 
    b1 = theano.shared(np.random.normal(loc=0, scale=1, size=(HIDDEN_SIZE))) 
    w2 = theano.shared(np.random.normal(loc=0, scale=1, size=(HIDDEN_SIZE, OUTPUT_SIZE))) 
    b2 = theano.shared(np.random.normal(loc=0, scale=1, size=(OUTPUT_SIZE))) 

    # symbolic expression-building 
    hid = T.tanh(T.dot(sx, w1) + b1) 
    out = T.tanh(T.dot(hid, w2) + b2) 

    err = 0.5 * T.sum(out - sy) ** 2 

    gw = T.grad(err, w1) 
    gb = T.grad(err, b1) 
    gv = T.grad(err, w2) 
    gc = T.grad(err, b2) 

    list = ((w1, w1 - (lr/ep) * gw), 
      (b1, b1 - (lr/ep) * gb), 
      (w2, w2 - (lr/ep) * gv), 
      (b2, b2 - (lr/ep) * gc)) 

    dict = collections.OrderedDict(list) 

    # compile a fast training function 
    train = theano.function([sx, sy, ep], err, updates=dict) 
    sample = theano.function([sx], out) 

    train_set_size = x.shape[0] 

    # now do the computations 
    batchsize = 1 
    for epoch in xrange(10): 
     err = 0 
     for i in xrange(train_set_size): 
      x_i = x[i * batchsize: (i + 1) * batchsize] 
      y_i = y[i * batchsize: (i + 1) * batchsize] 
      err += train(x_i, y_i, epoch + 1) 
     print "Error: " + str(err) 

    print "Weights:" 
    print w1.get_value() 
    print b1.get_value() 
    print w2.get_value() 
    print b2.get_value() 

    return sample(valid_set_x) 

def test__(files=None): 
    x_set = np.array([[-5, -5], 
         [-5, 5], 
         [5, -5], 
         [5, 5]]).astype("float32") 
    y_set = np.array([[-0.9], [-0.9], [-0.9], [0.9]]).astype("float32") 

    print "Processing..." 
    result_set_x = train_2((x_set, y_set), x_set) 

    print x_set 
    print result_set_x 
    print y_set 

if __name__ == '__main__': 
    test__() 
+0

您可能需要遵循以下Theano教程(含XOR例子),讓你的手髒與它之前充分了解Theano的基本知識:http://outlace.com/Beginner-Tutorial-Theano/檢查這張pdf的幻燈片33和34也是一個非常簡潔和清晰的XOR例子:http://speech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015_2/Lecture/Theano%20DNN.pdf – gcucurull

+0

我先讀鏈接,並記住當我在C++上進行簡單的XOR時,只需要10個時期就可以獲得一些質量。 10000個epoches這只是異常 –

+0

10個時代似乎有點低,但它收斂所需要的迭代次數可以取決於權重初始化,批量大小,學習速率,激活函數,優化器選擇... ... 10,000次迭代訓練異或確實看起來異常高 – gcucurull

回答

1

問題出在您的更新部分。我將變量重命名爲'updates'作爲'list'和'dict'是保留字;不是一個好的選擇。另外,我不知道你爲什麼想要如此快地降低你的學習速度,我刪除了它。 的更新應該是喜歡這裏

updates = [(w1, w1 - lr * gw), 
      (b1, b1 - lr * gb), 
      (w2, w2 - lr * gv), 
      (b2, b2 - lr * gc)] 

# compile a fast training function 
train = theano.function([sx, sy], err, updates=updates) 

數組我跑了改變的例子,並獲得如下結果。它需要更多的迭代來減少損失,但除此之外,它是可以的。

Processing... 
Error: 1.4456279556 
... 
Error: 0.0767515052046 
Weights: 
[[ 0.52955082 -1.26936557] 
[-1.05887804 0.04998216]] 
[ 0.29209577 -0.22703456] 
[[-0.89983822] 
[-0.88619565]] 
[-0.86047891] 
[[-5. -5.] 
[-5. 5.] 
[ 5. -5.] 
[ 5. 5.]] 
[[-0.98989634] 
[-0.68941057] 
[-0.7034631 ] 
[ 0.72087948]] 
[[-0.89999998] 
[-0.89999998] 
[-0.89999998] 
[ 0.89999998]]