Theano教程中RNN的參數

class RNNSLU(object): 
''' elman neural net model ''' 
def __init__(self, nh, nc, ne, de, cs): 
    ''' 
    nh :: dimension of the hidden layer 
    nc :: number of classes 
    ne :: number of word embeddings in the vocabulary 
    de :: dimension of the word embeddings 
    cs :: word window context size 
    ''' 
    # parameters of the model 
    self.emb = theano.shared(name='embeddings', 
          value=0.2 * numpy.random.uniform(-1.0, 1.0, 
          (ne+1, de)) 
          # add one for padding at the end 
          .astype(theano.config.floatX)) 
    self.wx = theano.shared(name='wx', 
          value=0.2 * numpy.random.uniform(-1.0, 1.0, 
          (de * cs, nh)) 
          .astype(theano.config.floatX)) 
    self.wh = theano.shared(name='wh', 
          value=0.2 * numpy.random.uniform(-1.0, 1.0, 
          (nh, nh)) 
          .astype(theano.config.floatX)) 
    self.w = theano.shared(name='w', 
          value=0.2 * numpy.random.uniform(-1.0, 1.0, 
          (nh, nc)) 
          .astype(theano.config.floatX)) 
    self.bh = theano.shared(name='bh', 
          value=numpy.zeros(nh, 
          dtype=theano.config.floatX)) 
    self.b = theano.shared(name='b', 
          value=numpy.zeros(nc, 
          dtype=theano.config.floatX)) 
    self.h0 = theano.shared(name='h0', 
          value=numpy.zeros(nh, 
          dtype=theano.config.floatX)) 

    # bundle 
    self.params = [self.emb, self.wx, self.wh, self.w, self.bh, self.b, self.h0] 



def recurrence(x_t, h_tm1): 
     h_t = T.nnet.sigmoid(T.dot(x_t, self.wx) 
          + T.dot(h_tm1, self.wh) + self.bh) 
     s_t = T.nnet.softmax(T.dot(h_t, self.w) + self.b) 
     return [h_t, s_t] 

[h, s], = theano.scan(fn=recurrence, 
          sequences=x, 
          outputs_info=[self.h0, None], 
          n_steps=x.shape[0])

我在關注RNN的這個Theano教程。（http://deeplearning.net/tutorial/rnnslu.html）但我有兩個關於它的問題。首先。在本教程中，復發的功能是這樣的：Theano教程中RNN的參數

def recurrence(x_t, h_tm1): h_t = T.nnet.sigmoid(T.dot(x_t, self.wx) + T.dot(h_tm1, self.wh) + self.bh) s_t = T.nnet.softmax(T.dot(h_t, self.w) + self.b) return [h_t, s_t]

我wounder爲什麼h_t不加H0？（即h_t = T.nnet.sigmoid(T.dot(x_t, self.wx) + T.dot(h_tm1, self.wh) + self.bh + self.h0)）

二，爲什麼outputs_info=[self.h0, None]？我知道outputs_info是初始化結果。所以我覺得outputs_info=[self.bh+self.h0, T.nnet.softmax(T.dot(self.bh+self.h0, self.w_h2y) + self.b_h2y)]

來源

2016-03-29 Nils Cao

def recurrence(x_t, h_tm1): 
     h_t = T.nnet.sigmoid(T.dot(x_t, self.wx) 
          + T.dot(h_tm1, self.wh) + self.bh) 
     s_t = T.nnet.softmax(T.dot(h_t, self.w) + self.b) 
     return [h_t, s_t]

所以，首先你問爲什麼我們不復發的功能使用H0。我們來分解這部分，

h_t = T.nnet.sigmoid(T.dot(x_t, self.wx)+ T.dot(h_tm1, self.wh) + self.bh)

我們期望的是3個術語。

第一項是輸入層乘以權重矩陣T.dot(x_t, self.wx)。
第二項是隱藏層被另一個權重矩陣（這是什麼使其反覆）T.dot(h_tm1, self.wh)。請注意，您必須有一個權重矩陣，您建議基本上將self.h0作爲偏差添加。
第三項是隱藏層，self.bh的偏置。

現在，在每次迭代之後，我們要跟蹤包含在self.h0中的隱藏層激活。然而，self.h0是爲了包含CURRENT激活，我們需要的是以前的激活。

[h, s], _ = theano.scan(fn=recurrence, 
          sequences=x, 
          outputs_info=[self.h0, None], 
          n_steps=x.shape[0])

因此，再次看掃描功能。 outputs_info=[self.h0, None]初始化值是正確的，但這些值也與輸出相關聯。有兩個輸出recurrence()，即[h_t, s_t]。

那麼outputs_info的功能是什麼，在每次迭代之後，self.h0的值將被值h_t（第一個返回的值）覆蓋。 outputs_info的第二個要素是None，因爲我們沒有保存或初始化s_t任何地方（outputs_info的第二個參數是與復發函數的返回值這個樣子。）值

在接下來的迭代中， outputs_info的第一個參數再次用作輸入，因此h_tm1與self.h0的值相同。但是，由於我們必須有一個參數h_tm，我們必須初始化這個值。由於我們不需要在outputs_info中初始化第二個參數，所以我們將第二項作爲None。

誠然，theano.scan()功能有時是非常混亂，我在這是新太。但是，這是我從做這個相同的教程瞭解到的。

來源

2016-03-29 10:53:40

謝謝你的回答。這非常有用，我想我明白你的意思，非常感謝你 –

Theano教程中RNN的參數

回答

相關問題