Theano HiddenLayer激活功能

無論如何，在Theano中使用整流線性單位（ReLU）作爲隱藏層的激活函數而不是tanh()或sigmoid()？隱藏層的實現如下，就我在互聯網上搜索而言，ReLU沒有在Theano內部實現。Theano HiddenLayer激活功能

class HiddenLayer(object): 
    def __init__(self, rng, input, n_in, n_out, W=None, b=None, activation=T.tanh): 
    pass

來源

2014-10-21 A.M.

RELU很容易在Theano做到：

switch(x<0, 0, x)

要使用它，你的情況做一個Python功能將實現RELU並把它傳遞給激活：

def relu(x): 
    return theano.tensor.switch(x<0, 0, x) 
HiddenLayer(..., activation=relu)

一些人們使用這個實現：x * (x > 0)

更新：較新的Theano版本有theano.te nsor.nnet.relu（x）可用。

來源

2014-10-22 00:19:43 nouiz

如何在零非可微處理嗎？ – Chet 2015-01-20 21:26:48

它在那時得到0。 – nouiz 2015-01-21 03:21:01

@nouiz我剛在我的筆記本電腦上安裝了Theano。該庫不包括nnet.relu。但是，我可以在桌面機器上使用nnet.relu，幾天前我在桌面機器上安裝了Theano。可能是什麼原因？ – Amir 2016-01-09 20:45:59

我覺得更準確的把它寫在這樣：

x * (x > 0.) + 0. * (x < 0.)

來源

2014-10-23 18:52:43

'0。 *（x <0.）'將被優化。所以執行的公式將是'x *（x> 0）' – nouiz 2014-10-24 20:35:18

我寫的是這樣的：

lambda x: T.maximum(0,x)

或：

lambda x: x * (x > 0)

來源

2014-10-28 09:47:24 grin

UPDATE： theano的最新版本具有原生支持ReLU： T.nnet.relu，這應該比定製解決方案更受歡迎。

我決定比較解決方案的速度，因爲它對神經網絡非常重要。比較函數本身的速度和它的梯度，在第一種情況下，首選switch是優選的，對於x *（x> 0），梯度更快。所有計算出的梯度都是正確的。

def relu1(x): 
    return T.switch(x<0, 0, x) 

def relu2(x): 
    return T.maximum(x, 0) 

def relu3(x): 
    return x * (x > 0) 


z = numpy.random.normal(size=[1000, 1000]) 
for f in [relu1, relu2, relu3]: 
    x = theano.tensor.matrix() 
    fun = theano.function([x], f(x)) 
    %timeit fun(z) 
    assert numpy.all(fun(z) == numpy.where(z > 0, z, 0)) 

Output: (time to compute ReLU function) 
>100 loops, best of 3: 3.09 ms per loop 
>100 loops, best of 3: 8.47 ms per loop 
>100 loops, best of 3: 7.87 ms per loop 

for f in [relu1, relu2, relu3]: 
    x = theano.tensor.matrix() 
    fun = theano.function([x], theano.grad(T.sum(f(x)), x)) 
    %timeit fun(z) 
    assert numpy.all(fun(z) == (z > 0) 

Output: time to compute gradient 
>100 loops, best of 3: 8.3 ms per loop 
>100 loops, best of 3: 7.46 ms per loop 
>100 loops, best of 3: 5.74 ms per loop

最後，讓我們比較梯度應如何計算（最快的方式）

x = theano.tensor.matrix() 
fun = theano.function([x], x > 0) 
%timeit fun(z) 
Output: 
>100 loops, best of 3: 2.77 ms per loop

所以theano產生inoptimal代碼梯度。恕我直言，今天切換版本應該是首選。

來源

2015-02-28 00:21:37 Alleo

這是來自[這裏]（https://github.com/lisa-lab/pylearn2/blob/master/pylearn2/scripts/benchmark/time_relu。 PY）？請注意，當您關心GPU速度時，「T.maximum」是最快的。另見[here]（https：// github。COM/Theano/Theano /問題/ 2698）。 – Albert 2015-03-29 13:58:58

@Albert，不，我決定比較我在這裏找到的版本（不幸的是我沒有GPU，所以這些都是CPU結果）。感謝您的第一個鏈接！ – Alleo 2015-03-29 18:10:32

有些跟進關於速度的討論是[這裏]（https://github.com/Theano/Theano/issues/2698）。 – Albert 2015-04-14 08:18:47

功能是非常簡單的Python：

def relu(input): 
    output = max(input, 0) 
    return(output)

來源

2017-05-03 21:20:14

Theano HiddenLayer激活功能

回答

相關問題