2016-03-02 105 views
3

我是新來theano,我有麻煩。 我試圖用theano來創建可用於迴歸任務(而不是分類任務) 閱讀大量教程後神經網絡,我得出的結論,我可以做到這一點通過創建一個輸出層,只是處理迴歸和prepand一個「正常的」神經網絡有幾個隱藏層。 (但那仍然是未來)。使用迴歸神經網絡(與Theano)

所以這是我的「模式」:

1 #!/usr/bin/env python 
    2 
    3 import numpy as np 
    4 import theano 
    5 import theano.tensor as T 
    6 
    7 class RegressionLayer(object): 
    8  """Class that represents the linear regression, will be the outputlayer 
    9  of the Network""" 
10  def __init__(self, input, n_in, learning_rate): 
11   self.n_in = n_in 
12   self.learning_rate = learning_rate 
13   self.input = input 
14 
15   self.weights = theano.shared(
16    value = np.zeros((n_in, 1), dtype = theano.config.floatX), 
17    name = 'weights', 
18    borrow = True 
19  ) 
20 
21   self.bias = theano.shared(
22    value = 0.0, 
23    name = 'bias' 
24  ) 
25 
26   self.regression = T.dot(input, self.weights) + self.bias 
27   self.params = [self.weights, self.bias] 
28 
29  def cost_function(self, y): 
30   return (y - self.regression) ** 2 
31 

訓練模型作爲theano教程我試過如下:

In [5]: x = T.dmatrix('x') 

In [6]: reg = r.RegressionLayer(x, 3, 0) 

In [8]: y = theano.shared(value = 0.0, name = "y") 

In [9]: cost = reg.cost_function(y) 

In [10]: T.grad(cost=cost, wrt=reg.weights) 


─────────────────────────────────────────────────────────────────────────────────────────────---------------------------------------------------------------------------   [77/1395] 
TypeError         Traceback (most recent call last) 
<ipython-input-10-0326df05c03f> in <module>() 
----> 1 T.grad(cost=cost, wrt=reg.weights) 

/home/name/PythonENVs/Theano/local/lib/python2.7/site-packages/theano/gradient.pyc in grad(c 
ost, wrt, consider_constant, disconnected_inputs, add_names, known_grads, return_disconnected 
) 
    430 
    431  if cost is not None and cost.ndim != 0: 
--> 432   raise TypeError("cost must be a scalar.") 
    433 
    434  if isinstance(wrt, set): 

TypeError: cost must be a scalar. 

我覺得我做了完全一樣的(只與數學,我需要),就像是在theanos迴歸教程(http://deeplearning.net/tutorial/logreg.html)來完成,但它不工作。那麼爲什麼我不能創建漸變?

回答

1

您的成本函數應該可能是一個平方和。此刻它是一個正方形矢量,但爲了能夠達到當時標量函數的梯度,您需要將其濃縮爲一個值。這通常是這樣做的:

def cost_function(self, y): 
    return ((y - self.regression) ** 2).mean() 
+0

爲什麼會26 self.regression = T.dot(input,self.weights)+ self.bias返回一個向量?我的意思是點積返回一個標量,偏差也是一個標量。 – Uzaku

+0

啊,通常你會放入一批數據,所以'input'就是一個矩陣。 – eickenberg

+0

好吧,謝謝:) – Uzaku