2015-11-30 54 views
2

我試圖實現具有兩個約束的隨機梯度下降,因此無法使用scikit-learn。不幸的是,我已經在不受兩個約束條件的情況下掙扎於常規新元。訓練集上的損失(平方損失)在一些迭代中下降,但在一段時間後開始增加,如圖中所示。 這些是我經常使用的功能:SGD - 經過一些迭代後損失開始增加

def loss_prime_simple(w,node,feature,data): 
    x = data[3] 
    y = data[2] 
    x_f = x[node][feature] 
    y_node = y[node] 
    ret = (y_node - w[feature] * x_f) * (-x_f) 
    return ret 

def update_weights(w,data,predecs,children,node, learning_rate): 
    len_features = len(data[3][0]) 
    w_new = np.zeros(len_features) 
    for feature_ in range(len_features): 
     w_new[feature_] = loss_prime_simple(w,node,feature_,data) 
    return w - learning_rate * w_new 

def loss_simple(w,data): 
    y_p = data[2] 
    x = data[3] 
    return ((y_p - np.dot(w,np.array(x).T)) ** 2).sum() 

這顯示了兩種不同的學習率(0.001,0.0001)設置的培訓損失http://postimg.org/image/43nbmh8x5/

任何人都可以找到一個錯誤或有建議如何調試這個? 感謝

編輯:

由於lejlot指出,這將是很好的數據。 這裏是我使用x的數據(單樣本):http://textuploader.com/5x0f1

Y = 2

這給出了這樣的損失:http://postimg.org/image/o9d97kt9v/

的更新的代碼:

def loss_prime_simple(w,node,feature,data): 
    x = data[3] 
    y = data[2] 
    x_f = x[node][feature] 
    y_node = y[node] 
    return -(y_node - w[feature] * x_f) * x_f 

def update_weights(w,data,predecs,children,node, learning_rate): 
    len_features = len(data[3][0]) 
    w_new = np.zeros(len_features) 
    for feature_ in range(len_features): 
     w_new[feature_] = loss_prime_simple(w,node,feature_,data) 
    return w - learning_rate * w_new 

def loss_simple2(w,data): 
    y_p = data[2] 
    x = data[3] 
    return ((y_p - np.dot(w,np.array(x).T)) ** 2).sum() 

import numpy as np 
X = [#put array from http://textuploader.com/5x0f1 here] 
y = [2] 

data = None, None, y, X 

w = np.random.rand(4096) 

a = [ loss_simple2(w, data) ] 

for _ in range(200): 
    for j in range(X.shape[0]): 
     w = update_weights(w,data,None,None,j, 0.0001) 
     a.append(loss_simple2(w, data)) 

from matplotlib import pyplot as plt 
plt.figure() 
plt.plot(a) 
plt.show() 

回答

0

問題是我用enter image description here而不是enter image description here更新了權重

所以這個作品:

def update_weights(w,x,y, learning_rate): 
    inner_product = 0.0  
    for f_ in range(len(x)): 
     inner_product += (w[f_] * x[f_]) 
    dloss = inner_product - y 
    for f_ in range(len(x)): 
     w[f_] += (learning_rate * (-x[f_] * dloss)) 
    return w 
1

可以注意到的主要錯誤是你reshape而不是transpose,比較:

>>> import numpy as np 
>>> X = np.array(range(10)).reshape(2,-1) 
>>> X 
array([[0, 1, 2, 3, 4], 
     [5, 6, 7, 8, 9]]) 
>>> X.reshape(-1, 2) 
array([[0, 1], 
     [2, 3], 
     [4, 5], 
     [6, 7], 
     [8, 9]]) 
>>> X.T 
array([[0, 5], 
     [1, 6], 
     [2, 7], 
     [3, 8], 
     [4, 9]]) 
>>> X.reshape(-1, 2) == X.T 
array([[ True, False], 
     [False, False], 
     [False, False], 
     [False, False], 
     [False, True]], dtype=bool) 

看起來壞呼籲總和(陣列)接下來的事情,你應該叫寧array.sum()

>>> import numpy as np 
>>> x = np.array(range(10)).reshape(2, 5) 
>>> x 
array([[0, 1, 2, 3, 4], 
     [5, 6, 7, 8, 9]]) 
>>> sum(x) 
array([ 5, 7, 9, 11, 13]) 
>>> x.sum() 
45 

在此之後,它工作得很好

def loss_prime_simple(w,node,feature,data): 
    x = data[3] 
    y = data[2] 
    x_f = x[node][feature] 
    y_node = y[node] 
    ret = w[feature] 
    return -(y_node - w[feature] * x_f) * x_f 

def update_weights(w,data,predecs,children,node, learning_rate): 
    len_features = len(data[3][0]) 
    w_new = np.zeros(len_features) 
    for feature_ in range(len_features): 
     w_new[feature_] = loss_prime_simple(w,node,feature_,data) 
    return w - learning_rate * w_new 

def loss_simple(w,data): 
    y_p = data[2] 
    x = data[3] 
    return ((y_p - np.dot(w,np.array(x).T)) ** 2).sum() 

import numpy as np 

X = np.random.randn(1000, 3) 
y = np.random.randn(1000) 

data = None, None, y, X 

w = np.array([1,3,3]) 

loss = [loss_simple(w, data)] 

for _ in range(20): 
    for j in range(X.shape[0]): 
     w = update_weights(w, data, None, None, j, 0.001) 
     loss.append(loss_simple(w, data)) 

from matplotlib import pyplot as plt 
plt.figure() 
plt.plot(loss) 
plt.show() 

enter image description here

+0

感謝您的建議。我試過了,但沒有改變。編輯它在我的問題 – TobSta

+0

如果問題仍然存在,你必須提供一個最小的工作示例 - 四**全**代碼(不只是很少的方法,也許你不正確地運行它們)和它失敗的數據 – lejlot

+0

感謝指出out :)我編輯了這些問題 – TobSta

相關問題