我想使用梯度下降來解決方程組,但是我每次都得到了錯誤的結果,所以我檢查了我的代碼並編寫了一個numpy版本,在這個版本中我提供了明確的損失梯度並且我可以得到正確的結果。GradientDescentOptimizer得到了錯誤的結果
所以我不明白爲什麼GradientDescentOptimizer不能工作。
這裏是我的代碼,而TF:
import numpy as np
class SolveEquation:
def __init__(self, rate: float, loss_threshold: float=0.0001, max_epochs: int=1000):
self.__rate = rate
self.__loss_threshold = loss_threshold
self.__max_epochs = max_epochs
self.__x = None
def solve(self, coefficients, b):
_a = np.array(coefficients)
_b = np.array(b).reshape([len(b), 1])
_x = np.zeros([_a.shape[1], 1])
for epoch in range(self.__max_epochs):
grad_loss = np.matmul(np.transpose(_a), np.matmul(_a, _x) - _b)
_x -= self.__rate * grad_loss
if epoch % 10 == 0:
loss = np.mean(np.square(np.subtract(np.matmul(_a, _x), _b)))
print('loss = {:.8f}'.format(loss))
if loss < self.__loss_threshold:
break
return _x
s = SolveEquation(0.1, max_epochs=1)
print(s.solve([[1, 2], [1, 3]], [3, 4]))
這裏是我的代碼與TF:
import tensorflow as tf
import numpy as np
class TFSolveEquation:
def __init__(self, rate: float, loss_threshold: float=0.0001, max_epochs: int=1000):
self.__rate = rate
self.__loss_threshold = tf.constant(loss_threshold)
self.__max_epochs = max_epochs
self.__session = tf.Session()
self.__x = None
def __del__(self):
try:
self.__session.close()
finally:
pass
def solve(self, coefficients, b):
coefficients_data = np.array(coefficients)
b_data = np.array(b)
_a = tf.placeholder(tf.float32)
_b = tf.placeholder(tf.float32)
_x = tf.Variable(tf.zeros([coefficients_data.shape[1], 1]))
loss = tf.reduce_mean(tf.square(tf.matmul(_a, _x) - _b))
optimizer = tf.train.GradientDescentOptimizer(self.__rate)
model = optimizer.minimize(loss)
self.__session.run(tf.global_variables_initializer())
for epoch in range(self.__max_epochs):
self.__session.run(model, {_a: coefficients_data, _b: b_data})
if epoch % 10 == 0:
if self.__session.run(loss < self.__loss_threshold, {_a: coefficients_data, _b: b_data}):
break
return self.__session.run(_x)
s = TFSolveEquation(0.1, max_epochs=1)
print(s.solve([[1, 2], [1, 3]], [3, 4]))
我測試這些2碼非常簡單的方程組:
x_1 + 2 * x_2 = 3
x_1 + 3 * x_3 = 4
loss = 1/2 * || Ax - b ||^2
Init x_1 = 0, x_2 = 0, rate = 0.1
使用梯度下降 因此,在第一次計算,則增量X =(0.7,1.8)
但不幸的是我與TF代碼給
delta x =
[[ 0.69999999]
[ 1.75 ]]
而且我沒有TF代碼給
delta x =
[[ 0.7]
[ 1.8]]
沒有tf的絕對代碼是正確的,但爲什麼tf計算的梯度可能小於0.05然後校正結果? 我認爲這是我沒有tf的代碼可以解決方程組的原因,但是tf版本目前無法求解方程組。
有人可以告訴我爲什麼會出現一個漸變的漸變?由於
我的平臺是Win10 + tensorflow-GPU V1.0