我在嘗試一個基本的平均例子,但驗證和損失不匹配,並且網絡無法收斂,如果我增加訓練時間。我正在訓練一個帶有2個隱藏層的網絡,每個500個單元的寬度範圍爲[0,9],學習率爲1e-1,Adam,批量爲1,並且退出3000次迭代並驗證每個100次迭代。如果標籤與假設之間的絕對差值小於閾值,則在此將閾值設置爲1,我認爲這是正確的。有人可以告訴我,如果這是一個與損失函數的選擇,Pytorch錯誤或我正在做的事情有關的問題。下面是一些情節:在驗證過程中迴歸損失功能不正確
val_diff = 1
acc_diff = torch.FloatTensor([val_diff]).expand(self.batch_size)
循環100次:
num_correct += torch.sum(torch.abs(val_h - val_y) < acc_diff)
各個驗證階段後追加:
validate.append(num_correct/total_val)
這裏是(假設的例子,和標籤):
[...(-0.7043088674545288, 6.0), (-0.15691305696964264, 2.6666667461395264),
(0.2827358841896057, 3.3333332538604736)]
我嘗試的API中的損失函數通常用於迴歸六:
torch.nn。L1Loss(size_average =假)
torch.nn。MSELoss(size_average =假)
torch.nn。SmoothL1Loss(size_average =假)
謝謝。
網絡代碼:
class Feedforward(nn.Module):
def __init__(self, topology):
super(Feedforward, self).__init__()
self.input_dim = topology['features']
self.num_hidden = topology['hidden_layers']
self.hidden_dim = topology['hidden_dim']
self.output_dim = topology['output_dim']
self.input_layer = nn.Linear(self.input_dim, self.hidden_dim)
self.hidden_layer = nn.Linear(self.hidden_dim, self.hidden_dim)
self.output_layer = nn.Linear(self.hidden_dim, self.output_dim)
self.dropout_layer = nn.Dropout(p=0.2)
def forward(self, x):
batch_size = x.size()[0]
feat_size = x.size()[1]
input_size = batch_size * feat_size
self.input_layer = nn.Linear(input_size, self.hidden_dim).cuda()
hidden = self.input_layer(x.view(1, input_size)).clamp(min=0)
for _ in range(self.num_hidden):
hidden = self.dropout_layer(F.relu(self.hidden_layer(hidden)))
output_size = batch_size * self.output_dim
self.output_layer = nn.Linear(self.hidden_dim, output_size).cuda()
return self.output_layer(hidden).view(output_size)
培訓代碼:
def train(self):
if self.cuda:
self.network.cuda()
dh = DataHandler(self.data)
# loss_fn = nn.L1Loss(size_average=False)
# loss_fn = nn.L1Loss()
# loss_fn = nn.SmoothL1Loss(size_average=False)
# loss_fn = nn.SmoothL1Loss()
# loss_fn = nn.MSELoss(size_average=False)
loss_fn = torch.nn.MSELoss()
losses = []
validate = []
hypos = []
labels = []
val_size = 100
val_diff = 1
total_val = float(val_size * self.batch_size)
for i in range(self.iterations):
x, y = dh.get_batch(self.batch_size)
x = self.tensor_to_Variable(x)
y = self.tensor_to_Variable(y)
self.optimizer.zero_grad()
loss = loss_fn(self.network(x), y)
loss.backward()
self.optimizer.step()
你更新後的權重歸零的梯度?這是一個常見的錯誤。此外,你的學習率似乎非常高。 – mexmex
@mexmex不是我應該清除每次迭代的本地漸變?這不是optimizer.zero_grad的目的嗎? – Soubriquet
是的,只是檢查你確實在做!對不起,如果我的語言不明確。 – mexmex