爲什麼我的tanh激活函數表現如此糟糕？

我有兩個Perceptron算法除激活函數外都是相同的。一個使用單步功能1 if u >= 0 else -1另一個使用雙功能np.tanh(u)。爲什麼我的tanh激活函數表現如此糟糕？

我預計正切跑贏一步，但它實際上是比較可怕的執行。我在這裏做錯了什麼，或者有什麼理由說它的問題集沒有得到執行？

import numpy as np 
import matplotlib.pyplot as plt 

# generate 20 two-dimensional training data 
# data must be linearly separable 

# C1: u = (0,0)/E = [1 0; 0 1]; C2: u = (4,0), E = [1 0; 0 1] where u, E represent centre & covariance matrix of the 
# Gaussian distribution respectively 


def step(u): 
    return 1 if u >= 0 else -1 


def sigmoid(u): 
    return np.tanh(u) 

c1mean = [0, 0] 
c2mean = [4, 0] 
c1cov = [[1, 0], [0, 1]] 
c2cov = [[1, 0], [0, 1]] 
x = np.ones((40, 3)) 
w = np.zeros(3)  # [0, 0, 0] 
w2 = np.zeros(3) # second set of weights to see how another classifier compares 
t = [] # target array 

# +1 for the first 20 then -1 
for i in range(0, 40): 
    if i < 20: 
     t.append(1) 
    else: 
     t.append(-1) 

x1, y1 = np.random.multivariate_normal(c1mean, c1cov, 20).T 
x2, y2 = np.random.multivariate_normal(c2mean, c2cov, 20).T 

# concatenate x1 & x2 within the first dimension of x and the same for y1 & y2 in the second dimension 
for i in range(len(x)): 
    if i >= 20: 
     x[i, 0] = x2[(i-20)] 
     x[i, 1] = y2[(i-20)] 
    else: 
     x[i, 0] = x1[i] 
     x[i, 1] = y1[i] 

errors = [] 
errors2 = [] 
lr = 0.0001 
n = 10 

for i in range(n): 
    count = 0 
    for row in x: 
     dot = np.dot(w, row) 
     response = step(dot) 
     errors.append(t[count] - response) 
     w += lr * (row * (t[count] - response)) 
     count += 1 

for i in range(n): 
    count = 0 
    for row in x: 
     dot = np.dot(w2, row) 
     response = sigmoid(dot) 
     errors2.append(t[count] - response) 
     w2 += lr * (row * (t[count] - response)) 
     count += 1 

print(errors[-1], errors2[-1]) 

# distribution 
plt.figure(1) 
plt.plot((-(w[2]/w[0]), 0), (0, -(w[2]/w[1]))) 
plt.plot(x1, y1, 'x') 
plt.plot(x2, y2, 'ro') 
plt.axis('equal') 
plt.title('Heaviside') 

# training error 
plt.figure(2) 
plt.ylabel('error') 
plt.xlabel('iterations') 
plt.plot(errors) 
plt.title('Heaviside Error') 

plt.figure(3) 
plt.plot((-(w2[2]/w2[0]), 0), (0, -(w2[2]/w2[1]))) 
plt.plot(x1, y1, 'x') 
plt.plot(x2, y2, 'ro') 
plt.axis('equal') 
plt.title('Sigmoidal') 

plt.figure(4) 
plt.ylabel('error') 
plt.xlabel('iterations') 
plt.plot(errors2) 
plt.title('Sigmoidal Error') 

plt.show()

編輯：即使從錯誤情節我已經顯示的正切函數給出了一些收斂所以這是合理的假設只是增加迭代或降低學習率將使其減少錯誤。不過，我想我真的問，從階梯函數銘記顯著更好的性能，什麼問題設置是以往任何時候都可行的使用正切與感知？

來源

2016-03-04 Luke Vincent

當你改變'lr' 0.1或1，結果看起來大致所以你的學習率太小了。 – Cleb

或者，您也可以增加'n'。 – Cleb

爲什麼你期望的S形激活功能做的更好，然後階躍函數因爲你的榜樣的數據似乎是線性可分？ –