爲什麼執行條件logit梯度失敗？

我已經寫了條件logit模型的可能性/梯度的非常簡單的實現（解釋爲here） - 可能性很好，但梯度不正確。我的兩個問題是：我的梯度是否正確，如果是的話，我在Python中的實現是否正確？如果在數學論壇中更好地提出這個問題，請隨時移動。爲什麼執行條件logit梯度失敗？

型號：

數似然：

最後，梯度：

這裏，i是每個觀測，j是觀察我內的替代方案中，c是在所選擇的替代觀察i，Xij是i中選擇j的特徵向量，B是相應的係數。 似然公式應該有特徵向量乘以係數向量。我的錯誤

我的可能性和梯度實現如下：

可能性：

def log_likelihood(coefs, observations, config, lasso): 
    def func(grp): 
     mtrx = grp.as_matrix(config.features) 
     dp = np.dot(mtrx, coefs) 
     sub = np.log(np.exp(dp).sum()) 
     inc = (dp * grp['choice']).sum() 
     return inc - sub 
    ll = observations.groupby(['observation_id']).apply(func).sum() 
    if lasso is not None: 
     ll -= (np.abs(coefs).sum() * lasso) 
    neg_log = ll * -1 
return neg_log

梯度：

def gradient(coefs, observations, config, lasso): 
    def func(grp): 
     mtrx = grp.as_matrix([config.features]) 
     tmtrx = mtrx.transpose() 
     tmp = np.exp(tmtrx * coefs[:, np.newaxis]) 
     sub = (tmp * tmtrx).sum(1)/tmp.sum(1) 
     inc = (mtrx * grp['choice'][:, np.newaxis]).sum(0) 
     ret = inc - sub 
     return ret 
    return -1 * observations.groupby(['observation_id']).apply(func).sum()

這裏，coefs是包含係數的numpy的陣列，意見是一個數據框，其中每一行表示一個觀察內的選擇，而列是一個選擇列秈稻ting 0/1作爲列中的選擇，而observation_id列中觀察值中的所有選項都具有相同的id，最後config是包含成員'features'的dict，它是包含特徵的觀察值列中的列表。 注意我正在測試而不使用套索參數。下面的例子是什麼數據看起來像。

我驗證了可能性是正確的;但是，使用scipy.optimize.check_grad時，梯度的錯誤非常大。如果沒有將梯度傳遞給scipy.optimize.minimize，我也可以解決B.漸變評估如我所料，所以在這一點上，我只能認爲我的推導是不正確的，但我不知道爲什麼。

In [27]: df.head(14) 
Out[27]: 
      x1  x2  x3 observation_id choice 
0 0.187785 0.435922 -0.475349    211  1 
1 -0.935956 -0.405833 -1.753128    211  0 
2 0.210424 0.141579 0.415933    211  0 
3 0.507025 0.307965 -0.198089    211  0 
4 0.080658 -0.125473 -0.592301    211  0 
5 0.605302 0.239491 0.287094    293  1 
6 0.259580 0.415388 -0.396969    293  0 
7 -0.637267 -0.984442 -1.376066    293  0 
8 0.241874 0.435922 0.855742    293  0 
9 0.831534 0.650425 0.930592    293  0 
10 -1.682565 0.435922 -2.517229    293  0 
11 -0.149186 0.300299 0.494513    293  0 
12 -1.918179 -9.967421 -2.774450    293  0 
13 -1.185817 0.295601 -1.974923    293  0

來源

2016-09-17 user3442536

推導不正確。在冪運算中，我只包含給定係數偏導數的特徵和係數。相反，它應該是所有特徵和係數的點積。

來源

2016-09-19 05:00:40 user3442536

爲什麼執行條件logit梯度失敗？

回答

相關問題