ValueError異常：在x_new值低於插範圍

這是一個scikit學習的錯誤，我得到的，當我做ValueError異常：在x_new值低於插範圍

my_estimator = LassoLarsCV(fit_intercept=False, normalize=False, positive=True, max_n_alphas=1e5)

需要注意的是，如果我減少max_n_alphas從1E5到1E4我沒有得到這個錯誤了。

任何人有什麼想法？

的錯誤，當我打電話

my_estimator.fit(x, y)

我在40尺寸40k數據點發生。

完整的堆棧跟蹤看起來像這樣

File "/usr/lib64/python2.7/site-packages/sklearn/linear_model/least_angle.py", line 1113, in fit 
    axis=0)(all_alphas) 
    File "/usr/lib64/python2.7/site-packages/scipy/interpolate/polyint.py", line 79, in __call__ 
    y = self._evaluate(x) 
    File "/usr/lib64/python2.7/site-packages/scipy/interpolate/interpolate.py", line 498, in _evaluate 
    out_of_bounds = self._check_bounds(x_new) 
    File "/usr/lib64/python2.7/site-packages/scipy/interpolate/interpolate.py", line 525, in _check_bounds 
    raise ValueError("A value in x_new is below the interpolation " 
ValueError: A value in x_new is below the interpolation range.

來源

2016-03-30 Baron Yugovich

當我運行'從sklearn.linear_model導入LassoLarsCV'後跟着你的代碼行我沒有錯誤。請提供足夠的代碼以重現您收到的錯誤以及完整的追溯消息。 –

錯誤不會發生在該行上，但是當我調用.fit（）時。不幸的是，這裏很難複製，我的數據集有40k點。 –

scipy中的插值器通常要求'x'值單調遞增。數據集的'x'單調遞增嗎？如果他們不是，嘗試用'x'作爲關鍵字對數據集進行排序，然後重試。如果它有效，請告訴我，我會爲賞金添加一個正確的答案:) –

必須有你的數據什麼特別的事。 LassoLarsCV()似乎具有相當乖巧的數據可知本合成例正常工作：

import numpy 
import sklearn.linear_model 

# create 40000 x 40 sample data from linear model with a bit of noise 
npoints = 40000 
ndims = 40 
numpy.random.seed(1) 
X = numpy.random.random((npoints, ndims)) 
w = numpy.random.random(ndims) 
y = X.dot(w) + numpy.random.random(npoints) * 0.1 

clf = sklearn.linear_model.LassoLarsCV(fit_intercept=False, normalize=False, max_n_alphas=1e6) 
clf.fit(X, y) 

# coefficients are almost exactly recovered, this prints 0.00377 
print max(abs(clf.coef_ - w)) 

# alphas actually used are 41 or ndims+1 
print clf.alphas_.shape

這是sklearn 0.16，我沒有positive=True選項。

我不知道爲什麼你會想要使用非常大的max_n_alphas無論如何。雖然我不知道爲什麼1e + 4有效，而1e + 5不適用於您的情況，但我懷疑您從max_n_alphas = ndims + 1和max_n_alphas = 1e + 4得到的路徑，或者對於性能良好的數據，它們是相同的。此外，通過clf.alpha_中的交叉驗證所估計的最佳alpha將會相同。請查看Lasso path using LARS示例，瞭解alpha正在嘗試做什麼。

此外，從LassoLars documentation

alphas_ array, shape (n_alphas + 1,)

Maximum of covariances (in absolute value) at each iteration. n_alphas is either max_iter, n_features, or the number of nodes in the path with correlation greater than alpha, whichever is smaller.

所以它使我們與alphas_上述尺寸爲ndims + 1（即n_features + 1）的結束感。

P.S.使用sklearn 0.17.1進行測試，positive = True，同時也測試了一些積極和消極的係數，結果相同：alphas_是ndims + 1或更少。

來源

2016-04-04 10:41:18

它與數據無關。在相同的數據集上，如上所述，當減少n_alphas時，問題消失。生成alpha時出錯，而不是處理問題集時。 –

@BaronYugovich你可以看到代碼與具有相同維數的不同數據集的地方，巨大的max_n_alphas，沒有問題。你爲什麼認爲這個問題與數據無關？請發佈完整的可運行示例，以重現您的問題。謝謝:) –

有道理。出於好奇，用隨機數據的實驗，你會得到什麼與正交匹配追求 http://stackoverflow.com/questions/36287045/orthogonal-matching-pursuit-regression-am-i-using-it-wrong ？noredirect = 1＃comment60438035_36287045 –

ValueError異常：在x_new值低於插範圍

回答

相關問題