2017-02-22 206 views
0

我比較擬合與optimize.curve_fit和optimize.least_squares。隨着curve_fit我得到的協方差矩陣pcov爲輸出,我可以通過計算我的擬合變量的標準偏差錯誤:如何使用scipy.optimize.least_squares計算標準偏差錯誤

perr = np.sqrt(np.diag(pcov)) 

如果我做了件與least_squares,我沒有得到任何的協方差矩陣輸出和我無法爲我的變量計算標準偏差誤差。

這裏是我的例子:

#import modules 
import matplotlib 
import numpy as np 
import matplotlib.pyplot as plt 
from scipy.optimize import curve_fit 
from scipy.optimize import least_squares 

noise = 0.5 
N = 100 
t = np.linspace(0, 4*np.pi, N) 

# generate data 
def generate_data(t, freq, amplitude, phase, offset, noise=0, n_outliers=0, random_state=0): 
    #formula for data generation with noise and outliers 
    y = np.sin(t * freq + phase) * amplitude + offset 
    rnd = np.random.RandomState(random_state) 
    error = noise * rnd.randn(t.size) 
    outliers = rnd.randint(0, t.size, n_outliers) 
    error[outliers] *= 10 
    return y + error 

#generate data 
data = generate_data(t, 1, 3, 0.001, 0.5, noise, n_outliers=10) 

#initial guesses 
p0=np.ones(4) 
x0=np.ones(4) 

# create the function we want to fit 
def my_sin(x, freq, amplitude, phase, offset): 
    return np.sin(x * freq + phase) * amplitude + offset 

# create the function we want to fit for least-square 
def my_sin_lsq(x, t, y): 
    # freq=x[0] 
    # phase=x[1] 
    # amplitude=x[2] 
    # offset=x[3] 
    return (np.sin(t*x[0]+x[2])*x[1]+ x[3]) - y 

# now do the fit for curve_fit 
fit = curve_fit(my_sin, t, data, p0=p0) 
print 'Curve fit output:'+str(fit[0]) 

#now do the fit for least_square 
res_lsq = least_squares(my_sin_lsq, x0, args=(t, data)) 
print 'Least_squares output:'+str(res_lsq.x) 


# we'll use this to plot our first estimate. This might already be good enough for you 
data_first_guess = my_sin(t, *p0) 

#data_first_guess_lsq = x0[2]*np.sin(t*x0[0]+x0[1])+x0[3] 
data_first_guess_lsq = my_sin(t, *x0) 

# recreate the fitted curve using the optimized parameters 
data_fit = my_sin(t, *fit[0]) 
data_fit_lsq = my_sin(t, *res_lsq.x) 

#calculation of residuals 
residuals = data - data_fit 
residuals_lsq = data - data_fit_lsq 
ss_res = np.sum(residuals**2) 
ss_tot = np.sum((data-np.mean(data))**2) 
ss_res_lsq = np.sum(residuals_lsq**2) 
ss_tot_lsq = np.sum((data-np.mean(data))**2) 

#R squared 
r_squared = 1 - (ss_res/ss_tot) 
r_squared_lsq = 1 - (ss_res_lsq/ss_tot_lsq) 
print 'R squared curve_fit is:'+str(r_squared) 
print 'R squared least_squares is:'+str(r_squared_lsq) 

plt.figure() 
plt.plot(t, data) 
plt.title('curve_fit') 
plt.plot(t, data_first_guess) 
plt.plot(t, data_fit) 
plt.plot(t, residuals) 

plt.figure() 
plt.plot(t, data) 
plt.title('lsq') 
plt.plot(t, data_first_guess_lsq) 
plt.plot(t, data_fit_lsq) 
plt.plot(t, residuals_lsq) 

#error 
perr = np.sqrt(np.diag(fit[1])) 
print 'The standard deviation errors for curve_fit are:' +str(perr) 

我將是任何幫助非常感謝,祝願

PS:我得到了很大的投入,從這個來源和使用的代碼Robust regression

的一部分
+0

沒有人可以幫助我嗎? – strohfelder

回答

0

optimize.least_squares的結果在其內部有一個名爲jac的參數。從documentation

JAC:ndarray,稀疏矩陣或LinearOperator,形狀(M,N)

改性雅可比矩陣在該溶液中,在某種意義上說,J(1)TJ是一個高斯 - 牛頓近似成本函數的Hessian。該類型與算法使用的類型相同。

這可以用於使用以下公式估計參數的協方差矩陣:Sigma =(J'J)^ - 1。從optimize.curve_fit的源代碼中,他們還通過殘差的MSE來縮放這個估計。在代碼中,這看起來像

J = res_lsq.jac 
cov = np.linalg.inv(J.T.dot(J)) * (residuals_lsq**2).mean() 

希望這會有所幫助。