scipy兩個正態分佈的重疊概率

我有兩個scipy.stats.norm（mean，std）.pdf（0）正態分佈曲線，我試圖找出兩條曲線的差異（重疊）。scipy兩個正態分佈的重疊概率

我如何用Python中的scipy進行計算？由於

2015-09-13 desmond

你見過這樣的：http://stackoverflow.com/questions/22579434/python-finding-the-intersection-point-of-two-gaussian-curves – duhaime

是的，但那只是爲了找到交點對？我試圖找到像我看到的整個區域重疊係數（OVL） – desmond

這個問題是類似的，但不限於高斯：https://stackoverflow.com/questions/20381672/calculate-overlap-area-of-two-功能 – Gabriel

您可以使用@duhalme建議得到相交，然後利用這一點來定義的積分限範圍內的答案，

其中，此代碼的樣子，

import numpy as np 
import matplotlib.pyplot as plt 
from scipy.stats import norm 
norm.cdf(1.96) 

def solve(m1,m2,std1,std2): 
    a = 1/(2*std1**2) - 1/(2*std2**2) 
    b = m2/(std2**2) - m1/(std1**2) 
    c = m1**2 /(2*std1**2) - m2**2/(2*std2**2) - np.log(std2/std1) 
    return np.roots([a,b,c]) 

m1 = 2.5 
std1 = 1.0 
m2 = 5.0 
std2 = 1.0 

#Get point of intersect 
result = solve(m1,m2,std1,std2) 

#Get point on surface 
x = np.linspace(-5,9,10000) 
plot1=plt.plot(x,norm.pdf(x,m1,std1)) 
plot2=plt.plot(x,norm.pdf(x,m2,std2)) 
plot3=plt.plot(result,norm.pdf(result,m1,std1),'o') 

#Plots integrated area 
r = result[0] 
olap = plt.fill_between(x[x>r], 0, norm.pdf(x[x>r],m1,std1),alpha=0.3) 
olap = plt.fill_between(x[x<r], 0, norm.pdf(x[x<r],m2,std2),alpha=0.3) 

# integrate 
area = norm.cdf(r,m2,std2) + (1.-norm.cdf(r,m1,std1)) 
print("Area under curves ", area) 

plt.show()

儘管可以定義高斯的符號版本並且可以定義scipy.quad（或其他），但cdf用於獲得高斯的積分。或者，您可以使用像這樣的蒙特卡羅方法link（即生成隨機數並拒絕任何您想要的範圍之外的任何數）。

來源

2015-09-14 22:01:40

埃德的回答非常好。但是，我注意到，當有兩個或無限（完全重疊）的聯繫點時，它不起作用。以下是處理這兩種情況的代碼版本。

如果您還想繼續查看分佈圖，可以使用Ed的代碼。

import numpy as np 
import matplotlib.pyplot as plt 
from scipy.stats import norm 

def solve(m1,m2,std1,std2): 
    a = 1./(2.*std1**2) - 1./(2.*std2**2) 
    b = m2/(std2**2) - m1/(std1**2) 
    c = m1**2 /(2*std1**2) - m2**2/(2*std2**2) - np.log(std2/std1) 
    return np.roots([a,b,c]) 

m1 = 2.5 
std1 = 1.0 
m2 = 5.0 
std2 = 1.0 

result = solve(m1,m2,std1,std2) 
# 'lower' and 'upper' represent the lower and upper bounds of the space within which we are computing the overlap 
if(len(result)==0): # Completely non-overlapping 
    overlap = 0.0 

elif(len(result)==1): # One point of contact 
    r = result[0] 
    if(m1>m2): 
     tm,ts=m2,std2 
     m2,std2=m1,std1 
     m1,std1=tm,ts 
    if(r<lower): # point of contact is less than the lower boundary. order: r-l-u 
     overlap = (norm.cdf(upper,m1,std1)-norm.cdf(lower,m1,std1)) 
    elif(r<upper): # point of contact is more than the upper boundary. order: l-u-r 
     overlap = (norm.cdf(r,m2,std2)-norm.cdf(lower,m2,std2))+(norm.cdf(upper,m1,std1)-norm.cdf(r,m1,std1)) 
    else: # point of contact is within the upper and lower boundaries. order: l-r-u 
     overlap = (norm.cdf(upper,m2,std2)-norm.cdf(lower,m2,std2)) 

elif(len(result)==2): # Two points of contact 
    r1 = result[0] 
    r2 = result[1] 
    if(r1>r2): 
     temp=r2 
     r2=r1 
     r1=temp 
    if(std1>std2): 
     tm,ts=m2,std2 
     m2,std2=m1,std1 
     m1,std1=tm,ts 
    if(r1<lower): 
     if(r2<lower):   # order: r1-r2-l-u 
      overlap = (norm.cdf(upper,m1,std1)-norm.cdf(lower,m1,std1)) 
     elif(r2<upper):   # order: r1-l-r2-u 
      overlap = (norm.cdf(r2,m2,std2)-norm.cdf(lower,m2,std2))+(norm.cdf(upper,m1,std1)-norm.cdf(r2,m1,std1)) 
     else:     # order: r1-l-u-r2 
      overlap = (norm.cdf(upper,m2,std2)-norm.cdf(lower,m2,std2)) 
    elif(r1<upper): 
     if(r2<upper):   # order: l-r1-r2-u 
      print norm.cdf(r1,m1,std1), "-", norm.cdf(lower,m1,std1), "+", norm.cdf(r2,m2,std2), "-", norm.cdf(r1,m2,std2), "+", norm.cdf(upper,m1,std1), "-", norm.cdf(r2,m1,std1) 
      overlap = (norm.cdf(r1,m1,std1)-norm.cdf(lower,m1,std1))+(norm.cdf(r2,m2,std2)-norm.cdf(r1,m2,std2))+(norm.cdf(upper,m1,std1)-norm.cdf(r2,m1,std1)) 
     else:     # order: l-r1-u-r2 
      overlap = (norm.cdf(r1,m1,std1)-norm.cdf(lower,m1,std1))+(norm.cdf(upper,m2,std2)-norm.cdf(r1,m2,std2)) 
    else:      # l-u-r1-r2 
     overlap = (norm.cdf(upper,m1,std1)-norm.cdf(lower,m1,std1))

來源

2016-04-13 19:55:28 amitdatta

'upper'和'lower'是什麼？你能否解釋許多不同的情況？在這種情況下， – reschu

上限和下限將是global_upper = max（distribution1，distribution2）和global_min = min（distribution1，distribution2）。如果這個想法是計算相對重疊保持分佈1作爲基線，那麼global_min = min（distribution1）和global_max = max（distribution2） – Pramit

對於完全重疊（len（result）== 0），overlap = 1而不是？ –

scipy兩個正態分佈的重疊概率

回答

相關問題