2016-03-08 41 views
1

我包含了我在下面寫的代碼。由於某些原因,與初始分佈相比,0.804的上限被過度抽樣。這發生在兩個我正在使用的發行版中。爲什麼scipy.stats.rv_continuous選擇上限太多了?

這是rv_continuous的常見問題還是我錯過了什麼?

import matplotlib.pyplot as plt 
import scipy.stats as st 

class Disk_pdf(st.rv_continuous): 
    def _pdf(self,x): 
     return (x*(1-np.exp((x-0.804)/0.2539)))/((1+x)*(x**2+0.0256**2)**0.5) 

Disk_cv = Disk_pdf(a=0,b=0.804,name='Disk_pdf') 
Disk_dist = Disk_cv.rvs(size = 10000) 
plt.figure() 
plt.hist(Disk_dist,100) 




class Bulge_pdf(st.rv_continuous): 
    def _pdf(self,x): 
     return x*np.exp(-2.368*x-6.691*x**2) 
Bulge_cv = Bulge_pdf(a=0,b=0.804,name='Bulge_pdf') 

Bulge_dist = Bulge_cv.rvs(size = 10000) 
plt.figure() 
plt.hist(Bulge_dist,100) 

使用rv_continuous創建的初始分佈和直方圖的圖像如下。我有兩個直方圖的圖像,一個放大顯示分佈是由過採樣上限以外的方法捕獲的。另一幅圖像顯示y尺度上的直方圖,顯示過採樣問題有多糟糕。

Initial Disk galaxies' distribution and histograms made using rv_continuous which have over sampled upper bound.

Initial Bulge dominated galaxies' distribution and histograms made using rv_continuous which have over sampled upper bound.

回答

1

PDF格式必須是標準化的,而你似乎並不爲:

In [6]: from scipy.integrate import quad 

In [7]: quad(Disk_cv.pdf, 0, 0.804) 
Out[7]: (0.41121809643549406, 4.005573481922018e-09) 
+0

的幫助輝煌的感謝。我有點笨拙的時刻,並認爲我已經在PDF中包含了一個歸一化常數。我的代碼工作正常,現在我已經包含了規範化。謝謝! – nium14