關於從書籍數據科學模擬中心極限定理的問題從零開始

我正在閱讀由Joel Grus從頭開始的「數據科學」一書。我的問題特別涉及第6章，作者使用二項式隨機變量來模擬定理。關於從書籍數據科學模擬中心極限定理的問題從零開始

結果將是一個圖表，其中包含二項試驗的概率分佈和使用正態分佈的近似圖。這兩個地塊應該非常相似。書示出像這樣的圖表： Author's Chart

他提供的代碼爲：

import random 
from matplotlib import pyplot as plt 
from collections import Counter 

def bernoulli_trial(p): 
    return 1 if random.random() < p else 0 

def binomial(n, p): 
    return sum(bernoulli_trial(p) for _ in range(n)) 

def make_hist(p, n, num_points): 
    data = [binomial(n, p) for _ in range(num_points)] 
    histogram = Counter(data) 
    plt.bar([x-0.4 for x in histogram.keys()], 
     [v/num_points for v in histogram.values()], 
     0.8, 
     color='0.75') 

    mu = p * n 
    sigma = math.sqrt(n * p * (1-p)) 

    # use a line chart to show the normal approximation 
    xs = range(min(data), max(data) + 1) 
    ys = [normal_cdf(i+0.5, mu, sigma) - normal_cdf(i-0.5, mu, sigma) for i in xs] 
    plt.plot(xs, ys) 
    plt.title('Binomial Distribution vs. Normal Approximation') 
    plt.show() 

make_hist(0.75, 100, 10000)

我的問題是，在這一行： [normal_cdf（I + 0.5，μ，西格馬） - normal_cdf（ i-0.5，mu，sigma）for i in xs] 作者爲什麼使用+0.5和-0.5？這是否有特定的原因？

不知道是否有人遇到過這個問題。預先感謝您！

來源

2017-10-29 Tian Park

normal_cdf超過在'範圍（-50,50）'的載體'X/10.0'。對？ – skrubber

是的。函數是：def normal_cdf（x，mu = 0，sigma = 1）：return（1 + math.erf（（x-mu）/math.sqrt（2）/ sigma））/ 2。 x軸爲：xs = [x/10.0（對於範圍內的x（-50,50）]。爲什麼要除以10？ –

我想只是將圖表縮小到x軸-5到5之間？ –

在xs變量中，您有一個步驟1的X座標列表，例如， [-5,6,7,8,9,10-]。在ys變量中，您需要獲得相應的Y座標，並且代碼中的normal_cdf(i+0.5, mu, sigma) - normal_cdf(i-0.5, mu, sigma)是從i-0.5到i + 0.5的積分，即（i + 0.5） - （i-0.5）= 1的寬度是相同的步驟。

更易於理解代碼看起來是這樣的：在書中定義

step = 1.0 
xs = range(min(data), max(data) + 1, step) 
ys = [normal_cdf(i + step/2, mu, sigma) - normal_cdf(i - step/2, mu, sigma) for i in xs]

來源

2017-10-29 10:56:38

我明白了！因此示例中的範圍（min（data），max（data）+ 1）實際上意味着該步驟爲1。 –

關於從書籍數據科學模擬中心極限定理的問題從零開始

回答

相關問題