2014-03-28 65 views
5

我正在學習pyMC 3並遇到一些麻煩。由於pyMC3的教程有限,我正在從Bayesian Methods for Hackers開始工作。我試圖在Bayesian A/B testing示例中將pyMC 2代碼移植到pyMC 3,但沒有成功。從我所看到的模型沒有考慮到觀察結果。將pyMC2貝葉斯A/B測試示例移植到pyMC3

我不得不作出從例如一些變化,如pyMC 3是完全不同的,所以應該是什麼樣子的: 進口pymc爲PM

# The parameters are the bounds of the Uniform. 
p = pm.Uniform('p', lower=0, upper=1) 

# set constants 
p_true = 0.05 # remember, this is unknown. 
N = 1500 

# sample N Bernoulli random variables from Ber(0.05). 
# each random variable has a 0.05 chance of being a 1. 
# this is the data-generation step 
occurrences = pm.rbernoulli(p_true, N) 

print occurrences # Remember: Python treats True == 1, and False == 0 
print occurrences.sum() 

# Occurrences.mean is equal to n/N. 
print "What is the observed frequency in Group A? %.4f" % occurrences.mean() 
print "Does this equal the true frequency? %s" % (occurrences.mean() == p_true) 

# include the observations, which are Bernoulli 
obs = pm.Bernoulli("obs", p, value=occurrences, observed=True) 

# To be explained in chapter 3 
mcmc = pm.MCMC([p, obs]) 
mcmc.sample(18000, 1000) 

figsize(12.5, 4) 
plt.title("Posterior distribution of $p_A$, the true effectiveness of site A") 
plt.vlines(p_true, 0, 90, linestyle="--", label="true $p_A$ (unknown)") 
plt.hist(mcmc.trace("p")[:], bins=25, histtype="stepfilled", normed=True) 
plt.legend() 

,而不是看起來像:

import pymc as pm 

import random 
import numpy as np 
import matplotlib.pyplot as plt 

with pm.Model() as model: 
    # Prior is uniform: all cases are equally likely 
    p = pm.Uniform('p', lower=0, upper=1) 

    # set constants 
    p_true = 0.05 # remember, this is unknown. 
    N = 1500 

    # sample N Bernoulli random variables from Ber(0.05). 
    # each random variable has a 0.05 chance of being a 1. 
    # this is the data-generation step 
    occurrences = [] # pm.rbernoulli(p_true, N) 
    for i in xrange(N): 
     occurrences.append((random.uniform(0.0, 1.0) <= p_true)) 
    occurrences = np.array(occurrences) 
    obs = pm.Bernoulli('obs', p_true, observed=occurrences) 

    start = pm.find_MAP() 
    step = pm.Metropolis() 
    trace = pm.sample(18000, step, start) 
    pm.traceplot(trace); 
    plt.show() 

道歉爲冗長的職位,但在我的適應有一些小的變化,例如手動生成觀測值,因爲pm.rbernoulli不再存在。我也不確定在運行跟蹤之前是否應該找到開始。我應該如何更改我的實現以正確運行?

回答

3

你確實很近。然而,這條線:

obs = pm.Bernoulli('obs', p_true, observed=occurrences) 

是錯了,因爲你是剛剛設置的p恆定值(p_true == 0.05)。因此,上面定義的具有統一先驗的隨機變量p不受可能性的限制,並且您的圖表顯示您只是從前一個採樣。如果在代碼中用p替換p_true,它應該可以工作。這裏是固定版本:

import pymc as pm 

import random 
import numpy as np 
import matplotlib.pyplot as plt 

with pm.Model() as model: 
    # Prior is uniform: all cases are equally likely 
    p = pm.Uniform('p', lower=0, upper=1) 

    # set constants 
    p_true = 0.05 # remember, this is unknown. 
    N = 1500 

    # sample N Bernoulli random variables from Ber(0.05). 
    # each random variable has a 0.05 chance of being a 1. 
    # this is the data-generation step 
    occurrences = [] # pm.rbernoulli(p_true, N) 
    for i in xrange(N): 
     occurrences.append((random.uniform(0.0, 1.0) <= p_true)) 
    occurrences = np.array(occurrences) 
    obs = pm.Bernoulli('obs', p, observed=occurrences) 

    start = pm.find_MAP() 
    step = pm.Metropolis() 
    trace = pm.sample(18000, step, start) 

pm.traceplot(trace); 
+0

啊,謝謝。我在文本中使用了p_true,它是硬編碼的(條件是我們不會知道現實生活中的價值)。這實際上將它從示例的直接端口更改爲對真實數據更有用的內容,謝謝! :) – Eoin

0

你非常接近 - 你只需要unindent最後兩行,這會產生traceplot。您可以考慮繪製traceplot作爲完成採樣後應該發生的診斷。以下作品適合我:

import pymc as pm 

import random 
import numpy as np 
import matplotlib.pyplot as plt 

with pm.Model() as model: 
    # Prior is uniform: all cases are equally likely 
    p = pm.Uniform('p', lower=0, upper=1) 

    # set constants 
    p_true = 0.05 # remember, this is unknown. 
    N = 1500 

    # sample N Bernoulli random variables from Ber(0.05). 
    # each random variable has a 0.05 chance of being a 1. 
    # this is the data-generation step 
    occurrences = [] # pm.rbernoulli(p_true, N) 
    for i in xrange(N): 
     occurrences.append((random.uniform(0.0, 1.0) <= p_true)) 
    occurrences = np.array(occurrences) 
    obs = pm.Bernoulli('obs', p_true, observed=occurrences) 

    start = pm.find_MAP() 
    step = pm.Metropolis() 
    trace = pm.sample(18000, step, start) 

#Now plot 
pm.traceplot(trace) 
plt.show() 
+0

好的,上下文讓我困惑! – Eoin

0

這對我有效。我在啓動模型之前生成了觀察結果。

true_p_A = 0.05 
true_p_B = 0.04 
N_A = 1500 
N_B = 750 

obs_A = np.random.binomial(1, true_p_A, size=N_A) 
obs_B = np.random.binomial(1, true_p_B, size=N_B) 

with pm.Model() as ab_model: 
    p_A = pm.Uniform('p_A', 0, 1) 
    p_B = pm.Uniform('p_B', 0, 1) 
    delta = pm.Deterministic('delta',p_A - p_B) 
    obs_A = pm.Bernoulli('obs_A', p_A, observed=obs_A) 
    osb_B = pm.Bernoulli('obs_B', p_B, observed=obs_B) 

with ab_model: 
    trace = pm.sample(2000) 

pm.traceplot(trace)