有沒有辦法加快這個簡單的PyMC模型?在20-40個數據點上,大概需要5-11秒。如何加快PyMC馬爾可夫模型?
import pymc
import time
import numpy as np
from collections import OrderedDict
# prior probability of rain
p_rain = 0.5
variables = OrderedDict()
# rain observations
data = [True, True, True, True, True,
False, False, False, False, False]*4
num_steps = len(data)
p_rain_given_rain = 0.9
p_rain_given_norain = 0.2
p_umbrella_given_rain = 0.8
p_umbrella_given_norain = 0.3
for n in range(num_steps):
if n == 0:
# Rain node at time t = 0
rain = pymc.Bernoulli("rain_%d" %(n), p_rain)
else:
rain_trans = \
pymc.Lambda("rain_trans",
lambda prev_rain=variables["rain_%d" %(n-1)]: \
prev_rain*p_rain_given_rain + (1-prev_rain)*p_rain_given_norain)
rain = pymc.Bernoulli("rain_%d" %(n), p=rain_trans)
umbrella_obs = \
pymc.Lambda("umbrella_obs",
lambda rain=rain: \
rain*p_umbrella_given_rain + (1-rain)*p_umbrella_given_norain)
umbrella = pymc.Bernoulli("umbrella_%d" %(n), p=umbrella_obs,
observed=True,
value=data[n])
variables["rain_%d" %(n)] = rain
variables["umbrella_%d" %(n)] = umbrella
print "running on %d points" %(len(data))
all_vars = variables.values()
t_start = time.time()
model = pymc.Model(all_vars)
m = pymc.MCMC(model)
m.sample(iter=2000)
t_end = time.time()
print "\n%.2f secs to run" %(t_end - t_start)
由於只有40個數據點,它需要11秒運行:
running on 40 points
[-----------------100%-----------------] 2000 of 2000 complete in 11.5 sec
11.54 secs to run
(用80分花費20秒)。這是一個玩具的例子。 Lambda()
中確定轉換的表達式實際上更復雜。這個基本的代碼結構是靈活的(而用轉移矩陣編碼模型的靈活性較差)。有沒有辦法保持類似的代碼結構,但獲得更好的性能?如有必要,很樂意切換到PyMC3。謝謝。
哪個pymc的版本,您使用的(pypy)?在2.3.6的pymc文檔中,我找不到Bernoulli函數,只有Bernoulli_like [Doc](https://pymc-devs.github.io/pymc/)。 – CodeMonkey
它存在於2.2 – slushy
我有一個類似的優化問題(https://stackoverflow.com/questions/42205123/how-to-fit-a-method-belonging-to-an-instance-with-pymc3) –