Potential probability with pymc

Question

Potential probability with pymc

(This question was originally published in statistics. A. I moved it here because it relates to pymc and the more general questions within it: in fact, the main goal is to better understand how pymc works pymc If any of the moderators think that it is not suitable for SO, I will clean it here.)

I read the pymc tutorial and many other questions both here and in SO.

I am trying to understand how to apply Bayes' theorem to calculate posterior probability using certain data. In particular, I have a set of independent parameters $gif.latex?$%5Ctheta=(%5Ctheta_1,...,%5Ctheta_n)$$

$gif.latex?p(%5Ctheta)=%5Cprod_ip(%5Ctheta_1).$

From the data, I would like to conclude about the probability $gif.latex?$p(E%5Cmid%5Ctheta)$$ where $gif.latex?E$ is a specific event. Then the goal is to calculate

$gif.latex?p(%5Ctheta%5Cmid&space;E)=%5Cfrac%7Bp(E%5Cmid%5Ctheta)p(%5Ctheta)%7D%7Bp(E)%7D.$

Additional comments :

This is a kind of unsupervised learning, I know what happened $gif.latex?E$ and I want to know the parameters $gif.latex?$%5Ctheta$$ that maximize the likelihood $gif.latex?$p(%5Ctheta&space;|&space;E)$$ . (*)
I would also like to have a parallel procedure when I let pymc calculate the probability of receiving data, and then for each set of parameters . I want to get a back chance.

In the future, I will consider that $gif.latex?$p(%5Ctheta)=%5Cmathcal%7BU%7D(0,100)%5Ctimes%5Cmathcal%7BN%7D(0,&space;0.0001)$$ and that probability is a multidimensional normal distribution with $gif.latex?$%5Cmu=%5Ctheta%5Cmbox%7B&space;and&space;%7D%5Csigma=I_n$$ (due to independence).

Below is the code I'm using (for simplicity, suppose there are only two parameters). The code is still under development (I know that it cannot work!). But I find it useful to include it, and then clarify it after comments and answers in order to provide the skeleton for future reference.

 class ObsData(object): def __init__(self, params): self.theta1 = params[0] self.theta2 = params[1] class Model(object): def __init__(self, data): # Priors self.theta1 = pm.Uniform('theta1', 0, 100) self.theta2 = pm.Normal('theta2', 0, 0.0001) @pm.deterministic def model( theta1=self.theta1, theta2=self.theta2, ): return (theta1, theta2) # Is this the actual likelihood? self.likelihood = pm.MvNormal( 'likelihood', mu=model, tau=np.identity(2), value=data, # is it correct to put the data here? observed=True ) def mcmc(observed_data): data = ObsData(observed_data) pymc_obj = Model(data) model = pymc.MCMC(pymc_obj) model.sample(10000, verbose=0) # Does this line compute the likelihood and the normalisation factor? # How do I get the posterior distribution?

The following questions arise:

Does self.likelihood Bayesian probability?
How to use the data? (I suspect value=data incorrect ..)
Does .sample() really calculate the back probability?
How do I get information from the back?
(*) Should I include everything related to the fact that happened at some point?
As a general question : is it possible to use pymc only to calculate the probability of receiving data and priorities?

Any comments, as well as a link to another question or tutorial are welcome!

+5

python naivebayes bayesian pymc mcmc

rafforaffo Mar 09 '16 at 14:22

source share

1 answer

sjc · Accepted Answer · 2016-03-15T21:12:31+0000

For starters, I think you want to return (theta1 * theta2) from your model definition.

model.sample is a selection, not a calculation, of the rear distributions (assuming a sufficient burn, etc.) of your parameters, given your data, and the likelihood that specific values for each tuple of parameters can be determined from the joint back after sampling.

I think you have a fundamental misunderstanding of MCMC at the moment. I can't think of a better way to answer your questions than point to the wonderful Bayesian methods for hackers.

Potential probability with pymc

More articles: