Potential probability with pymc

(This question was originally published in statistics. A. I moved it here because it relates to pymc and the more general questions within it: in fact, the main goal is to better understand how pymc works pymc If any of the moderators think that it is not suitable for SO, I will clean it here.)

I read the pymc tutorial and many other questions both here and in SO.

I am trying to understand how to apply Bayes' theorem to calculate posterior probability using certain data. In particular, I have a set of independent parameters gif.latex?$%5Ctheta=(%5Ctheta_1,...,%5Ctheta_n)$

gif.latex?p(%5Ctheta)=%5Cprod_ip(%5Ctheta_1).

From the data, I would like to conclude about the probability gif.latex?$p(E%5Cmid%5Ctheta)$ where gif.latex?E is a specific event. Then the goal is to calculate

gif.latex?p(%5Ctheta%5Cmid&space;E)=%5Cfrac%7Bp(E%5Cmid%5Ctheta)p(%5Ctheta)%7D%7Bp(E)%7D.

Additional comments :

  • This is a kind of unsupervised learning, I know what happened gif.latex?E and I want to know the parameters gif.latex?$%5Ctheta$ that maximize the likelihood gif.latex?$p(%5Ctheta&space;|&space;E)$ . (*)
  • I would also like to have a parallel procedure when I let pymc calculate the probability of receiving data, and then for each set of parameters . I want to get a back chance.

In the future, I will consider that gif.latex?$p(%5Ctheta)=%5Cmathcal%7BU%7D(0,100)%5Ctimes%5Cmathcal%7BN%7D(0,&space;0.0001)$ and that probability is a multidimensional normal distribution with gif.latex?$%5Cmu=%5Ctheta%5Cmbox%7B&space;and&space;%7D%5Csigma=I_n$ (due to independence).

Below is the code I'm using (for simplicity, suppose there are only two parameters). The code is still under development (I know that it cannot work!). But I find it useful to include it, and then clarify it after comments and answers in order to provide the skeleton for future reference.

 class ObsData(object): def __init__(self, params): self.theta1 = params[0] self.theta2 = params[1] class Model(object): def __init__(self, data): # Priors self.theta1 = pm.Uniform('theta1', 0, 100) self.theta2 = pm.Normal('theta2', 0, 0.0001) @pm.deterministic def model( theta1=self.theta1, theta2=self.theta2, ): return (theta1, theta2) # Is this the actual likelihood? self.likelihood = pm.MvNormal( 'likelihood', mu=model, tau=np.identity(2), value=data, # is it correct to put the data here? observed=True ) def mcmc(observed_data): data = ObsData(observed_data) pymc_obj = Model(data) model = pymc.MCMC(pymc_obj) model.sample(10000, verbose=0) # Does this line compute the likelihood and the normalisation factor? # How do I get the posterior distribution? 

The following questions arise:

  • Does self.likelihood Bayesian probability?
  • How to use the data? (I suspect value=data incorrect ..)
  • Does .sample() really calculate the back probability?
  • How do I get information from the back?
  • (*) Should I include everything related to the fact that gif.latex?E happened at some point?
  • As a general question : is it possible to use pymc only to calculate the probability of receiving data and priorities?

Any comments, as well as a link to another question or tutorial are welcome!

+5
source share
1 answer

For starters, I think you want to return (theta1 * theta2) from your model definition.

model.sample is a selection, not a calculation, of the rear distributions (assuming a sufficient burn, etc.) of your parameters, given your data, and the likelihood that specific values ​​for each tuple of parameters can be determined from the joint back after sampling.

I think you have a fundamental misunderstanding of MCMC at the moment. I can't think of a better way to answer your questions than point to the wonderful Bayesian methods for hackers.

+1
source

Source: https://habr.com/ru/post/1244740/


All Articles