Troubleshoot distribution problems in Python

Question

Troubleshoot distribution problems in Python

I am coding the output of the Dirichlet-Multinomial posterior in a document that I have read, and I am having trouble getting the distribution equal to 1. Here is the code in its unsatisfactory form:

def pcn(X, n, N, c, alpha):
    pnc = np.math.factorial(np.sum([n[i] for i in range(len(n))]))/ \
          np.product([np.math.factorial(n[i]) for i in range(len(n))])* \
          np.product([c[i]**n[i] for i in range(len(n))])
    pc = G(len(X)*alpha)/ \
         np.product([G(alpha) for i in range(len(n)) if i in X])* \
         np.product([(c[i])**(alpha - 1) for i in range(len(n)) if i in X])
    pn = np.math.factorial(N)/ \
         np.product([np.math.factorial(n[i]) for i in range(len(n)) if i in X])* \
         G(len(X)*alpha)/ \
         G(len(X)*alpha + N)* \
         np.product([G(alpha + n[i])/G(alpha) for i in range(len(n)) if i in X])
    return pnc

which I simplified here by removing the parts that are separated:

def pcns(X, n, N, c, alpha):
    pnc = np.product([c[i]**n[i] for i in range(len(n))])

    pc = np.product([(c[i])**(alpha - 1) for i in range(len(n))])/ \
         np.product([G(alpha) for i in range(len(n))])

    pn = np.product([G(alpha + n[i])/G(alpha) for i in range(len(n))])/ \
         G(len(X)*alpha + N)

    return pnc * pc / pn

I set the input variables and initialize the c array for input:

X = [0,1]
n = [6.0, 3.0]
N = sum(n)
alpha = 20
c = np.linspace(0, 1, 1000)

and then I iterate over c and evaluate my back function for every c and picture:

dist = []
for i in c:
    dist.append(pcns(X, n, N, [i, 1-i], alpha))

plt.plot(c,dist)

The value that I get when summing over distis 999 or len(c) - 1. Does anyone possibly find out why this is not to 1?

+4

python bayesian

user5590942 Nov 22 '15 at 6:08

source share

1 answer

DSM · Accepted Answer · 2015-11-22T13:06:27+0000

tl; dr: dx ~ = delta_x.

, dist 1? G, , . 0.5 stddev - , 0 1 :

>>> import scipy.stats
>>> c = np.linspace(0, 1, 1000)
>>> p = scipy.stats.norm(0.5,0.02).pdf(c)
>>> sum(p)
999.00000000000045

999. , 1? , , . .

: , x ^ 2 0 1 1/3,

>>> x = np.linspace(0, 1, 1000)
>>> sum(x**2)
333.50016683350009

- ( , , x = 1 ):

>>> sum(x**2) * (x[1]-x[0])
0.3338340008343344

>>> sum(p) * (c[1]-c[0])
1.0000000000000004

, * (c[1]-c[0]), c , "dx" Int(p(x) dx) ~= sum(p[x] * delta_x) . , - sum(p[:-1] * np.diff(c)).

Troubleshoot distribution problems in Python

More articles: