Message-ID: <354615440.2463.1603441683054.JavaMail.confluence@modelassist.epixanalytics.com> Subject: Exported From Confluence MIME-Version: 1.0 Content-Type: multipart/related; boundary="----=_Part_2462_2062569349.1603441683054" ------=_Part_2462_2062569349.1603441683054 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Content-Location: file:///C:/exported.html Conjugate priors

# Conjugate priors

A conjugate prior has the same functional form in q as the likelihood function which leads to a posterior distribution belong= ing to the same distribution family as the prior. For example, the Beta(a1,a2,1) distribution has pr= obability mass function f(q) given by:

&= nbsp;

=20 =20 =20 The denominator is a constant for particular values of a1 and a2, so we can rewrite the equation as:

=20 =20 =20 &= nbsp;

&= nbsp;           &nbs= p;            &= nbsp;           &nbs= p;            &= nbsp;           &nbs= p;     (1)

If we had observed s success= es in n trials and were attempting to estimate the true probabilit= y of success p, the likelihood function l(s,= n;q) would be g= iven by the binomial distribution probability mass function written (using<= /span> q to repres= ent the unknown parameter p):

=20 =20 =20 Since the binomial coefficient is constant for the given data set (i.e. known n, s), we can rewrite the equation as:

=20 =20 =20 &= nbsp;

&= nbsp;           &nbs= p;            &= nbsp;           &nbs= p;            &= nbsp;           &nbs= p;       (2)

We can see that the Beta distribution and the binomial likelihood functi= on have the same functional form in q, i.e. q a.(1-q)= b, where a and b = are constants. Since the posterior distribution is a product of the prior a= nd likelihood function, it too will have the same functional form, i.e. com= bining Equations 1 and 2 we have:

=20 =20 =20 &= nbsp;           &nbs= p;            &= nbsp;           &nbs= p;            &= nbsp;           &nbs= p;        (3)

We know from the form that this is a Beta(a1+s, a2+n-s)  distribution, so the poste= rior density is actually:

&= nbsp;

=20 =20 =20 With a bit of practice, one starts to= recognize distributions because of their functional form, without having t= o go through the step of obtaining the normalized equation. Thus if one use= s a Beta distribution as a prior for p with a binomial likelihood = function, the posterior distribution is also a Beta. The value of using con= jugate priors is that we can avoid actually doing any of the mathematics an= d get directly to the posterior distribution by simply updating the paramet= ers of the prior distribution. Conjugate priors are often called conven= ience priors for obvious reasons.

The Beta(1, 1, 1) distribution is the same as a Uniform(0, 1) distributi= on, so if we want to start with a Uniform(0, 1) prior for p, which= makes intuitive sense and also mathematical sense from the viewpoint of MaxEnt, our posterior distribution is given = by Beta(s+1, n-s+1,1). This is a particularly us= eful result in modeling binomial processes. The Jeffreys prior for a binomial probability is a Beta(=C2=BD,=C2=BD,1= ), which peaks at zero and one, but holds to one philosophy of an uninforme= d prior. Some modelers using a Beta(0, 0, 1) prior which is mathematically = undefined and therefore meaningless by itself, giving a posterior distribut= ion of Beta(s, n-s) which has a mean of s/n: in other words it provides an unbiased estimate for the binomial probability (a property m= any statisticians prefer), but has a mode of (s-1)/(n-2) = which is not intuitive, and doesn't work if s=3D0 or n.

The following table lists other conjugate priors and the associated like= lihood functions. Exponential famili= es of distributions, from which one often draws the likelihood function= , all have conjugate priors so the technique can be used frequently in prac= tice. Conjugate priors are also often used to provide approximate but very = convenient representations to subjective priors.

=20 =20 =20 =20 =20

T= able of likelihood functions and their conjugate distributions

Likelihood functions=

Information

Estimated parameter<= /em>

Prior<= /p>

Posterior

Multinomial

s1,s2,..sk successes in k categories

Probabilities p1,p2,..pk

Dirichlet(a1= ,a2,..ak<= /span>) Binomial

s successes in n trials

=

Probability p

Beta(a1<= /span>,a2,1)  Exponential

n "times' xi

mean-1 =3D l

Gamma(0,b,a)  Normal (with known s)

n data values with mean

Mean m

Normal(mm<= /span>,sm= )  Poisson

a observations in time t

Mean events per unit time l<= /em>

Gamma(0,b,a)  ------=_Part_2462_2062569349.1603441683054--