To learn more about EpiX Analytics' work, please visit our modeling applications, white papers, and training schedule.

Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

 

 

A Binomial(p, n) has a mean and standard deviation given by:

 

LaTeX Math Block
alignmentleft
\mu=np 
LaTeX Math Block
alignmentleft
\sigma =\sqrt{np(1-p)}

From Central Limit Theorem, as n gets large the number of observed successes s will tend to:

 

 

LaTeX Math Block
alignmentleft
s\approx Normal(np,\sqrt{np(1-p)})

 

Equation 2 for the binomial method can then be rewritten and p can be approximated by a Normal when n is large, as follows:

 

 

LaTeX Math Block
alignmentleft
p\approx \frac{Normal(n\frac{s}{n},\sqrt{n\frac{s}{n}(1-\frac{s}{n})}}{n}

                                                                        (1)

 

which can be rearranged to:

 

 

LaTeX Math Block
alignmentleft
p\approx Normal(\frac{s}{n},\sqrt{s\bigg(\frac{1}{n^{2}}-\frac{s}{n^{3}}\bigg)}

                                                                        (2)

 

and which results in the following equation:

 

 

LaTeX Math Block
alignmentleft
p\approx Normal \Big(\frac{s}{n},\sqrt{\frac{s(n-s)}{n^{3}}}\Big)

                                                                                  (3)

 

Figure 1: Example of Equation 3 estimate of p where s = 5, n = 10

 

 

Figure 2: Example of Equation 3 estimate of p where s = 1, n = 10

 

Equation 3 works nicely in the plot above for small n (10) because the number of successes was half of n, and so the uncertainty distribution is symmetric about 0.5, which nicely matches the properties of a Normal distribution. However, if one had observed just 1 success from 10 trials, it would look quite different, as shown in Figure 2: now the Normal approximation of Equation 3 is completely inaccurate, assigning considerable confidence to negative values, and fails to reflect the asymmetric nature of the uncertainty distribution.