To learn more about EpiX Analytics' work, please visit our modeling applications, white papers, and training schedule.

Page tree



A Binomial(p, n) has a mean and standard deviation given by:


\sigma =\sqrt{np(1-p)}

From Central Limit Theorem, as n gets large the number of observed successes s will tend to:



s\approx Normal(np,\sqrt{np(1-p)})


Equation 2 for the binomial method can then be rewritten and p can be approximated by a Normal when n is large, as follows:



p\approx \frac{Normal(n\frac{s}{n},\sqrt{n\frac{s}{n}(1-\frac{s}{n})}}{n}



which can be rearranged to:



p\approx Normal(\frac{s}{n},\sqrt{s\bigg(\frac{1}{n^{2}}-\frac{s}{n^{3}}\bigg)}



and which results in the following equation:



p\approx Normal \Big(\frac{s}{n},\sqrt{\frac{s(n-s)}{n^{3}}}\Big)



Figure 1: Example of Equation 3 estimate of p where s = 5, n = 10



Figure 2: Example of Equation 3 estimate of p where s = 1, n = 10


Equation 3 works nicely in the plot above for small n (10) because the number of successes was half of n, and so the uncertainty distribution is symmetric about 0.5, which nicely matches the properties of a Normal distribution. However, if one had observed just 1 success from 10 trials, it would look quite different, as shown in Figure 2: now the Normal approximation of Equation 3 is completely inaccurate, assigning considerable confidence to negative values, and fails to reflect the asymmetric nature of the uncertainty distribution.




  • No labels