To learn more about EpiX Analytics' work, please visit our modeling applications, white papers, and training schedule.

Page tree

 

 

For a given set of n data values randomly sampled from an assumed Normal distribution, with unknown mean m and unknown standard deviation s, the distribution of uncertainty of the true mean is calculated from a Student-t distribution:

 

                   

\mu=t(n-1).\Big(\frac{\widehat{\sigma}}{\sqrt{n}}\Big)+\bar{x}

                

                                                                                   (1)

 

where t(n-1) is a standard Student-t distribution with (n-1) degrees of freedom. [This page provides an explanation of the derivation of Equation 1].

 

\widehat{\sigma} is the unbiased single point estimate of the true standard deviation (calculated by STDEV( ) in Excel), given by:

 

\widehat{\sigma}=\sqrt{\frac{\displaystyle\sum_{i=1}^{n}(x_{i}-\bar{x})^{2}}{n-1}}

 

The standard Student-t distribution is unimodal and symmetric about zero (in the standard student distribution, the mode = 0). The formula therefore centers the uncertainty distribution of the value of the true mean m around the sample mean x which is the "best guess". It also has a spread that increases with the standard deviation \widehat{\sigma} and decreases with the square root of the sample size n. The Student-t distribution looks quite like a unit Normal distribution but flatter, with greater spread than the unit Normal distribution: a Standard Student(0,1,n) or Student(n) distribution has a standard deviation of \sqrt{\nu/(\nu-2)} compared with a standard deviation of 1 for the unit Normal distribution:

 

 

Figure 1 Examples of the Student-t distribution

 

In fact, the larger n gets, the closer the Student-t distribution approaches a unit Normal distribution (i.e. Normal(0, 1)). So, for large n (greater than 20 is usually fine), Equation 1 is very well approximated by:

 

 

\mu \approx Normal(0,1)\Big(\frac{\widehat{\sigma}}{\sqrt{n}}\Big)+\bar{x}

This following model lets you generate values for the above uncertainty distribution for m for a data set.

 

The links to the Estimate Mean & StDev for Normal Distribution When Neither Known software specific models are provided here:

  Estimate_mean_and_stdev_for_Normal_distribution_when_neither_known

 

Note that in Crystal Ball we can write equation 1 as the following Student distribution:

 

\mu=Student(\bar{x},\frac{\widehat{\sigma}}{\sqrt{n}},n-1)

 

Comparison with the Bayesian approach

The Bayesian derivation of Equation 2 is given here.

 
 
 
 

 


  • No labels