To learn more about EpiX Analytics' work, please visit our modeling applications, white papers, and training schedule.

Page tree

 

 

The non-parametric Bootstrap makes no assumptions about the distributional form of the population or probability (parent) distribution. However, there will be many times that we will know which family of distributions the parent distribution belongs to. For example, the number of earthquakes each year and the number of Giardia cysts in litres of water drawn from a lake will logically both be approximately Poisson distributed; the time between phone calls to an exchange will be roughly exponentially distributed and the number of males in randomly sampled groups of a certain size will be binomially distributed. The parametric Bootstrap gives us a means to use the extra information we have about the population distribution. The procedure is the same as the non-parametric Bootstrap approach except for the distribution estimation stage:

 

1. Estimate the distribution from the data

For the parametric Bootstrap, we select the distribution type we believe the data to come from and then find the MLE parameters for that distribution. This means, we find the parameter values for the distribution that give the highest probability of observing the data values we have.

 

2. Simulate the data collection

Just as with the non-parametric Bootstrap, we now replace each observation with a sample taken at random from the fitted population distribution.

 

3. Calculate the sample statistic

We now run a large number of iterations, each one generating a new Bootstrap replicate, and for each Bootstrap replicate we calculate the sample estimate of the statistic in question. 

 

In summary, the parametric Bootstrap proceeds as follows:

 

  • Collect the data set of n samples {x1, …xn}

  • Determine the parameter(s) of the distribution that best fits the data from the known distribution family using maximum likelihood estimators (MLEs)

  • Generate B Bootstrap samples {x1*, …xn*} by randomly sampling from this fitted distribution

  • For each Bootstrap sample {x1*, …xn*} calculate the required statistic \widehat{\theta}. The distribution of these B estimates of q represents the Bootstrap estimate of uncertainty about the true value of q.

 

 

Example:

Estimation population mean, standard deviation and percentile range for a continuous variable

 

 

 


  • No labels