# Transforming discrete data before performing a parametric distribution fit

Discrete parametric distributions take integer values and generally start at zero. The variable you are modeling may not. For example, the number of people in a household, the number of animals in an outbreak, or the number of people involved in a car crash must be a minimum of one. In this situation, you should transform the data (e.g. by subtracting one) before fitting a distribution, and then add one back on afterwards (e.g. =1+Poisson(75)).

The model Paratetramol contains the data and the fitted model of the following example. This example not only illustrates how to transform discrete data before performing a parametric distribution fit, but also shows an example of how to fit a discrete first order parametric distribution to discrete data. As described in this same section, most simulation software packages are able to "automatically" fit first order parametric continuous distribution to data, but are not able to do this with discrete distributions. The example below therefore shows how one can fit discrete distributions to discrete data.

The links to the Paratetramol software specific models are provided here:

To understand this example, it is useful to review the section about the Negative Binomial distribution. Note that simulation software packages may approach the NegBinomial distribution in different ways. For example, the NegBinomial distribution in @Risk starts at zero while in Crystal Ball it starts at s. If we wish to have the NegBinomial distribution in Crystal Ball to start at zero, we therefore just have to subtract 's'. This form of the NegBinomial distribution (NegBinomial (p,s) - s) models the number of failures until s successes, while Crystal Ball's default NegBinomial (p, s) distribution models the total number of trials before s successes.

All parametric discrete distributions take integer values. However, your data may not because the variable may not be measured in the units that are integers. For example, you may have data on the amount of a compound (e.g. Paratetremol) that people have taken. The Paratetremol will have come in pills of specific doses (e.g. 25mg), so the observations will take 25mg steps. You would need to divide each data value by 25mg, fit a distribution to the resulting integer, and then multiply the resultant distribution by 25mg:

Original data are discrete but not with increments of 1

Data are transformed to integers

Transformed data are fit to a parametric discrete distribution

Parametric distribution is transformed back to original scale

• No labels