To learn more about EpiX Analytics' work, please visit our modeling applications, white papers, and training schedule.

Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Option 1: A Discrete Uniform distribution

 

How to construct a discrete uniform distribution varies for different simulation software packages. Model Empirical Distributions provides an example.

Anchor
Discrete Uniform
Discrete Uniform

The links to the Discrete Uniform software specific models are provided here:

Expand
titleCrystal Ball

Model Image AddedEmpirical Distributions (sheet Discrete) provides an example.

With Crystal Ball you can construct

...

Discrete Uniform

...

 distribution using Crystal Ball's Custom Distribution. The Discrete Uniform takes one parameter: a list of values. It then randomly picks any one of those values with equal probability (sampling with replacement). Thus, for example, a Discrete Uniform with data ({1,2,3,5}) will generate, with each iteration, one of the four values 1, 2, 3 or 5 (each value has during each iteration a probability of being picked of 25%). The figure below shows what the probability distribution looks like.

 

Image Modified
 

 

Let's imagine that we have our data in a column of Cells B1:B50. By simply

...

using Crystal Ball's Custom Distribution

...

 to generate

...

Discrete Uniform distribution, we will generate a distribution that replicates the pattern of the observed data. You can use the Discrete Uniform distribution for both discrete and continuous data providing you have sufficient observations.

Expand
title

...

@Risk

...

Model Image AddedEmpirical Distributions (sheet

...

DUniform) provides an example.


@RISK offers a Discrete Uniform distribution that takes one parameter: a list of values. It then randomly picks any one of those values with equal probability (sampling with replacement). Thus, for example, =RiskDUniform({1,2,3,5}) will generate, with each iteration, one of the four values 1, 2, 3 or 5 (each value has during each iteration a probability of being picked of 25%). The figure below shows what the probability distribution looks like.

 

Image Added 

 

 

Let’s imagine that we have our data in an array of Cells called ’Observations’. By simply writing =RiskDUniform(Observations) we will generate a distribution that replicates the pattern of the observed data. You can use the DUniform distribution for both discrete and continuous data providing you have sufficient observations. 

 

 

Option 2: A Cumulative distribution

...

If your  data are continuous you also have the option of using a Cumulative distribution, also generated using Crystal Ball's Custom Distribution. If you specify that the data are cumulative, Crystal Ball then constructs an empirical cumulative distribution by straight-line interpolation between the points defined on the curve.

 

Our best guess of the cumulative probability of a data point in a set of observations turns out to be r/(n+1) where r is the rank of the data point within the data set and n is the number of observations. Thus, when choosing this option, one needs to:

...

  • Rank the observations in ascending order (Excel has an icon that makes this simple);
  • In the column to the left of the observations, calculate the rank of the data: write a column of values 1, 2, … n;
  • In the column immediately to the right of the data, calculate the cumulative probability F(x) = rank/(n+1);
  • Use the data and F(x) columns (two neighboring columns ) as inputs to the Custom distribution, and specify in the "Define assumption" menu of the custom distribution that it is cumulative data.

 

Note that the minimum and maximum values of x only have any effect on the very first and last interpolating lines to create the Cumulative distribution, and so the distribution is less and less sensitive to the values chosen as more data are used in its construction.

 

Model Empirical Distributions provides an example.

Anchor
Cumulative Distribution
Cumulative Distribution

The links to the Cumulative Distribution software specific models are provided here:

Expand
titleCrystal Ball

Model Image AddedEmpirical Distributions 

...

(sheet Cumul) provides an example.


This model is generated using Crystal Ball's Custom Distribution. If you specify that the data are cumulative, Crystal Ball then constructs an empirical cumulative distribution by straight-line interpolation between the points defined on the curve.

 

It should be noted that for constructing this model in Crystal Ball, one should use the data and F(x) columns (two neighboring columns) as inputs to the Custom distribution, and specify in the "Define assumption" menu of the custom distribution that it is cumulative data.

Expand
title@Risk

Model Image AddedEmpirical Distributions (sheet Cumul) provides an example.

 

This is a distribution that @RISK offers that takes four parameters: a minimum, a maximum, a list of values, and a list of cumulative probabilities associated with those values. From these parameters, it then constructs an empirical cumulative distribution by straight-line interpolation between the points defined on the curve.

 

It should be noted that for constructing this model in @Risk, one should use the data and F(x) columns as inputs to the RiskCumul distribution, together with subjective estimates of what the minimum and maximum values might be.

 

 

Option 3: A histogram distribution

...

Sometimes (admittedly, not as often as we'd like) we have enormous amounts of random observations that we would like to construct a distribution from (for example, the generated values from another simulation). The Discrete Uniform and Cumulative options described above start to get a bit slow at that point, and model the variable in unnecessarily fine details. A more practical approach now is to create a histogram of the data and use that instead.

 

Model Empirical Distributions provides an example.

Anchor
Histogram Distribution
Histogram Distribution

The links to the Histogram Distribution software specific models are provided here:

Expand
titleCrystal Ball

Model Image AddedEmpirical Distributions (sheet Histogram) provides an example.

 

The array function FREQUENCY( ) in Excel will analyze a data set and say how many lie within any number of contiguous bin ranges. The Crystal Ball

...

distribution Histogram distribution, also constructed by

...

using Crystal Ball's Custom Distribution, needs three neighboring columns: the first has minimum possible value, the second the maximum possible value, and the third the bin frequencies (or probabilities), which is just the FREQUENCY( ) array.

Expand

...

title@Risk

Model 

...

Image AddedEmpirical

...

Distributions (sheet

...

Histogrm) provides an example.

 

The array function FREQUENCY( ) in Excel will analyze a data set and say how many lie within any number of contiguous bin ranges. The @RISK distribution Histogrm has three parameters: the minimum possible value, the maximum possible value, and an array of bin frequencies (or probabilities), which is just the FREQUENCY( ) array.

 

Creating an empirical joint distribution for two or more variables

For data that are collected in sets (pairs, triplets, etc), there may be correlation patterns inherent in the observations, and that we would like to maintain while fitting empirical distributions to data. An example is data of people's weight and  heightheight, where there is clearly  some some relationship between them.

 

Model Empirical Distributions provides an example.

Anchor
Empirical Joint Distribution
Empirical Joint Distribution

The links to the Empirical Joint Distribution software specific models are provided here:

Expand
titleCrystal Ball

Model Image Added Empirical Distributions (sheet Joint) provides an example.


A combination of

...

using Crystal Ball's Discrete Uniform distribution

...

 with an Excel VLOOKUP() or OFFSET( ) function allows us to do this easily.

Expand

...

title@Risk

Model 

...

Image Added Empirical Distributions (sheet Joint) provides an example.

...


A combination of using an IntUniform distribution with an Excel VLOOKUP() or OFFSET( ) function allows us to do this easily.