To learn more about EpiX Analytics' work, please visit our modeling applications, white papers, and training schedule.

Page tree


The table below lists all the Crystal Ball models available in ModelAssist. Links are provided to topics that discuss these models. Use Ctrl+F if you want to search for a particular Software feature or Excel function, or for a specific modeling technique. All models are unprotected to allow you to use the code for your own problems, but remember to first open Crystal Ball to be able to run the model. Note that all models are copyrighted: as a user of ModelAssist you are authorized to use the models with attribution, so please cite ModelAssist in your publications.


Model name

Topic link

Techniques used

Problem description

Distributions

Excel functions

Hypergeometric model

Binomial model

Direct calculation of probability

Numerical integration

Simulating a stochastic sequence

A bag contains 100 sweets, 20 of which contain arsenic. 20 people take 5 sweets each. You are one of them. A person eating just 1 arsenic sweet has a 50% probability of dying. A person eating 2 or more sweets will certainly die. Question 1: How many arsenic sweets will you get? Question 2: What is the probability you will die? Question 3: How many people will die? Question 4: How many sweets would you need to eat to have eaten 1 arsenic sweet? Question 5: How many sweets would you need to eat to have eaten 2 arsenic sweets?

Hypergeometric

Binomial

Discrete

IF

HYPGEOMDIST

COUNTIF

COMBIN

Gamma estimate of Poisson mean

Numerical integration

Different types of asphalt are considered for the resurfacing of a road from City A to City B. The current road has two types of asphalt, called Grey and Black. For the government, safety is the major consideration when choosing the type of asphalt that has to be used for resurfacing. The performance of asphalt of course depends to a large extend on the specific weather and climate circumstances. Both types of asphalt have been used for the current road and we can therefore compare the risks of an accident per mile of Grey asphalt with that of Black asphalt.

Gamma

GAMMADIST

IF

SUMPRODUCT

Loan default

Loss model

You are working for a bank that has a portfolio of 25 loans, ranging from $1,000 to $15,000 and belonging to seven different rating grades with long-term (historic) probability of default (PD) levels ranging from 0.5% to 5%. The short term PD is, however, influenced by a macro-economic factor (50%) and an idiosyncratic (random) factor (50%). What are the losses and their distribution parameters under the three different scenarios?

Beta

Binomial 

Exponential

Correlation Matrix Tool

VLOOKUP

Constructing a Dirichlet distribution

Compares a binomial and a beta-binomial distribution

Beta

Binomial


Constructs a classical statistics confidence distribution for a binomial probability


Cumulative Ascending

IF

BINOMDIST

Bootstrap/MoM fit

Performs a second-order fit of data to a Gamma distribution

Gamma

AVERAGE

VAR


Models a random walk where we impose a boundary condition on a stochastic time series.

Lognormal

IF

Poisson counts

Sum of random number of random variables

Output statistics generation in spreadsheet

Calculation of risk budget

You are a consultant, hired by a US beer brewery that makes a special beer exclusively for the UK market. The brewery has a contract for selling 1,000,000 cases next year at a price of 10 pounds per case. You are asked how much capital the firm needs. The firm is exposed to two types of risk; exchange risk and liability claims, but the firm does not want to use any insurance or hedging.

Beta

Normal

Poisson

EXP

IF

LN

ROUND

Hypergeometric sample

Numerical integration

I have three full packs of cards, and I draw 10 cards from each pack. What is the probability that I draw at least 10 hearts in total?

Hypergeometric

IF

Sum of random number of random variables

Numerical integration

Central Limit Theorem

You run a special forces division of your country's army. You need to recruit another 16 personnel. The selection process is very tough, and only 34% of people who have ever started the training have completed it. The selection process is also very expensive. It costs $10600 to start a person on the program and $3200 thereafter for each week of training. The training course lasts 13 weeks. A person who fails has equal probability of doing so at any point during the course. You have a budget of $2,200,000. What is the probability that this will be sufficient for your recruitment needs?

NegBinomial

Normal

Uniform

AVERAGE

IF

SQRT

Central Limit Theorem

Bernoulli

Normal

PERT

IF

Disaggregated

Poisson process

Clumps of cysts appear in water from a particular lake at a rate of 2.3 cysts per 1000m3. The number of cysts in a clump appears to take the distribution from empirical data shown in the table below. A filtration system is used that is able to capture these cyst clumps with varying efficiency depending on the clump size as shown in the third column of the table below. If a town draws its drinking water from this lake, uses the filtration system, and consumes 7600  m3 of water per year, how many cysts will be consumed? A cyst has a 25% probability of causing illness, how many people will get ill next year?

Binomial

Poisson

IF

Exercise in building a dynamic array

What is the distribution of the maximum number of heads I can get in a row by tossing a fair coin 10 times?

Binomial

IF

MAX

Combining divergent expert estimates of the same quantity

Three weather experts are estimating what the wind speed (in Beaufort) will be tomorrow. Each of them gives her own opinion on what values the parameter can take. Expert A thinks the speed will be minimal 3, most likely 4 and maximal 6. Expert C thinks it will be respectively 3, 7 and 10. Finally, expert B thinks it will be between 3 and 9, with a distribution that looks like the one on sheet "Expert B". We weight expert B's opinion twice as heavy as the opinion of A or C. Now, we are to combine their opinions into one estimate.

Beta

Discrete

PERT


Exercise in building a dynamic array

You need a replacement PC. The IT manager says there are 22 PCs stored in the basement, but 3 have bad hard disks only, two have bad motherboards only, and one has both a bad hard disk and motherboard. Of course, nobody can remember which ones. For reasons he alone understands, you can only take out one PC at a time, coming to him to ask for the key, and then returning it afterwards. Presuming you can dismantle PCs and rebuild them, how many trips will you have to make to the basement to get a working PC?

Binomial

AND

IF

OR

Contagious extreme value model and comparison with simulation

Model the extreme value distribution for the maximum explosion size, given the frequency of explosions and explosion intensity.

Exponential

Extremevalue

Poisson

IF

LN

MAX

Number of errors to achieve a success

Sum of resources expended to achieve success

A manufacture is trying to extrude a length of copper wire of 5 kilometres, and there is a certain failure rate which is described by .07 failures per kilometre. And if the failure occurs before we produced our 5 km. of wire, then we have to re-start again. Because for some reason we can't attach two pieces of wire together, we just need 5 km. of absolutely perfect wire. So we are trying to estimate the distribution for the total amount of wire we produce in kilometres in order to get our 5 kilometres of perfect wire and the distribution for the number of times we need to re-start the production.

Exponential

IF

Modeling correlated risks

We have a portfolio of risks (A to J), each of which is described by a PERT distribution. The risks are correlated in the following way: a) If A or C occurs, D increases in size by 50%; b) If A or D occurs, E increases to 45% probability and c) If all of B, C, E, F occur, H increases in size by 320% and to 13% probability. Our task is to calculate the total risk of the portfolio.

Binomial

Beta

AND

IF

OR

Modeling two correlated variables using the envelope method


Beta


Modeling correlation using a lookup table


Discrete

Triangular

HLOOKUP

Modeling project duration and cost from inter-related tasks

A new building is to be constructed by a consortium for a client. The project is divided into seven sections. The client wishes to see a risk analysis of the schedule and cost risks and how these relate to each other.

Discrete

Triangular

TODAY

MAX

A very simple project cost model

We need to build a cost model for a new bridge. All cost items are grouped into 8 categories. Three of these categories include a probability of occurrence of a particular event that will result in additional cost. Find the distribution of the total cost for this project.

Discrete

Triangular


Calculates the approximate sum of correlated random variables using a covariance matrix.

A demonstration of how covariance can be used to calculate the approximate sum of a number of number of correlated random variables, even when those variables are skewed.

Correlation Matrix Tool

Lognormal


COVAR

INDEX

SQRT

Shows how you can construct six different distributions with the Custom Distribution in Crystal Ball


Custom


Sum of random number of random variables

Numerical integration

Foci of cysts are randomly distributed in water with the average of 0.3 foci per litre. Each focus can contain from 1 to 5 cysts, with each value having the same probability of being true. Each cyst has a 20% probability of infecting a person. If I drink 12 litres of water, how many cysts will I consume? What is the probability of me getting infected from drinking 12 litres of water?

Integer Uniform

Poisson

IF

Generate a Dirichlet distribution

All cars in a city are divided into 9 different types. We have monitored 100 cars that were entering a particular motorway, and counted the number of cars of each type. What are the uncertainty distributions for the proportions of each type in a total number of cars?

Beta


Two methods of Dirichlet distribution construction.


Beta

Gamma


Two ways of determining the distance between an individual and its nearest, or next nearest, etc. neighbor.

It is known that the average density of herds in a region is 0.052/km2, but no information is available about the distribution of distance between these herds. We assume that the herds are randomly distributed, that their is no clustering in their position, and that the spread of a herd is very small compared to the distance between herds. What is the distribution of distance between a random herd and its nearest neighbor?

Exponential

Gamma

Uniform

IF

MIN

PI

SQRT

Distributing particles randomly scattered through a medium when the medium is divided up.

You are bottling 375 litres of wine into 500 bottles. There are in total 216 bacteria in the vat of wine. The task is to model the amount of bacteria in each bottle.

Binomial

IF

Constructing a Dirichlet distribution

Numerical integration

We have surveyed 175 people, asking them for which party they are intending to vote. The results are as shown below. Using the Dirichlet distribution and assuming that people don't change their mind between the poll and election time, answer the questions:

- How confident are we that SMP will win (get more votes than any other party)?

- If the SDP joins forces with the EDP, and the SMP joins forces with the PSM, how confident are we that SDP/EDP will get more votes than SMP/PSM?

Beta


MAX


An electricity re-seller is thinking of selling an option to its customers. The buyer of this option will have the right to purchase electricity at any time within a period of 10 days at the spot rate with a cap of USD 120/KWh, thus saving money if the electricity price rises above that figure. In order to calculate the price of this option, the company needs to estimate how much the spot price could exceed the cap and how much electricity it would sell to the option-holder when the latter decides to exercise the option.

Lognormal


IF

Three options to construct an empirical distribution


Cumulative Ascending

Discrete Uniform

Integer Uniform

Histogram

CONCATENATE

FREQUENCY

INDEX

VLOOKUP

Generating values from the Error distribution


General

ABS

EXP

GAMMALN

SQRT

classical statistics method of estimating the mean and the standard deviation for Normal distribution when neither are known.


Binomial

Gamma

Uniform

Student-t

AVERAGE

COUNT

SQRT

STDEV

classical statistics method of estimating the mean for Normal distribution when the distributions standard deviation is known


Normal

AVERAGE

COUNT

SQRT

classical statistics method of estimating the standard deviation of a Normal distribution when the distributions mean is known.


Gamma

COUNT

SQRT

SUMPRODUCT

Projected random time series with dominant shock component

Model exchange rate for 300 periods according to the formula: Xi+1=Normal(Xi,sigma*Xi). There is an election at period 100, with 3 possible outcomes (see the table below). The volatility of the exchange rate changes depending on the party that gets into power. If Conservatives get in, there's a 50:50 chance that sigma increases to 0.1 somewhere between period 180 and 200.

Discrete

Normal

Uniform

AND

IF

ROUND

VLOOKUP

Calculate the total worth of a retirement fund upon retirement.

You are trying to calculate the total worth of your retirement fund upon your retirement. You are 32-years-old now and the retirement age in your country is 60, but there is a 75% chance that it will be changed to 65 years. You contribute 5% of your salary to the retirement fund each year. Your annual salary this year is 20,000, and you expect it to rise with Lognormal(3%,1%) per year in real terms (above inflation). You estimate that the return on the pension fund will be minimum 3%, most likely 4% and maximum 7% (assuming a Pert distribution).

Binomial

Lognormal

Beta

IF

MAX

Offers a number of formats you can use so that a cell with the value 105,000, for example, will appear on-screen as 105k. You need only copy and paste cells from this file into your spreadsheet, or use Excels Format Painter tool to transfer the formats




Fit an Extreme Value minimum distribution to data


Extremevalue


A simple model to illustrate how to add uncertainty to the values of the components in the model by replacing a fixed value by a probability distribution describing our uncertainty.


Lognormal

Beta



A vast area of land in country A is below the average sea level. Dikes are built along the sea coast in order to protect the country from floods. Extreme wave-height and still water level are two very important factors for causing flooding along the sea coast. The scientists concerned with the safety of the land-space found out that a catastrophe can occur if the wave-height and still water level satisfy the following relation:
Catastrophe factor = 0.4 * wave-height [m.] + sea level [m.] > 6.2.
Taking into account the country’s geography, the area of flooded land-space can be calculated using the following formula:
LogNormal(1.38*Catastrophe factor^2, 3.52*Catastrophe factor^(1/4)) [100 sq. km.].
Giving that storms occur at a Poisson expected rate of 25 per year, what is the probability that within 5 years there will be a flood that will cover more that 7,000 sq. km?

Lognormal

Normal

EXP

IF

LOGNORMDIST

Generating a Frechet distribution


Uniform

LN

Numerical integration

Perform the following integral:

\int\limits_0^{10} x^3 ~ sin ^ 2 (x) ~ exp ~ \bigg(\frac{-1}{x}\bigg)dx

Uniform

EXP

IF

SIN

Numerical integration

Perform the following integral:

\int\limits_0^{10} \frac{\sin(x)}{log(x)} dx

Uniform

ABS

IF

LN

SIN

Numerical integration

Perform the following integral:

\int\limits_0^{10} x^3 ~ sin ^ 2 (x) ~ exp ~ \bigg(\frac{-1}{x}\bigg)dx

Uniform

EXP

IF

SIN

Bayesian inference

Informed prior

Hypergeometric sampling

20 people are randomly picked off a city street in France. Whether they are male or female is noted on 20 identical pieces of paper, put into a hat and the hat is brought to me. I have not seen these 20 people. I take out five pieces of paper from the hat and read them - three are female. I am then asked to estimate the number of females in the original group of twenty.


BINOMDIST

HYPGEOMDIST

IF

OR

Number of samples required in a random sampling of a population to obtain a particular minimum set.

Normal approximation to the NegBin

You are a government body doing research into the effects of marriage and smoking on peoples health. You are doing a random telephone survey and you require 50 people from each of the four possible categories. From previous studies you know that 32% of people agree to participate in this type of survey when called. How many calls will you need to make, given that previous studies show the population to be split into the four categories.

Binomial

NegBinomial

AND

IF

Simple project duration model

In order to build a hospital, a construction firm needs to finish several stages of different duration. What is the total duration of the project?

Beta

PERT


Constructing an Inverse Hypergeometric distribution and generating values


Discrete

COMBIN

Illustrating an incorrect way to use the Beta distribution to model variability in a Binomial probability

The probability of a trials success is a random variable and each trials probability is independent of the others. Model the probability for each trial separately as a random variable and calculate the total number of successes.

Beta

Binomial


Sum of random number of random variables

Numerical integration

Insurance

Hedging

Your company is exposed to two types of risk, market risk through price and volume fluctuations and risk of fire. Currently, the company does have insurance and you are asked to evaluate if this is the best way going forward.

Lognormal

Beta

Poisson

IF

TREND

Inverse Gaussian
InvGaussian

Jacknife analysis


Discrete Uniform

AVERAGE

COUNT

SQRT

STDEV

Array formulae

Constructs a Laplace distribution


Exponential

SQRT

Comparing uncertain quantities

Binomial estimation of p

Numerical integration

A performance measure for lawyers is the percentage of cases he or she has won. Let's say a person is looking for a lawyer to defend his case and would like to choose the one with the highest performance. There are three lawyers in the area who are knowledgeable in the field and their past performances are known. Assuming that each lawyer's cases were random samples from the same population of cases (so we can assume a Binomial process) since all three lawyers work in the same field, whom should the person choose as his lawyer?

Beta

BETADIST

Model a time series based on a relationship with a leading indicator variable


Normal

IF

INTERCEPT

ISNA

NA

RSQ

SLOPE

FALSE

Reliability of a set of components

A piece of electronic equipment is composed of six components A to F. The mean times between failures for each of the components are given below. The components are in serial and parallel configuration as shown on the chart. What is the probability that the machine will fail within 250 hours?

Exponential

MAX

MIN

Constructs a Logarithmic distribution


Discrete

LN

Model a lognormal random walk


Lognormal


Exercise to allow you to try out various modeling techniques

You are asked to determine an NPV for 10 years of cashflow (with no residual value, e.g. lease end), discounted at 8.5% on revenue converted back to sterling, for a new store in Times Square, New York to be opened in 2005

Discrete Uniform

Normal

Beta

EXP

TODAY

TREND

YEAR


You are an R&D company, that is planning to commercialize a new technology for oil refineries to help them meet a new EPA NOx emission requirements that will come into effect 1 year from now. However, you only want to develop this technology further if there are at least 55 refineries in the US that could benefit from your technology. Recently, a competing technology was brought in the market (technology X) that also helps refineries meet these emissions and quite a few refineries are already using this technology (before start of EPA requirements), but it is a lot more expensive. You are convinced that a refinery that used technology X is as likely to buy your new technology (if you develop it) as a refinery that doesn't use any technology yet. The total market size for your new technology (Y) is therefore any refinery that currently doesn't meet EPA requirements plus any refinery that used technology X. What is the probability that this total market is more than 55 refineries?

Uniform

AND

IF

Models the ratings and defaults of the portfolio over time with a Markov Chain model
Binomial

TRANSPOSE

Array Functions

IFERROR

Checks whether sufficient iterations have been run for a desired level of model output accuracy

How many iterations are needed to be confident that the true mean is within Y of my models output mean.


IF

NORMSDIST

analyze historic data for a random time series exhibiting mean reversion and then produce a forecast


Normal

Discrete Uniform

EXP

INTERCEPT

SLOPE

FALSE

Array formulae

Bayesian inference, estimating MTTF

Constructing posterior distribution for simulation

10 components of the same type are run for 3000 hours and 4 fail within that time. Assuming that the components have a constant instantaneous failure rate, what is the MTTF?

General

BINOMDIST

EXPONDIST

Construct a Multinomial distribution

All cars in a city are divided into 9 different types. We know the proportion of each type in a total number of cars. If we were to monitor 1000 cars that enter a particular motorway, how many cars of each type would we see?

Binomial

IF

Construct a Multivariate Hypergeometric distribution

There are 100 colored balls in a bag, from which 10 are red, 15 white, 20 blue, 25 green and 30 yellow. Without looking into the bag, you take 30 balls out. How many balls of each color will you take from the bag?

Hypergeometric

IF


Model sales of a new product that will either do very well, poorly or something in between.

Lognormal

Poisson

Triangular

BETADIST

Non-parametric Bootstrap technique to calculate the statistics of interest


Discrete Uniform

AVERAGE

PERCENTILE

STDEV

Non-parametric statistical estimate of difference in means between two populations


Discrete Uniform

AVERAGE

Non-parametric Bootstrap regression

Estimate regression parameters without assuming normally distributed residuals

Discrete Uniform

COUNT

FORECAST

INTERCEPT

SLOPE

SQRT

TREND

Fit a continuous non-parametric 1st order distribution to small and large data sets


Cumulative Ascending

COUNT

MAX

MIN

PERCENTILE

Fit a continuous non-parametric 2nd order distribution to small and large data sets


Beta

Cumulative Ascending

PERT

Uniform

COUNT

Exercise to allow you to try out various modeling techniques

You are evaluating a new company making fuel cells for hospital power plants. There are various uncertainties influencing the cashflows. Calculate the NPV of a project.

Binomial

Discrete Uniform

General

PERT

Triangular

AND

IF

Valuation model with three sheets which show how multiple risks are accounted for, including launch timing risk.A product launch opportunity is being evaluated with and without uncertainty about the assumptions, and with risk of launch delay using the envelope method to correlated variables. The NPV of the business opportunity is calculated for each model.PERT

NPV

MAX

MIN

IF

SUMIF

MaxExtreme distribution

During the construction of an offshore oil platform, we need to evaluate the yearly probability that the oil platform fails. The platform has 6 beams with individual strength distributions that can be described by a Weibull (0, 36.5, 20). The maximum yearly stresses on the platform (e.g. wind) can be described by a MaxExtreme (7.5, 5) and are highly correlated as they share the same environment. If 2 or more beams break, the oil-platform fails. What is the yearly probability that the platform fails?

Weibull


IF

Renewal process

Number of random variable to reach a defined sum

It is known that a certain type of light bulb has a lifetime that is Weibull (1.3,4020) hours distributed. If I have one light bulb working at all times, replacing each failed light bulb immediately with another, how many light bulbs will have failed in 10 000 hours?

Weibull

IF

Parametric Bootstrap

Calculate the statistics of interest for a Lognormal distribution fitted to a data set

Lognormal


AVERAGE

PERCENTILE

STDEV


You  have data on the amount of Paratetramol that people have taken. The Paratetramol will have come in pills of specific doses (e.g. 25mg), so the observations will take 25mg steps. How to fit a Negative Binomial distribution to this data?NegBinomial

Allocating particles in suspension to sub-divisions of equal, and unequal sizes

A number of particles N is distributed in some relatively large liquid volume V. This volume is divided into n volumes. What is the joint probability distribution of the number of particles in each division?

Binomial

IF

Bayesian inference through accept/reject simulation

Several interlinked stochastic processes

A herd of 25 cows is suspected of having disease X. You are going to look for this disease by taking samples from the cow pats they produce and test for antibodies to X. On average, this breed of cow produces 12 pats per day per cow. 10 fresh pats (i.e. produced that day) are tested. 2 test positive for contamination.

The test used is not perfect. If the cow pat contains antibodies to X there is a 90% probability (bases on a previous study with 18/20 success rate) that the test will come up positive. On the other hand, if the cow pat has no antibodies to X, here is still a 5% chance (based on a previous study with a 1/20 failure rate) that the test will be positive. How many of the cows are infected?

Beta

Binomial

Hypergeometric

Uniform

Poisson

IF



Gamma

Pearson5




Beta

Pearson6


Calculating (rather than simulating) the probability that a sum of random variables exceeds some target

The elevator is able to lift 500 kg, and if this limit is exceeded, it breaks down. The task is to calculate the optimal number of people allowed to enter the elevator at a time so that the risk of breaking the lift is acceptable.


NORMDIST

SQRT

Illustrates the steps to perform Monte Carlo Simulations in Crystal Ball.


Beta


Continuously calculates the number of iterations needed to achieve the desired level of accuracy for an output percentile

How many iterations are needed to be confident that the true X percentile is within Y of my models output percentile.


NORMSDIST

ROUNDUP

Sum of random number of random variables

Poisson process

A company insures airplanes. They crash at a rate of 0.23 crashes per month. Each crash costs $Lognormal(120,52) million. What is the distribution of cost to the company for the next five years?

Lognormal

Poisson

VLOOKUP

Poisson process with time log

Sum of random number of random variables

A company insures airplanes. They crash at a rate of 0.23 crashes per month. Each crash costs $Lognormal(120,52) million. What is the distribution of the value of the liability if discount it at the risk free rate of 5%?

Exponential

Lognormal

IF

Construct a classical statistics confidence distribution for the Poisson parameter


Cumulative Ascending

IF

POISSON

Poisson random walk with trend

A model that can be used, for example, to describe vehicle accident claims made to an insurance company, or cases of a disease for a health authority as the number of cars increases, the number of car crashes increases correspondingly according to some function; as the pollution level in a city increases, the number of people with respiratory disease increases

Poisson


Poisson random walk with trend and seasonality

An insurance company needs to model the number of car crashes in the country over the next 52 weeks period. The projection should include two factors: a) 10% annual upward trend due to the increase in the total number of cars in the country and b) seasonality factor - higher probability of a car crash during a winter season.

Poisson


Modeling continuous random process

4 pumps, A to D, serve a power station. In order to work, the station needs 2 working pumps. Each pump has a probability of failure in a day, which depends on number of pumps working. Each repair of the pump takes LN(20,15) days. How long will it take before a shutdown (before 3 of 4 pumps break down)? How many shutdowns will the station have per year? What is the probability of one or more shutdowns per year?

Exponential

Lognormal

AND

HLOOKUP

IF

LN

OR

Numerical integration

Multiple scenarios

Your company produces specialist power units. We deliver these units to the client in batches of ten. The client has a quality control procedure for each consignment, as follows: Three units are tested. If two or more of these samples are defective, the consignment is rejected. If one is defective, another three are tested, and if any of these second set are defective the consignment is also rejected. We want to construct a model that looks at the risk of rejection of a consignment for different numbers of defective power units.

Hypergeometric


IF

MAX

Bayesian inference

Prevalence estimate with imperfect test

We are doing a random survey of people in some region to determine the prevalence of infection with a particular micro-organism. The test consists of randomly selecting a person, drawing a sample of their blood and looking for antibodies. The test is imperfect, however, as there remains a possibility that antibodies may not be detected in the blood of an infected person, and that antibodies may be misidentified in samples from non-infected people. Knowing Se - the probability that an infected person will test positive and Sp - the probability that a non-infected person will test negative, estimate true population prevalence.

General

BINOMDIST

Estimating combined risk from several risk events

Probability of exceeding some threshold

Numerical integration

Risks A to J are ranked in descending order of expected cost. If a loss of $500,000 or more would ruin your company, how would you calculate the probability of ruin?

Binomial


Time series with cyclical shock

People are dying from some health problem X at a current rate of 88/month, and the rate appears to be increasing by 1.3 deaths/month with each month. However, a cure for this problem appears to be close to being finished. Estimates are that the cure will be available between 20 and 50 months from now, but nobody can say that any duration is more likely than any other. Tests so far show that it is expected that there is only a 30% chance the cure will work for a treated person.

Forecast the number of deaths there will be for the next 100 months if the cure is approved and if not approved.

Binomial

Poisson

IF

Exercise to allow you to try out various modeling techniques

Your company has been invited to bid for a project to design and construct a line of 16 electricity pylons. Estimates are based on PERT distributions (using absolute minimum, most likely and absolute maximum) unless indicated. Produce distributions of the total project cost and duration. Use a scatter plot to illustrate the relationship between the projects cost and duration. Plot iterations above and below 35 weeks separately on the same graph and fit regression lines through the two data sets.

Beta

Binomial

Discrete

Normal

PERT

IF

ROUNDUP

Exercise to allow you to try out various modeling techniques

Correlation matrix

Your company has been invited to bid for a project to design and construct a line of 16 electricity pylons. Estimates are based on PERT distributions (using absolute minimum, most likely and absolute maximum) unless indicated. Rerun the model Pylon, but make the design time and cost 50% correlated and the time to do the foundations 40% correlated with the associated plant cost and 50% correlated with the total foundation man hours. What is the effect of including these correlations?

Beta

Correlation Matrix Tool

Discrete

Normal

PERT

IF

ROUNDUP

Using Visual Basic macros with Crystal Ball

A post office has one counter that it recognizes is insufficient for its customer volume. It is considering putting in another counter and wishes to model the effect on the maximum number in a queue at any one time, which is considered to be a measure of the quality of its service. The post office is open from 9am to 5pm each working day. People arrive throughout the day at an average rate of 1 every 12 minutes. The amount of time it takes to serve each person is Lognorm (29,23) minutes. What is the maximum queue size in a day?

Discrete

Exponential

Lognormal

Poisson


Additive random walk with no memory

Model a random walk: St = St-1+ Normal(m,s)

Normal


Multiplicative random walk with no memory

Model a random walk: St = St-1* Normal(1+m,s)

Normal


Mixing calculation and simulation in risk models

You have determined that a raw egg has a 0.2% probability of being contaminated with Salmonella, that there is a 3.5% probability that a random egg would be consumed raw, a contaminated raw egg contains Poisson(50) bacteria, and that the probability a person would be ill is =1-Exp(-bacteria/5308)^-0.4059. You are asked to calculate the probability that a random egg will cause someone to become ill.

Binomial

Poisson

EXP

POISSON

SUMPRODUCT

Using optimization (Excels Solver) to obtain an MLE fit to a distribution

Find the parameter of a Rayleigh distribution that will best match the observed 18 data points.


ABS

EXP

MIN

Solver

Real option evaluation

Same model as  NPV of a capital investment, but with the addition of a real option to be evaluated

Discrete Uniform


Using recursive formulae to construct a distribution

Illustrates a method to construct a Binomial(1000000, 0.000001) distribution.

Discrete


Regression Bootstrap

Use of INDEX function

Illustrates the Parametric regression Bootstrap technique using Excels INDEX function to calculate the statistics of interest for two dependent datasets.

Integer Uniform

COUNT

INDEX

INTERCEPT

SLOPE

STEYX

Regression Bootstrap

Use of OFFSET function

Illustrates the Parametric regression Bootstrap technique using Excels OFFSET function to calculate the statistics of interest for two dependent datasets.

Integer Uniform

COUNT

INDEX

INTERCEPT

SLOPE

STEYX

Regression Bootstrap

Use of VLOOKUP function

Illustrates the Parametric regression Bootstrap technique using Excels VLOOKUP function to calculate the statistics of interest for two dependent datasets.

Integer Uniform

COUNT

INTERCEPT

SLOPE

FALSE

VLOOKUP

Modeling impact of correlated risks

Model a portfolio of risks that can influence the base estimated project cost and calculate the total cost of a project including inflation. There are 5 risk factors, some of which are correlated, plus the risk coming from the inflation. The probability of a strike doubles if the "bad weather" factor occurs. If both the "H&S" and "Strike" factors occur, the "Insolvency of sub-contractor" risk factor rises from 5% to 75%.

Beta

Binomial

Triangular

AND

IF

Poisson process

Summing large random number of random variables using Central Limit Theorem

A small shop wishes to estimate the revenue for the next year. Historical data shows that the total number of people passing the store is equal to 270,000 per year. There is a 3% probability that a passing the store person will enter the shop, and 10% probability that the person that entered the shop will make a purchase. The purchase size is distributed normally with mean - $12.71 and standard deviation - $7.27. What is the distribution of the annual revenue?

Normal

Poisson

SQRT

Model the sales over a period where it is known that there is a finite market for the product.


Binomial

Normal

Triangular

ROUND

SQRT 

Project schedule modeling with risks

There are three identified risks that can influence the duration of certain stages of the project. The GANTT chart below shows the diagram of the project schedule. Calculate the distribution of the total duration for this project.

Binomial

Triangular

MAX

Additive and multiplicative seasonalized forecasts with trend.


Normal


Time series projection of events occurring randomly in time with seasonality and cyclical shock

An insurance company needs to model the number of old people deaths over the next 52 weeks period. The projection should include three factors: a) 5% annual downward trend due to the improvement of medical care, b) seasonality factor - higher probability of a death during summer and winter, c) economic factor - the country is expecting a sharp increase in pension in June if the law is passed through parliament which should help pensioners with heating and food, but the resultant reduction in the death rate is uncertain.

Binomial

Uniform

Beta

Poisson

IF

MIN

MAX

Illustration of incorrect modeling of sum of random variables

An individual share price movement follows a Normal(m,s) distribution. Model the distribution of the change in share price after n number of turns.

Normal

SQRT

Number of random variables to reach a target

The farm sells sheep to the slaughterhouse each week by supplying the certain amount of kilograms. The task is to calculate the number of sheep that the farm needs to supply in order to make the total weight of the sheep equal the required number.

Lognormal


IF

MAX

Number of random variables to reach a target with a particular probability

How many sheep is needed to supply to be 95% sure of producing at least 10 tons

Lognormal

AND

IF

Simple project duration model

In order to build a silo, a construction firm needs to finish several inter-related stages of different duration. What is the total duration of the project?

Triangular

MAX

Simulating a stock price with a random walk model and valuation of a call option with the Black-Scholes model.

Lognormal

Normal

EXP

SQRT

MAX

NORMSDIST

Two ways of simulating a risk event with a random impact

Lets imagine that there will be an election this year and that two parties are running: the Socialist Democrats Party and the Democratic Socialists Party. The SDP are currently in power and have vowed to keep the corporate tax rate at 17% if they win the election. Political analysts reckon they have about a 65% chance of staying in power. The DSP promise to lower the corporate tax rate by one to four percent, most probably 3%. Model the next years corporate tax rate.

Beta

Binomial

Discrete

PERT



Splicing two distributions with smooth transfer function

Model to splice a Pareto distribution onto a Gamma distribution to extend its tail.

General

GAMMADIST

LOG

NORMDIST

Splicing two distributions by using a truncation to give the smooth transition

Model to splice a Pareto distribution onto a Gamma distribution to extend its tail.

General

GAMMADIST

Model to determine the probability that stress on a component exceeds its strength, and therefore causes it to fail


Gamma

IF

A template to plot a summary chart in Excel after exporting the required output statistics from Crystal Ball into Excel.




Modeling a parallel renewal process

It is known that a certain type of light bulb has a lifetime that is Weibull(1.3,4020) hours distributed. If I have 10 light bulbs going at all times, how many will fail in 1000 hours?

Weibull

IF 

Bayesian inference

Capture-release-recapture experiment

Hypergeometric sampling

A game warden on an island covered in jungle would like to know how many tigers she has on her island. She arranges for a capture-release-recapture survey to be carried From the experiments results, estimate how many tigers are there on the island.

Discrete

IF 

HYPGEOMDIST

Estimating total cost of Poisson failures at different rates. Note this is a VC/RS/US model

In 79 weeks, a company has observed 21 transaction failures. Each transaction failure costs are shown in the table below. What is the cost for these transaction failures next year?

Gamma

Poisson

SUMPRODUCT

Comparison of several uncertain binomial probabilities

Numerical integration

A government agency is charged with regulating the prescription of drugs. A pharmaceutical company have developed a new drug useful in the control of disease X. In field trials they have tried various dose:duration combinations with randomly selected individuals suffering from X. The results are shown in a table. How confident are we that combination D is the most effective of the four options? Is it plausible that combination A is actually better?

Beta

IF

MAX

A template to produce a Tukey Box Plot, which iscommonly used to represent variations between data sets.




Bayesian inference

Numerical integration

Hyperparameters

Constructing posterior

Your company manufactures gas turbines for power stations. One of the key performance characteristics is that the turbine blades do not, with high probability, develop micro-fractures beyond size X within an operation period T. There are 50 blades in a turbine. You test 18 turbines for the required period T, and perform an inspection of the blades using a sonic scanner. The inspection method is not foolproof: it has a 20% chance of failing to detect a fracture that is there. Your study identified 3 fractures. What is the probability that a turbine satisfies the performance requirement?

Contacting the manufacturer of the sonic scanner, you find that actually they are not so sure about this 20% failure rate: it is based on a study where 4 of 5 fractures were detected. How does this affect your estimate of the probability that a turbine satisfies the performance requirement?

Beta

General


BINOMDIST

Bayesian inference

Numerical integration

Hyperparameters

Simulating posterior

Beta

NegBinomial


Part 1 of a simple example of a VC/RC/US model: randomness R only

A VC/RC/US model, where the probability of each possible outcome is explicitly calculated (RC), the range of variability is incorporated by making separate calculations (VC) for each value of the variability parameter(s), and the uncertainty is simulated (US).


POISSON

SQRT

Array formulae

Part 2 of a simple example of a VC/RC/US model: randomness R and uncertainty U

An extension to model  VC RC US - 1: randomness and uncertainty

Gamma

POISSON

SQRT

Array formulae

Part 3 of a simple example of a VC/RC/US model: randomness R, uncertainty U and variability V

An extension to model  VC RC US - 2: randomness, uncertainty and variability



Simple example of a VC/RS/UL model part 1: generating values for uncertain parameters and

2: using generated values for uncertain parameters in loop

Variability is calculated, Randomness is simulated and Uncertainty is simulated in second loop (VC/RS/UL model)

Gamma

Poisson



Simple example of a VC/RS/US model

Variability is calculated, Randomness and Uncertainty are simulated (VC/RS/US model). Example 1.

Gamma

Poisson


Extension to  VC RC US - 1 model showing how additional variability extends complexity quickly

Variability is calculated, Randomness and Uncertainty are simulated (VC/RS/US model). Extension to the previous example with additional levels of variability.

Gamma

Poisson




Beta

Binomial

BETADIST

Simple example of a VS/RS/US model

Variability, Randomness and Uncertainty are simulated together (VS/RS/US model). Example 1

Gamma

Uniform

Poisson

COUNT

INDEX

Alternative approach to  VS RS US - 1

Variability, Randomness and Uncertainty are simulated together (VS/RS/US model). Example 2

Gamma

Poisson


Simulation thinking

Graphical illustration of model

Two people agreed to meet under a clock between 1pm and 2pm. Each agreed to wait 20 minutes for the other. What is the probability that they meet?

Uniform



Generating random values from a distribution using formula inversion method

Illustrates the technique of generating one of the three standard extreme value distributions - a Weibull distribution

Uniform

LN

Finding MLEs for a distribution fitted to complete data using optimization

Illustration of MLE fitting to the Weibull distribution using complete data


LOG

WEIBULL

Array formulae

Solver

Bayesian inference for two correlated parameters

Likelihood function for complete data

Determining the joint uncertainty distribution for parameters of a Weibull distribution where we have values for all observations

General

EXP

INDEX

HLOOKUP

LOG

WEIBULL

Array formulae

Finding MLEs for a distribution fitted to interval censored data using optimization

Illustration of MLE fitting to the Weibull distribution using  interval censored data


LOG

WEIBULL

Array formulae

Solver

Bayesian inference for two correlated parameters

Likelihood function for interval censored data

Determining the joint uncertainty distribution for parameters of a Weibull distribution where we only know the frequency of values within (usually contiguous) bin ranges.

General

EXP

INDEX

HLOOKUP

LOG

WEIBULL

Array formulae

Finding MLEs for a distribution fitted to left censored data using optimization

Illustration of MLE fitting to the Weibull distribution using  left censored data


LOG

WEIBULL

Array formulae

Solver

Bayesian inference for two correlated parameters

Likelihood function 

for left censored data

Determining the joint uncertainty distribution for parameters of a Weibull distribution where we only have values for observations above a threshold.

General

EXP

INDEX

HLOOKUP

LOG

WEIBULL

Array formulae

Finding MLEs for a distribution fitted to right censored data using optimization

Illustration of MLE fitting to the Weibull distribution using  right censored data


LOG

WEIBULL

Array formulae

Solver

Bayesian inference for two correlated parameters

Likelihood function for right censored data

Determining the joint uncertainty distribution for parameters of a Weibull distribution where we only have values for observations below a threshold.

General

EXP

INDEX

HLOOKUP

LOG

WEIBULL

Array formulae

Binomial process

Poisson process

Simple simulation

There is a 5% chance of incorrectly setting up a machine to produce widgets resulting in a bad batch of out of tolerance widgets 10 machines are set up on a days production run. Each machine will produce Poisson(250) widgets. What fraction of my production will be out of tolerance?

Binomial

Poisson


Bayesian inference

Binomial probability estimate

Numerical integration

Graphical presentation of model

Two wine experts are each asked to guess the year of 20 different wines. Expert A guesses 11 correctly, while expert B guesses 14 correctly. How confident can we be that Expert B is really better at this exercise than Expert A?

Beta


Binomial process

Poisson process

Simple dose-response model

20 people are invited to a party, where some wine, contaminated with bacteria, is being served. Let's say we know that there are on average 7 bacteria per litre of wine, and one glass contains 150 ml of wine. 12 out of 20 people decided to drink one glass of wine each.

Let's also say that we know that 7 people at the party are allergic to this particular bacterium, so if they consume one or more bacteria they have a higher chance of getting ill than those who are not allergic. The probabilities of getting ill for both allergic and non-allergic people are shown in the table below. How many people are going to get ill?

Binomial

Poisson

VLOOKUP



  • No labels