To learn more about EpiX Analytics' work, please visit our modeling applications, white papers, and training schedule.

Page tree

A herd of 25 cows is suspected of having disease X. You are going to look for this disease by taking samples from the cow pats they produce and test for antibodies to X.

 

On average, this breed of cow produces 12 pats per day per cow.

 

10 "fresh' pats (i.e. produced that day) are tested. 2 test positive for contamination.

 

The test used is not perfect. If the cow pat contains antibodies to X there is a 90% probability (bases on a previous study with 18/20 success rate) that the test will come up positive. On the other hand, if the cow pat has no antibodies to X, here is still a 5% chance (based on a previous study with a 1/20 failure rate) that the test will be positive.

 

How many of the cows are infected?

 

This is a great example on which to practice your knowledge of stochastic processes, even if you are not too keen on the subject matter! We have nested binomial, Poisson and Hypergeometric processes all at work. Plus, it is a real world problem (this technique is actually used in veterinary science). Due to the complexity of the stochastic process that runs from the parameter we wish to estimate, to the observations we have, we use a simulation approach to Bayesian inference. This example is available in the simulation model: Pats.

The links to the Pats software specific models are provided here:

  Pats

 

 

Explanation of model:

 

The model uses the following notation:

 

M = Herd size = 25

Lambda = pats/cow/day = 12

Tested = number of pats tested = 10

Positive = number of tested pats that were positive = 2

Se = P(test positive is pat infected) = Beta(18+1,20-18+1,1) from binomial theory

Sp = P(test negative is pat not infected) = Beta(19+1,20-19+1,1) from binomial theory

 

BadCows = Prior distribution for infected cows = ROUND(Uniform((0-0.5),(M+0.5)),0)

A discrete uniform distribution representing an uninformed prior

InfectedPats = Number of "fresh' infected cow pats

=IF(BadCows=0,0,Poisson(BadCows*Lambda))

The IF statement is there to prevent Crystal Ball's Poisson distribution from giving an error in this cell when its mean = 0. The Poisson distribution is modeling the number of pats that are laid by infected animals (BadCows), assuming that pats are laid randomly in time and that each cow lays on average "Lambda' pats per day

NotInfectedPats = Number of "fresh' pats from cows not infected

=IF(M=BadCows,0,RiskPoisson((M-BadCows)*Lambda))

The IF statement is there to prevent Crystal Ball's Poisson distribution from giving an error in this cell when its mean = 0. The Poisson distribution is modeling the number of pats that are laid by non-infected animals (M-BadCows), assuming that pats are laid randomly in time and that each cow lays on average "Lambda' pats per day. It assumes that cows lay the same average number of pats whether infected or not (maybe a big assumption).

TotalPats = Total number of pats produced that day

= InfectedPats + InfectedPats

InfectedInSample = Number of infected pats in the sample

=IF(InfectedPats=0,0,Hypergeometric(InfectedPats/TotalPats,Tested,TotalPats))

Recognises that this is a Hypergeometric sample from a total of TotalPats, of which InfectedPats' are infected, and Tested' are sampled from this population.

TruePos = Number of infected pats that tested positive

IF(InfectedInSample=0,0,Binomial(Se,InfectedInSample))

Each infected pat has probability Se of testing positive. The IF statement is there to prevent Crysal Tall Binomial distribution from giving an error in this cell when the number of trials = 0.

FalsePos = Number of non-infected pats that tested positive

=IF(Tested-InfectedInSample=0,0,Binomial(1-SP,Tested-InfectedInSample))

Each non-infected pat has probability (1-Sp) of testing positive.

Posterior = estimated number of infected cows in herd

=IF(TruePos+FalsePos=Positive,BadCows,NA())

Accepts values from the prior "BadCows' if the total number of positive pats (TruePos+FalsePos) equals the number observed "positive'. The NA function generates an error that is not then processed by Crystal Ball (i.e. plays no part in calculation of output statistics) beyond noting that the error was produced.

 

 

The simulation model with a ROUND(Uniform(-0.5,25),0) prior produces the following posterior:

 

 

  Pats

 

 

Explanation of the model: 

 

The model uses the following notation:

 

M = Herd size = 25

Lambda = pats/cow/day = 12

Tested = number of pats tested = 10

Positive = number of tested pats that were positive = 2

Se = P(test positive is pat infected) = Beta(18+1,20-18+1) from binomial theory

Sp = P(test negative is pat not infected) = Beta(19+1,20-19+1) from binomial theory

 

BadCows = Prior distribution for infected cows = IntUniform(0,M)

A discrete uniform distribution representing an uninformed prior

InfectedPats = Number of ’fresh’ infected cow pats

=IF(BadCows=0,0,RiskPoisson(BadCows*Lambda))

The IF statement is there to prevent @RISK’s Poisson distribution from producing an error when its mean = 0. The Poisson distribution is modeling the number of pats that are laid by infected animals (BadCows), assuming that pats are laid randomly in time and that each cow lays on average ’Lambda’ pats per day

NotInfectedPats = Number of ’fresh’ pats from cows not infected

=IF(M=BadCows,0,RiskPoisson((M-BadCows)*Lambda))

The IF statement is there to prevent @RISK’s Poisson distribution from producing an error when its mean = 0. The Poisson distribution is modeling the number of pats that are laid by non-infected animals (M-BadCows), assuming that pats are laid randomly in time and that each cow lays on average ’Lambda’ pats per day. It assumes that cows lay the same average number of pats whether infected or not (maybe a big assumption).

TotalPats = Total number of pats produced that day

= InfectedPats + InfectedPats

InfectedInSample = Number of infected pats in the sample

=IF(InfectedPats=0,0,RiskHypergeo(Tested,InfectedPats,TotalPats))

Recognises that this is a Hypergeometric sample from a total of TotalPats, of which InfectedPats’ are infected, and Tested’ are sampled from this population.

TruePos = Number of infected pats that tested positive

IF(InfectedInSample=0,0,RiskBinomial(InfectedInSample,Se))

Each infected pat has probability Se of testing positive. The IF statement is there to prevent @RISK’s Binomial distribution from producing an error when the number of trials = 0.

FalsePos = Number of non-infected pats that tested positive

=IF(Tested-InfectedInSample=0,0,RiskBinomial(Tested-InfectedInSample,1-Sp))

Each non-infected pat has probability (1-Sp) of testing positive.

Posterior = estimated number of infected cows in herd

=RiskOutput(”Posterior”)+IF(TruePos+FalsePos=Positive,BadCows,NA())

Accepts values from the prior ’BadCows’ if the total number of positive pats (TruePos+FalsePos) equals the number observed ’positive’. The NA function generates an error that is not then processed by @RISK (i.e. plays no part in calculation of output statistics) beyond noting that the error was produced.

 

The simulation model with an IntUniform(0,25) prior produces the following posterior:

 

 

 

  • No labels