Let's imagine two wine experts are each asked to guess the year of 20 different wines. Expert A guesses 11 correctly, while expert B guesses 14 correctly. How confident can we be that Expert B is really better at this exercise than Expert A?

If we allow that the guess of the year for each wine tasted is independent of every other guess, we can assume this to be a binomial process. We are thus interested in whether the probability of one expert guessing correctly is greater than the other's. We can model our uncertainty about the true probability of success for expert A as Beta(12,10,1) and expert B as Beta(15,7,1) - see the Beta distribution for an explanation of why.

The model then randomly samples from the two distributions and the outcome cell returns a 1 if the distribution for expert B has a greater value than the distribution for expert A. We run a simulation on this cell and the mean result equals the percentage of our iterations (scenarios) for which the distribution for expert B generated a higher value than for expert A, and thus represents our confidence that expert B is indeed better at this exercise. In this case, we are 83% confident.

The links to the Wine Experts software specific models are provided here: