We often hear on the news from a recent poll of a population how people are expected to vote on some issue or at an election. If the issue is a simple "yes" or "no", and the people are randomly and representatively sampled from the population, then the poll is a binomial process. In this case, our uncertainty about the fraction of voters p who will ultimately vote "yes" is described by an uncertainty distribution as follows:
p = Beta(s+1,n-s+1,1)
where n is the number of people surveyed and s is the number among them who stated they would vote "yes". Built into this analysis is the assumption that people won't change their minds between the time the poll was conducted and the date of the vote – which is always a tricky assumption!
A more interesting case is when there are more than two possible outcomes, for example, an election where there are three or more significant competing parties. This is a multinomial process, and we would therefore employ the Dirichlet distribution to represent our uncertainty about the fraction of the population who would vote for each party.
For example, imagine that we have surveyed 175 people, asking them for which party they are intending to vote. The results are as follows:
Voting choice | Number with this preference |
SDP | 25 |
SMP | 37 |
PSM | 16 |
EDP | 28 |
Abstaining | 69 |
Total | 175 |
Using the Dirichlet distribution and assuming that people don't change their mind between the poll and election time, we can answer questions like:
How confident are we that SMP will win (get more votes than any other party)?
If the SDP joins forces with the EDP, and the SMP joins forces with the PSM, how confident are we that SDP/EDP will get more votes than SMP/PSM?
Model Election demonstrates how to construct the Dirichlet distribution to calculate the probabilities and their associated confidences. The Dirichlet distribution is explained more fully in the Multivariate Trials section.
The links to the Election software specific models are provided here: