We sometimes need to recognize the inter-relationship between probabilities of values for two or more distributions.

In other words, these distributions are not independent of each other.

In some simulation software packages, specific features, such as Rank Order Correlation in Crystal Ball, and other modeling methods allow us to crudely model correlations between several distributions. However, there are certain situations where specific multinomial distributions are needed.

The following three common multivariate distributions are described here:

Multinomial

Dirichlet

and Multivariate Hypergeometric

**Multinomial distribution**

For a set of n trials, each of which could take one of k different outcomes (like different colours of balls in a huge urn) with probabilities p_{1}..p_{k}, the distribution of the outcomes is known as multinomial, which is just an extension of the binomial distribution. The only difference is the number of possible outcomes: only two for the binomial and multiple for the multinomial. The Multinomial distribution has the following probability mass function:

p(s_1,s_2,...,s_k) = \left( \begin{array}{c} n \\ {s_1 ~ ~ s_2 ~ ~ ... ~ ~s_k} \end{array} \right) p_1^{s_1}...p_k^{s_k} |

where \; \displaystyle\sum_{i=1}^{k} s_i = n \quad and \quad \displaystyle\sum_{i-1}^{k}p_i =1 |

\left( \begin{array}{c} n \\ S_1 ~ ~ S_2 ~ ~ ... S_k \end{array} \right) = \frac{\bigg(\displaystyle\sum_{i=1}^{k} s_i\bigg) !}{s_1!s_2!...s_k!} is sometimes known as Multinomial coefficient.

Let's consider the following problem: The cars in a city are divided into 9 different categories. We know the proportion of the city's cars that are in each category. If we were to monitor 1000 cars that enter a particular motorway, how many cars of each category would we see?

This is clearly a problem of multinomial trials since every car that enters the motorway can be any one of 9 types.

To sample from a multinomial distribution we need to proceed as follows:

We know p_{1}, p_{2},…,p_{k} (proportions of each type) and n (our sample size - 1000).

First we simulate from binomial(n, p_{1}) – this gives us s1.

For each remaining category, we simulate s_{2}, s_{3}, …,, s_{k} in order with s_{j} = binomial(n-SUM(s_{1}…s_{j-1}),p_{j}/SUM(p_{j}…p_{k}))

Note that the marginal distribution (the distribution of the parameter looked at in isolation, i.e. with the uncertainty of all the other parameters integrated out) for s_{j} (i.e. the distribution of generated values for sj when looked at by itself) is simply a binomial(n,p_{j}).

An example of a multinomial distribution can be found in the Multinomial model.

The links to the Multinomial software specific models are provided here:

**Dirichlet distribution**

The conjugate to the multinomial distribution is the Dirichlet distribution, much like the beta distribution is the conjugate to the binomial distribution. The Dirichlet distribution is used for modeling the uncertainty around probabilities of successes in multinomial trials.

The Dirichlet distribution has the following probability density function:

f(p_1,p_2,...p_k)=\frac{\Gamma\big(\alpha_1+a_2+...+a_k\big)}{\Gamma\big(a_1\big) \Gamma\big(a_2\big) ...\Gamma \big(a_k\big)} p_1^{\;\alpha_{1^{-1}}}...p_k^{\;a_{k^{-1}}} |

For example, if you've observed s_{1}, s_{2}, … s_{k} of different types of outcomes from n trials, the Dirichlet distribution provides the confidence distribution about the correct values for the probability that a random trial will produce each type of outcome by setting *a*_{1} = s_{1}+1,. Obviously these probabilities have to sum to 1, so their uncertainty distributions are inter-related.

Let's take the same problem that we used in the previous example: All cars in a city are divided into 9 different types. But now we have monitored 100 cars that were entering a particular motorway, and counted the number of cars of each type. What is the uncertainty distribution for the proportions of each type in the total population of cars?

In order to solve this problem we need to go through the following steps:

Simulate p

_{1}= Beta(*a*_{1}, SUM(*a*_{2}..*a*_{k}),1)For each remaining probability (except pk), simulate p

_{2}, p_{3}, …,, p_{k-1}in order with

pj = (1-SUM(p_{1}..p_{j-1}))*Beta(*a*_{j}, SUM(*a*_{j+1}..*a*_{k}),1)Finally set p

_{k}= 1-SUM(p_{1}..p_{k-1})

Note that the marginal distribution of p_{j} is Beta(s_{j}+1, n+k-s_{j}-1, 1), which means that if we have no information (i.e. n = 0, s_{j} = 0), the prior is Beta(1,k-1,1) which has a mean of 1/k. This is not a Uniform distribution. That makes sense because the probabilities must add up to 1, so the means of their uncertainties must add up to 1.

The solution to this problem can be found in Dirichlet model:

The links to the Dirichlet software specific models are provided here:

The Dirichlet distribution is not as intuitive as the Multinomial distribution, but it is a very handy tool when modeling multinomial trials.

**Multivariate hypergeometric**

Sometimes we need to model sampling from a population without replacement with multiple outcomes and when the population is small so the process cannot be approximated to a multinomial where the probabilities of success remain constant. In this case we use the multivariate hypergeometric distribution, which is similar to the hypergeometric distribution, with the difference in the number of possible outcomes from a trial (two - in the hypergeometric and many - in the multivariate hypergeometric).

The figure below shows the graphical representation of the multivariate hypergeometric process. D_{1}, D_{2}, D_{3} and so on are the number of individuals of different types in a population, and x_{1}, x_{2}, x_{3}, ... are the number of successes (the number of individuals in our random sample (circled) belonging to each category).

The Multivariate hypergeometric distribution has the following probability mass function:

f(x)=\frac {\left( \begin{array}{c} D_1 \\ x_1 \end{array} \right)\left( \begin{array}{c} D_2 \\ x_2 \end{array} \right)...\left( \begin{array}{c} D_k \\ x_k \end{array} \right)}{\left( \begin{array}{c} M \\ n \end{array} \right)} |

, where \displaystyle\sum_{i=1}^k D_i = M, \displaystyle\sum_{i=1}^k x_i = n

Let's imagine a problem where we have 100 colored balls in a bag, from which 10 are red, 15 purple, 20 blue, 25 green and 30 yellow. Without looking into the bag, you take 30 balls out. How many balls of each color will you take from the bag?

We cannot model this problem using the multivariate distribution, because when we take the first ball out, the proportions of the different color balls in the bag change. The same happens when we take the second ball out and so on.

Thus, we must proceed as follows:

Model the first color (red for example) as x

_{1}= Hypergeometric(s_{1},N, M) , where s_{1}is the the type 1 population = 10, N is the sample size = 30, and M is the population size = 100

Model the rest as: x

_{i}= Hypergeometric (s_{i}, N - SUM(x_{1}: x_{i-1}), SUM(D_{i}: D_{n})) , where s_{i}is the the type i population, x_{i}is the number of successes of the type i in a sample, x_{i-1}is the number of successes of the type i-1 in a sample, D_{i}is the number of successes of type i in the total population, D_{n}is the number of successes of the last type in the total population.

The solution to this problem can be reached here: Multivariate Hypergeometric

The links to the Multivariate Hypergeometric software specific models are provided here: