# Modeling Correlations

Independent variables may take any value from their distributions irrespective of the value from any other variable. For example, if one wished to model the weight of 10 random people queuing to get into an elevator, each person's weight is independent of every other person's. In other words, the probability that one variable will take a specific value is unrelated to the value that any of the other variables take.

However, for dependent variables, the probability that a dependent variable takes a specific value is in some way related to the value that another variable, or other variables, take. For example, we might have three distributions representing the time it will take to design (D), code (C) and test (T) a new software application that our company is writing:

­

We might argue that the longer the time required to design the software, the more complicated it turned out to be, and thus the longer it would take to code. In that case, C will tend to either both be large or small or in the middle:

If we were to plot out random scenarios from these two distributions, they might look like this:

The figure below plots distributions of values of the code time C generated when D is in a low range (between 7 and 8) and again when D is in a high range (between 32 and 33). These are called conditional distributions for C: they are conditioned on the value that D takes.

This section looks at several different ways to model correlation:

1. Rank order correlation

This method give a quick and easy, but not very intuitive, method of correlating variables. Rank order correlation does not need to model the direction of the influence, so one does not have to specify which variable is dependent on which. Software packages typically have two ways to implement rank-order correlations between distributions, one in which just two probability distribution are correlated, and a second one in which a correlation matrix allows the user to correlate two or more distributions. The rank order correlation technique is one of the only practical ways to get multiple distribution correlation.

2. Envelope method

This method has the dependent variable being modeled by a distribution whose parameters are functions of the independent variable. It is well suited to modeling expert opinion of correlated variables, and is easy to use and check. It can model one-to-many relationships, but is difficult to adapt to many-to-many relationships and would require determining a logical sequence of relationships. The envelop method can help with intuitively including important relationships in financial models.

3. Using lookup tables

This method modifies a distribution or selects from different distributions to model a variable, according to the value that is generated for a variable it is being influenced by. The lookup table method is well suited to modeling expert opinion of correlated relationships, and can model one-to-many influences, but is difficult to adapt to many-to-many influences and would require a sequence of influence.

4. Conditional logic

There are various functions (e.g. IF(), AND(), OR()) in Excel that allow one to build up a logic that makes a Cell switch between values according to other Cell values. We can capitalize on these features to build up complex relationships between our model variables.

5. Copulas

Copulas provide a more flexible method to model correlations. They are conceptually similar to rank the rank order correlation method, but allow us to model more diverse correlation patterns. For example, some variables can be highly correlated at their extremes (often called tail dependencies) and less correlated in their mid range, whereas other variables can be more correlated at high values and less at low values (and vice-versa). The stock market provides a good example of such correlation patterns: to the despair of investors, stocks tend to move together more closely when the market experiences extreme losses than when the market is doing very well.

There are two general types of copulas:

• Archimedean: they only require one parameter so they are simple to use while also being able to model different correlation shapes. The most common in this group are the Frank, Clayton, and Gumbel copulas
• Elliptical: only the Gaussian and Student (or more simply, t) copula fall in this group. They are equivalent to the rank order correlation method, so they are useful under the same situations.

• No labels