What do we mean by a long-tailed distribution? One distribution is said to have a longer tail than another if its probability density (or mass) function is (asymptotically) larger than the other distribution's for very large values of the variable, i.e. for two distributions A and B:

If ~ for ~ increasing ~ values ~ of ~ x ~ \frac {f_B(x)}{f_A(x)} \rightarrow 0, then ~ B ~ has ~ a ~ smaller ~ tail ~ than ~ A \\ If ~ for ~ increasing ~ values ~ of ~ x ~ \frac {f_B(x)}{f_A(x)} \rightarrow constant, ~ then ~ B ~ has ~ the ~ same ~ tail ~as ~ A \\ If ~ for ~ increasing ~ values ~ of ~ x ~ \frac {f_B(x)}{f_A(x)} \rightarrow \infty, then ~ B ~ has ~ a ~ large ~ tail ~ than ~ A |

Many socioeconomic and other natural random variables take long-tailed distributions. Examples are city population sizes, occurrences of natural resources (e.g. size of reserves in a certain geological region), stock price fluctuations, size of companies, income.

The most commonly fitted distribution to the extreme of such data has been the __Pareto__. There is no decent theory to explain why the Pareto distribution tends to fit the tails of long-tailed variables, but most people accept that it works and use it anyway.

The Pareto is usually a poor fit for the main body of the variable, though. Thus, when modeling long-tailed distributions one usually does so using a splice of one distribution (like the Lognormal, or Gamma, for example), with a Pareto distribution to model the tail. The ideal splice ensures that the variable's composite distribution makes a smooth transition between the distribution for the main body of the variable, and the distribution for the tail. That means that if there is an abrupt switch from one distribution to another at some value x, the two distributions have the same probability density at that point, and the same gradient to the density (or mass) function, which is more challenging. For example, one might wish to combine a Gamma(0, 12, 3) distribution for the main body of a distribution and a Pareto(30, 4) for the tail, with a splice at x = 80. The figure below plots these two distributions together.

The Pareto distribution has a density that is too low at the splice point x = 80. However, we can change that by truncating the Pareto distribution on its left side, which has the effect of increasing the density to the right of the truncation point. With a little testing of different truncation values, we can get the following:

Conveniently, the two distributions have the same gradient at the splice point x = 80. We can now create a spiced composite distribution as shown in model Splice Gamma and Pareto. This model plots out the probability density function over all x, and then uses the General distribution to sample from the constructed density function.

The links to the Splice Gamma and Pareto software specific models are provided here:

The above example was rather easier that one normally has to deal with because there was a smooth transition between the two fitted distributions at the splice point. More usual would be that one could match the probability densities, but the two distributions have quite different gradients. The figure below shows such a scenario.

We wish to splice a Pareto(10,600) that is shifted -600, from x=65 to a Gamma(0, 12, 3). They have the same densities at the splice point, but quite different gradients. There is a simple way to splice these two distributions around the required splice point by creating a function that gradually transfers from one distribution to another. The function begins by taking 0% of the Pareto and smoothly increases up to 100% with increasing x. The Lognormal cumulative distribution function is convenient for this purpose because it extends to infinity, but not below zero. Excel offers the LOGNORMDIST(x,*m*,*s*) function which will return the cumulative probability at a value x for a variable whose natural log is Normal(*m*,*s*). Setting the mean *m* equal to the splice point means that at this point the composite distribution is taking it probability density equally from the Pareto and Gamma distributions. Varying the standard deviation *s* to vary the smoothness (the larger *s*, the slower the transition), we can usually achieve something acceptable. Model Smooth Splice Gamma and Pareto provides an example. The figure below shows the resultant combined distribution for varying values of *s*.

The links to the Smooth Splice Gamma and Pareto software specific models are provided here: