Estimate of the mean number of events per period l
Like the binomial probability p, the mean events per period l is a fundamental property of the stochastic system in question. It can never be observed and it can never be exactly known. However, we can become progressively more certain about its value as more data are collected. Statistics provides us with a means of quantifying the state of our knowledge as we accumulate data.
We discuss two approaches: Bayesian and classical statistics
Assuming an uninformed prior p(l) = 1/ l and the Poisson likelihood function for observing a events in period t:
The proportional statement is acceptable because we can ignore terms that don't involve l, and we then get the posterior distribution:
which by comparison with a Gamma density function is a Gamma(0,1/t,a) distribution. The Gamma distribution can also be used to describe our uncertainty about l if we start off with an informed opinion and then observe a events in time t. If we can reasonably describe our prior belief with a Gamma(0,b,a) distribution, the posterior is given by a Gamma(0, b/ (1 + b t),a + a) distribution.
More difficult: the effect of the prior
The following paragraph is fairly difficult (but interesting) and is not totally necessary to understand the use of the Gamma distribution in determining the mean number of events per period (l). The choice of p(l) = 1/ l (which is equivalent to a Gamma(0,z,1/z) distribution where z is extremely large) as an uninformed prior is an uncomfortable one for many. We can get a feel for the importance of the prior with the following train of thought:
A p(l) = 1/ l prior is equivalent to Gamma(0,z,1/z) where z approaches infinity. You can prove this by looking at the Gamma probability distribution function and setting a to zero and b to infinity.
A flat prior (the opposite extreme to the p(l) = 1/ l prior) would be equivalent to a Gamma(0,z,1), where z approaches infinity, i.e. an infinitely drawn out Exponential distribution.
We have seen that for a Gamma(0,b,a) prior, the resultant posterior is Gamma(0, b/ (1 + a t),a + a) which means that the posterior for 1. would be Gamma(0, 1/t,a) and for 2. would be Gamma(0, 1/t,a +1).
Thus, the sensitivity of the Gamma distribution to the prior amounts to whether (a +1) is approximately the same as a. Moreover, Gamma(0,b,a) is the sum of a independent Exponential(1/b) distributions so one can think of the choice of priors as being whether we add one extra Exponential distribution or not to the a Exponential distributions from the data. Thus, if a was 100 for example, the distribution would be roughly 1% influenced by the prior and 99% influenced by the data. In this model, the information contained in the quantity of data available always overpowers the prior.
2. Classical statistics
Various classic statistics approaches to estimating l are discussed here.
3. Comparison of classical and Bayesian methods
A comparison of the different approaches to estimating l are discussed here.