Beta(0,0,1) which does not exist;
Beta(0.5,0.5,1) which has a peak at zero and one; and
Beta(1,1,1) which is a Uniform distribution
When s is small (or close to n by reflection) the classical and Bayesian with Beta(0,0,1) prior give the widest distribution. The Bayesian with Beta(1,1,1) prior gives an estimate closer to 0.5, the Bayesian with Beta(0,0,1) prior gives an estimate the furthest away from 0.5, and the classical and Bayesian with Beta(0.5,0.5,1) prior lie in between.
The classical and Bayesian with Beta(0.5,0.5,1) prior give very similar results for n>9 and 0<s<n. All methods tend to the same result as n gets large, and tend more quickly to the same result as s approaches n/2. The Bayesian method with Beta(0,0,1) prior only works for 0<s<n.
It is interesting to see from density plots where these four techniques place their emphasis:
When s=1 and n=2 the Bayesian inference Beta(0,0,1) prior and the classical method have Uniform(0,1) distributions for p: in other words, there appears to be no information contained in the data except to say that 0<p<1. A Bayesian has already stated that the probability exists (and therefore lies within this range) while the classical statistics result must first determine that the probability is not either zero or one.
When s=1 the classical and Bayesian with Beta(0,0,1) prior results in a mode at p=0 for any n, while the Bayesian with Beta(1,1,1) prior gives a mode at p=1/n, which is more intuitive.
The methods give the following means and modes:
The formulae for the modes have some parameter restrictions.
At Epix Analytics we use Bayesian inference with a Beta(1,1,1) prior when we feel we know that there is a stochastic process, and the classical result when we do not. We have found use of the other priors difficult to justify.