-[sect:optim Optimisation Examples}
+[sect:optim Optimisation Examples]
[h4 Poisson Distribution - Optimization and Accuracy is quite complicated.
But the case of small integral k is *very* common, so it is worth considering optimisation.
-The first obvious step is to use a finite sum of each pdf (Probability *density* function)
-for each value of k to build up the cdf (*cumulative* distribution function).
+The first obvious step is to use a finite sum of each PDF (Probability *density* function)
+for each value of k to build up the CDF (*cumulative* distribution function).
-This could be done using the pdf function for the distribution,
+This could be done using the PDF function for the distribution,
for which there are two equivalent formulae:
return exp(-mean + log(mean) * k - lgamma(k+1));
return gamma_p_derivative(k+1, mean);
-The pdf would probably be more accurate using gamma_p_derivative.
+The pdf would probably be more accurate using `gamma_p_derivative`.
The reason is that the expression:
-mean + log(mean) * k - lgamma(k+1)
Will produce a value much smaller than the largest of the terms, so you get
-cancellation error: and then when you pass the result to exp() which
+cancellation error: and then when you pass the result to `exp()` which
converts the absolute error in its argument to a relative error in the
result (explanation available if required), you effectively amplify the
error further still.
-gamma_p_derivative is just a thin wrapper around some of the internals of
+`gamma_p_derivative` is just a thin wrapper around some of the internals of
the incomplete gamma, it does its utmost to avoid issues like this, because
this function is responsible for virtually all of the error in the result.
Hopefully further advances in the future might improve things even further
-(what is really needed is an 'accurate' pow(1+x) function, but that's a whole
+(what is really needed is an 'accurate' `pow(1+x)` function, but that's a whole
other story!).
-But calling pdf function makes repeated, redundant, checks on the value of mean and k,
+But calling `pdf` function makes repeated, redundant, checks on the value of `mean` and `k`,
result += pdf(dist, i);
And moreover, in the above series, each term can be calculated
from the previous one much more efficiently:
-cdf = sum from 0 to k of C[k]
+ cdf = sum from 0 to k of C[k]
with:
-C[0] = exp(-mean)
-
-C[N+1] = C[N] * mean / (N+1)
+ C[0] = exp(-mean)
+ C[N+1] = C[N] * mean / (N+1)
So hopefully that's:
return result;
}
-This is exactly the same finite sum as used by gamma_p/gamma_q internally.
+This is exactly the same finite sum as used by `gamma_p/gamma_q` internally.
As explained previously it's only used when the result
-p > 0.5 or 1-p = q < 0.5.
+ p > 0.5 or 1-p = q < 0.5.
The slight danger when using the sum directly like this, is that if
the mean is small and k is large then you're calculating a value ~1, so
conceivably you might overshoot slightly. For this and other reasons in the
-case when p < 0.5 and q > 0.5 gamma_p/gamma_q use a different (infinite but
+case when p < 0.5 and q > 0.5 `gamma_p/gamma_q` use a different (infinite but
rapidly converging) sum, so that danger isn't present since you always
calculate the smaller of p and q.
-So... it's tempting to suggest that you just call gamma_p/gamma_q as
+So... it's tempting to suggest that you just call `gamma_p/gamma_q` as
required. However, there is a slight benefit for the k = 0 case because you
-avoid all the internal logic inside gamma_p/gamma_q trying to figure out
+avoid all the internal logic inside `gamma_p/gamma_q` trying to figure out
which method to use etc.
For the incomplete beta function, there are no simple finite sums
finite sum of the PDF and an incomplete beta call, the finite sum may indeed
win out in that case.
-[endsect][/sect:optim Optimisation Examples}
+[endsect] [/sect:optim Optimisation Examples]
[/
Copyright 2006 John Maddock and Paul A. Bristow.