libs/math/doc/policies/policy_tutorial.qbk

   1 [section:pol_tutorial Policy Tutorial]
   2
   3 [section:what_is_a_policy So Just What is a Policy Anyway?]
   4
   5 A policy is a compile-time mechanism for customising the behaviour of a
   6 special function, or a statistical distribution.  With Policies you can
   7 control:
   8
   9 * What action to take when an error occurs.
  10 * What happens when you call a function that is mathematically undefined
  11 (for example, if you ask for the mean of a Cauchy distribution).
  12 * What happens when you ask for a quantile of a discrete distribution.
  13 * Whether the library is allowed to internally promote `float` to `double`
  14 and `double` to `long double` in order to improve precision.
  15 * What precision to use when calculating the result.
  16
  17 Some of these policies could arguably be run-time variables, but then we couldn't
  18 use compile-time dispatch internally to select the best evaluation method
  19 for the given policies.
  20
  21 For this reason a Policy is a /type/: in fact it's an instance of the
  22 class template `boost::math::policies::policy<>`.  This class is just a
  23 compile-time-container of user-selected policies (sometimes called a type-list).
  24
  25 Over a dozen __policy_defaults are provided, so most of the time you can ignore the policy framework,
  26 but you can overwrite the defaults with your own policies to give detailed control, for example:
  27
  28    using namespace boost::math::policies;
  29
  30    // Define a policy that sets ::errno on overflow,
  31    // and does not promote double to long double internally,
  32    // and only aims for precision of only 3 decimal digits,
  33    // to an error-handling policy, usually to trade precision for speed:
  34
  35    typedef policy
  36    <
  37      domain_error<errno_on_error>,
  38      promote_double<false>,
  39      digits10<3>
  40    > my_policy;
  41
  42 [endsect] [/section:what_is_a_policy So Just What is a Policy Anyway?]
  43
  44 [section:policy_tut_defaults Policies Have Sensible Defaults]
  45
  46 Most of the time you can just ignore the policy framework.
  47
  48 ['*The defaults for the various policies are as follows,
  49 if these work OK for you then you can stop reading now!]
  50
  51 [variablelist
  52 [[Domain Error][Throws a `std::domain_error` exception.]]
  53 [[Pole Error][Occurs when a function is evaluated at a pole: throws a `std::domain_error` exception.]]
  54 [[Overflow Error][Throws a `std::overflow_error` exception.]]
  55 [[Underflow][Ignores the underflow, and returns zero.]]
  56 [[Denormalised Result][Ignores the fact that the result is denormalised, and returns it.]]
  57 [[Rounding Error][Throws a `boost::math::rounding_error` exception.]]
  58 [[Internal Evaluation Error][Throws a `boost::math::evaluation_error` exception.]]
  59 [[Indeterminate Result Error][Returns a result that depends on the function where the error occurred.]]
  60 [[Promotion of float to double][Does occur by default - gives full float precision results.]]
  61 [[Promotion of double to long double][Does occur by default if long double offers
  62    more precision than double.]]
  63 [[Precision of Approximation Used][By default uses an approximation that
  64    will result in the lowest level of error for the type of the result.]]
  65 [[Behaviour of Discrete Quantiles]
  66    [
  67    The quantile function will by default return an integer result that has been
  68    /rounded outwards/.  That is to say lower quantiles (where the probability is
  69    less than 0.5) are rounded downward, and upper quantiles (where the probability
  70    is greater than 0.5) are rounded upwards.  This behaviour
  71    ensures that if an X% quantile is requested, then /at least/ the requested
  72    coverage will be present in the central region, and /no more than/
  73    the requested coverage will be present in the tails.
  74
  75 This behaviour can be changed so that the quantile functions are rounded
  76    differently, or even return a real-valued result using
  77    [link math_toolkit.pol_overview Policies].  It is strongly
  78    recommended that you read the tutorial
  79    [link math_toolkit.pol_tutorial.understand_dis_quant
  80    Understanding Quantiles of Discrete Distributions] before
  81    using the quantile function on a discrete distribution.  The
  82    [link math_toolkit.pol_ref.discrete_quant_ref reference docs]
  83    describe how to change the rounding policy
  84    for these distributions.
  85 ]]
  86 ]
  87
  88 What's more, if you define your own policy type, then it automatically
  89 inherits the defaults for any policies not explicitly set, so given:
  90
  91    using namespace boost::math::policies;
  92    //
  93    // Define a policy that sets ::errno on overflow, and does
  94    // not promote double to long double internally:
  95
  96    typedef policy
  97    <
  98      domain_error<errno_on_error>,
  99      promote_double<false>
 100    > my_policy;
 101
 102 then `my_policy` defines a policy where only the overflow error handling and
 103 `double`-promotion policies differ from the defaults.
 104
 105 We can also add a desired precision, for example, 9 bits or 3 decimal digits,
 106 to an error-handling policy, usually to trade precision for speed:
 107
 108    typedef policy<domain_error<errno_on_error>, digit2<9> > my_policy;
 109
 110 Or if you want to further modify an existing user policy, use `normalise`:
 111
 112   using boost::math::policies::normalise;
 113
 114   typedef normalise<my_policy, digits2<9>>::type my_policy_9; // errno on error, and limited precision.
 115
 116 [endsect] [/section:policy_tut_defaults Policies Have Sensible Defaults]
 117
 118 [section:policy_usage So How are Policies Used Anyway?]
 119
 120 The details follow later, but basically policies can be set by either:
 121
 122 * Defining some macros that change the default behaviour: [*this is the
 123    recommended method for setting installation-wide policies].
 124 * By instantiating a statistical distribution object with an explicit policy:
 125    this is mainly reserved for ad hoc policy changes.
 126 * By passing a policy to a special function as an optional final argument:
 127    this is mainly reserved for ad hoc policy changes.
 128 * By using some helper macros to define a set of functions or distributions
 129 in the current namespace that use a specific policy: [*this is the
 130 recommended method for setting policies on a project- or translation-unit-wide
 131 basis].
 132
 133 The following sections introduce these methods in more detail.
 134
 135 [endsect] [/section:policy_usage So How are Policies Used Anyway?]
 136
 137 [section:changing_policy_defaults Changing the Policy Defaults]
 138
 139 The default policies used by the library are changed by the usual
 140 configuration macro method.
 141
 142 For example, passing `-DBOOST_MATH_DOMAIN_ERROR_POLICY=errno_on_error` to
 143 your compiler will cause domain errors to set `::errno` and return a __NaN
 144 rather than the usual default behaviour of throwing a `std::domain_error`
 145 exception.
 146
 147 [tip For Microsoft Visual Studio,you can add to the Project Property Page,
 148 C/C++, Preprocessor, Preprocessor definitions like:
 149
 150 ``BOOST_MATH_ASSERT_UNDEFINED_POLICY=0
 151 BOOST_MATH_OVERFLOW_ERROR_POLICY=errno_on_error``
 152
 153 This may be helpful to avoid complications with pre-compiled headers
 154 that may mean that the equivalent definitions in source code:
 155
 156 ``#define BOOST_MATH_ASSERT_UNDEFINED_POLICY false
 157 #define BOOST_MATH_OVERFLOW_ERROR_POLICY errno_on_error``
 158
 159 *may be ignored*.
 160
 161 The compiler command line shows:
 162
 163 ``/D "BOOST_MATH_ASSERT_UNDEFINED_POLICY=0"
 164 /D "BOOST_MATH_OVERFLOW_ERROR_POLICY=errno_on_error"``
 165 ] [/MSVC tip]
 166
 167 There is however a very important caveat to this:
 168
 169 [important
 170 [*['Default policies changed by setting configuration macros must be changed
 171 uniformly in every translation unit in the program.]]
 172
 173 Failure to follow this rule may result in violations of the "One
 174 Definition Rule (ODR)" and result in unpredictable program behaviour.]
 175
 176 That means there are only two safe ways to use these macros:
 177
 178 * Edit them in [@../../../../boost/math/tools/user.hpp boost/math/tools/user.hpp],
 179 so that the defaults are set on an installation-wide basis.
 180 Unfortunately this may not be convenient if
 181 you are using a pre-installed Boost distribution (on Linux for example).
 182 * Set the defines in your project's Makefile or build environment, so that they
 183 are set uniformly across all translation units.
 184
 185 What you should *not* do is:
 186
 187 * Set the defines in the source file using `#define` as doing so
 188 almost certainly will break your program, unless you're absolutely
 189 certain that the program is restricted to a single translation unit.
 190
 191 And, yes, you will find examples in our test programs where we break this
 192 rule: but only because we know there will always be a single
 193 translation unit only: ['don't say that you weren't warned!]
 194
 195 [import ../../example/error_handling_example.cpp]
 196
 197 [error_handling_example]
 198
 199 [endsect] [/section:changing_policy_defaults Changing the Policy Defaults]
 200
 201 [section:ad_hoc_dist_policies Setting Policies for Distributions on an Ad Hoc Basis]
 202
 203 All of the statistical distributions in this library are class templates
 204 that accept two template parameters:
 205 real type (float, double ...) and policy (how to handle exceptional events),
 206 both with sensible defaults, for example:
 207
 208    namespace boost{ namespace math{
 209
 210    template <class RealType = double, class Policy = policies::policy<> >
 211    class fisher_f_distribution;
 212
 213    typedef fisher_f_distribution<> fisher_f;
 214
 215    }}
 216
 217 This policy gets used by all the accessor functions that accept
 218 a distribution as an argument, and forwarded to all the functions called
 219 by these.  So if you use the shorthand-typedef for the distribution, then you get
 220 `double` precision arithmetic and all the default policies.
 221
 222 However, say for example we wanted to evaluate the quantile
 223 of the binomial distribution at float precision, without internal
 224 promotion to double, and with the result rounded to the /nearest/
 225 integer, then here's how it can be done:
 226
 227 [import ../../example/policy_eg_3.cpp]
 228
 229 [policy_eg_3]
 230
 231 Which outputs:
 232
 233 [pre quantile is: 40]
 234
 235 [endsect] [/section:ad_hoc_dist_policies Setting Policies for Distributions on an Ad Hoc Basis]
 236
 237 [section:ad_hoc_sf_policies Changing the Policy on an Ad Hoc Basis for the Special Functions]
 238
 239 All of the special functions in this library come in two overloaded forms,
 240 one with a final "policy" parameter, and one without.  For example:
 241
 242    namespace boost{ namespace math{
 243
 244    template <class RealType, class Policy>
 245    RealType tgamma(RealType, const Policy&);
 246
 247    template <class RealType>
 248    RealType tgamma(RealType);
 249
 250    }} // namespaces
 251
 252 Normally, the second version is just a forwarding wrapper to the first
 253 like this:
 254
 255    template <class RealType>
 256    inline RealType tgamma(RealType x)
 257    {
 258       return tgamma(x, policies::policy<>());
 259    }
 260
 261 So calling a special function with a specific policy
 262 is just a matter of defining the policy type to use
 263 and passing it as the final parameter.  For example,
 264 suppose we want `tgamma` to behave in a C-compatible
 265 fashion and set `::errno` when an error occurs, and never
 266 throw an exception:
 267
 268 [import ../../example/policy_eg_1.cpp]
 269
 270 [policy_eg_1]
 271
 272 which outputs:
 273
 274 [pre
 275 Result of tgamma(30000) is: 1.#INF
 276 errno = 34
 277 Result of tgamma(-10) is: 1.#QNAN
 278 errno = 33
 279 ]
 280
 281 Alternatively, for ad hoc use, we can use the `make_policy`
 282 helper function to create a policy for us: this usage is more
 283 verbose, so is probably only preferred when a policy is going
 284 to be used once only:
 285
 286 [import ../../example/policy_eg_2.cpp]
 287
 288 [policy_eg_2]
 289
 290 [endsect] [/section:ad_hoc_sf_policies Changing the Policy on an Ad Hoc Basis for the Special Functions]
 291
 292 [section:namespace_policies Setting Policies at Namespace or Translation Unit Scope]
 293
 294 Sometimes what you want to do is just change a set of policies within
 295 the current scope: *the one thing you should not do in this situation
 296 is use the configuration macros*, as this can lead to "One Definition
 297 Rule" violations.  Instead this library provides a pair of macros
 298 especially for this purpose.
 299
 300 Let's consider the special functions first: we can declare a set of
 301 forwarding functions that all use a specific policy using the
 302 macro BOOST_MATH_DECLARE_SPECIAL_FUNCTIONS(['Policy]).  This
 303 macro should be used either inside a unique namespace set aside for the
 304 purpose (for example, a C namespace for a C-style policy),
 305 or an unnamed namespace if you just want the functions
 306 visible in global scope for the current file only.
 307
 308 [import ../../example/policy_eg_4.cpp]
 309
 310 [policy_eg_4]
 311
 312 The same mechanism works well at file scope as well, by using an unnamed
 313 namespace, we can ensure that these declarations don't conflict with any
 314 alternate policies present in other translation units:
 315
 316 [import ../../example/policy_eg_5.cpp]
 317
 318 [policy_eg_5]
 319
 320 Handling policies for the statistical distributions is very similar except that now
 321 the macro BOOST_MATH_DECLARE_DISTRIBUTIONS accepts two parameters: the
 322 floating point type to use, and the policy type to apply.  For example:
 323
 324    BOOST_MATH_DECLARE_DISTRIBUTIONS(double, my_policy)
 325
 326 Results a set of typedefs being defined like this:
 327
 328    typedef boost::math::normal_distribution<double, my_policy> normal;
 329
 330 The name of each typedef is the same as the name of the distribution
 331 class template, but without the "_distribution" suffix.
 332
 333 [import ../../example/policy_eg_6.cpp]
 334
 335 [policy_eg_6]
 336
 337 [note
 338 There is an important limitation to note: you can *not use the macros
 339 BOOST_MATH_DECLARE_DISTRIBUTIONS and BOOST_MATH_DECLARE_SPECIAL_FUNCTIONS
 340 ['in the same namespace]*,  as doing so creates ambiguities between functions
 341 and distributions of the same name.
 342 ]
 343
 344 As before, the same mechanism works well at file scope as well: by using an unnamed
 345 namespace, we can ensure that these declarations don't conflict with any
 346 alternate policies present in other translation units:
 347
 348 [import ../../example/policy_eg_7.cpp]
 349
 350 [policy_eg_7]
 351
 352 [endsect][/section:namespace_policies Setting Policies at Namespace or Translation Unit Scope]
 353
 354 [section:user_def_err_pol Calling User Defined Error Handlers]
 355
 356 [import ../../example/policy_eg_8.cpp]
 357
 358 [policy_eg_8]
 359
 360 [import ../../example/policy_eg_9.cpp]
 361
 362 [policy_eg_9]
 363
 364 [endsect] [/section:user_def_err_pol Calling User Defined Error Handlers]
 365
 366 [section:understand_dis_quant Understanding Quantiles of Discrete Distributions]
 367
 368 Discrete distributions present us with a problem when calculating the
 369 quantile: we are starting from a continuous real-valued variable - the
 370 probability - but the result (the value of the random variable)
 371 should really be discrete.
 372
 373 Consider for example a Binomial distribution, with a sample size of
 374 50, and a success fraction of 0.5.  There are a variety of ways
 375 we can plot a discrete distribution, but if we plot the PDF
 376 as a step-function then it looks something like this:
 377
 378 [$../graphs/binomial_pdf.png]
 379
 380 Now lets suppose that the user asks for a the quantile that corresponds
 381 to a probability of 0.05, if we zoom in on the CDF for that region here's
 382 what we see:
 383
 384 [$../graphs/binomial_quantile_1.png]
 385
 386 As can be seen there is no random variable that corresponds to
 387 a probability of exactly 0.05, so we're left with two choices as
 388 shown in the figure:
 389
 390 * We could round the result down to 18.
 391 * We could round the result up to 19.
 392
 393 In fact there's actually a third choice as well: we could "pretend" that the
 394 distribution was continuous and return a real valued result: in this case we
 395 would calculate a result of approximately 18.701 (this accurately
 396 reflects the fact that the result is nearer to 19 than 18).
 397
 398 By using policies we can offer any of the above as options, but that
 399 still leaves the question: ['What is actually the right thing to do?]
 400
 401 And in particular: ['What policy should we use by default?]
 402
 403 In coming to an answer we should realise that:
 404
 405 * Calculating an integer result is often much faster than
 406 calculating a real-valued result: in fact in our tests it
 407 was up to 20 times faster.
 408 * Normally people calculate quantiles so that they can perform
 409 a test of some kind: ['"If the random variable is less than N
 410 then we can reject our null-hypothesis with 90% confidence."]
 411
 412 So there is a genuine benefit to calculating an integer result
 413 as well as it being "the right thing to do" from a philosophical
 414 point of view.  What's more if someone asks for a quantile at 0.05,
 415 then we can normally assume that they are asking for
 416 ['[*at least] 95% of the probability to the right of the value chosen,
 417 and [*no more than] 5% of the probability to the left of the value chosen.]
 418
 419 In the above binomial example we would therefore round the result down to 18.
 420
 421 The converse applies to upper-quantiles: If the probability is greater than
 422 0.5 we would want to round the quantile up, ['so that [*at least] the requested
 423 probability is to the left of the value returned, and [*no more than] 1 - the
 424 requested probability is to the right of the value returned.]
 425
 426 Likewise for two-sided intervals, we would round lower quantiles down,
 427 and upper quantiles up.  This ensures that we have ['at least the requested
 428 probability in the central region] and ['no more than 1 minus the requested
 429 probability in the tail areas.]
 430
 431 For example, taking our 50 sample binomial distribution with a success fraction
 432 of 0.5, if we wanted a two sided 90% confidence interval, then we would ask
 433 for the 0.05 and 0.95 quantiles with the results ['rounded outwards] so that
 434 ['at least 90% of the probability] is in the central area:
 435
 436 [$../graphs/binomial_pdf_3.png]
 437
 438 So far so good, but there is in fact a trap waiting for the unwary here:
 439
 440    quantile(binomial(50, 0.5), 0.05);
 441
 442 returns 18 as the result, which is what we would expect from the graph above,
 443 and indeed there is no x greater than 18 for which:
 444
 445    cdf(binomial(50, 0.5), x) <= 0.05;
 446
 447 However:
 448
 449    quantile(binomial(50, 0.5), 0.95);
 450
 451 returns 31, and indeed while there is no x less than 31 for which:
 452
 453    cdf(binomial(50, 0.5), x) >= 0.95;
 454
 455 We might naively expect that for this symmetrical distribution the result
 456 would be 32 (since 32 = 50 - 18), but we need to remember that the cdf of
 457 the binomial is /inclusive/ of the random variable.  So while the left tail
 458 area /includes/ the quantile returned, the right tail area always excludes
 459 an upper quantile value: since that "belongs" to the central area.
 460
 461 Look at the graph above to see what's going on here: the lower quantile
 462 of 18 belongs to the left tail, so any value <= 18 is in the left tail.
 463 The upper quantile of 31 on the other hand belongs to the central area,
 464 so the tail area actually starts at 32, so any value > 31 is in the
 465 right tail.
 466
 467 Therefore if U and L are the upper and lower quantiles respectively, then
 468 a random variable X is in the tail area - where we would reject the null
 469 hypothesis if:
 470
 471    X <= L || X > U
 472
 473 And the a variable X is inside the central region if:
 474
 475    L < X <= U
 476
 477 The moral here is to ['always be very careful with your comparisons
 478 when dealing with a discrete distribution], and if in doubt,
 479 ['base your comparisons on CDF's instead].
 480
 481 [heading Other Rounding Policies are Available]
 482
 483 As you would expect from a section on policies, you won't be surprised
 484 to know that other rounding options are available:
 485
 486 [variablelist
 487
 488 [[integer_round_outwards]
 489    [This is the default policy as described above: lower quantiles
 490    are rounded down (probability < 0.5), and upper quantiles
 491    (probability > 0.5) are rounded up.
 492
 493    This gives /no more than/ the requested probability
 494    in the tails, and /at least/ the requested probability
 495    in the central area.]]
 496 [[integer_round_inwards]
 497    [This is the exact opposite of the default policy:
 498    lower quantiles
 499    are rounded up (probability < 0.5),
 500    and upper quantiles (probability > 0.5) are rounded down.
 501
 502    This gives /at least/ the requested probability
 503    in the tails, and /no more than/ the requested probability
 504    in the central area.]]
 505 [[integer_round_down][This policy will always round the result down
 506    no matter whether it is an upper or lower quantile]]
 507 [[integer_round_up][This policy will always round the result up
 508    no matter whether it is an upper or lower quantile]]
 509 [[integer_round_nearest][This policy will always round the result
 510    to the nearest integer
 511    no matter whether it is an upper or lower quantile]]
 512 [[real][This policy will return a real valued result
 513    for the quantile of a discrete distribution: this is
 514    generally much slower than finding an integer result
 515    but does allow for more sophisticated rounding policies.]]
 516
 517 ]
 518
 519 [import ../../example/policy_eg_10.cpp]
 520
 521 [policy_eg_10]
 522
 523 [endsect] [/section:understand_dis_quant Understanding Quantiles of Discrete Distributions]
 524
 525 [endsect] [/section:pol_Tutorial Policy Tutorial]
 526
 527
 528 [/ math.qbk
 529   Copyright 2007, 2013 John Maddock and Paul A. Bristow.
 530   Distributed under the Boost Software License, Version 1.0.
 531   (See accompanying file LICENSE_1_0.txt or copy at
 532   http://www.boost.org/LICENSE_1_0.txt).
 533 ]
 534
 535