2 Copyright (c) 2019 Nick Thompson
3 Use, modification and distribution are subject to the
4 Boost Software License, Version 1.0. (See accompanying file
5 LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
8 [section:runs_test Runs tests]
13 #include <boost/math/statistics/runs_test.hpp>
15 namespace boost::math::statistics {
17 template<typename RandomAccessContainer>
18 std::pair<Real, Real> runs_above_and_below_threshold(RandomAccessContainer const & v, typename RandomAccessContainer::value_type threshold);
20 template<typename RandomAccessContainer>
21 std::pair<Real, Real> runs_above_and_below_median(RandomAccessContainer const & v);
28 The "runs above and below median test" tries to determine if a sequence is random by looking at the number of consecutive values which exceed (or are below) the median of the sequence.
29 For example, if we are given data {5, 2, 0, 4, 7, 9, 10, 6, 1, 8, 3}, first we find the median (5), and transform the vector into a list of + and -'s, depending on whether the value is greater or less than the median.
30 (Values equal to the median we simply ignore-this is a convention we have inherited from the wonderful `randtests` package in R.)
31 Hence our data vector is transformed into
35 which is 5 runs, with /n/[sub a] = 5 values above and /n/[sub b] = 5 values below the median.
36 [@https://www.itl.nist.gov/div898/handbook/eda/section3/eda35d.htm NIST] tells us the expected number of runs and their variance:
38 [$../graphs/expected_runs_above_threshold.svg]
40 from which we derive the test statistic
42 [$../graphs/runs_test_statistic.svg]
44 whose distribution we approximate as normal to extract a /p/-value.
48 An example usage is as follows:
53 #include <boost/math/statistics/runs_test.hpp>
55 std::random_device rd;
56 std::vector<double> v{5, 2, 0, 4, 7, 9, 10, 6, 1, 8, 3};
57 auto [t, p] = boost::math::statistics::runs_above_and_below_median(v);
58 // t = -0.670820393249936919; p = 0.502334954360502017;
60 We see that we have about a 50% chance of seeing this number of runs if the null hypothesis of randomness is true, and hence the assumption of randomness seems reasonable.
61 As always, the test statistic is the first element of the pair, and the /p/-value is the second element.
66 There are two cases: Where the threshold (typically the median) has already been computed, and the case where the mean and sample variance must be computed on the fly.
67 Computing the median is fairly expensive (requiring a call to `boost::math::statistics::median`), and since the order of the original data must be preserved, it must allocate.
68 If you believe your data to come from a distribution where the means and median coincide, or if you've already computed the median in the course of some other analysis, then you can get away with a call to `runs_above_and_below_threshold` via
72 auto [t, p] = boost::math::statistics::runs_above_and_below_threshold(v, threshold);
75 The performance differences between these two cases are obvious:
78 ---------------------------------------------------------------------------------------
79 Benchmark Time Bytes/second
80 ---------------------------------------------------------------------------------------
81 BMRunsAboveAndBelowMedian<float>/8 260 ns 118.421M/s
82 BMRunsAboveAndBelowMedian<float>/16 318 ns 192.797M/s
83 BMRunsAboveAndBelowMedian<float>/32 417 ns 303.509M/s
84 BMRunsAboveAndBelowMedian<float>/64 625 ns 390.578M/s
85 BMRunsAboveAndBelowMedian<float>/128 743 ns 657.827M/s
86 BMRunsAboveAndBelowMedian<float>/256 1308 ns 767.181M/s
87 BMRunsAboveAndBelowMedian<float>/512 1896 ns 1034.31M/s
88 BMRunsAboveAndBelowMedian<float>/1024 6582 ns 594.458M/s
89 BMRunsAboveAndBelowMedian<float>/2048 26067 ns 300.001M/s
90 BMRunsAboveAndBelowMedian<float>/4096 62023 ns 252.125M/s
91 BMRunsAboveAndBelowMedian<float>/8192 124976 ns 250.256M/s
92 BMRunsAboveAndBelowMedian<float>/16384 242171 ns 258.29M/s
93 BMRunsAboveAndBelowMedian<float>/32768 528683 ns 236.714M/s
94 BMRunsAboveAndBelowMedian<float>/65536 965354 ns 259.185M/s
95 BMRunsAboveAndBelowMedian<float>/131072 2514693 ns 199.068M/s
96 BMRunsAboveAndBelowMedian<float>/262144 4223084 ns 237.058M/s
97 BMRunsAboveAndBelowMedian<float>/524288 8638963 ns 231.755M/s
98 BMRunsAboveAndBelowMedian<float>/1048576 16215682 ns 246.995M/s
99 BMRunsAboveAndBelowMedian<float>/2097152 39180496 ns 204.443M/s
100 BMRunsAboveAndBelowMedian<float>/4194304 82495779 ns 194.307M/s
101 BMRunsAboveAndBelowMedian<float>/8388608 142675936 ns 224.547M/s
102 BMRunsAboveAndBelowMedian<float>/16777216 287638068 ns 223.088M/s
103 BMRunsAboveAndBelowMedian<float>_BigO 17.25 N
104 BMRunsAboveAndBelowMedian<double>/8 191 ns 320.129M/s
105 BMRunsAboveAndBelowMedian<double>/16 233 ns 523.526M/s
106 BMRunsAboveAndBelowMedian<double>/32 334 ns 730.8M/s
107 BMRunsAboveAndBelowMedian<double>/64 456 ns 1070.93M/s
108 BMRunsAboveAndBelowMedian<double>/128 688 ns 1.38789G/s
109 BMRunsAboveAndBelowMedian<double>/256 1257 ns 1.51807G/s
110 BMRunsAboveAndBelowMedian<double>/512 2663 ns 1.43406G/s
111 BMRunsAboveAndBelowMedian<double>/1024 4100 ns 1.86266G/s
112 BMRunsAboveAndBelowMedian<double>/2048 23493 ns 665.851M/s
113 BMRunsAboveAndBelowMedian<double>/4096 57968 ns 539.551M/s
114 BMRunsAboveAndBelowMedian<double>/8192 142272 ns 439.746M/s
115 BMRunsAboveAndBelowMedian<double>/16384 260948 ns 479.639M/s
116 BMRunsAboveAndBelowMedian<double>/32768 551577 ns 453.623M/s
117 BMRunsAboveAndBelowMedian<double>/65536 1056583 ns 473.654M/s
118 BMRunsAboveAndBelowMedian<double>/131072 2123956 ns 471.35M/s
119 BMRunsAboveAndBelowMedian<double>/262144 5028745 ns 398.111M/s
120 BMRunsAboveAndBelowMedian<double>/524288 10399212 ns 384.981M/s
121 BMRunsAboveAndBelowMedian<double>/1048576 23089767 ns 348.496M/s
122 BMRunsAboveAndBelowMedian<double>/2097152 37626884 ns 425.962M/s
123 BMRunsAboveAndBelowMedian<double>/4194304 79281747 ns 404.088M/s
124 BMRunsAboveAndBelowMedian<double>/8388608 172055781 ns 373.391M/s
125 BMRunsAboveAndBelowMedian<double>/16777216 391377449 ns 332.01M/s
126 BMRunsAboveAndBelowMedian<double>_BigO 22.52 N
127 BMRunsAboveAndBelowThreshold<float>/8 41.6 ns 739.55M/s
128 BMRunsAboveAndBelowThreshold<float>/16 58.4 ns 1050.48M/s
129 BMRunsAboveAndBelowThreshold<float>/32 66.5 ns 1.79606G/s
130 BMRunsAboveAndBelowThreshold<float>/64 115 ns 2.0762G/s
131 BMRunsAboveAndBelowThreshold<float>/128 198 ns 2.41515G/s
132 BMRunsAboveAndBelowThreshold<float>/256 365 ns 2.61328G/s
133 BMRunsAboveAndBelowThreshold<float>/512 720 ns 2.65053G/s
134 BMRunsAboveAndBelowThreshold<float>/1024 1424 ns 2.68123G/s
135 BMRunsAboveAndBelowThreshold<float>/2048 3009 ns 2.5379G/s
136 BMRunsAboveAndBelowThreshold<float>/4096 16748 ns 933.699M/s
137 BMRunsAboveAndBelowThreshold<float>/8192 40190 ns 778.105M/s
138 BMRunsAboveAndBelowThreshold<float>/16384 86500 ns 723.067M/s
139 BMRunsAboveAndBelowThreshold<float>/32768 176692 ns 708.108M/s
140 BMRunsAboveAndBelowThreshold<float>/65536 356863 ns 701.198M/s
141 BMRunsAboveAndBelowThreshold<float>/131072 714807 ns 700.08M/s
142 BMRunsAboveAndBelowThreshold<float>/262144 1429078 ns 700.415M/s
143 BMRunsAboveAndBelowThreshold<float>/524288 2877227 ns 695.785M/s
144 BMRunsAboveAndBelowThreshold<float>/1048576 5795662 ns 691.222M/s
145 BMRunsAboveAndBelowThreshold<float>/2097152 11562715 ns 692.427M/s
146 BMRunsAboveAndBelowThreshold<float>/4194304 23364846 ns 686.464M/s
147 BMRunsAboveAndBelowThreshold<float>/8388608 46442540 ns 689.871M/s
148 BMRunsAboveAndBelowThreshold<float>/16777216 92284501 ns 694.006M/s
149 BMRunsAboveAndBelowThreshold<float>_BigO 5.51 N
150 BMRunsAboveAndBelowThreshold<double>/8 45.1 ns 1.32169G/s
151 BMRunsAboveAndBelowThreshold<double>/16 53.6 ns 2.22712G/s
152 BMRunsAboveAndBelowThreshold<double>/32 71.4 ns 3.34079G/s
153 BMRunsAboveAndBelowThreshold<double>/64 112 ns 4.24946G/s
154 BMRunsAboveAndBelowThreshold<double>/128 196 ns 4.87317G/s
155 BMRunsAboveAndBelowThreshold<double>/256 378 ns 5.04476G/s
156 BMRunsAboveAndBelowThreshold<double>/512 702 ns 5.44134G/s
157 BMRunsAboveAndBelowThreshold<double>/1024 1417 ns 5.3865G/s
158 BMRunsAboveAndBelowThreshold<double>/2048 3031 ns 5.03872G/s
159 BMRunsAboveAndBelowThreshold<double>/4096 16813 ns 1.81669G/s
160 BMRunsAboveAndBelowThreshold<double>/8192 41182 ns 1.48565G/s
161 BMRunsAboveAndBelowThreshold<double>/16384 86939 ns 1.40536G/s
162 BMRunsAboveAndBelowThreshold<double>/32768 177255 ns 1.37892G/s
163 BMRunsAboveAndBelowThreshold<double>/65536 356391 ns 1.3713G/s
164 BMRunsAboveAndBelowThreshold<double>/131072 718417 ns 1.36057G/s
165 BMRunsAboveAndBelowThreshold<double>/262144 1442288 ns 1.35583G/s
166 BMRunsAboveAndBelowThreshold<double>/524288 2942259 ns 1.33217G/s
167 BMRunsAboveAndBelowThreshold<double>/1048576 5870235 ns 1.33244G/s
168 BMRunsAboveAndBelowThreshold<double>/2097152 11743081 ns 1.33192G/s
169 BMRunsAboveAndBelowThreshold<double>/4194304 23521002 ns 1.32976G/s
170 BMRunsAboveAndBelowThreshold<double>/8388608 46917407 ns 1.33339G/s
171 BMRunsAboveAndBelowThreshold<double>/16777216 93823876 ns 1.33305G/s
172 BMRunsAboveAndBelowThreshold<double>_BigO 5.59 N 5.59 N