3 <meta http-equiv="Content-Type" content="text/html; charset=US-ASCII">
4 <title>Comparison of Nth-root Finding Algorithms</title>
5 <link rel="stylesheet" href="../../math.css" type="text/css">
6 <meta name="generator" content="DocBook XSL Stylesheets V1.79.1">
7 <link rel="home" href="../../index.html" title="Math Toolkit 2.11.0">
8 <link rel="up" href="../root_comparison.html" title="Comparison of Root Finding Algorithms">
9 <link rel="prev" href="cbrt_comparison.html" title="Comparison of Cube Root Finding Algorithms">
10 <link rel="next" href="elliptic_comparison.html" title="Comparison of Elliptic Integral Root Finding Algoritghms">
12 <body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF">
13 <table cellpadding="2" width="100%"><tr>
14 <td valign="top"><img alt="Boost C++ Libraries" width="277" height="86" src="../../../../../../boost.png"></td>
15 <td align="center"><a href="../../../../../../index.html">Home</a></td>
16 <td align="center"><a href="../../../../../../libs/libraries.htm">Libraries</a></td>
17 <td align="center"><a href="http://www.boost.org/users/people.html">People</a></td>
18 <td align="center"><a href="http://www.boost.org/users/faq.html">FAQ</a></td>
19 <td align="center"><a href="../../../../../../more/index.htm">More</a></td>
22 <div class="spirit-nav">
23 <a accesskey="p" href="cbrt_comparison.html"><img src="../../../../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../root_comparison.html"><img src="../../../../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../../index.html"><img src="../../../../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="elliptic_comparison.html"><img src="../../../../../../doc/src/images/next.png" alt="Next"></a>
26 <div class="titlepage"><div><div><h3 class="title">
27 <a name="math_toolkit.root_comparison.root_n_comparison"></a><a class="link" href="root_n_comparison.html" title="Comparison of Nth-root Finding Algorithms">Comparison
28 of Nth-root Finding Algorithms</a>
29 </h3></div></div></div>
31 A second example compares four generalized nth-root finding algorithms for
32 various n-th roots (5, 7 and 13) of a single value 28.0, for four floating-point
33 types, <code class="computeroutput"><span class="keyword">float</span></code>, <code class="computeroutput"><span class="keyword">double</span></code>, <code class="computeroutput"><span class="keyword">long</span>
34 <span class="keyword">double</span></code> and a <a href="../../../../../../libs/multiprecision/doc/html/index.html" target="_top">Boost.Multiprecision</a>
35 type <code class="computeroutput"><span class="identifier">cpp_bin_float_50</span></code>. In
36 each case the target accuracy was set using our "recomended" accuracy
37 limits (or at least limits that make a good starting point - which is likely
38 to give close to full accuracy without resorting to unnecessary iterations).
40 <div class="informaltable"><table class="table">
66 numeric_limits<T>::digits - 2
78 floor(numeric_limits<T>::digits * 0.6)
90 floor(numeric_limits<T>::digits * 0.4)
102 floor(numeric_limits<T>::digits * 0.4)
109 Tests used Microsoft Visual Studio 2013 (Update 1) and GCC 4.9.1 using source
110 code <a href="../../../../example/root_n_finding_algorithms.cpp" target="_top">root_n_finding_algorithms.cpp</a>.
113 The timing uncertainty (especially using MSVC) is at least 5% of normalized
117 To pick out the 'best' and 'worst' algorithms are highlighted in blue and
118 red. More than one result can be 'best' when normalized times are indistinguishable
119 within the uncertainty.
122 <a name="math_toolkit.root_comparison.root_n_comparison.h0"></a>
123 <span class="phrase"><a name="math_toolkit.root_comparison.root_n_comparison.program_example_root_n_finding_a"></a></span><a class="link" href="root_n_comparison.html#math_toolkit.root_comparison.root_n_comparison.program_example_root_n_finding_a">Program
124 ..\example\root_n_finding_algorithms.cpp, Microsoft Visual C++ version 14.1,
125 Dinkumware standard library version 650, Win32 Compiled in optimise mode.,
129 Fraction of full accuracy 1
132 <a name="math_toolkit.root_comparison.root_n_comparison.root_5"></a><p class="title"><b>Table 10.3. 5th root(28) for float, double, long double and cpp_bin_float_50 types,
133 using _X86_SSE2</b></p>
134 <div class="table-contents"><table class="table" summary="5th root(28) for float, double, long double and cpp_bin_float_50 types,
210 <td class="auto-generated"> </td>
211 <td class="auto-generated"> </td>
393 <span class="red">8.11</span>
422 <span class="blue">1.00</span>
444 <span class="blue">1.00</span>
488 <span class="blue">1.00</span>
561 <span class="blue">1.00</span>
656 <span class="blue">1.01</span>
692 <br class="table-break"><div class="table">
693 <a name="math_toolkit.root_comparison.root_n_comparison.root_7"></a><p class="title"><b>Table 10.4. 7th root(28) for float, double, long double and cpp_bin_float_50 types,
694 using _X86_SSE2</b></p>
695 <div class="table-contents"><table class="table" summary="7th root(28) for float, double, long double and cpp_bin_float_50 types,
771 <td class="auto-generated"> </td>
772 <td class="auto-generated"> </td>
910 <span class="red">4.06</span>
932 <span class="red">4.17</span>
954 <span class="red">8.12</span>
983 <span class="blue">1.00</span>
1005 <span class="blue">1.00</span>
1027 <span class="blue">1.00</span>
1049 <span class="blue">1.00</span>
1253 <br class="table-break"><div class="table">
1254 <a name="math_toolkit.root_comparison.root_n_comparison.root_11"></a><p class="title"><b>Table 10.5. 11th root(28) for float, double, long double and cpp_bin_float_50
1255 types, using _X86_SSE2</b></p>
1256 <div class="table-contents"><table class="table" summary="11th root(28) for float, double, long double and cpp_bin_float_50
1257 types, using _X86_SSE2">
1332 <td class="auto-generated"> </td>
1333 <td class="auto-generated"> </td>
1471 <span class="red">4.19</span>
1515 <span class="red">9.28</span>
1544 <span class="blue">1.00</span>
1566 <span class="blue">1.00</span>
1588 <span class="blue">1.00</span>
1610 <span class="blue">1.00</span>
1814 <br class="table-break"><h4>
1815 <a name="math_toolkit.root_comparison.root_n_comparison.h1"></a>
1816 <span class="phrase"><a name="math_toolkit.root_comparison.root_n_comparison.program_root_n_finding_algorithm"></a></span><a class="link" href="root_n_comparison.html#math_toolkit.root_comparison.root_n_comparison.program_root_n_finding_algorithm">Program
1817 root_n_finding_algorithms.cpp, Microsoft Visual C++ version 12.0, Dinkumware
1818 standard library version 610, Win32 Compiled in optimise mode., _X64_AVX</a>
1821 Fraction of full accuracy 1
1824 <a name="math_toolkit.root_comparison.root_n_comparison.root_5_0"></a><p class="title"><b>Table 10.6. 5th root(28) for float, double, long double and cpp_bin_float_50 types,
1825 using _X64_AVX</b></p>
1826 <div class="table-contents"><table class="table" summary="5th root(28) for float, double, long double and cpp_bin_float_50 types,
1902 <td class="auto-generated"> </td>
1903 <td class="auto-generated"> </td>
2085 <span class="red">7.51</span>
2114 <span class="blue">1.00</span>
2136 <span class="blue">1.00</span>
2158 <span class="blue">1.00</span>
2180 <span class="blue">1.00</span>
2384 <br class="table-break"><div class="table">
2385 <a name="math_toolkit.root_comparison.root_n_comparison.root_7_0"></a><p class="title"><b>Table 10.7. 7th root(28) for float, double, long double and cpp_bin_float_50 types,
2386 using _X64_AVX</b></p>
2387 <div class="table-contents"><table class="table" summary="7th root(28) for float, double, long double and cpp_bin_float_50 types,
2463 <td class="auto-generated"> </td>
2464 <td class="auto-generated"> </td>
2646 <span class="red">6.81</span>
2675 <span class="blue">1.00</span>
2697 <span class="blue">1.00</span>
2719 <span class="blue">1.00</span>
2741 <span class="blue">1.00</span>
2945 <br class="table-break"><div class="table">
2946 <a name="math_toolkit.root_comparison.root_n_comparison.root_11_0"></a><p class="title"><b>Table 10.8. 11th root(28) for float, double, long double and cpp_bin_float_50
2947 types, using _X64_AVX</b></p>
2948 <div class="table-contents"><table class="table" summary="11th root(28) for float, double, long double and cpp_bin_float_50
2949 types, using _X64_AVX">
3024 <td class="auto-generated"> </td>
3025 <td class="auto-generated"> </td>
3207 <span class="red">8.85</span>
3236 <span class="blue">1.00</span>
3258 <span class="blue">1.00</span>
3280 <span class="blue">1.00</span>
3302 <span class="blue">1.00</span>
3506 <br class="table-break"><h4>
3507 <a name="math_toolkit.root_comparison.root_n_comparison.h2"></a>
3508 <span class="phrase"><a name="math_toolkit.root_comparison.root_n_comparison.program_example_root_n_finding_0"></a></span><a class="link" href="root_n_comparison.html#math_toolkit.root_comparison.root_n_comparison.program_example_root_n_finding_0">Program
3509 ..\example\root_n_finding_algorithms.cpp, GNU C++ version 7.1.0, GNU libstdc++
3510 version 20170502, Win32 Compiled in optimise mode., _X64_SSE2</a>
3513 Fraction of full accuracy 1
3516 <a name="math_toolkit.root_comparison.root_n_comparison.root_5_1"></a><p class="title"><b>Table 10.9. 5th root(28) for float, double, long double and cpp_bin_float_50 types,
3517 using _X64_SSE2</b></p>
3518 <div class="table-contents"><table class="table" summary="5th root(28) for float, double, long double and cpp_bin_float_50 types,
3594 <td class="auto-generated"> </td>
3595 <td class="auto-generated"> </td>
3733 <span class="red">4.04</span>
3755 <span class="red">4.40</span>
3777 <span class="red">8.39</span>
3806 <span class="blue">1.00</span>
3828 <span class="blue">1.00</span>
3850 <span class="blue">1.00</span>
3872 <span class="blue">1.00</span>
4076 <br class="table-break"><div class="table">
4077 <a name="math_toolkit.root_comparison.root_n_comparison.root_7_1"></a><p class="title"><b>Table 10.10. 7th root(28) for float, double, long double and cpp_bin_float_50 types,
4078 using _X64_SSE2</b></p>
4079 <div class="table-contents"><table class="table" summary="7th root(28) for float, double, long double and cpp_bin_float_50 types,
4155 <td class="auto-generated"> </td>
4156 <td class="auto-generated"> </td>
4338 <span class="red">7.32</span>
4367 <span class="blue">1.00</span>
4389 <span class="blue">1.00</span>
4411 <span class="blue">1.00</span>
4433 <span class="blue">1.00</span>
4637 <br class="table-break"><div class="table">
4638 <a name="math_toolkit.root_comparison.root_n_comparison.root_11_1"></a><p class="title"><b>Table 10.11. 11th root(28) for float, double, long double and cpp_bin_float_50
4639 types, using _X64_SSE2</b></p>
4640 <div class="table-contents"><table class="table" summary="11th root(28) for float, double, long double and cpp_bin_float_50
4641 types, using _X64_SSE2">
4716 <td class="auto-generated"> </td>
4717 <td class="auto-generated"> </td>
4877 <span class="red">4.22</span>
4899 <span class="red">9.53</span>
4928 <span class="blue">1.00</span>
4950 <span class="blue">1.00</span>
4972 <span class="blue">1.00</span>
4994 <span class="blue">1.00</span>
5023 <span class="blue">1.04</span>
5198 <br class="table-break"><p>
5199 Some tentative conclusions can be drawn from this limited exercise.
5201 <div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; ">
5202 <li class="listitem">
5203 Perhaps surprisingly, there is little difference between the various
5204 algorithms for <a href="http://en.cppreference.com/w/cpp/language/types" target="_top">fundamental
5205 (built-in) types</a> floating-point types. Using the first derivatives
5206 (<a class="link" href="../roots_deriv.html#math_toolkit.roots_deriv.newton">Newton-Raphson iteration</a>)
5207 is usually the best, but while the improvement over the no-derivative
5208 <a href="http://portal.acm.org/citation.cfm?id=210111" target="_top">TOMS Algorithm
5209 748: enclosing zeros of continuous functions</a> is considerable
5210 in number of iterations, but little in execution time. This reflects
5211 the fact that the function we are finding the root for is trivial to
5212 evaluate, so runtimetimes are dominated by the time taken by the boilerplate
5213 code in each method.
5215 <li class="listitem">
5216 The extra cost of evaluating the second derivatives (<a class="link" href="../roots_deriv.html#math_toolkit.roots_deriv.halley">Halley</a>
5217 or <a class="link" href="../roots_deriv.html#math_toolkit.roots_deriv.schroder">Schröder</a>)
5218 is usually too much for any net benefit: as with the cube root, these
5219 functors are so cheap to evaluate that the runtime is largely dominated
5220 by the complexity of the root finding method.
5222 <li class="listitem">
5223 For a <a href="../../../../../../libs/multiprecision/doc/html/index.html" target="_top">Boost.Multiprecision</a>
5224 floating-point type, the <a class="link" href="../roots_deriv.html#math_toolkit.roots_deriv.newton">Newton-Raphson
5225 iteration</a> is a clear winner with a several-fold gain over <a href="http://portal.acm.org/citation.cfm?id=210111" target="_top">TOMS Algorithm 748:
5226 enclosing zeros of continuous functions</a>, and again no improvement
5227 from the second-derivative algorithms.
5229 <li class="listitem">
5230 The run-time of 50 decimal-digit <a href="../../../../../../libs/multiprecision/doc/html/index.html" target="_top">Boost.Multiprecision</a>
5231 is about 30-fold greater than <code class="computeroutput"><span class="keyword">double</span></code>.
5233 <li class="listitem">
5234 The column 'dis' showing the number of bits distance from the correct
5235 result. The Newton-Raphson algorithm shows a bit or two better accuracy
5236 than <a href="http://portal.acm.org/citation.cfm?id=210111" target="_top">TOMS Algorithm
5237 748: enclosing zeros of continuous functions</a>.
5239 <li class="listitem">
5240 The goodness of the 'guess' is especially crucial for <a href="../../../../../../libs/multiprecision/doc/html/index.html" target="_top">Boost.Multiprecision</a>.
5241 Separate experiments show that evaluating the 'guess' using <code class="computeroutput"><span class="keyword">double</span></code> allows convergence to the final
5242 exact result in one or two iterations. So in this contrived example,
5243 crudely dividing the exponent by N for a 'guess', it would be far better
5244 to use a <code class="computeroutput"><span class="identifier">pow</span><span class="special"><</span><span class="keyword">double</span><span class="special">></span></code>
5245 or , if more precise <code class="computeroutput"><span class="identifier">pow</span><span class="special"><</span><span class="keyword">long</span> <span class="keyword">double</span><span class="special">></span></code>,
5246 function to estimate a 'guess'. The limitation of this tactic is that
5247 the range of possible (exponent) values may be less than the multiprecision
5250 <li class="listitem">
5251 Using floating-point extension <a href="http://en.wikipedia.org/wiki/SSE2" target="_top">SSE2
5252 instructions</a> made a modest ten-percent speedup.
5254 <li class="listitem">
5255 Using MSVC, there was some improvement using 64-bit, markedly for <a href="../../../../../../libs/multiprecision/doc/html/index.html" target="_top">Boost.Multiprecision</a>.
5257 <li class="listitem">
5258 The GCC compiler 4.9.1 using 64-bit was at least five-folder faster that
5259 32-bit, apparently reflecting better optimization.
5263 Clearly, your mileage <span class="bold"><strong>will vary</strong></span>, but in
5264 summary, <a class="link" href="../roots_deriv.html#math_toolkit.roots_deriv.newton">Newton-Raphson iteration</a>
5265 seems the first choice of algorithm, and effort to find a good 'guess' the
5266 first speed-up target, especially for <a href="../../../../../../libs/multiprecision/doc/html/index.html" target="_top">Boost.Multiprecision</a>.
5267 And of course, compiler optimisation is crucial for speed.
5270 <table xmlns:rev="http://www.cs.rpi.edu/~gregod/boost/tools/doc/revision" width="100%"><tr>
5271 <td align="left"></td>
5272 <td align="right"><div class="copyright-footer">Copyright © 2006-2019 Nikhar
5273 Agrawal, Anton Bikineev, Paul A. Bristow, Marco Guazzone, Christopher Kormanyos,
5274 Hubert Holin, Bruno Lalande, John Maddock, Jeremy Murphy, Matthew Pulver, Johan
5275 Råde, Gautam Sewani, Benjamin Sobotta, Nicholas Thompson, Thijs van den Berg,
5276 Daryle Walker and Xiaogang Zhang<p>
5277 Distributed under the Boost Software License, Version 1.0. (See accompanying
5278 file LICENSE_1_0.txt or copy at <a href="http://www.boost.org/LICENSE_1_0.txt" target="_top">http://www.boost.org/LICENSE_1_0.txt</a>)
5283 <div class="spirit-nav">
5284 <a accesskey="p" href="cbrt_comparison.html"><img src="../../../../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../root_comparison.html"><img src="../../../../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../../index.html"><img src="../../../../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="elliptic_comparison.html"><img src="../../../../../../doc/src/images/next.png" alt="Next"></a>