2 .. default-domain:: cpp
4 .. cpp:namespace:: ceres
6 .. _chapter-nnls_solving:
8 ================================
9 Solving Non-linear Least Squares
10 ================================
15 Effective use of Ceres requires some familiarity with the basic
16 components of a non-linear least squares solver, so before we describe
17 how to configure and use the solver, we will take a brief look at how
18 some of the core optimization algorithms in Ceres work.
20 Let :math:`x \in \mathbb{R}^n` be an :math:`n`-dimensional vector of
22 :math:`F(x) = \left[f_1(x), ... , f_{m}(x) \right]^{\top}` be a
23 :math:`m`-dimensional function of :math:`x`. We are interested in
24 solving the optimization problem [#f1]_
26 .. math:: \arg \min_x \frac{1}{2}\|F(x)\|^2\ . \\
30 Where, :math:`L` and :math:`U` are lower and upper bounds on the
31 parameter vector :math:`x`.
33 Since the efficient global minimization of :eq:`nonlinsq` for
34 general :math:`F(x)` is an intractable problem, we will have to settle
35 for finding a local minimum.
37 In the following, the Jacobian :math:`J(x)` of :math:`F(x)` is an
38 :math:`m\times n` matrix, where :math:`J_{ij}(x) = \partial_j f_i(x)`
39 and the gradient vector is :math:`g(x) = \nabla \frac{1}{2}\|F(x)\|^2
42 The general strategy when solving non-linear optimization problems is
43 to solve a sequence of approximations to the original problem
44 [NocedalWright]_. At each iteration, the approximation is solved to
45 determine a correction :math:`\Delta x` to the vector :math:`x`. For
46 non-linear least squares, an approximation can be constructed by using
47 the linearization :math:`F(x+\Delta x) \approx F(x) + J(x)\Delta x`,
48 which leads to the following linear least squares problem:
50 .. math:: \min_{\Delta x} \frac{1}{2}\|J(x)\Delta x + F(x)\|^2
53 Unfortunately, naively solving a sequence of these problems and
54 updating :math:`x \leftarrow x+ \Delta x` leads to an algorithm that
55 may not converge. To get a convergent algorithm, we need to control
56 the size of the step :math:`\Delta x`. Depending on how the size of
57 the step :math:`\Delta x` is controlled, non-linear optimization
58 algorithms can be divided into two major categories [NocedalWright]_.
60 1. **Trust Region** The trust region approach approximates the
61 objective function using using a model function (often a quadratic)
62 over a subset of the search space known as the trust region. If the
63 model function succeeds in minimizing the true objective function
64 the trust region is expanded; conversely, otherwise it is
65 contracted and the model optimization problem is solved again.
67 2. **Line Search** The line search approach first finds a descent
68 direction along which the objective function will be reduced and
69 then computes a step size that decides how far should move along
70 that direction. The descent direction can be computed by various
71 methods, such as gradient descent, Newton's method and Quasi-Newton
72 method. The step size can be determined either exactly or
75 Trust region methods are in some sense dual to line search methods:
76 trust region methods first choose a step size (the size of the trust
77 region) and then a step direction while line search methods first
78 choose a step direction and then a step size. Ceres implements
79 multiple algorithms in both categories.
81 .. _section-trust-region-methods:
86 The basic trust region algorithm looks something like this.
88 1. Given an initial point :math:`x` and a trust region radius :math:`\mu`.
92 \arg \min_{\Delta x}& \frac{1}{2}\|J(x)\Delta x + F(x)\|^2 \\
93 \text{such that} &\|D(x)\Delta x\|^2 \le \mu\\
94 &L \le x + \Delta x \le U.
96 3. :math:`\rho = \frac{\displaystyle \|F(x + \Delta x)\|^2 -
97 \|F(x)\|^2}{\displaystyle \|J(x)\Delta x + F(x)\|^2 -
99 4. if :math:`\rho > \epsilon` then :math:`x = x + \Delta x`.
100 5. if :math:`\rho > \eta_1` then :math:`\mu = 2 \mu`
101 6. else if :math:`\rho < \eta_2` then :math:`\mu = 0.5 * \mu`
104 Here, :math:`\mu` is the trust region radius, :math:`D(x)` is some
105 matrix used to define a metric on the domain of :math:`F(x)` and
106 :math:`\rho` measures the quality of the step :math:`\Delta x`, i.e.,
107 how well did the linear model predict the decrease in the value of the
108 non-linear objective. The idea is to increase or decrease the radius
109 of the trust region depending on how well the linearization predicts
110 the behavior of the non-linear objective, which in turn is reflected
111 in the value of :math:`\rho`.
113 The key computational step in a trust-region algorithm is the solution
114 of the constrained optimization problem
117 \arg \min_{\Delta x}&\quad \frac{1}{2}\|J(x)\Delta x + F(x)\|^2 \\
118 \text{such that} &\quad \|D(x)\Delta x\|^2 \le \mu\\
119 &\quad L \le x + \Delta x \le U.
122 There are a number of different ways of solving this problem, each
123 giving rise to a different concrete trust-region algorithm. Currently,
124 Ceres implements two trust-region algorithms - Levenberg-Marquardt
125 and Dogleg, each of which is augmented with a line search if bounds
126 constraints are present [Kanzow]_. The user can choose between them by
127 setting :member:`Solver::Options::trust_region_strategy_type`.
129 .. rubric:: Footnotes
131 .. [#f1] At the level of the non-linear solver, the block structure is
132 not relevant, therefore our discussion here is in terms of an
133 optimization problem defined over a state vector of size
134 :math:`n`. Similarly the presence of loss functions is also
135 ignored as the problem is internally converted into a pure
136 non-linear least squares problem.
139 .. _section-levenberg-marquardt:
144 The Levenberg-Marquardt algorithm [Levenberg]_ [Marquardt]_ is the
145 most popular algorithm for solving non-linear least squares problems.
146 It was also the first trust region algorithm to be developed
147 [Levenberg]_ [Marquardt]_. Ceres implements an exact step [Madsen]_
148 and an inexact step variant of the Levenberg-Marquardt algorithm
149 [WrightHolt]_ [NashSofer]_.
151 It can be shown, that the solution to :eq:`trp` can be obtained by
152 solving an unconstrained optimization of the form
154 .. math:: \arg\min_{\Delta x} \frac{1}{2}\|J(x)\Delta x + F(x)\|^2 +\lambda \|D(x)\Delta x\|^2
156 Where, :math:`\lambda` is a Lagrange multiplier that is inverse
157 related to :math:`\mu`. In Ceres, we solve for
159 .. math:: \arg\min_{\Delta x} \frac{1}{2}\|J(x)\Delta x + F(x)\|^2 + \frac{1}{\mu} \|D(x)\Delta x\|^2
162 The matrix :math:`D(x)` is a non-negative diagonal matrix, typically
163 the square root of the diagonal of the matrix :math:`J(x)^\top J(x)`.
165 Before going further, let us make some notational simplifications. We
166 will assume that the matrix :math:`\sqrt{\mu} D` has been concatenated
167 at the bottom of the matrix :math:`J` and similarly a vector of zeros
168 has been added to the bottom of the vector :math:`f` and the rest of
169 our discussion will be in terms of :math:`J` and :math:`f`, i.e, the
170 linear least squares problem.
172 .. math:: \min_{\Delta x} \frac{1}{2} \|J(x)\Delta x + f(x)\|^2 .
175 For all but the smallest problems the solution of :eq:`simple` in
176 each iteration of the Levenberg-Marquardt algorithm is the dominant
177 computational cost in Ceres. Ceres provides a number of different
178 options for solving :eq:`simple`. There are two major classes of
179 methods - factorization and iterative.
181 The factorization methods are based on computing an exact solution of
182 :eq:`lsqr` using a Cholesky or a QR factorization and lead to an exact
183 step Levenberg-Marquardt algorithm. But it is not clear if an exact
184 solution of :eq:`lsqr` is necessary at each step of the LM algorithm
185 to solve :eq:`nonlinsq`. In fact, we have already seen evidence
186 that this may not be the case, as :eq:`lsqr` is itself a regularized
187 version of :eq:`linearapprox`. Indeed, it is possible to
188 construct non-linear optimization algorithms in which the linearized
189 problem is solved approximately. These algorithms are known as inexact
190 Newton or truncated Newton methods [NocedalWright]_.
192 An inexact Newton method requires two ingredients. First, a cheap
193 method for approximately solving systems of linear
194 equations. Typically an iterative linear solver like the Conjugate
195 Gradients method is used for this
196 purpose [NocedalWright]_. Second, a termination rule for
197 the iterative solver. A typical termination rule is of the form
199 .. math:: \|H(x) \Delta x + g(x)\| \leq \eta_k \|g(x)\|.
202 Here, :math:`k` indicates the Levenberg-Marquardt iteration number and
203 :math:`0 < \eta_k <1` is known as the forcing sequence. [WrightHolt]_
204 prove that a truncated Levenberg-Marquardt algorithm that uses an
205 inexact Newton step based on :eq:`inexact` converges for any
206 sequence :math:`\eta_k \leq \eta_0 < 1` and the rate of convergence
207 depends on the choice of the forcing sequence :math:`\eta_k`.
209 Ceres supports both exact and inexact step solution strategies. When
210 the user chooses a factorization based linear solver, the exact step
211 Levenberg-Marquardt algorithm is used. When the user chooses an
212 iterative linear solver, the inexact step Levenberg-Marquardt
220 Another strategy for solving the trust region problem :eq:`trp` was
221 introduced by M. J. D. Powell. The key idea there is to compute two
226 \Delta x^{\text{Gauss-Newton}} &= \arg \min_{\Delta x}\frac{1}{2} \|J(x)\Delta x + f(x)\|^2.\\
227 \Delta x^{\text{Cauchy}} &= -\frac{\|g(x)\|^2}{\|J(x)g(x)\|^2}g(x).
229 Note that the vector :math:`\Delta x^{\text{Gauss-Newton}}` is the
230 solution to :eq:`linearapprox` and :math:`\Delta
231 x^{\text{Cauchy}}` is the vector that minimizes the linear
232 approximation if we restrict ourselves to moving along the direction
233 of the gradient. Dogleg methods finds a vector :math:`\Delta x`
234 defined by :math:`\Delta x^{\text{Gauss-Newton}}` and :math:`\Delta
235 x^{\text{Cauchy}}` that solves the trust region problem. Ceres
236 supports two variants that can be chose by setting
237 :member:`Solver::Options::dogleg_type`.
239 ``TRADITIONAL_DOGLEG`` as described by Powell, constructs two line
240 segments using the Gauss-Newton and Cauchy vectors and finds the point
241 farthest along this line shaped like a dogleg (hence the name) that is
242 contained in the trust-region. For more details on the exact reasoning
243 and computations, please see Madsen et al [Madsen]_.
245 ``SUBSPACE_DOGLEG`` is a more sophisticated method that considers the
246 entire two dimensional subspace spanned by these two vectors and finds
247 the point that minimizes the trust region problem in this subspace
250 The key advantage of the Dogleg over Levenberg-Marquardt is that if
251 the step computation for a particular choice of :math:`\mu` does not
252 result in sufficient decrease in the value of the objective function,
253 Levenberg-Marquardt solves the linear approximation from scratch with
254 a smaller value of :math:`\mu`. Dogleg on the other hand, only needs
255 to compute the interpolation between the Gauss-Newton and the Cauchy
256 vectors, as neither of them depend on the value of :math:`\mu`.
258 The Dogleg method can only be used with the exact factorization based
261 .. _section-inner-iterations:
266 Some non-linear least squares problems have additional structure in
267 the way the parameter blocks interact that it is beneficial to modify
268 the way the trust region step is computed. For example, consider the
269 following regression problem
271 .. math:: y = a_1 e^{b_1 x} + a_2 e^{b_3 x^2 + c_1}
274 Given a set of pairs :math:`\{(x_i, y_i)\}`, the user wishes to estimate
275 :math:`a_1, a_2, b_1, b_2`, and :math:`c_1`.
277 Notice that the expression on the left is linear in :math:`a_1` and
278 :math:`a_2`, and given any value for :math:`b_1, b_2` and :math:`c_1`,
279 it is possible to use linear regression to estimate the optimal values
280 of :math:`a_1` and :math:`a_2`. It's possible to analytically
281 eliminate the variables :math:`a_1` and :math:`a_2` from the problem
282 entirely. Problems like these are known as separable least squares
283 problem and the most famous algorithm for solving them is the Variable
284 Projection algorithm invented by Golub & Pereyra [GolubPereyra]_.
286 Similar structure can be found in the matrix factorization with
287 missing data problem. There the corresponding algorithm is known as
288 Wiberg's algorithm [Wiberg]_.
290 Ruhe & Wedin present an analysis of various algorithms for solving
291 separable non-linear least squares problems and refer to *Variable
292 Projection* as Algorithm I in their paper [RuheWedin]_.
294 Implementing Variable Projection is tedious and expensive. Ruhe &
295 Wedin present a simpler algorithm with comparable convergence
296 properties, which they call Algorithm II. Algorithm II performs an
297 additional optimization step to estimate :math:`a_1` and :math:`a_2`
298 exactly after computing a successful Newton step.
301 This idea can be generalized to cases where the residual is not
302 linear in :math:`a_1` and :math:`a_2`, i.e.,
304 .. math:: y = f_1(a_1, e^{b_1 x}) + f_2(a_2, e^{b_3 x^2 + c_1})
306 In this case, we solve for the trust region step for the full problem,
307 and then use it as the starting point to further optimize just `a_1`
308 and `a_2`. For the linear case, this amounts to doing a single linear
309 least squares solve. For non-linear problems, any method for solving
310 the :math:`a_1` and :math:`a_2` optimization problems will do. The
311 only constraint on :math:`a_1` and :math:`a_2` (if they are two
312 different parameter block) is that they do not co-occur in a residual
315 This idea can be further generalized, by not just optimizing
316 :math:`(a_1, a_2)`, but decomposing the graph corresponding to the
317 Hessian matrix's sparsity structure into a collection of
318 non-overlapping independent sets and optimizing each of them.
320 Setting :member:`Solver::Options::use_inner_iterations` to ``true``
321 enables the use of this non-linear generalization of Ruhe & Wedin's
322 Algorithm II. This version of Ceres has a higher iteration
323 complexity, but also displays better convergence behavior per
326 Setting :member:`Solver::Options::num_threads` to the maximum number
327 possible is highly recommended.
329 .. _section-non-monotonic-steps:
334 Note that the basic trust-region algorithm described in
335 :ref:`section-trust-region-methods` is a descent algorithm in that it
336 only accepts a point if it strictly reduces the value of the objective
339 Relaxing this requirement allows the algorithm to be more efficient in
340 the long term at the cost of some local increase in the value of the
343 This is because allowing for non-decreasing objective function values
344 in a principled manner allows the algorithm to *jump over boulders* as
345 the method is not restricted to move into narrow valleys while
346 preserving its convergence properties.
348 Setting :member:`Solver::Options::use_nonmonotonic_steps` to ``true``
349 enables the non-monotonic trust region algorithm as described by Conn,
350 Gould & Toint in [Conn]_.
352 Even though the value of the objective function may be larger
353 than the minimum value encountered over the course of the
354 optimization, the final parameters returned to the user are the
355 ones corresponding to the minimum cost over all iterations.
357 The option to take non-monotonic steps is available for all trust
361 .. _section-line-search-methods:
366 The line search method in Ceres Solver cannot handle bounds
367 constraints right now, so it can only be used for solving
368 unconstrained problems.
370 Line search algorithms
372 1. Given an initial point :math:`x`
373 2. :math:`\Delta x = -H^{-1}(x) g(x)`
374 3. :math:`\arg \min_\mu \frac{1}{2} \| F(x + \mu \Delta x) \|^2`
375 4. :math:`x = x + \mu \Delta x`
378 Here :math:`H(x)` is some approximation to the Hessian of the
379 objective function, and :math:`g(x)` is the gradient at
380 :math:`x`. Depending on the choice of :math:`H(x)` we get a variety of
381 different search directions :math:`\Delta x`.
383 Step 4, which is a one dimensional optimization or `Line Search` along
384 :math:`\Delta x` is what gives this class of methods its name.
386 Different line search algorithms differ in their choice of the search
387 direction :math:`\Delta x` and the method used for one dimensional
388 optimization along :math:`\Delta x`. The choice of :math:`H(x)` is the
389 primary source of computational complexity in these
390 methods. Currently, Ceres Solver supports three choices of search
391 directions, all aimed at large scale problems.
393 1. ``STEEPEST_DESCENT`` This corresponds to choosing :math:`H(x)` to
394 be the identity matrix. This is not a good search direction for
395 anything but the simplest of the problems. It is only included here
398 2. ``NONLINEAR_CONJUGATE_GRADIENT`` A generalization of the Conjugate
399 Gradient method to non-linear functions. The generalization can be
400 performed in a number of different ways, resulting in a variety of
401 search directions. Ceres Solver currently supports
402 ``FLETCHER_REEVES``, ``POLAK_RIBIERE`` and ``HESTENES_STIEFEL``
405 3. ``BFGS`` A generalization of the Secant method to multiple
406 dimensions in which a full, dense approximation to the inverse
407 Hessian is maintained and used to compute a quasi-Newton step
408 [NocedalWright]_. BFGS is currently the best known general
409 quasi-Newton algorithm.
411 4. ``LBFGS`` A limited memory approximation to the full ``BFGS``
412 method in which the last `M` iterations are used to approximate the
413 inverse Hessian used to compute a quasi-Newton step [Nocedal]_,
416 Currently Ceres Solver supports both a backtracking and interpolation
417 based Armijo line search algorithm, and a sectioning / zoom
418 interpolation (strong) Wolfe condition line search algorithm.
419 However, note that in order for the assumptions underlying the
420 ``BFGS`` and ``LBFGS`` methods to be guaranteed to be satisfied the
421 Wolfe line search algorithm should be used.
423 .. _section-linear-solver:
428 Recall that in both of the trust-region methods described above, the
429 key computational cost is the solution of a linear least squares
432 .. math:: \min_{\Delta x} \frac{1}{2} \|J(x)\Delta x + f(x)\|^2 .
435 Let :math:`H(x)= J(x)^\top J(x)` and :math:`g(x) = -J(x)^\top
436 f(x)`. For notational convenience let us also drop the dependence on
437 :math:`x`. Then it is easy to see that solving :eq:`simple2` is
438 equivalent to solving the *normal equations*.
440 .. math:: H \Delta x = g
443 Ceres provides a number of different options for solving :eq:`normal`.
450 For small problems (a couple of hundred parameters and a few thousand
451 residuals) with relatively dense Jacobians, ``DENSE_QR`` is the method
452 of choice [Bjorck]_. Let :math:`J = QR` be the QR-decomposition of
453 :math:`J`, where :math:`Q` is an orthonormal matrix and :math:`R` is
454 an upper triangular matrix [TrefethenBau]_. Then it can be shown that
455 the solution to :eq:`normal` is given by
457 .. math:: \Delta x^* = -R^{-1}Q^\top f
460 Ceres uses ``Eigen`` 's dense QR factorization routines.
462 .. _section-cholesky:
464 ``DENSE_NORMAL_CHOLESKY`` & ``SPARSE_NORMAL_CHOLESKY``
465 ------------------------------------------------------
467 Large non-linear least square problems are usually sparse. In such
468 cases, using a dense QR factorization is inefficient. Let :math:`H =
469 R^\top R` be the Cholesky factorization of the normal equations, where
470 :math:`R` is an upper triangular matrix, then the solution to
471 :eq:`normal` is given by
475 \Delta x^* = R^{-1} R^{-\top} g.
478 The observant reader will note that the :math:`R` in the Cholesky
479 factorization of :math:`H` is the same upper triangular matrix
480 :math:`R` in the QR factorization of :math:`J`. Since :math:`Q` is an
481 orthonormal matrix, :math:`J=QR` implies that :math:`J^\top J = R^\top
482 Q^\top Q R = R^\top R`. There are two variants of Cholesky
483 factorization -- sparse and dense.
485 ``DENSE_NORMAL_CHOLESKY`` as the name implies performs a dense
486 Cholesky factorization of the normal equations. Ceres uses
487 ``Eigen`` 's dense LDLT factorization routines.
489 ``SPARSE_NORMAL_CHOLESKY``, as the name implies performs a sparse
490 Cholesky factorization of the normal equations. This leads to
491 substantial savings in time and memory for large sparse
492 problems. Ceres uses the sparse Cholesky factorization routines in
493 Professor Tim Davis' ``SuiteSparse`` or ``CXSparse`` packages [Chen]_
494 or the sparse Cholesky factorization algorithm in ``Eigen`` (which
495 incidently is a port of the algorithm implemented inside ``CXSparse``)
502 For general sparse problems, if the problem is too large for
503 ``CHOLMOD`` or a sparse linear algebra library is not linked into
504 Ceres, another option is the ``CGNR`` solver. This solver uses the
505 Conjugate Gradients solver on the *normal equations*, but without
506 forming the normal equations explicitly. It exploits the relation
509 H x = J^\top J x = J^\top(J x)
511 The convergence of Conjugate Gradients depends on the conditioner
512 number :math:`\kappa(H)`. Usually :math:`H` is poorly conditioned and
513 a :ref:`section-preconditioner` must be used to get reasonable
514 performance. Currently only the ``JACOBI`` preconditioner is available
515 for use with ``CGNR``. It uses the block diagonal of :math:`H` to
516 precondition the normal equations.
518 When the user chooses ``CGNR`` as the linear solver, Ceres
519 automatically switches from the exact step algorithm to an inexact
524 ``DENSE_SCHUR`` & ``SPARSE_SCHUR``
525 ----------------------------------
527 While it is possible to use ``SPARSE_NORMAL_CHOLESKY`` to solve bundle
528 adjustment problems, bundle adjustment problem have a special
529 structure, and a more efficient scheme for solving :eq:`normal`
532 Suppose that the SfM problem consists of :math:`p` cameras and
533 :math:`q` points and the variable vector :math:`x` has the block
534 structure :math:`x = [y_{1}, ... ,y_{p},z_{1}, ... ,z_{q}]`. Where,
535 :math:`y` and :math:`z` correspond to camera and point parameters,
536 respectively. Further, let the camera blocks be of size :math:`c` and
537 the point blocks be of size :math:`s` (for most problems :math:`c` =
538 :math:`6`--`9` and :math:`s = 3`). Ceres does not impose any constancy
539 requirement on these block sizes, but choosing them to be constant
540 simplifies the exposition.
542 A key characteristic of the bundle adjustment problem is that there is
543 no term :math:`f_{i}` that includes two or more point blocks. This in
544 turn implies that the matrix :math:`H` is of the form
546 .. math:: H = \left[ \begin{matrix} B & E\\ E^\top & C \end{matrix} \right]\ ,
549 where :math:`B \in \mathbb{R}^{pc\times pc}` is a block sparse matrix
550 with :math:`p` blocks of size :math:`c\times c` and :math:`C \in
551 \mathbb{R}^{qs\times qs}` is a block diagonal matrix with :math:`q` blocks
552 of size :math:`s\times s`. :math:`E \in \mathbb{R}^{pc\times qs}` is a
553 general block sparse matrix, with a block of size :math:`c\times s`
554 for each observation. Let us now block partition :math:`\Delta x =
555 [\Delta y,\Delta z]` and :math:`g=[v,w]` to restate :eq:`normal`
556 as the block structured linear system
558 .. math:: \left[ \begin{matrix} B & E\\ E^\top & C \end{matrix}
559 \right]\left[ \begin{matrix} \Delta y \\ \Delta z
560 \end{matrix} \right] = \left[ \begin{matrix} v\\ w
561 \end{matrix} \right]\ ,
564 and apply Gaussian elimination to it. As we noted above, :math:`C` is
565 a block diagonal matrix, with small diagonal blocks of size
566 :math:`s\times s`. Thus, calculating the inverse of :math:`C` by
567 inverting each of these blocks is cheap. This allows us to eliminate
568 :math:`\Delta z` by observing that :math:`\Delta z = C^{-1}(w - E^\top
569 \Delta y)`, giving us
571 .. math:: \left[B - EC^{-1}E^\top\right] \Delta y = v - EC^{-1}w\ .
576 .. math:: S = B - EC^{-1}E^\top
578 is the Schur complement of :math:`C` in :math:`H`. It is also known as
579 the *reduced camera matrix*, because the only variables
580 participating in :eq:`schur` are the ones corresponding to the
581 cameras. :math:`S \in \mathbb{R}^{pc\times pc}` is a block structured
582 symmetric positive definite matrix, with blocks of size :math:`c\times
583 c`. The block :math:`S_{ij}` corresponding to the pair of images
584 :math:`i` and :math:`j` is non-zero if and only if the two images
585 observe at least one common point.
588 Now, :eq:`linear2` can be solved by first forming :math:`S`, solving for
589 :math:`\Delta y`, and then back-substituting :math:`\Delta y` to
590 obtain the value of :math:`\Delta z`. Thus, the solution of what was
591 an :math:`n\times n`, :math:`n=pc+qs` linear system is reduced to the
592 inversion of the block diagonal matrix :math:`C`, a few matrix-matrix
593 and matrix-vector multiplies, and the solution of block sparse
594 :math:`pc\times pc` linear system :eq:`schur`. For almost all
595 problems, the number of cameras is much smaller than the number of
596 points, :math:`p \ll q`, thus solving :eq:`schur` is
597 significantly cheaper than solving :eq:`linear2`. This is the
598 *Schur complement trick* [Brown]_.
600 This still leaves open the question of solving :eq:`schur`. The
601 method of choice for solving symmetric positive definite systems
602 exactly is via the Cholesky factorization [TrefethenBau]_ and
603 depending upon the structure of the matrix, there are, in general, two
604 options. The first is direct factorization, where we store and factor
605 :math:`S` as a dense matrix [TrefethenBau]_. This method has
606 :math:`O(p^2)` space complexity and :math:`O(p^3)` time complexity and
607 is only practical for problems with up to a few hundred cameras. Ceres
608 implements this strategy as the ``DENSE_SCHUR`` solver.
611 But, :math:`S` is typically a fairly sparse matrix, as most images
612 only see a small fraction of the scene. This leads us to the second
613 option: Sparse Direct Methods. These methods store :math:`S` as a
614 sparse matrix, use row and column re-ordering algorithms to maximize
615 the sparsity of the Cholesky decomposition, and focus their compute
616 effort on the non-zero part of the factorization [Chen]_. Sparse
617 direct methods, depending on the exact sparsity structure of the Schur
618 complement, allow bundle adjustment algorithms to significantly scale
619 up over those based on dense factorization. Ceres implements this
620 strategy as the ``SPARSE_SCHUR`` solver.
622 .. _section-iterative_schur:
627 Another option for bundle adjustment problems is to apply
628 Preconditioned Conjugate Gradients to the reduced camera matrix
629 :math:`S` instead of :math:`H`. One reason to do this is that
630 :math:`S` is a much smaller matrix than :math:`H`, but more
631 importantly, it can be shown that :math:`\kappa(S)\leq \kappa(H)`.
632 Ceres implements Conjugate Gradients on :math:`S` as the
633 ``ITERATIVE_SCHUR`` solver. When the user chooses ``ITERATIVE_SCHUR``
634 as the linear solver, Ceres automatically switches from the exact step
635 algorithm to an inexact step algorithm.
637 The key computational operation when using Conjuagate Gradients is the
638 evaluation of the matrix vector product :math:`Sx` for an arbitrary
639 vector :math:`x`. There are two ways in which this product can be
640 evaluated, and this can be controlled using
641 ``Solver::Options::use_explicit_schur_complement``. Depending on the
642 problem at hand, the performance difference between these two methods
643 can be quite substantial.
645 1. **Implicit** This is default. Implicit evaluation is suitable for
646 large problems where the cost of computing and storing the Schur
647 Complement :math:`S` is prohibitive. Because PCG only needs
648 access to :math:`S` via its product with a vector, one way to
649 evaluate :math:`Sx` is to observe that
651 .. math:: x_1 &= E^\top x
652 .. math:: x_2 &= C^{-1} x_1
653 .. math:: x_3 &= Ex_2\\
654 .. math:: x_4 &= Bx\\
655 .. math:: Sx &= x_4 - x_3
658 Thus, we can run PCG on :math:`S` with the same computational
659 effort per iteration as PCG on :math:`H`, while reaping the
660 benefits of a more powerful preconditioner. In fact, we do not
661 even need to compute :math:`H`, :eq:`schurtrick1` can be
662 implemented using just the columns of :math:`J`.
664 Equation :eq:`schurtrick1` is closely related to *Domain
665 Decomposition methods* for solving large linear systems that
666 arise in structural engineering and partial differential
667 equations. In the language of Domain Decomposition, each point in
668 a bundle adjustment problem is a domain, and the cameras form the
669 interface between these domains. The iterative solution of the
670 Schur complement then falls within the sub-category of techniques
671 known as Iterative Sub-structuring [Saad]_ [Mathew]_.
673 2. **Explicit** The complexity of implicit matrix-vector product
674 evaluation scales with the number of non-zeros in the
675 Jacobian. For small to medium sized problems, the cost of
676 constructing the Schur Complement is small enough that it is
677 better to construct it explicitly in memory and use it to
678 evaluate the product :math:`Sx`.
680 When the user chooses ``ITERATIVE_SCHUR`` as the linear solver, Ceres
681 automatically switches from the exact step algorithm to an inexact
686 In exact arithmetic, the choice of implicit versus explicit Schur
687 complement would have no impact on solution quality. However, in
688 practice if the Jacobian is poorly conditioned, one may observe
689 (usually small) differences in solution quality. This is a
690 natural consequence of performing computations in finite arithmetic.
693 .. _section-preconditioner:
698 The convergence rate of Conjugate Gradients for
699 solving :eq:`normal` depends on the distribution of eigenvalues
700 of :math:`H` [Saad]_. A useful upper bound is
701 :math:`\sqrt{\kappa(H)}`, where, :math:`\kappa(H)` is the condition
702 number of the matrix :math:`H`. For most bundle adjustment problems,
703 :math:`\kappa(H)` is high and a direct application of Conjugate
704 Gradients to :eq:`normal` results in extremely poor performance.
706 The solution to this problem is to replace :eq:`normal` with a
707 *preconditioned* system. Given a linear system, :math:`Ax =b` and a
708 preconditioner :math:`M` the preconditioned system is given by
709 :math:`M^{-1}Ax = M^{-1}b`. The resulting algorithm is known as
710 Preconditioned Conjugate Gradients algorithm (PCG) and its worst case
711 complexity now depends on the condition number of the *preconditioned*
712 matrix :math:`\kappa(M^{-1}A)`.
714 The computational cost of using a preconditioner :math:`M` is the cost
715 of computing :math:`M` and evaluating the product :math:`M^{-1}y` for
716 arbitrary vectors :math:`y`. Thus, there are two competing factors to
717 consider: How much of :math:`H`'s structure is captured by :math:`M`
718 so that the condition number :math:`\kappa(HM^{-1})` is low, and the
719 computational cost of constructing and using :math:`M`. The ideal
720 preconditioner would be one for which :math:`\kappa(M^{-1}A)
721 =1`. :math:`M=A` achieves this, but it is not a practical choice, as
722 applying this preconditioner would require solving a linear system
723 equivalent to the unpreconditioned problem. It is usually the case
724 that the more information :math:`M` has about :math:`H`, the more
725 expensive it is use. For example, Incomplete Cholesky factorization
726 based preconditioners have much better convergence behavior than the
727 Jacobi preconditioner, but are also much more expensive.
729 The simplest of all preconditioners is the diagonal or Jacobi
730 preconditioner, i.e., :math:`M=\operatorname{diag}(A)`, which for
731 block structured matrices like :math:`H` can be generalized to the
732 block Jacobi preconditioner. Ceres implements the block Jacobi
733 preconditioner and refers to it as ``JACOBI``. When used with
734 :ref:`section-cgnr` it refers to the block diagonal of :math:`H` and
735 when used with :ref:`section-iterative_schur` it refers to the block
736 diagonal of :math:`B` [Mandel]_.
738 Another obvious choice for :ref:`section-iterative_schur` is the block
739 diagonal of the Schur complement matrix :math:`S`, i.e, the block
740 Jacobi preconditioner for :math:`S`. Ceres implements it and refers to
741 is as the ``SCHUR_JACOBI`` preconditioner.
743 For bundle adjustment problems arising in reconstruction from
744 community photo collections, more effective preconditioners can be
745 constructed by analyzing and exploiting the camera-point visibility
746 structure of the scene [KushalAgarwal]_. Ceres implements the two
747 visibility based preconditioners described by Kushal & Agarwal as
748 ``CLUSTER_JACOBI`` and ``CLUSTER_TRIDIAGONAL``. These are fairly new
749 preconditioners and Ceres' implementation of them is in its early
750 stages and is not as mature as the other preconditioners described
753 .. _section-ordering:
758 The order in which variables are eliminated in a linear solver can
759 have a significant of impact on the efficiency and accuracy of the
760 method. For example when doing sparse Cholesky factorization, there
761 are matrices for which a good ordering will give a Cholesky factor
762 with :math:`O(n)` storage, where as a bad ordering will result in an
763 completely dense factor.
765 Ceres allows the user to provide varying amounts of hints to the
766 solver about the variable elimination ordering to use. This can range
767 from no hints, where the solver is free to decide the best ordering
768 based on the user's choices like the linear solver being used, to an
769 exact order in which the variables should be eliminated, and a variety
770 of possibilities in between.
772 Instances of the :class:`ParameterBlockOrdering` class are used to
773 communicate this information to Ceres.
775 Formally an ordering is an ordered partitioning of the parameter
776 blocks. Each parameter block belongs to exactly one group, and each
777 group has a unique integer associated with it, that determines its
778 order in the set of groups. We call these groups *Elimination Groups*
780 Given such an ordering, Ceres ensures that the parameter blocks in the
781 lowest numbered elimination group are eliminated first, and then the
782 parameter blocks in the next lowest numbered elimination group and so
783 on. Within each elimination group, Ceres is free to order the
784 parameter blocks as it chooses. For example, consider the linear system
790 There are two ways in which it can be solved. First eliminating
791 :math:`x` from the two equations, solving for :math:`y` and then back
792 substituting for :math:`x`, or first eliminating :math:`y`, solving
793 for :math:`x` and back substituting for :math:`y`. The user can
794 construct three orderings here.
796 1. :math:`\{0: x\}, \{1: y\}` : Eliminate :math:`x` first.
797 2. :math:`\{0: y\}, \{1: x\}` : Eliminate :math:`y` first.
798 3. :math:`\{0: x, y\}` : Solver gets to decide the elimination order.
800 Thus, to have Ceres determine the ordering automatically using
801 heuristics, put all the variables in the same elimination group. The
802 identity of the group does not matter. This is the same as not
803 specifying an ordering at all. To control the ordering for every
804 variable, create an elimination group per variable, ordering them in
807 If the user is using one of the Schur solvers (``DENSE_SCHUR``,
808 ``SPARSE_SCHUR``, ``ITERATIVE_SCHUR``) and chooses to specify an
809 ordering, it must have one important property. The lowest numbered
810 elimination group must form an independent set in the graph
811 corresponding to the Hessian, or in other words, no two parameter
812 blocks in in the first elimination group should co-occur in the same
813 residual block. For the best performance, this elimination group
814 should be as large as possible. For standard bundle adjustment
815 problems, this corresponds to the first elimination group containing
816 all the 3d points, and the second containing the all the cameras
819 If the user leaves the choice to Ceres, then the solver uses an
820 approximate maximum independent set algorithm to identify the first
821 elimination group [LiSaad]_.
823 .. _section-solver-options:
825 :class:`Solver::Options`
826 ========================
828 .. class:: Solver::Options
830 :class:`Solver::Options` controls the overall behavior of the
831 solver. We list the various settings and their default values below.
833 .. function:: bool Solver::Options::IsValid(string* error) const
835 Validate the values in the options struct and returns true on
836 success. If there is a problem, the method returns false with
837 ``error`` containing a textual description of the cause.
839 .. member:: MinimizerType Solver::Options::minimizer_type
841 Default: ``TRUST_REGION``
843 Choose between ``LINE_SEARCH`` and ``TRUST_REGION`` algorithms. See
844 :ref:`section-trust-region-methods` and
845 :ref:`section-line-search-methods` for more details.
847 .. member:: LineSearchDirectionType Solver::Options::line_search_direction_type
851 Choices are ``STEEPEST_DESCENT``, ``NONLINEAR_CONJUGATE_GRADIENT``,
852 ``BFGS`` and ``LBFGS``.
854 .. member:: LineSearchType Solver::Options::line_search_type
858 Choices are ``ARMIJO`` and ``WOLFE`` (strong Wolfe conditions).
859 Note that in order for the assumptions underlying the ``BFGS`` and
860 ``LBFGS`` line search direction algorithms to be guaranteed to be
861 satisifed, the ``WOLFE`` line search should be used.
863 .. member:: NonlinearConjugateGradientType Solver::Options::nonlinear_conjugate_gradient_type
865 Default: ``FLETCHER_REEVES``
867 Choices are ``FLETCHER_REEVES``, ``POLAK_RIBIERE`` and
868 ``HESTENES_STIEFEL``.
870 .. member:: int Solver::Options::max_lbfgs_rank
874 The L-BFGS hessian approximation is a low rank approximation to the
875 inverse of the Hessian matrix. The rank of the approximation
876 determines (linearly) the space and time complexity of using the
877 approximation. Higher the rank, the better is the quality of the
878 approximation. The increase in quality is however is bounded for a
881 1. The method only uses secant information and not actual
884 2. The Hessian approximation is constrained to be positive
887 So increasing this rank to a large number will cost time and space
888 complexity without the corresponding increase in solution
889 quality. There are no hard and fast rules for choosing the maximum
890 rank. The best choice usually requires some problem specific
893 .. member:: bool Solver::Options::use_approximate_eigenvalue_bfgs_scaling
897 As part of the ``BFGS`` update step / ``LBFGS`` right-multiply
898 step, the initial inverse Hessian approximation is taken to be the
899 Identity. However, [Oren]_ showed that using instead :math:`I *
900 \gamma`, where :math:`\gamma` is a scalar chosen to approximate an
901 eigenvalue of the true inverse Hessian can result in improved
902 convergence in a wide variety of cases. Setting
903 ``use_approximate_eigenvalue_bfgs_scaling`` to true enables this
904 scaling in ``BFGS`` (before first iteration) and ``LBFGS`` (at each
907 Precisely, approximate eigenvalue scaling equates to
909 .. math:: \gamma = \frac{y_k' s_k}{y_k' y_k}
913 .. math:: y_k = \nabla f_{k+1} - \nabla f_k
914 .. math:: s_k = x_{k+1} - x_k
916 Where :math:`f()` is the line search objective and :math:`x` the
917 vector of parameter values [NocedalWright]_.
919 It is important to note that approximate eigenvalue scaling does
920 **not** *always* improve convergence, and that it can in fact
921 *significantly* degrade performance for certain classes of problem,
922 which is why it is disabled by default. In particular it can
923 degrade performance when the sensitivity of the problem to different
924 parameters varies significantly, as in this case a single scalar
925 factor fails to capture this variation and detrimentally downscales
926 parts of the Jacobian approximation which correspond to
927 low-sensitivity parameters. It can also reduce the robustness of the
928 solution to errors in the Jacobians.
930 .. member:: LineSearchIterpolationType Solver::Options::line_search_interpolation_type
934 Degree of the polynomial used to approximate the objective
935 function. Valid values are ``BISECTION``, ``QUADRATIC`` and
938 .. member:: double Solver::Options::min_line_search_step_size
940 The line search terminates if:
942 .. math:: \|\Delta x_k\|_\infty < \text{min_line_search_step_size}
944 where :math:`\|\cdot\|_\infty` refers to the max norm, and
945 :math:`\Delta x_k` is the step change in the parameter values at
946 the :math:`k`-th iteration.
948 .. member:: double Solver::Options::line_search_sufficient_function_decrease
952 Solving the line search problem exactly is computationally
953 prohibitive. Fortunately, line search based optimization algorithms
954 can still guarantee convergence if instead of an exact solution,
955 the line search algorithm returns a solution which decreases the
956 value of the objective function sufficiently. More precisely, we
957 are looking for a step size s.t.
959 .. math:: f(\text{step_size}) \le f(0) + \text{sufficient_decrease} * [f'(0) * \text{step_size}]
961 This condition is known as the Armijo condition.
963 .. member:: double Solver::Options::max_line_search_step_contraction
967 In each iteration of the line search,
969 .. math:: \text{new_step_size} >= \text{max_line_search_step_contraction} * \text{step_size}
971 Note that by definition, for contraction:
973 .. math:: 0 < \text{max_step_contraction} < \text{min_step_contraction} < 1
975 .. member:: double Solver::Options::min_line_search_step_contraction
979 In each iteration of the line search,
981 .. math:: \text{new_step_size} <= \text{min_line_search_step_contraction} * \text{step_size}
983 Note that by definition, for contraction:
985 .. math:: 0 < \text{max_step_contraction} < \text{min_step_contraction} < 1
987 .. member:: int Solver::Options::max_num_line_search_step_size_iterations
991 Maximum number of trial step size iterations during each line
992 search, if a step size satisfying the search conditions cannot be
993 found within this number of trials, the line search will stop.
995 As this is an 'artificial' constraint (one imposed by the user, not
996 the underlying math), if ``WOLFE`` line search is being used, *and*
997 points satisfying the Armijo sufficient (function) decrease
998 condition have been found during the current search (in :math:`<=`
999 ``max_num_line_search_step_size_iterations``). Then, the step size
1000 with the lowest function value which satisfies the Armijo condition
1001 will be returned as the new valid step, even though it does *not*
1002 satisfy the strong Wolfe conditions. This behaviour protects
1003 against early termination of the optimizer at a sub-optimal point.
1005 .. member:: int Solver::Options::max_num_line_search_direction_restarts
1009 Maximum number of restarts of the line search direction algorithm
1010 before terminating the optimization. Restarts of the line search
1011 direction algorithm occur when the current algorithm fails to
1012 produce a new descent direction. This typically indicates a
1013 numerical failure, or a breakdown in the validity of the
1014 approximations used.
1016 .. member:: double Solver::Options::line_search_sufficient_curvature_decrease
1020 The strong Wolfe conditions consist of the Armijo sufficient
1021 decrease condition, and an additional requirement that the
1022 step size be chosen s.t. the *magnitude* ('strong' Wolfe
1023 conditions) of the gradient along the search direction
1024 decreases sufficiently. Precisely, this second condition
1025 is that we seek a step size s.t.
1027 .. math:: \|f'(\text{step_size})\| <= \text{sufficient_curvature_decrease} * \|f'(0)\|
1029 Where :math:`f()` is the line search objective and :math:`f'()` is the derivative
1030 of :math:`f` with respect to the step size: :math:`\frac{d f}{d~\text{step size}}`.
1032 .. member:: double Solver::Options::max_line_search_step_expansion
1036 During the bracketing phase of a Wolfe line search, the step size
1037 is increased until either a point satisfying the Wolfe conditions
1038 is found, or an upper bound for a bracket containing a point
1039 satisfying the conditions is found. Precisely, at each iteration
1042 .. math:: \text{new_step_size} <= \text{max_step_expansion} * \text{step_size}
1044 By definition for expansion
1046 .. math:: \text{max_step_expansion} > 1.0
1048 .. member:: TrustRegionStrategyType Solver::Options::trust_region_strategy_type
1050 Default: ``LEVENBERG_MARQUARDT``
1052 The trust region step computation algorithm used by
1053 Ceres. Currently ``LEVENBERG_MARQUARDT`` and ``DOGLEG`` are the two
1054 valid choices. See :ref:`section-levenberg-marquardt` and
1055 :ref:`section-dogleg` for more details.
1057 .. member:: DoglegType Solver::Options::dogleg_type
1059 Default: ``TRADITIONAL_DOGLEG``
1061 Ceres supports two different dogleg strategies.
1062 ``TRADITIONAL_DOGLEG`` method by Powell and the ``SUBSPACE_DOGLEG``
1063 method described by [ByrdSchnabel]_ . See :ref:`section-dogleg`
1066 .. member:: bool Solver::Options::use_nonmonotonic_steps
1070 Relax the requirement that the trust-region algorithm take strictly
1071 decreasing steps. See :ref:`section-non-monotonic-steps` for more
1074 .. member:: int Solver::Options::max_consecutive_nonmonotonic_steps
1078 The window size used by the step selection algorithm to accept
1079 non-monotonic steps.
1081 .. member:: int Solver::Options::max_num_iterations
1085 Maximum number of iterations for which the solver should run.
1087 .. member:: double Solver::Options::max_solver_time_in_seconds
1090 Maximum amount of time for which the solver should run.
1092 .. member:: int Solver::Options::num_threads
1096 Number of threads used by Ceres to evaluate the Jacobian.
1098 .. member:: double Solver::Options::initial_trust_region_radius
1102 The size of the initial trust region. When the
1103 ``LEVENBERG_MARQUARDT`` strategy is used, the reciprocal of this
1104 number is the initial regularization parameter.
1106 .. member:: double Solver::Options::max_trust_region_radius
1110 The trust region radius is not allowed to grow beyond this value.
1112 .. member:: double Solver::Options::min_trust_region_radius
1116 The solver terminates, when the trust region becomes smaller than
1119 .. member:: double Solver::Options::min_relative_decrease
1123 Lower threshold for relative decrease before a trust-region step is
1126 .. member:: double Solver::Options::min_lm_diagonal
1130 The ``LEVENBERG_MARQUARDT`` strategy, uses a diagonal matrix to
1131 regularize the trust region step. This is the lower bound on
1132 the values of this diagonal matrix.
1134 .. member:: double Solver::Options::max_lm_diagonal
1138 The ``LEVENBERG_MARQUARDT`` strategy, uses a diagonal matrix to
1139 regularize the trust region step. This is the upper bound on
1140 the values of this diagonal matrix.
1142 .. member:: int Solver::Options::max_num_consecutive_invalid_steps
1146 The step returned by a trust region strategy can sometimes be
1147 numerically invalid, usually because of conditioning
1148 issues. Instead of crashing or stopping the optimization, the
1149 optimizer can go ahead and try solving with a smaller trust
1150 region/better conditioned problem. This parameter sets the number
1151 of consecutive retries before the minimizer gives up.
1153 .. member:: double Solver::Options::function_tolerance
1157 Solver terminates if
1159 .. math:: \frac{|\Delta \text{cost}|}{\text{cost}} <= \text{function_tolerance}
1161 where, :math:`\Delta \text{cost}` is the change in objective
1162 function value (up or down) in the current iteration of
1163 Levenberg-Marquardt.
1165 .. member:: double Solver::Options::gradient_tolerance
1169 Solver terminates if
1171 .. math:: \|x - \Pi \boxplus(x, -g(x))\|_\infty <= \text{gradient_tolerance}
1173 where :math:`\|\cdot\|_\infty` refers to the max norm, :math:`\Pi`
1174 is projection onto the bounds constraints and :math:`\boxplus` is
1175 Plus operation for the overall local parameterization associated
1176 with the parameter vector.
1178 .. member:: double Solver::Options::parameter_tolerance
1182 Solver terminates if
1184 .. math:: \|\Delta x\| <= (\|x\| + \text{parameter_tolerance}) * \text{parameter_tolerance}
1186 where :math:`\Delta x` is the step computed by the linear solver in
1187 the current iteration.
1189 .. member:: LinearSolverType Solver::Options::linear_solver_type
1191 Default: ``SPARSE_NORMAL_CHOLESKY`` / ``DENSE_QR``
1193 Type of linear solver used to compute the solution to the linear
1194 least squares problem in each iteration of the Levenberg-Marquardt
1195 algorithm. If Ceres is built with support for ``SuiteSparse`` or
1196 ``CXSparse`` or ``Eigen``'s sparse Cholesky factorization, the
1197 default is ``SPARSE_NORMAL_CHOLESKY``, it is ``DENSE_QR``
1200 .. member:: PreconditionerType Solver::Options::preconditioner_type
1204 The preconditioner used by the iterative linear solver. The default
1205 is the block Jacobi preconditioner. Valid values are (in increasing
1206 order of complexity) ``IDENTITY``, ``JACOBI``, ``SCHUR_JACOBI``,
1207 ``CLUSTER_JACOBI`` and ``CLUSTER_TRIDIAGONAL``. See
1208 :ref:`section-preconditioner` for more details.
1210 .. member:: VisibilityClusteringType Solver::Options::visibility_clustering_type
1212 Default: ``CANONICAL_VIEWS``
1214 Type of clustering algorithm to use when constructing a visibility
1215 based preconditioner. The original visibility based preconditioning
1216 paper and implementation only used the canonical views algorithm.
1218 This algorithm gives high quality results but for large dense
1219 graphs can be particularly expensive. As its worst case complexity
1220 is cubic in size of the graph.
1222 Another option is to use ``SINGLE_LINKAGE`` which is a simple
1223 thresholded single linkage clustering algorithm that only pays
1224 attention to tightly coupled blocks in the Schur complement. This
1225 is a fast algorithm that works well.
1227 The optimal choice of the clustering algorithm depends on the
1228 sparsity structure of the problem, but generally speaking we
1229 recommend that you try ``CANONICAL_VIEWS`` first and if it is too
1230 expensive try ``SINGLE_LINKAGE``.
1232 .. member:: DenseLinearAlgebraLibrary Solver::Options::dense_linear_algebra_library_type
1236 Ceres supports using multiple dense linear algebra libraries for
1237 dense matrix factorizations. Currently ``EIGEN`` and ``LAPACK`` are
1238 the valid choices. ``EIGEN`` is always available, ``LAPACK`` refers
1239 to the system ``BLAS + LAPACK`` library which may or may not be
1242 This setting affects the ``DENSE_QR``, ``DENSE_NORMAL_CHOLESKY``
1243 and ``DENSE_SCHUR`` solvers. For small to moderate sized probem
1244 ``EIGEN`` is a fine choice but for large problems, an optimized
1245 ``LAPACK + BLAS`` implementation can make a substantial difference
1248 .. member:: SparseLinearAlgebraLibrary Solver::Options::sparse_linear_algebra_library_type
1250 Default: The highest available according to: ``SUITE_SPARSE`` >
1251 ``CX_SPARSE`` > ``EIGEN_SPARSE`` > ``NO_SPARSE``
1253 Ceres supports the use of three sparse linear algebra libraries,
1254 ``SuiteSparse``, which is enabled by setting this parameter to
1255 ``SUITE_SPARSE``, ``CXSparse``, which can be selected by setting
1256 this parameter to ``CX_SPARSE`` and ``Eigen`` which is enabled by
1257 setting this parameter to ``EIGEN_SPARSE``. Lastly, ``NO_SPARSE``
1258 means that no sparse linear solver should be used; note that this is
1259 irrespective of whether Ceres was compiled with support for one.
1261 ``SuiteSparse`` is a sophisticated and complex sparse linear
1262 algebra library and should be used in general.
1264 If your needs/platforms prevent you from using ``SuiteSparse``,
1265 consider using ``CXSparse``, which is a much smaller, easier to
1266 build library. As can be expected, its performance on large
1267 problems is not comparable to that of ``SuiteSparse``.
1269 Last but not the least you can use the sparse linear algebra
1270 routines in ``Eigen``. Currently the performance of this library is
1271 the poorest of the three. But this should change in the near
1274 Another thing to consider here is that the sparse Cholesky
1275 factorization libraries in Eigen are licensed under ``LGPL`` and
1276 building Ceres with support for ``EIGEN_SPARSE`` will result in an
1277 LGPL licensed library (since the corresponding code from Eigen is
1278 compiled into the library).
1280 The upside is that you do not need to build and link to an external
1281 library to use ``EIGEN_SPARSE``.
1283 .. member:: int Solver::Options::num_linear_solver_threads
1287 Number of threads used by the linear solver.
1289 .. member:: shared_ptr<ParameterBlockOrdering> Solver::Options::linear_solver_ordering
1293 An instance of the ordering object informs the solver about the
1294 desired order in which parameter blocks should be eliminated by the
1295 linear solvers. See section~\ref{sec:ordering`` for more details.
1297 If ``NULL``, the solver is free to choose an ordering that it
1300 See :ref:`section-ordering` for more details.
1302 .. member:: bool Solver::Options::use_explicit_schur_complement
1306 Use an explicitly computed Schur complement matrix with
1307 ``ITERATIVE_SCHUR``.
1309 By default this option is disabled and ``ITERATIVE_SCHUR``
1310 evaluates evaluates matrix-vector products between the Schur
1311 complement and a vector implicitly by exploiting the algebraic
1312 expression for the Schur complement.
1314 The cost of this evaluation scales with the number of non-zeros in
1317 For small to medium sized problems there is a sweet spot where
1318 computing the Schur complement is cheap enough that it is much more
1319 efficient to explicitly compute it and use it for evaluating the
1320 matrix-vector products.
1322 Enabling this option tells ``ITERATIVE_SCHUR`` to use an explicitly
1323 computed Schur complement. This can improve the performance of the
1324 ``ITERATIVE_SCHUR`` solver significantly.
1328 This option can only be used with the ``SCHUR_JACOBI``
1331 .. member:: bool Solver::Options::use_post_ordering
1335 Sparse Cholesky factorization algorithms use a fill-reducing
1336 ordering to permute the columns of the Jacobian matrix. There are
1337 two ways of doing this.
1339 1. Compute the Jacobian matrix in some order and then have the
1340 factorization algorithm permute the columns of the Jacobian.
1342 2. Compute the Jacobian with its columns already permuted.
1344 The first option incurs a significant memory penalty. The
1345 factorization algorithm has to make a copy of the permuted Jacobian
1346 matrix, thus Ceres pre-permutes the columns of the Jacobian matrix
1347 and generally speaking, there is no performance penalty for doing
1350 In some rare cases, it is worth using a more complicated reordering
1351 algorithm which has slightly better runtime performance at the
1352 expense of an extra copy of the Jacobian matrix. Setting
1353 ``use_postordering`` to ``true`` enables this tradeoff.
1355 .. member:: bool Solver::Options::dynamic_sparsity
1357 Some non-linear least squares problems are symbolically dense but
1358 numerically sparse. i.e. at any given state only a small number of
1359 Jacobian entries are non-zero, but the position and number of
1360 non-zeros is different depending on the state. For these problems
1361 it can be useful to factorize the sparse jacobian at each solver
1362 iteration instead of including all of the zero entries in a single
1363 general factorization.
1365 If your problem does not have this property (or you do not know),
1366 then it is probably best to keep this false, otherwise it will
1367 likely lead to worse performance.
1369 This setting only affects the `SPARSE_NORMAL_CHOLESKY` solver.
1371 .. member:: int Solver::Options::min_linear_solver_iterations
1375 Minimum number of iterations used by the linear solver. This only
1376 makes sense when the linear solver is an iterative solver, e.g.,
1377 ``ITERATIVE_SCHUR`` or ``CGNR``.
1379 .. member:: int Solver::Options::max_linear_solver_iterations
1383 Minimum number of iterations used by the linear solver. This only
1384 makes sense when the linear solver is an iterative solver, e.g.,
1385 ``ITERATIVE_SCHUR`` or ``CGNR``.
1387 .. member:: double Solver::Options::eta
1391 Forcing sequence parameter. The truncated Newton solver uses this
1392 number to control the relative accuracy with which the Newton step
1393 is computed. This constant is passed to
1394 ``ConjugateGradientsSolver`` which uses it to terminate the
1397 .. math:: \frac{Q_i - Q_{i-1}}{Q_i} < \frac{\eta}{i}
1399 .. member:: bool Solver::Options::jacobi_scaling
1403 ``true`` means that the Jacobian is scaled by the norm of its
1404 columns before being passed to the linear solver. This improves the
1405 numerical conditioning of the normal equations.
1407 .. member:: bool Solver::Options::use_inner_iterations
1411 Use a non-linear version of a simplified variable projection
1412 algorithm. Essentially this amounts to doing a further optimization
1413 on each Newton/Trust region step using a coordinate descent
1414 algorithm. For more details, see :ref:`section-inner-iterations`.
1416 .. member:: double Solver::Options::inner_iteration_tolerance
1420 Generally speaking, inner iterations make significant progress in
1421 the early stages of the solve and then their contribution drops
1422 down sharply, at which point the time spent doing inner iterations
1425 Once the relative decrease in the objective function due to inner
1426 iterations drops below ``inner_iteration_tolerance``, the use of
1427 inner iterations in subsequent trust region minimizer iterations is
1430 .. member:: shared_ptr<ParameterBlockOrdering> Solver::Options::inner_iteration_ordering
1434 If :member:`Solver::Options::use_inner_iterations` true, then the
1435 user has two choices.
1437 1. Let the solver heuristically decide which parameter blocks to
1438 optimize in each inner iteration. To do this, set
1439 :member:`Solver::Options::inner_iteration_ordering` to ``NULL``.
1441 2. Specify a collection of of ordered independent sets. The lower
1442 numbered groups are optimized before the higher number groups
1443 during the inner optimization phase. Each group must be an
1444 independent set. Not all parameter blocks need to be included in
1447 See :ref:`section-ordering` for more details.
1449 .. member:: LoggingType Solver::Options::logging_type
1451 Default: ``PER_MINIMIZER_ITERATION``
1453 .. member:: bool Solver::Options::minimizer_progress_to_stdout
1457 By default the :class:`Minimizer` progress is logged to ``STDERR``
1458 depending on the ``vlog`` level. If this flag is set to true, and
1459 :member:`Solver::Options::logging_type` is not ``SILENT``, the logging
1460 output is sent to ``STDOUT``.
1462 For ``TRUST_REGION_MINIMIZER`` the progress display looks like
1464 .. code-block:: bash
1466 iter cost cost_change |gradient| |step| tr_ratio tr_radius ls_iter iter_time total_time
1467 0 4.185660e+06 0.00e+00 1.09e+08 0.00e+00 0.00e+00 1.00e+04 0 7.59e-02 3.37e-01
1468 1 1.062590e+05 4.08e+06 8.99e+06 5.36e+02 9.82e-01 3.00e+04 1 1.65e-01 5.03e-01
1469 2 4.992817e+04 5.63e+04 8.32e+06 3.19e+02 6.52e-01 3.09e+04 1 1.45e-01 6.48e-01
1473 #. ``cost`` is the value of the objective function.
1474 #. ``cost_change`` is the change in the value of the objective
1475 function if the step computed in this iteration is accepted.
1476 #. ``|gradient|`` is the max norm of the gradient.
1477 #. ``|step|`` is the change in the parameter vector.
1478 #. ``tr_ratio`` is the ratio of the actual change in the objective
1479 function value to the change in the value of the trust
1481 #. ``tr_radius`` is the size of the trust region radius.
1482 #. ``ls_iter`` is the number of linear solver iterations used to
1483 compute the trust region step. For direct/factorization based
1484 solvers it is always 1, for iterative solvers like
1485 ``ITERATIVE_SCHUR`` it is the number of iterations of the
1486 Conjugate Gradients algorithm.
1487 #. ``iter_time`` is the time take by the current iteration.
1488 #. ``total_time`` is the total time taken by the minimizer.
1490 For ``LINE_SEARCH_MINIMIZER`` the progress display looks like
1492 .. code-block:: bash
1494 0: f: 2.317806e+05 d: 0.00e+00 g: 3.19e-01 h: 0.00e+00 s: 0.00e+00 e: 0 it: 2.98e-02 tt: 8.50e-02
1495 1: f: 2.312019e+05 d: 5.79e+02 g: 3.18e-01 h: 2.41e+01 s: 1.00e+00 e: 1 it: 4.54e-02 tt: 1.31e-01
1496 2: f: 2.300462e+05 d: 1.16e+03 g: 3.17e-01 h: 4.90e+01 s: 2.54e-03 e: 1 it: 4.96e-02 tt: 1.81e-01
1500 #. ``f`` is the value of the objective function.
1501 #. ``d`` is the change in the value of the objective function if
1502 the step computed in this iteration is accepted.
1503 #. ``g`` is the max norm of the gradient.
1504 #. ``h`` is the change in the parameter vector.
1505 #. ``s`` is the optimal step length computed by the line search.
1506 #. ``it`` is the time take by the current iteration.
1507 #. ``tt`` is the total time taken by the minimizer.
1509 .. member:: vector<int> Solver::Options::trust_region_minimizer_iterations_to_dump
1513 List of iterations at which the trust region minimizer should dump
1514 the trust region problem. Useful for testing and benchmarking. If
1515 ``empty``, no problems are dumped.
1517 .. member:: string Solver::Options::trust_region_problem_dump_directory
1521 Directory to which the problems should be written to. Should be
1523 :member:`Solver::Options::trust_region_minimizer_iterations_to_dump` is
1525 :member:`Solver::Options::trust_region_problem_dump_format_type` is not
1528 .. member:: DumpFormatType Solver::Options::trust_region_problem_dump_format
1530 Default: ``TEXTFILE``
1532 The format in which trust region problems should be logged when
1533 :member:`Solver::Options::trust_region_minimizer_iterations_to_dump`
1534 is non-empty. There are three options:
1536 * ``CONSOLE`` prints the linear least squares problem in a human
1537 readable format to ``stderr``. The Jacobian is printed as a
1538 dense matrix. The vectors :math:`D`, :math:`x` and :math:`f` are
1539 printed as dense vectors. This should only be used for small
1542 * ``TEXTFILE`` Write out the linear least squares problem to the
1543 directory pointed to by
1544 :member:`Solver::Options::trust_region_problem_dump_directory` as
1545 text files which can be read into ``MATLAB/Octave``. The Jacobian
1546 is dumped as a text file containing :math:`(i,j,s)` triplets, the
1547 vectors :math:`D`, `x` and `f` are dumped as text files
1548 containing a list of their values.
1550 A ``MATLAB/Octave`` script called
1551 ``ceres_solver_iteration_???.m`` is also output, which can be
1552 used to parse and load the problem into memory.
1554 .. member:: bool Solver::Options::check_gradients
1558 Check all Jacobians computed by each residual block with finite
1559 differences. This is expensive since it involves computing the
1560 derivative by normal means (e.g. user specified, autodiff, etc),
1561 then also computing it using finite differences. The results are
1562 compared, and if they differ substantially, the optimization fails
1563 and the details are stored in the solver summary.
1565 .. member:: double Solver::Options::gradient_check_relative_precision
1569 Precision to check for in the gradient checker. If the relative
1570 difference between an element in a Jacobian exceeds this number,
1571 then the Jacobian for that cost term is dumped.
1573 .. member:: double Solver::Options::gradient_check_numeric_derivative_relative_step_size
1579 This option only applies to the numeric differentiation used for
1580 checking the user provided derivatives when when
1581 `Solver::Options::check_gradients` is true. If you are using
1582 :class:`NumericDiffCostFunction` and are interested in changing
1583 the step size for numeric differentiation in your cost function,
1584 please have a look at :class:`NumericDiffOptions`.
1586 Relative shift used for taking numeric derivatives when
1587 `Solver::Options::check_gradients` is `true`.
1589 For finite differencing, each dimension is evaluated at slightly
1590 shifted values, e.g., for forward differences, the numerical
1595 \delta &= gradient\_check\_numeric\_derivative\_relative\_step\_size\\
1596 \Delta f &= \frac{f((1 + \delta) x) - f(x)}{\delta x}
1598 The finite differencing is done along each dimension. The reason to
1599 use a relative (rather than absolute) step size is that this way,
1600 numeric differentiation works for functions where the arguments are
1601 typically large (e.g. :math:`10^9`) and when the values are small
1602 (e.g. :math:`10^{-5}`). It is possible to construct *torture cases*
1603 which break this finite difference heuristic, but they do not come
1604 up often in practice.
1606 .. member:: vector<IterationCallback> Solver::Options::callbacks
1608 Callbacks that are executed at the end of each iteration of the
1609 :class:`Minimizer`. They are executed in the order that they are
1610 specified in this vector. By default, parameter blocks are updated
1611 only at the end of the optimization, i.e., when the
1612 :class:`Minimizer` terminates. This behavior is controlled by
1613 :member:`Solver::Options::update_state_every_variable`. If the user
1614 wishes to have access to the update parameter blocks when his/her
1615 callbacks are executed, then set
1616 :member:`Solver::Options::update_state_every_iteration` to true.
1618 The solver does NOT take ownership of these pointers.
1620 .. member:: bool Solver::Options::update_state_every_iteration
1624 Normally the parameter blocks are only updated when the solver
1625 terminates. Setting this to true update them in every
1626 iteration. This setting is useful when building an interactive
1627 application using Ceres and using an :class:`IterationCallback`.
1629 :class:`ParameterBlockOrdering`
1630 ===============================
1632 .. class:: ParameterBlockOrdering
1634 ``ParameterBlockOrdering`` is a class for storing and manipulating
1635 an ordered collection of groups/sets with the following semantics:
1637 Group IDs are non-negative integer values. Elements are any type
1638 that can serve as a key in a map or an element of a set.
1640 An element can only belong to one group at a time. A group may
1641 contain an arbitrary number of elements.
1643 Groups are ordered by their group id.
1645 .. function:: bool ParameterBlockOrdering::AddElementToGroup(const double* element, const int group)
1647 Add an element to a group. If a group with this id does not exist,
1648 one is created. This method can be called any number of times for
1649 the same element. Group ids should be non-negative numbers. Return
1650 value indicates if adding the element was a success.
1652 .. function:: void ParameterBlockOrdering::Clear()
1656 .. function:: bool ParameterBlockOrdering::Remove(const double* element)
1658 Remove the element, no matter what group it is in. If the element
1659 is not a member of any group, calling this method will result in a
1660 crash. Return value indicates if the element was actually removed.
1662 .. function:: void ParameterBlockOrdering::Reverse()
1664 Reverse the order of the groups in place.
1666 .. function:: int ParameterBlockOrdering::GroupId(const double* element) const
1668 Return the group id for the element. If the element is not a member
1669 of any group, return -1.
1671 .. function:: bool ParameterBlockOrdering::IsMember(const double* element) const
1673 True if there is a group containing the parameter block.
1675 .. function:: int ParameterBlockOrdering::GroupSize(const int group) const
1677 This function always succeeds, i.e., implicitly there exists a
1678 group for every integer.
1680 .. function:: int ParameterBlockOrdering::NumElements() const
1682 Number of elements in the ordering.
1684 .. function:: int ParameterBlockOrdering::NumGroups() const
1686 Number of groups with one or more elements.
1689 :class:`IterationCallback`
1690 ==========================
1692 .. class:: IterationSummary
1694 :class:`IterationSummary` describes the state of the minimizer at
1695 the end of each iteration.
1697 .. member:: int32 IterationSummary::iteration
1699 Current iteration number.
1701 .. member:: bool IterationSummary::step_is_valid
1703 Step was numerically valid, i.e., all values are finite and the
1704 step reduces the value of the linearized model.
1706 **Note**: :member:`IterationSummary::step_is_valid` is `false`
1707 when :member:`IterationSummary::iteration` = 0.
1709 .. member:: bool IterationSummary::step_is_nonmonotonic
1711 Step did not reduce the value of the objective function
1712 sufficiently, but it was accepted because of the relaxed
1713 acceptance criterion used by the non-monotonic trust region
1716 **Note**: :member:`IterationSummary::step_is_nonmonotonic` is
1717 `false` when when :member:`IterationSummary::iteration` = 0.
1719 .. member:: bool IterationSummary::step_is_successful
1721 Whether or not the minimizer accepted this step or not.
1723 If the ordinary trust region algorithm is used, this means that the
1724 relative reduction in the objective function value was greater than
1725 :member:`Solver::Options::min_relative_decrease`. However, if the
1726 non-monotonic trust region algorithm is used
1727 (:member:`Solver::Options::use_nonmonotonic_steps` = `true`), then
1728 even if the relative decrease is not sufficient, the algorithm may
1729 accept the step and the step is declared successful.
1731 **Note**: :member:`IterationSummary::step_is_successful` is `false`
1732 when when :member:`IterationSummary::iteration` = 0.
1734 .. member:: double IterationSummary::cost
1736 Value of the objective function.
1738 .. member:: double IterationSummary::cost_change
1740 Change in the value of the objective function in this
1741 iteration. This can be positive or negative.
1743 .. member:: double IterationSummary::gradient_max_norm
1745 Infinity norm of the gradient vector.
1747 .. member:: double IterationSummary::gradient_norm
1749 2-norm of the gradient vector.
1751 .. member:: double IterationSummary::step_norm
1753 2-norm of the size of the step computed in this iteration.
1755 .. member:: double IterationSummary::relative_decrease
1757 For trust region algorithms, the ratio of the actual change in cost
1758 and the change in the cost of the linearized approximation.
1760 This field is not used when a linear search minimizer is used.
1762 .. member:: double IterationSummary::trust_region_radius
1764 Size of the trust region at the end of the current iteration. For
1765 the Levenberg-Marquardt algorithm, the regularization parameter is
1766 1.0 / member::`IterationSummary::trust_region_radius`.
1768 .. member:: double IterationSummary::eta
1770 For the inexact step Levenberg-Marquardt algorithm, this is the
1771 relative accuracy with which the step is solved. This number is
1772 only applicable to the iterative solvers capable of solving linear
1773 systems inexactly. Factorization-based exact solvers always have an
1776 .. member:: double IterationSummary::step_size
1778 Step sized computed by the line search algorithm.
1780 This field is not used when a trust region minimizer is used.
1782 .. member:: int IterationSummary::line_search_function_evaluations
1784 Number of function evaluations used by the line search algorithm.
1786 This field is not used when a trust region minimizer is used.
1788 .. member:: int IterationSummary::linear_solver_iterations
1790 Number of iterations taken by the linear solver to solve for the
1793 Currently this field is not used when a line search minimizer is
1796 .. member:: double IterationSummary::iteration_time_in_seconds
1798 Time (in seconds) spent inside the minimizer loop in the current
1801 .. member:: double IterationSummary::step_solver_time_in_seconds
1803 Time (in seconds) spent inside the trust region step solver.
1805 .. member:: double IterationSummary::cumulative_time_in_seconds
1807 Time (in seconds) since the user called Solve().
1810 .. class:: IterationCallback
1812 Interface for specifying callbacks that are executed at the end of
1813 each iteration of the minimizer.
1817 class IterationCallback {
1819 virtual ~IterationCallback() {}
1820 virtual CallbackReturnType operator()(const IterationSummary& summary) = 0;
1824 The solver uses the return value of ``operator()`` to decide whether
1825 to continue solving or to terminate. The user can return three
1828 #. ``SOLVER_ABORT`` indicates that the callback detected an abnormal
1829 situation. The solver returns without updating the parameter
1830 blocks (unless ``Solver::Options::update_state_every_iteration`` is
1831 set true). Solver returns with ``Solver::Summary::termination_type``
1832 set to ``USER_FAILURE``.
1834 #. ``SOLVER_TERMINATE_SUCCESSFULLY`` indicates that there is no need
1835 to optimize anymore (some user specified termination criterion
1836 has been met). Solver returns with
1837 ``Solver::Summary::termination_type``` set to ``USER_SUCCESS``.
1839 #. ``SOLVER_CONTINUE`` indicates that the solver should continue
1842 For example, the following :class:`IterationCallback` is used
1843 internally by Ceres to log the progress of the optimization.
1847 class LoggingCallback : public IterationCallback {
1849 explicit LoggingCallback(bool log_to_stdout)
1850 : log_to_stdout_(log_to_stdout) {}
1852 ~LoggingCallback() {}
1854 CallbackReturnType operator()(const IterationSummary& summary) {
1855 const char* kReportRowFormat =
1856 "% 4d: f:% 8e d:% 3.2e g:% 3.2e h:% 3.2e "
1857 "rho:% 3.2e mu:% 3.2e eta:% 3.2e li:% 3d";
1858 string output = StringPrintf(kReportRowFormat,
1861 summary.cost_change,
1862 summary.gradient_max_norm,
1864 summary.relative_decrease,
1865 summary.trust_region_radius,
1867 summary.linear_solver_iterations);
1868 if (log_to_stdout_) {
1869 cout << output << endl;
1873 return SOLVER_CONTINUE;
1877 const bool log_to_stdout_;
1885 .. class:: CRSMatrix
1887 A compressed row sparse matrix used primarily for communicating the
1888 Jacobian matrix to the user.
1890 .. member:: int CRSMatrix::num_rows
1894 .. member:: int CRSMatrix::num_cols
1898 .. member:: vector<int> CRSMatrix::rows
1900 :member:`CRSMatrix::rows` is a :member:`CRSMatrix::num_rows` + 1
1901 sized array that points into the :member:`CRSMatrix::cols` and
1902 :member:`CRSMatrix::values` array.
1904 .. member:: vector<int> CRSMatrix::cols
1906 :member:`CRSMatrix::cols` contain as many entries as there are
1907 non-zeros in the matrix.
1909 For each row ``i``, ``cols[rows[i]]`` ... ``cols[rows[i + 1] - 1]``
1910 are the indices of the non-zero columns of row ``i``.
1912 .. member:: vector<int> CRSMatrix::values
1914 :member:`CRSMatrix::values` contain as many entries as there are
1915 non-zeros in the matrix.
1918 ``values[rows[i]]`` ... ``values[rows[i + 1] - 1]`` are the values
1919 of the non-zero columns of row ``i``.
1921 e.g., consider the 3x4 sparse matrix
1929 The three arrays will be:
1933 -row0- ---row1--- -row2-
1934 rows = [ 0, 2, 5, 7]
1935 cols = [ 1, 3, 1, 2, 3, 0, 1]
1936 values = [10, 4, 2, -3, 2, 1, 2]
1939 :class:`Solver::Summary`
1940 ========================
1942 .. class:: Solver::Summary
1944 Summary of the various stages of the solver after termination.
1946 .. function:: string Solver::Summary::BriefReport() const
1948 A brief one line description of the state of the solver after
1951 .. function:: string Solver::Summary::FullReport() const
1953 A full multiline description of the state of the solver after
1956 .. function:: bool Solver::Summary::IsSolutionUsable() const
1958 Whether the solution returned by the optimization algorithm can be
1959 relied on to be numerically sane. This will be the case if
1960 `Solver::Summary:termination_type` is set to `CONVERGENCE`,
1961 `USER_SUCCESS` or `NO_CONVERGENCE`, i.e., either the solver
1962 converged by meeting one of the convergence tolerances or because
1963 the user indicated that it had converged or it ran to the maximum
1964 number of iterations or time.
1966 .. member:: MinimizerType Solver::Summary::minimizer_type
1968 Type of minimization algorithm used.
1970 .. member:: TerminationType Solver::Summary::termination_type
1972 The cause of the minimizer terminating.
1974 .. member:: string Solver::Summary::message
1976 Reason why the solver terminated.
1978 .. member:: double Solver::Summary::initial_cost
1980 Cost of the problem (value of the objective function) before the
1983 .. member:: double Solver::Summary::final_cost
1985 Cost of the problem (value of the objective function) after the
1988 .. member:: double Solver::Summary::fixed_cost
1990 The part of the total cost that comes from residual blocks that
1991 were held fixed by the preprocessor because all the parameter
1992 blocks that they depend on were fixed.
1994 .. member:: vector<IterationSummary> Solver::Summary::iterations
1996 :class:`IterationSummary` for each minimizer iteration in order.
1998 .. member:: int Solver::Summary::num_successful_steps
2000 Number of minimizer iterations in which the step was
2001 accepted. Unless :member:`Solver::Options::use_non_monotonic_steps`
2002 is `true` this is also the number of steps in which the objective
2003 function value/cost went down.
2005 .. member:: int Solver::Summary::num_unsuccessful_steps
2007 Number of minimizer iterations in which the step was rejected
2008 either because it did not reduce the cost enough or the step was
2009 not numerically valid.
2011 .. member:: int Solver::Summary::num_inner_iteration_steps
2013 Number of times inner iterations were performed.
2015 .. member:: int Solver::Summary::num_line_search_steps
2017 Total number of iterations inside the line search algorithm across
2018 all invocations. We call these iterations "steps" to distinguish
2019 them from the outer iterations of the line search and trust region
2020 minimizer algorithms which call the line search algorithm as a
2023 .. member:: double Solver::Summary::preprocessor_time_in_seconds
2025 Time (in seconds) spent in the preprocessor.
2027 .. member:: double Solver::Summary::minimizer_time_in_seconds
2029 Time (in seconds) spent in the Minimizer.
2031 .. member:: double Solver::Summary::postprocessor_time_in_seconds
2033 Time (in seconds) spent in the post processor.
2035 .. member:: double Solver::Summary::total_time_in_seconds
2037 Time (in seconds) spent in the solver.
2039 .. member:: double Solver::Summary::linear_solver_time_in_seconds
2041 Time (in seconds) spent in the linear solver computing the trust
2044 .. member:: double Solver::Summary::residual_evaluation_time_in_seconds
2046 Time (in seconds) spent evaluating the residual vector.
2048 .. member:: double Solver::Summary::jacobian_evaluation_time_in_seconds
2050 Time (in seconds) spent evaluating the Jacobian matrix.
2052 .. member:: double Solver::Summary::inner_iteration_time_in_seconds
2054 Time (in seconds) spent doing inner iterations.
2056 .. member:: int Solver::Summary::num_parameter_blocks
2058 Number of parameter blocks in the problem.
2060 .. member:: int Solver::Summary::num_parameters
2062 Number of parameters in the problem.
2064 .. member:: int Solver::Summary::num_effective_parameters
2066 Dimension of the tangent space of the problem (or the number of
2067 columns in the Jacobian for the problem). This is different from
2068 :member:`Solver::Summary::num_parameters` if a parameter block is
2069 associated with a :class:`LocalParameterization`.
2071 .. member:: int Solver::Summary::num_residual_blocks
2073 Number of residual blocks in the problem.
2075 .. member:: int Solver::Summary::num_residuals
2077 Number of residuals in the problem.
2079 .. member:: int Solver::Summary::num_parameter_blocks_reduced
2081 Number of parameter blocks in the problem after the inactive and
2082 constant parameter blocks have been removed. A parameter block is
2083 inactive if no residual block refers to it.
2085 .. member:: int Solver::Summary::num_parameters_reduced
2087 Number of parameters in the reduced problem.
2089 .. member:: int Solver::Summary::num_effective_parameters_reduced
2091 Dimension of the tangent space of the reduced problem (or the
2092 number of columns in the Jacobian for the reduced problem). This is
2093 different from :member:`Solver::Summary::num_parameters_reduced` if
2094 a parameter block in the reduced problem is associated with a
2095 :class:`LocalParameterization`.
2097 .. member:: int Solver::Summary::num_residual_blocks_reduced
2099 Number of residual blocks in the reduced problem.
2101 .. member:: int Solver::Summary::num_residuals_reduced
2103 Number of residuals in the reduced problem.
2105 .. member:: int Solver::Summary::num_threads_given
2107 Number of threads specified by the user for Jacobian and residual
2110 .. member:: int Solver::Summary::num_threads_used
2112 Number of threads actually used by the solver for Jacobian and
2113 residual evaluation. This number is not equal to
2114 :member:`Solver::Summary::num_threads_given` if `OpenMP` is not
2117 .. member:: int Solver::Summary::num_linear_solver_threads_given
2119 Number of threads specified by the user for solving the trust
2122 .. member:: int Solver::Summary::num_linear_solver_threads_used
2124 Number of threads actually used by the solver for solving the trust
2125 region problem. This number is not equal to
2126 :member:`Solver::Summary::num_linear_solver_threads_given` if
2127 `OpenMP` is not available.
2129 .. member:: LinearSolverType Solver::Summary::linear_solver_type_given
2131 Type of the linear solver requested by the user.
2133 .. member:: LinearSolverType Solver::Summary::linear_solver_type_used
2135 Type of the linear solver actually used. This may be different from
2136 :member:`Solver::Summary::linear_solver_type_given` if Ceres
2137 determines that the problem structure is not compatible with the
2138 linear solver requested or if the linear solver requested by the
2139 user is not available, e.g. The user requested
2140 `SPARSE_NORMAL_CHOLESKY` but no sparse linear algebra library was
2143 .. member:: vector<int> Solver::Summary::linear_solver_ordering_given
2145 Size of the elimination groups given by the user as hints to the
2148 .. member:: vector<int> Solver::Summary::linear_solver_ordering_used
2150 Size of the parameter groups used by the solver when ordering the
2151 columns of the Jacobian. This maybe different from
2152 :member:`Solver::Summary::linear_solver_ordering_given` if the user
2153 left :member:`Solver::Summary::linear_solver_ordering_given` blank
2154 and asked for an automatic ordering, or if the problem contains
2155 some constant or inactive parameter blocks.
2157 .. member:: std::string Solver::Summary::schur_structure_given
2159 For Schur type linear solvers, this string describes the template
2160 specialization which was detected in the problem and should be
2163 .. member:: std::string Solver::Summary::schur_structure_used
2165 For Schur type linear solvers, this string describes the template
2166 specialization that was actually instantiated and used. The reason
2167 this will be different from
2168 :member:`Solver::Summary::schur_structure_given` is because the
2169 corresponding template specialization does not exist.
2171 Template specializations can be added to ceres by editing
2172 ``internal/ceres/generate_template_specializations.py``
2174 .. member:: bool Solver::Summary::inner_iterations_given
2176 `True` if the user asked for inner iterations to be used as part of
2179 .. member:: bool Solver::Summary::inner_iterations_used
2181 `True` if the user asked for inner iterations to be used as part of
2182 the optimization and the problem structure was such that they were
2183 actually performed. For example, in a problem with just one parameter
2184 block, inner iterations are not performed.
2186 .. member:: vector<int> inner_iteration_ordering_given
2188 Size of the parameter groups given by the user for performing inner
2191 .. member:: vector<int> inner_iteration_ordering_used
2193 Size of the parameter groups given used by the solver for
2194 performing inner iterations. This maybe different from
2195 :member:`Solver::Summary::inner_iteration_ordering_given` if the
2196 user left :member:`Solver::Summary::inner_iteration_ordering_given`
2197 blank and asked for an automatic ordering, or if the problem
2198 contains some constant or inactive parameter blocks.
2200 .. member:: PreconditionerType Solver::Summary::preconditioner_type_given
2202 Type of the preconditioner requested by the user.
2204 .. member:: PreconditionerType Solver::Summary::preconditioner_type_used
2206 Type of the preconditioner actually used. This may be different
2207 from :member:`Solver::Summary::linear_solver_type_given` if Ceres
2208 determines that the problem structure is not compatible with the
2209 linear solver requested or if the linear solver requested by the
2210 user is not available.
2212 .. member:: VisibilityClusteringType Solver::Summary::visibility_clustering_type
2214 Type of clustering algorithm used for visibility based
2215 preconditioning. Only meaningful when the
2216 :member:`Solver::Summary::preconditioner_type` is
2217 ``CLUSTER_JACOBI`` or ``CLUSTER_TRIDIAGONAL``.
2219 .. member:: TrustRegionStrategyType Solver::Summary::trust_region_strategy_type
2221 Type of trust region strategy.
2223 .. member:: DoglegType Solver::Summary::dogleg_type
2225 Type of dogleg strategy used for solving the trust region problem.
2227 .. member:: DenseLinearAlgebraLibraryType Solver::Summary::dense_linear_algebra_library_type
2229 Type of the dense linear algebra library used.
2231 .. member:: SparseLinearAlgebraLibraryType Solver::Summary::sparse_linear_algebra_library_type
2233 Type of the sparse linear algebra library used.
2235 .. member:: LineSearchDirectionType Solver::Summary::line_search_direction_type
2237 Type of line search direction used.
2239 .. member:: LineSearchType Solver::Summary::line_search_type
2241 Type of the line search algorithm used.
2243 .. member:: LineSearchInterpolationType Solver::Summary::line_search_interpolation_type
2245 When performing line search, the degree of the polynomial used to
2246 approximate the objective function.
2248 .. member:: NonlinearConjugateGradientType Solver::Summary::nonlinear_conjugate_gradient_type
2250 If the line search direction is `NONLINEAR_CONJUGATE_GRADIENT`,
2251 then this indicates the particular variant of non-linear conjugate
2254 .. member:: int Solver::Summary::max_lbfgs_rank
2256 If the type of the line search direction is `LBFGS`, then this
2257 indicates the rank of the Hessian approximation.