docs/source/interfacing_with_autodiff.rst

   1 .. default-domain:: cpp
   2
   3 .. cpp:namespace:: ceres
   4
   5 .. _chapter-interfacing_with_automatic_differentiation:
   6
   7 Interfacing with Automatic Differentiation
   8 ==========================================
   9
  10 Automatic differentiation is straightforward to use in cases where an
  11 explicit expression for the cost function is available. But this is
  12 not always possible. Often one has to interface with external routines
  13 or data. In this chapter we will consider a number of different ways
  14 of doing so.
  15
  16 To do this, we will consider the problem of finding parameters
  17 :math:`\theta` and :math:`t` that solve an optimization problem of the
  18 form:
  19
  20 .. math::
  21    \min & \quad \sum_i \left \|y_i - f\left (\|q_{i}\|^2\right) q_i
  22    \right \|^2\\
  23    \text{such that} & \quad q_i = R(\theta) x_i + t
  24
  25 Here, :math:`R` is a two dimensional rotation matrix parameterized
  26 using the angle :math:`\theta` and :math:`t` is a two dimensional
  27 vector. :math:`f` is an external distortion function.
  28
  29 We begin by considering the case, where we have a templated function
  30 :code:`TemplatedComputeDistortion` that can compute the function
  31 :math:`f`. Then the implementation of the corresponding residual
  32 functor is straightforward and will look as follows:
  33
  34 .. code-block:: c++
  35    :emphasize-lines: 21
  36
  37    template <typename T> T TemplatedComputeDistortion(const T r2) {
  38      const double k1 = 0.0082;
  39      const double k2 = 0.000023;
  40      return 1.0 + k1 * y2 + k2 * r2 * r2;
  41    }
  42
  43    struct Affine2DWithDistortion {
  44      Affine2DWithDistortion(const double x_in[2], const double y_in[2]) {
  45        x[0] = x_in[0];
  46        x[1] = x_in[1];
  47        y[0] = y_in[0];
  48        y[1] = y_in[1];
  49      }
  50
  51      template <typename T>
  52      bool operator()(const T* theta,
  53                      const T* t,
  54                      T* residuals) const {
  55        const T q_0 =  cos(theta[0]) * x[0] - sin(theta[0]) * x[1] + t[0];
  56        const T q_1 =  sin(theta[0]) * x[0] + cos(theta[0]) * x[1] + t[1];
  57        const T f = TemplatedComputeDistortion(q_0 * q_0 + q_1 * q_1);
  58        residuals[0] = y[0] - f * q_0;
  59        residuals[1] = y[1] - f * q_1;
  60        return true;
  61      }
  62
  63      double x[2];
  64      double y[2];
  65    };
  66
  67 So far so good, but let us now consider three ways of defining
  68 :math:`f` which are not directly amenable to being used with automatic
  69 differentiation:
  70
  71 #. A non-templated function that evaluates its value.
  72 #. A function that evaluates its value and derivative.
  73 #. A function that is defined as a table of values to be interpolated.
  74
  75 We will consider them in turn below.
  76
  77 A function that returns its value
  78 ----------------------------------
  79
  80 Suppose we were given a function :code:`ComputeDistortionValue` with
  81 the following signature
  82
  83 .. code-block:: c++
  84
  85    double ComputeDistortionValue(double r2);
  86
  87 that computes the value of :math:`f`. The actual implementation of the
  88 function does not matter. Interfacing this function with
  89 :code:`Affine2DWithDistortion` is a three step process:
  90
  91 1. Wrap :code:`ComputeDistortionValue` into a functor
  92    :code:`ComputeDistortionValueFunctor`.
  93 2. Numerically differentiate :code:`ComputeDistortionValueFunctor`
  94    using :class:`NumericDiffCostFunction` to create a
  95    :class:`CostFunction`.
  96 3. Wrap the resulting :class:`CostFunction` object using
  97    :class:`CostFunctionToFunctor`. The resulting object is a functor
  98    with a templated :code:`operator()` method, which pipes the
  99    Jacobian computed by :class:`NumericDiffCostFunction` into the
 100    approproate :code:`Jet` objects.
 101
 102 An implementation of the above three steps looks as follows:
 103
 104 .. code-block:: c++
 105    :emphasize-lines: 15,16,17,18,19,20, 29
 106
 107    struct ComputeDistortionValueFunctor {
 108      bool operator()(const double* r2, double* value) const {
 109        *value = ComputeDistortionValue(r2[0]);
 110        return true;
 111      }
 112    };
 113
 114    struct Affine2DWithDistortion {
 115      Affine2DWithDistortion(const double x_in[2], const double y_in[2]) {
 116        x[0] = x_in[0];
 117        x[1] = x_in[1];
 118        y[0] = y_in[0];
 119        y[1] = y_in[1];
 120
 121        compute_distortion.reset(new ceres::CostFunctionToFunctor<1, 1>(
 122             new ceres::NumericDiffCostFunction<ComputeDistortionValueFunctor,
 123                                                ceres::CENTRAL,
 124                                                1,
 125                                                1>(
 126                new ComputeDistortionValueFunctor)));
 127      }
 128
 129      template <typename T>
 130      bool operator()(const T* theta, const T* t, T* residuals) const {
 131        const T q_0 = cos(theta[0]) * x[0] - sin(theta[0]) * x[1] + t[0];
 132        const T q_1 = sin(theta[0]) * x[0] + cos(theta[0]) * x[1] + t[1];
 133        const T r2 = q_0 * q_0 + q_1 * q_1;
 134        T f;
 135        (*compute_distortion)(&r2, &f);
 136        residuals[0] = y[0] - f * q_0;
 137        residuals[1] = y[1] - f * q_1;
 138        return true;
 139      }
 140
 141      double x[2];
 142      double y[2];
 143      std::unique_ptr<ceres::CostFunctionToFunctor<1, 1> > compute_distortion;
 144    };
 145
 146
 147 A function that returns its value and derivative
 148 ------------------------------------------------
 149
 150 Now suppose we are given a function :code:`ComputeDistortionValue`
 151 thatis able to compute its value and optionally its Jacobian on demand
 152 and has the following signature:
 153
 154 .. code-block:: c++
 155
 156    void ComputeDistortionValueAndJacobian(double r2,
 157                                           double* value,
 158                                           double* jacobian);
 159
 160 Again, the actual implementation of the function does not
 161 matter. Interfacing this function with :code:`Affine2DWithDistortion`
 162 is a two step process:
 163
 164 1. Wrap :code:`ComputeDistortionValueAndJacobian` into a
 165    :class:`CostFunction` object which we call
 166    :code:`ComputeDistortionFunction`.
 167 2. Wrap the resulting :class:`ComputeDistortionFunction` object using
 168    :class:`CostFunctionToFunctor`. The resulting object is a functor
 169    with a templated :code:`operator()` method, which pipes the
 170    Jacobian computed by :class:`NumericDiffCostFunction` into the
 171    approproate :code:`Jet` objects.
 172
 173 The resulting code will look as follows:
 174
 175 .. code-block:: c++
 176    :emphasize-lines: 21,22, 33
 177
 178    class ComputeDistortionFunction : public ceres::SizedCostFunction<1, 1> {
 179     public:
 180      virtual bool Evaluate(double const* const* parameters,
 181                            double* residuals,
 182                            double** jacobians) const {
 183        if (!jacobians) {
 184          ComputeDistortionValueAndJacobian(parameters[0][0], residuals, NULL);
 185        } else {
 186          ComputeDistortionValueAndJacobian(parameters[0][0], residuals, jacobians[0]);
 187        }
 188        return true;
 189      }
 190    };
 191
 192    struct Affine2DWithDistortion {
 193      Affine2DWithDistortion(const double x_in[2], const double y_in[2]) {
 194        x[0] = x_in[0];
 195        x[1] = x_in[1];
 196        y[0] = y_in[0];
 197        y[1] = y_in[1];
 198        compute_distortion.reset(
 199            new ceres::CostFunctionToFunctor<1, 1>(new ComputeDistortionFunction));
 200      }
 201
 202      template <typename T>
 203      bool operator()(const T* theta,
 204                      const T* t,
 205                      T* residuals) const {
 206        const T q_0 =  cos(theta[0]) * x[0] - sin(theta[0]) * x[1] + t[0];
 207        const T q_1 =  sin(theta[0]) * x[0] + cos(theta[0]) * x[1] + t[1];
 208        const T r2 = q_0 * q_0 + q_1 * q_1;
 209        T f;
 210        (*compute_distortion)(&r2, &f);
 211        residuals[0] = y[0] - f * q_0;
 212        residuals[1] = y[1] - f * q_1;
 213        return true;
 214      }
 215
 216      double x[2];
 217      double y[2];
 218      std::unique_ptr<ceres::CostFunctionToFunctor<1, 1> > compute_distortion;
 219    };
 220
 221
 222 A function that is defined as a table of values
 223 -----------------------------------------------
 224
 225 The third and final case we will consider is where the function
 226 :math:`f` is defined as a table of values on the interval :math:`[0,
 227 100)`, with a value for each integer.
 228
 229 .. code-block:: c++
 230
 231    vector<double> distortion_values;
 232
 233 There are many ways of interpolating a table of values. Perhaps the
 234 simplest and most common method is linear interpolation. But it is not
 235 a great idea to use linear interpolation because the interpolating
 236 function is not differentiable at the sample points.
 237
 238 A simple (well behaved) differentiable interpolation is the `Cubic
 239 Hermite Spline
 240 <http://en.wikipedia.org/wiki/Cubic_Hermite_spline>`_. Ceres Solver
 241 ships with routines to perform Cubic & Bi-Cubic interpolation that is
 242 automatic differentiation friendly.
 243
 244 Using Cubic interpolation requires first constructing a
 245 :class:`Grid1D` object to wrap the table of values and then
 246 constructing a :class:`CubicInterpolator` object using it.
 247
 248 The resulting code will look as follows:
 249
 250 .. code-block:: c++
 251    :emphasize-lines: 10,11,12,13, 24, 32,33
 252
 253    struct Affine2DWithDistortion {
 254      Affine2DWithDistortion(const double x_in[2],
 255                             const double y_in[2],
 256                             const std::vector<double>& distortion_values) {
 257        x[0] = x_in[0];
 258        x[1] = x_in[1];
 259        y[0] = y_in[0];
 260        y[1] = y_in[1];
 261
 262        grid.reset(new ceres::Grid1D<double, 1>(
 263            &distortion_values[0], 0, distortion_values.size()));
 264        compute_distortion.reset(
 265            new ceres::CubicInterpolator<ceres::Grid1D<double, 1> >(*grid));
 266      }
 267
 268      template <typename T>
 269      bool operator()(const T* theta,
 270                      const T* t,
 271                      T* residuals) const {
 272        const T q_0 =  cos(theta[0]) * x[0] - sin(theta[0]) * x[1] + t[0];
 273        const T q_1 =  sin(theta[0]) * x[0] + cos(theta[0]) * x[1] + t[1];
 274        const T r2 = q_0 * q_0 + q_1 * q_1;
 275        T f;
 276        compute_distortion->Evaluate(r2, &f);
 277        residuals[0] = y[0] - f * q_0;
 278        residuals[1] = y[1] - f * q_1;
 279        return true;
 280      }
 281
 282      double x[2];
 283      double y[2];
 284      std::unique_ptr<ceres::Grid1D<double, 1> > grid;
 285      std::unique_ptr<ceres::CubicInterpolator<ceres::Grid1D<double, 1> > > compute_distortion;
 286    };
 287
 288 In the above example we used :class:`Grid1D` and
 289 :class:`CubicInterpolator` to interpolate a one dimensional table of
 290 values. :class:`Grid2D` combined with :class:`CubicInterpolator` lets
 291 the user to interpolate two dimensional tables of values. Note that
 292 neither :class:`Grid1D` or :class:`Grid2D` are limited to scalar
 293 valued functions, they also work with vector valued functions.