doc/tutorials/dnn/dnn_custom_layers/dnn_custom_layers.md

   1 # Custom deep learning layers support {#tutorial_dnn_custom_layers}
   2
   3 ## Introduction
   4 Deep learning is a fast growing area. The new approaches to build neural networks
   5 usually introduce new types of layers. They could be modifications of existing
   6 ones or implement outstanding researching ideas.
   7
   8 OpenCV gives an opportunity to import and run networks from different deep learning
   9 frameworks. There are a number of the most popular layers. However you can face
  10 a problem that your network cannot be imported using OpenCV because of unimplemented layers.
  11
  12 The first solution is to create a feature request at https://github.com/opencv/opencv/issues
  13 mentioning details such a source of model and type of new layer. A new layer could
  14 be implemented if OpenCV community shares this need.
  15
  16 The second way is to define a **custom layer** so OpenCV's deep learning engine
  17 will know how to use it. This tutorial is dedicated to show you a process of deep
  18 learning models import customization.
  19
  20 ## Define a custom layer in C++
  21 Deep learning layer is a building block of network's pipeline.
  22 It has connections to **input blobs** and produces results to **output blobs**.
  23 There are trained **weights** and **hyper-parameters**.
  24 Layers' names, types, weights and hyper-parameters are stored in files are generated by
  25 native frameworks during training. If OpenCV mets unknown layer type it throws an
  26 exception trying to read a model:
  27
  28 ```
  29 Unspecified error: Can't create layer "layer_name" of type "MyType" in function getLayerInstance
  30 ```
  31
  32 To import the model correctly you have to derive a class from cv::dnn::Layer with
  33 the following methods:
  34
  35 @snippet dnn/custom_layers.hpp A custom layer interface
  36
  37 And register it before the import:
  38
  39 @snippet dnn/custom_layers.hpp Register a custom layer
  40
  41 @note `MyType` is a type of unimplemented layer from the thrown exception.
  42
  43 Let's see what all the methods do:
  44
  45 - Constructor
  46
  47 @snippet dnn/custom_layers.hpp MyLayer::MyLayer
  48
  49 Retrieves hyper-parameters from cv::dnn::LayerParams. If your layer has trainable
  50 weights they will be already stored in the Layer's member cv::dnn::Layer::blobs.
  51
  52 - A static method `create`
  53
  54 @snippet dnn/custom_layers.hpp MyLayer::create
  55
  56 This method should create an instance of you layer and return cv::Ptr with it.
  57
  58 - Output blobs' shape computation
  59
  60 @snippet dnn/custom_layers.hpp MyLayer::getMemoryShapes
  61
  62 Returns layer's output shapes depends on input shapes. You may request an extra
  63 memory using `internals`.
  64
  65 - Run a layer
  66
  67 @snippet dnn/custom_layers.hpp MyLayer::forward
  68
  69 Implement a layer's logic here. Compute outputs for given inputs.
  70
  71 @note OpenCV manages memory allocated for layers. In the most cases the same memory
  72 can be reused between layers. So your `forward` implementation should not rely that
  73 the second invocation of `forward` will has the same data at `outputs` and `internals`.
  74
  75 - Optional `finalize` method
  76
  77 @snippet dnn/custom_layers.hpp MyLayer::finalize
  78
  79 The chain of methods are the following: OpenCV deep learning engine calls `create`
  80 method once then it calls `getMemoryShapes` for an every created layer then you
  81 can make some preparations depends on known input dimensions at cv::dnn::Layer::finalize.
  82 After network was initialized only `forward` method is called for an every network's input.
  83
  84 @note Varying input blobs' sizes such height or width or batch size you make OpenCV
  85 reallocate all the internal memory. That leads efficiency gaps. Try to initialize
  86 and deploy models using a fixed batch size and image's dimensions.
  87
  88 ## Example: custom layer from Caffe
  89 Let's create a custom layer `Interp` from https://github.com/cdmh/deeplab-public.
  90 It's just a simple resize that takes an input blob of size `N x C x Hi x Wi` and returns
  91 an output blob of size `N x C x Ho x Wo` where `N` is a batch size, `C` is a number of channels,
  92 `Hi x Wi` and `Ho x Wo` are input and output `height x width` correspondingly.
  93 This layer has no trainable weights but it has hyper-parameters to specify an output size.
  94
  95 In example,
  96 ~~~~~~~~~~~~~
  97 layer {
  98   name: "output"
  99   type: "Interp"
 100   bottom: "input"
 101   top: "output"
 102   interp_param {
 103     height: 9
 104     width: 8
 105   }
 106 }
 107 ~~~~~~~~~~~~~
 108
 109 This way our implementation can look like:
 110
 111 @snippet dnn/custom_layers.hpp InterpLayer
 112
 113 Next we need to register a new layer type and try to import the model.
 114
 115 @snippet dnn/custom_layers.hpp Register InterpLayer
 116
 117 ## Example: custom layer from TensorFlow
 118 This is an example of how to import a network with [tf.image.resize_bilinear](https://www.tensorflow.org/versions/master/api_docs/python/tf/image/resize_bilinear)
 119 operation. This is also a resize but with an implementation different from OpenCV's or `Interp` above.
 120
 121 Let's create a single layer network:
 122 ~~~~~~~~~~~~~{.py}
 123 inp = tf.placeholder(tf.float32, [2, 3, 4, 5], 'input')
 124 resized = tf.image.resize_bilinear(inp, size=[9, 8], name='resize_bilinear')
 125 ~~~~~~~~~~~~~
 126 OpenCV sees that TensorFlow's graph in the following way:
 127
 128 ```
 129 node {
 130   name: "input"
 131   op: "Placeholder"
 132   attr {
 133     key: "dtype"
 134     value {
 135       type: DT_FLOAT
 136     }
 137   }
 138 }
 139 node {
 140   name: "resize_bilinear/size"
 141   op: "Const"
 142   attr {
 143     key: "dtype"
 144     value {
 145       type: DT_INT32
 146     }
 147   }
 148   attr {
 149     key: "value"
 150     value {
 151       tensor {
 152         dtype: DT_INT32
 153         tensor_shape {
 154           dim {
 155             size: 2
 156           }
 157         }
 158         tensor_content: "\t\000\000\000\010\000\000\000"
 159       }
 160     }
 161   }
 162 }
 163 node {
 164   name: "resize_bilinear"
 165   op: "ResizeBilinear"
 166   input: "input:0"
 167   input: "resize_bilinear/size"
 168   attr {
 169     key: "T"
 170     value {
 171       type: DT_FLOAT
 172     }
 173   }
 174   attr {
 175     key: "align_corners"
 176     value {
 177       b: false
 178     }
 179   }
 180 }
 181 library {
 182 }
 183 ```
 184 Custom layers import from TensorFlow is designed to put all layer's `attr` into
 185 cv::dnn::LayerParams but input `Const` blobs into cv::dnn::Layer::blobs.
 186 In our case resize's output shape will be stored in layer's `blobs[0]`.
 187
 188 @snippet dnn/custom_layers.hpp ResizeBilinearLayer
 189
 190 Next we register a layer and try to import the model.
 191
 192 @snippet dnn/custom_layers.hpp Register ResizeBilinearLayer
 193
 194 ## Define a custom layer in Python
 195 The following example shows how to customize OpenCV's layers in Python.
 196
 197 Let's consider [Holistically-Nested Edge Detection](https://arxiv.org/abs/1504.06375)
 198 deep learning model. That was trained with one and only difference comparing to
 199 a current version of [Caffe framework](http://caffe.berkeleyvision.org/). `Crop`
 200 layers that receive two input blobs and crop the first one to match spatial dimensions
 201 of the second one used to crop from the center. Nowadays Caffe's layer does it
 202 from the top-left corner. So using the latest version of Caffe or OpenCV you'll
 203 get shifted results with filled borders.
 204
 205 Next we're going to replace OpenCV's `Crop` layer that makes top-left cropping by
 206 a centric one.
 207
 208 - Create a class with `getMemoryShapes` and `forward` methods
 209
 210 @snippet dnn/edge_detection.py CropLayer
 211
 212 @note Both methods should return lists.
 213
 214 - Register a new layer.
 215
 216 @snippet dnn/edge_detection.py Register
 217
 218 That's it! We've replaced an implemented OpenCV's layer to a custom one.
 219 You may find a full script in the [source code](https://github.com/opencv/opencv/tree/3.4/samples/dnn/edge_detection.py).
 220
 221 <table border="0">
 222 <tr>
 223 <td>![](js_tutorials/js_assets/lena.jpg)</td>
 224 <td>![](images/lena_hed.jpg)</td>
 225 </tr>
 226 </table>