[batchnorm] Optimize batch norm layer
authorParichay Kapoor <pk.kapoor@samsung.com>
Fri, 24 Sep 2021 11:58:10 +0000 (20:58 +0900)
committerJijoong Moon <jijoong.moon@samsung.com>
Fri, 1 Oct 2021 04:28:07 +0000 (13:28 +0900)
commit9d6e0d01c81904aa3012870ac8e98cd9c424e36a
treeda7fd4f6cdb15da1be440849c113c29a16959286
parent58d10086b11589854efc9a146bd0630473745d1c
[batchnorm] Optimize batch norm layer

This patch optimizes batch norm layer and tries to share the
calculations performed in calcGradient and calcDerivative.
- reuse dbeta and dgamma calculations
- reduce number of required temporary variables
- create all the required tensor variables with context
- add support for checking if the layer is trainable or not via run
context
- support average operation with the output tensor already allocated
- this patch reduces as much as memory as possible without sacrificing
speed. more memory optimization is possible at the expense of speed but
has been ommitted for now.

Note: this patch has slight improvement in performance, and adds no
extra operations.

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>
nntrainer/layers/bn_layer.cpp
nntrainer/layers/bn_layer.h
nntrainer/layers/layer_context.cpp
nntrainer/layers/layer_context.h
nntrainer/layers/layer_node.cpp
nntrainer/layers/time_dist.cpp
nntrainer/tensor/tensor.cpp
nntrainer/tensor/tensor.h
test/unittest/layers/layers_golden_tests.cpp