another round of dnn optimization (#9011)
authorVadim Pisarevsky <vadim.pisarevsky@gmail.com>
Wed, 28 Jun 2017 08:15:22 +0000 (11:15 +0300)
committerGitHub <noreply@github.com>
Wed, 28 Jun 2017 08:15:22 +0000 (11:15 +0300)
commit8b3d6603d5469060d0217b3e743d3e75374afbeb
treeb5c5217a61af9f77783191a62fcc2f136fa759ba
parent82ec76c123f0548806b25c15477719b4bcd6b201
another round of dnn optimization (#9011)

* another round of dnn optimization:
* increased malloc alignment across OpenCV from 16 to 64 bytes to make it AVX2 and even AVX-512 friendly
* improved SIMD optimization of pooling layer, optimized average pooling
* cleaned up convolution layer implementation
* made activation layer "attacheable" to all other layers, including fully connected and addition layer.
* fixed bug in the fusion algorithm: "LayerData::consumers" should not be cleared, because it desctibes the topology.
* greatly optimized permutation layer, which improved SSD performance
* parallelized element-wise binary/ternary/... ops (sum, prod, max)

* also, added missing copyrights to many of the layer implementation files

* temporarily disabled (again) the check for intermediate blobs consistency; fixed warnings from various builders
29 files changed:
modules/core/include/opencv2/core/private.hpp
modules/dnn/include/opencv2/dnn/all_layers.hpp
modules/dnn/include/opencv2/dnn/dnn.hpp
modules/dnn/src/dnn.cpp
modules/dnn/src/layers/blank_layer.cpp
modules/dnn/src/layers/concat_layer.cpp
modules/dnn/src/layers/convolution_layer.cpp
modules/dnn/src/layers/crop_layer.cpp
modules/dnn/src/layers/detection_output_layer.cpp
modules/dnn/src/layers/elementwise_layers.cpp
modules/dnn/src/layers/eltwise_layer.cpp
modules/dnn/src/layers/flatten_layer.cpp
modules/dnn/src/layers/fully_connected_layer.cpp
modules/dnn/src/layers/layers_common.avx2.cpp
modules/dnn/src/layers/layers_common.cpp
modules/dnn/src/layers/layers_common.hpp
modules/dnn/src/layers/lrn_layer.cpp
modules/dnn/src/layers/mvn_layer.cpp
modules/dnn/src/layers/normalize_bbox_layer.cpp
modules/dnn/src/layers/permute_layer.cpp
modules/dnn/src/layers/pooling_layer.cpp
modules/dnn/src/layers/prior_box_layer.cpp
modules/dnn/src/layers/recurrent_layers.cpp
modules/dnn/src/layers/reshape_layer.cpp
modules/dnn/src/layers/slice_layer.cpp
modules/dnn/src/layers/softmax_layer.cpp
modules/dnn/src/layers/split_layer.cpp
modules/dnn/test/test_googlenet.cpp
modules/imgproc/src/templmatch.cpp