improve reduction logic and add fast transpose kernel
authorYashasSamaga <yashas_2010@yahoo.com>
Mon, 23 Dec 2019 18:53:45 +0000 (00:23 +0530)
committerYashasSamaga <yashas_2010@yahoo.com>
Mon, 23 Dec 2019 18:53:45 +0000 (00:23 +0530)
commit16bc505d26860b5d055deec9f0df5a4e6d59b622
treee4b3caef9dc4c5b4a615ad55966e1125406b9530
parentee4feb4b09d144c8bfabf539fd81b513cddf399e
improve reduction logic and add fast transpose kernel
modules/dnn/src/cuda/fill_copy.cu [moved from modules/dnn/src/cuda/fill.cu with 56% similarity]
modules/dnn/src/cuda/max_unpooling.cu
modules/dnn/src/cuda/normalize.cu
modules/dnn/src/cuda/permute.cu
modules/dnn/src/cuda4dnn/kernels/fill_copy.hpp [moved from modules/dnn/src/cuda4dnn/kernels/fill.hpp with 63% similarity]
modules/dnn/src/cuda4dnn/kernels/permute.hpp
modules/dnn/src/cuda4dnn/primitives/concat.hpp
modules/dnn/src/cuda4dnn/primitives/padding.hpp