review.tizen.org Git - platform/upstream/tensorflow.git/commit

projects / platform / upstream / tensorflow.git / commit

author	A. Unique TensorFlower <gardener@tensorflow.org>
	Fri, 16 Feb 2018 21:20:13 +0000 (13:20 -0800)
committer	TensorFlower Gardener <gardener@tensorflow.org>
	Fri, 16 Feb 2018 21:24:00 +0000 (13:24 -0800)
commit	428d034227c9e7b637de0194d80cac3976a37eef
tree	09e7948680f12f13238254deb488473edaab8aa7	tree \| snapshot
parent	96c2a846609d3a68f9a88c60c4c68a243f74ee44	commit \| diff

Fix pontential issue with number of blocks launched for depthwise kernels: the number of work_elements was too small, which could return a block_count that is too small to cover all elements.

We also have been ignoring the suggested thread_per_block, so were potentially launching more blocks than necessary to fill the GPU (which is inefficient, but functionally correct).

Changing 'assert(false && ...' to LOG(FATAL) because it shouldn't be debug only.

PiperOrigin-RevId: 186037306

tensorflow/core/kernels/depthwise_conv_op_gpu.cu.cc		diff \| blob \| history
tensorflow/core/util/cuda_launch_config.h		diff \| blob \| history

Domain: Machine Learning / ML Framework;

RSS Atom