review.tizen.org Git - platform/upstream/tensorflow.git/commit

projects / platform / upstream / tensorflow.git / commit

author	Justin Lebar <jlebar@google.com>
	Wed, 21 Mar 2018 14:33:03 +0000 (07:33 -0700)
committer	TensorFlower Gardener <gardener@tensorflow.org>
	Wed, 21 Mar 2018 14:35:27 +0000 (07:35 -0700)
commit	39dd4ee6a3727a0eb30a8d5b8f39390383a1e761
tree	58e4db7ae151fe0ccd093771bd2fb2eefd9c01ab	tree \| snapshot
parent	abd5b15ababbb5601f02691620d4d8e094cff64e	commit \| diff

[XLA] Initialize arrays using cudaMemset when possible.

Previously we were using our own hand-rolled initializer thunk. This
worked OK for reduces, because the amount of data we were initializing
is usually small. But for e.g. select-and-scatter, it's quite slow.

This patch lets us use cudaMemset instead.

PiperOrigin-RevId: 189904720

Domain: Machine Learning / ML Framework;

RSS Atom

tensorflow/compiler/xla/service/gpu/BUILD		diff \| blob \| history
tensorflow/compiler/xla/service/gpu/ir_emitter_unnested.cc		diff \| blob \| history
tensorflow/compiler/xla/service/gpu/ir_emitter_unnested.h		diff \| blob \| history
tensorflow/compiler/xla/service/gpu/memset_thunk.cc	[new file with mode: 0644]	blob
tensorflow/compiler/xla/service/gpu/memset_thunk.h	[new file with mode: 0644]	blob
tensorflow/compiler/xla/service/gpu/thunk.h		diff \| blob \| history
tensorflow/compiler/xla/tests/reduce_test.cc		diff \| blob \| history