review.tizen.org Git - platform/upstream/tensorflow.git/commit

projects / platform / upstream / tensorflow.git / commit

author	Ben Barsdell <benbarsdell@gmail.com>
	Thu, 10 May 2018 18:06:01 +0000 (11:06 -0700)
committer	Jonathan Hseu <vomjom@vomjom.net>
	Thu, 10 May 2018 18:06:01 +0000 (11:06 -0700)
commit	f08f24cd559b5824a1874a0e76d339875e43f366
tree	ade423df2e77815bcc246064124fbd0ecbe8e286	tree \| snapshot
parent	9201e2c002667047b1807745c4a7d6a8e5f2e9da	commit \| diff

Add GPU support for float16 batched matmul (#18436)

* Add GPU support for float16 batched matmul

- Uses cublasGemmBatchedEx introduced in CUDA 9.1.
- Includes support for Tensor Op math.
- Falls back to a loop over non-batched gemm calls on older CUDA
versions or GPU architectures.

* Refactor GPU batched gemm into one internal func

Domain: Machine Learning / ML Framework;

RSS Atom

tensorflow/core/kernels/batch_matmul_op_impl.h		diff \| blob \| history
tensorflow/core/kernels/batch_matmul_op_real.cc		diff \| blob \| history
tensorflow/stream_executor/blas.h		diff \| blob \| history
tensorflow/stream_executor/cuda/cuda_blas.cc		diff \| blob \| history
tensorflow/stream_executor/cuda/cuda_blas.h		diff \| blob \| history
tensorflow/stream_executor/stream.cc		diff \| blob \| history
tensorflow/stream_executor/stream.h		diff \| blob \| history