review.tizen.org Git - platform/upstream/pytorch.git/commit

projects / platform / upstream / pytorch.git / commit

author	Jie <jiej@nvidia.com>
	Thu, 6 Dec 2018 16:57:39 +0000 (08:57 -0800)
committer	Facebook Github Bot <facebook-github-bot@users.noreply.github.com>
	Thu, 6 Dec 2018 17:03:46 +0000 (09:03 -0800)
commit	d2fdc33411a3ffeb0575604d60014869869c5653
tree	42f3279d5f7a81125f0cafe3a09db8c62a8902fe	tree \| snapshot
parent	eb3cabffd69e37162a3fe0bb1bbfa3de83404f3a	commit \| diff

(#14580)

Summary:
Removes cast of half to float in torch.sum, with float16 input tensor and
float32 output tensor, instead we cast data when loading input in kernel.

This supposingly would save a kernel launch as well as a full global memory load
on promoted data type (float).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14580

Differential Revision: D13356203

Pulled By: ezyang

fbshipit-source-id: 85e91225b880a65fe3ceb493371b9b36407fdf48

Domain: Machine Learning / ML Framework;

RSS Atom

aten/src/ATen/native/ReduceOps.cpp		diff \| blob \| history
aten/src/ATen/native/TensorIterator.cpp		diff \| blob \| history
aten/src/ATen/native/TensorIterator.h		diff \| blob \| history
aten/src/ATen/native/cuda/Reduce.cuh		diff \| blob \| history
aten/src/ATen/native/cuda/ReduceOpsKernel.cu		diff \| blob \| history
test/test_cuda.py		diff \| blob \| history