review.tizen.org Git - platform/upstream/pytorch.git/commit

author	Jie <jiej@nvidia.com>
	Tue, 18 Dec 2018 04:08:15 +0000 (20:08 -0800)
committer	Facebook Github Bot <facebook-github-bot@users.noreply.github.com>
	Tue, 18 Dec 2018 04:13:30 +0000 (20:13 -0800)
commit	bd958cde685c2de67ecf691934470ef3c289e00d
tree	228d6d40e6ee57d5b0b1342848286d49faa5e10b	tree \| snapshot
parent	71ee882157ea06b0e8facb510c44b5a3a55e5d91	commit \| diff

[TensorIterator fixing mean to output correct result for half precisi… (#14878)

Summary:
…on](#12115)

mean is calculated in two step sum()/numel(). For half precision, data gets
casted back to half after sum().
We fused the division into the reduction kernel by adding pre_op/post_op.

This allows us to do torch.ones(65536).cuda().half().mean() to return correct
result.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14878

Differential Revision: D13491159

Pulled By: soumith

fbshipit-source-id: e83802e1628b6d2615c45e18d7acf991d143a09e

aten/src/ATen/native/ReduceOps.cpp		diff \| blob \| history
aten/src/ATen/native/ReduceOps.h		diff \| blob \| history
aten/src/ATen/native/TensorIterator.cpp		diff \| blob \| history
aten/src/ATen/native/cpu/Reduce.h		diff \| blob \| history
aten/src/ATen/native/cpu/ReduceOpsKernel.cpp		diff \| blob \| history
aten/src/ATen/native/cuda/Reduce.cuh		diff \| blob \| history
aten/src/ATen/native/cuda/ReduceOpsKernel.cu		diff \| blob \| history
test/test_cuda.py		diff \| blob \| history
tools/autograd/derivatives.yaml		diff \| blob \| history