review.tizen.org Git - platform/upstream/pytorch.git/commit

Implementing cuda kernel for tril_indices and triu_indices (#15203)

Summary:
Followup PR of #14904, and the stretch goal of #12653.

Directly calculate coordinates in the original tensor using column index in the result tensor. Every GPU thread takes care of a column (two numbers) in the output tensor.

The implementation detects and handles precision loss during calculating the square root of a `int64_t` variable, and supports tensors with up to `row * column = 2 ^ 59` numbers.

Algorithm details are describe in [comments of TensorFactories.cu](https://github.com/pytorch/pytorch/blob/23ddb6f58a1c8a7a660a793f174cf014230176c6/aten/src/ATen/native/cuda/TensorFactories.cu#L109-L255).

zou3519
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15203

Reviewed By: zou3519

Differential Revision: D13517695

Pulled By: mrshenli

fbshipit-source-id: 86b305d22cac08c8962a3b0cf8e9e620b7ec33ea

author	Shen Li <shenli@fb.com>
	Thu, 20 Dec 2018 18:21:02 +0000 (10:21 -0800)
committer	Facebook Github Bot <facebook-github-bot@users.noreply.github.com>
	Thu, 20 Dec 2018 18:23:38 +0000 (10:23 -0800)
commit	06a7cb59019ee57c679ba2cf7d51e36bd3710ad4
tree	f2626119c931d0665a731a1a84556793666f77a7	tree \| snapshot
parent	5c66662e58c5b87b3f39913bde056d5ecfd4e58e	commit \| diff

aten/src/ATen/native/TensorFactories.cpp		diff \| blob \| history
aten/src/ATen/native/TensorFactories.h	[new file with mode: 0644]	blob
aten/src/ATen/native/cuda/TensorFactories.cu		diff \| blob \| history
aten/src/ATen/native/native_functions.yaml		diff \| blob \| history
test/common_methods_invocations.py		diff \| blob \| history
test/test_cuda.py		diff \| blob \| history
test/test_torch.py		diff \| blob \| history
torch/_torch_docs.py		diff \| blob \| history