Enable Half, BFloat16, and Complex dtypes for coo-coo sparse matmul [CUDA] (#59980)
authorIvan Yashchuk <ivan.yashchuk@aalto.fi>
Mon, 30 Aug 2021 22:03:15 +0000 (15:03 -0700)
committerFacebook GitHub Bot <facebook-github-bot@users.noreply.github.com>
Mon, 30 Aug 2021 22:06:25 +0000 (15:06 -0700)
commitad4848565e1d9f4d408c60614f213acb52035181
tree29a4f95a1262fe323b08c5edb5cf0c5733ed7fa8
parentc3464e78a461c6275e9fbbe3dfa72ca3983cb4df
Enable Half, BFloat16, and Complex dtypes for coo-coo sparse matmul [CUDA] (#59980)

Summary:
This PR enables Half, BFloat16, ComplexFloat, and ComplexDouble support for matrix-matrix multiplication of COO sparse matrices.
The change is applied only to CUDA 11+ builds.

`cusparseSpGEMM` also supports `CUDA_C_16F` (complex float16) and `CUDA_C_16BF` (complex bfloat16). PyTorch also supports the complex float16 dtype (`ScalarType::ComplexHalf`), but there is no convenient dispatch, so this dtype is omitted in this PR.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/59980

Reviewed By: ngimel

Differential Revision: D29699456

Pulled By: cpuhrsch

fbshipit-source-id: 407ae53392acb2f92396a62a57cbaeb0fe6e950b
aten/src/ATen/cuda/CUDADataType.h [new file with mode: 0644]
aten/src/ATen/native/sparse/cuda/SparseMatMul.cu
test/test_sparse.py
torch/testing/_internal/common_cuda.py
torch/utils/hipify/cuda_to_hip_mappings.py