[NVPTX] Add more FMA intriniscs/builtins
authorJakub Chlanda <j.chlanda@gmail.com>
Tue, 22 Feb 2022 22:45:19 +0000 (14:45 -0800)
committerArtem Belevich <tra@google.com>
Wed, 23 Feb 2022 21:56:53 +0000 (13:56 -0800)
commitbe672934ff885255b7e5e393bf4606e9fb8894a0
treef2c11563982497cd74c4e8a61e3cb519341a2f81
parente0dc4ac28f0080a10a51a4627c880ca795f07ba0
[NVPTX] Add more FMA intriniscs/builtins

This patch adds builtins/intrinsics for the following variants of FMA:

- f16, f16x2
  - rn
  - rn_ftz
  - rn_sat
  - rn_ftz_sat
  - rn_relu
  - rn_ftz_relu
- bf16, bf16x2
  - rn
  - rn_relu

ptxas (Cuda compilation tools, release 11.0, V11.0.194) is happy with the generated assembly.

Differential Revision: https://reviews.llvm.org/D118977
llvm/include/llvm/IR/IntrinsicsNVVM.td
llvm/lib/Target/NVPTX/NVPTXInstrInfo.td
llvm/lib/Target/NVPTX/NVPTXIntrinsics.td
llvm/lib/Target/NVPTX/NVPTXTargetTransformInfo.cpp
llvm/test/CodeGen/NVPTX/math-intrins-sm53-ptx42.ll [new file with mode: 0644]
llvm/test/CodeGen/NVPTX/math-intrins-sm80-ptx70-instcombine.ll
llvm/test/CodeGen/NVPTX/math-intrins-sm80-ptx70.ll
llvm/test/CodeGen/NVPTX/math-intrins-sm86-ptx72.ll