[NVPTX] Backend support for variadic functions
This patch adds lowering for function calls with variadic number of
arguments as well as enables support for the following
instructions/intrinsics:
- va_arg
- va_start
- va_end
- va_copy
Note that this patch doesn't intent to include clang's support for
variadic functions for CUDA.
According to the docs:
PTX version 6.0 supports passing unsized array parameter to a
function which can be used to implement variadic functions. [0]
The last parameter in the parameter list may be a .param array of
type .b8 with no size specified. It is used to pass an arbitrary
number of parameters to the function packed into a single array
object.
When calling a function with such an unsized last argument, the last
argument may be omitted from the call instruction if no parameter is
passed through it. Accesses to this array parameter must be within
the bounds of the array. The result of an access is undefined if no
array was passed, or if the access was outside the bounds of the
actual array being passed. [1]
Note that aggregates passed by value as variadic arguments are not
currently supported.
[0] https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#variadic-functions
[1] https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#kernel-and-function-directives-func
Differential Revision: https://reviews.llvm.org/D138531