[Cuda] Add initial support for wrapping CUDA images in the new driver.
This patch adds the initial support for wrapping CUDA images. This
requires changing some of the logic for how we bundle images. We now
need to copy the image for all kinds that are active for the
architecture. Then we need to run a separate wrapping job if the Kind is
Cuda. For cuda wrapping we need to use the `fatbinary` program from the
CUDA SDK to bundle all the binaries together. This is then passed to a
new function to perfom the actual module code generation that will be
implemented in a later patch.
Depends on D120273 D123471
Reviewed By: tra
Differential Revision: https://reviews.llvm.org/D123810