GPUToCUDA: attach CUBIN to the nested module rather than to the function
authorAlex Zinenko <zinenko@google.com>
Tue, 8 Oct 2019 12:11:00 +0000 (05:11 -0700)
committerA. Unique TensorFlower <gardener@tensorflow.org>
Tue, 8 Oct 2019 12:11:26 +0000 (05:11 -0700)
commit11d12670daef546f55cc76d8fe0b32f137ab3bb6
tree6b0b7a5100f517e966eb35001118df42b790d22f
parent52e082b6ed964ad408abc637b995bc13ff2fb122
GPUToCUDA: attach CUBIN to the nested module rather than to the function

Originally, we were attaching attributes containing CUBIN blobs to the kernel
function called by `gpu.launch_func`. This kernel is now contained in a nested
module that is used as a compilation unit. Attach compiled CUBIN blobs to the
module rather than to the function since we were compiling the module. This
also avoids duplication of the attribute on multiple kernels within the same
module.

PiperOrigin-RevId: 273497303
mlir/include/mlir/Conversion/GPUToCUDA/GPUToCUDAPass.h
mlir/lib/Conversion/GPUToCUDA/ConvertKernelFuncToCubin.cpp
mlir/lib/Conversion/GPUToCUDA/ConvertLaunchFuncToCudaCalls.cpp
mlir/test/Conversion/GPUToCUDA/lower-launch-func-to-cuda.mlir
mlir/test/Conversion/GPUToCUDA/lower-nvvm-kernel-to-cubin.mlir
mlir/tools/mlir-cuda-runner/mlir-cuda-runner.cpp