[CUDA, MemCpyOpt] Add a flag to force-enable memcpyopt and use it for CUDA.
authorArtem Belevich <tra@google.com>
Tue, 20 Jul 2021 21:37:06 +0000 (14:37 -0700)
committerArtem Belevich <tra@google.com>
Fri, 6 Aug 2021 18:13:52 +0000 (11:13 -0700)
commit6a9cf21f5a2dcd02f90075d6d3576a87f1abd8a9
treec34ffe8d486f0b8e5c99436a034abb613d18ddd8
parentf59f6598790c8a74bcb1ab7e045484925e8bf551
[CUDA, MemCpyOpt] Add a flag to force-enable memcpyopt and use it for CUDA.

Attempt to enable MemCpyOpt unconditionally in D104801 uncovered the fact that
there are users that do not expect LLVM to materialize `memset` intrinsic.

While other passes can do that, too, MemCpyOpt triggers it more frequently and
breaks sanitizers and some downstream users.

For now introduce a flag to force-enable the flag and opt-in only CUDA
compilation with NVPTX back-end.

Differential Revision: https://reviews.llvm.org/D106401
clang/lib/Driver/ToolChains/Cuda.cpp
llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp
llvm/test/Transforms/MemCpyOpt/no-libcalls.ll