[OpenMP] Add `__CUDA_ARCH__` definition when offloading with OpenMP
authorJoseph Huber <jhuber6@vols.utk.edu>
Mon, 9 May 2022 19:10:04 +0000 (15:10 -0400)
committerJoseph Huber <jhuber6@vols.utk.edu>
Fri, 13 May 2022 18:38:35 +0000 (14:38 -0400)
Currently we define the `__CUDA_ARCH__` macro only in CUDA mode. This
patch allows us to use this macro in OpenMP-offloading mode when
targeting NVPTX.

Reviewed By: tra, tianshilei1992

Differential Revision: https://reviews.llvm.org/D125256

clang/lib/Basic/Targets/NVPTX.cpp
clang/test/OpenMP/driver-openmp-target.c

index ffd6998..9dd60ad 100644 (file)
@@ -179,7 +179,7 @@ void NVPTXTargetInfo::getTargetDefines(const LangOptions &Opts,
                                        MacroBuilder &Builder) const {
   Builder.defineMacro("__PTX__");
   Builder.defineMacro("__NVPTX__");
-  if (Opts.CUDAIsDevice) {
+  if (Opts.CUDAIsDevice || Opts.OpenMPIsDevice) {
     // Set __CUDA_ARCH__ for the GPU specified.
     std::string CUDAArchCode = [this] {
       switch (GPU) {
index ae8430a..8809b54 100644 (file)
@@ -1,4 +1,8 @@
 // REQUIRES: x86-registered-target
+// REQUIRES: nvptx-registered-target
 // REQUIRES: clang-target-64-bits
+
 // RUN: %clang %s -c -E -dM -fopenmp=libomp -fopenmp-version=45 -fopenmp-targets=x86_64-unknown-unknown -o - | FileCheck --check-prefix=CHECK-45-VERSION %s
 // CHECK-45-VERSION: #define _OPENMP 201511
+// RUN: %clang %s -c -E -dM -fopenmp=libomp -nogpulib --offload-arch=sm_70 --offload-device-only -o - | FileCheck --check-prefix=CHECK-CUDA-ARCH %s
+// CHECK-CUDA-ARCH: #define __CUDA_ARCH__ 700