[CUDA][FIX] Make shfl[_sync] for unsigned long long non-recursive
authorJohannes Doerfert <johannes@jdoerfert.de>
Tue, 12 Jul 2022 02:42:16 +0000 (21:42 -0500)
committerJohannes Doerfert <johannes@jdoerfert.de>
Thu, 21 Jul 2022 17:36:54 +0000 (12:36 -0500)
commit48d6f5240187573881f96cc9574ea09592f50723
treef3a782e9031cf73f6cc625c440ff75a1be682091
parentd150152615074190d20492512da439cd5820b04a
[CUDA][FIX] Make shfl[_sync] for unsigned long long non-recursive

A copy-paste error caused UB in the definition of the unsigned long long
versions of the shfl intrinsics. Reported and diagnosed by @trws.

Differential Revision: https://reviews.llvm.org/D129536
clang/lib/Headers/__clang_cuda_intrinsics.h
clang/test/CodeGenCUDA/shuffle_long_long.cu [new file with mode: 0644]