As seen in #19203, the `__builtin_expect` compiler builtin isn't recognized as
a builtin in nvcc8, leading to compilation failures of the form
./tensorflow/core/kernels/gather_functor_gpu.cu.h(57): error: calling a __host__ function("__builtin_
expect") from a __global__ function("tensorflow::GatherOpKernel< ::Eigen::half, int, (bool)1> ") is n
ot allowed
when attempting to build TensorFlow.
This change fixes things by adding an additional check for `__NVCC__`, and
avoiding any branch prediction hints in that case.
PiperOrigin-RevId:
197067418
// analysis. Giving it this information can help it optimize for the
// common case in the absence of better information (ie.
// -fprofile-arcs).
-#if TF_HAS_BUILTIN(__builtin_expect) || (defined(__GNUC__) && __GNUC__ >= 3)
+//
+// We need to disable this for GPU builds, though, since nvcc8 and older
+// don't recognize `__builtin_expect` as a builtin, and fail compilation.
+#if (!defined(__NVCC__)) && \
+ (TF_HAS_BUILTIN(__builtin_expect) || (defined(__GNUC__) && __GNUC__ >= 3))
#define TF_PREDICT_FALSE(x) (__builtin_expect(x, 0))
#define TF_PREDICT_TRUE(x) (__builtin_expect(!!(x), 1))
#else