llvmpipe: Use lp_build_round_arch on IBM Z (s390x)
authorMarius Hillenbrand <mhillen@linux.ibm.com>
Thu, 18 Nov 2021 17:27:35 +0000 (18:27 +0100)
committerMarge Bot <emma+marge@anholt.net>
Tue, 23 Nov 2021 17:49:02 +0000 (17:49 +0000)
LLVM has all the required intrinsics available on IBM Z, so use them for
rounding operations (they will be implemented as a single instruction).
This change makes the test case lp_test_arit pass, because it avoids
using the buggy generic code.

v2: update .gitlab-ci/cross-xfail-s390x to reflect passing lp_test_arit

Signed-off-by: Marius Hillenbrand <mhillen@linux.ibm.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13927>

.gitlab-ci/cross-xfail-s390x
src/gallium/auxiliary/gallivm/lp_bld_arit.c
src/gallium/drivers/llvmpipe/lp_test_arit.c

index 1c71c05..1718800 100644 (file)
@@ -1887,6 +1887,8 @@ arch_rounding_available(const struct lp_type type)
       return TRUE;
    else if (util_get_cpu_caps()->has_neon)
       return TRUE;
+   else if (util_cpu_caps_has_zarch())
+      return TRUE;
 
    return FALSE;
 }
@@ -1994,7 +1996,8 @@ lp_build_round_arch(struct lp_build_context *bld,
                     LLVMValueRef a,
                     enum lp_build_round_mode mode)
 {
-   if (util_get_cpu_caps()->has_sse4_1 || util_get_cpu_caps()->has_neon) {
+   if (util_get_cpu_caps()->has_sse4_1 || util_get_cpu_caps()->has_neon ||
+       util_cpu_caps_has_zarch()) {
       LLVMBuilderRef builder = bld->gallivm->builder;
       const struct lp_type type = bld->type;
       const char *intrinsic_root;
index cbea1e2..ff6cdac 100644 (file)
@@ -480,6 +480,7 @@ test_unary(unsigned verbose, FILE *fp, const struct unary_test_t *test, unsigned
          }
 
          if (!util_get_cpu_caps()->has_neon &&
+             !util_cpu_caps_has_zarch() &&
              test->ref == &nearbyintf && length == 2 &&
              ref != roundf(testval)) {
             /* FIXME: The generic (non SSE) path in lp_build_iround, which is