llvmpipe: Optimize do_triangle_ccw for POWER8
authorOded Gabbay <oded.gabbay@gmail.com>
Sun, 13 Dec 2015 15:49:32 +0000 (17:49 +0200)
committerOded Gabbay <oded.gabbay@gmail.com>
Wed, 6 Jan 2016 12:54:16 +0000 (14:54 +0200)
commit3bbe16ea79bb5738109df36780cc99119a006d91
treea986f3612cd79a73e1c0b5d25d56e1508daef92f
parente99555ef0bf1b786a1bf1e93f3304507dbb6e939
llvmpipe: Optimize do_triangle_ccw for POWER8

This patch converts the SSE optimization done in do_triangle_ccw to
VMX/VSX.

I measured the results on POWER8 machine with 32 cores at 3.4GHz and
16GB of RAM.

                      FPS/Score
  Name            Before     After    Delta
------------------------------------------------
glmark2 (score)   136.6      139.8    2.34%
openarena         16.14      16.35    1.30%
xonotic           4.655      4.707    1.11%

v2:

- Convert loads to use aligned loads
- Make sure code is build only on POWER8 LE machine

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
src/gallium/drivers/llvmpipe/lp_setup_tri.c