vmx: implement fast path vmx_fill
Based on sse2 impl.
It was benchmarked against commid id
e2d211a from pixman/master
Tested cairo trimmed benchmarks on POWER8, 8 cores, 3.4GHz,
RHEL 7.1 ppc64le :
speedups
========
t-swfdec-giant-steps 1383.09 -> 718.63 : 1.92x speedup
t-gnome-system-monitor 1403.53 -> 918.77 : 1.53x speedup
t-evolution 552.34 -> 415.24 : 1.33x speedup
t-xfce4-terminal-a1 1573.97 -> 1351.46 : 1.16x speedup
t-firefox-paintball 847.87 -> 734.50 : 1.15x speedup
t-firefox-asteroids 565.99 -> 492.77 : 1.15x speedup
t-firefox-canvas-swscroll 1656.87 -> 1447.48 : 1.14x speedup
t-midori-zoomed 724.73 -> 642.16 : 1.13x speedup
t-firefox-planet-gnome 975.78 -> 911.92 : 1.07x speedup
t-chromium-tabs 292.12 -> 274.74 : 1.06x speedup
t-firefox-chalkboard 690.78 -> 653.93 : 1.06x speedup
t-firefox-talos-gfx 1375.30 -> 1303.74 : 1.05x speedup
t-firefox-canvas-alpha 1016.79 -> 967.24 : 1.05x speedup
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Acked-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>