ARM: reuse common NEON code for over_{n_8|8888_n|8888_8}_0565
authorSiarhei Siamashka <siarhei.siamashka@nokia.com>
Sat, 27 Nov 2010 02:47:39 +0000 (04:47 +0200)
committerSiarhei Siamashka <siarhei.siamashka@nokia.com>
Fri, 3 Dec 2010 13:37:19 +0000 (15:37 +0200)
commit3990931bf6197eff1cec06cf24bce53ddf9a539a
tree566a2626ad4f819aeb6dd2f75abfec6a5868d097
parenta7c36681c0c1955ff9110b81f1789e56abb10a95
ARM: reuse common NEON code for over_{n_8|8888_n|8888_8}_0565

Renamed suppementary macros from 'over_n_8_0565' to 'over_8888_8_0565',
because they can actually support all variants of this operation:
over_8888_8_0565/over_n_8_0565/over_8888_n_0565.

Also 'over_8888_8_0565' now uses more optimized common code instead of its
own variant, improving performance a bit. Even though this operation is
still memory bandwidth limited, scaled variants of these fast paths may
put more stress on CPU later.

Benchmarked on ARM Cortex-A8 @500MHz:

== before ==

    over_8888_8_0565 =  L1:  67.10  L2:  53.82  M: 44.70 (105.17%)
                        HT:  18.73  VT:  16.91  R: 14.25  RT:  4.80 (52Kops/s)

== after ==

    over_8888_8_0565 =  L1:  77.83  L2:  58.14  M: 44.82 (105.52%)
                        HT:  20.58  VT:  17.44  R: 15.05  RT:  4.88 (52Kops/s)
pixman/pixman-arm-neon-asm.S