perf bench: Also allow measuring alternative memcpy implementations
authorJan Beulich <JBeulich@suse.com>
Wed, 18 Jan 2012 13:28:56 +0000 (13:28 +0000)
committerArnaldo Carvalho de Melo <acme@redhat.com>
Tue, 24 Jan 2012 21:51:01 +0000 (19:51 -0200)
commit800eb01484b3ca1eaf4eb5186df13fb24de2db19
tree689b76d267371bbb5a614cefccaa0bea96187134
parent9ea811973d49a1df0be04ff6e4df449e4fca4fb5
perf bench: Also allow measuring alternative memcpy implementations

Intended to be able to support the current selection of the preferred
memcpy() implementation, this patch adds the ability to also measure the
two alternative implementations, again by way of using some
pre-processsor replacement.

While on my Westmere system this proves that the movsb based variant is
worse than the movsq based one (since the ERMS feature isn't there), it
also shows that here for the default as well as small sizes the unrolled
variant outperforms the movsq one.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/4F16D728020000780006D732@nat28.tlf.novell.com
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
tools/perf/bench/mem-memcpy-x86-64-asm-def.h
tools/perf/bench/mem-memcpy-x86-64-asm.S