x86: Add avx2 optimized functions for the wchar_t strcpy family
authorNoah Goldstein <goldstein.w.n@gmail.com>
Wed, 9 Nov 2022 01:38:41 +0000 (17:38 -0800)
committerNoah Goldstein <goldstein.w.n@gmail.com>
Wed, 9 Nov 2022 03:22:33 +0000 (19:22 -0800)
commit52cf11004eb10f8ebbc193fbdf4094cfecb3dbff
treedc3c5cf41a53bd42de548c3e0d04f37dac95b72b
parent64b8b6516b3cba19dba4c8f4f9b97daa0556fd98
x86: Add avx2 optimized functions for the wchar_t strcpy family

Implemented:
    wcscat-avx2  (+ 744 bytes
    wcscpy-avx2  (+ 539 bytes)
    wcpcpy-avx2  (+ 577 bytes)
    wcsncpy-avx2 (+1108 bytes)
    wcpncpy-avx2 (+1214 bytes)
    wcsncat-avx2 (+1085 bytes)

Performance Changes:
    Times are from N = 10 runs of the benchmark suite and are reported
    as geometric mean of all ratios of New Implementation / Best Old
    Implementation. Best Old Implementation was determined with the
    highest ISA implementation.

    wcscat-avx2     -> 0.975
    wcscpy-avx2     -> 0.591
    wcpcpy-avx2     -> 0.698
    wcsncpy-avx2    -> 0.730
    wcpncpy-avx2    -> 0.711
    wcsncat-avx2    -> 0.954

Code Size Changes:
    This change  increase the size of libc.so by ~5.5kb bytes. For
    reference the patch optimizing the normal strcpy family functions
    decreases libc.so by ~5.2kb.

Full check passes on x86-64 and build succeeds for all ISA levels w/
and w/o multiarch.
27 files changed:
sysdeps/x86_64/multiarch/Makefile
sysdeps/x86_64/multiarch/ifunc-impl-list.c
sysdeps/x86_64/multiarch/ifunc-wcs.h
sysdeps/x86_64/multiarch/wcpcpy-avx2.S [new file with mode: 0644]
sysdeps/x86_64/multiarch/wcpcpy-generic.c
sysdeps/x86_64/multiarch/wcpncpy-avx2.S [new file with mode: 0644]
sysdeps/x86_64/multiarch/wcpncpy-generic.c
sysdeps/x86_64/multiarch/wcscat-avx2.S [new file with mode: 0644]
sysdeps/x86_64/multiarch/wcscat-generic.c
sysdeps/x86_64/multiarch/wcscpy-avx2.S [new file with mode: 0644]
sysdeps/x86_64/multiarch/wcscpy-generic.c
sysdeps/x86_64/multiarch/wcscpy.c
sysdeps/x86_64/multiarch/wcsncat-avx2.S [new file with mode: 0644]
sysdeps/x86_64/multiarch/wcsncat-generic.c
sysdeps/x86_64/multiarch/wcsncpy-avx2.S [new file with mode: 0644]
sysdeps/x86_64/multiarch/wcsncpy-generic.c
sysdeps/x86_64/wcpcpy-generic.c
sysdeps/x86_64/wcpcpy.S
sysdeps/x86_64/wcpncpy-generic.c
sysdeps/x86_64/wcpncpy.S
sysdeps/x86_64/wcscat-generic.c
sysdeps/x86_64/wcscat.S
sysdeps/x86_64/wcscpy.S
sysdeps/x86_64/wcsncat-generic.c
sysdeps/x86_64/wcsncat.S
sysdeps/x86_64/wcsncpy-generic.c
sysdeps/x86_64/wcsncpy.S