x86: Unifies 'strlen-evex' and 'strlen-evex512' implementations.
authorMatthew Sterrett <matthew.sterrett@intel.com>
Fri, 15 Dec 2023 20:04:05 +0000 (12:04 -0800)
committerNoah Goldstein <goldstein.w.n@gmail.com>
Mon, 18 Dec 2023 18:38:01 +0000 (12:38 -0600)
commite957308723ac2e55dad360d602298632980bbd38
tree75b99dddc0746f3e950e43eda54c51449dbbe612
parent442983319ba70de801fc856e8dd4748fba8f7f1b
x86: Unifies 'strlen-evex' and 'strlen-evex512' implementations.

This commit uses a common implementation 'strlen-evex-base.S' for both
'strlen-evex' and 'strlen-evex512'

The motivation is to reduce the number of implementations to maintain.
This incidentally gives a small performance improvement.

All tests pass on x86.

Benchmarks were taken on SKX.
https://www.intel.com/content/www/us/en/products/sku/123613/intel-core-i97900x-xseries-processor-13-75m-cache-up-to-4-30-ghz/specifications.html

Geometric mean for strlen-evex512 over all benchmarks (N=10) was (new/old) 0.939
Geometric mean for wcslen-evex512 over all benchmarks (N=10) was (new/old) 0.965

Code Size Changes:
    strlen-evex512.S    :  +24 bytes
    wcslen-evex512.S    :  +54 bytes

Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
sysdeps/x86_64/multiarch/strlen-evex-base.S
sysdeps/x86_64/multiarch/strlen-evex.S
sysdeps/x86_64/multiarch/strnlen-evex512.S
sysdeps/x86_64/multiarch/wcslen-evex512.S
sysdeps/x86_64/multiarch/wcsnlen-evex512.S