aarch64: Optimized implementation of strnlen
authorXuelei Zhang <zhangxuelei4@huawei.com>
Thu, 19 Dec 2019 13:49:46 +0000 (13:49 +0000)
committerAdhemerval Zanella <adhemerval.zanella@linaro.org>
Thu, 19 Dec 2019 19:31:04 +0000 (16:31 -0300)
commit2911cb68ed3d6c515ad1979237e74e1fefab3674
treee7a53b39337a460ef833279f1d0a3c2ab54177fb
parent0237b61526e716fa9597f521643908a4fda3b46a
aarch64: Optimized implementation of strnlen

Optimize the strlen implementation by using vector operations and
loop unrooling in main loop. Compared to aarch64/strnlen.S, it
reduces latency of cases in bench-strnlen by 11%~24% when the length
of src is greater than 64 bytes, with gains throughout the benchmark.

Checked on aarch64-linux-gnu.

Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
sysdeps/aarch64/strnlen.S