x86: Optimize memchr-evex.S
authorNoah Goldstein <goldstein.w.n@gmail.com>
Mon, 3 May 2021 07:03:19 +0000 (03:03 -0400)
committerNoah Goldstein <goldstein.w.n@gmail.com>
Tue, 4 May 2021 01:18:03 +0000 (21:18 -0400)
commit2a76821c3081d2c0231ecd2618f52662cb48fccd
tree053a73e993b4eb15341f53039a3b98dade784c28
parentacfd088a1963ba51cd83c78f95c0ab25ead79e04
x86: Optimize memchr-evex.S

No bug. This commit optimizes memchr-evex.S. The optimizations include
replacing some branches with cmovcc, avoiding some branches entirely
in the less_4x_vec case, making the page cross logic less strict,
saving some ALU in the alignment process, and most importantly
increasing ILP in the 4x loop. test-memchr, test-rawmemchr, and
test-wmemchr are all passing.

Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
sysdeps/x86_64/multiarch/memchr-evex.S