powerpc64le: add optimized strlen for P9
authorPaul E. Murphy <murphyp@linux.vnet.ibm.com>
Mon, 18 May 2020 16:16:06 +0000 (11:16 -0500)
committerPaul E. Murphy <murphyp@linux.vnet.ibm.com>
Fri, 5 Jun 2020 20:30:00 +0000 (15:30 -0500)
commita23bd00f9d810c28d9e83ce1d7cf53968375937d
treead8b0472058d43b628bb9882d999fa3b3514cd7c
parent6ef422750985f7e60a8d480f07ecda59e0311fdf
powerpc64le: add optimized strlen for P9

This started as a trivial change to Anton's rawmemchr.  I got
carried away.  This is a hybrid between P8's asympotically
faster 64B checks with extremely efficient small string checks
e.g <64B (and sometimes a little bit more depending on alignment).

The second trick is to align to 64B by running a 48B checking loop
16B at a time until we naturally align to 64B (i.e checking 48/96/144
bytes/iteration based on the alignment after the first 5 comparisons).
This allieviates the need to check page boundaries.

Finally, explicly use the P7 strlen with the runtime loader when building
P9.  We need to be cautious about vector/vsx extensions here on P9 only
builds.
sysdeps/powerpc/powerpc64/le/power9/rtld-strlen.S [new file with mode: 0644]
sysdeps/powerpc/powerpc64/le/power9/strlen.S [new file with mode: 0644]
sysdeps/powerpc/powerpc64/multiarch/Makefile
sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c
sysdeps/powerpc/powerpc64/multiarch/strlen-power9.S [new file with mode: 0644]
sysdeps/powerpc/powerpc64/multiarch/strlen.c