review.tizen.org Git - external/glibc.git/log

powerpc: refactor powerpc64 lrint/lrintf/llrint/llrintf

This patches consolidates all the powerpc llrint{f} implementations on
the generic sysdeps/powerpc/fpu/s_llrint{f}.

The IFUNC support is also moved only to powerpc64 only, since for
powerpc64le generic implementation resulting in optimized code.

Checked on powerpc-linux-gnu (built without --with-cpu, with
--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch),
powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+
and --disable-multi-arch).

* sysdeps/powerpc/powerpc64/be/fpu/multiarch/Makefile
(libm-sysdep_routines): Add s_llrint-power8, s_llrint-power6x, and
s_llrint-ppc64.
(CFLAGS-s_llrint-power8.c, CFLAGS-s_llrint-power6x.c): New rule.
* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_llrint-power6x.c: New
file.
* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_llrint-power8.c:
Likewise.
* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_llrint-ppc64.c:
Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_lrint.c: Move to ...
* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_lrint.c: ... here.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_llrint.c: Move to ...
* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_llrint.c: ... here.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_llrintf.c: Move to ...
* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_llrintf.c: ... here.
* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_lrint.c: New file.
* sysdeps/powerpc/powerpc64/fpu/Makefile: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile
(libm-sysdep_routines): Remove s_llrint-* objects.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_llrint-power6x.S: Remove
file.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_llrint-power8.S:
Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_llrint-ppc64.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/s_llrint.c: New file.
* sysdeps/powerpc/powerpc64/fpu/s_llrintf.c: Likewise.
* sysdeps/powerpc/powerpc64/fpu/s_lrint.c: Likewise.
* sysdeps/powerpc/powerpc64/fpu/s_lrintf.c: Likewise.
* sysdeps/powerpc/powerpc64/fpu/s_llrint.S: Remove file.
* sysdeps/powerpc/powerpc64/fpu/s_llrintf.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/s_lrint.S: Likewise.
* sysdeps/powerpc/powerpc64/power6x/fpu/s_llrint.S: Likewise.
* sysdeps/powerpc/powerpc64/power8/fpu/s_llrint.S: Likewise.

Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>

Linux: Fix __glibc_has_include use for <sys/stat.h> and statx

The identifier linux is used as a predefined macro, so the actually
used path is 1/stat.h or 1/stat64.h. Using the quote-based version
triggers a file lookup for /usr/include/bits/linux/stat.h (or whatever
directory is used to store bits/statx.h), but since bits/ is pretty
much reserved by glibc, this appears to be acceptable.

This is related to GCC PR 80005: incorrect macro expansion of the
argument of __has_include.

Suggested by Zack Weinberg.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

<sys/cdefs.h>: Inhibit macro expansion for __glibc_has_include

This is currently ineffective with GCC because of GCC PR 80005, but
it makes sense to anticipate a fix for this defect.

Suggested by Zack Weinberg.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

Add IPV6_ROUTER_ALERT_ISOLATE from Linux 5.1 to bits/in.h.

This patch adds the new constant IPV6_ROUTER_ALERT_ISOLATE from Linux
5.1 to sysdeps/unix/sysv/linux/bits/in.h.

Tested for x86_64.

* sysdeps/unix/sysv/linux/bits/in.h (IPV6_ROUTER_ALERT_ISOLATE):
New macro.

Allow memset local PLT reference for powerpc soft-float.

Some recent change on GCC mainline resulted in the localplt test
failing for powerpc soft-float (not sure exactly when, as the failure
appeared when there were other build test failures as well;
<https://sourceware.org/ml/libc-testresults/2019-q2/msg00261.html>
shows it remaining when other failures went away). The problem is a
call to memset that GCC now generates in the libgcc long double code.

Since memset is documented as a function GCC may always implicitly
generate calls to, it seems reasonable to allow that local PLT
reference (just like those for libgcc functions that GCC implicitly
generates calls to and that are also exported from libc.so), which
this patch does.

Tested for powerpc soft-float with build-many-glibcs.py.

* sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/localplt.data:
Allow memset in libc.so.

aarch64: handle STO_AARCH64_VARIANT_PCS

Avoid lazy binding of symbols that may follow a variant PCS with different
register usage convention from the base PCS.

Currently the lazy binding entry code does not preserve all the registers
required for AdvSIMD and SVE vector calls. Saving and restoring all
registers unconditionally may break existing binaries, even if they never
use vector calls, because of the larger stack requirement for lazy
resolution, which can be significant on an SVE system.

The solution is to mark all symbols in the symbol table that may follow
a variant PCS so the dynamic linker can handle them specially. In this
patch such symbols are always resolved at load time, not lazily.

So currently LD_AUDIT for variant PCS symbols are not supported, for that
the _dl_runtime_profile entry needs to be changed e.g. to unconditionally
save/restore all registers (but pass down arg and retval registers to
pltentry/exit callbacks according to the base PCS).

This patch also removes a __builtin_expect from the modified code because
the branch prediction hint did not seem useful.

* sysdeps/aarch64/dl-dtprocnum.h: New file.
* sysdeps/aarch64/dl-machine.h (DT_AARCH64): Define.
(elf_machine_runtime_setup): Handle DT_AARCH64_VARIANT_PCS.
(elf_machine_lazy_rel): Check STO_AARCH64_VARIANT_PCS and bind such
symbols at load time.
* sysdeps/aarch64/linkmap.h (struct link_map_machine): Add variant_pcs.

aarch64: add STO_AARCH64_VARIANT_PCS and DT_AARCH64_VARIANT_PCS

STO_AARCH64_VARIANT_PCS is a non-visibility st_other flag for marking
symbols that reference functions that may follow a variant PCS with
different register usage convention from the base PCS.

DT_AARCH64_VARIANT_PCS is a dynamic tag that marks ELF modules that
have R_*_JUMP_SLOT relocations for symbols marked with
STO_AARCH64_VARIANT_PCS (i.e. have variant PCS calls via a PLT).

* elf/elf.h (STO_AARCH64_VARIANT_PCS): Define.
(DT_AARCH64_VARIANT_PCS): Define.

powerpc: Remove optimized finite

The powerpc finite optimization do not show much gain:

  - GCC will call libm iff -fsignaling-nans is used. This usage pattern
    is usually not performance oriented and for such calls PLT overhead
    should dominate execution time.

  - The power7 uses ftdiv to optimize for some input patterns, but at
    cost of others.  Comparing against generic C implementation built
    for powerpc64-linux-gnu-power7 (--with-cpu=power7):

    - Generic sysdeps/ieee754 implementation:
       "isfinite": {
        "": {
         "duration": 5.0082e+09,
         "iterations": 2.45299e+09,
         "max": 43.824,
         "min": 2.008,
         "mean": 2.04167
        },
        "INF": {
         "duration": 4.66554e+09,
         "iterations": 2.28288e+09,
         "max": 35.73,
         "min": 2.008,
         "mean": 2.04371
        },
        "NAN": {
         "duration": 4.66274e+09,
         "iterations": 2.28716e+09,
         "max": 34.161,
         "min": 2.009,
         "mean": 2.03866
        }
       }

    - power7 optimized one:
       "isfinite": {
        "": {
         "duration": 4.99111e+09,
         "iterations": 2.65566e+09,
         "max": 25.015,
         "min": 1.716,
         "mean": 1.87942
        },
        "INF": {
         "duration": 4.6783e+09,
         "iterations": 2.0999e+09,
         "max": 35.264,
         "min": 1.868,
         "mean": 2.22787
        },
        "NAN": {
         "duration": 4.67915e+09,
         "iterations": 2.08678e+09,
         "max": 38.099,
         "min": 1.869,
         "mean": 2.24228
        }
       }

     So it basically optimizes marginally for normal numbers while
     increasing the latency for other kind of FP.

  - The power8 implementation is just the generic implementation using
    ISA 2.07 mfvsrd instruction (which GCC uses for generic implementation).
    So generic implementation is the best option for powerpc64le.

Checked on powerpc-linux-gnu (built without --with-cpu, with
--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch),
powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+
and --disable-multi-arch).

* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/Makefile
(sysdeps_routines, libm-sysdep_routines): Remove s_finite*
objects.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_finite-power7.S:
Remove file.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_finite-ppc32.c:
Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_finite.c: Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_finitef-ppc32.c:
Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_finitef.c: Likewise.
* sysdeps/powerpc/powerpc32/power7/fpu/s_finite.S: Likewise.
* sysdeps/powerpc/powerpc32/power7/fpu/s_finitef.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile (sysdep_call):
Remove s_finite* objects.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_finite-power7.S: Remove file.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_finite-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_finite-ppc64.c: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_finite.c: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_finitef-ppc64.c: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_finitef.c: Likewise.
* sysdeps/powerpc/powerpc64/power7/fpu/s_finite.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/fpu/s_finitef.S: Likewise.
* sysdeps/powerpc/powerpc64/power8/fpu/s_finite.S: Likewise.
* sysdeps/powerpc/powerpc64/power8/fpu/s_finitef.S: Likewise.

Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>

math: Use wordsize-64 version for finite

  - math.h will use compiler builtin for gcc 4.4 when built without
    -fsignaling-nans and the builtin is expanded inline for all
    support architectures.  As an example, there is no intra finite
    call on libm for the architecture I checked, x86, arm, aarch64,
    and powerpc.

  - The resulting binary difference on 32 bits architecture is minimum
    for the non hotspot symbol.

  - It helps wordsize-64 architectures that use ldbl-opt.

  - It add some code simplification with reduction of duplicated
    implementations.

Checked on powerpc-linux-gnu (built without --with-cpu, with
--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch),
powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+
and --disable-multi-arch).

* sysdeps/ieee754/dbl-64/wordsize-64/s_finite.c: Move to ...
* sysdeps/ieee754/dbl-64/s_finite.c: ... here and format code.

Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>

powerpc: Remove optimized isinf

The powerpc isinf optimizations onyl adds complexity:

  - GCC will call libm iff -fsignaling-nans is used. This usage pattern
    is usually not performance oriented and for such calls PLT overhead
    should dominate execution time.

  - The power7 uses ftdiv to optimize for some input pattern and branch
    implementation for INF and denormal that does:

    return (ix & UINT64_C (0x7fffffffffffffff)) == UINT64_C (0x7ff0000000000000)

    Although it does show slight better latency than generic algorithm
    (as below), it is only for power7 and requires it to override it
    for power8.

  - The power8 implementation is just the generic implementation using
    ISA 2.07 mfvsrd instruction (which GCC uses for generic implementation).
    So generic implementation is the best option for powerpc64le.

Checked on powerpc-linux-gnu (built without --with-cpu, with
--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch),
powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+
and --disable-multi-arch).

* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/Makefile
(sysdeps_routines, libm-sysdep_routines): Remove s_isinf* and s_isinf*
objects.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isinf-power7.S:
Remove file.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isinf-ppc32.c:
Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isinf.c: Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isinff-ppc32.c:
Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isinff.c: Likewise.
* sysdeps/powerpc/powerpc32/power7/fpu/s_isinf.S: Likewise.
* sysdeps/powerpc/powerpc32/power7/fpu/s_isinff.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile (sysdep_call):
Remove s_isinf* and s_isinf* objects.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isinf-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isinf-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isinf-ppc64.c: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isinf.c: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isinff-ppc64.c: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isinff.c: Likewise.
* sysdeps/powerpc/powerpc64/power7/fpu/s_isinf.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/fpu/s_isinff.S: Likewise.
* sysdeps/powerpc/powerpc64/power8/fpu/s_isinf.S: Likewise.
* sysdeps/powerpc/powerpc64/power8/fpu/s_isinff.S: Likewise.

Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>

math: Use wordsize-64 version for isinf

  - math.h will use compiler builtin for gcc 4.4 when built without
    -fsignaling-nans and the builtin is expanded inline for all
    support architectures.  As an example, there is no intra isinf
    call on libm for the architecture I checked, x86, arm, aarch64,
    and powerpc.

  - The resulting binary difference on 32 bits architecture is minimum
    for the non hotspot symbol.

  - It helps wordsize-64 architectures that use ldbl-opt.

  - It add some code simplification with reduction of duplicated
    implementations.

Checked on powerpc-linux-gnu (built without --with-cpu, with
--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch),
powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+
and --disable-multi-arch).

        * sysdeps/ieee754/dbl-64/wordsize-64/s_isinf.c: Move to ...
        * sysdeps/ieee754/dbl-64/s_isinf.c: ... here and format code.

Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>

powerpc: Remove optimized isnan

The powerpc isnan optimizations are not really a gain:

  - GCC will call libm iff -fsignaling-nans is used. This usage pattern
    is usually not performance oriented and for such calls PLT overhead
    should dominate execution time.

  - The power5, power6, and power6x are just micro-optimization to
    improve the Load-Hit-Store hazards from floating-point to general
    register transfer, and current GCC already has support to minimize
    it by inserting either extra nops or group dispatch instructions.

  - The power7 uses ftdiv to optimize for some input patterns, but at
    cost of others.  Comparing against generic C implementation built
    for powerpc-linux-gnu-power4 (which uses the hp-timing support on
    benchtests):

    - Generic sysdeps/ieee754 implementation:
      "isnan": {
       "": {
        "duration": 4.98415e+09,
        "iterations": 2.34516e+09,
        "max": 45.925,
        "min": 2.052,
        "mean": 2.12529
       },
       "INF": {
        "duration": 4.74057e+09,
        "iterations": 1.69761e+09,
        "max": 91.01,
        "min": 2.052,
        "mean": 2.79249
       },
       "NAN": {
        "duration": 4.74071e+09,
        "iterations": 1.68768e+09,
        "max": 282.343,
        "min": 2.052,
        "mean": 2.809
       }
      }

    - power7 optimized one:
    $ ./testrun.sh benchtests/bench-isnan
      "isnan": {
       "": {
        "duration": 4.96842e+09,
        "iterations": 2.56297e+09,
        "max": 50.048,
        "min": 1.872,
        "mean": 1.93854
       },
       "INF": {
        "duration": 4.76648e+09,
        "iterations": 1.54213e+09,
        "max": 373.408,
        "min": 2.661,
        "mean": 3.09084
       },
       "NAN": {
        "duration": 4.76845e+09,
        "iterations": 1.54515e+09,
        "max": 51.016,
        "min": 2.736,
        "mean": 3.08607
       }
      }

    So it basically optimizes marginally for normal numbers while
    increasing the latency for other kind of FP.

  - The generic implementation requires getting the floating point
    status, disable the invalid operation bit, and restore the
    floating-point status.  Each operation is costly and requires
    flushing the FP pipeline.

    Using the same scenarion for the previous analysis:

      "isnan": {
       "": {
        "duration": 5.08284e+09,
        "iterations": 6.2898e+08,
        "max": 41.844,
        "min": 8.057,
        "mean": 8.08108
       },
       "INF": {
        "duration": 4.97904e+09,
        "iterations": 6.16176e+08,
        "max": 39.661,
        "min": 8.057,
        "mean": 8.08055
       },
       "NAN": {
        "duration": 4.98695e+09,
        "iterations": 5.95866e+08,
        "max": 29.728,
        "min": 8.345,
        "mean": 8.36925
       }
      }

  - The power8 implementation is just the generic implementation using
    ISA 2.07 mfvsrd instruction (which GCC uses for generic implementation).
    So generic implementation is the best option for powerpc64le.

Checked on powerpc-linux-gnu (built without --with-cpu, with
--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch),
powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+
and --disable-multi-arch).

* sysdeps/powerpc/fpu/s_isnan.c: Remove file.
* sysdeps/powerpc/fpu/s_isnanf.S: Likewise.
* sysdeps/powerpc/powerpc32/fpu/s_isnan.S: Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/Makefile
(sysdeps_routines, libm-sysdep_routines): Remove s_isnan-* and
s_isnanf-* objects.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnan-power5.S:
Remove file
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnan-power6.S:
Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnan-power7.S:
Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnan-ppc32.S:
Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnan.c: Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnanf-power5.S:
Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnanf-power6.S:
Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnanf.c: Likewise.
* sysdeps/powerpc/powerpc32/power5/fpu/s_isnan.S: Likewise.
* sysdeps/powerpc/powerpc32/power5/fpu/s_isnanf.S: Likewise.
* sysdeps/powerpc/powerpc32/power6/fpu/s_isnan.S: Likewise.
* sysdeps/powerpc/powerpc32/power6/fpu/s_isnanf.S: Likewise.
* sysdeps/powerpc/powerpc32/power7/fpu/s_isnan.S: Likewise.
* sysdeps/powerpc/powerpc32/power7/fpu/s_isnanf.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile (sysdep_calls):
Remove s_isnan-* and s_isnanf-* objects.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-power5.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-power6.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-power6x.S:
Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-ppc64.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan.c: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnanf.c: Likewise.
* sysdeps/powerpc/powerpc64/fpu/s_isnan.S: Likewise.
* sysdeps/powerpc/powerpc64/power5/fpu/s_isnan.S: Likewise.
* sysdeps/powerpc/powerpc64/power6/fpu/s_isnan.S: Likewise.
* sysdeps/powerpc/powerpc64/power6x/fpu/s_isnan.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/fpu/s_isnan.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/fpu/s_isnanf.S: Likewise.
* sysdeps/powerpc/powerpc64/power8/fpu/s_isnan.S: Likewise.
* sysdeps/powerpc/powerpc64/power8/fpu/s_isnanf.S: Likewise.

Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>

math: Use wordsize-64 version for isnan

  - math.h will use compiler builtin for gcc 4.4 when built without
    -fsignaling-nans and the builtin is expanded inline for all
    support architectures.  As an example, there is no intra isnan
    call on libm for the architecture I checked, x86, arm, aarch64,
    and powerpc.

  - The resulting binary difference on 32 bits architecture is minimum
    for the non hotspot symbol.

  - It helps wordsize-64 architectures that use ldbl-opt.

  - It add some code simplification with reduction of duplicated
    implementations.

Checked on powerpc-linux-gnu (built without --with-cpu, with
--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch),
powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+
and --disable-multi-arch).

* sysdeps/ieee754/dbl-64/wordsize-64/s_isnan.c: Move to ...
* sysdeps/ieee754/dbl-64/s_isnan.c: ... here and format code.

Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>

benchtests: Add isnan/isinf/isfinite benchmark

* benchtests/Makefile (bench-math): Add isnan, isinf, and isfinite.
(CFLAGS-bench-isnan.c, CFLAGS-bench-isinf.c,
CFLAGS-bench-isfinite.c): New rule.
* benchtests/isnan-input: New file.
* benchtests/isinf-input: New file.
* benchtests/isfinite-input: New file.

Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>

powerpc: copysign cleanup

GCC always expand copysign{f} for all possible cpus, so calling the libm
is only done if user explicitly states to disable the builtin (which is
done usually not for performance reason). So to provide ifunc variant
for copysign is just unrequired complexity, since libm will be called
on non-performance critical code.

This patch removes both powerpc32 and powerpc64 ifunc variants and
consolidates the powerpc implementation on
sysdeps/powerpc/fpu/s_copysign{f}.c using compiler builtins.

Checked on powerpc-linux-gnu (built without --with-cpu, with
--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch),
powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+
and --disable-multi-arch).

* sysdeps/powerpc/fpu/s_copysign.c: New file.
* sysdeps/powerpc/fpu/s_copysignf.c: Likewise.
* sysdeps/powerpc/powerpc32/fpu/s_copysign.S: Remove file.
* sysdeps/powerpc/powerpc32/fpu/s_copysignf.S: Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/Makefile
(sysdep_routines, libm-sysdep_routines): Remove s_copysign-power6 and
s_copysign-ppc32.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_copysign-power6.S:
Remove file.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_copysign-ppc32.S:
Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_copysign.c:
Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_copysignf.c:
Likewise.
* sysdeps/powerpc/powerpc32/power6/fpu/s_copysign.S: Likewise.
* sysdeps/powerpc/powerpc32/power6/fpu/s_copysignf.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile (sysdeps_calls):
Remove s_copysign-power6 s_copysign-ppc64.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_copysign-power6.S:
Remove file.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_copysign-ppc64.S:
Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_copysign.c: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_copysignf.c: Likewise.
* sysdeps/powerpc/powerpc64/fpu/s_copysign.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/s_copysignf.S: Likewise.
* sysdeps/powerpc/powerpc64/power6/fpu/s_copysign.S: Likewise.
* sysdeps/powerpc/powerpc64/power6/fpu/s_copysignf.S: Likewise.

Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>

powerpc: consolidate rint

This patches consolidates all the powerpc rint{f} implementations on
the generic sysdeps/powerpc/fpu/s_rint{f}.

Checked on powerpc-linux-gnu (built without --with-cpu, with
--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch),
powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+
and --disable-multi-arch).

* sysdeps/powerpc/fpu/round_to_integer.h (set_fenv_mode,
round_to_integer_float, round_mode): Add RINT handling.
(reset_fenv_mode): New symbol.
* sysdeps/powerpc/fpu/s_rint.c (__rint): Use generic implementation.
* sysdeps/powerpc/fpu/s_rintf.c (__rintf): Likewise.
* sysdeps/powerpc/powerpc32/fpu/s_rint.S: Remove file.
* sysdeps/powerpc/powerpc32/fpu/s_rintf.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/s_rint.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/s_rintf.S: Likewise.

Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>

libio: freopen of default streams crashes in old programs [BZ #24632]

As seen with very old i386 GCC binaries.

Linux: Deprecate <sys/sysctl.h> and sysctl

Now that there are no internal users of __sysctl left, it is possible
to add an unconditional deprecation warning to <sys/sysctl.h>.

To avoid a test failure due this warning in check-install-headers,
skip the test for sys/sysctl.h.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

<sys/stat.h>: Use Linux UAPI header for statx if available and useful

This will automatically import new STATX_* constants. It also avoids
a conflict between <sys/stat.h> and <linux/stat.h>.

<sys/cdefs.h>: Add __glibc_has_include macro

Improve performance of memmem

This patch significantly improves performance of memmem using a novel
modified Horspool algorithm. Needles up to size 256 use a bad-character
table indexed by hashed pairs of characters to quickly skip past mismatches.
Long needles use a self-adapting filtering step to avoid comparing the whole
needle repeatedly.

By limiting the needle length to 256, the shift table only requires 8 bits
per entry, lowering preprocessing overhead and minimizing cache effects.
This limit also implies worst-case performance is linear.

Small needles up to size 2 use a dedicated linear search. Very long needles
use the Two-Way algorithm (to avoid increasing stack size or slowing down
the common case, inlining is disabled).

The performance gain is 6.6 times on English text on AArch64 using random
needles with average size 8.

Tested against GLIBC testsuite and randomized tests.

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
* string/memmem.c (__memmem): Rewrite to improve performance.

Improve performance of strstr

This patch significantly improves performance of strstr using a novel
modified Horspool algorithm. Needles up to size 256 use a bad-character
table indexed by hashed pairs of characters to quickly skip past mismatches.
Long needles use a self-adapting filtering step to avoid comparing the whole
needle repeatedly.

By limiting the needle length to 256, the shift table only requires 8 bits
per entry, lowering preprocessing overhead and minimizing cache effects.
This limit also implies worst-case performance is linear.

Small needles up to size 3 use a dedicated linear search. Very long needles
use the Two-Way algorithm.

The performance gain using the improved bench-strstr on Cortex-A72 is 5.8
times basic_strstr and 3.7 times twoway_strstr.

Tested against GLIBC testsuite, randomized tests and the GNULIB strstr test
(https://git.savannah.gnu.org/cgit/gnulib.git/tree/tests/test-strstr.c).

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
* string/str-two-way.h (two_way_short_needle): Add inline to avoid
warning.
(two_way_long_needle): Block inlining.
* string/strstr.c (strstr2): Add new function.
(strstr3): Likewise.
(STRSTR): Completely rewrite strstr to improve performance.

Benchmark strstr hard needles

Benchmark needles which exhibit worst-case performance.  This shows that
basic_strstr is quadratic and thus unsuitable for large needles.
On the other hand the Two-way and new strstr implementations are linear with
increasing needle sizes.  The slowest cases of the two implementations are
within a factor of 2 on several different microarchitectures.  Two-way is
slowest on inputs which cause a branch mispredict on almost every character.
The new strstr is slowest on inputs which almost match and result in many
calls to memcmp.  Thanks to Szabolcs for providing various hard needles.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* benchtests/bench-strstr.c (test_hard_needle): New function.

Fix malloc tests build with GCC 10.

GCC mainline has recently added warn_unused_result attributes to some
malloc-like built-in functions, where glibc previously had them in its
headers only for __USE_FORTIFY_LEVEL > 0. This results in those
attributes being newly in effect for building the glibc testsuite, so
resulting in new warnings that break the build where tests
deliberately call such functions and ignore the result. Thus patch
duly adds calls to DIAG_* macros around those calls to disable the
warning.

Tested with build-many-glibcs.py for aarch64-linux-gnu.

* malloc/tst-calloc.c: Include <libc-diag.h>.
(null_test): Ignore -Wunused-result around calls to calloc.
* malloc/tst-mallocfork.c: Include <libc-diag.h>.
(do_test): Ignore -Wunused-result around call to malloc.

Linux: Add getdents64 system call

No 32-bit system call wrapper is added because the interface
is problematic because it cannot deal with 64-bit inode numbers
and 64-bit directory hashes.

A future commit will deprecate the undocumented getdirentries
and getdirentries64 functions.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

[powerpc] get_rounding_mode: utilize faster method to get rounding mode

Add support to use 'mffsl' instruction if compiled for POWER9 (or later).

Also, mask the result to avoid bleeding unrelated bits into the result of
_FPU_GET_RC().

Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>

riscv: Do not use __has_include__

The user-visible preprocessor construct is called __has_include.

[powerpc] fegetexcept: utilize function instead of duplicating code

fegetexcept() included code which exactly duplicates the code in
fenv_reg_to_exceptions(). Replace with a call to that function.

2019-06-05 Paul A. Clarke <pc@us.ibm.com>

* sysdeps/powerpc/fpu/fegetexcept.c (__fegetexcept): Replace code
with call to equivalent function.

iconv: Use __twalk_r in __gconv_release_shlib

Fix iconv buffer handling with IGNORE error handler (bug #18830)

Add INADDR_ALLSNOOPERS_GROUP from Linux 5.1 to netinet/in.h.

This patch adds INADDR_ALLSNOOPERS_GROUP from Linux 5.1 to
netinet/in.h.

Tested for x86_64.

* inet/netinet/in.h (INADDR_ALLSNOOPERS_GROUP): New macro.

Fix data of ChangeLog entry

arm: Remove ioperm/iopl/inb/inw/inl/outb/outw/outl support

Linux only supports the required ISA sysctls on StrongARM devices,
which are armv4 and no longer tested during glibc development
and probably bit-rotted by this point. (No reported test results,
and the last discussion of armv4 support was in the glibc 2.19
release notes.)

Linux: Add oddly-named arm syscalls to syscall-names.list

<asm/unistd.h> on arm defines the following macros:

#define __ARM_NR_breakpoint             (__ARM_NR_BASE+1)
#define __ARM_NR_cacheflush             (__ARM_NR_BASE+2)
#define __ARM_NR_usr26                  (__ARM_NR_BASE+3)
#define __ARM_NR_usr32                  (__ARM_NR_BASE+4)
#define __ARM_NR_set_tls                (__ARM_NR_BASE+5)
#define __ARM_NR_get_tls                (__ARM_NR_BASE+6)

These do not follow the regular __NR_* naming convention and
have so far been ignored by the syscall-names.list consistency
checks.  This commit adds these names to the file, preparing
for the availability of these names in the regular __NR_*
namespace.

powerpc: Fix build failures with current GCC

Since GCC commit 271500 (svn), also known as the following commit on the
git mirror:

commit 61edec870f9fdfb5df3fa4e40f28cbaede28a5b1
Author: amodra <amodra@138bc75d-0d04-0410-961f-82ee72b054a4>
Date:   Wed May 22 04:34:26 2019 +0000

    [RS6000] Don't pass -many to the assembler

glibc builds are failing when an assembly implementation does not
declare the correct '.machine' directive, or when no such directive is
declared at all.  For example, when a POWER6 instruction is used, but
'.machine power6' is not declared, the assembler will fail with an error
similar to the following:

    ../sysdeps/powerpc/powerpc64/power8/strcmp.S: Assembler messages:
     24 ../sysdeps/powerpc/powerpc64/power8/strcmp.S:55: Error: unrecognized opcode: `cmpb'

This patch adds '.machine powerN' directives where none existed, as well
as it updates '.machine power7' directives on POWER8 files, because the
minimum binutils version required to build glibc (binutils 2.25) now
provides this machine version.  It also adds '-many' to the assembler
command used to build tst-set_ppr.c.

Tested for powerpc, powerpc64, and powerpc64le, as well as with
build-many-glibcs.py for powerpc targets.

Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>

Remove unused get_clockfreq files

The patch 6e8ba7fd574f meant to remove the all get_clockfreq.c. This
patch removes the missing files for sparcv9 and x86_64.

Checked against a build to x86_64-linux-gnu and sparcv9-linux-gnu.

* sysdeps/unix/sysv/linux/sparc/sparc32/sparcv9/get_clockfreq.c:
Remove file.
* sysdeps/unix/sysv/linux/x86_64/get_clockfreq.c: Likewise.

powerpc: generic nearbyint/nearbyintf

This patches consolidates all the powerpc nearbyint{f} implementations
on the generic sysdeps/powerpc/fpu/s_nearbyint{f}.

* sysdeps/powerpc/fpu/round_to_integer.h (set_fenv_mode): Add
NEARBYINT handling.
* sysdeps/powerpc/fpu/s_nearbyint.c: New file.
* sysdeps/powerpc/fpu/s_nearbyintf.c: Likewise.
* sysdeps/powerpc/powerpc32/fpu/s_nearbyint.S: Remove file.
* sysdeps/powerpc/powerpc32/fpu/s_nearbyintf.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/s_nearbyint.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/s_nearbyintf.S: Likewise.

tt_RU: Add lang_name [BZ #24370]

This commit adds a lang_name according to CLDR-35.1.

[BZ #24370]
* localedata/locales/tt_RU (lang_name): Add from CLDR-35.1.

tt_RU: Fix orthographic mistakes in mon and abmon sections [BZ #24369]

This commit fixes some errors and converts all month names to lowercase.
The content is synchronized with CLDR-35.1 now but trailing dots are
removed from abmon values in order to maintain consistency with the
previous values and with many other locales which do the same.

[BZ #24369]
* localedata/locales/tt_RU (mon): Update from CLDR-35.1, fix errors.
(abmon): Likewise, but remove the trailing dots.

Add IGMP_MRDISC_ADV from Linux 5.1 to netinet/igmp.h.

This patch adds the IGMP_MRDISC_ADV macro from Linux 5.1 to
netinet/igmp.h.

Tested for x86_64.

* inet/netinet/igmp.h (IGMP_MRDISC_ADV): New macro.

nptl: Add comment to __pthread_get_minstack about external users

nss_dns: Check for proper A/AAAA address alignment

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

Add F_SEAL_FUTURE_WRITE from Linux 5.1 to bits/fcntl-linux.h.

This patch adds the new F_SEAL_FUTURE_WRITE constant from Linux 5.1 to
bits/fcntl-linux.h.

Tested for x86_64.

* sysdeps/unix/sysv/linux/bits/fcntl-linux.h [__USE_GNU]
(F_SEAL_FUTURE_WRITE): New macro.

elf: Add tst-ldconfig-bad-aux-cache test [BZ #18093]

This test corrupts /var/cache/ldconfig/aux-cache and executes ldconfig
to check it will not segfault using the corrupted aux_cache. The test
uses the test-in-container framework. Verified no regressions on
x86_64.

Add ChangeLog entry for previous commit.

Remove support for PowerPC SPE extension (powerpc*-*-*gnuspe*).

GCC 9 dropped support for the SPE extensions to PowerPC, which means
powerpc*-*-*gnuspe* configurations are no longer buildable with that
compiler.  This ISA extension was peculiar to the “e500” line of
embedded PowerPC chips, which, as far as I can tell, are no longer
being manufactured, so I think we should follow suit.

This patch was developed by grepping for “e500”, “__SPE__”, and
“__NO_FPRS__”, and may not eliminate every vestige of SPE support.
Most uses of __NO_FPRS__ are left alone, as they are relevant to
normal embedded PowerPC with soft-float.

        * sysdeps/powerpc/preconfigure: Error out on powerpc-*-*gnuspe*
        host type.
        * scripts/build-many-glibcs.py: Remove powerpc-*-linux-gnuspe
        and powerpc-*-linux-gnuspe-e500v1 from list of build configurations.

        * sysdeps/powerpc/powerpc32/e500: Recursively delete.
        * sysdeps/unix/sysv/linux/powerpc/powerpc32/e500: Recursively delete.
        * sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/context-e500.h:
        Delete.

        * sysdeps/powerpc/fpu_control.h: Remove SPE variant.
        Issue an #error if used with a compiler in SPE-float mode.
        * sysdeps/powerpc/powerpc32/__longjmp_common.S
        * sysdeps/powerpc/powerpc32/setjmp_common.S
        * sysdeps/unix/sysv/linux/powerpc/powerpc32/getcontext-common.S
        * sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/getcontext.S
        * sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/setcontext.S
        * sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/swapcontext.S
        * sysdeps/unix/sysv/linux/powerpc/powerpc32/setcontext-common.S
        * sysdeps/unix/sysv/linux/powerpc/powerpc32/swapcontext-common.S:
        Remove code to preserve SPE register state.

        * sysdeps/unix/sysv/linux/powerpc/elision-lock.c
        * sysdeps/unix/sysv/linux/powerpc/elision-trylock.c
        * sysdeps/unix/sysv/linux/powerpc/elision-unlock.c
        Remove __SPE__ ifndefs.

Improve string benchtest timing

Improve string benchtest timing.  Many tests run for 0.01s which is way too
short to give accurate results.  Other tests take over 40 seconds which is
way too long.  Significantly increase the iterations of the short running
tests.  Reduce number of alignment variations in the long running memcpy walk
tests so they take less than 5 seconds.

As a result most tests take at least 0.1s and all finish within 5 seconds.

* benchtests/bench-memcpy-random.c (do_one_test): Use medium iterations.
* benchtests/bench-memcpy-walk.c (test_main): Reduce alignment tests.
* benchtests/bench-memmem.c (do_one_test): Use small iterations.
* benchtests/bench-memmove-walk.c (test_main): Reduce alignment tests.
* benchtests/bench-memset-walk.c (test_main): Reduce alignment tests.
* benchtests/bench-strcasestr.c (do_one_test): Use small iterations.
* benchtests/bench-string.h (INNER_LOOP_ITERS): Increase iterations.
(INNER_LOOP_ITERS_MEDIUM): New define.
(INNER_LOOP_ITERS_SMALL): New define.
* benchtests/bench-strpbrk.c (do_one_test): Use medium iterations.
* benchtests/bench-strsep.c (do_one_test): Use small iterations.
* benchtests/bench-strspn.c (do_one_test): Use medium iterations.
* benchtests/bench-strstr.c (do_one_test): Use small iterations.
* benchtests/bench-strtok.c (do_one_test): Use small iterations.

sysvipc: Add missing bit of semtimedop s390 consolidation

This patch add the missing SEMTIMEDOP_IPC_ARGS definions on powerpc
and sparc ipc_priv.h.

Checked on powerpc64le-linux-gnu and with a build for sparc64-linux-gnu.

* sysdeps/unix/sysv/linux/powerpc/ipc_priv.h (SEMTIMEDOP_IPC_ARGS):
New define.
* sysdeps/unix/sysv/linux/sparc/sparc64/ipc_priv.h
(SEMTIMEDOP_IPC_ARGS): Likewise.

wcsmbs: Fix data race in __wcsmbs_clone_conv [BZ #24584]

This also adds an overflow check and documents the synchronization
requirement in <gconv.h>.

libio: Fix gconv-related memory leak [BZ #24583]

struct gconv_fcts for the C locale is statically allocated,
and __gconv_close_transform deallocates the steps object.
Therefore this commit introduces __wcsmbs_close_conv to avoid
freeing the statically allocated steps objects.

libio: Remove codecvt vtable [BZ #24588]

The codecvt vtable is not a real vtable because it also contains the
conversion state data. Furthermore, wide stream support was added to
GCC 3.0, after a C++ ABI bump, so there is no compatibility
requirement with libstdc++.

This change removes several unmangled function pointers which could
be used with a corrupted FILE object to redirect execution. (libio
vtable verification did not cover the codecvt vtable.)

Reviewed-by: Yann Droneaud <ydroneaud@opteya.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

support: Expose sbindir as support_sbindir_prefix

support: Add missing EOL terminators on timespec

The original implementations of test_timespec_before_impl and
test_timespec_equal_or_after in 519839965197291924895a3988804e325035beee
were missing the backslash required for a newline.

Checked on x86_64-linux-gnu.

* support/timespec.c: Add backslash to correct newline in failure
message.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

support: Correct confusing comment

* support/timespec.h: Correct confusing comment.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

sysvipc: Consolidate semtimedop s390

This patch consolidates the s390-32 semtimedop implementation by defining
a arch-specific SEMTIMEDOP_IPC_ARGS to rearrange the arguments expected
by s390 Linux kABI. The idea is to avoid have multiples semtimedop
implementation changes for Linux v5.1 change to enable wire-up sysvipc
support.

Checked with a s390-linux-gnu and s390x-linux-gnu and checking that
resulting semtimedop objects did not change.

* sysdeps/unix/sysv/linux/ipc_priv.h (SEMTIMEDOP_IPC_ARGS): New
define.
* sysdpes/unix/sysv/linux/s390/ipc_priv.h: New file.
* sysdeps/unix/sysv/linux/s390/semtimedop.c: Remove file.
* sysdeps/unix/sysv/linux/semtimedop.c (semtimedop): Use
SEMTIMEDOP_IPC_ARGS for calls with __NR_ipc.

sysvipc: Fix compat msgctl (BZ#24570)

The __IPC64 flags is meant to be used to enable the new sysv struct
format when the architectures supports it (ARCH_WANT_IPC_PARSE_VERSION
config flag on Linux kernel).

This currently issue only affects alpha.

[BZ #24570]
* sysdeps/unix/sysv/linux/msgctl.c (__old_msgctl): Remove __IPC_64
usage.

Add NT_ARM_PACA_KEYS and NT_ARM_PACG_KEYS from Linux 5.1 to elf.h.

This patch adds the new NT_ARM_PACA_KEYS and NT_ARM_PACG_KEYS from
Linux 5.1 to glibc's elf.h.

Tested for x86_64.

* elf/elf.h (NT_ARM_PACA_KEYS): New macro.
(NT_ARM_PACG_KEYS): Likewise.

Small tcache improvements

Change the tcache->counts[] entries to uint16_t - this removes
the limit set by char and allows a larger tcache. Remove a few
redundant asserts.

bench-malloc-thread with 4 threads is ~15% faster on Cortex-A72.

Reviewed-by: DJ Delorie <dj@redhat.com>
* malloc/malloc.c (MAX_TCACHE_COUNT): Increase to UINT16_MAX.
(tcache_put): Remove redundant assert.
(tcache_get): Remove redundant asserts.
(__libc_malloc): Check tcache count is not zero.
* manual/tunables.texi (glibc.malloc.tcache_count): Update maximum.

manual: Document O_DIRECTORY

Update kernel-features.h files for Linux 5.1.

Linux 5.1 adds missing syscalls to the syscall table for many Linux
kernel architectures.  This patch updates the kernel-features.h
headers accordingly.  __ASSUME_DIRECT_SYSVIPC_SYSCALLS is not updated
because of the differences between new and old syscalls described in
<https://sourceware.org/ml/libc-alpha/2019-05/msg00235.html>.  The
statfs64 structure used by alpha matches what the new kernel syscalls
use.

Tested with build-many-glibcs.py.

* sysdeps/unix/sysv/linux/alpha/kernel-features.h
(__ASSUME_STATFS64): Only undefine if [__LINUX_KERNEL_VERSION <
0x050100].
* sysdeps/unix/sysv/linux/ia64/kernel-features.h (__ASSUME_STATX):
Likewise.
* sysdeps/unix/sysv/linux/sh/kernel-features.h
(__ASSUME_STATX): Likewise.

nss_nis, nss_nisplus: Remove RES_USE_INET6 handling

Since commit 3f8b44be0a658266adff5ece1e4bc3ce097a5dbe ("resolv:
Remove support for RES_USE_INET6 and the inet6 option"),
res_use_inet6 () always evaluates to false.

nss_files: Remove RES_USE_INET6 from hosts processing

Since commit 3f8b44be0a658266adff5ece1e4bc3ce097a5dbe ("resolv:
Remove support for RES_USE_INET6 and the inet6 option"),
res_use_inet6 () always evaluates to false.

support: Report NULL blobs explicitly in TEST_COMPARE

Provide an explicit diagnostic if the length is positive, and
do not just crash with a null pointer dereference. Null pointers
are only valid if the length is zero, so this can only happen with
a faulty test.

dlfcn: Guard __dlerror_main_freeres with __libc_once_get (once) [BZ# 24476]

dlerror.c (__dlerror_main_freeres) will try to free resources which only
have been initialized when init () has been called. That function is
called when resources are needed using __libc_once (once, init) where
once is a __libc_once_define (static, once) in the dlerror.c file.
Trying to free those resources if init () hasn't been called will
produce errors under valgrind memcheck. So guard the freeing of those
resources using __libc_once_get (once) and make sure we have a valid
key. Also add a similar guard to __dlerror ().

* dlfcn/dlerror.c (__dlerror_main_freeres): Guard using
__libc_once_get (once) and static_bug == NULL.
(__dlerror): Check we have a valid key, set result to static_buf
otherwise.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

Add missing Changelog entry

For 5b06f538c5aee0389ed034f60d90a8884d6d54de

Fix crash in _IO_wfile_sync (bug 20568)

When computing the length of the converted part of the stdio buffer, use
the number of consumed wide characters, not the (negative) distance to the
end of the wide buffer.

nss: Turn __nss_database_lookup into a compatibility symbol

The function uses the internal service_user type, so it is not
really usable from the outside of glibc. Rename the function
to __nss_database_lookup2 for internal use, and change
__nss_database_lookup to always indicate failure to the caller.

__nss_next already was a compatibility symbol. The new
implementation always fails and no longer calls __nss_next2.

unscd, the alternative nscd implementation, does not use
__nss_database_lookup, so it is not affected by this change.

support: Add support_install_rootsbindir

Reviewed by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

iconv: Remove public declaration of __gconv_transliterate

Commit ba7b4d294b01870ce3497971e9d07ee261cdc540 ("Complete the
removal of __gconv_translit_find") added a declaration of the
GLIBC_PRIVATE function, __gconv_transliterate, to the installed
header <gconv.h>. It should have been added to the internal
<gconv_int.h> header.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

Linux: Add the tgkill function

The tgkill function is sometimes used in crash handlers.

<bits/signal_ext.h> follows the same approach as <bits/unistd_ext.h>
(which was added for the gettid system call wrapper).

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

manual: Adjust twalk_r documentation.

Reviewed-by: Florian Weimer <fweimer@redhat.com>

elf: Fix tst-pldd for non-default --prefix and/or --bindir (BZ#24544)

Use a new libsupport support_bindir_prefix instead of a hardcoded
/usr/bin to create the pldd path on container directory.

Checked on x86_64-linux-gnu with default and non-default --prefix and
--bindir paths, as well with --enable-hardcoded-path-in-tests.

[BZ #24544]
* elf/tst-pldd.c (do_test): Use support_bindir_prefix instead of
pre-defined value.

Reviewed-by: DJ Delorie <dj@redhat.com>

support: Export bindir path on support_path

Checked on x86_64-linux-gnu.

* support/Makefile (CFLAGS-support_paths.c): Add -DBINDIR_PATH.
* support/support.h (support_bindir_prefix): New variable.
* support/support_paths.c [BINDIR_PATH] (support_bindir_prefix):

Reviewed-by: DJ Delorie <dj@redhat.com>

Make --bindir effective

This allows sets a path using --bindir. Checked on x86_64-linux-gnu
with a non-default --bindir and checked resulting installed binaries
(pldd for instance).

* config.make.in (bindir): New variable.

Reviewed-by: DJ Delorie <dj@redhat.com>

x86: Remove arch-specific low level lock implementation

This patch removes the arch-specific x86 assembly implementation for
low level locking and consolidate both 64 bits and 32 bits in a
single implementation.

Different than other architectures, x86 lll_trylock, lll_lock, and
lll_unlock implements a single-thread optimization to avoid atomic
operation, using cmpxchgl instead. This patch implements by using
the new single-thread.h definitions in a generic way, although using
the previous semantic.

The lll_cond_trylock, lll_cond_lock, and lll_timedlock just use
atomic operations plus calls to lll_lock_wait*.

For __lll_lock_wait_private and __lll_lock_wait the generic implemtation
there is no indication that assembly implementation is required
performance-wise.

Checked on x86_64-linux-gnu and i686-linux-gnu.

* sysdeps/nptl/lowlevellock.h (__lll_trylock): New macro.
(lll_trylock): Call __lll_trylock.
* sysdeps/unix/sysv/linux/i386/libc-lowlevellock.S: Remove file.
* sysdeps/unix/sysv/linux/i386/lll_timedlock_wait.c: Likewise.
* sysdeps/unix/sysv/linux/i386/lowlevellock.S: Likewise.
* sysdeps/unix/sysv/linux/i386/lowlevellock.h: Likewise.
* sysdeps/unix/sysv/linux/x86_64/libc-lowlevellock.S: Likewise.
* sysdeps/unix/sysv/linux/x86_64/lll_timedlock_wait.c: Likewise.
* sysdeps/unix/sysv/linux/x86_64/lowlevellock.S: Likewise.
* sysdeps/unix/sysv/linux/x86_64/lowlevellock.h: Likewise.
* sysdeps/unix/sysv/linux/x86/lowlevellock.h: New file.
* sysdeps/unix/sysv/linux/x86_64/cancellation.S: Include
lowlevellock-futex.h.

Assume LLL_LOCK_INITIALIZER is 0

Since hppa is not an outlier anymore regarding LLL_LOCK_INITIALIZER value,
we can now assume it 0 for all architectures.

Checked on a build for all major ABIs.

* nptl/nptl-init.c (__pthread_initialize_minimal_internal): Remove
initialization for LLL_LOCK_INITIALIZER different than 0.
* nptl/old_pthread_cond_broadcast.c (__pthread_cond_broadcast_2_0):
Assume LLL_LOCK_INITIALIZER being 0.
* nptl/old_pthread_cond_signal.c (__pthread_cond_signal_2_0): Likewise.
* nptl/old_pthread_cond_timedwait.c (__pthread_cond_timedwait_2_0):
Likewise.
* nptl/old_pthread_cond_wait.c (__pthread_cond_wait_2_0): Likewise.
* sysdeps/nptl/libc-lockP.h (__libc_lock_define_initialized): Likewise.

Small optimization for lowlevellock

This patch optimizes both __lll_lock_wait_private and __lll_lock_wait
by issuing only one lll_futex_wait.  Since it is defined as an inlined
syscall and inlined syscalls are defined using inlined assembly the
compiler usually can not see both calls are equal and optimize
accordingly.

On aarch64 the resulting binary is change from:

0000000000000060 <__lll_lock_wait>:
  60:   2a0103e5        mov     w5, w1
  64:   b9400001        ldr     w1, [x0]
  68:   aa0003e4        mov     x4, x0
  6c:   7100083f        cmp     w1, #0x2
  70:   540000e1        b.ne    8c <__lll_lock_wait+0x2c>  // b.any
  74:   521900a1        eor     w1, w5, #0x80
  78:   d2800042        mov     x2, #0x2                        // #2
  7c:   93407c21        sxtw    x1, w1
  80:   d2800003        mov     x3, #0x0                        // #0
  84:   d2800c48        mov     x8, #0x62                       // #98
  88:   d4000001        svc     #0x0
  8c:   521900a5        eor     w5, w5, #0x80
  90:   52800046        mov     w6, #0x2                        // #2
  94:   93407ca5        sxtw    x5, w5
  98:   14000008        b       b8 <__lll_lock_wait+0x58>
  9c:   d503201f        nop
  a0:   aa0403e0        mov     x0, x4
  a4:   aa0503e1        mov     x1, x5
  a8:   d2800042        mov     x2, #0x2                        // #2
  ac:   d2800003        mov     x3, #0x0                        // #0
  b0:   d2800c48        mov     x8, #0x62                       // #98
  b4:   d4000001        svc     #0x0
  b8:   885ffc80        ldaxr   w0, [x4]
  bc:   88017c86        stxr    w1, w6, [x4]
  c0:   35ffffc1        cbnz    w1, b8 <__lll_lock_wait+0x58>
  c4:   35fffee0        cbnz    w0, a0 <__lll_lock_wait+0x40>
  c8:   d65f03c0        ret

To:

0000000000000048 <__lll_lock_wait>:
  48:   aa0003e4        mov     x4, x0
  4c:   2a0103e5        mov     w5, w1
  50:   b9400000        ldr     w0, [x0]
  54:   7100081f        cmp     w0, #0x2
  58:   540000c0        b.eq    70 <__lll_lock_wait+0x28>  // b.none
  5c:   52800041        mov     w1, #0x2                        // #2
  60:   885ffc80        ldaxr   w0, [x4]
  64:   88027c81        stxr    w2, w1, [x4]
  68:   35ffffc2        cbnz    w2, 60 <__lll_lock_wait+0x18>
  6c:   34000120        cbz     w0, 90 <__lll_lock_wait+0x48>
  70:   521900a1        eor     w1, w5, #0x80
  74:   aa0403e0        mov     x0, x4
  78:   93407c21        sxtw    x1, w1
  7c:   d2800042        mov     x2, #0x2                        // #2
  80:   d2800003        mov     x3, #0x0                        // #0
  84:   d2800c48        mov     x8, #0x62                       // #98
  88:   d4000001        svc     #0x0
  8c:   17fffff4        b       5c <__lll_lock_wait+0x14>
  90:   d65f03c0        ret

I see similar changes on powerpc and other architectures.  It also aligns
with x86_64 implementation by adding the systemtap probes.

Checker on aarch64-linux-gnu.

* nptl/lowlevellock.c (__lll_lock_wait, __lll_lock_wait_private):
Optimize futex call and add systemtap probe.

Add single-thread.h header

This patch move the single-thread syscall optimization defintions from
syscall-cancel.h to new header file single-thread.h and also move the
cancellation definitions from pthreadP.h to syscall-cancel.h.

The idea is just simplify the inclusion of both syscall-cancel.h and
single-thread.h (without the requirement of including all pthreadP.h
defintions).

No semantic changes expected, checked on a build for all major ABIs.

* nptl/pthreadP.h (CANCEL_ASYNC, CANCEL_RESET, LIBC_CANCEL_ASYNC,
LIBC_CANCEL_RESET, __libc_enable_asynccancel,
__libc_disable_asynccancel, __librt_enable_asynccancel,
__libc_disable_asynccancel, __librt_enable_asynccancel,
__librt_disable_asynccancel): Move to ...
* sysdeps/unix/sysv/linux/sysdep-cancel.h: ... here.
(SINGLE_THREAD_P, RTLD_SINGLE_THREAD_P): Move to ...
* sysdeps/unix/sysv/linux/single-thread.h: ... here.
* sysdeps/generic/single-thread.h: New file.
* sysdeps/unix/sysdep.h: Include single-thread.h.
* sysdeps/unix/sysv/linux/futex-internal.h: Include sysdep-cancel.h.
* sysdeps/unix/sysv/linux/lowlevellock-futex.h: Likewise.

Bug 24535: Update to Unicode 12.1.0

Unicode 12.1.0 Support: Character encoding, character type info, and
transliteration tables are all updated to Unicode 12.1.0, using
the generator scripts contributed by Mike FABIAN (Red Hat).

Some info about the number of characters added or changed:

Total added characters in newly generated CHARMAP: 1
added: <U32FF> /xe3/x8b/xbf SQUARE ERA NAME REIWA
Total added characters in newly generated WIDTH: 1
added: <U32FF> 2 : eaw=W category=So bidi=L name=SQUARE ERA NAME REIWA
graph: Added 1 characters in new ctype which were not in old ctype
graph: Added: ㋿ U+32FF SQUARE ERA NAME REIWA
print: Added 1 characters in new ctype which were not in old ctype
print: Added: ㋿ U+32FF SQUARE ERA NAME REIWA
punct: Added 1 characters in new ctype which were not in old ctype
punct: Added: ㋿ U+32FF SQUARE ERA NAME REIWA

Fix tcache count maximum (BZ #24531)

The tcache counts[] array is a char, which has a very small range and thus
may overflow. When setting tcache_count tunable, there is no overflow check.
However the tunable must not be larger than the maximum value of the tcache
counts[] array, otherwise it can overflow when filling the tcache.

[BZ #24531]
* malloc/malloc.c (MAX_TCACHE_COUNT): New define.
(do_set_tcache_count): Only update if count is small enough.
* manual/tunables.texi (glibc.malloc.tcache_count): Document max value.

sem_close: Use __twalk_r

support: Fix timespec printf

The patch print timespec members as intmax_t instead of long int.
It avoid the -Werror=format= build issue on x32:

  timespec.c: In function 'test_timespec_before_impl':
  timespec.c:32:23: error: format '%ld' expects argument of type 'long int',
  but argument 4 has type '__time_t' {aka 'const long long int'} [-Werror=format=]

Checked on x86_64-linux-gnu-x32, x86_64-linux-gnu, and i686-linux-gnu.

* support/timespec.c (test_timespec_before_impl,
test_timespec_equal_or_after_impl): print timespec member as intmax_t
insted of long int.

Reviewed-by: Florian Weimer <fweimer@redhat.com>

nptl/tst-abstime: Use libsupport

Checked on x86_64-linux-gnu and i686-linux-gnu.

* nptl/tst-abstime.c: Use libsupport.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

nptl: Convert some rwlock tests to use libsupport

Checked on x86_64-linux-gnu and i686-linux-gnu.

* nptl/tst-rwlock6.c: Use libsupport. This also happens to fix a
small bug where only tv.tv_usec was checked which could cause an
erroneous pass if pthread_rwlock_timedrdlock incorrectly took more
than a second.

* nptl/tst-rwlock7.c, nptl/tst-rwlock9.c, nptl/tst-rwlock14.c: Use
libsupport.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

nptl: Use recent additions to libsupport in tst-sem5

Checked on x86_64-linux-gnu and i686-linux-gnu.

* nptl/tst-sem5.c(do_test): Use xclock_gettime, timespec_add and
TEST_TIMESPEC_NOW_OR_AFTER from libsupport.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

nptl: Convert tst-cond11.c to use libsupport

Checked on x86_64-linux-gnu and i686-linux-gnu.

* nptl/tst-cond11.c: Use libsupport.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

support: Add timespec.h

It adds useful functions for tests that use struct timespec.

Checked on x86_64-linux-gnu and i686-linux-gnu.

* support/timespec.h: New file. Provide timespec helper functions
along with macros in the style of those in check.h.
* support/timespec.c: New file. Implement check functions declared
in support/timespec.h.
* support/timespec-add.c: New file from gnulib containing
timespec_add implementation that handles overflow.
* support/timespec-sub.c: New file from gnulib containing
timespec_sub implementation that handles overflow.
* support/README: Mention timespec.h.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

Move nptl/tst-eintr1 to xtests

Don't run nptl/tst-eintr1 by normal make check because it can spuriously
break testing on various linux kernels. (Currently this affects the
aarch64 glibc buildbot machine which regularly fails and loses test
results.)

[BZ #24537]
* nptl/Makefile: Move tst-eintr1 to xtests.

powerpc: trunc/truncf refactor

This patches consolidates all the powerpc trunc{f} implementations on
the generic sysdeps/powerpc/fpu/s_trunc{f}. The generic implementation
uses either the compiler builts for ISA 2.03+ (which generates the
frim instruction) or a generic implementation which uses FP only
operations.

The IFUNC organization for powerpc64 is also change to be enabled only
for powerpc64 and not for powerpc64le (since minium ISA of 2.08 does not
require the fallback generic implementation).

Checked on powerpc-linux-gnu (built without --with-cpu, with
--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch),
powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+
and --disable-multi-arch).

* sysdeps/powerpc/fpu/trunc_to_integer.h (set_fenv_mode): Add
TRUNC handling.
(round_mode): Add definition for TRUNC.
* sysdeps/powerpc/fpu/s_trunc.c: New file.
* sysdeps/powerpc/fpu/s_truncf.c: New file.
* sysdeps/powerpc/powerpc32/fpu/s_trunc.S: Remove file.
* sysdeps/powerpc/powerpc32/fpu/s_truncf.S: Likewise.
* sysdep/powerpc/powepc32/power4/fpu/multiarch/s_trunc-power5+.S:
Likewise.
* sysdep/powerpc/powepc32/power4/fpu/multiarch/s_trunc-ppc32.S:
Likewise.
* sysdep/powerpc/powepc32/power4/fpu/multiarch/s_truncf-power5+.S:
Likewise.
* sysdep/powerpc/powepc32/power4/fpu/multiarch/s_truncf-ppc32.S:
Likewise.
* sysdep/powerpc/powepc32/power4/fpu/multiarch/s_trunc-power5+.c: New
file.
* sysdep/powerpc/powepc32/power4/fpu/multiarch/s_trunc-ppc32.c:
Likewise.
* sysdep/powerpc/powepc32/power4/fpu/multiarch/s_truncf-power5+.c:
Likewise.
* sysdep/powerpc/powepc32/power4/fpu/multiarch/s_truncf-ppc32.c:
Likewise.
* sysdep/powerpc/powerpc32/power5+/fpu/s_trunc.S: Remove file.
* sysdep/powerpc/powerpc32/power5+/fpu/s_truncf.S: Likewise.
* sysdep/powerpc/powerpc64/be/fpu/multiarch/Makefile
(libm-sysdep_routines): Add s_trunc-power5+, s_trunc-ppc64,
s_truncf-power5+, and s_truncf-ppc64.
(CFLAGS-s_trunc-power5+.c, CFLAGS-s_truncf-power5+.c): New rule.
* sysdep/powerpc/powercp64/be/fpu/multiarch/s_trunc-power5+.c: New
file.
* sysdep/powerpc/powercp64/be/fpu/multiarch/s_trunc-ppc64.c: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_trunc.c: Move to ...
* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_trunc.c: ... here.
* sysdep/powerpc/powercp64/be/fpu/multiarch/s_truncf-power5+.c: New
file.
* sysdep/powerpc/powercp64/be/fpu/multiarch/s_truncf-ppc64.c:
Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_truncf.c: Move to ...
* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_truncf.c: ... here.
* sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile
(libm-sysdep_routines): Remove s_trunc-power5+, s_trunc-ppc64,
s_truncf-power5+, and s_truncf-ppc64.
* sysdep/powerpc/powerpc64/fpu/multiarch/s_trunc-power5+.S: Remove
file.
* sysdep/powerpc/powerpc64/fpu/multiarch/s_trunc-ppc64.S: Likewise.
* sysdep/powerpc/powerpc64/fpu/multiarch/s_truncf-power5+.S:
Likewise.
* sysdep/powerpc/powerpc64/fpu/multiarch/s_truncf-ppc64.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/s_trunc.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/s_truncf.S: Likewise.
* sysdep/powerpc/powerpc64/power5+/fpu/s_trunc.S: Likewise.
* sysdep/powerpc/powerpc64/power5+/fpu/s_truncf.S: Likewise.

Reviewed-by: Gabriel F. T. Gomes <gabriel@inconstante.eti.br>

powerpc: round/roundf refactor

This patches consolidates all the powerpc round{f} implementations on
the generic sysdeps/powerpc/fpu/s_round{f}. The generic implementation
uses either the compiler builts for ISA 2.03+ (which generates the
frim instruction) or a generic implementation which uses FP only
operations.

The IFUNC organization for powerpc64 is also change to be enabled only
for powerpc64 and not for powerpc64le (since minium ISA of 2.08 does not
require the fallback generic implementation).

Checked on powerpc-linux-gnu (built without --with-cpu, with
--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch),
powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+
and --disable-multi-arch).

* sysdeps/powerpc/fpu/round_to_integer.h (set_fenv_mode): Add
ROUND handling.
(round_mode): Add definition for ROUND.
(round_to_integer_float): Likewise.
* sysdeps/powerpc/fpu/s_round.c: New file.
* sysdeps/powerpc/fpu/s_roundf.c: New file.
* sysdeps/powerpc/powerpc32/fpu/s_round.S: Remove file.
* sysdeps/powerpc/powerpc32/fpu/s_roundf.S: Likewise.
* sysdep/powerpc/powepc32/power4/fpu/multiarch/s_round-power5+.S:
Likewise.
* sysdep/powerpc/powepc32/power4/fpu/multiarch/s_round-ppc32.S:
Likewise.
* sysdep/powerpc/powepc32/power4/fpu/multiarch/s_roundf-power5+.S:
Likewise.
* sysdep/powerpc/powepc32/power4/fpu/multiarch/s_roundf-ppc32.S:
Likewise.
* sysdep/powerpc/powepc32/power4/fpu/multiarch/s_round-power5+.c: New
file.
* sysdep/powerpc/powepc32/power4/fpu/multiarch/s_round-ppc32.c:
Likewise.
* sysdep/powerpc/powepc32/power4/fpu/multiarch/s_roundf-power5+.c:
Likewise.
* sysdep/powerpc/powepc32/power4/fpu/multiarch/s_roundf-ppc32.c:
Likewise.
* sysdep/powerpc/powerpc32/power5+/fpu/s_round.S: Remove file.
* sysdep/powerpc/powerpc32/power5+/fpu/s_roundf.S: Likewise.
* sysdep/powerpc/powerpc64/be/fpu/multiarch/Makefile
(libm-sysdep_routines): Add s_round-power5+, s_round-ppc64,
s_roundf-power5+, and s_roundf-ppc64.
(CFLAGS-s_round-power5+.c, CFLAGS-s_roundf-power5+.c): New rule.
* sysdep/powerpc/powercp64/be/fpu/multiarch/s_round-power5+.c: New
file.
* sysdep/powerpc/powercp64/be/fpu/multiarch/s_round-ppc64.c: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_round.c: Move to ...
* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_round.c: ... here.
* sysdep/powerpc/powercp64/be/fpu/multiarch/s_roundf-power5+.c: New
file.
* sysdep/powerpc/powercp64/be/fpu/multiarch/s_roundf-ppc64.c:
Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_roundf.c: Move to ...
* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_roundf.c: ... here.
* sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile
(libm-sysdep_routines): Remove s_round-power5+, s_round-ppc64,
s_roundf-power5+, and s_roundf-ppc64.
* sysdep/powerpc/powerpc64/fpu/multiarch/s_round-power5+.S: Remove
file.
* sysdep/powerpc/powerpc64/fpu/multiarch/s_round-ppc64.S: Likewise.
* sysdep/powerpc/powerpc64/fpu/multiarch/s_roundf-power5+.S:
Likewise.
* sysdep/powerpc/powerpc64/fpu/multiarch/s_roundf-ppc64.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/s_round.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/s_roundf.S: Likewise.
* sysdep/powerpc/powerpc64/power5+/fpu/s_round.S: Likewise.
* sysdep/powerpc/powerpc64/power5+/fpu/s_roundf.S: Likewise.

Reviewed-by: Gabriel F. T. Gomes <gabriel@inconstante.eti.br>

powerpc: floor/floorf refactor

This patches consolidates all the powerpc floor{f} implementations on
the generic sysdeps/powerpc/fpu/s_floor{f}. The generic implementation
uses either the compiler builts for ISA 2.03+ (which generates the
frim instruction) or a generic implementation which uses FP only
operations.

The IFUNC organization for powerpc64 is also change to be enabled only
for powerpc64 and not for powerpc64le (since minium ISA of 2.08 does not
require the fallback generic implementation).

Checked on powerpc-linux-gnu (built without --with-cpu, with
--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch),
powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+
and --disable-multi-arch).

* sysdeps/powerpc/fpu/round_to_integer.h (set_fenv_mode):
Add FLOOR option.
(round_mode): Add definition for FLOOR.
* sysdeps/powerpc/fpu/s_floor.c: New file.
* sysdeps/powerpc/fpu/s_floorf.c: Likewise.
* sysdeps/powerpc/powerpc32/fpu/s_floor.S: Remove file.
* sysdeps/powerpc/powerpc32/fpu/s_floorf.S: Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_floor-power5+.S:
Remove file.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_floor-ppc32.S:
Likewise
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_floorf-power5+.S:
Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_floorf-ppc32.S:
Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_floor-power5+.c:
New file.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_floor-ppc32.c:
Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_floorf-power5+.c:
Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_floorf-ppc32.c:
Likewise.
* sysdeps/powerpc/powerpc32/power5+/fpu/s_floor.S: Remove file.
* sysdeps/powerpc/powerpc32/power5+/fpu/s_floorf.S: Remove file.
* sysdeps/powerpc/powerpc64/be/fpu/multiarch/Makefile
(libm-sysdep_routines): Add s_floor-power5+, s_floor-ppc64,
s_floorf-power5+, and s_floorf-ppc64.
(CFLAGS-s_floor-power5+.c, CFLAGS-s_floorf-power5+.c): New rule.
* sysdep/powerpc/powerpc64/be/fpu/multiarch/s_floor-power5+.c: New
file.
* sysdep/powerpc/powerpc64/be/fpu/multiarch/s_floor-ppc64.c: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_floor.c: Move to ...
* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_floor.c: ... here.
* sysdep/powerpc/powerpc64/be/fpu/multiarch/s_floorf-power5+.c: New
file.
* sysdep/powerpc/powerpc64/be/fpu/multiarch/s_floorf-ppc64.c:
Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_floorf.c: Move to ...
* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_floorf.c: ... here.
* sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile
(libm-sysdep_routines): Remove s_floor-power5+, s_floor-ppc64,
s_floorf-power5+, and s_floorf-ppc64.
* sysdep/powerpc/powerpc64/fpu/multiarch/s_floor-power5+.S: Remove
file.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_floor-ppc64.S: Remove
file.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_floorf-power5+.S:
Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_floorf-ppc64.S:
Likewise.
* sysdeps/powerpc/powerpc64/fpu/s_floor.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/s_floorf.S: Likewise.
* sysdeps/powerpc/powerpc64/power5+/fpu/s_floor.S: Likewise.
* sysdeps/powerpc/powerpc64/power5+/fpu/s_floorf.S: Likewise.

Reviewed-by: Gabriel F. T. Gomes <gabriel@inconstante.eti.br>

support: Add xclock_gettime

* support/xclock_gettime.c (xclock_gettime): New file. Provide
clock_gettime wrapper for use in tests that fails the test rather
than returning failure.

* support/xtime.h: New file to declare xclock_gettime.

* support/Makefile: Add xclock_gettime.c.

* support/README: Mention xtime.h.

malloc/tst-mallocfork2: Use process-shared barriers

This synchronization method has a lower overhead and makes
it more likely that the signal arrives during one of the critical
functions.

Also test for fork deadlocks explicitly.

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>

Update syscall-names.list for Linux 5.1.

This patch updates syscall-names.list for Linux 5.1 (which has many
new syscalls, mainly but not entirely ones for 64-bit time).

Tested with build-many-glibcs.py (before the revert of the move to
Linux 5.1 there; verified there were no tst-syscall-list failures).

* sysdeps/unix/sysv/linux/syscall-names.list: Update kernel
version to 5.1.
(clock_adjtime64) New syscall.
(clock_getres_time64) Likewise.
(clock_gettime64) Likewise.
(clock_nanosleep_time64) Likewise.
(clock_settime64) Likewise.
(futex_time64) Likewise.
(io_pgetevents_time64) Likewise.
(io_uring_enter) Likewise.
(io_uring_register) Likewise.
(io_uring_setup) Likewise.
(mq_timedreceive_time64) Likewise.
(mq_timedsend_time64) Likewise.
(pidfd_send_signal) Likewise.
(ppoll_time64) Likewise.
(pselect6_time64) Likewise.
(recvmmsg_time64) Likewise.
(rt_sigtimedwait_time64) Likewise.
(sched_rr_get_interval_time64) Likewise.
(semtimedop_time64) Likewise.
(timer_gettime64) Likewise.
(timer_settime64) Likewise.
(timerfd_gettime64) Likewise.
(timerfd_settime64) Likewise.
(utimensat_time64) Likewise.

Revert "Use Linux 5.1 in build-many-glibcs.py."

This reverts commit c2b11710fb4a2e8d337ae8f042724143c5ccf173.

Linux 5.1 headers are not in fact usable for glibc testing, because
"[PATCH] uapi: avoid namespace conflict in linux/posix_types.h"
<https://lore.kernel.org/lkml/20190319165123.3967889-1-arnd@arndb.de/>
did not get merged for 5.1 and so many conform/ tests fail.

Use Linux 5.1 in build-many-glibcs.py.

* scripts/build-many-glibcs.py (Context.checkout): Default Linux
version to 5.1.

Use GCC 9 in build-many-glibcs.py.

* scripts/build-many-glibcs.py (Context.checkout): Default GCC
version to 9 branch.

aarch64: thunderx2 memmove performance improvements

The performance improvement is about 20%-30% for
larger cases and about 1%-5% for smaller cases.

Used SIMD load/store instead of GPR for large
overlapping forward moves.

Reused existing memcpy implementation for smaller
or overlapping backward moves.

Fixed the existing memcpy implementation to allow it
to deal with the overlapping case.

Simplified loop tails in the memcpy implementation -
use branchless overlapping sequence of fixed length
load/stores instead of branching depending on the
size.

A cleanup/optimization converting str's to stp's.

Added __memmove_thunderx2 to the list of the
available implementations.

misc/tst-tsearch: Additional explicit error checking

This avoids an undefined variable warning with certain GCC versions.

Add missing bug number on CL entry for BZ#24506 (b2af6fb2ed239)