platform/upstream/glibc.git
4 years agox86: Add thresholds for "rep movsb/stosb" to tunables
H.J. Lu [Mon, 6 Jul 2020 18:48:09 +0000 (11:48 -0700)]
x86: Add thresholds for "rep movsb/stosb" to tunables

Add x86_rep_movsb_threshold and x86_rep_stosb_threshold to tunables
to update thresholds for "rep movsb" and "rep stosb" at run-time.

Note that the user specified threshold for "rep movsb" smaller than
the minimum threshold will be ignored.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
4 years agoUse C2x return value from getpayload of non-NaN (bug 26073).
Joseph Myers [Mon, 6 Jul 2020 16:18:02 +0000 (16:18 +0000)]
Use C2x return value from getpayload of non-NaN (bug 26073).

In TS 18661-1, getpayload had an unspecified return value for a
non-NaN argument, while C2x requires the return value -1 in that case.

This patch implements the return value of -1.  I don't think this is
worth having a new symbol version that's an alias of the old one,
although occasionally we do that in such cases where the new function
semantics are a refinement of the old ones (to avoid programs relying
on the new semantics running on older glibc versions but not behaving
as intended).

Tested for x86_64 and x86; also ran math/ tests for aarch64 and
powerpc.

4 years agox86: Detect Extended Feature Disable (XFD)
H.J. Lu [Mon, 6 Jul 2020 13:57:08 +0000 (06:57 -0700)]
x86: Detect Extended Feature Disable (XFD)

An extension called extended feature disable (XFD) is an extension added
for Intel AMX to the XSAVE feature set that allows an operating system
to enable a feature while preventing specific user threads from using
the feature.

4 years agox86: Correct bit_cpu_CLFSH [BZ #26208]
H.J. Lu [Mon, 6 Jul 2020 13:38:05 +0000 (06:38 -0700)]
x86: Correct bit_cpu_CLFSH [BZ #26208]

bit_cpu_CLFSH should be (1u << 19), not (1u << 20).

4 years agomanual: Document __libc_single_threaded
Florian Weimer [Wed, 24 Jun 2020 12:32:26 +0000 (14:32 +0200)]
manual: Document __libc_single_threaded

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
Reviewed-by: DJ Delorie <dj@redhat.com>
4 years agoAdd the __libc_single_threaded variable
Florian Weimer [Wed, 20 May 2020 13:40:35 +0000 (15:40 +0200)]
Add the __libc_single_threaded variable

The variable is placed in libc.so, and it can be true only in
an outer libc, not libcs loaded via dlmopen or static dlopen.
Since thread creation from inner namespaces does not work,
pthread_create can update __libc_single_threaded directly.

Using __libc_early_init and its initial flag, implementation of this
variable is very straightforward.  A future version may reset the flag
during fork (but not in an inner namespace), or after joining all
threads except one.

Reviewed-by: DJ Delorie <dj@redhat.com>
4 years agoLinux: rseq registration tests
Mathieu Desnoyers [Mon, 6 Jul 2020 08:21:35 +0000 (10:21 +0200)]
Linux: rseq registration tests

These tests validate that rseq is registered from various execution
contexts (main thread, destructor, other threads, other threads created
from destructor, forked process (without exec), pthread_atfork handlers,
pthread setspecific destructors, signal handlers, atexit handlers).

tst-rseq.c only links against libc.so, testing registration of rseq in
a non-multithreaded environment.

tst-rseq-nptl.c also links against libpthread.so, testing registration
of rseq in a multithreaded environment.

See the Linux kernel selftests for extensive rseq stress-tests.

4 years agoLinux: Use rseq in sched_getcpu if available
Mathieu Desnoyers [Mon, 6 Jul 2020 08:21:31 +0000 (10:21 +0200)]
Linux: Use rseq in sched_getcpu if available

When available, use the cpu_id field from __rseq_abi on Linux to
implement sched_getcpu().  Fall-back on the vgetcpu vDSO if unavailable.

Benchmarks:

x86-64: Intel E5-2630 v3@2.40GHz, 16-core, hyperthreading

glibc sched_getcpu():                     13.7 ns (baseline)
glibc sched_getcpu() using rseq:           2.5 ns (speedup:  5.5x)
inline load cpuid from __rseq_abi TLS:     0.8 ns (speedup: 17.1x)

4 years agoLinux: Perform rseq registration at C startup and thread creation
Mathieu Desnoyers [Mon, 6 Jul 2020 08:21:16 +0000 (10:21 +0200)]
Linux: Perform rseq registration at C startup and thread creation

Register rseq TLS for each thread (including main), and unregister for
each thread (excluding main).  "rseq" stands for Restartable Sequences.

See the rseq(2) man page proposed here:
  https://lkml.org/lkml/2018/9/19/647

Those are based on glibc master branch commit 3ee1e0ec5c.
The rseq system call was merged into Linux 4.18.

The TLS_STATIC_SURPLUS define is increased to leave additional room for
dlopen'd initial-exec TLS, which keeps elf/tst-auditmany working.

The increase (76 bytes) is larger than 32 bytes because it has not been
increased in quite a while.  The cost in terms of additional TLS storage
is quite significant, but it will also obscure some initial-exec-related
dlopen failures.

4 years agotst-cancel4: deal with ENOSYS errors
Samuel Thibault [Sun, 5 Jul 2020 17:21:45 +0000 (19:21 +0200)]
tst-cancel4: deal with ENOSYS errors

The Hurd port doesn't have support for sigwaitinfo, sigtimedwait, and msgget
yet, so let us ignore the test for these when they return ENOSYS.

* nptl/tst-cancel4.c (tf_sigwaitinfo): Fallback on sigwait when
sigwaitinfo returns ENOSYS.
(tf_sigtimedwait): Likewise with sigtimedwait.
(tf_msgrcv, tf_msgsnd): Fallback on tf_usleep when msgget returns ENOSYS.

4 years agomanual: Show copyright information not just in the printed manual
Florian Weimer [Fri, 3 Jul 2020 08:06:24 +0000 (10:06 +0200)]
manual: Show copyright information not just in the printed manual

@insertcopying was not used at all in the Info and HTML versions.
As a result, the notices that need to be present according to the
GNU Free Documentation License were missing.

This commit shows these notices above the table of contents in the
HTML version, and as part of the Main Menu node in the Info version.

Remove the "This file documents" line because it is redundant with the
following line.

Tested-by: Carlos O'Donell <carlos@redhat.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
4 years agoFix typo in comment in bug 26137 fix.
Joseph Myers [Wed, 1 Jul 2020 14:53:30 +0000 (14:53 +0000)]
Fix typo in comment in bug 26137 fix.

4 years agoFix strtod multiple-precision division bug (bug 26137).
Joseph Myers [Tue, 30 Jun 2020 23:04:06 +0000 (23:04 +0000)]
Fix strtod multiple-precision division bug (bug 26137).

Bug 26137 reports spurious "inexact" exceptions from strtod, on 32-bit
systems only, for a decimal argument that is exactly 1 + 2^-32.  In
fact the same issue also appears for 1 + 2^-64 and 1 + 2^-96 as
arguments to strtof128 on 32-bit systems, and 1 + 2^-64 as an argument
to strtof128 on 64-bit systems.  In FE_DOWNWARD or FE_TOWARDZERO mode,
the return value is also incorrect.

The problem is in the multiple-precision division logic used in the
case of dividing by a denominator that occupies at least three GMP
limbs.  There was a comment "The division does not work if the upper
limb of the two-limb mumerator is greater than the denominator.", but
in fact there were problems for the case of equality (that is, where
the high limbs are equal, offset by some multiple of the GMP limb
size) as well.  In such cases, the code used "quot = ~(mp_limb_t) 0;"
(with subsequent correction if that is an overestimate), because
udiv_qrnnd does not support the case of equality, but it's possible
for the shifted numerator to be greater than or equal to the
denominator, in which case that is an underestimate.  To avoid that,
this patch changes the ">" condition to ">=", meaning the first
division is done with a zero high word.

The tests added are all 1 + 2^-n for n from 1 to 113 except for those
that were already present in tst-strtod-round-data.

Tested for x86_64 and x86.

4 years agoLinux: Fix UTC offset setting in settimeofday for __TIMESIZE != 64
Florian Weimer [Tue, 30 Jun 2020 19:19:43 +0000 (21:19 +0200)]
Linux: Fix UTC offset setting in settimeofday for __TIMESIZE != 64

The time argument is NULL in this case, and attempt to convert it
leads to a null pointer dereference.

This fixes commit d2e3b697da2433c08702f95c76458c51545c3df1
("y2038: linux: Provide __settimeofday64 implementation").

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
4 years agorandom: range is not portably RAND_MAX [BZ #7003]
John Marshall [Tue, 30 Jun 2020 18:16:03 +0000 (14:16 -0400)]
random: range is not portably RAND_MAX [BZ #7003]

On other platforms, RAND_MAX (which is the range of rand(3))
may differ from 2^31-1 (which is the range of random(3)).

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
4 years agoUpdate kernel version to 5.7 in tst-mman-consts.py.
Joseph Myers [Mon, 29 Jun 2020 14:06:32 +0000 (14:06 +0000)]
Update kernel version to 5.7 in tst-mman-consts.py.

This patch updates the kernel version in the test tst-mman-consts.py
to 5.7.  (There are no new constants covered by this test in 5.7 that
need any other header changes; there's a new MREMAP_DONTUNMAP, but
this test doesn't yet cover MREMAP_*.)

Tested with build-many-glibcs.py.

4 years agopowerpc: Add support for POWER10
Tulio Magno Quites Machado Filho [Wed, 24 Jun 2020 21:04:41 +0000 (18:04 -0300)]
powerpc: Add support for POWER10

1. Add the directories to hold POWER10 files.

2. Add support to select POWER10 libraries based on AT_PLATFORM.

3. Let submachine=power10 be set automatically.

4 years agohurd: Simplify usleep timeout computation
Samuel Thibault [Mon, 29 Jun 2020 08:09:14 +0000 (10:09 +0200)]
hurd: Simplify usleep timeout computation

as suggested by Andreas Schwab

* sysdeps/mach/usleep.c (usleep): Divide timeout in an overflow-safe way.

4 years agohtl: Enable cancel*16 an cancel*20 tests
Samuel Thibault [Mon, 29 Jun 2020 00:14:52 +0000 (00:14 +0000)]
htl: Enable cancel*16 an cancel*20 tests

* nptl/tst-cancel16.c, tst-cancel20.c, tst-cancelx16.c, tst-cancelx20.c:
Move to...
* sysdeps/pthread: ... here.
* nptl/Makefile: Move corresponding references and rules to...
* sysdeps/pthread/Makefile: ... here.
* sysdeps/mach/hurd/i386/Makefile: Xfail tst-cancel*16 for now: missing
barrier pshared support, but test should be working otherwise.

4 years agohurd: Add remaining cancelation points
Samuel Thibault [Sun, 28 Jun 2020 22:41:18 +0000 (22:41 +0000)]
hurd: Add remaining cancelation points

* hurd/hurdselect.c: Include <sysdep-cancel.h>.
(_hurd_select): Surround call to __mach_msg with enabling async cancel.
* sysdeps/mach/hurd/accept4.c: Include <sysdep-cancel.h>.
(__libc_accept4): Surround call to __socket_accept with enabling async cancel,
and use HURD_DPORT_USE_CANCEL instead of HURD_DPORT_USE.
* sysdeps/mach/hurd/connect.c: Include <sysdep-cancel.h>.
(__connect): Surround call to __file_name_lookup and __socket_connect
with enabling async cancel, and use HURD_DPORT_USE_CANCEL instead of
HURD_DPORT_USE.
* sysdeps/mach/hurd/fdatasync.c: Include <sysdep-cancel.h>.
(fdatasync): Surround call to __file_sync with enabling async cancel, and use
HURD_DPORT_USE_CANCEL instead of HURD_DPORT_USE.
* sysdeps/mach/hurd/fsync.c: Include <sysdep-cancel.h>.
(fsync): Surround call to __file_sync with enabling async cancel, and use
HURD_DPORT_USE_CANCEL instead of HURD_DPORT_USE.
* sysdeps/mach/hurd/ioctl.c: Include <sysdep-cancel.h>.
(__ioctl): When request is TIOCDRAIN, surround call to send_rpc with enabling
async cancel, and use HURD_DPORT_USE_CANCEL instead of HURD_DPORT_USE.
* sysdeps/mach/hurd/msync.c: Include <sysdep-cancel.h>.
(msync): Surround call to __vm_object_sync with enabling async cancel.
* sysdeps/mach/hurd/sigsuspend.c: Include <sysdep-cancel.h>.
(__sigsuspend): Surround call to __mach_msg with enabling async cancel.
* sysdeps/mach/hurd/sigwait.c: Include <sysdep-cancel.h>.
(__sigwait): Surround wait code with enabling async cancel.
* sysdeps/mach/msync.c: Include <sysdep-cancel.h>.
(msync): Surround call to __vm_msync with enabling async cancel.
* sysdeps/mach/sleep.c: Include <sysdep-cancel.h>.
(__sleep): Surround call to __mach_msg with enabling async cancel.
* sysdeps/mach/usleep.c: Include <sysdep-cancel.h>.
(usleep): Surround call to __vm_msync with enabling async cancel.

4 years agohurd: fix usleep(ULONG_MAX)
Samuel Thibault [Sun, 28 Jun 2020 22:39:03 +0000 (22:39 +0000)]
hurd: fix usleep(ULONG_MAX)

* sysdeps/mach/usleep.c (usleep): Clamp timeout when rouding up.

4 years agohurd: Make fcntl(F_SETLKW*) cancellation points
Samuel Thibault [Sun, 28 Jun 2020 18:18:43 +0000 (18:18 +0000)]
hurd: Make fcntl(F_SETLKW*) cancellation points

and add _nocancel variant.

* sysdeps/mach/hurd/Makefile [io] (sysdep_routines): Add fcntl_nocancel.
* sysdeps/mach/hurd/fcntl.c [NOCANCEL]: Include <not-cancel.h>.
[!NOCANCEL]: Include <sysdep-cancel.h>.
(__libc_fcntl) [!NOCANCEL]: Surround __file_record_lock call with enabling async cancel, and use HURD_FD_PORT_USE_CANCEL instead of HURD_FD_PORT_USE.
* sysdeps/mach/hurd/fcntl_nocancel.c: New file, defines __fcntl_nocancel by including fcntl.c.
* sysdeps/mach/hurd/not-cancel.h (__fcntl64_nocancel): Replace macro with
    __fcntl_nocancel declaration with hidden proto, and make
    __fcntl64_nocancel call __fcntl_nocancel.

4 years agohurd: make wait4 a cancellation point
Samuel Thibault [Sun, 28 Jun 2020 16:54:49 +0000 (16:54 +0000)]
hurd: make wait4 a cancellation point

and add _nocancel variant.

* sysdeps/mach/hurd/Makefile [io] (sysdep_routines): Add wait4_nocancel.
* sysdeps/mach/hurd/wait4.c: Include <sysdep-cancel.h>
(__wait4): Surround __proc_wait with enabling async cancel, and use
__USEPORT_CANCEL instead of __USEPORT.
* sysdeps/mach/hurd/wait4_nocancel.c: New file, contains previous
implementation of __wait4.
* sysdeps/mach/hurd/not-cancel.h (__waitpid_nocancel): Replace macro with
__wait4_nocancel declaration with hidden proto, and make
__waitpid_nocancel call __wait4_nocancel.

4 years agohurd: Fix port definition in HURD_PORT_USE_CANCEL
Samuel Thibault [Sun, 28 Jun 2020 17:00:47 +0000 (17:00 +0000)]
hurd: Fix port definition in HURD_PORT_USE_CANCEL

* sysdeps/hurd/include/hurd/port.h: Include <libc-lock.h>.
(HURD_PORT_USE_CANCEL): Add local port variable.

4 years agohurd: make close a cancellation point
Samuel Thibault [Sun, 28 Jun 2020 15:51:40 +0000 (15:51 +0000)]
hurd: make close a cancellation point

and add _nocancel variant.

* sysdeps/mach/hurd/Makefile [io] (sysdep_routines): Add close_nocancel.
* sysdeps/mach/hurd/Versions (libc.GLIBC_PRIVATE, ld.GLIBC_PRIVATE): Add
__close_nocancel.
* sysdeps/mach/hurd/i386/localplt.data (__close_nocancel): Allow PLT.
* sysdeps/mach/hurd/close.c: Include <sysdep-cancel.h>
(__libc_close): Surround _hurd_fd_close with enabling async cancel.
* sysdeps/mach/hurd/close_nocancel.c: New file.
* sysdeps/mach/hurd/not-cancel.h (__close_nocancel): Replace macro with
declaration with hidden proto.

4 years agohurd: make open and openat cancellation points
Samuel Thibault [Sun, 28 Jun 2020 14:27:36 +0000 (14:27 +0000)]
hurd: make open and openat cancellation points

and add _nocancel variants.

* sysdeps/mach/hurd/Makefile [io] (sysdep_routines): Add open_nocancel
openat_nocancel.
* sysdeps/mach/hurd/Versions (libc.GLIBC_PRIVATE, ld.GLIBC_PRIVATE): Add
__open_nocancel.
* sysdeps/mach/hurd/dl-sysdep.c (__open_nocancel): Add alias, check it
is not hidden.
* sysdeps/mach/hurd/i386/localplt.data (__open_nocancel): Allow PLT.
* sysdeps/mach/hurd/not-cancel.h (__open_nocancel, __openat_nocancel:
Replace macros with declarations with hidden proto.
(__open64_nocancel, __openat64_nocancel): Call __open_nocancel and
__openat_nocancel instead of __open64 and __openat64.
* sysdeps/mach/hurd/open.c: Include <sysdep-cancel.h>
(__libc_open): Surround __file_name_lookup with enabling async cancel.
* sysdeps/mach/hurd/openat.c: Likewise.
* sysdeps/mach/hurd/open_nocancel.c,
sysdeps/mach/hurd/openat_nocancel.c: New files.

4 years agohurd: clean fd and port on thread cancel
Samuel Thibault [Sun, 28 Jun 2020 00:15:56 +0000 (00:15 +0000)]
hurd: clean fd and port on thread cancel

HURD_*PORT_USE link fd and port with a stack-stored structure, so on
thread cancel we need to cleanup this.

* hurd/fd-cleanup.c: New file.
* hurd/port-cleanup.c (_hurd_port_use_cleanup): New function.
* hurd/Makefile (routines): Add fd-cleanup.
* sysdeps/hurd/include/hurd.h (__USEPORT_CANCEL): New macro.
* sysdeps/hurd/include/hurd/fd.h (_hurd_fd_port_use_data): New
structure.
(_hurd_fd_port_use_cleanup): New prototype.
(HURD_DPORT_USE_CANCEL, HURD_FD_PORT_USE_CANCEL): New macros.
* sysdeps/hurd/include/hurd/port.h (_hurd_port_use_data): New structure.
(_hurd_port_use_cleanup): New prototype.
(HURD_PORT_USE_CANCEL): New macro.
* hurd/hurd/fd.h (HURD_FD_PORT_USE): Also refer to HURD_FD_PORT_USE_CANCEL.
* hurd/hurd.h (__USEPORT): Also refer to __USEPORT_CANCEL.
* hurd/hurd/port.h (HURD_PORT_USE): Also refer to HURD_PORT_USE_CANCEL.

* hurd/fd-read.c (_hurd_fd_read): Call HURD_FD_PORT_USE_CANCEL instead
of HURD_FD_PORT_USE.
* hurd/fd-write.c (_hurd_fd_write): Likewise.
* sysdeps/mach/hurd/send.c (__send): Call HURD_DPORT_USE_CANCEL instead
of HURD_DPORT_USE.
* sysdeps/mach/hurd/sendmsg.c (__libc_sendmsg): Likewise.
* sysdeps/mach/hurd/sendto.c (__sendto): Likewise.
* sysdeps/mach/hurd/recv.c (__recv): Likewise.
* sysdeps/mach/hurd/recvfrom.c (__recvfrom): Likewise.
* sysdeps/mach/hurd/recvmsg.c (__libc_recvmsg): Call __USEPORT_CANCEL
instead of __USEPORT, and HURD_DPORT_USE_CANCEL instead of
HURD_DPORT_USE.

4 years agohtl: Move cleanup handling to non-private libc-lock
Samuel Thibault [Sat, 27 Jun 2020 18:33:52 +0000 (18:33 +0000)]
htl: Move cleanup handling to non-private libc-lock

This adds sysdeps/htl/libc-lock.h which augments sysdeps/mach/libc-lock.h with
the htl-aware cleanup handling. Otherwise inclusion of libc-lock.h
without libc-lockP.h would keep only the mach-aware handling.

This also fixes cleanup getting called when the binary is
statically-linked without libpthread.

* sysdeps/htl/libc-lockP.h (__libc_cleanup_region_start,
__libc_cleanup_end, __libc_cleanup_region_end,
__pthread_get_cleanup_stack): Move to...
* sysdeps/htl/libc-lock.h: ... new file.
(__libc_cleanup_region_start): Always set handler and arg.
(__libc_cleanup_end): Always call the cleanup handler.
(__libc_cleanup_push, __libc_cleanup_pop): New macros.

4 years agohtl: Fix includes for lockfile
Samuel Thibault [Sat, 27 Jun 2020 12:20:24 +0000 (12:20 +0000)]
htl: Fix includes for lockfile

These only need exactly to use __libc_ptf_call.

* sysdeps/htl/flockfile.c: Include <libc-lockP.h> instead of
<libc-lock.h>
* sysdeps/htl/ftrylockfile.c: Include <libc-lockP.h> instead of
<errno.h>, <pthread.h>, <stdio-lock.h>
* sysdeps/htl/funlockfile.c: Include <libc-lockP.h> instead of
<pthread.h> and <stdio-lock.h>

4 years agohtl: avoid cancelling threads inside critical sections
Samuel Thibault [Sat, 27 Jun 2020 00:34:18 +0000 (02:34 +0200)]
htl: avoid cancelling threads inside critical sections

Like hurd_thread_cancel does.

* sysdeps/mach/hurd/htl/pt-docancel.c: Include <hurd/signal.h>
(__pthread_do_cancel): Lock target thread's critical_section_lock and ss
lock around thread mangling.

4 years agotst-cancel4-common.c: fix calling socketpair
Samuel Thibault [Fri, 26 Jun 2020 20:44:30 +0000 (22:44 +0200)]
tst-cancel4-common.c: fix calling socketpair

PF_UNIX was actually never intended to be passed as protocol parameter to
socket() calls: it is a protocol family, not a protocol.  It happens that
Linux introduced accepting it during its 2.0 development, but it shouldn't.
OpenBSD kernels accept it as well, but FreeBSD and NetBSD rightfully do not.
GNU/Hurd does not either.

* nptl/tst-cancel4-common.c (do_test): Pass 0 instead of PF_UNIX as
protocol.

4 years agox86: Detect Intel Advanced Matrix Extensions
H.J. Lu [Thu, 25 Jun 2020 22:12:57 +0000 (15:12 -0700)]
x86: Detect Intel Advanced Matrix Extensions

Intel Advanced Matrix Extensions (Intel AMX) is a new programming
paradigm consisting of two components: a set of 2-dimensional registers
(tiles) representing sub-arrays from a larger 2-dimensional memory image,
and accelerators able to operate on tiles.  Intel AMX is an extensible
architecture.  New accelerators can be added and the existing accelerator
may be enhanced to provide higher performance.  The initial features are
AMX-BF16, AMX-TILE and AMX-INT8, which are usable only if the operating
system supports both XTILECFG state and XTILEDATA state.

Add AMX-BF16, AMX-TILE and AMX-INT8 support to HAS_CPU_FEATURE and
CPU_FEATURE_USABLE.

4 years agoSet width of JUNGSEONG/JONGSEONG characters from UD7B0 to UD7FB to 0 [BZ #26120]
Mike FABIAN [Tue, 16 Jun 2020 06:29:40 +0000 (08:29 +0200)]
Set width of JUNGSEONG/JONGSEONG characters from UD7B0 to UD7FB to 0 [BZ #26120]

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
4 years agoS390: Optimize __memset_z196.
Stefan Liebler [Fri, 26 Jun 2020 07:45:11 +0000 (09:45 +0200)]
S390: Optimize __memset_z196.

It turned out that an 256b-mvc instruction which depends on the
result of a previous 256b-mvc instruction is counterproductive.
Therefore this patch adjusts the 256b-loop by storing the
first byte with stc and setting the remaining 255b with mvc.
Now the 255b-mvc instruction depends on the stc instruction.

4 years agoS390: Optimize __memcpy_z196.
Stefan Liebler [Fri, 26 Jun 2020 07:45:11 +0000 (09:45 +0200)]
S390: Optimize __memcpy_z196.

This patch introduces an extra loop without pfd instructions
as it turned out that the pfd instructions are usefull
for copies >=64KB but are counterproductive for smaller copies.

4 years agoelf: Include <stddef.h> (for size_t), <sys/stat.h> in <ldconfig.h>
Florian Weimer [Thu, 25 Jun 2020 14:51:03 +0000 (16:51 +0200)]
elf: Include <stddef.h> (for size_t), <sys/stat.h> in <ldconfig.h>

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
4 years agonptl: Don't madvise user provided stack
Szabolcs Nagy [Wed, 24 Jun 2020 06:47:15 +0000 (07:47 +0100)]
nptl: Don't madvise user provided stack

User provided stack should not be released nor madvised at
thread exit because it's owned by the user.

If the memory is shared or file based then MADV_DONTNEED
can have unwanted effects. With memory tagging on aarch64
linux the tags are dropped and thus it may invalidate
pointers.

Tested on aarch64-linux-gnu with MTE, it fixes

FAIL: nptl/tst-stack3
FAIL: nptl/tst-stack3-mem

4 years agoS390: Regenerate ULPs.
Stefan Liebler [Wed, 24 Jun 2020 12:51:06 +0000 (14:51 +0200)]
S390: Regenerate ULPs.

Updates needed after recent exp10f commits.

4 years agohtl: Add wrapper header for <semaphore.h> with hidden __sem_post
Florian Weimer [Wed, 24 Jun 2020 11:38:08 +0000 (13:38 +0200)]
htl: Add wrapper header for <semaphore.h> with hidden __sem_post

This is required to avoid a check-localplt failure due to a
sem_post call through the PLT.

Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
4 years agoelf: Include <stdbool.h> in <dl-tunables.h> because bool is used
Florian Weimer [Wed, 24 Jun 2020 09:02:33 +0000 (11:02 +0200)]
elf: Include <stdbool.h> in <dl-tunables.h> because bool is used

4 years agohtl: Fix case when sem_*wait is canceled while holding a token
Samuel Thibault [Wed, 24 Jun 2020 00:18:45 +0000 (00:18 +0000)]
htl: Fix case when sem_*wait is canceled while holding a token

* sysdeps/htl/sem-timedwait.c (struct cancel_ctx): Add cancel_wake
field.
(cancel_hook): When unblocking thread, set cancel_wake field to 1.
(__sem_timedwait_internal): Set cancel_wake field to 0 by default.
On cancellation exit, check whether we hold a token, to be put back.

4 years agohtl: Make sem_*wait cancellations points
Samuel Thibault [Tue, 23 Jun 2020 22:43:32 +0000 (22:43 +0000)]
htl: Make sem_*wait cancellations points

By aligning its implementation on pthread_cond_wait.

* sysdeps/htl/sem-timedwait.c (cancel_ctx): New structure.
(cancel_hook): New function.
(__sem_timedwait_internal): Check for cancellation and register
cancellation hook that wakes the thread up, and check again for
cancellation on exit.
* nptl/tst-cancel13.c, nptl/tst-cancelx13.c: Move to...
* sysdeps/pthread/: ... here.
* nptl/Makefile: Move corresponding references and rules to...
* sysdeps/pthread/Makefile: ... here.

4 years agohtl: Simplify non-cancel path of __pthread_cond_timedwait_internal
Samuel Thibault [Tue, 23 Jun 2020 22:41:18 +0000 (22:41 +0000)]
htl: Simplify non-cancel path of __pthread_cond_timedwait_internal

Since __pthread_exit does not return, we do not need to indent the
noncancel path

* sysdeps/htl/pt-cond-timedwait.c (__pthread_cond_timedwait_internal):
Move cancelled path before non-cancelled path, to avoid "else"
indentation.

4 years agohtl: Enable tst-cancel25 test
Samuel Thibault [Tue, 23 Jun 2020 22:00:53 +0000 (22:00 +0000)]
htl: Enable tst-cancel25 test

* nptl/tst-cancel25.c: Move to...
* sysdeps/pthread/tst-cancel25.c: ... here.
(tf2) Do not test for SIGCANCEL when it is not defined.
* nptl/Makefile: Move corresponding reference to...
* sysdeps/pthread/Makefile: ... here.

4 years agopowerpc: Add new hwcap values
Tulio Magno Quites Machado Filho [Mon, 15 Jun 2020 14:15:57 +0000 (11:15 -0300)]
powerpc: Add new hwcap values

Linux commit ID ee988c11acf6f9464b7b44e9a091bf6afb3b3a49 reserved 2 new
bits in AT_HWCAP2:
 - PPC_FEATURE2_ARCH_3_1 indicates the availability of the POWER ISA
   3.1;
 - PPC_FEATURE2_MMA indicates the availability of the Matrix-Multiply
   Assist facility.

4 years agoaarch64: MTE compatible strncmp
Alex Butler [Tue, 16 Jun 2020 12:44:24 +0000 (12:44 +0000)]
aarch64: MTE compatible strncmp

Add support for MTE to strncmp. Regression tested with xcheck and benchmarked
with glibc's benchtests on the Cortex-A53, Cortex-A72, and Neoverse N1.

The existing implementation assumes that any access to the pages in which the
string resides is safe. This assumption is not true when MTE is enabled. This
patch updates the algorithm to ensure that accesses remain within the bounds
of an MTE tag (16-byte chunks) and improves overall performance.

Co-authored-by: Branislav Rankov <branislav.rankov@arm.com>
Co-authored-by: Wilco Dijkstra <wilco.dijkstra@arm.com>
4 years agoaarch64: MTE compatible strcmp
Alex Butler [Tue, 16 Jun 2020 12:42:38 +0000 (12:42 +0000)]
aarch64: MTE compatible strcmp

Add support for MTE to strcmp. Regression tested with xcheck and benchmarked
with glibc's benchtests on the Cortex-A53, Cortex-A72, and Neoverse N1.

The existing implementation assumes that any access to the pages in which the
string resides is safe. This assumption is not true when MTE is enabled. This
patch updates the algorithm to ensure that accesses remain within the bounds
of an MTE tag (16-byte chunks) and improves overall performance.

Co-authored-by: Branislav Rankov <branislav.rankov@arm.com>
Co-authored-by: Wilco Dijkstra <wilco.dijkstra@arm.com>
4 years agoaarch64: MTE compatible strrchr
Alex Butler [Tue, 9 Jun 2020 16:09:36 +0000 (16:09 +0000)]
aarch64: MTE compatible strrchr

Add support for MTE to strrchr. Regression tested with xcheck and benchmarked
with glibc's benchtests on the Cortex-A53, Cortex-A72, and Neoverse N1.

The existing implementation assumes that any access to the pages in which the
string resides is safe. This assumption is not true when MTE is enabled. This
patch updates the algorithm to ensure that accesses remain within the bounds
of an MTE tag (16-byte chunks) and improves overall performance.

Co-authored-by: Wilco Dijkstra <wilco.dijkstra@arm.com>
4 years agoaarch64: MTE compatible memrchr
Alex Butler [Tue, 9 Jun 2020 16:08:07 +0000 (16:08 +0000)]
aarch64: MTE compatible memrchr

Add support for MTE to memrchr. Regression tested with xcheck and benchmarked
with glibc's benchtests on the Cortex-A53, Cortex-A72, and Neoverse N1.

The existing implementation assumes that any access to the pages in which the
string resides is safe. This assumption is not true when MTE is enabled. This
patch updates the algorithm to ensure that accesses remain within the bounds
of an MTE tag (16-byte chunks) and improves overall performance.

Co-authored-by: Wilco Dijkstra <wilco.dijkstra@arm.com>
4 years agoaarch64: MTE compatible memchr
Alex Butler [Tue, 9 Jun 2020 16:06:03 +0000 (16:06 +0000)]
aarch64: MTE compatible memchr

Add support for MTE to memchr. Regression tested with xcheck and benchmarked
with glibc's benchtests on the Cortex-A53, Cortex-A72, and Neoverse N1.

The existing implementation assumes that any access to the pages in which the
string resides is safe. This assumption is not true when MTE is enabled. This
patch updates the algorithm to ensure that accesses remain within the bounds
of an MTE tag (16-byte chunks) and improves overall performance.

Co-authored-by: Gabor Kertesz <gabor.kertesz@arm.com>
4 years agoaarch64: MTE compatible strcpy
Alex Butler [Tue, 9 Jun 2020 15:57:03 +0000 (15:57 +0000)]
aarch64: MTE compatible strcpy

Add support for MTE to strcpy. Regression tested with xcheck and benchmarked
with glibc's benchtests on the Cortex-A53, Cortex-A72, and Neoverse N1.

The existing implementation assumes that any access to the pages in which the
string resides is safe. This assumption is not true when MTE is enabled. This
patch updates the algorithm to ensure that accesses remain within the bounds
of an MTE tag (16-byte chunks) and improves overall performance.

Co-authored-by: Wilco Dijkstra <wilco.dijkstra@arm.com>
4 years agoAdd MREMAP_DONTUNMAP from Linux 5.7
Joseph Myers [Tue, 23 Jun 2020 14:42:45 +0000 (14:42 +0000)]
Add MREMAP_DONTUNMAP from Linux 5.7

Add the new constant MREMAP_DONTUNMAP from Linux 5.7 to
bits/mman-shared.h.

Tested with build-many-glibcs.py.

4 years agox86: Update CPU feature detection [BZ #26149]
H.J. Lu [Wed, 17 Jun 2020 13:34:46 +0000 (06:34 -0700)]
x86: Update CPU feature detection [BZ #26149]

1. Divide architecture features into the usable features and the preferred
features.  The usable features are for correctness and can be exported in
a stable ABI.  The preferred features are for performance and only for
glibc internal use.
2. Change struct cpu_features to

struct cpu_features
{
  struct cpu_features_basic basic;
  unsigned int *usable_p;
  struct cpuid_registers cpuid[COMMON_CPUID_INDEX_MAX];
  unsigned int usable[USABLE_FEATURE_INDEX_MAX];
  unsigned int preferred[PREFERRED_FEATURE_INDEX_MAX];
  ...
};

and initialize usable_p to pointer to the usable arary so that

struct cpu_features
{
  struct cpu_features_basic basic;
  unsigned int *usable_p;
  struct cpuid_registers cpuid[COMMON_CPUID_INDEX_MAX];
};

can be exported via a stable ABI.  The cpuid and usable arrays can be
expanded with backward binary compatibility for both .o and .so files.
3. Add COMMON_CPUID_INDEX_7_ECX_1 for AVX512_BF16.
4. Detect ENQCMD, PKS, AVX512_VP2INTERSECT, MD_CLEAR, SERIALIZE, HYBRID,
TSXLDTRK, L1D_FLUSH, CORE_CAPABILITIES and AVX512_BF16.
5. Rename CAPABILITIES to ARCH_CAPABILITIES.
6. Check if AVX512_VP2INTERSECT, AVX512_BF16 and PKU are usable.
7. Update CPU feature detection test.

4 years agoaarch64: Remove fpu Makefile
Adhemerval Zanella [Wed, 10 Jun 2020 16:20:26 +0000 (13:20 -0300)]
aarch64: Remove fpu Makefile

The -fno-math-errno is already added by default and the minimum
required GCC to build glibc (6.2) make the -ffinite-math-only
superflous.

Checked on aarch64-linux-gnu.

4 years agom68k: Use sqrt{f} builtin for coldfire
Adhemerval Zanella [Tue, 9 Jun 2020 18:42:22 +0000 (15:42 -0300)]
m68k: Use sqrt{f} builtin for coldfire

Checked with a build for m68k-linux-gnu-coldfire.

4 years agoarm: Use sqrt{f} builtin
Adhemerval Zanella [Fri, 5 Jun 2020 15:08:44 +0000 (15:08 +0000)]
arm: Use sqrt{f} builtin

Checked on arm-linux-gnueabi and armv7-linux-gnueabihf

4 years agoriscv: Use sqrt{f} builtin
Adhemerval Zanella [Fri, 5 Jun 2020 14:44:44 +0000 (14:44 +0000)]
riscv: Use sqrt{f} builtin

Checked with a build for riscv64-linux-gnu-rv64imac-lp64 (no
builtin support), riscv64-linux-gnu-rv64imafdc-lp64, and
riscv64-linux-gnu-rv64imafdc-lp64d.

4 years agos390: Use sqrt{f} builtin
Adhemerval Zanella [Fri, 5 Jun 2020 14:31:30 +0000 (14:31 +0000)]
s390: Use sqrt{f} builtin

Checked on s390x-linux-gnu.

4 years agosparc: Use sqrt{f} builtin
Adhemerval Zanella [Fri, 5 Jun 2020 14:22:09 +0000 (14:22 +0000)]
sparc: Use sqrt{f} builtin

It also enabled to use fsqrtd on sparc64.

Checked on sparcv9-linux-gnu and sparc64-linux-gnu.

4 years agomips: Use sqrt{f} builtin
Adhemerval Zanella [Fri, 5 Jun 2020 14:00:25 +0000 (14:00 +0000)]
mips: Use sqrt{f} builtin

Checked with a build against mips-linux-gnu and mips64-linux-gnu
and comparing the resulting binaries.

4 years agoalpha: Use builtin sqrt{f}
Adhemerval Zanella [Fri, 5 Jun 2020 03:33:51 +0000 (03:33 +0000)]
alpha: Use builtin sqrt{f}

The generic implementation is simplified by removing the
'optimization' for !_IEEE_FP_INEXACT (which does not handle
inexact neither some values).

Checked on alpha-linux-gnu.

4 years agoi386: Use builtin sqrtl
Adhemerval Zanella [Fri, 5 Jun 2020 03:33:12 +0000 (03:33 +0000)]
i386: Use builtin sqrtl

Checked on i686-linux-gnu.

4 years agox86_64: Use builtin sqrt{f,l}
Adhemerval Zanella [Thu, 4 Jun 2020 20:46:34 +0000 (20:46 +0000)]
x86_64: Use builtin sqrt{f,l}

Checked on x86_64-linux-gnu.

4 years agopowerpc: Use sqrt{f} builtin
Adhemerval Zanella [Thu, 4 Jun 2020 19:47:16 +0000 (22:47 +0300)]
powerpc: Use sqrt{f} builtin

The powerpc sqrt implementation is also simplified:

  - the static constants are open coded within the implementation.
  - for !USE_SQRT_BUILTIN the function is implemented directly on
    __ieee754_sqrt (it avoid an superflous extra jump).

Checked on powerpc-linux-gnu and powerpc64le-linux-gnu.

4 years agos390x: Use fma{f} builtin
Adhemerval Zanella [Thu, 4 Jun 2020 17:22:20 +0000 (17:22 +0000)]
s390x: Use fma{f} builtin

Checked on s390x-linux-gnu.

4 years agoaarch64: Use math-use-builtins for ceil{f}
Adhemerval Zanella [Thu, 4 Jun 2020 14:35:48 +0000 (14:35 +0000)]
aarch64: Use math-use-builtins for ceil{f}

The define is already set on the math-use-builtins-ceil.h, the patch
just removes the implementations (it was missed on c9feb1be93).

Checked on aarch64-linux-gnu.

4 years agomath: Decompose math-use-builtins.h
Adhemerval Zanella [Thu, 4 Jun 2020 14:32:17 +0000 (14:32 +0000)]
math: Decompose math-use-builtins.h

Each symbol definitions are moved on a separated file and it
cover all symbol type definitions (float, double, long double,
and float128).

It allows to set support for architectures without the boiler
place of copying default values.

Checked with a build on the affected ABIs.

4 years agohurd: Add mremap
Samuel Thibault [Sat, 20 Jun 2020 13:48:04 +0000 (13:48 +0000)]
hurd: Add mremap

* sysdeps/mach/hurd/mremap.c: New file.
* sysdeps/mach/hurd/Makefile [misc] (sysdep_routines): Add mremap.
* sysdeps/mach/hurd/Versions (libc.GLIBC_2.32): Add mremap.
* sysdeps/mach/hurd/i386/libc.abilist: Add mremap.

4 years agoia64: Use generic exp10f
Adhemerval Zanella [Thu, 9 Apr 2020 19:38:27 +0000 (16:38 -0300)]
ia64: Use generic exp10f

The generic implementation is slight worse (Itanium(R) Processor 9020):

Before new code:
  "exp10f": {
   "workload-spec2017.wrf (adapted)": {
    "duration": 3.61582e+08,
    "iterations": 2.384e+07,
    "reciprocal-throughput": 14.8334,
    "latency": 15.5006,
    "max-throughput": 6.74153e+07,
    "min-throughput": 6.45136e+07
   }
  }

With new code:
  "exp10f": {
   "workload-spec2017.wrf (adapted)": {
    "duration": 3.85549e+08,
    "iterations": 2.384e+07,
    "reciprocal-throughput": 15.8391,
    "latency": 16.5056,
    "max-throughput": 6.31348e+07,
    "min-throughput": 6.05857e+07
   }
  }

However it fixes all the issues on both:

  math/test-float-exp10
  math/test-float32-exp10

(all the issues wrong results for non default rounding modes).

The existing ia64 libm interface uses matherrf and matherrl in addition
to matherr for SVID error handling. However, there is no such error
handling support for exp10f in ia64 libm.  So replacing it with the
generic implementation should be fine.

Checked on ia64-linux-gnu.

4 years agoNew exp10f version without SVID compat wrapper
Adhemerval Zanella [Wed, 8 Apr 2020 22:51:44 +0000 (19:51 -0300)]
New exp10f version without SVID compat wrapper

This patch changes the exp10f error handling semantics to only set
errno according to POSIX rules.  New symbol version is introduced at
GLIBC_2.32.  The old wrappers are kept for compat symbols.

There are some outliers that need special handling:

  - ia64 provides an optimized implementation of exp10f that uses ia64
    specific routines to set SVID compatibility.  The new symbol version
    is aliased to the exp10f one.

  - m68k also provides an optimized implementation, and the new version
    uses it instead of the sysdeps/ieee754/flt32 one.

  - riscv and csky uses the generic template implementation that
    does not provide SVID support.  For both cases a new exp10f
    version is not added, but rather the symbols version of the
    generic sysdeps/ieee754/flt32 is adjusted instead.

Checked on aarch64-linux-gnu, x86_64-linux-gnu, i686-linux-gnu,
powerpc64le-linux-gnu.

4 years agoi386: Use generic exp10f
Adhemerval Zanella [Wed, 8 Apr 2020 20:42:46 +0000 (17:42 -0300)]
i386: Use generic exp10f

The generic implementation is twice as fast.  Using the exp10f
benchmark:

 * master:
  "exp10f": {
   "workload-spec2017.wrf (adapted)": {
    "duration": 1.02967e+09,
    "iterations": 4.768e+07,
    "reciprocal-throughput": 18.3579,
    "latency": 24.8331,
    "max-throughput": 5.44725e+07,
    "min-throughput": 4.02688e+07
   }
  }

 * patched:
  "exp10f": {
   "workload-spec2017.wrf (adapted)": {
    "duration": 1.01821e+09,
    "iterations": 6.1984e+07,
    "reciprocal-throughput": 13.1975,
    "latency": 19.6563,
    "max-throughput": 7.57719e+07,
    "min-throughput": 5.08743e+07
   }
  }

Checked on i686-linux-gnu.

4 years agomath: Optimized generic exp10f with wrappers
Paul Zimmermann [Wed, 8 Apr 2020 20:32:28 +0000 (17:32 -0300)]
math: Optimized generic exp10f with wrappers

It is inspired by expf and reuses its tables and internal functions.
The error checks are inlined and errno setting is in separate tail
called functions, but the wrappers are kept in this patch to handle
the _LIB_VERSION==_SVID_ case.

Double precision arithmetics is used which is expected to be faster on
most targets (including soft-float) than using single precision and it
is easier to get good precision result with it.

Result for x86_64 (i7-4790K CPU @ 4.00GHz) are:

Before new code:
  "exp10f": {
   "workload-spec2017.wrf (adapted)": {
    "duration": 4.0414e+09,
    "iterations": 1.00128e+08,
    "reciprocal-throughput": 26.6818,
    "latency": 54.043,
    "max-throughput": 3.74787e+07,
    "min-throughput": 1.85038e+07
   }

With new code:
  "exp10f": {
   "workload-spec2017.wrf (adapted)": {
    "duration": 4.11951e+09,
    "iterations": 1.23968e+08,
    "reciprocal-throughput": 21.0581,
    "latency": 45.4028,
    "max-throughput": 4.74876e+07,
    "min-throughput": 2.20251e+07
   }

Result for aarch64 (A72 @ 2GHz) are:

Before new code:
  "exp10f": {
   "workload-spec2017.wrf (adapted)": {
    "duration": 4.62362e+09,
    "iterations": 3.3376e+07,
    "reciprocal-throughput": 127.698,
    "latency": 149.365,
    "max-throughput": 7.831e+06,
    "min-throughput": 6.69501e+06
   }

With new code:
  "exp10f": {
   "workload-spec2017.wrf (adapted)": {
    "duration": 4.29108e+09,
    "iterations": 6.6752e+07,
    "reciprocal-throughput": 51.2111,
    "latency": 77.3568,
    "max-throughput": 1.9527e+07,
    "min-throughput": 1.29271e+07
   }

Checked on x86_64-linux-gnu, powerpc64le-linux-gnu, aarch64-linux-gnu,
and sparc64-linux-gnu.

4 years agobenchtests: Add exp10f benchmark
Adhemerval Zanella [Wed, 8 Apr 2020 20:00:06 +0000 (17:00 -0300)]
benchtests: Add exp10f benchmark

It is based on expf one by converting each line with the formula:

  new_val = (float) log10 (exp ((double) old_val))

4 years agox86: Update F16C detection [BZ #26133]
H.J. Lu [Thu, 18 Jun 2020 12:34:15 +0000 (05:34 -0700)]
x86: Update F16C detection [BZ #26133]

Since F16C requires AVX, set F16C usable only when AVX is usable.

4 years agoFix avx2 strncmp offset compare condition check [BZ #25933]
Sunil K Pandey [Fri, 12 Jun 2020 15:57:16 +0000 (08:57 -0700)]
Fix avx2 strncmp offset compare condition check [BZ #25933]

strcmp-avx2.S: In avx2 strncmp function, strings are compared in
chunks of 4 vector size(i.e. 32x4=128 byte for avx2). After first 4
vector size comparison, code must check whether it already passed
the given offset. This patch implement avx2 offset check condition
for strncmp function, if both string compare same for first 4 vector
size.

4 years agonptl: Remove now-spurious tst-cancelx9 references
Samuel Thibault [Wed, 17 Jun 2020 13:55:52 +0000 (15:55 +0200)]
nptl: Remove now-spurious tst-cancelx9 references

They were to be moved to sysdeps/pthread/Makefile in 45fce058f ('htl:
Enable more cancellation tests')

* nptl/Makefile: (tests): Remove tst-cancelx9.
(CFLAGS-tst-cancelx9.c): Remove.

4 years agox86_64: Use %xmmN with vpxor to clear a vector register
H.J. Lu [Thu, 11 Jun 2020 19:41:18 +0000 (12:41 -0700)]
x86_64: Use %xmmN with vpxor to clear a vector register

Since "vpxor %xmmN, %xmmN, %xmmN" clears the whole vector register, use
%xmmN, instead of %ymmN, with vpxor to clear a vector register.

4 years agox86: Correct bit_cpu_CLFLUSHOPT [BZ #26128]
H.J. Lu [Wed, 17 Jun 2020 12:32:37 +0000 (05:32 -0700)]
x86: Correct bit_cpu_CLFLUSHOPT [BZ #26128]

bit_cpu_CLFLUSHOPT should be (1u << 23), not (1u << 22).

4 years agopowerpc64le: refactor e_sqrtf128.c
Paul E. Murphy [Tue, 7 Apr 2020 21:20:51 +0000 (16:20 -0500)]
powerpc64le: refactor e_sqrtf128.c

Combine both implementations into a single file to allow
building twice with appropriate multiarch support when possible.

4 years agoUpdate syscall-names.list for Linux 5.7.
Joseph Myers [Mon, 15 Jun 2020 22:58:22 +0000 (22:58 +0000)]
Update syscall-names.list for Linux 5.7.

Linux 5.7 has no new syscalls.  Update the version number in
syscall-names.list to reflect that it is still current for 5.7.

Tested with build-many-glibcs.py.

4 years agoieee754/dbl-64: Reduce the scope of temporary storage variables
Vineet Gupta [Fri, 8 Nov 2019 19:32:00 +0000 (11:32 -0800)]
ieee754/dbl-64: Reduce the scope of temporary storage variables

This came to light when adding hard-flaot support to ARC glibc port
without hardware sqrt support causing glibc build to fail:

| ../sysdeps/ieee754/dbl-64/e_sqrt.c: In function '__ieee754_sqrt':
| ../sysdeps/ieee754/dbl-64/e_sqrt.c:58:54: error: unused variable 'ty' [-Werror=unused-variable]
|   double y, t, del, res, res1, hy, z, zz, p, hx, tx, ty, s;

The reason being EMULV() macro uses the hardware provided
__builtin_fma() variant, leaving temporary variables 'p, hx, tx, hy, ty'
unused hence compiler warning and ensuing error.

The intent of the patch was to fix that error, but EMULV is pervasive
and used fair bit indirectly via othe rmacros, hence this patch.
Functionally it should not result in code gen changes and if at all
those would be better since the scope of those temporaries is greatly
reduced now

Built tested with aarch64-linux-gnu arm-linux-gnueabi arm-linux-gnueabihf hppa-linux-gnu x86_64-linux-gnu arm-linux-gnueabihf riscv64-linux-gnu-rv64imac-lp64 riscv64-linux-gnu-rv64imafdc-lp64 powerpc-linux-gnu microblaze-linux-gnu nios2-linux-gnu hppa-linux-gnu

Also as suggested by Joseph [1] used --strip and compared the libs with
and w/o patch and they are byte-for-byte unchanged (with gcc 9).

| for i in `find . -name libm-2.31.9000.so`;
| do
|    echo $i; diff $i /SCRATCH/vgupta/gnu2/install/glibcs/$i ; echo $?;
| done

| ./aarch64-linux-gnu/lib64/libm-2.31.9000.so
| 0
| ./arm-linux-gnueabi/lib/libm-2.31.9000.so
| 0
| ./x86_64-linux-gnu/lib64/libm-2.31.9000.so
| 0
| ./arm-linux-gnueabihf/lib/libm-2.31.9000.so
| 0
| ./riscv64-linux-gnu-rv64imac-lp64/lib64/lp64/libm-2.31.9000.so
| 0
| ./riscv64-linux-gnu-rv64imafdc-lp64/lib64/lp64/libm-2.31.9000.so
| 0
| ./powerpc-linux-gnu/lib/libm-2.31.9000.so
| 0
| ./microblaze-linux-gnu/lib/libm-2.31.9000.so
| 0
| ./nios2-linux-gnu/lib/libm-2.31.9000.so
| 0
| ./hppa-linux-gnu/lib/libm-2.31.9000.so
| 0
| ./s390x-linux-gnu/lib64/libm-2.31.9000.so

[1] https://sourceware.org/pipermail/libc-alpha/2019-November/108267.html

4 years agomanual: Add pthread_attr_setsigmask_np, pthread_attr_getsigmask_np
Florian Weimer [Mon, 15 Jun 2020 10:18:38 +0000 (12:18 +0200)]
manual: Add pthread_attr_setsigmask_np, pthread_attr_getsigmask_np

And the PTHREAD_ATTR_NO_SIGMASK_NP constant.

4 years agold.so: Check for new cache format first and enhance corruption check
Florian Weimer [Mon, 15 Jun 2020 07:50:14 +0000 (09:50 +0200)]
ld.so: Check for new cache format first and enhance corruption check

Now that ldconfig defaults to the new format (only), check for it
first.  Also apply the corruption check added in commit 2954daf00bb4d
("Add more checks for valid ld.so.cache file (bug 18093)") to the
new-format-only case.

Suggested-by: Josh Triplett <josh@joshtriplett.org>
4 years agohurd: Fix __writev_nocancel_nostatus
Samuel Thibault [Sun, 14 Jun 2020 17:44:57 +0000 (17:44 +0000)]
hurd: Fix __writev_nocancel_nostatus

* sysdeps/mach/hurd/Makefile [subdir=misc] (sysdep_routines): Add
writev_nocancel writev_nocancel_nostatus.
* sysdeps/mach/hurd/not-cancel.h (__writev_nocancel_nostatus): Replace
macro with function declaration (with hidden prototype in libc).
(__writev_nocancel): New function declaration (with hidden prototype in libc).
* sysdeps/mach/hurd/writev_nocancel_nostatus.c: New file.
* sysdeps/posix/writev_nocancel.c: New file, includes writev.c to make a
nocancel variant that calls __write_nocancel.
* sysdeps/posix/writev.c (writev): Do not define alias if __writev is
renamed.

4 years agohurd: Make send* cancellation points
Samuel Thibault [Sun, 14 Jun 2020 17:09:59 +0000 (17:09 +0000)]
hurd: Make send* cancellation points

* sysdeps/mach/hurd/send.c (__send): Make the __socket_send call
a cancellation point.
* sysdeps/mach/hurd/sendto.c (__sendto): Likewise.
* sysdeps/mach/hurd/sendmsg.c (__libc_sendmsg): Likewise.

4 years agohtl: Enable more cancellation tests
Samuel Thibault [Sun, 14 Jun 2020 16:15:39 +0000 (16:15 +0000)]
htl: Enable more cancellation tests

* nptl/tst-cancel-self-cancelstate.c, tst-cancel-self.c, tst-cancel9.c,
tst-cancelx9.c: Move to...
* sysdeps/pthread: ... here.
* nptl/Makefile: Move corresponding references and rules to...
* sysdeps/pthread/Makefile: ... here.

4 years agohurd: Make write and pwrite64 cancellation points
Samuel Thibault [Sun, 14 Jun 2020 15:50:44 +0000 (15:50 +0000)]
hurd: Make write and pwrite64 cancellation points

and add _nocancel variants.

* sysdeps/mach/hurd/write.c (__libc_write): Call __write_nocancel
surrounded by enabling async cancel, to replace implementation moved
to...
* sysdeps/mach/hurd/write_nocancel.c (__write_nocancel): ... here.
* sysdeps/mach/hurd/pwrite64.c (__libc_pwrite64): Call
__pwrite64_nocancel surrounded by enabling async cancel, to replace
implementation moved to...
* sysdeps/mach/hurd/pwrite64_nocancel.c (__pwrite64_nocancel): ... here.
* sysdeps/mach/hurd/Makefile (sysdep_routines): Add write_nocancel and
pwrite64_nocancel.
* sysdeps/mach/hurd/not-cancel.h (__write_nocancel,
__pwrite64_nocancel): Replace macros with prototypes with a hidden proto on
libc.

* sysdeps/mach/hurd/dl-sysdep.c (__write_nocancel): New alias, check
that it is not hidden.
* sysdeps/mach/hurd/Versions (libc.GLIBC_PRIVATE): Add __write_nocancel.
(ld.GLIBC_PRIVATE): Add __write_nocancel.
* sysdeps/mach/hurd/i386/localplt.data (__write_nocancel): Add
reference.

4 years agohtl: Fix cleanup support for IO locking
Samuel Thibault [Sun, 14 Jun 2020 14:48:07 +0000 (14:48 +0000)]
htl: Fix cleanup support for IO locking

* sysdeps/htl/stdio-lock.h: New file, registers locking cleanup to htl.
* sysdeps/htl/libc-lockP.h: Include <libc-lock.h>.
(__libc_cleanup_region_start, __libc_cleanup_end,
__libc_cleanup_region_end): Override macros from <libc-lock.h> with
versions which register cleanup to htl.
(__pthread_get_cleanup_stack): Make reference weak for skipping
registration on in the static non-libpthread case.

4 years agohtl: Move cleanup stack to variable shared between libc and pthread
Samuel Thibault [Sun, 14 Jun 2020 12:56:54 +0000 (12:56 +0000)]
htl: Move cleanup stack to variable shared between libc and pthread

If libpthread gets loaded dynamically, the stack needs to already contain the
cleanup handlers of the main thread.

* htl/libc_pthread_init.c (__pthread_cleanup_stack): New per-thread variable.
* htl/Versions (libc): Add __pthread_cleanup_stack as private symbol.
* htl/pt-internal.h (struct __pthread): Remove cancelation_handlers
field.
(__pthread_cleanup_stack): Add variable declaration.
* htl/pt-alloc.c (initialize_pthread): Remove initialization of
cancelation_handlers field.
* htl/pt-cleanup.c (__pthread_get_cleanup_stack): Return the address of
__pthread_cleanup_stack instead of that of the cancelation_handlers
field.
* htl/forward.c: Include <pt-internal.h>.
(dummy_list): Remove variable.
(__pthread_get_cleanup_stack): Return the address of __pthread_cleanup_stack
instead of that of dummy_list.

4 years agohtl: initialize first and prevent from unloading
Samuel Thibault [Sun, 14 Jun 2020 15:47:14 +0000 (15:47 +0000)]
htl: initialize first and prevent from unloading

libc does not have codepaths for reverting the load of a libpthread.

* htl/Makefile (LDFLAGS-pthread.so): Pass -z nodelete -z initfirst to
linker.

4 years agohtl: Add noreturn attribute on __pthread_exit forward
Samuel Thibault [Sun, 14 Jun 2020 12:53:38 +0000 (12:53 +0000)]
htl: Add noreturn attribute on __pthread_exit forward

* sysdeps/htl/pthread-functions.h (__pthread_exit): Add noreturn
attribute.
(struct pthread_functions): Add noreturn attribute on ptr___pthread_exit
field.

4 years agohurd: Make recv* cancellation points
Samuel Thibault [Sun, 14 Jun 2020 00:19:35 +0000 (00:19 +0000)]
hurd: Make recv* cancellation points

* sysdeps/mach/hurd/recv.c (__recv): Make the __socket_recv call
cancellable.
* sysdeps/mach/hurd/recvfrom.c (__recvfrom): Make the __socket_recv and
__socket_whatis_address calls cancellable.
* sysdeps/mach/hurd/recvmsg.c (__libc_recvmsg): Make the __socket_recv,
__socket_whatis_address, __io_reauthenticate, and __auth_user_authenticate calls
cancellable.

4 years agopowerpc: Automatic CPU detection in preconfigure
Paul E. Murphy [Fri, 8 May 2020 13:27:56 +0000 (08:27 -0500)]
powerpc: Automatic CPU detection in preconfigure

Added a check to detect the CPU value in preconfigure, so that glibc is
built with the correct --with-cpu value.  And move existing checks into
preconfigure.ac.

Co-Authored-By: Carlos Eduardo Seo <cseo@linux.vnet.ibm.com>
Co-Authored-By: Tulio Magno Quites Machado Filho <tuliom@linux.vnet.ibm.com>
4 years agoUse Linux 5.7 in build-many-glibcs.py.
Joseph Myers [Wed, 10 Jun 2020 22:53:55 +0000 (22:53 +0000)]
Use Linux 5.7 in build-many-glibcs.py.

This patch makes build-many-glibcs.py use Linux 5.7.

Tested with build-many-glibcs.py (host-libraries, compilers and glibcs
builds).

4 years agohtl: Enable more cancel tests
Samuel Thibault [Wed, 10 Jun 2020 20:29:21 +0000 (20:29 +0000)]
htl: Enable more cancel tests

* nptl/tst-cancel11.c, tst-cancel21-static.c, tst-cancel21.c, tst-cancel6.c, tst-cancelx11.c, tst-cancelx21.c, tst-cancelx6.c: Move to...
* sysdeps/pthread: ... here.
* nptl/Makefile: Move corresponding references and rules to...
* sysdeps/pthread/Makefile: ... here.

4 years agohtl: Fix linking static tests by factorizing the symbols list
Samuel Thibault [Wed, 10 Jun 2020 20:03:52 +0000 (20:03 +0000)]
htl: Fix linking static tests by factorizing the symbols list

libpthread_syms.a will contain the symbols that libc tries to get from
libpthread, to be used by the system, but also by tests.

* htl/libpthread.a, htl/libpthread_pic.a: Link libpthread_syms.a and Move EXTERN
references to...
* htl/libpthread_syms.a: ... new file. Add missing
__pthread_enable_asynccancel reference.
* htl/Makefile: Install libpthread_syms.a and link it into static tests.

4 years agoAdd "%d" support to _dl_debug_vdprintf
H.J. Lu [Tue, 9 Jun 2020 19:15:01 +0000 (12:15 -0700)]
Add "%d" support to _dl_debug_vdprintf

"%d" will be used to print out signed value.

4 years agoaarch64: MTE compatible strlen
Andrea Corallo [Fri, 5 Jun 2020 15:22:26 +0000 (17:22 +0200)]
aarch64: MTE compatible strlen

Introduce an Arm MTE compatible strlen implementation.

The existing implementation assumes that any access to the pages in
which the string resides is safe.  This assumption is not true when
MTE is enabled.  This patch updates the algorithm to ensure that
accesses remain within the bounds of an MTE tag (16-byte chunks) and
improves overall performance on modern cores. On cores with less
efficient Advanced SIMD implementation such as Cortex-A53 it can
be slower.

Benchmarked on Cortex-A72, Cortex-A53, Neoverse N1.

Co-authored-by: Wilco Dijkstra <wilco.dijkstra@arm.com>
4 years agoaarch64: MTE compatible strchr
Andrea Corallo [Fri, 5 Jun 2020 15:20:50 +0000 (17:20 +0200)]
aarch64: MTE compatible strchr

Introduce an Arm MTE compatible strchr implementation.

The existing implementation assumes that any access to the pages in
which the string resides is safe.  This assumption is not true when
MTE is enabled.  This patch updates the algorithm to ensure that
accesses remain within the bounds of an MTE tag (16-byte chunks) and
improves overall performance.

Benchmarked on Cortex-A72, Cortex-A53, Neoverse N1.

Co-authored-by: Wilco Dijkstra <wilco.dijkstra@arm.com>
4 years agoaarch64: MTE compatible strchrnul
Andrea Corallo [Fri, 5 Jun 2020 15:18:49 +0000 (17:18 +0200)]
aarch64: MTE compatible strchrnul

Introduce an Arm MTE compatible strchrnul implementation.

The existing implementation assumes that any access to the pages in
which the string resides is safe.  This assumption is not true when
MTE is enabled.  This patch updates the algorithm to ensure that
accesses remain within the bounds of an MTE tag (16-byte chunks) and
improves overall performance.

Benchmarked on Cortex-A72, Cortex-A53, Neoverse N1.

Co-authored-by: Wilco Dijkstra <wilco.dijkstra@arm.com>