Sergey Bugaev [Fri, 14 Apr 2023 19:37:00 +0000 (22:37 +0300)]
hurd: Avoid leaking task & thread ports
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Sergey Bugaev [Fri, 14 Apr 2023 19:36:59 +0000 (22:36 +0300)]
hurd: Simplify _S_catch_exception_raise
_hurd_thread_sigstate () already handles finding an existing sigstate
before allocating a new one, so just use that. Bonus: this will only
lock the _hurd_siglock once.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Sergey Bugaev [Sat, 15 Apr 2023 19:08:56 +0000 (22:08 +0300)]
hurd: Run init_pids () before init_dtable ()
Much as the comment says, things on _hurd_subinit assume that _hurd_pid
is already initialized by the time _hurd_subinit is run, so
_hurd_proc_subinit has to run before it. Specifically, init_dtable ()
calls _hurd_port2fd (), which uses _hurd_pid and _hurd_pgrp to set up
ctty handling. With _hurd_subinit running before _hurd_proc_subinit,
ctty setup was broken:
13<--33(pid1255)->term_getctty () = 0 4<--39(pid1255)
task16(pid1255)->mach_port_deallocate (pn{ 10}) = 0
13<--33(pid1255)->term_open_ctty (0 0) = 0x40000016 (Invalid argument)
Fix this by running the _hurd_proc_subinit hook in the correct place --
just after _hurd_portarray is set up (so the proc server port is
available in its usual place) and just before running _hurd_subinit.
Fixes
1ccbb9258eed0f667edf459a28ba23a805549b36
("hurd: Notify the proc server later during initialization").
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Sergey Bugaev [Fri, 14 Apr 2023 19:36:56 +0000 (22:36 +0300)]
hurd: Fix restoring reply port in sigreturn
We must not use the user's reply port (scp->sc_reply_port) for any of
our own RPCs, otherwise various things break. So, use MACH_PORT_DEAD as
a reply port when destroying our reply port, and make sure to do this
after _hurd_sigstate_unlock (), which may do a gsync_wake () RPC.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Florian Weimer [Mon, 17 Apr 2023 13:41:08 +0000 (15:41 +0200)]
wcsmbs: Re-flow and sort routines, tests variables in Makefile
Eliminate strop-tests because it does not seem to be a simplification.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Florian Weimer [Mon, 17 Apr 2023 13:41:08 +0000 (15:41 +0200)]
debug: Re-flow and sort routines variable in Makefile
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Sergey Bugaev [Sat, 15 Apr 2023 16:17:18 +0000 (19:17 +0300)]
hurd: Avoid extra ctty RPCs in init_dtable ()
It is common to have (some of) stdin, stdout and stderr point to the
very same port. We were making the ctty RPCs that _hurd_port2fd () does
for each one of them separately:
1. term_getctty ()
2. mach_port_deallocate ()
3. term_open_ctty ()
Instead, let's detect this case and duplicate the ctty port we already
have. This means we do 1 RPC instead of 3 (and create a single protid
on the server side) if the file is our ctty, and no RPCs instead of 1
if it's not. A clear win!
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Wilco Dijkstra [Mon, 17 Apr 2023 11:42:18 +0000 (12:42 +0100)]
math: Improve fmod(f) performance
Optimize the fast paths (x < y) and (x/y < 2^12). Delay handling of special
cases to reduce the number of instructions executed before the fast paths.
Performance improvements for fmod:
Skylake Zen2 Neoverse V1
subnormals 11.8% 4.2% 11.5%
normal 3.9% 0.01% -0.5%
close-exponents 6.3% 5.6% 19.4%
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Wilco Dijkstra [Tue, 21 Mar 2023 14:00:22 +0000 (14:00 +0000)]
Benchtests: Adjust timing
Adjust iteration counts so benchmarks don't run too slowly or quickly.
Ensure benchmarks take less than 10 seconds on older, slower cores and
more than 0.5 seconds on fast cores.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Sergey Bugaev [Sun, 19 Mar 2023 15:10:07 +0000 (18:10 +0300)]
hurd: Only check for TLS initialization inside rtld or in static builds
When glibc is built as a shared library, TLS is always initialized by
the call of TLS_INIT_TP () macro made inside the dynamic loader, prior
to running the main program (see dl-call_tls_init_tp.h). We can take
advantage of this: we know for sure that __LIBC_NO_TLS () will evaluate
to 0 in all other cases, so let the compiler know that explicitly too.
Also, only define _hurd_tls_init () and TLS_INIT_TP () under the same
conditions (either !SHARED or inside rtld), to statically assert that
this is the case.
Other than a microoptimization, this also helps with avoiding awkward
sharing of the __libc_tls_initialized variable between ld.so and libc.so
that we would have to do otherwise -- we know for sure that no sharing
is required, simply because __libc_tls_initialized would always be set
to true inside libc.so.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <
20230319151017.531737-25-bugaevc@gmail.com>
Sergey Bugaev [Thu, 13 Apr 2023 11:58:12 +0000 (14:58 +0300)]
hurd: Remove __hurd_local_reply_port
Now that the signal code no longer accesses it, the only real user of it
was mig-reply.c, so move the logic for managing the port there.
If we're in SHARED and outside of rtld, we know that __LIBC_NO_TLS ()
always evaluates to 0, and a TLS reply port will always be used, not
__hurd_reply_port0. Still, the compiler does not see that
__hurd_reply_port0 is never used due to its address being taken. To deal
with this, explicitly compile out __hurd_reply_port0 when we know we
won't use it.
Also, instead of accessing the port via THREAD_SELF->reply_port, this
uses THREAD_GETMEM and THREAD_SETMEM directly, avoiding possible
miscompilations.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Adhemerval Zanella [Fri, 14 Apr 2023 11:22:40 +0000 (08:22 -0300)]
malloc: Assure that THP mode read do write OOB end of stringt
Adhemerval Zanella [Wed, 12 Apr 2023 12:36:54 +0000 (09:36 -0300)]
malloc: Assure that THP mode is always null terminated
Samuel Thibault [Thu, 13 Apr 2023 00:02:37 +0000 (02:02 +0200)]
hurd: Mark two tests as unsupported
They make the whole testsuite hang/crash.
Samuel Thibault [Wed, 12 Apr 2023 20:44:50 +0000 (20:44 +0000)]
hurd: Restore destroying receive rights on sigreturn
Just subtracting a ref is making signal/tst-signal signal/tst-raise
signal/tst-minsigstksz-5 htl/tst-raise1 fail.
Samuel Thibault [Tue, 11 Apr 2023 22:12:02 +0000 (00:12 +0200)]
aio: Fix freeing memory
The content of the pool array is initialized only until pool_size,
pointers between pool_size and pool_max_size were not initialized by the
realloc call in get_elem so they should not be freed.
This fixes aio tests crashing at their termination on GNU/Hurd.
Samuel Thibault [Tue, 11 Apr 2023 18:06:03 +0000 (18:06 +0000)]
Revert "hurd: Only check for TLS initialization inside rtld or in static builds"
This reverts commit
b37899d34d2190ef4b454283188f22519f096048.
Apparently we load libc.so (and thus start using its functions) before
calling TLS_INIT_TP, so libc.so functions should not actually assume
that TLS is always set up.
Sergey Bugaev [Sun, 19 Mar 2023 15:10:10 +0000 (18:10 +0300)]
hurd: Don't leak __hurd_reply_port0
Previously, once we set up TLS, we would implicitly switch from using
__hurd_reply_port0 to reply_port inside the TCB, leaving the former
unused. But we never deallocated it, so it got leaked.
Instead, migrate the port into the new TCB's reply_port slot. This
avoids both the port leak and an extra syscall to create a new reply
port for the TCB.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <
20230319151017.531737-28-bugaevc@gmail.com>
Sergey Bugaev [Sun, 19 Mar 2023 15:10:08 +0000 (18:10 +0300)]
hurd: Improve reply port handling when exiting signal handlers
If we're doing signals, that means we've already got the signal thread
running, and that implies TLS having been set up. So we know that
__hurd_local_reply_port will resolve to THREAD_SELF->reply_port, and can
access that directly using the THREAD_GETMEM and THREAD_SETMEM macros.
This avoids potential miscompilations, and should also be a tiny bit
faster.
Also, use mach_port_mod_refs () and not mach_port_destroy () to destroy
the receive right. mach_port_destroy () should *never* be used on
mach_task_self (); this can easily lead to port use-after-free
vulnerabilities if the task has any other references to the same port.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <
20230319151017.531737-26-bugaevc@gmail.com>
Sergey Bugaev [Sun, 19 Mar 2023 15:10:07 +0000 (18:10 +0300)]
hurd: Only check for TLS initialization inside rtld or in static builds
When glibc is built as a shared library, TLS is always initialized by
the call of TLS_INIT_TP () macro made inside the dynamic loader, prior
to running the main program (see dl-call_tls_init_tp.h). We can take
advantage of this: we know for sure that __LIBC_NO_TLS () will evaluate
to 0 in all other cases, so let the compiler know that explicitly too.
Also, only define _hurd_tls_init () and TLS_INIT_TP () under the same
conditions (either !SHARED or inside rtld), to statically assert that
this is the case.
Other than a microoptimization, this also helps with avoiding awkward
sharing of the __libc_tls_initialized variable between ld.so and libc.so
that we would have to do otherwise -- we know for sure that no sharing
is required, simply because __libc_tls_initialized would always be set
to true inside libc.so.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <
20230319151017.531737-25-bugaevc@gmail.com>
Sergey Bugaev [Sun, 19 Mar 2023 15:10:06 +0000 (18:10 +0300)]
elf: Stop including tls.h in ldsodefs.h
Nothing in there needs tls.h
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <
20230319151017.531737-24-bugaevc@gmail.com>
Sergey Bugaev [Mon, 3 Apr 2023 11:56:21 +0000 (14:56 +0300)]
hurd: Port trampoline.c to x86_64
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <
20230403115621.258636-3-bugaevc@gmail.com>
Sergey Bugaev [Mon, 3 Apr 2023 11:56:20 +0000 (14:56 +0300)]
hurd: Do not declare local variables volatile
These are just regular local variables that are not accessed in any
funny ways, not even though a pointer. There's absolutely no reason to
declare them volatile. It only ends up hurting the quality of the
generated machine code.
If anything, it would make sense to decalre sigsp as *pointing* to
volatile memory (volatile void *sigsp), but evidently that's not needed
either.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <
20230403115621.258636-2-bugaevc@gmail.com>
Sergey Bugaev [Sun, 19 Mar 2023 15:10:00 +0000 (18:10 +0300)]
hurd: Implement x86_64/intr-msg.h
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <
20230319151017.531737-18-bugaevc@gmail.com>
Sergey Bugaev [Sun, 19 Mar 2023 15:09:59 +0000 (18:09 +0300)]
hurd: Add sys/ucontext.h and sigcontext.h for x86_64
This is based on the Linux port's version, but laid out to match Mach's
struct i386_thread_state, much like the i386 version does.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Flavio Cruz [Mon, 10 Apr 2023 02:42:52 +0000 (22:42 -0400)]
hurd: Stop depending on the default_pager stubs provided by gnumach
The hurd source tree already provides the same stubs and they are only
needed there.
Message-Id: <ZDN3rDdjMowtUWf7@jupiter.tail36e24.ts.net>
Paul Eggert [Sat, 8 Apr 2023 20:51:26 +0000 (13:51 -0700)]
manual: update AddressSanitizer discussion
* manual/string.texi (Truncating Strings): Update obsolescent
reference and use the more-generic term “AddressSanitizer”.
Mention fortification, too. -fcheck-pointer-bounds is no longer
supported.
Paul Eggert [Sat, 8 Apr 2023 20:51:26 +0000 (13:51 -0700)]
manual: document snprintf truncation better
Paul Eggert [Sat, 8 Apr 2023 20:51:26 +0000 (13:51 -0700)]
manual: improve string section wording
* manual/string.texi: Editorial fixes. Do not say “text” when
“string” or “string contents” is meant, as a C string can contain
bytes that are not valid text in the current encoding.
When warning about strcat efficiency, warn similarly about strncat
and wcscat. “coping” → “copying”.
Mention at the start of the two problematic sections that problems
are discussed at section end.
Paul Eggert [Sat, 8 Apr 2023 20:51:26 +0000 (13:51 -0700)]
manual: fix texinfo typo
* manual/creature.texi (Feature Test Macros): Fix
“creature.texi:309: warning: `.' or `,' must follow @xref, not f”.
Florian Weimer [Thu, 6 Apr 2023 14:40:44 +0000 (16:40 +0200)]
<stdio.h>: Make fopencookie, vasprintf, asprintf available by default
FreeBSD makes these functions available by default, so we should
not treat them as GNU-specific and restrict them to _GNU_SOURCE.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Florian Weimer [Thu, 6 Apr 2023 14:40:44 +0000 (16:40 +0200)]
<string.h>: Make strchrnul, strcasestr, memmem available by default
FreeBSD makes them available by default, too, so there does not seem
to be a reason to restrict these functions to _GNU_SOURCE.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
H.J. Lu [Wed, 5 Apr 2023 16:21:44 +0000 (09:21 -0700)]
<sys/platform/x86.h>: Add PREFETCHI support
Add PREFETCHI support to <sys/platform/x86.h>.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
H.J. Lu [Wed, 5 Apr 2023 16:21:43 +0000 (09:21 -0700)]
<sys/platform/x86.h>: Add AMX-COMPLEX support
Add AMX-COMPLEX support to <sys/platform/x86.h>.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
H.J. Lu [Wed, 5 Apr 2023 16:21:42 +0000 (09:21 -0700)]
<sys/platform/x86.h>: Add AVX-NE-CONVERT support
Add AVX-NE-CONVERT support to <sys/platform/x86.h>.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
H.J. Lu [Wed, 5 Apr 2023 16:21:41 +0000 (09:21 -0700)]
<sys/platform/x86.h>: Add AVX-VNNI-INT8 support
Add AVX-VNNI-INT8 support to <sys/platform/x86.h>.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
H.J. Lu [Wed, 5 Apr 2023 16:21:40 +0000 (09:21 -0700)]
<sys/platform/x86.h>: Add MSRLIST support
Add MSRLIST support to <sys/platform/x86.h>.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
H.J. Lu [Wed, 5 Apr 2023 16:21:39 +0000 (09:21 -0700)]
<sys/platform/x86.h>: Add AVX-IFMA support
Add AVX-IFMA support to <sys/platform/x86.h>.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
H.J. Lu [Wed, 5 Apr 2023 16:21:38 +0000 (09:21 -0700)]
<sys/platform/x86.h>: Add AMX-FP16 support
Add AMX-FP16 support to <sys/platform/x86.h>.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
H.J. Lu [Wed, 5 Apr 2023 16:21:37 +0000 (09:21 -0700)]
<sys/platform/x86.h>: Add WRMSRNS support
Add WRMSRNS support to <sys/platform/x86.h>.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
H.J. Lu [Wed, 5 Apr 2023 16:21:36 +0000 (09:21 -0700)]
<sys/platform/x86.h>: Add ArchPerfmonExt support
Add Architectural Performance Monitoring Extended Leaf (EAX = 23H)
support to <sys/platform/x86.h>.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
H.J. Lu [Wed, 5 Apr 2023 16:21:35 +0000 (09:21 -0700)]
<sys/platform/x86.h>: Add CMPCCXADD support
Add CMPCCXADD support to <sys/platform/x86.h>.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
H.J. Lu [Wed, 5 Apr 2023 16:21:34 +0000 (09:21 -0700)]
<sys/platform/x86.h>: Add LASS support
Add Linear Address Space Separation (LASS) support to <sys/platform/x86.h>.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
H.J. Lu [Wed, 5 Apr 2023 16:21:33 +0000 (09:21 -0700)]
<sys/platform/x86.h>: Add RAO-INT support
Add RAO-INT support to <sys/platform/x86.h>.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
H.J. Lu [Wed, 5 Apr 2023 16:21:32 +0000 (09:21 -0700)]
<sys/platform/x86.h>: Add LBR support
Add architectural LBR support to <sys/platform/x86.h>.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
H.J. Lu [Wed, 5 Apr 2023 16:21:31 +0000 (09:21 -0700)]
<sys/platform/x86.h>: Add RTM_FORCE_ABORT support
Add RTM_FORCE_ABORT support to <sys/platform/x86.h>.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
H.J. Lu [Wed, 5 Apr 2023 16:21:30 +0000 (09:21 -0700)]
<sys/platform/x86.h>: Add SGX-KEYS support
Add SGX-KEYS support to <sys/platform/x86.h>.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
H.J. Lu [Wed, 5 Apr 2023 16:21:29 +0000 (09:21 -0700)]
<sys/platform/x86.h>: Add BUS_LOCK_DETECT support
Add Bus lock debug exceptions (BUS_LOCK_DETECT) support to
<sys/platform/x86.h>.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
H.J. Lu [Wed, 5 Apr 2023 16:21:28 +0000 (09:21 -0700)]
<sys/platform/x86.h>: Add LA57 support
Add 57-bit linear addresses and five-level paging (LA57) support to
<sys/platform/x86.h>.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
H.J. Lu [Wed, 5 Apr 2023 16:21:27 +0000 (09:21 -0700)]
platform.texi: Move LAM after LAHF64_SAHF64
Move LAM after LAHF64_SAHF64 to sort x86 features.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
H.J. Lu [Wed, 5 Apr 2023 16:21:26 +0000 (09:21 -0700)]
<bits/platform/x86.h>: Rename to x86_cpu_INDEX_7_ECX_15
Rename x86_cpu_INDEX_7_ECX_1 to x86_cpu_INDEX_7_ECX_15 for the unused bit
15 in ECX from CPUID with EAX == 0x7 and ECX == 0.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
John David Anglin [Wed, 5 Apr 2023 18:54:47 +0000 (18:54 +0000)]
hppa: Update struct __pthread_rwlock_arch_t comment.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
John David Anglin [Wed, 5 Apr 2023 18:35:38 +0000 (18:35 +0000)]
hppa: Revise __TIMESIZE define to use __WORDSIZE
Handle both 32 and 64-bit ABIs.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Adhemerval Zanella [Tue, 4 Apr 2023 19:44:40 +0000 (16:44 -0300)]
libio: Remove unused pragma weak on vtable
Both _IO_file_jumps_alias and _IO_wfile_jumps_alias are defined as
alias.
Adhemerval Zanella [Tue, 4 Apr 2023 19:42:33 +0000 (16:42 -0300)]
malloc: Only set pragma weak for rpc freemem if required
Both __rpc_freemem and __rpc_thread_destroy are only used if the
the compat symbols are required.
Guy-Fleury Iteriteka [Sat, 18 Mar 2023 09:58:26 +0000 (11:58 +0200)]
htl: move pthread_self info libc.
Signed-off-by: Guy-Fleury Iteriteka <gfleury@disroot.org>
Message-Id: <
20230318095826.1125734-4-gfleury@disroot.org>
Guy-Fleury Iteriteka [Sat, 18 Mar 2023 09:58:25 +0000 (11:58 +0200)]
htl: move ___pthread_self into libc.
sysdeps/mach/hurd/htl/pt-pthread_self.c: New file.
htl/Makefile: .. Add it to libc routine.
sysdeps/mach/hurd/htl/pt-sysdep.c(__pthread_self): Remove it.
sysdeps/mach/hurd/htl/pt-sysdep.h(__pthread_self): Add hidden propertie.
htl/Versions(__pthread_self) Version it as private symbol.
Signed-off-by: Guy-Fleury Iteriteka <gfleury@disroot.org>
Message-Id: <
20230318095826.1125734-3-gfleury@disroot.org>
Guy-Fleury Iteriteka [Sat, 18 Mar 2023 09:58:24 +0000 (11:58 +0200)]
htl: move __pthtread_total into libc
htl/pt-nthreads.c: new file.
htl/Makefile: Add it to routine.
htl/Versions: version it as private libc symbol.
htl/pt-create.c: remove his definition here.
htl/pt-internal.h: add propertie to it declaration.
Signed-off-by: Guy-Fleury Iteriteka <gfleury@disroot.org>
Message-Id: <
20230318095826.1125734-2-gfleury@disroot.org>
Nisha Menon [Mon, 3 Apr 2023 14:11:05 +0000 (10:11 -0400)]
compare_strings.py : Add --gmean flag
To calculate geometric mean for string benchmark results.
Signed-off-by: Nisha Poyarekar <nisha.s.menon@gmail.com>
Andreas Schwab [Thu, 9 Feb 2023 13:56:21 +0000 (14:56 +0100)]
x86/dl-cacheinfo: remove unsused parameter from handle_amd
Also replace an unreachable assert with __builtin_unreachable.
Adhemerval Zanella [Mon, 3 Apr 2023 19:10:43 +0000 (16:10 -0300)]
powerpc: Disable stack protector in early static initialization
Similar to
fb95c316382679c0826cc8399760977cd95f15c9, also disable
for string-ppc64.c (pulled on rltd as the default string
implementation).
Checked on powerpc64-linux-gnu.
Adhemerval Zanella [Mon, 3 Apr 2023 17:18:14 +0000 (14:18 -0300)]
nptl: Fix tst-cancel30 on sparc64
As indicated by sparc kernel-features.h, even though sparc64 defines
__NR_pause, it is not supported (ENOSYS). Always use ppoll or the
64 bit time_t variant instead.
Adhemerval Zanella Netto [Mon, 20 Mar 2023 16:01:18 +0000 (13:01 -0300)]
math: Remove the error handling wrapper from fmod and fmodf
The error handling is moved to sysdeps/ieee754 version with no SVID
support. The compatibility symbol versions still use the wrapper
with SVID error handling around the new code. There is no new symbol
version nor compatibility code on !LIBM_SVID_COMPAT targets
(e.g. riscv).
The ia64 is unchanged, since it still uses the arch specific
__libm_error_region on its implementation. For both i686 and m68k,
which provive arch specific implementation, wrappers are added so
no new symbol are added (which would require to change the
implementations).
It shows an small improvement, the results for fmod:
Architecture | Input | master | patch
-----------------|-----------------|----------|--------
x86_64 (Ryzen 9) | subnormals | 12.5049 | 9.40992
x86_64 (Ryzen 9) | normal | 296.939 | 296.738
x86_64 (Ryzen 9) | close-exponents | 16.0244 | 13.119
aarch64 (N1) | subnormal | 6.81778 | 4.33313
aarch64 (N1) | normal | 155.620 | 152.915
aarch64 (N1) | close-exponents | 8.21306 | 5.76138
armhf (N1) | subnormal | 15.1083 | 14.5746
armhf (N1) | normal | 244.833 | 241.738
armhf (N1) | close-exponents | 21.8182 | 22.457
Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Adhemerval Zanella Netto [Mon, 20 Mar 2023 16:01:17 +0000 (13:01 -0300)]
math: Improve fmodf
This uses a new algorithm similar to already proposed earlier [1].
With x = mx * 2^ex and y = my * 2^ey (mx, my, ex, ey being integers),
the simplest implementation is:
mx * 2^ex == 2 * mx * 2^(ex - 1)
while (ex > ey)
{
mx *= 2;
--ex;
mx %= my;
}
With mx/my being mantissa of double floating pointer, on each step the
argument reduction can be improved 8 (which is sizeof of uint32_t minus
MANTISSA_WIDTH plus the signal bit):
while (ex > ey)
{
mx << 8;
ex -= 8;
mx %= my;
} */
The implementation uses builtin clz and ctz, along with shifts to
convert hx/hy back to doubles. Different than the original patch,
this path assume modulo/divide operation is slow, so use multiplication
with invert values.
I see the following performance improvements using fmod benchtests
(result only show the 'mean' result):
Architecture | Input | master | patch
-----------------|-----------------|----------|--------
x86_64 (Ryzen 9) | subnormals | 17.2549 | 12.0318
x86_64 (Ryzen 9) | normal | 85.4096 | 49.9641
x86_64 (Ryzen 9) | close-exponents | 19.1072 | 15.8224
aarch64 (N1) | subnormal | 10.2182 | 6.81778
aarch64 (N1) | normal | 60.0616 | 20.3667
aarch64 (N1) | close-exponents | 11.5256 | 8.39685
I also see similar improvements on arm-linux-gnueabihf when running on
the N1 aarch64 chips, where it a lot of soft-fp implementation (for
modulo, and multiplication):
Architecture | Input | master | patch
-----------------|-----------------|----------|--------
armhf (N1) | subnormal | 11.6662 | 10.8955
armhf (N1) | normal | 69.2759 | 34.1524
armhf (N1) | close-exponents | 13.6472 | 18.2131
Instead of using the math_private.h definitions, I used the
math_config.h instead which is used on newer math implementations.
Co-authored-by: kirill <kirill.okhotnikov@gmail.com>
[1] https://sourceware.org/pipermail/libc-alpha/2020-November/119794.html
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Adhemerval Zanella Netto [Mon, 20 Mar 2023 16:01:16 +0000 (13:01 -0300)]
math: Improve fmod
This uses a new algorithm similar to already proposed earlier [1].
With x = mx * 2^ex and y = my * 2^ey (mx, my, ex, ey being integers),
the simplest implementation is:
mx * 2^ex == 2 * mx * 2^(ex - 1)
while (ex > ey)
{
mx *= 2;
--ex;
mx %= my;
}
With mx/my being mantissa of double floating pointer, on each step the
argument reduction can be improved 11 (which is sizeo of uint64_t minus
MANTISSA_WIDTH plus the signal bit):
while (ex > ey)
{
mx << 11;
ex -= 11;
mx %= my;
} */
The implementation uses builtin clz and ctz, along with shifts to
convert hx/hy back to doubles. Different than the original patch,
this path assume modulo/divide operation is slow, so use multiplication
with invert values.
I see the following performance improvements using fmod benchtests
(result only show the 'mean' result):
Architecture | Input | master | patch
-----------------|-----------------|----------|--------
x86_64 (Ryzen 9) | subnormals | 19.1584 | 12.5049
x86_64 (Ryzen 9) | normal | 1016.51 | 296.939
x86_64 (Ryzen 9) | close-exponents | 18.4428 | 16.0244
aarch64 (N1) | subnormal | 11.153 | 6.81778
aarch64 (N1) | normal | 528.649 | 155.62
aarch64 (N1) | close-exponents | 11.4517 | 8.21306
I also see similar improvements on arm-linux-gnueabihf when running on
the N1 aarch64 chips, where it a lot of soft-fp implementation (for
modulo, clz, ctz, and multiplication):
Architecture | Input | master | patch
-----------------|-----------------|----------|--------
armhf (N1) | subnormal | 15.908 | 15.1083
armhf (N1) | normal | 837.525 | 244.833
armhf (N1) | close-exponents | 16.2111 | 21.8182
Instead of using the math_private.h definitions, I used the
math_config.h instead which is used on newer math implementations.
Co-authored-by: kirill <kirill.okhotnikov@gmail.com>
[1] https://sourceware.org/pipermail/libc-alpha/2020-November/119794.html
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Adhemerval Zanella Netto [Mon, 20 Mar 2023 16:01:15 +0000 (13:01 -0300)]
benchtests: Add fmodf benchmark
1. Subnormals: 128 inputs.
2. Normal numbers with large exponent difference (|x/y| > 2^8):
1024 inputs between FLT_MIN and FLT_MAX;
3. Close exponents (ey >= -103 and |x/y| < 2^8): 1024 inputs with
exponents between -10 and 10.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Adhemerval Zanella Netto [Mon, 20 Mar 2023 16:01:14 +0000 (13:01 -0300)]
benchtests: Add fmod benchmark
Add three different dataset, from random floating point numbers:
1. Subnormals: 128 inputs.
2. Normal numbers with large exponent difference (|x/y| > 2^52):
1024 inputs between DBL_MIN and DBL_MAX;
3. Close exponents (ey >= -907 and |x/y| < 2^52): 1024 inputs with
exponents between -10 and 10.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
H.J. Lu [Thu, 16 Mar 2023 00:42:54 +0000 (17:42 -0700)]
x86: Set FSGSBASE to active if enabled by kernel
Linux kernel uses AT_HWCAP2 to indicate if FSGSBASE instructions are
enabled. If the HWCAP2_FSGSBASE bit in AT_HWCAP2 is set, FSGSBASE
instructions can be used in user space. Define dl_check_hwcap2 to set
the FSGSBASE feature to active on Linux when the HWCAP2_FSGSBASE bit is
set.
Add a test to verify that FSGSBASE is active on current kernels.
NB: This test will fail if the kernel doesn't set the HWCAP2_FSGSBASE
bit in AT_HWCAP2 while fsgsbase shows up in /proc/cpuinfo.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Florian Weimer [Mon, 3 Apr 2023 15:23:11 +0000 (17:23 +0200)]
x86_64: Fix asm constraints in feraiseexcept (bug 30305)
The divss instruction clobbers its first argument, and the constraints
need to reflect that. Fortunately, with GCC 12, generated code does
not actually change, so there is no externally visible bug.
Suggested-by: Jakub Jelinek <jakub@redhat.com>
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
Siddhesh Poyarekar [Mon, 3 Apr 2023 14:20:04 +0000 (10:20 -0400)]
manual: Document __wur usage under _FORTIFY_SOURCE
The __warn_unused_result__ attribute is only enabled when fortification
is enabled. Mention that in the document. The rationale for this is
essentially to mitigate against CWE-252:
[1] https://cwe.mitre.org/data/definitions/252.html
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Sergey Bugaev [Sun, 19 Mar 2023 15:10:14 +0000 (18:10 +0300)]
hurd: Microoptimize _hurd_self_sigstate ()
When THREAD_GETMEM is defined with inline assembly, the compiler may not
optimize away the two reads of _hurd_sigstate. Help it out a little bit
by only reading it once. This also makes for a slightly cleaner code.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <
20230319151017.531737-32-bugaevc@gmail.com>
Sergey Bugaev [Sun, 19 Mar 2023 15:10:12 +0000 (18:10 +0300)]
hurd: Add vm_param.h for x86_64
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <
20230319151017.531737-30-bugaevc@gmail.com>
Sergey Bugaev [Sun, 19 Mar 2023 15:10:11 +0000 (18:10 +0300)]
hurd: Implement _hurd_longjmp_thread_state for x86_64
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <
20230319151017.531737-29-bugaevc@gmail.com>
Sergey Bugaev [Sun, 19 Mar 2023 15:10:05 +0000 (18:10 +0300)]
htl: Implement thread_set_pcsptp for x86_64
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <
20230319151017.531737-23-bugaevc@gmail.com>
Sergey Bugaev [Sun, 19 Mar 2023 15:10:04 +0000 (18:10 +0300)]
x86_64: Add rtld-stpncpy & rtld-strncpy
Just like the other existing rtld-str* files, this provides rtld with
usable versions of stpncpy and strncpy.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <
20230319151017.531737-22-bugaevc@gmail.com>
Sergey Bugaev [Sun, 19 Mar 2023 15:10:03 +0000 (18:10 +0300)]
htl: Add tcb-offsets.sym for x86_64
The source code is the same as sysdeps/i386/htl/tcb-offsets.sym, but of
course the produced tcb-offsets.h will be different.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <
20230319151017.531737-21-bugaevc@gmail.com>
Sergey Bugaev [Sun, 19 Mar 2023 15:10:02 +0000 (18:10 +0300)]
hurd: Move a couple of signal-related files to x86
These do not need any changes to be used on x86_64.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <
20230319151017.531737-20-bugaevc@gmail.com>
Sergey Bugaev [Sun, 19 Mar 2023 15:09:58 +0000 (18:09 +0300)]
hurd: Use uintptr_t for register values in trampoline.c
This is more correct, if only because these fields are defined as having
the type unsigned int in the Mach headers, so casting them to a signed
int and then back is suboptimal.
Also, remove an extra reassignment of uesp -- this is another remnant of
the ecx kludge.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <
20230319151017.531737-16-bugaevc@gmail.com>
Sergey Bugaev [Sun, 19 Mar 2023 15:09:57 +0000 (18:09 +0300)]
hurd: Move rtld-strncpy-c.c out of mach/hurd/
There's nothing Mach- or Hurd-specific about it; any port that ends
up with rtld pulling in strncpy will need this.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <
20230319151017.531737-15-bugaevc@gmail.com>
Sergey Bugaev [Sun, 19 Mar 2023 15:09:55 +0000 (18:09 +0300)]
hurd: More 64-bit integer casting fixes
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <
20230319151017.531737-13-bugaevc@gmail.com>
Sergey Bugaev [Sun, 19 Mar 2023 15:09:54 +0000 (18:09 +0300)]
mach, hurd: Drop __libc_lock_self0
This was used for the value of libc-lock's owner when TLS is not yet set
up, so THREAD_SELF can not be used. Since the value need not be anything
specific -- it just has to be non-NULL -- we can just use a plain
constant, such as (void *) 1, for this. This avoids accessing the symbol
through GOT, and exporting it from libc.so in the first place.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <
20230319151017.531737-12-bugaevc@gmail.com>
Sergey Bugaev [Sun, 19 Mar 2023 15:09:53 +0000 (18:09 +0300)]
stdio-common: Fix building when !IS_IN (libc)
In this case, _itoa_word () is already defined inline in the header (see
sysdeps/generic/_itoa.h), and the second definition causes an error.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <
20230319151017.531737-11-bugaevc@gmail.com>
Sergey Bugaev [Sun, 19 Mar 2023 15:09:52 +0000 (18:09 +0300)]
hurd: Fix _hurd_setup_sighandler () signature
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <
20230319151017.531737-10-bugaevc@gmail.com>
Sergey Bugaev [Sun, 19 Mar 2023 15:09:51 +0000 (18:09 +0300)]
hurd: Disable O_TRUNC and FS_RETRY_MAGICAL in rtld
hurd/lookup-retry.c is compiled into rtld, the dynamic linker/loader. To
avoid pulling in file_set_size, file_utimens, tty/ctty stuff, more
string/memory code (memmove, strncpy, strcpy), and more strtoul/itoa
code, compile out support for O_TRUNC and FS_RETRY_MAGICAL when building
hurd/lookup-retry.c for rtld. None of that functionality is useful to
rtld during startup anyway. Keep support for FS_RETRY_MAGICAL("/"),
since that does not pull in much, and is required for following absolute
symlinks.
The large number of extra code being pulled into rtld was noticed by
reviewing librtld.map & elf/librtld.os.map in the build tree.
It is worth noting that once libc.so is loaded, the real __open, __stat,
etc. replace the minimal versions used initially by rtld -- this is
especially important in the Hurd port, where the minimal rtld versions
do not use the dtable and just pass real Mach port names as fds. Thus,
once libc.so is loaded, rtld will gain access to the full
__hurd_file_name_lookup_retry () version, complete with FS_RETRY_MAGICAL
support, which is important in case the program decides to
dlopen ("/proc/self/fd/...") or some such.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <
20230319151017.531737-9-bugaevc@gmail.com>
Sergey Bugaev [Sun, 19 Mar 2023 15:09:50 +0000 (18:09 +0300)]
hurd: Fix file name in #error
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <
20230319151017.531737-8-bugaevc@gmail.com>
Sergey Bugaev [Sun, 19 Mar 2023 15:09:49 +0000 (18:09 +0300)]
hurd: Swap around two function calls
...to keep `sigexc' port initialization in one place, and match what the
comments say.
No functional change.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <
20230319151017.531737-7-bugaevc@gmail.com>
Sergey Bugaev [Sun, 19 Mar 2023 15:09:48 +0000 (18:09 +0300)]
hurd: Remove __hurd_threadvar_stack_{offset,mask}
Noone is or should be using __hurd_threadvar_stack_{offset,mask}, we
have proper TLS now. These two remaining variables are never set to
anything other than zero, so any code that would try to use them as
described would just dereference a zero pointer and crash. So remove
them entirely.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <
20230319151017.531737-6-bugaevc@gmail.com>
Sergey Bugaev [Sun, 19 Mar 2023 15:09:47 +0000 (18:09 +0300)]
hurd: Make exception subcode a long
On EXC_BAD_ACCESS, exception subcode is used to pass the faulting memory
address, so it needs to be (at least) pointer-sized. Thus, make it into
a long. This matches the corresponding change in GNU Mach.
Message-Id: <
20230319151017.531737-5-bugaevc@gmail.com>
Alejandro Colomar [Sun, 12 Mar 2023 00:08:10 +0000 (01:08 +0100)]
time: Fix strftime(3) API regarding nullability
strftime(3) doesn't accept null pointers in any of the parameters.
Cc: Paul Eggert <eggert@cs.ucla.edu>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Adhemerval Zanella [Thu, 30 Mar 2023 13:44:33 +0000 (13:44 +0000)]
Update arm libm-tests-ulps
For the next test from
cf7ffdd8a5f6da55397e10b3860062944312824c.
Andreas Schwab [Wed, 15 Mar 2023 10:44:24 +0000 (11:44 +0100)]
getlogin_r: fix missing fallback if loginuid is unset (bug 30235)
When /proc/self/loginuid is not set, we should still fall back to using
the traditional utmp lookup, instead of failing right away.
DJ Delorie [Wed, 29 Mar 2023 04:18:40 +0000 (00:18 -0400)]
memalign: Support scanning for aligned chunks.
This patch adds a chunk scanning algorithm to the _int_memalign code
path that reduces heap fragmentation by reusing already aligned chunks
instead of always looking for chunks of larger sizes and splitting
them. The tcache macros are extended to allow removing a chunk from
the middle of the list.
The goal is to fix the pathological use cases where heaps grow
continuously in workloads that are heavy users of memalign.
Note that tst-memalign-2 checks for tcache operation, which
malloc-check bypasses.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Adhemerval Zanella [Fri, 11 Mar 2022 16:53:11 +0000 (13:53 -0300)]
malloc: Use C11 atomics on memusage
Checked on x86_64-linux-gnu.
Reviewed-by: DJ Delorie <dj@redhat.com>
Adhemerval Zanella Netto [Thu, 23 Mar 2023 13:13:51 +0000 (10:13 -0300)]
Remove --enable-tunables configure option
And make always supported. The configure option was added on glibc 2.25
and some features require it (such as hwcap mask, huge pages support, and
lock elisition tuning). It also simplifies the build permutations.
Changes from v1:
* Remove glibc.rtld.dynamic_sort changes, it is orthogonal and needs
more discussion.
* Cleanup more code.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Adhemerval Zanella [Tue, 28 Mar 2023 18:46:34 +0000 (15:46 -0300)]
Remove --disable-experimental-malloc option
It is the default since 2.26 and it has bitrotten over the years,
By using it multiple malloc tests fails:
FAIL: malloc/tst-memalign-2
FAIL: malloc/tst-memalign-2-malloc-hugetlb1
FAIL: malloc/tst-memalign-2-malloc-hugetlb2
FAIL: malloc/tst-memalign-2-mcheck
FAIL: malloc/tst-mxfast-malloc-hugetlb1
FAIL: malloc/tst-mxfast-malloc-hugetlb2
FAIL: malloc/tst-tcfree2
FAIL: malloc/tst-tcfree2-malloc-hugetlb1
FAIL: malloc/tst-tcfree2-malloc-hugetlb2
Checked on x86_64-linux-gnu.
Reviewed-by: DJ Delorie <dj@redhat.com>
Flavio Cruz [Tue, 28 Mar 2023 13:16:17 +0000 (10:16 -0300)]
Allow building with --disable-nscd again
The change
88677348b4de breaks the build with undefiend references to
the NSCD functions.
Joe Simmons-Talbott [Wed, 22 Mar 2023 18:04:30 +0000 (14:04 -0400)]
system: Add "--" after "-c" for sh (BZ #28519)
Prevent sh from interpreting a user string as shell options if it
starts with '-' or '+'. Since the version of /bin/sh used for testing
system() is different from the full-fledged system /bin/sh add support
to it for handling "--" after "-c". Add a testcase to ensure the
expected behavior.
Signed-off-by: Joe Simmons-Talbott <josimmon@redhat.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Julian Squires [Wed, 22 Mar 2023 16:39:57 +0000 (14:09 -0230)]
posix: Fix some crashes in wordexp [BZ #18096]
Without these fixes, the first three included tests segfault (on a
NULL dereference); the fourth aborts on an assertion, which is itself
unnecessary.
Signed-off-by: Julian Squires <julian@cipht.net>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
caiyinyu [Tue, 28 Mar 2023 01:19:53 +0000 (09:19 +0800)]
LoongArch: ldconfig: Add comments for using EF_LARCH_OBJABI_V1
We added Adhemerval Zanella's comment to explain the reason for
using EF_LARCH_OBJABI_V1.
Romain Geissler [Sun, 26 Mar 2023 19:25:58 +0000 (19:25 +0000)]
elf: Take into account ${sysconfdir} in elf/tst-ldconfig-p.sh
Take into account ${sysconfdir} in elf/tst-ldconfig-p.sh.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>