Sergey Bugaev [Sun, 23 Apr 2023 16:05:47 +0000 (19:05 +0300)]
hurd: Only deallocate addrport when it's valid
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <
20230423160548.126576-3-bugaevc@gmail.com>
Sergey Bugaev [Sun, 23 Apr 2023 21:55:23 +0000 (00:55 +0300)]
hurd: Implement MAP_32BIT
This is a flag that can be passed to mmap () to request that the mapping
being established should be located in the lower 2 GB area of the
address space, so only the lower 31 (not 32) bits can be set in its
address, and the address can be represented as a 32-bit integer without
truncating it.
This flag is intended to be compatible with Linux, FreeBSD, and Darwin
flags of the same name. Out of those systems, it appears Linux and
FreeBSD take MAP_32BIT to mean "map 31 bit", whereas Darwin allows the
32nd bit to be set in the address as well. The Hurd follows Linux and
FreeBSD behavior.
Unlike on those systems, on the Hurd MAP_32BIT is defined on all
supported architectures (which currently are only i386 and x86_64).
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <
20230423215526.346009-1-bugaevc@gmail.com>
Sergey Bugaev [Wed, 19 Apr 2023 16:02:03 +0000 (19:02 +0300)]
Use O_CLOEXEC in more places (BZ #15722)
When opening a temporary file without O_CLOEXEC we risk leaking the
file descriptor if another thread calls (fork and then) exec while we
have the fd open. Fix this by consistently passing O_CLOEXEC everywhere
where we open a file for internal use (and not to return it to the user,
in which case the API defines whether or not the close-on-exec flag
shall be set on the returned fd).
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <
20230419160207.65988-4-bugaevc@gmail.com>
Sergey Bugaev [Wed, 19 Apr 2023 16:02:01 +0000 (19:02 +0300)]
misc: Convert daemon () to GNU coding style
This is nicer, and is going to be required for the following changes
to reasonably stay within the 79 column limit.
No functional change.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Message-Id: <
20230419160207.65988-2-bugaevc@gmail.com>
Joe Simmons-Talbott [Fri, 21 Apr 2023 13:24:25 +0000 (09:24 -0400)]
wcsmbs: Add wcsdup() tests. (BZ #30266)
Enable wide character testcases for wcsdup().
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Joe Simmons-Talbott [Fri, 21 Apr 2023 13:24:24 +0000 (09:24 -0400)]
string: Add tests for strndup (BZ #30266)
Copy strncpy tests for strndup. Covers some basic testcases with random
strings. Remove tests that set the destination's bytes and checked the
resulting buffer's bytes. Remove wide character test support since
wcsndup() doesn't exist.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Joe Simmons-Talbott [Fri, 21 Apr 2023 13:24:23 +0000 (09:24 -0400)]
string: Add tests for strdup (BZ #30266)
Copy strcpy tests for strdup. Covers some basic testcases with random
strings. Add a zero-length string testcase.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Joe Simmons-Talbott [Fri, 21 Apr 2023 13:24:22 +0000 (09:24 -0400)]
string: Allow use of test-string.h for non-ifunc implementations.
Mark two variables as unused to silence warning when using
test-string.h for non-ifunc implementations.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Sergey Bugaev [Thu, 20 Apr 2023 18:42:19 +0000 (21:42 +0300)]
hurd: Don't migrate reply port into __init1_tcbhead
Properly differentiate between setting up the real TLS with
TLS_INIT_TP, and setting up the early TLS (__init1_tcbhead) in static
builds. In the latter case, don't yet migrate the reply port into the
TCB, and don't yet set __libc_tls_initialized to 1.
This also lets us move the __init1_desc assignment inside
_hurd_tls_init ().
Fixes
cd019ddd892e182277fadd6aedccc57fa3923c8d
"hurd: Don't leak __hurd_reply_port0"
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Sergey Bugaev [Wed, 19 Apr 2023 16:02:05 +0000 (19:02 +0300)]
hurd: Make dl-sysdep's open () cope with O_IGNORE_CTTY
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <
20230419160207.65988-6-bugaevc@gmail.com>
Cupertino Miranda [Fri, 14 Apr 2023 15:12:20 +0000 (16:12 +0100)]
Created tunable to force small pages on stack allocation.
Created tunable glibc.pthread.stack_hugetlb to control when hugepages
can be used for stack allocation.
In case THP are enabled and glibc.pthread.stack_hugetlb is set to
0, glibc will madvise the kernel not to use allow hugepages for stack
allocations.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Adhemerval Zanella [Wed, 19 Apr 2023 21:17:20 +0000 (18:17 -0300)]
malloc: Add missing shared thread library flags
So tst-memalign-3 builds on Hurd.
Adhemerval Zanella [Wed, 19 Apr 2023 21:18:15 +0000 (18:18 -0300)]
linux: Re-flow and sort multiline Makefile definitions
Adhemerval Zanella [Tue, 18 Apr 2023 16:58:18 +0000 (13:58 -0300)]
posix: Re-flow and sort multiline Makefile definitions
Jan-Benedict Glaw [Sat, 1 Apr 2023 19:09:19 +0000 (21:09 +0200)]
build-many-glibcs.py: --disable-gcov for gcc-first
This is also being tracked n GCC [1].
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100289
DJ Delorie [Mon, 3 Apr 2023 21:33:03 +0000 (17:33 -0400)]
malloc: set NON_MAIN_ARENA flag for reclaimed memalign chunk (BZ #30101)
Based on these comments in malloc.c:
size field is or'ed with NON_MAIN_ARENA if the chunk was obtained
from a non-main arena. This is only set immediately before handing
the chunk to the user, if necessary.
The NON_MAIN_ARENA flag is never set for unsorted chunks, so it
does not have to be taken into account in size comparisons.
When we pull a chunk off the unsorted list (or any list) we need to
make sure that flag is set properly before returning the chunk.
Use the rounded-up size for chunk_ok_for_memalign()
Do not scan the arena for reusable chunks if there's no arena.
Account for chunk overhead when determining if a chunk is a reuse
candidate.
mcheck interferes with memalign, so skip mcheck variants of
memalign tests.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
Sergey Bugaev [Fri, 14 Apr 2023 13:32:45 +0000 (16:32 +0300)]
hurd: Microoptimize sigreturn
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Siddhesh Poyarekar [Tue, 18 Apr 2023 13:47:40 +0000 (09:47 -0400)]
rcmd.c: Fix indentation in last commit
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Frédéric Bérat [Tue, 18 Apr 2023 13:01:00 +0000 (09:01 -0400)]
inet/rcmd.c: fix warn unused result
Fix unused result warnings, detected when _FORTIFY_SOURCE is enabled in
glibc.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Sergey Bugaev [Fri, 14 Apr 2023 19:37:00 +0000 (22:37 +0300)]
hurd: Avoid leaking task & thread ports
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Sergey Bugaev [Fri, 14 Apr 2023 19:36:59 +0000 (22:36 +0300)]
hurd: Simplify _S_catch_exception_raise
_hurd_thread_sigstate () already handles finding an existing sigstate
before allocating a new one, so just use that. Bonus: this will only
lock the _hurd_siglock once.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Sergey Bugaev [Sat, 15 Apr 2023 19:08:56 +0000 (22:08 +0300)]
hurd: Run init_pids () before init_dtable ()
Much as the comment says, things on _hurd_subinit assume that _hurd_pid
is already initialized by the time _hurd_subinit is run, so
_hurd_proc_subinit has to run before it. Specifically, init_dtable ()
calls _hurd_port2fd (), which uses _hurd_pid and _hurd_pgrp to set up
ctty handling. With _hurd_subinit running before _hurd_proc_subinit,
ctty setup was broken:
13<--33(pid1255)->term_getctty () = 0 4<--39(pid1255)
task16(pid1255)->mach_port_deallocate (pn{ 10}) = 0
13<--33(pid1255)->term_open_ctty (0 0) = 0x40000016 (Invalid argument)
Fix this by running the _hurd_proc_subinit hook in the correct place --
just after _hurd_portarray is set up (so the proc server port is
available in its usual place) and just before running _hurd_subinit.
Fixes
1ccbb9258eed0f667edf459a28ba23a805549b36
("hurd: Notify the proc server later during initialization").
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Sergey Bugaev [Fri, 14 Apr 2023 19:36:56 +0000 (22:36 +0300)]
hurd: Fix restoring reply port in sigreturn
We must not use the user's reply port (scp->sc_reply_port) for any of
our own RPCs, otherwise various things break. So, use MACH_PORT_DEAD as
a reply port when destroying our reply port, and make sure to do this
after _hurd_sigstate_unlock (), which may do a gsync_wake () RPC.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Florian Weimer [Mon, 17 Apr 2023 13:41:08 +0000 (15:41 +0200)]
wcsmbs: Re-flow and sort routines, tests variables in Makefile
Eliminate strop-tests because it does not seem to be a simplification.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Florian Weimer [Mon, 17 Apr 2023 13:41:08 +0000 (15:41 +0200)]
debug: Re-flow and sort routines variable in Makefile
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Sergey Bugaev [Sat, 15 Apr 2023 16:17:18 +0000 (19:17 +0300)]
hurd: Avoid extra ctty RPCs in init_dtable ()
It is common to have (some of) stdin, stdout and stderr point to the
very same port. We were making the ctty RPCs that _hurd_port2fd () does
for each one of them separately:
1. term_getctty ()
2. mach_port_deallocate ()
3. term_open_ctty ()
Instead, let's detect this case and duplicate the ctty port we already
have. This means we do 1 RPC instead of 3 (and create a single protid
on the server side) if the file is our ctty, and no RPCs instead of 1
if it's not. A clear win!
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Wilco Dijkstra [Mon, 17 Apr 2023 11:42:18 +0000 (12:42 +0100)]
math: Improve fmod(f) performance
Optimize the fast paths (x < y) and (x/y < 2^12). Delay handling of special
cases to reduce the number of instructions executed before the fast paths.
Performance improvements for fmod:
Skylake Zen2 Neoverse V1
subnormals 11.8% 4.2% 11.5%
normal 3.9% 0.01% -0.5%
close-exponents 6.3% 5.6% 19.4%
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Wilco Dijkstra [Tue, 21 Mar 2023 14:00:22 +0000 (14:00 +0000)]
Benchtests: Adjust timing
Adjust iteration counts so benchmarks don't run too slowly or quickly.
Ensure benchmarks take less than 10 seconds on older, slower cores and
more than 0.5 seconds on fast cores.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Sergey Bugaev [Sun, 19 Mar 2023 15:10:07 +0000 (18:10 +0300)]
hurd: Only check for TLS initialization inside rtld or in static builds
When glibc is built as a shared library, TLS is always initialized by
the call of TLS_INIT_TP () macro made inside the dynamic loader, prior
to running the main program (see dl-call_tls_init_tp.h). We can take
advantage of this: we know for sure that __LIBC_NO_TLS () will evaluate
to 0 in all other cases, so let the compiler know that explicitly too.
Also, only define _hurd_tls_init () and TLS_INIT_TP () under the same
conditions (either !SHARED or inside rtld), to statically assert that
this is the case.
Other than a microoptimization, this also helps with avoiding awkward
sharing of the __libc_tls_initialized variable between ld.so and libc.so
that we would have to do otherwise -- we know for sure that no sharing
is required, simply because __libc_tls_initialized would always be set
to true inside libc.so.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <
20230319151017.531737-25-bugaevc@gmail.com>
Sergey Bugaev [Thu, 13 Apr 2023 11:58:12 +0000 (14:58 +0300)]
hurd: Remove __hurd_local_reply_port
Now that the signal code no longer accesses it, the only real user of it
was mig-reply.c, so move the logic for managing the port there.
If we're in SHARED and outside of rtld, we know that __LIBC_NO_TLS ()
always evaluates to 0, and a TLS reply port will always be used, not
__hurd_reply_port0. Still, the compiler does not see that
__hurd_reply_port0 is never used due to its address being taken. To deal
with this, explicitly compile out __hurd_reply_port0 when we know we
won't use it.
Also, instead of accessing the port via THREAD_SELF->reply_port, this
uses THREAD_GETMEM and THREAD_SETMEM directly, avoiding possible
miscompilations.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Adhemerval Zanella [Fri, 14 Apr 2023 11:22:40 +0000 (08:22 -0300)]
malloc: Assure that THP mode read do write OOB end of stringt
Adhemerval Zanella [Wed, 12 Apr 2023 12:36:54 +0000 (09:36 -0300)]
malloc: Assure that THP mode is always null terminated
Samuel Thibault [Thu, 13 Apr 2023 00:02:37 +0000 (02:02 +0200)]
hurd: Mark two tests as unsupported
They make the whole testsuite hang/crash.
Samuel Thibault [Wed, 12 Apr 2023 20:44:50 +0000 (20:44 +0000)]
hurd: Restore destroying receive rights on sigreturn
Just subtracting a ref is making signal/tst-signal signal/tst-raise
signal/tst-minsigstksz-5 htl/tst-raise1 fail.
Samuel Thibault [Tue, 11 Apr 2023 22:12:02 +0000 (00:12 +0200)]
aio: Fix freeing memory
The content of the pool array is initialized only until pool_size,
pointers between pool_size and pool_max_size were not initialized by the
realloc call in get_elem so they should not be freed.
This fixes aio tests crashing at their termination on GNU/Hurd.
Samuel Thibault [Tue, 11 Apr 2023 18:06:03 +0000 (18:06 +0000)]
Revert "hurd: Only check for TLS initialization inside rtld or in static builds"
This reverts commit
b37899d34d2190ef4b454283188f22519f096048.
Apparently we load libc.so (and thus start using its functions) before
calling TLS_INIT_TP, so libc.so functions should not actually assume
that TLS is always set up.
Sergey Bugaev [Sun, 19 Mar 2023 15:10:10 +0000 (18:10 +0300)]
hurd: Don't leak __hurd_reply_port0
Previously, once we set up TLS, we would implicitly switch from using
__hurd_reply_port0 to reply_port inside the TCB, leaving the former
unused. But we never deallocated it, so it got leaked.
Instead, migrate the port into the new TCB's reply_port slot. This
avoids both the port leak and an extra syscall to create a new reply
port for the TCB.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <
20230319151017.531737-28-bugaevc@gmail.com>
Sergey Bugaev [Sun, 19 Mar 2023 15:10:08 +0000 (18:10 +0300)]
hurd: Improve reply port handling when exiting signal handlers
If we're doing signals, that means we've already got the signal thread
running, and that implies TLS having been set up. So we know that
__hurd_local_reply_port will resolve to THREAD_SELF->reply_port, and can
access that directly using the THREAD_GETMEM and THREAD_SETMEM macros.
This avoids potential miscompilations, and should also be a tiny bit
faster.
Also, use mach_port_mod_refs () and not mach_port_destroy () to destroy
the receive right. mach_port_destroy () should *never* be used on
mach_task_self (); this can easily lead to port use-after-free
vulnerabilities if the task has any other references to the same port.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <
20230319151017.531737-26-bugaevc@gmail.com>
Sergey Bugaev [Sun, 19 Mar 2023 15:10:07 +0000 (18:10 +0300)]
hurd: Only check for TLS initialization inside rtld or in static builds
When glibc is built as a shared library, TLS is always initialized by
the call of TLS_INIT_TP () macro made inside the dynamic loader, prior
to running the main program (see dl-call_tls_init_tp.h). We can take
advantage of this: we know for sure that __LIBC_NO_TLS () will evaluate
to 0 in all other cases, so let the compiler know that explicitly too.
Also, only define _hurd_tls_init () and TLS_INIT_TP () under the same
conditions (either !SHARED or inside rtld), to statically assert that
this is the case.
Other than a microoptimization, this also helps with avoiding awkward
sharing of the __libc_tls_initialized variable between ld.so and libc.so
that we would have to do otherwise -- we know for sure that no sharing
is required, simply because __libc_tls_initialized would always be set
to true inside libc.so.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <
20230319151017.531737-25-bugaevc@gmail.com>
Sergey Bugaev [Sun, 19 Mar 2023 15:10:06 +0000 (18:10 +0300)]
elf: Stop including tls.h in ldsodefs.h
Nothing in there needs tls.h
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <
20230319151017.531737-24-bugaevc@gmail.com>
Sergey Bugaev [Mon, 3 Apr 2023 11:56:21 +0000 (14:56 +0300)]
hurd: Port trampoline.c to x86_64
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <
20230403115621.258636-3-bugaevc@gmail.com>
Sergey Bugaev [Mon, 3 Apr 2023 11:56:20 +0000 (14:56 +0300)]
hurd: Do not declare local variables volatile
These are just regular local variables that are not accessed in any
funny ways, not even though a pointer. There's absolutely no reason to
declare them volatile. It only ends up hurting the quality of the
generated machine code.
If anything, it would make sense to decalre sigsp as *pointing* to
volatile memory (volatile void *sigsp), but evidently that's not needed
either.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <
20230403115621.258636-2-bugaevc@gmail.com>
Sergey Bugaev [Sun, 19 Mar 2023 15:10:00 +0000 (18:10 +0300)]
hurd: Implement x86_64/intr-msg.h
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <
20230319151017.531737-18-bugaevc@gmail.com>
Sergey Bugaev [Sun, 19 Mar 2023 15:09:59 +0000 (18:09 +0300)]
hurd: Add sys/ucontext.h and sigcontext.h for x86_64
This is based on the Linux port's version, but laid out to match Mach's
struct i386_thread_state, much like the i386 version does.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Flavio Cruz [Mon, 10 Apr 2023 02:42:52 +0000 (22:42 -0400)]
hurd: Stop depending on the default_pager stubs provided by gnumach
The hurd source tree already provides the same stubs and they are only
needed there.
Message-Id: <ZDN3rDdjMowtUWf7@jupiter.tail36e24.ts.net>
Paul Eggert [Sat, 8 Apr 2023 20:51:26 +0000 (13:51 -0700)]
manual: update AddressSanitizer discussion
* manual/string.texi (Truncating Strings): Update obsolescent
reference and use the more-generic term “AddressSanitizer”.
Mention fortification, too. -fcheck-pointer-bounds is no longer
supported.
Paul Eggert [Sat, 8 Apr 2023 20:51:26 +0000 (13:51 -0700)]
manual: document snprintf truncation better
Paul Eggert [Sat, 8 Apr 2023 20:51:26 +0000 (13:51 -0700)]
manual: improve string section wording
* manual/string.texi: Editorial fixes. Do not say “text” when
“string” or “string contents” is meant, as a C string can contain
bytes that are not valid text in the current encoding.
When warning about strcat efficiency, warn similarly about strncat
and wcscat. “coping” → “copying”.
Mention at the start of the two problematic sections that problems
are discussed at section end.
Paul Eggert [Sat, 8 Apr 2023 20:51:26 +0000 (13:51 -0700)]
manual: fix texinfo typo
* manual/creature.texi (Feature Test Macros): Fix
“creature.texi:309: warning: `.' or `,' must follow @xref, not f”.
Florian Weimer [Thu, 6 Apr 2023 14:40:44 +0000 (16:40 +0200)]
<stdio.h>: Make fopencookie, vasprintf, asprintf available by default
FreeBSD makes these functions available by default, so we should
not treat them as GNU-specific and restrict them to _GNU_SOURCE.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Florian Weimer [Thu, 6 Apr 2023 14:40:44 +0000 (16:40 +0200)]
<string.h>: Make strchrnul, strcasestr, memmem available by default
FreeBSD makes them available by default, too, so there does not seem
to be a reason to restrict these functions to _GNU_SOURCE.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
H.J. Lu [Wed, 5 Apr 2023 16:21:44 +0000 (09:21 -0700)]
<sys/platform/x86.h>: Add PREFETCHI support
Add PREFETCHI support to <sys/platform/x86.h>.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
H.J. Lu [Wed, 5 Apr 2023 16:21:43 +0000 (09:21 -0700)]
<sys/platform/x86.h>: Add AMX-COMPLEX support
Add AMX-COMPLEX support to <sys/platform/x86.h>.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
H.J. Lu [Wed, 5 Apr 2023 16:21:42 +0000 (09:21 -0700)]
<sys/platform/x86.h>: Add AVX-NE-CONVERT support
Add AVX-NE-CONVERT support to <sys/platform/x86.h>.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
H.J. Lu [Wed, 5 Apr 2023 16:21:41 +0000 (09:21 -0700)]
<sys/platform/x86.h>: Add AVX-VNNI-INT8 support
Add AVX-VNNI-INT8 support to <sys/platform/x86.h>.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
H.J. Lu [Wed, 5 Apr 2023 16:21:40 +0000 (09:21 -0700)]
<sys/platform/x86.h>: Add MSRLIST support
Add MSRLIST support to <sys/platform/x86.h>.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
H.J. Lu [Wed, 5 Apr 2023 16:21:39 +0000 (09:21 -0700)]
<sys/platform/x86.h>: Add AVX-IFMA support
Add AVX-IFMA support to <sys/platform/x86.h>.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
H.J. Lu [Wed, 5 Apr 2023 16:21:38 +0000 (09:21 -0700)]
<sys/platform/x86.h>: Add AMX-FP16 support
Add AMX-FP16 support to <sys/platform/x86.h>.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
H.J. Lu [Wed, 5 Apr 2023 16:21:37 +0000 (09:21 -0700)]
<sys/platform/x86.h>: Add WRMSRNS support
Add WRMSRNS support to <sys/platform/x86.h>.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
H.J. Lu [Wed, 5 Apr 2023 16:21:36 +0000 (09:21 -0700)]
<sys/platform/x86.h>: Add ArchPerfmonExt support
Add Architectural Performance Monitoring Extended Leaf (EAX = 23H)
support to <sys/platform/x86.h>.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
H.J. Lu [Wed, 5 Apr 2023 16:21:35 +0000 (09:21 -0700)]
<sys/platform/x86.h>: Add CMPCCXADD support
Add CMPCCXADD support to <sys/platform/x86.h>.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
H.J. Lu [Wed, 5 Apr 2023 16:21:34 +0000 (09:21 -0700)]
<sys/platform/x86.h>: Add LASS support
Add Linear Address Space Separation (LASS) support to <sys/platform/x86.h>.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
H.J. Lu [Wed, 5 Apr 2023 16:21:33 +0000 (09:21 -0700)]
<sys/platform/x86.h>: Add RAO-INT support
Add RAO-INT support to <sys/platform/x86.h>.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
H.J. Lu [Wed, 5 Apr 2023 16:21:32 +0000 (09:21 -0700)]
<sys/platform/x86.h>: Add LBR support
Add architectural LBR support to <sys/platform/x86.h>.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
H.J. Lu [Wed, 5 Apr 2023 16:21:31 +0000 (09:21 -0700)]
<sys/platform/x86.h>: Add RTM_FORCE_ABORT support
Add RTM_FORCE_ABORT support to <sys/platform/x86.h>.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
H.J. Lu [Wed, 5 Apr 2023 16:21:30 +0000 (09:21 -0700)]
<sys/platform/x86.h>: Add SGX-KEYS support
Add SGX-KEYS support to <sys/platform/x86.h>.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
H.J. Lu [Wed, 5 Apr 2023 16:21:29 +0000 (09:21 -0700)]
<sys/platform/x86.h>: Add BUS_LOCK_DETECT support
Add Bus lock debug exceptions (BUS_LOCK_DETECT) support to
<sys/platform/x86.h>.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
H.J. Lu [Wed, 5 Apr 2023 16:21:28 +0000 (09:21 -0700)]
<sys/platform/x86.h>: Add LA57 support
Add 57-bit linear addresses and five-level paging (LA57) support to
<sys/platform/x86.h>.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
H.J. Lu [Wed, 5 Apr 2023 16:21:27 +0000 (09:21 -0700)]
platform.texi: Move LAM after LAHF64_SAHF64
Move LAM after LAHF64_SAHF64 to sort x86 features.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
H.J. Lu [Wed, 5 Apr 2023 16:21:26 +0000 (09:21 -0700)]
<bits/platform/x86.h>: Rename to x86_cpu_INDEX_7_ECX_15
Rename x86_cpu_INDEX_7_ECX_1 to x86_cpu_INDEX_7_ECX_15 for the unused bit
15 in ECX from CPUID with EAX == 0x7 and ECX == 0.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
John David Anglin [Wed, 5 Apr 2023 18:54:47 +0000 (18:54 +0000)]
hppa: Update struct __pthread_rwlock_arch_t comment.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
John David Anglin [Wed, 5 Apr 2023 18:35:38 +0000 (18:35 +0000)]
hppa: Revise __TIMESIZE define to use __WORDSIZE
Handle both 32 and 64-bit ABIs.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Adhemerval Zanella [Tue, 4 Apr 2023 19:44:40 +0000 (16:44 -0300)]
libio: Remove unused pragma weak on vtable
Both _IO_file_jumps_alias and _IO_wfile_jumps_alias are defined as
alias.
Adhemerval Zanella [Tue, 4 Apr 2023 19:42:33 +0000 (16:42 -0300)]
malloc: Only set pragma weak for rpc freemem if required
Both __rpc_freemem and __rpc_thread_destroy are only used if the
the compat symbols are required.
Guy-Fleury Iteriteka [Sat, 18 Mar 2023 09:58:26 +0000 (11:58 +0200)]
htl: move pthread_self info libc.
Signed-off-by: Guy-Fleury Iteriteka <gfleury@disroot.org>
Message-Id: <
20230318095826.1125734-4-gfleury@disroot.org>
Guy-Fleury Iteriteka [Sat, 18 Mar 2023 09:58:25 +0000 (11:58 +0200)]
htl: move ___pthread_self into libc.
sysdeps/mach/hurd/htl/pt-pthread_self.c: New file.
htl/Makefile: .. Add it to libc routine.
sysdeps/mach/hurd/htl/pt-sysdep.c(__pthread_self): Remove it.
sysdeps/mach/hurd/htl/pt-sysdep.h(__pthread_self): Add hidden propertie.
htl/Versions(__pthread_self) Version it as private symbol.
Signed-off-by: Guy-Fleury Iteriteka <gfleury@disroot.org>
Message-Id: <
20230318095826.1125734-3-gfleury@disroot.org>
Guy-Fleury Iteriteka [Sat, 18 Mar 2023 09:58:24 +0000 (11:58 +0200)]
htl: move __pthtread_total into libc
htl/pt-nthreads.c: new file.
htl/Makefile: Add it to routine.
htl/Versions: version it as private libc symbol.
htl/pt-create.c: remove his definition here.
htl/pt-internal.h: add propertie to it declaration.
Signed-off-by: Guy-Fleury Iteriteka <gfleury@disroot.org>
Message-Id: <
20230318095826.1125734-2-gfleury@disroot.org>
Nisha Menon [Mon, 3 Apr 2023 14:11:05 +0000 (10:11 -0400)]
compare_strings.py : Add --gmean flag
To calculate geometric mean for string benchmark results.
Signed-off-by: Nisha Poyarekar <nisha.s.menon@gmail.com>
Andreas Schwab [Thu, 9 Feb 2023 13:56:21 +0000 (14:56 +0100)]
x86/dl-cacheinfo: remove unsused parameter from handle_amd
Also replace an unreachable assert with __builtin_unreachable.
Adhemerval Zanella [Mon, 3 Apr 2023 19:10:43 +0000 (16:10 -0300)]
powerpc: Disable stack protector in early static initialization
Similar to
fb95c316382679c0826cc8399760977cd95f15c9, also disable
for string-ppc64.c (pulled on rltd as the default string
implementation).
Checked on powerpc64-linux-gnu.
Adhemerval Zanella [Mon, 3 Apr 2023 17:18:14 +0000 (14:18 -0300)]
nptl: Fix tst-cancel30 on sparc64
As indicated by sparc kernel-features.h, even though sparc64 defines
__NR_pause, it is not supported (ENOSYS). Always use ppoll or the
64 bit time_t variant instead.
Adhemerval Zanella Netto [Mon, 20 Mar 2023 16:01:18 +0000 (13:01 -0300)]
math: Remove the error handling wrapper from fmod and fmodf
The error handling is moved to sysdeps/ieee754 version with no SVID
support. The compatibility symbol versions still use the wrapper
with SVID error handling around the new code. There is no new symbol
version nor compatibility code on !LIBM_SVID_COMPAT targets
(e.g. riscv).
The ia64 is unchanged, since it still uses the arch specific
__libm_error_region on its implementation. For both i686 and m68k,
which provive arch specific implementation, wrappers are added so
no new symbol are added (which would require to change the
implementations).
It shows an small improvement, the results for fmod:
Architecture | Input | master | patch
-----------------|-----------------|----------|--------
x86_64 (Ryzen 9) | subnormals | 12.5049 | 9.40992
x86_64 (Ryzen 9) | normal | 296.939 | 296.738
x86_64 (Ryzen 9) | close-exponents | 16.0244 | 13.119
aarch64 (N1) | subnormal | 6.81778 | 4.33313
aarch64 (N1) | normal | 155.620 | 152.915
aarch64 (N1) | close-exponents | 8.21306 | 5.76138
armhf (N1) | subnormal | 15.1083 | 14.5746
armhf (N1) | normal | 244.833 | 241.738
armhf (N1) | close-exponents | 21.8182 | 22.457
Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Adhemerval Zanella Netto [Mon, 20 Mar 2023 16:01:17 +0000 (13:01 -0300)]
math: Improve fmodf
This uses a new algorithm similar to already proposed earlier [1].
With x = mx * 2^ex and y = my * 2^ey (mx, my, ex, ey being integers),
the simplest implementation is:
mx * 2^ex == 2 * mx * 2^(ex - 1)
while (ex > ey)
{
mx *= 2;
--ex;
mx %= my;
}
With mx/my being mantissa of double floating pointer, on each step the
argument reduction can be improved 8 (which is sizeof of uint32_t minus
MANTISSA_WIDTH plus the signal bit):
while (ex > ey)
{
mx << 8;
ex -= 8;
mx %= my;
} */
The implementation uses builtin clz and ctz, along with shifts to
convert hx/hy back to doubles. Different than the original patch,
this path assume modulo/divide operation is slow, so use multiplication
with invert values.
I see the following performance improvements using fmod benchtests
(result only show the 'mean' result):
Architecture | Input | master | patch
-----------------|-----------------|----------|--------
x86_64 (Ryzen 9) | subnormals | 17.2549 | 12.0318
x86_64 (Ryzen 9) | normal | 85.4096 | 49.9641
x86_64 (Ryzen 9) | close-exponents | 19.1072 | 15.8224
aarch64 (N1) | subnormal | 10.2182 | 6.81778
aarch64 (N1) | normal | 60.0616 | 20.3667
aarch64 (N1) | close-exponents | 11.5256 | 8.39685
I also see similar improvements on arm-linux-gnueabihf when running on
the N1 aarch64 chips, where it a lot of soft-fp implementation (for
modulo, and multiplication):
Architecture | Input | master | patch
-----------------|-----------------|----------|--------
armhf (N1) | subnormal | 11.6662 | 10.8955
armhf (N1) | normal | 69.2759 | 34.1524
armhf (N1) | close-exponents | 13.6472 | 18.2131
Instead of using the math_private.h definitions, I used the
math_config.h instead which is used on newer math implementations.
Co-authored-by: kirill <kirill.okhotnikov@gmail.com>
[1] https://sourceware.org/pipermail/libc-alpha/2020-November/119794.html
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Adhemerval Zanella Netto [Mon, 20 Mar 2023 16:01:16 +0000 (13:01 -0300)]
math: Improve fmod
This uses a new algorithm similar to already proposed earlier [1].
With x = mx * 2^ex and y = my * 2^ey (mx, my, ex, ey being integers),
the simplest implementation is:
mx * 2^ex == 2 * mx * 2^(ex - 1)
while (ex > ey)
{
mx *= 2;
--ex;
mx %= my;
}
With mx/my being mantissa of double floating pointer, on each step the
argument reduction can be improved 11 (which is sizeo of uint64_t minus
MANTISSA_WIDTH plus the signal bit):
while (ex > ey)
{
mx << 11;
ex -= 11;
mx %= my;
} */
The implementation uses builtin clz and ctz, along with shifts to
convert hx/hy back to doubles. Different than the original patch,
this path assume modulo/divide operation is slow, so use multiplication
with invert values.
I see the following performance improvements using fmod benchtests
(result only show the 'mean' result):
Architecture | Input | master | patch
-----------------|-----------------|----------|--------
x86_64 (Ryzen 9) | subnormals | 19.1584 | 12.5049
x86_64 (Ryzen 9) | normal | 1016.51 | 296.939
x86_64 (Ryzen 9) | close-exponents | 18.4428 | 16.0244
aarch64 (N1) | subnormal | 11.153 | 6.81778
aarch64 (N1) | normal | 528.649 | 155.62
aarch64 (N1) | close-exponents | 11.4517 | 8.21306
I also see similar improvements on arm-linux-gnueabihf when running on
the N1 aarch64 chips, where it a lot of soft-fp implementation (for
modulo, clz, ctz, and multiplication):
Architecture | Input | master | patch
-----------------|-----------------|----------|--------
armhf (N1) | subnormal | 15.908 | 15.1083
armhf (N1) | normal | 837.525 | 244.833
armhf (N1) | close-exponents | 16.2111 | 21.8182
Instead of using the math_private.h definitions, I used the
math_config.h instead which is used on newer math implementations.
Co-authored-by: kirill <kirill.okhotnikov@gmail.com>
[1] https://sourceware.org/pipermail/libc-alpha/2020-November/119794.html
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Adhemerval Zanella Netto [Mon, 20 Mar 2023 16:01:15 +0000 (13:01 -0300)]
benchtests: Add fmodf benchmark
1. Subnormals: 128 inputs.
2. Normal numbers with large exponent difference (|x/y| > 2^8):
1024 inputs between FLT_MIN and FLT_MAX;
3. Close exponents (ey >= -103 and |x/y| < 2^8): 1024 inputs with
exponents between -10 and 10.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Adhemerval Zanella Netto [Mon, 20 Mar 2023 16:01:14 +0000 (13:01 -0300)]
benchtests: Add fmod benchmark
Add three different dataset, from random floating point numbers:
1. Subnormals: 128 inputs.
2. Normal numbers with large exponent difference (|x/y| > 2^52):
1024 inputs between DBL_MIN and DBL_MAX;
3. Close exponents (ey >= -907 and |x/y| < 2^52): 1024 inputs with
exponents between -10 and 10.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
H.J. Lu [Thu, 16 Mar 2023 00:42:54 +0000 (17:42 -0700)]
x86: Set FSGSBASE to active if enabled by kernel
Linux kernel uses AT_HWCAP2 to indicate if FSGSBASE instructions are
enabled. If the HWCAP2_FSGSBASE bit in AT_HWCAP2 is set, FSGSBASE
instructions can be used in user space. Define dl_check_hwcap2 to set
the FSGSBASE feature to active on Linux when the HWCAP2_FSGSBASE bit is
set.
Add a test to verify that FSGSBASE is active on current kernels.
NB: This test will fail if the kernel doesn't set the HWCAP2_FSGSBASE
bit in AT_HWCAP2 while fsgsbase shows up in /proc/cpuinfo.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Florian Weimer [Mon, 3 Apr 2023 15:23:11 +0000 (17:23 +0200)]
x86_64: Fix asm constraints in feraiseexcept (bug 30305)
The divss instruction clobbers its first argument, and the constraints
need to reflect that. Fortunately, with GCC 12, generated code does
not actually change, so there is no externally visible bug.
Suggested-by: Jakub Jelinek <jakub@redhat.com>
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
Siddhesh Poyarekar [Mon, 3 Apr 2023 14:20:04 +0000 (10:20 -0400)]
manual: Document __wur usage under _FORTIFY_SOURCE
The __warn_unused_result__ attribute is only enabled when fortification
is enabled. Mention that in the document. The rationale for this is
essentially to mitigate against CWE-252:
[1] https://cwe.mitre.org/data/definitions/252.html
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Sergey Bugaev [Sun, 19 Mar 2023 15:10:14 +0000 (18:10 +0300)]
hurd: Microoptimize _hurd_self_sigstate ()
When THREAD_GETMEM is defined with inline assembly, the compiler may not
optimize away the two reads of _hurd_sigstate. Help it out a little bit
by only reading it once. This also makes for a slightly cleaner code.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <
20230319151017.531737-32-bugaevc@gmail.com>
Sergey Bugaev [Sun, 19 Mar 2023 15:10:12 +0000 (18:10 +0300)]
hurd: Add vm_param.h for x86_64
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <
20230319151017.531737-30-bugaevc@gmail.com>
Sergey Bugaev [Sun, 19 Mar 2023 15:10:11 +0000 (18:10 +0300)]
hurd: Implement _hurd_longjmp_thread_state for x86_64
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <
20230319151017.531737-29-bugaevc@gmail.com>
Sergey Bugaev [Sun, 19 Mar 2023 15:10:05 +0000 (18:10 +0300)]
htl: Implement thread_set_pcsptp for x86_64
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <
20230319151017.531737-23-bugaevc@gmail.com>
Sergey Bugaev [Sun, 19 Mar 2023 15:10:04 +0000 (18:10 +0300)]
x86_64: Add rtld-stpncpy & rtld-strncpy
Just like the other existing rtld-str* files, this provides rtld with
usable versions of stpncpy and strncpy.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <
20230319151017.531737-22-bugaevc@gmail.com>
Sergey Bugaev [Sun, 19 Mar 2023 15:10:03 +0000 (18:10 +0300)]
htl: Add tcb-offsets.sym for x86_64
The source code is the same as sysdeps/i386/htl/tcb-offsets.sym, but of
course the produced tcb-offsets.h will be different.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <
20230319151017.531737-21-bugaevc@gmail.com>
Sergey Bugaev [Sun, 19 Mar 2023 15:10:02 +0000 (18:10 +0300)]
hurd: Move a couple of signal-related files to x86
These do not need any changes to be used on x86_64.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <
20230319151017.531737-20-bugaevc@gmail.com>
Sergey Bugaev [Sun, 19 Mar 2023 15:09:58 +0000 (18:09 +0300)]
hurd: Use uintptr_t for register values in trampoline.c
This is more correct, if only because these fields are defined as having
the type unsigned int in the Mach headers, so casting them to a signed
int and then back is suboptimal.
Also, remove an extra reassignment of uesp -- this is another remnant of
the ecx kludge.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <
20230319151017.531737-16-bugaevc@gmail.com>
Sergey Bugaev [Sun, 19 Mar 2023 15:09:57 +0000 (18:09 +0300)]
hurd: Move rtld-strncpy-c.c out of mach/hurd/
There's nothing Mach- or Hurd-specific about it; any port that ends
up with rtld pulling in strncpy will need this.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <
20230319151017.531737-15-bugaevc@gmail.com>
Sergey Bugaev [Sun, 19 Mar 2023 15:09:55 +0000 (18:09 +0300)]
hurd: More 64-bit integer casting fixes
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <
20230319151017.531737-13-bugaevc@gmail.com>
Sergey Bugaev [Sun, 19 Mar 2023 15:09:54 +0000 (18:09 +0300)]
mach, hurd: Drop __libc_lock_self0
This was used for the value of libc-lock's owner when TLS is not yet set
up, so THREAD_SELF can not be used. Since the value need not be anything
specific -- it just has to be non-NULL -- we can just use a plain
constant, such as (void *) 1, for this. This avoids accessing the symbol
through GOT, and exporting it from libc.so in the first place.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <
20230319151017.531737-12-bugaevc@gmail.com>