review.tizen.org Git - platform/upstream/glibc.git/log

projects / platform / upstream / glibc.git / log

Khem Raj [Thu, 2 Dec 2021 07:13:13 +0000 (23:13 -0800)]

intl: Emit no lines in bison generated files

Improve reproducibility:
Do not put any #line preprocessor commands in bison generated files.
These lines contain absolute paths containing file locations on
the host build machine.

Signed-off-by: Juro Bystricky <juro.bystricky@intel.com>
Signed-off-by: Khem Raj <raj.khem@gmail.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

commit | commitdiff | tree

Samuel Thibault [Tue, 14 Dec 2021 07:38:05 +0000 (08:38 +0100)]

hurd: Do not set PIE_UNSUPPORTED

This is now supported.

commit | commitdiff | tree

H.J. Lu [Tue, 14 Dec 2021 00:33:57 +0000 (16:33 -0800)]

NEWS: Move LD_PREFER_MAP_32BIT_EXEC

Move LD_PREFER_MAP_32BIT_EXEC to

Deprecated and removed features, and other changes affecting compatibility:

commit | commitdiff | tree

Samuel Thibault [Tue, 14 Dec 2021 00:01:48 +0000 (01:01 +0100)]

mach: Fix spurious inclusion of stack_chk_fail_local in libmachuser.a

When linking programs statically, stack_chk_fail_local already comes
from libc_nonshared, so we don't need it in lib{mach,hurd}user.a.

commit | commitdiff | tree

H.J. Lu [Wed, 8 Dec 2021 15:02:27 +0000 (07:02 -0800)]

Disable DT_RUNPATH on NSS tests [BZ #28455]

The glibc internal NSS functions should always load NSS modules from
the system. For testing purpose, disable DT_RUNPATH on NSS tests so
that the glibc internal NSS functions can load testing NSS modules
via DT_RPATH.

This partially fixes BZ #28455.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

commit | commitdiff | tree

Akila Welihinda [Sun, 12 Dec 2021 18:35:03 +0000 (10:35 -0800)]

sysdeps: Simplify sin Taylor Series calculation

The macro TAYLOR_SIN adds the term `-0.5*da*a^2 + da` in hopes
of regaining some precision as a function of da. However the
comment says we add the term `-0.5*da*a^2 + 0.5*da` which is
different. This fix updates the comment to reflect the
code and also simplifies the calculation by replacing `a` with `x`
because they always have the same value.

Signed-off-by: Akila Welihinda <akilawelihinda@ucla.edu>
Reviewed-by: Paul Zimmermann <Paul.Zimmermann@inria.fr>

commit | commitdiff | tree

Adhemerval Zanella [Tue, 6 Apr 2021 17:33:14 +0000 (14:33 -0300)]

math: Remove the error handling wrapper from hypot and hypotf

The error handling is moved to sysdeps/ieee754 version with no SVID
support. The compatibility symbol versions still use the wrapper with
SVID error handling around the new code. There is no new symbol version
nor compatibility code on !LIBM_SVID_COMPAT targets (e.g. riscv).

Only ia64 is unchanged, since it still uses the arch specific
__libm_error_region on its implementation.

Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.

commit | commitdiff | tree

Wilco Dijkstra [Wed, 1 Dec 2021 14:08:14 +0000 (11:08 -0300)]

math: Use fmin/fmax on hypot

It optimizes for architectures that provides fast builtins.

Checked on aarch64-linux-gnu.

commit | commitdiff | tree

Adhemerval Zanella [Wed, 1 Dec 2021 13:57:32 +0000 (10:57 -0300)]

aarch64: Add math-use-builtins-f{max,min}.h

It allows to remove the arch-specific implementations.

commit | commitdiff | tree

Adhemerval Zanella [Wed, 1 Dec 2021 13:44:58 +0000 (10:44 -0300)]

math: Add math-use-builtinds-fmin.h

It allows the architecture to use the builtin instead of generic
implementation.

commit | commitdiff | tree

Adhemerval Zanella [Wed, 1 Dec 2021 13:37:44 +0000 (10:37 -0300)]

math: Add math-use-builtinds-fmax.h

It allows the architecture to use the builtin instead of generic
implementation.

commit | commitdiff | tree

Adhemerval Zanella [Sun, 4 Apr 2021 02:52:45 +0000 (23:52 -0300)]

math: Remove powerpc e_hypot

The generic implementation is shows only slight worse performance:

POWER10    reciprocal-throughput    latency
master                   8.28478    13.7253
new hypot                7.21945    13.1933

POWER9     reciprocal-throughput    latency
master                   13.4024    14.0967
new hypot                14.8479    15.8061

POWER8     reciprocal-throughput    latency
master                   15.5767    16.8885
new hypot                16.5371    18.4057

One way to improve might to make gcc generate xsmaxdp/xsmindp for
fmax/fmin (it onl does for -ffast-math, clang does for default
options).

Checked on powerpc64-linux-gnu (power8) and powerpc64le-linux-gnu
(power9).

commit | commitdiff | tree

Adhemerval Zanella [Tue, 6 Apr 2021 15:32:06 +0000 (12:32 -0300)]

i386: Move hypot implementation to C

The generic hypotf is slight slower, mostly due the tricks the assembly
does to optimize the isinf/isnan/issignaling. The generic hypot is way
slower, since the optimized implementation uses the i386 default
excessive precision to issue the operation directly. A similar
implementation is provided instead of using the generic implementation:

Checked on i686-linux-gnu.

commit | commitdiff | tree

Adhemerval Zanella [Tue, 6 Apr 2021 02:55:55 +0000 (23:55 -0300)]

math: Use an improved algorithm for hypotl (ldbl-128)

This implementation is based on 'An Improved Algorithm for hypot(a,b)'
by Carlos F. Borges [1] using the MyHypot3 with the following changes:

  - Handle qNaN and sNaN.
  - Tune the 'widely varying operands' to avoid spurious underflow
    due the multiplication and fix the return value for upwards
    rounding mode.
  - Handle required underflow exception for subnormal results.

The main advantage of the new algorithm is its precision.  With a
random 1e9 input pairs in the range of [LDBL_MIN, LDBL_MAX], glibc
current implementation shows around 0.05% results with an error of
1 ulp (453266 results) while the new implementation only shows
0.0001% of total (1280).

Checked on aarch64-linux-gnu and x86_64-linux-gnu.

[1] https://arxiv.org/pdf/1904.09481.pdf

commit | commitdiff | tree

Adhemerval Zanella [Mon, 5 Apr 2021 20:28:48 +0000 (17:28 -0300)]

math: Use an improved algorithm for hypotl (ldbl-96)

This implementation is based on 'An Improved Algorithm for hypot(a,b)'
by Carlos F. Borges [1] using the MyHypot3 with the following changes:

- Handle qNaN and sNaN.
- Tune the 'widely varying operands' to avoid spurious underflow
   due the multiplication and fix the return value for upwards
   rounding mode.
- Handle required underflow exception for subnormal results.

The main advantage of the new algorithm is its precision.  With a
random 1e8 input pairs in the range of [LDBL_MIN, LDBL_MAX], glibc
current implementation shows around 0.02% results with an error of
1 ulp (23158 results) while the new implementation only shows
0.0001% of total (111).

[1] https://arxiv.org/pdf/1904.09481.pdf

commit | commitdiff | tree

Wilco Dijkstra [Tue, 30 Nov 2021 19:29:25 +0000 (16:29 -0300)]

math: Improve hypot performance with FMA

Improve hypot performance significantly by using fma when available. The
fma version has twice the throughput of the previous version and 70% of
the latency. The non-fma version has 30% higher throughput and 10%
higher latency.

Max ULP error is 0.949 with fma and 0.792 without fma.

Passes GLIBC testsuite.

commit | commitdiff | tree

Wilco Dijkstra [Mon, 8 Mar 2021 20:07:39 +0000 (17:07 -0300)]

math: Use an improved algorithm for hypot (dbl-64)

This implementation is based on the 'An Improved Algorithm for
hypot(a,b)' by Carlos F. Borges [1] using the MyHypot3 with the
following changes:

- Handle qNaN and sNaN.
- Tune the 'widely varying operands' to avoid spurious underflow
   due the multiplication and fix the return value for upwards
   rounding mode.
- Handle required underflow exception for denormal results.

The main advantage of the new algorithm is its precision: with a
random 1e9 input pairs in the range of [DBL_MIN, DBL_MAX], glibc
current implementation shows around 0.34% results with an error of
1 ulp (3424869 results) while the new implementation only shows
0.002% of total (18851).

The performance result are also only slight worse than current
implementation.  On x86_64 (Ryzen 5900X) with gcc 12:

Before:

  "hypot": {
   "workload-random": {
    "duration": 3.73319e+09,
    "iterations": 1.12e+08,
    "reciprocal-throughput": 22.8737,
    "latency": 43.7904,
    "max-throughput": 4.37184e+07,
    "min-throughput": 2.28361e+07
   }
  }

After:

  "hypot": {
   "workload-random": {
    "duration": 3.7597e+09,
    "iterations": 9.8e+07,
    "reciprocal-throughput": 23.7547,
    "latency": 52.9739,
    "max-throughput": 4.2097e+07,
    "min-throughput": 1.88772e+07
   }
  }

Co-Authored-By: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Checked on x86_64-linux-gnu and aarch64-linux-gnu.

[1] https://arxiv.org/pdf/1904.09481.pdf

commit | commitdiff | tree

Adhemerval Zanella [Mon, 5 Apr 2021 17:49:47 +0000 (14:49 -0300)]

math: Simplify hypotf implementation

Use a more optimized comparison for check for NaN and infinite and
add an inlined issignaling implementation for float. With gcc it
results in 2 FP comparisons.

The file Copyright is also changed to use GPL, the implementation was
completely changed by 7c10fd3515f to use double precision instead of
scaling and this change removes all the GET_FLOAT_WORD usage.

Checked on x86_64-linux-gnu.

commit | commitdiff | tree

Siddhesh Poyarekar [Mon, 13 Dec 2021 04:31:45 +0000 (10:01 +0530)]

Cleanup encoding in comments

Replace non-UTF-8 and non-ASCII characters in comments with their UTF-8
equivalents so that files don't end up with mixed encodings. With this,
all files (except tests that actually test different encodings) have a
single encoding.

Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>

commit | commitdiff | tree

Siddhesh Poyarekar [Wed, 8 Dec 2021 05:51:26 +0000 (11:21 +0530)]

Replace --enable-static-pie with --disable-default-pie

Build glibc programs and tests as PIE by default and enable static-pie
automatically if the architecture and toolchain supports it.

Also add a new configuration option --disable-default-pie to prevent
building programs as PIE.

Only the following architectures now have PIE disabled by default
because they do not work at the moment.  hppa, ia64, alpha and csky
don't work because the linker is unable to handle a pcrel relocation
generated from PIE objects.  The microblaze compiler is currently
failing with an ICE.  GNU hurd tries to enable static-pie, which does
not work and hence fails.  All these targets have default PIE disabled
at the moment and I have left it to the target maintainers to enable PIE
on their targets.

build-many-glibcs runs clean for all targets.  I also tested x86_64 on
Fedora and Ubuntu, to verify that the default build as well as
--disable-default-pie work as expected with both system toolchains.

Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

commit | commitdiff | tree

Samuel Thibault [Sat, 11 Dec 2021 22:08:32 +0000 (23:08 +0100)]

hurd: Add rules for static PIE build

This fixes [BZ #28671].

commit | commitdiff | tree

Samuel Thibault [Sat, 11 Dec 2021 23:41:38 +0000 (00:41 +0100)]

hurd: Fix gmon-static

We need to use crt0 for gmon-static too.

commit | commitdiff | tree

H.J. Lu [Fri, 10 Dec 2021 21:00:09 +0000 (13:00 -0800)]

x86-64: Remove LD_PREFER_MAP_32BIT_EXEC support [BZ #28656]

Remove the LD_PREFER_MAP_32BIT_EXEC environment variable support since
the first PT_LOAD segment is no longer executable due to defaulting to
-z separate-code.

This fixes [BZ #28656].

Reviewed-by: Florian Weimer <fweimer@redhat.com>

commit | commitdiff | tree

Florian Weimer [Fri, 10 Dec 2021 20:34:30 +0000 (21:34 +0100)]

elf: Use errcode instead of (unset) errno in rtld_chain_load

commit | commitdiff | tree

H.J. Lu [Thu, 9 Dec 2021 15:01:33 +0000 (07:01 -0800)]

Add a testcase to check alignment of PT_LOAD segment [BZ #28676]

commit | commitdiff | tree

Rongwei Wang [Fri, 10 Dec 2021 12:39:10 +0000 (20:39 +0800)]

elf: Properly align PT_LOAD segments [BZ #28676]

When PT_LOAD segment alignment > the page size, allocate enough space to
ensure that the segment can be properly aligned. This change helps code
segments use huge pages become simple and available.

This fixes [BZ #28676].

Signed-off-by: Xu Yu <xuyu@linux.alibaba.com>
Signed-off-by: Rongwei Wang <rongwei.wang@linux.alibaba.com>

commit | commitdiff | tree

Florian Weimer [Fri, 10 Dec 2021 15:06:36 +0000 (16:06 +0100)]

elf: Install a symbolic link to ld.so as /usr/bin/ld.so

This makes ld.so features such as --preload, --audit,
and --list-diagnostics more accessible to end users because they
do not need to know the ABI name of the dynamic loader.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

commit | commitdiff | tree

Florian Weimer [Fri, 10 Dec 2021 04:14:24 +0000 (05:14 +0100)]

nptl: Add one more barrier to nptl/tst-create1

Without the bar_ctor_finish barrier, it was possible that thread2
re-locked user_lock before ctor had a chance to lock it. ctor then
blocked in its locking operation, xdlopen from the main thread
did not return, and thread2 was stuck waiting in bar_dtor:

thread 1: started.
thread 2: started.
thread 2: locked user_lock.
constructor started: 0.
thread 1: in ctor: started.
thread 3: started.
thread 3: done.
thread 2: unlocked user_lock.
thread 2: locked user_lock.

Fixes the test in commit 83b5323261bb72313bffcf37476c1b8f0847c736
("elf: Avoid deadlock between pthread_create and ctors [BZ #28357]").

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>

commit | commitdiff | tree

Florian Weimer [Thu, 9 Dec 2021 16:57:11 +0000 (17:57 +0100)]

Remove TLS_TCB_ALIGN and TLS_INIT_TCB_ALIGN

TLS_INIT_TCB_ALIGN is not actually used.  TLS_TCB_ALIGN was likely
introduced to support a configuration where the thread pointer
has not the same alignment as THREAD_SELF.  Only ia64 seems to use
that, but for the stack/pointer guard, not for storing tcbhead_t.
Some ports use TLS_TCB_OFFSET and TLS_PRE_TCB_SIZE to shift
the thread pointer, potentially landing in a different residue class
modulo the alignment, but the changes should not impact that.

In general, given that TLS variables have their own alignment
requirements, having different alignment for the (unshifted) thread
pointer and struct pthread would potentially result in dynamic
offsets, leading to more complexity.

hppa had different values before: __alignof__ (tcbhead_t), which
seems to be 4, and __alignof__ (struct pthread), which was 8
(old default) and is now 32.  However, it defines THREAD_SELF as:

/* Return the thread descriptor for the current thread.  */
# define THREAD_SELF \
  ({ struct pthread *__self; \
__self = __get_cr27(); \
__self - 1; \
   })

So the thread pointer points after struct pthread (hence __self - 1),
and they have to have the same alignment on hppa as well.

Similarly, on ia64, the definitions were different.  We have:

# define TLS_PRE_TCB_SIZE \
  (sizeof (struct pthread) \
   + (PTHREAD_STRUCT_END_PADDING < 2 * sizeof (uintptr_t) \
      ? ((2 * sizeof (uintptr_t) + __alignof__ (struct pthread) - 1) \
& ~(__alignof__ (struct pthread) - 1)) \
      : 0))
# define THREAD_SELF \
  ((struct pthread *) ((char *) __thread_self - TLS_PRE_TCB_SIZE))

And TLS_PRE_TCB_SIZE is a multiple of the struct pthread alignment
(confirmed by the new _Static_assert in sysdeps/ia64/libc-tls.c).

On m68k, we have a larger gap between tcbhead_t and struct pthread.
But as far as I can tell, the port is fine with that.  The definition
of TCB_OFFSET is sufficient to handle the shifted TCB scenario.

This fixes commit 23c77f60181eb549f11ec2f913b4270af29eee38
("nptl: Increase default TCB alignment to 32").

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

commit | commitdiff | tree