Jakub Jelinek [Mon, 7 Feb 2022 16:39:11 +0000 (17:39 +0100)]
testsuite: Fix up testsuite/gcc.c-torture/execute/builtins/lib/chk.c for powerpc [PR104380]
> > The following testcase FAILs when configured with
> > --with-long-double-format=ieee . Only happens in the -std=c* modes, not the
> > GNU modes; while the glibc headers have __asm redirects of
> > vsnprintf and __vsnprinf_chk to __vsnprintfieee128 and
> > __vsnprintf_chkieee128, the vsnprintf fortification extern inline gnu_inline
> > always_inline wrapper calls __builtin_vsnprintf_chk and we actually emit
> > a call to __vsnprinf_chk (i.e. with IBM extended long double) instead of
> > __vsnprintf_chkieee128.
> >
> > rs6000_mangle_decl_assembler_name already had cases for *printf and *scanf,
> > so this just adds another case for *printf_chk. *scanf_chk doesn't exist.
> > __ prefixing isn't done because *printf_chk already starts with __.
Unfortunately, while I've tested the testcase also with -mabi=ieeelongdouble
by hand, the full bootstrap/regtest was on GCCFarm where glibc is too old
to test with --with-long-double-format=ieee.
I've done full bootstrap/regtest with that option during the weekend and
the patch regressed:
FAIL: gcc.c-torture/execute/builtins/snprintf-chk.c execution, -O1
FAIL: gcc.c-torture/execute/builtins/snprintf-chk.c execution, -O2
FAIL: gcc.c-torture/execute/builtins/snprintf-chk.c execution, -O2 -flto -fno-use-linker-plugin -flto-partition=none
FAIL: gcc.c-torture/execute/builtins/snprintf-chk.c execution, -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects
FAIL: gcc.c-torture/execute/builtins/snprintf-chk.c execution, -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions
FAIL: gcc.c-torture/execute/builtins/snprintf-chk.c execution, -O3 -g
FAIL: gcc.c-torture/execute/builtins/snprintf-chk.c execution, -Og -g
FAIL: gcc.c-torture/execute/builtins/snprintf-chk.c execution, -Os
FAIL: gcc.c-torture/execute/builtins/sprintf-chk.c execution, -O1
FAIL: gcc.c-torture/execute/builtins/sprintf-chk.c execution, -O2
FAIL: gcc.c-torture/execute/builtins/sprintf-chk.c execution, -O2 -flto -fno-use-linker-plugin -flto-partition=none
FAIL: gcc.c-torture/execute/builtins/sprintf-chk.c execution, -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects
FAIL: gcc.c-torture/execute/builtins/sprintf-chk.c execution, -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions
FAIL: gcc.c-torture/execute/builtins/sprintf-chk.c execution, -O3 -g
FAIL: gcc.c-torture/execute/builtins/sprintf-chk.c execution, -Og -g
FAIL: gcc.c-torture/execute/builtins/sprintf-chk.c execution, -Os
FAIL: gcc.c-torture/execute/builtins/vsnprintf-chk.c execution, -O1
FAIL: gcc.c-torture/execute/builtins/vsnprintf-chk.c execution, -O2
FAIL: gcc.c-torture/execute/builtins/vsnprintf-chk.c execution, -O2 -flto -fno-use-linker-plugin -flto-partition=none
FAIL: gcc.c-torture/execute/builtins/vsnprintf-chk.c execution, -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects
FAIL: gcc.c-torture/execute/builtins/vsnprintf-chk.c execution, -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions
FAIL: gcc.c-torture/execute/builtins/vsnprintf-chk.c execution, -O3 -g
FAIL: gcc.c-torture/execute/builtins/vsnprintf-chk.c execution, -Og -g
FAIL: gcc.c-torture/execute/builtins/vsnprintf-chk.c execution, -Os
FAIL: gcc.c-torture/execute/builtins/vsprintf-chk.c execution, -O1
FAIL: gcc.c-torture/execute/builtins/vsprintf-chk.c execution, -O2
FAIL: gcc.c-torture/execute/builtins/vsprintf-chk.c execution, -O2 -flto -fno-use-linker-plugin -flto-partition=none
FAIL: gcc.c-torture/execute/builtins/vsprintf-chk.c execution, -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects
FAIL: gcc.c-torture/execute/builtins/vsprintf-chk.c execution, -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions
FAIL: gcc.c-torture/execute/builtins/vsprintf-chk.c execution, -O3 -g
FAIL: gcc.c-torture/execute/builtins/vsprintf-chk.c execution, -Og -g
FAIL: gcc.c-torture/execute/builtins/vsprintf-chk.c execution, -Os
The problem is that the execute/builtins/ testsuite wants to override some
of the library functions and with the change we (correctly) call
__*printf_chkieee128 and so lib/chk.c is no longer called but the glibc
APIs are.
2022-02-07 Jakub Jelinek <jakub@redhat.com>
PR target/104380
* gcc.c-torture/execute/builtins/lib/chk.c (__sprintf_chkieee128,
__vsprintf_chkieee128, __snprintf_chkieee128,
__vsnprintf_chkieee128): New aliases to non-ieee128 suffixed functions
for powerpc -mabi=ieeelongdouble.
Tamar Christina [Mon, 7 Feb 2022 12:55:12 +0000 (12:55 +0000)]
AArch32: correct usdot-product RTL patterns.
There was a bug in the ACLE specication for dot product which has now
been fixed[1]. This means some intrinsics were missing and are added by this
patch.
Bootstrapped and regtested on arm-none-linux-gnueabihf and no issues.
Ok for master?
[1] https://github.com/ARM-software/acle/releases/tag/r2021Q3
gcc/ChangeLog:
* config/arm/arm_neon.h (vusdotq_s32, vusdot_laneq_s32,
vusdotq_laneq_s32, vsudot_laneq_s32, vsudotq_laneq_s32): New
* config/arm/arm_neon_builtins.def (usdot): Add V16QI.
(usdot_laneq, sudot_laneq): New.
* config/arm/neon.md (neon_<sup>dot_laneq<vsi2qi>): New.
(neon_<sup>dot_lane<vsi2qi>): Remote unneeded code.
gcc/testsuite/ChangeLog:
* gcc.target/arm/simd/vdot-2-1.c: Add new tests.
* gcc.target/arm/simd/vdot-2-2.c: Likewise and fix output.
Tamar Christina [Mon, 7 Feb 2022 12:54:42 +0000 (12:54 +0000)]
AArch32: correct dot-product RTL patterns.
The previous fix for this problem was wrong due to a subtle difference between
where NEON expects the RMW values and where intrinsics expects them.
The insn pattern is modeled after the intrinsics and so needs an expand for
the vectorizer optab to switch the RTL.
However operand[3] is not expected to be written to so the current pattern is
bogus.
Instead we use the expand to shuffle around the RTL.
The vectorizer expects operands[3] and operands[0] to be
the same but the aarch64 intrinsics expanders expect operands[0] and
operands[1] to be the same.
This also fixes some issues with big-endian, each dot product performs 4 8-byte
multiplications. However compared to AArch64 we don't enter lanes in GCC
lane indexed in AArch32 aside from loads/stores. This means no lane remappings
are done in arm-builtins.c and so none should be done at the instruction side.
There are some other instructions that need inspections as I think there are
more incorrect ones.
Third there was a bug in the ACLE specication for dot product which has now been
fixed[1]. This means some intrinsics were missing and are added by this patch.
Bootstrapped and regtested on arm-none-linux-gnueabihf and no issues.
Ok for master? and active branches after some stew?
[1] https://github.com/ARM-software/acle/releases/tag/r2021Q3
gcc/ChangeLog:
* config/arm/arm_neon.h (vdot_laneq_u32, vdotq_laneq_u32,
vdot_laneq_s32, vdotq_laneq_s32): New.
* config/arm/arm_neon_builtins.def (sdot_laneq, udot_laneq): New.
* config/arm/neon.md (neon_<sup>dot<vsi2qi>): New.
(<sup>dot_prod<vsi2qi>): Re-order rtl.
(neon_<sup>dot_lane<vsi2qi>): Fix rtl order and endiannes.
(neon_<sup>dot_laneq<vsi2qi>): New.
gcc/testsuite/ChangeLog:
* gcc.target/arm/simd/vdot-compile.c: Add new cases.
* gcc.target/arm/simd/vdot-exec.c: Likewise.
Andreas Krebbel [Sun, 6 Feb 2022 08:07:41 +0000 (09:07 +0100)]
Check always_inline flag in s390_can_inline_p [PR104327]
MASK_MVCLE is set for -Os but not for other optimization levels. In
general it should not make much sense to inline across calls where the
flag is different but we have to allow it for always_inline.
The patch also rearranges the hook implementation a bit based on the
recommendations from Jakub und Martin in the PR.
Bootstrapped and regression tested on s390x with various arch flags.
Will commit after giving a few days for comments.
gcc/ChangeLog:
PR target/104327
* config/s390/s390.cc (s390_can_inline_p): Accept a few more flags
if always_inline is set. Don't inline when tune differs without
always_inline.
gcc/testsuite/ChangeLog:
PR target/104327
* gcc.c-torture/compile/pr104327.c: New test.
Richard Biener [Mon, 7 Feb 2022 08:31:07 +0000 (09:31 +0100)]
middle-end/104402 - split out _Complex compares from COND_EXPRs
This makes sure we always have a _Complex compare split to a
different stmt for the compare operand in a COND_EXPR on GIMPLE.
Complex lowering doesn't handle this and the change is something
we want for all kind of compares at some point.
2022-02-07 Richard Biener <rguenther@suse.de>
PR middle-end/104402
* gimple-expr.cc (is_gimple_condexpr): _Complex typed
compares are not valid.
* tree-cfg.cc (verify_gimple_assign_ternary): For COND_EXPR
check is_gimple_condexpr.
* gcc.dg/torture/pr104402.c: New testcase.
Kewen Lin [Mon, 7 Feb 2022 03:30:02 +0000 (21:30 -0600)]
rs6000: Move the hunk affecting VSX/ALTIVEC ahead [PR103627]
The modified hunk can update VSX and ALTIVEC flag, we have some codes
to check/warn for some flags related to VSX and ALTIVEC sitting where
the hunk is proprosed to be moved to. Without this adjustment, the
VSX and ALTIVEC update is too late, it can cause the incompatibility
and result in unexpected behaviors, the associated test case is one
typical case.
Since we already have the code which sets TARGET_FLOAT128_TYPE and lays
after the moved place, and OPTION_MASK_FLOAT128_KEYWORD will rely on
TARGET_FLOAT128_TYPE, so it just simply remove them.
gcc/ChangeLog:
PR target/103627
* config/rs6000/rs6000.cc (rs6000_option_override_internal): Move the
hunk affecting VSX and ALTIVEC to appropriate place.
gcc/testsuite/ChangeLog:
PR target/103627
* gcc.target/powerpc/pr103627-3.c: New test.
Kewen Lin [Mon, 7 Feb 2022 03:29:32 +0000 (21:29 -0600)]
rs6000: Disable MMA if no VSX support [PR103627]
As PR103627 shows, there is an unexpected case where !TARGET_VSX
and TARGET_MMA co-exist. As ISA3.1 claims, SIMD is a requirement
for MMA. By looking into the ICE, I noticed that the current
MMA implementation depends on vector pairs load/store which use
VSX register, but we don't have a separated option to control
Power10 vector support and Segher pointed out "-mpower9-vector is
a workaround that should go away" and more explanations in [1].
So this patch makes MMA require VSX instead.
[1] https://gcc.gnu.org/pipermail/gcc-patches/2022-January/589303.html
gcc/ChangeLog:
PR target/103627
* config/rs6000/rs6000.cc (rs6000_option_override_internal): Disable
MMA if !TARGET_VSX.
gcc/testsuite/ChangeLog:
PR target/103627
* gcc.target/powerpc/pr103627-1.c: New test.
* gcc.target/powerpc/pr103627-2.c: New test.
GCC Administrator [Mon, 7 Feb 2022 00:16:17 +0000 (00:16 +0000)]
Daily bump.
Patrick Palka [Sun, 6 Feb 2022 15:47:48 +0000 (10:47 -0500)]
c++: dependent noexcept-spec on defaulted comparison op [PR96242]
Here we're failing to instantiate the defaulted comparison op's
explicit dependent noexcept-spec. The problem is ultimately that
mark_used relies on maybe_instantiate_noexcept to synthesize a defaulted
comparison op, but the relevant DECL_MAYBE_DELETED fn handling in m_i_n
is intended for such functions whose noexcept-spec wasn't explicitly
provided (and is therefore determined via synthesis), so m_i_n just
exits early afterwards, without considering that the synthesized fn may
have an explicit noexcept-spec that needs instantiating.
This patch fixes this issue by making mark_used directly synthesize a
DECL_MAYBE_DELETED fn before calling maybe_instantiate_noexcept. And
in turn, we can properly restrict the DECL_MAYBE_DELETED fn synthesis
in m_i_n to only those without an explicit noexcept-spec.
PR c++/96242
gcc/cp/ChangeLog:
* decl2.cc (mark_used): Directly synthesize a DECL_MAYBE_DELETED
fn by calling maybe_synthesize_method instead of relying on
maybe_instantiate_noexcept. Move call to m_i_n after the
DECL_DELETED_FN handling.
* pt.cc (maybe_instantiate_noexcept): Restrict DECL_MAYBE_DELETED
fn synthesis to only those with an implicit noexcept-spec, and
return !DECL_DELETED_FN instead of !DECL_MAYBE_DELETED afterwards.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/spaceship-synth15.C: New test.
Jakub Jelinek [Sun, 6 Feb 2022 10:16:29 +0000 (11:16 +0100)]
c++: Further address_compare fixes [PR89074]
This patch introduces folding_cxx_constexpr, folding_initializer is used
for both C and C++ initializer/constant expression folding and enables more
optimizations over what we do normally at runtime, while folding_cxx_constexpr
is used only during C++ constant expression folding and disables some optimizations.
The patch improves STRING_CST vs. STRING_CST folding, for folding_initializer
FUNCTION_DECL vs. FUNCTION_DECL folding, disables some optimizations like
is_global_var != is_global_var or STRING_CST vs. DECL_P for folding_cxx_constexpr
etc.
2022-02-06 Jakub Jelinek <jakub@redhat.com>
PR c++/89074
PR c++/104033
* fold-const.h (folding_initializer): Adjust comment.
(folding_cxx_constexpr): Declare.
* fold-const.cc (folding_initializer): Adjust comment.
(folding_cxx_constexpr): New variable.
(address_compare): Restrict the decl vs. STRING_CST
or vice versa or STRING_CST vs. STRING_CST or
is_global_var != is_global_var optimizations to !folding_cxx_constexpr.
Punt for FUNCTION_DECLs with non-zero offsets. If folding_initializer,
assume non-aliased functions have non-zero size and have different
addresses. For folding_cxx_constexpr, punt on comparisons of start
of some object and end of another one, regardless whether it is a decl
or string literal. Also punt for folding_cxx_constexpr on
STRING_CST vs. STRING_CST comparisons if the two literals could be
overlapping.
* constexpr.cc (cxx_eval_binary_expression): Temporarily set
folding_cxx_constexpr.
* g++.dg/cpp1y/constexpr-89074-3.C: New test.
GCC Administrator [Sun, 6 Feb 2022 00:16:21 +0000 (00:16 +0000)]
Daily bump.
Jeff Law [Sat, 5 Feb 2022 17:17:56 +0000 (12:17 -0500)]
Fix expected output for s390 tests
Recent changes in diagnostic outputs have been triggering failures on the s390
testsuite. In particular, capitalization changed in one diagnostic and the
range representation changed in another. This patch makes the obvious updates
to the s390 testsuite.
gcc/testsuite
* gcc.target/s390/
20150826-1.c: Update expected output.
* gcc.target/s390/zvector/imm-range-error-1.c: Likewise.
Jakub Jelinek [Sat, 5 Feb 2022 09:52:19 +0000 (10:52 +0100)]
match.pd: Fix x * 0.0 -> 0.0 folding [PR104389]
The recent PR95115 change to punt in const_binop on folding operation
with non-NaN operands into NaN if flag_trapping_math broke the following
testcase, because the x * 0.0 simplification punts just if
x maybe a NaN (because NaN * 0.0 is NaN not 0.0) or if one of the operands
could be negative zero. But Inf * 0.0 or -Inf * 0.0 is also NaN, not
0.0, so when NaNs are honored we need to punt for possible infinities too.
2022-02-05 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/104389
* match.pd (x * 0 -> 0): Punt if x maybe infinite and NaNs are
honored.
* gcc.dg/pr104389.c: New test.
Kito Cheng [Sat, 5 Feb 2022 09:24:46 +0000 (17:24 +0800)]
RISC-V: Fix detection of zifencei support for binutils
- binutils will complain version info is not found if default ISA spec
is 2.2 for binutils.
Error: cannot find default versions of the ISA extension `zifencei'
gcc/ChangeLog:
* configure.ac: Fix detection for zifencei support.
* configure: Regenerate.
Kito Cheng [Tue, 25 Jan 2022 12:44:04 +0000 (20:44 +0800)]
RISC-V: Always pass -misa-spec to assembler [PR104219]
Add -misa-spec to OPTION_DEFAULT_SPECS to make sure -misa-spec will
always pass that into assembler, that prevent GCC and binutils using
different way to interpret the ISA string.
gcc/ChangeLog:
PR target/104219
* config.gcc (riscv*-*-*): Normalize the with_isa_spec value.
(all_defaults): Add isa_spec.
* config/riscv/riscv.h (OPTION_DEFAULT_SPECS): Add isa_spec.
Jason Merrill [Wed, 2 Feb 2022 23:36:41 +0000 (18:36 -0500)]
c++: assignment, aggregate, array [PR104300]
The PR92385 fix meant that we see more VEC_INIT_EXPR outside of INIT_EXPR;
in such cases, we need to wrap them in TARGET_EXPR. I previously fixed
that in build_array_copy; we also need it in process_init_constructor.
After fixing that, I needed to adjust a few places to recognize the
VEC_INIT_EXPR even inside a TARGET_EXPR. And prevent cp_fully_fold_init
from lowering VEC_INIT_EXPR too soon. And handle COMPOUND_EXPR inside
TARGET_EXPR better.
PR c++/104300
PR c++/92385
gcc/cp/ChangeLog:
* cp-tree.h (get_vec_init_expr): New.
(target_expr_needs_replace): New.
* cp-gimplify.cc (cp_gimplify_init_expr): Use it.
(struct cp_fold_data): New.
(cp_fold_r): Only genericize inits at end of fn.
(cp_fold_function): Here.
(cp_fully_fold_init): Not here.
* init.cc (build_vec_init): Use get_vec_init_expr.
* tree.cc (build_vec_init_expr): Likewise.
* typeck2.cc (split_nonconstant_init_1): Likewise.
(process_init_constructor): Wrap VEC_INIT_EXPR in
TARGET_EXPR.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/initlist-array14.C: New test.
Jason Merrill [Thu, 3 Feb 2022 21:23:24 +0000 (16:23 -0500)]
c++: add comment
gcc/cp/ChangeLog:
* pt.cc (iterative_hash_template_arg): Add comment.
Ian Lance Taylor [Tue, 1 Feb 2022 22:44:20 +0000 (14:44 -0800)]
compiler: accept "any" as an alias for "interface{}"
For golang/go#33232
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/382248
GCC Administrator [Sat, 5 Feb 2022 00:16:31 +0000 (00:16 +0000)]
Daily bump.
Jonathan Wakely [Fri, 4 Feb 2022 23:54:17 +0000 (23:54 +0000)]
libstdc++: Fix std::filesystem build failure for Windows
The std::filesystem code needs to use posix::DIR not ::DIR, as that is
an alias for _WDIR on Windows.
libstdc++-v3/ChangeLog:
* src/filesystem/dir-common.h (_Dir_base::openat): Change return
type to use portable posix::DIR alias.
Jonathan Wakely [Fri, 4 Feb 2022 19:42:19 +0000 (19:42 +0000)]
libstdc++: Allow Clang to use <stdatomic.h> before C++23
There is code that only expects to be compiled with clang++ and uses its
<stdatomic.h>, which works because Clang supports the _Atomic specifier
in C++. The addition of <stdatomic.h> to libstdc++ broke this code, as
now it finds the C++ header instead, which is empty for any standard
mode before C++23.
This change allows that code to keep working as before, by forwarding to
clang's <stdatomic.h>.
libstdc++-v3/ChangeLog:
* include/c_compatibility/stdatomic.h [__clang__]: Use
#include_next <stdatomic.h>.
Jonathan Wakely [Fri, 4 Feb 2022 15:23:31 +0000 (15:23 +0000)]
libstdc++: Remove un-implementable noexcept from Filesystem TS operations
LWG 3014 removed these incorrect noexcept specifications from the C++17
std::filesystem operations. They are also incorrect on the experimental
TS versions and should be removed from them too.
libstdc++-v3/ChangeLog:
* include/experimental/bits/fs_ops.h (fs::copy_file): Remove
noexcept.
(fs::create_directories): Likewise.
(fs::remove_all): Likewise.
* src/filesystem/ops.cc (fs::copy_file): Remove noexcept.
(fs::create_directories): Likewise.
(fs::remove_all): Likewise.
Jonathan Wakely [Tue, 1 Feb 2022 22:04:46 +0000 (22:04 +0000)]
libstdc++: Fix filesystem::remove_all races [PR104161]
This fixes the remaining filesystem::remove_all race condition by using
POSIX openat to recurse into sub-directories and using POSIX unlinkat to
remove files. This avoids the remaining race where the directory being
removed is replaced with a symlink after the directory has been opened,
so that the filesystem::remove("subdir/file") resolves to "target/file"
instead, because "subdir" has been removed and replaced with a symlink.
The previous patch only fixed the case where the directory was replaced
with a symlink before we tried to open it, but it still used the full
(potentially compromised) path as an argument to filesystem::remove.
The first part of the fix is to use openat when recursing into a
sub-directory with recursive_directory_iterator. This means that opening
"dir/subdir" uses the file descriptor for "dir", and so is sure to open
"dir/subdir" and not "symlink/subdir". (The previous patch to use
O_NOFOLLOW already ensured we won't open "dir/symlink/" here.)
The second part of the fix is to use unlinkat for the remove_all
operation. Previously we used a directory_iterator to get the name of
each file in a directory and then used filesystem::remove(iter->path())
on that name. This meant that any checks (e.g. O_NOFOLLOW) done by the
iterator could be invalidated before the remove operation on that
pathname. The directory iterator contains an open DIR stream, which we
can use to obtain a file descriptor to pass to unlinkat. This ensures
that the file being deleted really is contained within the directory
we're iterating over, rather than using a pathname that could resolve to
some other file.
The filesystem::remove_all function previously used a (non-recursive)
filesystem::directory_iterator for each directory, and called itself
recursively for sub-directories. The new implementation uses a single
filesystem::recursive_directory_iterator object, and calls a new __erase
member function on that iterator. That new __erase member function does
the actual work of removing a file (or a directory after its contents
have been iterated over and removed) using unlinkat. That means we don't
need to expose the DIR stream or its file descriptor to the remove_all
function, it's still encapuslated by the iterator class.
It would be possible to add a __rewind member to directory iterators
too, to call rewinddir after each modification to the directory. That
would make it more likely for filesystem::remove_all to successfully
remove everything even if files are being written to the directory tree
while removing it. It's unclear if that is actually prefereable, or if
it's better to fail and report an error at the first opportunity.
The necessary APIs (openat, unlinkat, fdopendir, dirfd) are defined in
POSIX.1-2008, and in Glibc since 2.10. But if the target doesn't provide
them, the original code (with race conditions) is still used.
This also reduces the number of small memory allocations needed for
std::filesystem::remove_all, because we do not store the full path to
every directory entry that is iterated over. The new filename_only
option means we only store the filename in the directory entry, as that
is all we need in order to use openat or unlinkat.
Finally, rather than duplicating everything for the Filesystem TS, the
std::experimental::filesystem::remove_all implementation now just calls
std::filesystem::remove_all to do the work.
libstdc++-v3/ChangeLog:
PR libstdc++/104161
* acinclude.m4 (GLIBCXX_CHECK_FILESYSTEM_DEPS): Check for dirfd
and unlinkat.
* config.h.in: Regenerate.
* configure: Regenerate.
* include/bits/fs_dir.h (recursive_directory_iterator): Declare
remove_all overloads as friends.
(recursive_directory_iterator::__erase): Declare new member
function.
* include/bits/fs_fwd.h (remove, remove_all): Declare.
* src/c++17/fs_dir.cc (_Dir): Add filename_only parameter to
constructor. Pass file descriptor argument to base constructor.
(_Dir::dir_and_pathname, _Dir::open_subdir, _Dir::do_unlink)
(_Dir::unlink, _Dir::rmdir): Define new member functions.
(directory_iterator): Pass filename_only argument to _Dir
constructor.
(recursive_directory_iterator::_Dir_stack): Adjust constructor
parameters to take a _Dir rvalue instead of creating one.
(_Dir_stack::orig): Add data member for storing original path.
(_Dir_stack::report_error): Define new member function.
(__directory_iterator_nofollow): Move here from dir-common.h and
fix value to be a power of two.
(__directory_iterator_filename_only): Define new constant.
(recursive_directory_iterator): Construct _Dir object and move
into _M_dirs stack. Pass skip_permission_denied argument to first
advance call.
(recursive_directory_iterator::increment): Use _Dir::open_subdir.
(recursive_directory_iterator::__erase): Define new member
function.
* src/c++17/fs_ops.cc (ErrorReporter, do_remove_all): Remove.
(fs::remove_all): Use new recursive_directory_iterator::__erase
member function.
* src/filesystem/dir-common.h (_Dir_base): Add int parameter to
constructor and use openat to implement nofollow semantics.
(_Dir_base::fdcwd, _Dir_base::set_close_on_exec, _Dir_base::openat):
Define new member functions.
(__directory_iterator_nofollow): Move to fs_dir.cc.
* src/filesystem/dir.cc (_Dir): Pass file descriptor argument to
base constructor.
(_Dir::dir_and_pathname, _Dir::open_subdir): Define new member
functions.
(recursive_directory_iterator::_Dir_stack): Adjust constructor
parameters to take a _Dir rvalue instead of creating one.
(recursive_directory_iterator): Check for new nofollow option.
Construct _Dir object and move into _M_dirs stack. Pass
skip_permission_denied argument to first advance call.
(recursive_directory_iterator::increment): Use _Dir::open_subdir.
* src/filesystem/ops.cc (fs::remove_all): Use C++17 remove_all.
Bill Schmidt [Thu, 3 Feb 2022 02:55:36 +0000 (20:55 -0600)]
rs6000: More factoring of overload processing
This patch continues the refactoring started with r12-6014. I had previously
noted that the resolve_vec* routines can be further simplified by processing
the argument list earlier, so that all routines can use the arrays of arguments
and types. I found that this was useful for some of the routines, but not for
all of them.
For several of the special-cased overloads, we don't specify all of the
possible type combinations in rs6000-overload.def, because the types don't
matter for the expansion we do. For these, we can't use generic error message
handling when the number of arguments is incorrect, because the result is
misleading error messages that indicate argument types are wrong.
So this patch goes halfway and improves the factoring on the remaining special
cases, but leaves vec_splats, vec_promote, vec_extract, vec_insert, and
vec_step alone.
2022-02-02 Bill Schmidt <wschmidt@linux.ibm.com>
gcc/
* config/rs6000/rs6000-c.cc (resolve_vec_mul): Accept args and types
parameters instead of arglist and nargs. Simplify accordingly. Remove
unnecessary test for argument count mismatch.
(resolve_vec_cmpne): Likewise.
(resolve_vec_adde_sube): Likewise.
(resolve_vec_addec_subec): Likewise.
(altivec_resolve_overloaded_builtin): Move overload special handling
after the gathering of arguments into args[] and types[] and the test
for correct number of arguments. Don't perform the test for correct
number of arguments for certain special cases. Call the other special
cases with args and types instead of arglist and nargs.
Bill Schmidt [Fri, 4 Feb 2022 19:26:44 +0000 (13:26 -0600)]
rs6000: Clean up ISA 3.1 documentation [PR100808]
Due to a pasto error in the documentation, vec_replace_unaligned was
implemented with the same function prototypes as vec_replace_elt. It was
intended that vec_replace_unaligned always specify output vectors as having
type vector unsigned char, to emphasize that elements are potentially
misaligned by this built-in function. This patch corrects the
misimplementation.
2022-02-04 Bill Schmidt <wschmidt@linux.ibm.com>
gcc/
PR target/100808
* doc/extend.texi (Basic PowerPC Built-in Functions Available on ISA
3.1): Provide consistent type names. Remove unnecessary semicolons.
Fix bad line breaks.
Jakub Jelinek [Fri, 4 Feb 2022 17:30:59 +0000 (18:30 +0100)]
rs6000: Fix up -D_FORTIFY_SOURCE* with -mabi=ieeelongdouble [PR104380]
The following testcase FAILs when configured with
--with-long-double-format=ieee . Only happens in the -std=c* modes, not the
GNU modes; while the glibc headers have __asm redirects of
vsnprintf and __vsnprinf_chk to __vsnprintfieee128 and
__vsnprintf_chkieee128, the vsnprintf fortification extern inline gnu_inline
always_inline wrapper calls __builtin_vsnprintf_chk and we actually emit
a call to __vsnprinf_chk (i.e. with IBM extended long double) instead of
__vsnprintf_chkieee128.
rs6000_mangle_decl_assembler_name already had cases for *printf and *scanf,
so this just adds another case for *printf_chk. *scanf_chk doesn't exist.
__ prefixing isn't done because *printf_chk already starts with __.
2022-02-04 Jakub Jelinek <jakub@redhat.com>
PR target/104380
* config/rs6000/rs6000.cc (rs6000_mangle_decl_assembler_name): Also
adjust mangling of __builtin*printf_chk.
* gcc.dg/pr104380.c: New test.
Eric Botcazou [Fri, 4 Feb 2022 16:41:55 +0000 (17:41 +0100)]
Add optmization testcase for incorrect optimization in Ada
gcc/testsuite/
PR tree-optimization/104356
* gnat.dg/opt97.adb: New test.
Tobias Burnus [Fri, 4 Feb 2022 16:31:21 +0000 (17:31 +0100)]
libgomp.fortran/allocate-1.f90: Fix minor cleanup
libgomp/ChangeLog:
* testsuite/libgomp.fortran/allocate-1.f90: Remove spurious
STOP of previous commit.
Jonathan Wakely [Fri, 4 Feb 2022 13:23:25 +0000 (13:23 +0000)]
doc: Update references to "C++2a" in cpp.texi
gcc/ChangeLog:
* doc/cpp.texi (Variadic Macros): Replace C++2a with C++20.
Jonathan Wakely [Thu, 3 Feb 2022 13:17:05 +0000 (13:17 +0000)]
libstdc++: Add suggestion to std::uncaught_exception() warning
We should use the SUGGEST macro for std::uncaught_exception()
deprecation warnings.
libstdc++-v3/ChangeLog:
* include/bits/allocator.h: Qualify std::allocator_traits in
deprecated warnings.
* libsupc++/exception (uncaught_exception): Add suggestion to
deprecated warning.
David Edelsohn [Fri, 4 Feb 2022 15:08:58 +0000 (10:08 -0500)]
testsuite: -mbig/-mlittle only is valid for powerpc-linux.
A recent change to some powerpc tests added explicit -mbig and -mlittle
options, but those options are not valid outside of powerpc-linux.
This patch updates the testcase options to enable -mbig when valid
and to only use -mlittle for powerpc-linux.
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/builtins-1.c: Limit -mbig.
* gcc.target/powerpc/vsu/vec-cntlz-lsbb-0.c: Limit -mbig.
* gcc.target/powerpc/vsu/vec-cntlz-lsbb-1.c: Limit -mbig.
* gcc.target/powerpc/vsu/vec-cntlz-lsbb-2.c: Remove target selector.
* gcc.target/powerpc/vsu/vec-cntlz-lsbb-3.c: Only powerpc*-linux.
* gcc.target/powerpc/vsu/vec-cntlz-lsbb-4.c: Only powerpc*-linux*.
* gcc.target/powerpc/vsu/vec-cnttz-lsbb-0.c: Limit -mbig.
* gcc.target/powerpc/vsu/vec-cnttz-lsbb-1.c: Limit -mbig.
* gcc.target/powerpc/vsu/vec-cnttz-lsbb-2.c: Remove target selector.
* gcc.target/powerpc/vsu/vec-cnttz-lsbb-3.c: Only powerpc*-linux*.
* gcc.target/powerpc/vsu/vec-cnttz-lsbb-4.c: Only powerpc*-linux*.
Tobias Burnus [Fri, 4 Feb 2022 13:51:01 +0000 (14:51 +0100)]
libgomp.fortran/allocate-1.f90: Minor cleanup
libgomp/ChangeLog:
* testsuite/libgomp.fortran/allocate-1.c (is_64bit_aligned): Renamed
from is_64bit_aligned_.
* testsuite/libgomp.fortran/allocate-1.f90: Fix interface decl
and use it, more implicit none, remove unused argument.
Richard Biener [Wed, 26 Jan 2022 08:35:57 +0000 (09:35 +0100)]
tree-optimization/100499 - niter analysis and multiple_of_p
niter analysis uses multiple_of_p which currently assumes
operations like MULT_EXPR do not wrap. We've got to rely on this
for optimizing size expressions like those in DECL_SIZE and those
generally use unsigned arithmetic with no indication that they
are not expected to wrap. To preserve that the following adds
a parameter to multiple_of_p, defaulted to true, indicating that
the TOP expression is not expected to wrap for outer computations
in TYPE. This mostly follows a patch proposed by Bin last year
with the conversion behavior added.
Applying to all users the new effect is that upon type conversions
in the TOP expression the behavior will switch to honor
TYPE_OVERFLOW_UNDEFINED for the converted sub-expressions.
The patch also changes the occurance in niter analysis that we
know is problematic and we have testcases for to pass false
to multiple_of_p. The patch also contains a change to the
PR72817 fix from Bin to avoid regressing gcc.dg/tree-ssa/loop-42.c.
The intent for stage1 is to introduce a size_multiple_of_p and
internalize the added parameter so all multiple_of_p users will
honor TYPE_OVERFLOW_UNDEFINED and users dealing with size expressions
need to be switched to size_multiple_of_p.
2022-01-26 Richard Biener <rguenther@suse.de>
PR tree-optimization/100499
* fold-const.h (multiple_of_p): Add nowrap parameter, defaulted
to true.
* fold-const.cc (multiple_of_p): Likewise. Honor it for
MULT_EXPR, PLUS_EXPR and MINUS_EXPR and pass it along,
switching to false for conversions.
* tree-ssa-loop-niter.cc (number_of_iterations_ne): Do not
claim the outermost expression does not wrap when calling
multiple_of_p. Refactor the check done to check the
original IV, avoiding a bias that might wrap.
* gcc.dg/torture/pr100499-1.c: New testcase.
* gcc.dg/torture/pr100499-2.c: Likewise.
* gcc.dg/torture/pr100499-3.c: Likewise.
Co-authored-by: Bin Cheng <bin.cheng@linux.alibaba.com>
Martin Liska [Fri, 4 Feb 2022 09:24:51 +0000 (10:24 +0100)]
fixincludes: Update pwd.
fixincludes/ChangeLog:
* fixinc.in: Use cd OLDDIR instead of cd .. .
Richard Biener [Mon, 24 Jan 2022 14:26:46 +0000 (15:26 +0100)]
Adjust LSHIFT_EXPR handling of multiple_of_p
This removes the odd check of size_type_node when handling left-shifts
as multiplications of 1 << N and instead uses the type as specified.
It also moves left-shift handling next to multiplications where it
semantically belongs.
2022-01-24 Richard Biener <rguenther@suse.de>
* fold-const.cc (multiple_of_p): Re-write and move LSHIFT_EXPR
handling.
Eric Botcazou [Fri, 4 Feb 2022 11:07:46 +0000 (12:07 +0100)]
Empty the base_types vector before (re)populating it
Otherwise Bad Things happen when it is populated several times.
gcc/
PR debug/104366
* dwarf2out.cc (dwarf2out_finish): Empty base_types.
(dwarf2out_early_finish): Likewise.
Eric Botcazou [Fri, 4 Feb 2022 11:03:49 +0000 (12:03 +0100)]
Disable new 1/X optimization with -fnon-call-exceptions
The trapping behavior of the operation needs to be preserved when the
-fnon-call-exceptions switch is in effect. This also adds the same
guards to similar optimizations.
gcc/
PR tree-optimization/104356
* match.pd (X / bool_range_Y is X): Add guard.
(X / X is one): Likewise.
(X / abs (X) is X < 0 ? -1 : 1): Likewise.
(X / -X is -1): Likewise.
(1 / X -> X == 1): Likewise.
Richard Biener [Fri, 4 Feb 2022 08:26:57 +0000 (09:26 +0100)]
tree-optimization/103641 - improve vect_synth_mult_by_constant
The following happens to improve compile-time of the PR103641
testcase on aarch64 significantly. I did not investigate the
effect on the generated code but at least in theory
choose_mult_variant should do a better job when we tell it
the actual mode we are going to use for the operations it
synthesizes.
2022-02-04 Richard Biener <rguenther@suse.de>
PR tree-optimization/103641
* tree-vect-patterns.cc (vect_synth_mult_by_constant):
Pass the vector mode to choose_mult_variant.
Roger Sayle [Fri, 4 Feb 2022 09:32:21 +0000 (09:32 +0000)]
[PATCH] PR rtl-optimization/101885: Prevent combine from clobbering flags
This patch addresses PR rtl-optimization/101885 which is a P2 wrong code
regression. In combine, if the resulting fused instruction is a parallel
of two sets which fails to be recognized by the backend, combine tries to
emit these as two sequential set instructions (known as split_i2i3).
As each set is recognized the backend may add any necessary "clobbers".
The code currently checks that any clobbers added to the first "set"
don't interfere with the second set, but doesn't currently handle the
case that clobbers added to the second set may interfere/kill the
destination of the first set (which must be live at this point).
The solution is to cut'n'paste the "clobber" logic from just a few
lines earlier, suitably adjusted for the second instruction.
One minor nit that may confuse a reviewer is that at this point in
the code we've lost track of which set was first and which was second
(combine chooses dynamically, and the recog processes that adds the
clobbers may have obfuscated the original SET_DEST) so the actual test
below is to confirm that any newly added clobbers (on the second set
instruction) don't overlap either set0's or set1's destination.
2022-02-04 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
PR rtl-optimization/101885
* combine.cc (try_combine): When splitting a parallel into two
sequential sets, check not only that the first doesn't clobber
the second but also that the second doesn't clobber the first.
gcc/testsuite/ChangeLog
PR rtl-optimization/101885
* gcc.dg/pr101885.c: New test case.
Richard Sandiford [Fri, 4 Feb 2022 08:08:59 +0000 (08:08 +0000)]
aarch64: Add test for PR104092
gcc/testsuite/
PR middle-end/104092
* gcc.target/aarch64/sve/acle/general/pr104092.c: New test.
Richard Biener [Wed, 2 Feb 2022 13:24:39 +0000 (14:24 +0100)]
Add CLOBBER_EOL to mark storage end-of-life clobbers
This adds a flag to CONSTRUCTOR nodes indicating that for
clobbers this marks the end-of-life of storage as opposed to
just ending the lifetime of the object that occupied it.
The dangling pointer diagnostics uses CLOBBERs but is confused
by those emitted by the C++ frontend for example which emits
them for the second purpose at the start of CTORs. The issue
is also appearant for aarch64 in PR104092.
Distinguishing the two cases is also necessary for the PR90348 fix.
Since I'm going to add another flag I added an enum clobber_flags
and a defaulted argument to build_clobber plus a convenient way to
query the enum from the CTOR tree and specify it for gimple_clobber_p.
Since 'CLOBBER' is already taken and I needed a name for the unspecified
clobber we have now I used 'CLOBBER_UNDEF'.
2022-02-03 Richard Biener <rguenther@suse.de>
PR middle-end/90348
PR middle-end/104092
gcc/
* tree-core.h (clobber_kind): New enum.
(tree_base::u::bits::address_space): Document use in CONSTRUCTORs.
* tree.h (CLOBBER_KIND): Add.
(build_clobber): Add clobber kind argument, defaulted to
CLOBBER_UNDEF.
* tree.cc (build_clobber): Likewise.
* gimple.h (gimple_clobber_p): New overload with specified kind.
* tree-streamer-in.cc (streamer_read_tree_bitfields): Stream
CLOBBER_KIND.
* tree-streamer-out.cc (streamer_write_tree_bitfields):
Likewise.
* tree-pretty-print.cc (dump_generic_node): Mark EOL CLOBBERs.
* gimplify.cc (gimplify_bind_expr): Build storage end-of-life clobbers
with CLOBBER_EOL.
(gimplify_target_expr): Likewise.
* tree-inline.cc (expand_call_inline): Likewise.
* tree-ssa-ccp.cc (insert_clobber_before_stack_restore): Likewise.
* gimple-ssa-warn-access.cc (pass_waccess::check_stmt): Only treat
CLOBBER_EOL clobbers as ending lifetime of storage.
gcc/lto/
* lto-common.cc (compare_tree_sccs_1): Compare CLOBBER_KIND.
gcc/testsuite/
* gcc.dg/pr87052.c: Adjust.
Martin Sebor [Fri, 4 Feb 2022 02:44:44 +0000 (19:44 -0700)]
Use auto_vec for pointer_query cache for auto cleanup.
gcc/Changelog:
* pointer-query.h (pointer_query::cache_type): Use auto_vec for auto
cleanup.
GCC Administrator [Fri, 4 Feb 2022 00:16:24 +0000 (00:16 +0000)]
Daily bump.
Patrick Palka [Thu, 3 Feb 2022 23:54:23 +0000 (18:54 -0500)]
c++: dependence of member noexcept-spec [PR104079]
Here a stale TYPE_DEPENDENT_P/_P_VALID value for f's function type
after replacing the type's DEFERRED_NOEXCEPT with the parsed dependent
noexcept-spec causes us to try to instantiate g's noexcept-spec ahead
of time (since it in turn appears non-dependent), leading to an ICE.
This patch fixes this by clearing TYPE_DEPENDENT_P_VALID in
fixup_deferred_exception_variants appropriately (as in
build_cp_fntype_variant).
That turns out to fix the testcase for C++17 but not for C++11/14,
because it's not until C++17 that a noexcept-spec is part of (and
therefore affects dependence of) the function type. Since dependence of
NOEXCEPT_EXPR is defined in terms of instantiation dependence, the most
appropriate fix for earlier dialects seems to be to make instantiation
dependence consider dependence of a noexcept-spec.
PR c++/104079
gcc/cp/ChangeLog:
* pt.cc (value_dependent_noexcept_spec_p): New predicate split
out from ...
(dependent_type_p_r): ... here.
(instantiation_dependent_r): Use value_dependent_noexcept_spec_p
to consider dependence of a noexcept-spec before C++17.
* tree.cc (fixup_deferred_exception_variants): Clear
TYPE_DEPENDENT_P_VALID.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/noexcept74.C: New test.
* g++.dg/cpp0x/noexcept74a.C: New test.
David Malcolm [Wed, 2 Feb 2022 21:39:12 +0000 (16:39 -0500)]
analyzer: fixes to realloc-handling [PR104369]
This patch fixes various issues with how -fanalyzer handles "realloc"
seen when debugging PR analyzer/104369.
Previously it wasn't correctly copying over the contents of the old
buffer for the success-with-move case, leading to false
-Wanalyzer-use-of-uninitialized-value diagnostics.
I also noticed that -fanalyzer failed to properly handle "realloc" for
cases where the ptr's region had unknown dynamic extents, and an ICE
for the case where a tainted value is used as a realloc size argument.
This patch fixes these issues, including the false uninit diagnostics
seen in PR analyzer/104369.
gcc/analyzer/ChangeLog:
PR analyzer/104369
* engine.cc (exploded_graph::process_node): Use the node for any
diagnostics, avoiding ICE if a bifurcation update adds a
saved_diagnostic, such as for a tainted realloc size.
* region-model-impl-calls.cc
(region_model::impl_call_realloc::success_no_move::update_model):
Require the old pointer to be non-NULL to be able successfully
grow in place. Use model->deref_rvalue rather than maybe_get_region
to support the old pointer being symbolic.
(region_model::impl_call_realloc::success_with_move::update_model):
Likewise. Add a constraint that the new pointer != the old pointer.
Use a sized_region when setting the value of the new region.
Handle the case where we don't know the dynamic size of the old
region by marking the new region as unknown.
* sm-taint.cc (tainted_allocation_size::tainted_allocation_size):
Update assertion to also allow for MEMSPACE_UNKNOWN.
(tainted_allocation_size::emit): Likewise.
(region_model::check_dynamic_size_for_taint): Likewise.
gcc/testsuite/ChangeLog:
PR analyzer/104369
* gcc.dg/analyzer/pr104369-1.c: New test.
* gcc.dg/analyzer/pr104369-2.c: New test.
* gcc.dg/analyzer/realloc-3.c: New test.
* gcc.dg/analyzer/realloc-4.c: New test.
* gcc.dg/analyzer/taint-realloc.c: New test.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
David Malcolm [Thu, 3 Feb 2022 16:15:48 +0000 (11:15 -0500)]
analyzer: fix zero-fill of calloc
It turned out that the analyzer wasn't treating calloc regions
as zero-filled, due to binding_cluster::fill_region getting an
unknown value for the byte_size_size_sval, and thus
get_or_create_repeated_svalue returning an unknown_svalue, which
was then used to fill the region.
Fixed thusly.
gcc/analyzer/ChangeLog:
* region-model-impl-calls.cc (region_model::impl_call_calloc): Use
a sized_region when calling zero_fill_region.
gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/calloc-1.c: New test.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Martin Sebor [Thu, 3 Feb 2022 21:51:46 +0000 (14:51 -0700)]
Adjust warn_access pass placement [PR104260].
Resolves:
PR middle-end/104260 - Misplaced waccess3 pass
gcc/ChangeLog:
PR middle-end/104260
* passes.def (pass_warn_access): Adjust pass placement.
Uros Bizjak [Thu, 3 Feb 2022 21:24:21 +0000 (22:24 +0100)]
i386: Do not use %ecx DRAP for functions that use __builtin_eh_return [PR104362]
%ecx can't be used for both DRAP register and eh_return. Adjust find_drap_reg
to choose %edi for functions that uses __builtin_eh_return to avoid the assert
in ix86_expand_epilogue that enforces this rule.
2022-02-03 Uroš Bizjak <ubizjak@gmail.com>
gcc/ChangeLog:
PR target/104362
* config/i386/i386.cc (find_drap_reg): For 32bit targets
return DI_REG if function uses __builtin_eh_return.
gcc/testsuite/ChangeLog:
PR target/104362
* gcc.target/i386/pr104362.c: New test.
Martin Sebor [Thu, 3 Feb 2022 20:59:39 +0000 (13:59 -0700)]
Enable pointer_query caching in -Wrestrict.
gcc/ChangeLog:
* gimple-ssa-warn-restrict.cc (class pass_wrestrict): Outline ctor.
(pass_wrestrict::m_ptr_qry): New member.
(wrestrict_walk): Rename...
(pass_wrestrict::check_block): ...to this.
(pass_wrestrict::execute): Set up and tear down pointer_query and
ranger.
(builtin_memref::builtin_memref): Change ctor argument. Simplify.
(builtin_access::builtin_access): Same.
(builtin_access::m_ptr_qry): New member.
(check_call): Rename...
(pass_wrestrict::check_call): ...to this.
(check_bounds_or_overlap): Change argument.
* gimple-ssa-warn-restrict.h (check_bounds_or_overlap): Same.
Martin Sebor [Thu, 3 Feb 2022 20:58:28 +0000 (13:58 -0700)]
Enable pointer_query caching in -Warray-bounds.
gcc/ChangeLog:
* gimple-array-bounds.cc (array_bounds_checker::array_bounds_checker):
Define ctor.
(array_bounds_checker::get_value_range): Use new member.
(array_bounds_checker::check_mem_ref): Same.
* gimple-array-bounds.h (array_bounds_checker::array_bounds_checker):
Outline ctor.
(array_bounds_checker::m_ptr_query): New member.
Martin Sebor [Thu, 3 Feb 2022 20:56:50 +0000 (13:56 -0700)]
Make pointer_query cache a private member.
gcc/ChangeLog:
* gimple-ssa-warn-access.cc (pass_waccess::pass_waccess): Remove
pointer_query cache.
* pointer-query.cc (pointer_query::pointer_query): Remove cache
argument. Zero-initialize new cache member.
(pointer_query::get_ref): Replace cache pointer with direct access.
(pointer_query::put_ref): Same.
(pointer_query::flush_cache): Same.
(pointer_query::dump): Same.
* pointer-query.h (class pointer_query): Remove cache argument from
ctor. Change cache pointer to cache subobject member.
* tree-ssa-strlen.cc: Remove pointer_query cache.
Martin Sebor [Thu, 3 Feb 2022 20:27:16 +0000 (13:27 -0700)]
Constrain conservative string lengths to array sizes [PR104119].
Resolves:
PR tree-optimization/104119 - unexpected -Wformat-overflow after strlen in ILP32 since Ranger integration
gcc/ChangeLog:
PR tree-optimization/104119
* gimple-ssa-sprintf.cc (struct directive): Change argument type.
(format_none): Same.
(format_percent): Same.
(format_integer): Same.
(format_floating): Same.
(get_string_length): Same.
(format_character): Same.
(format_string): Same.
(format_plain): Same.
(format_directive): Same.
(compute_format_length): Same.
(handle_printf_call): Same.
* tree-ssa-strlen.cc (get_range_strlen_dynamic): Same. Call
get_maxbound.
(get_range_strlen_phi): Same.
(get_maxbound): New function.
(strlen_pass::get_len_or_size): Adjust to parameter change.
* tree-ssa-strlen.h (get_range_strlen_dynamic): Change argument type.
gcc/testsuite/ChangeLog:
PR tree-optimization/104119
* gcc.dg/tree-ssa/builtin-snprintf-13.c: New test.
* gcc.dg/tree-ssa/builtin-sprintf-warn-29.c: New test.
Harald Anlauf [Tue, 1 Feb 2022 22:33:24 +0000 (23:33 +0100)]
Fortran: reject simplifying TRANSFER for MOLD with storage size 0
gcc/fortran/ChangeLog:
PR fortran/104311
* check.cc (gfc_calculate_transfer_sizes): Checks for case when
storage size of SOURCE is greater than zero while the storage size
of MOLD is zero and MOLD is an array shall not depend on SIZE.
gcc/testsuite/ChangeLog:
PR fortran/104311
* gfortran.dg/transfer_simplify_15.f90: New test.
Martin Liska [Thu, 3 Feb 2022 14:49:43 +0000 (15:49 +0100)]
Speed up fixincludes.
In my case:
$ rm ./stmp-fixinc ; time make -j16
takes 17 seconds, where I can reduce it easily with the suggested
change. Then I get to 11.2 seconds.
The scripts searches ~2500 folders in my case with total 20K header
files.
fixincludes/ChangeLog:
* fixinc.in: Use mkdir -p rather that a loop.
Bill Schmidt [Thu, 3 Feb 2022 03:30:27 +0000 (21:30 -0600)]
rs6000: Remove -m[no-]fold-gimple flag [PR103686]
The -m[no-]fold-gimple flag was really intended primarily for internal
testing while implementing GIMPLE folding for rs6000 vector built-in
functions. It ended up leaking into other places, causing problems such
as PR103686 identifies. Let's remove it.
There are a number of tests in the testsuite that require adjustment.
Some specify -mfold-gimple directly, which is the default, so that is
handled by removing the option. Others unnecessarily specify
-mno-fold-gimple, as the tests work fine without this. Again that is
handled by removing the option. There are a couple of extra variants of
tests specifically for -mno-fold-gimple; for those, we can just remove the
whole test.
gcc.target/powerpc/builtins-1.c was more problematic. It was written in
such a way as to be extremely fragile. For this one, I rewrote the whole
test in a different style, using individual functions to test each
built-in function. These same tests are also largely covered by
builtins-1-be-folded.c and builtins-1-le-folded.c, so I chose to
explicitly make this test -mbig for simplicity, and use -O2 for clean code
generation. I made some slight modifications to the expected instruction
counts as a result, and tested on both 32- and 64-bit.
2022-02-02 Bill Schmidt <wschmidt@linux.ibm.com>
gcc/
PR target/103686
* config/rs6000/rs6000-builtin.cc (rs6000_gimple_fold_builtin): Remove
test for !rs6000_fold_gimple.
* config/rs6000/rs6000.cc (rs6000_option_override_internal): Likewise.
* config/rs6000/rs6000.opt (mfold-gimple): Remove.
gcc/testsuite/
PR target/103686
* gcc.target/powerpc/builtins-1-be-folded.c: Remove -mfold-gimple
option.
* gcc.target/powerpc/builtins-1-le-folded.c: Likewise.
* gcc.target/powerpc/builtins-1.c: Rewrite to use small functions and
restrict to -O2 -mbig for predictability. Adjust instruction counts.
* gcc.target/powerpc/builtins-5.c: Remove -mno-fold-gimple option.
* gcc.target/powerpc/p8-vec-xl-xst.c: Likewise.
* gcc.target/powerpc/pr83926.c: Likewise.
* gcc.target/powerpc/pr86731-nogimplefold-longlong.c: Delete.
* gcc.target/powerpc/pr86731-nogimplefold.c: Delete.
* gcc.target/powerpc/swaps-p8-17.c: Remove -mno-fold-gimple option.
Bill Schmidt [Thu, 3 Feb 2022 03:24:22 +0000 (21:24 -0600)]
rs6000: Fix LE code gen for vec_cnt[lt]z_lsbb [PR95082]
These built-ins were misimplemented as always having big-endian semantics.
2022-01-18 Bill Schmidt <wschmidt@linux.ibm.com>
gcc/
PR target/95082
* config/rs6000/rs6000-builtin.cc (rs6000_expand_builtin): Handle
endianness for vclzlsbb and vctzlsbb.
* config/rs6000/rs6000-builtins.def (VCLZLSBB_V16QI): Change
default pattern and indicate a different pattern will be used for
big endian.
(VCLZLSBB_V4SI): Likewise.
(VCLZLSBB_V8HI): Likewise.
(VCTZLSBB_V16QI): Likewise.
(VCTZLSBB_V4SI): Likewise.
(VCTZLSBB_V8HI): Likewise.
gcc/testsuite/
PR target/95082
* gcc.target/powerpc/vsu/vec-cntlz-lsbb-0.c: Restrict to -mbig.
* gcc.target/powerpc/vsu/vec-cntlz-lsbb-1.c: Likewise.
* gcc.target/powerpc/vsu/vec-cntlz-lsbb-3.c: New.
* gcc.target/powerpc/vsu/vec-cntlz-lsbb-4.c: New.
* gcc.target/powerpc/vsu/vec-cnttz-lsbb-0.c: Restrict to -mbig.
* gcc.target/powerpc/vsu/vec-cnttz-lsbb-1.c: Likewise.
* gcc.target/powerpc/vsu/vec-cnttz-lsbb-3.c: New.
* gcc.target/powerpc/vsu/vec-cnttz-lsbb-4.c: New.
Bill Schmidt [Thu, 3 Feb 2022 16:26:29 +0000 (10:26 -0600)]
rs6000: Consolidate target built-ins code
Continuing with the refactoring effort, this patch moves as much of the
target-specific built-in support code into a new file, rs6000-builtin.cc.
However, we can't easily move the overloading support code out of
rs6000-c.cc, because the build machinery understands that as a special file
to be included with the C and C++ front ends.
This patch is just a straightforward move, with one exception. I found
that the builtin_mode_to_type[] array is no longer used, so I also removed
all code having to do with it.
The code in rs6000-builtin.cc is organized in related sections:
- General support functions
- Initialization support
- GIMPLE folding support
- Expansion support
Overloading support remains in rs6000-c.cc.
2022-02-03 Bill Schmidt <wschmidt@linux.ibm.com>
gcc/
* config.gcc (powerpc*-*-*): Add rs6000-builtin.o to extra_objs.
* config/rs6000/rs6000-builtin.cc: New file, containing code moved
from other files.
* config/rs6000/rs6000-call.cc (cpu_is_info): Move to
rs6000-builtin.cc.
(cpu_supports_info): Likewise.
(rs6000_type_string): Likewise.
(altivec_expand_predicate_builtin): Likewise.
(rs6000_htm_spr_icode): Likewise.
(altivec_expand_vec_init_builtin): Likewise.
(get_element_number): Likewise.
(altivec_expand_vec_set_builtin): Likewise.
(altivec_expand_vec_ext_builtin): Likewise.
(rs6000_invalid_builtin): Likewise.
(rs6000_fold_builtin): Likewise.
(fold_build_vec_cmp): Likewise.
(fold_compare_helper): Likewise.
(map_to_integral_tree_type): Likewise.
(fold_mergehl_helper): Likewise.
(fold_mergeeo_helper): Likewise.
(rs6000_builtin_valid_without_lhs): Likewise.
(rs6000_builtin_is_supported): Likewise.
(rs6000_gimple_fold_mma_builtin): Likewise.
(rs6000_gimple_fold_builtin): Likewise.
(rs6000_expand_ldst_mask): Likewise.
(cpu_expand_builtin): Likewise.
(elemrev_icode): Likewise.
(ldv_expand_builtin): Likewise.
(lxvrse_expand_builtin): Likewise.
(lxvrze_expand_builtin): Likewise.
(stv_expand_builtin): Likewise.
(mma_expand_builtin): Likewise.
(htm_spr_num): Likewise.
(htm_expand_builtin): Likewise.
(rs6000_expand_builtin): Likewise.
(rs6000_vector_type): Likewise.
(rs6000_init_builtins): Likewise. Remove initialization of
builtin_mode_to_type entries.
(rs6000_builtin_decl): Move to rs6000-builtin.cc.
* config/rs6000/rs6000.cc (rs6000_builtin_mask_for_load): New
external declaration.
(rs6000_builtin_md_vectorized_function): Likewise.
(rs6000_builtin_reciprocal): Likewise.
(altivec_builtin_mask_for_load): Move to rs6000-builtin.cc.
(rs6000_builtin_types): Likewise.
(builtin_mode_to_type): Remove.
(rs6000_builtin_mask_for_load): Move to rs6000-builtin.cc. Remove
static qualifier.
(rs6000_builtin_md_vectorized_function): Likewise.
(rs6000_builtin_reciprocal): Likewise.
* config/rs6000/rs6000.h (builtin_mode_to_type): Remove.
* config/rs6000/t-rs6000 (rs6000-builtin.o): New target.
David Seifert [Thu, 3 Feb 2022 14:47:10 +0000 (15:47 +0100)]
make `-Werror` optional in libatomic/libbacktrace/libgomp/libitm/libsanitizer
* `-Werror` can cause issues when a more recent version of GCC compiles
an older version:
- https://bugs.gentoo.org/229059
- https://bugs.gentoo.org/475350
- https://bugs.gentoo.org/667104
libatomic/ChangeLog:
* configure.ac: Support --disable-werror.
* configure: Regenerate.
libbacktrace/ChangeLog:
* configure.ac: Support --disable-werror.
* configure: Regenerate.
libgomp/ChangeLog:
* configure.ac: Support --disable-werror.
* configure: Regenerate.
libitm/ChangeLog:
* configure.ac: Support --disable-werror.
* configure: Regenerate.
libsanitizer/ChangeLog:
* configure.ac: Support --disable-werror.
* aclocal.m4: Include also ../config/warnings.m4.
* libbacktrace/Makefile.am (WARN_FLAGS): Remove.
* configure: Regenerate.
* Makefile.in: Regenerate.
* asan/Makefile.in: Regenerate.
* hwasan/Makefile.in: Regenerate.
* interception/Makefile.in: Regenerate.
* libbacktrace/Makefile.in: Regenerate.
* lsan/Makefile.in: Regenerate.
* sanitizer_common/Makefile.in: Regenerate.
* tsan/Makefile.in: Regenerate.
* ubsan/Makefile.in: Regenerate.
Co-Authored-By: Jakub Jelinek <jakub@redhat.com>
Richard Biener [Thu, 3 Feb 2022 10:20:59 +0000 (11:20 +0100)]
debug/104337 - avoid messing with the abstract origin chain in NRV
The following avoids NRV from massaging DECL_ABSTRACT_ORIGIN after
variable creation since NRV runs _after_ the function was inlined and thus
affects the inlined variables copy indirectly. We may adjust the abstract
origin of a variable only at the point we create it, not further along the
path since otherwise the (new) invariant that the abstract origin is always
the ultimate origin cannot be maintained.
The intent of what NRV does is OK I guess and it may improve the debug
experience. But I also notice we do
SET_DECL_VALUE_EXPR (found, result);
DECL_HAS_VALUE_EXPR_P (found) = 1;
the code is there since the merge from tree-ssa which added tree-nrv.c.
Jakub added the DECL_VALUE_EXPR in g:
938650d8fddb878f623e315f0b7fd94b217efa96
and Jason added the abstract origin setting conditional in g:
7716876bbd3a
The follwoing takes the radical approach and remove the attempt
to "optimize" the debug info.
The gdb testsuites show no regressions.
2022-02-03 Richard Biener <rguenther@suse.de>
PR debug/104337
* tree-nrv.cc (pass_nrv::execute): Remove tieing result and found
together via DECL_ABSTRACT_ORIGIN.
* gcc.dg/debug/pr104337.c: New testcase.
Bill Schmidt [Thu, 3 Feb 2022 02:59:00 +0000 (20:59 -0600)]
rs6000: Unify error messages for built-in constant restrictions
We currently give different error messages for built-in functions that
violate range restrictions on their arguments, depending on whether we
record them as requiring an n-bit literal or a literal between two values.
It's better to be consistent. Change the error message for the n-bit
literal to look like the other one.
2022-02-02 Bill Schmidt <wschmidt@linux.ibm.com>
gcc/
* config/rs6000/rs6000-call.cc (rs6000_expand_builtin): Revise error
message for RES_BITS case.
gcc/testsuite/
* gcc.target/powerpc/bfp/scalar-test-data-class-10.c: Adjust error
messages.
* gcc.target/powerpc/bfp/scalar-test-data-class-2.c: Likewise.
* gcc.target/powerpc/bfp/scalar-test-data-class-3.c: Likewise.
* gcc.target/powerpc/bfp/scalar-test-data-class-4.c: Likewise.
* gcc.target/powerpc/bfp/scalar-test-data-class-5.c: Likewise.
* gcc.target/powerpc/bfp/scalar-test-data-class-9.c: Likewise.
* gcc.target/powerpc/bfp/vec-test-data-class-4.c: Likewise.
* gcc.target/powerpc/bfp/vec-test-data-class-5.c: Likewise.
* gcc.target/powerpc/bfp/vec-test-data-class-6.c: Likewise.
* gcc.target/powerpc/bfp/vec-test-data-class-7.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-12.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-14.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-17.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-19.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-2.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-22.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-24.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-27.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-29.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-32.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-34.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-37.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-39.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-4.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-42.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-44.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-47.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-49.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-52.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-54.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-57.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-59.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-62.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-64.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-67.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-69.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-7.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-72.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-74.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-77.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-79.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-9.c: Likewise.
* gcc.target/powerpc/pr80315-1.c: Likewise.
* gcc.target/powerpc/pr80315-2.c: Likewise.
* gcc.target/powerpc/pr80315-3.c: Likewise.
* gcc.target/powerpc/pr80315-4.c: Likewise.
* gcc.target/powerpc/pr82015.c: Likewise.
* gcc.target/powerpc/pr91903.c: Likewise.
* gcc.target/powerpc/test_fpscr_rn_builtin_error.c: Likewise.
* gcc.target/powerpc/vec-ternarylogic-10.c: Likewise.
Aldy Hernandez [Thu, 3 Feb 2022 14:45:55 +0000 (15:45 +0100)]
ranger: fix small thinko in fur_list constructor
The fur_list constructor for two ranges is leaving [1] in an undefined
state. The reason we haven't noticed is because after all the
shuffling in the last cycle there are no remaining users of it
(similarly for fur_list(unsigned, irange *)).
Since it's very late in the cycle, I would prefer to fix this, rather
than removing unused constructors altogether. Besides, we have uses
of them queued up for the next release.
gcc/ChangeLog:
* gimple-range-fold.cc (fur_list::fur_list): Set m_local[1] correctly.
Jakub Jelinek [Thu, 3 Feb 2022 13:34:21 +0000 (14:34 +0100)]
arm: Fix up help.exp regression
On Thu, Jan 20, 2022 at 11:27:20AM +0000, Richard Earnshaw via Gcc-patches wrote:
> gcc/ChangeLog:
>
> * config/arm/arm.opt (mfix-cortex-a57-aes-1742098): New command-line
> option.
> (mfix-cortex-a72-aes-1655431): New option alias.
> --- a/gcc/config/arm/arm.opt
> +++ b/gcc/config/arm/arm.opt
> @@ -272,6 +272,16 @@ mfix-cmse-cve-2021-35465
> Target Var(fix_vlldm) Init(2)
> Mitigate issues with VLLDM on some M-profile devices (CVE-2021-35465).
>
> +mfix-cortex-a57-aes-1742098
> +Target Var(fix_aes_erratum_1742098) Init(2) Save
> +Mitigate issues with AES instructions on Cortex-A57 and Cortex-A72.
> +Arm erratum #1742098
> +
> +mfix-cortex-a72-aes-1655431
> +Target Alias(mfix-cortex-a57-aes-1742098)
> +Mitigate issues with AES instructions on Cortex-A57 and Cortex-A72.
> +Arm erratum #1655431
> +
> munaligned-access
> Target Var(unaligned_access) Init(2) Save
> Enable unaligned word and halfword accesses to packed data.
This breaks:
Running /usr/src/gcc/gcc/testsuite/gcc.misc-tests/help.exp ...
FAIL: compiler driver --help=target option(s): "^ +-.*[^:.]$" absent from output: " -mfix-cortex-a57-aes-1742098 Mitigate issues with AES instructions on Cortex-A57 and Cortex-A72. Arm erratum #1742098"
help.exp with help of lib/options.exp tests whether all non-empty descriptions of
options are terminated with . or :.
2022-02-03 Jakub Jelinek <jakub@redhat.com>
* config/arm/arm.opt (mfix-cortex-a57-aes-1742098,
mfix-cortex-a72-aes-1655431): Ensure description ends with full stop.
Aldy Hernandez [Fri, 21 Jan 2022 12:04:20 +0000 (13:04 +0100)]
Assert that backedges are available in path solver.
gcc/ChangeLog:
* cfganal.cc (verify_marked_backedges): New.
* cfganal.h (verify_marked_backedges): New.
* gimple-range-path.cc (path_range_query::path_range_query):
Verify freshness of back edges.
* tree-ssa-loop-ch.cc (ch_base::copy_headers): Call
mark_dfs_back_edges.
* tree-ssa-threadbackward.cc (back_threader::back_threader): Move
path_range_query construction after backedges have been
updated.
Eric Botcazou [Thu, 3 Feb 2022 12:12:37 +0000 (13:12 +0100)]
Skip gnat.dg/div_zero.adb on PowerPC
The hardware instruction does not trap on divide by zero there.
gcc/testsuite
PR tree-optimization/104356
* gnat.dg/div_zero.adb: Add dg-skip-if directive for PowerPC.
Richard Sandiford [Thu, 3 Feb 2022 10:44:01 +0000 (10:44 +0000)]
aarch64: Remove struct_vect_25.c XFAILs
At some point we started generating the intended code for
aarch64/sve/struct_vect_25.c. This patch removes the xfails
and the scan-assembler-times that replaced the xfailed forms.
gcc/testsuite/
* gcc.target/aarch64/sve/struct_vect_25.c: Remove XFAILs.
Richard Sandiford [Thu, 3 Feb 2022 10:44:01 +0000 (10:44 +0000)]
aarch64: Adjust tests after fix for PR102659
After the fix for PR102659, the vectoriser can no longer group
conditional accesses of the form:
for (int i = 0; i < n; ++i)
if (...)
...a[i * 2] + a[i * 2 + 1]...;
on LP64 targets. It has to treat them as two independent
gathers instead.
This was causing failures in the sve mask_struct*.c tests.
The tests weren't really testing that int iterators could
be used, so this patch switches to pointer-sized iterators
instead.
gcc/testsuite/
* gcc.target/aarch64/sve/mask_struct_load_1.c: Use intptr_t
iterators instead of int iterators.
* gcc.target/aarch64/sve/mask_struct_load_2.c: Likewise.
* gcc.target/aarch64/sve/mask_struct_load_3.c: Likewise.
* gcc.target/aarch64/sve/mask_struct_load_4.c: Likewise.
* gcc.target/aarch64/sve/mask_struct_load_5.c: Likewise.
* gcc.target/aarch64/sve/mask_struct_load_6.c: Likewise.
* gcc.target/aarch64/sve/mask_struct_load_7.c: Likewise.
* gcc.target/aarch64/sve/mask_struct_load_8.c: Likewise.
* gcc.target/aarch64/sve/mask_struct_store_1.c: Likewise.
* gcc.target/aarch64/sve/mask_struct_store_2.c: Likewise.
* gcc.target/aarch64/sve/mask_struct_store_3.c: Likewise.
* gcc.target/aarch64/sve/mask_struct_store_4.c: Likewise.
Richard Sandiford [Thu, 3 Feb 2022 10:44:00 +0000 (10:44 +0000)]
aarch64: Add missing movmisalign patterns
The Advanced SIMD movmisalign patterns didn't handle 16-bit
FP modes, which meant that the vector loop for:
void
test (_Float16 *data)
{
_Pragma ("omp simd")
for (int i = 0; i < 8; ++i)
data[i] = 1.0;
}
would be versioned for alignment.
This was causing some new failures in aarch64/sve/single_5.c:
FAIL: gcc.target/aarch64/sve/single_5.c scan-assembler-not \\tb
FAIL: gcc.target/aarch64/sve/single_5.c scan-assembler-not \\tcmp
FAIL: gcc.target/aarch64/sve/single_5.c scan-assembler-times \\tstr\\tq[0-9]+, 10
but I didn't look into what changed from earlier releases.
Adding the missing modes removes some existing xfails.
gcc/
* config/aarch64/aarch64-simd.md (movmisalign<mode>): Extend from
VALL to VALL_F16.
gcc/testsuite/
* gcc.target/aarch64/sve/single_5.c: Remove some XFAILs.
Richard Sandiford [Thu, 3 Feb 2022 10:44:00 +0000 (10:44 +0000)]
aarch64: Remove VALL_F16MOV iterator
The VALL_F16MOV iterator now has the same modes as VALL_F16,
in the same order. This patch removes the former in favour
of the latter.
This doesn't fix a bug as such, but it's ultra-safe (no change in
object code) and it saves a follow-up patch from having to make
a false choice between the iterators.
gcc/
* config/aarch64/iterators.md (VALL_F16MOV): Delete.
* config/aarch64/aarch64-simd.md (mov<mode>): Use VALL_F16 instead
of VALL_F16MOV.
Richard Sandiford [Thu, 3 Feb 2022 10:43:59 +0000 (10:43 +0000)]
testsuite: Remove TSVC XFAILs for SVE
Many of the XFAILed TSVC tests pass for SVE. This patch updates
the markup accordingly.
gcc/testsuite/
* gcc.dg/vect/tsvc/vect-tsvc-s1115.c: Don't XFAIL for SVE.
* gcc.dg/vect/tsvc/vect-tsvc-s114.c: Likewise.
* gcc.dg/vect/tsvc/vect-tsvc-s1161.c: Likewise.
* gcc.dg/vect/tsvc/vect-tsvc-s1232.c: Likewise.
* gcc.dg/vect/tsvc/vect-tsvc-s124.c: Likewise.
* gcc.dg/vect/tsvc/vect-tsvc-s1279.c: Likewise.
* gcc.dg/vect/tsvc/vect-tsvc-s161.c: Likewise.
* gcc.dg/vect/tsvc/vect-tsvc-s253.c: Likewise.
* gcc.dg/vect/tsvc/vect-tsvc-s257.c: Likewise.
* gcc.dg/vect/tsvc/vect-tsvc-s271.c: Likewise.
* gcc.dg/vect/tsvc/vect-tsvc-s2711.c: Likewise.
* gcc.dg/vect/tsvc/vect-tsvc-s2712.c: Likewise.
* gcc.dg/vect/tsvc/vect-tsvc-s272.c: Likewise.
* gcc.dg/vect/tsvc/vect-tsvc-s273.c: Likewise.
* gcc.dg/vect/tsvc/vect-tsvc-s274.c: Likewise.
* gcc.dg/vect/tsvc/vect-tsvc-s276.c: Likewise.
* gcc.dg/vect/tsvc/vect-tsvc-s278.c: Likewise.
* gcc.dg/vect/tsvc/vect-tsvc-s279.c: Likewise.
* gcc.dg/vect/tsvc/vect-tsvc-s3111.c: Likewise.
* gcc.dg/vect/tsvc/vect-tsvc-s4113.c: Likewise.
* gcc.dg/vect/tsvc/vect-tsvc-s441.c: Likewise.
* gcc.dg/vect/tsvc/vect-tsvc-s443.c: Likewise.
* gcc.dg/vect/tsvc/vect-tsvc-s491.c: Likewise.
* gcc.dg/vect/tsvc/vect-tsvc-vas.c: Likewise.
* gcc.dg/vect/tsvc/vect-tsvc-vif.c: Likewise.
Richard Sandiford [Thu, 3 Feb 2022 10:43:59 +0000 (10:43 +0000)]
testsuite: Update guality xfails for aarch64*-*-*
Following on from GCC 11 patch g:
f31ddad8ac8, this one gives clean
guality.exp test results for aarch64-linux-gnu with modern gdb
(this time gdb 11.2).
The justification is the same as previously:
------
For people using older gdbs, it will trade one set of noisy results for
another set. I still think it's better to have the xfails based on
one “clean” and “modern” run rather than have FAILs and XPASSes for
all runs.
It's hard to tell which of these results are aarch64-specific and
which aren't. If other target maintainers want to do something similar,
and are prepared to assume the same gdb version, then it should become
clearer over time which ones are target-specific and which aren't.
There are no new skips here, so changes in test results will still
show up as XPASSes.
I've not analysed the failures or filed PRs for them. In some
ways the guality directory itself seems like the best place to
start looking for xfails, if someone's interested in working
in this area.
------
gcc/testsuite/
* gcc.dg/guality/ipa-sra-1.c: Update aarch64*-*-* xfails.
* gcc.dg/guality/pr54519-1.c: Likewise.
* gcc.dg/guality/pr54519-3.c: Likewise.
Martin Liska [Thu, 3 Feb 2022 09:19:33 +0000 (10:19 +0100)]
Fix wording for: attribute ‘-xyz’ argument ‘target’ is unknown
gcc/ChangeLog:
* config/i386/i386-options.cc (ix86_valid_target_attribute_inner_p):
Change subject and object in the error message.
* config/s390/s390.cc (s390_valid_target_attribute_inner_p):
Likewise.
Martin Liska [Thu, 3 Feb 2022 08:55:59 +0000 (09:55 +0100)]
s390x: Fix one more -Wformat-diag.
gcc/ChangeLog:
* config/s390/s390.cc (s390_valid_target_attribute_inner_p):
Use the error message for i386 target.
Jakub Jelinek [Thu, 3 Feb 2022 08:45:16 +0000 (09:45 +0100)]
ranger: Fix up wi_fold_in_parts for small precision types [PR104334]
The wide-int.h templates expect that when an int/long etc. operand is used
it will be sign-extended based on the types precision.
wi_fold_in_parts passes 3 such non-zero constants to wi::lt_p, wi::gt_p
and wi::eq_p - 1, 3 and 4, which means it was doing weird things if either
some of 1, 3 or 4 weren't representable in type, or if type was unsigned 3 bit
type 4 should be written as -4.
The following patch promotes the subtraction operands to widest_int and
uses that as the type for ?h_range variables and compares them as such.
We don't need the overflow handling because there is never an overflow.
2022-02-02 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/104334
* range-op.cc (range_operator::wi_fold_in_parts): Change lh_range
and rh_range type to widest_int and subtract in widest_int. Remove
ov_rh, ov_lh and sign vars, always perform comparisons as signed
and use >, < and == operators for it.
* g++.dg/opt/pr104334.C: New test.
Jakub Jelinek [Thu, 3 Feb 2022 08:01:07 +0000 (09:01 +0100)]
openmp, fortran: Improve !$omp atomic checks [PR104328]
The testcase shows some cases that weren't verified and we ICE on
invalid because of that.
One problem is that unlike before, we weren't checking if some expression
is EXPR_VARIABLE with non-NULL symtree in the case where there was
a conversion around it.
The other two issues is that we check that in an IF ->block is non-NULL
and then immediately dereference ->block->next->op, but on invalid
code with no statements in the then clause ->block->next might be NULL.
2022-02-02 Jakub Jelinek <jakub@redhat.com>
PR fortran/104328
* openmp.cc (is_scalar_intrinsic_expr): If must_be_var && conv_ok
and expr is conversion, verify it is a conversion from EXPR_VARIABLE
with non-NULL symtree. Check ->block->next before dereferencing it.
* gfortran.dg/gomp/atomic-27.f90: New test.
Jason Merrill [Wed, 2 Feb 2022 22:49:02 +0000 (17:49 -0500)]
c++: dependent array bounds completion [PR104302]
The patch for PR55227 changed the minimal init-list handling in
cp_complete_array_type to a call to reshape_init, which broke on the
dependent initializer. It occurred to me that trying to deduce the array
size from a dependent init-list is wrong in general, so let's just not. I
also limited the reshape_init call to the case of a char array, as before
the patch for 55227; that's the only case where we want to strip a level of
braces from an array.
PR c++/104302
gcc/cp/ChangeLog:
* decl.cc (maybe_deduce_size_from_array_init): Give up
on type-dependent init.
(cp_complete_array_type): Only call reshape_init for character
array.
gcc/testsuite/ChangeLog:
* g++.dg/template/array35.C: New test.
* g++.dg/template/array36.C: New test.
Martin Sebor [Thu, 3 Feb 2022 00:47:52 +0000 (17:47 -0700)]
Correct typos in -Wuse-after-free description.
gcc/ChangeLog:
* common.opt (-Wuse-after-free): Correct typos.
GCC Administrator [Thu, 3 Feb 2022 00:16:22 +0000 (00:16 +0000)]
Daily bump.
David Malcolm [Wed, 2 Feb 2022 21:45:29 +0000 (16:45 -0500)]
docs: mention analyzer interaction with -ftrivial-auto-var-init [PR104270]
gcc/ChangeLog:
PR analyzer/104270
* doc/invoke.texi (-ftrivial-auto-var-init=): Add reference to
-Wanalyzer-use-of-uninitialized-value to paragraph documenting that
-ftrivial-auto-var-init= doesn't suppress warnings.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Segher Boessenkool [Wed, 2 Feb 2022 20:15:46 +0000 (20:15 +0000)]
rs6000/testsuite: Return 0 for powerpc_altivec_ok on other targets
2022-02-02 Segher Boessenkool <segher@kernel.crashing.org>
gcc/testsuite/
* lib/target-supports.exp (check_effective_target_powerpc_altivec_ok):
Return 0 if the target is not Power. Restructure and add some comments.
Jonathan Wakely [Wed, 2 Feb 2022 11:40:28 +0000 (11:40 +0000)]
libstdc++: Fix -Wunused-variable warning for -fno-exceptions build
If _GLIBCXX_THROW_OR_ABORT expands to just __builtin_abort() then the
bool variable used in the filesystem_error constructor is unused. Mark
it as maybe_unused to there's no warning for -fno-exceptions builds.
libstdc++-v3/ChangeLog:
* src/c++17/fs_dir.cc (fs::recursive_directory_iterator::pop):
Add [[maybe_unused]] attribute.
* src/filesystem/dir.cc (fs::recursive_directory_iterator::pop):
Likewise.
Jonathan Wakely [Wed, 2 Feb 2022 17:04:58 +0000 (17:04 +0000)]
libstdc++: Fix invalid instantiations in tests
These tests instantiate std::multiset and std::set with a type that has
no operator< so they should use a custom comparison function.
libstdc++-v3/ChangeLog:
* testsuite/23_containers/multiset/operators/cmp_c++20.cc: Use
custom comparison function for multiset.
* testsuite/23_containers/set/operators/cmp_c++20.cc: Use custom
comparison function for set.
Jonathan Wakely [Tue, 25 Jan 2022 21:29:31 +0000 (21:29 +0000)]
libstdc++: Fix link failure in _OutputIteratorConcept
The C++98-style concept check for output iterators causes a link
failure on mingw-w64, because the __val() member function isn't defined.
Change it to use a function pointer instead. That pointer is never set
to anything meaningful, but it doesn't matter as the __constraints()
function only has to be instantiated, it's never called.
We could refactor all of these to use unevaluated contexts (e.g. sizeof
of __decltype) so that we only check the expressions are well-formed,
without any codegen at all. Any improvements to these are very low
priority though.
libstdc++-v3/ChangeLog:
* include/bits/boost_concept_check.h (_OutputIteratorConcept):
Change member function to data member of function pointer type.
Martin Liska [Wed, 2 Feb 2022 13:21:51 +0000 (14:21 +0100)]
lto: fix error handling for -Wl,-plugin-opt=debug
When one uses something like: -Wl,-plugin-opt=debug,
we end up with lto1 WPA invocation that has 'debug'
on command line. We interpret that as input filename.
The patch moves resolution checking later so that we end up with
a reasonable error message:
lto1: fatal error: open debug failed: No such file or directory
compilation terminated.
PR lto/104333
gcc/lto/ChangeLog:
* lto-common.cc (read_cgraph_and_symbols): Move resolution
checking for number of files later and report a reasonable
error message.
* lto-object.cc (lto_obj_file_open): Make error fatal.
Martin Liska [Wed, 2 Feb 2022 15:04:18 +0000 (16:04 +0100)]
Remove dead macro: TEXT_SECTION_NAME
gcc/ChangeLog:
* dwarf2out.cc (TEXT_SECTION_NAME): Remove unused macro.
David Malcolm [Fri, 28 Jan 2022 18:37:51 +0000 (13:37 -0500)]
analyzer: fix missing check for uninit of return values
When moving the -fanalyzer tests for -ftrivial-auto-var-init to the
"torture" subdirectory of gcc.dg/analyzer I noticed that -fanalyzer
wasn't always properly checking for initialization of return values.
The issue was that some "return" handling was using
region_model::copy_region to copy to the RESULT_DECL, and copy_region
wasn't checking for poisoned svalues.
This patch eliminates region_model::copy_region in favor of simply
doing a get_ravlue/set_value pair, fixing the issue.
gcc/analyzer/ChangeLog:
* region-model.cc (region_model::on_return): Replace usage of
copy_region with get_rvalue/set_value pair.
(region_model::pop_frame): Likewise.
(selftest::test_compound_assignment): Likewise.
* region-model.h (region_model::copy_region): Delete decl.
* region.cc (region_model::copy_region): Delete.
gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/torture/ubsan-1.c: Add missing return stmts.
* gcc.dg/analyzer/uninit-trivial-auto-var-init-pattern.c: Move
to...
* gcc.dg/analyzer/torture/uninit-trivial-auto-var-init-pattern.c:
...here.
* gcc.dg/analyzer/uninit-trivial-auto-var-init-uninitialized.c:
Move to...
* gcc.dg/analyzer/torture/uninit-trivial-auto-var-init-uninitialized.c:
...here.
* gcc.dg/analyzer/uninit-trivial-auto-var-init-zero.c: Move to...
* gcc.dg/analyzer/torture/uninit-trivial-auto-var-init-zero.c: ...here.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
David Malcolm [Tue, 1 Feb 2022 20:48:26 +0000 (15:48 -0500)]
analyzer: consolidate duplicate code in region::calc_offset
gcc/analyzer/ChangeLog:
* region.cc (region::calc_offset): Consolidate effectively
identical cases.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
David Malcolm [Fri, 28 Jan 2022 21:15:44 +0000 (16:15 -0500)]
analyzer: implement bit_range_region
GCC 12 has gained -Wanalyzer-use-of-uninitialized-value, and I'm
seeing various false positives from it due to region_model::get_lvalue
not properly handling BIT_FIELD_REF, and falling back to
using an UNKNOWN_REGION for them.
This patch fixes these false positives by implementing a new
bit_range_region region subclass for handling BIT_FIELD_REF.
gcc/analyzer/ChangeLog:
* analyzer.h (class bit_range_region): New forward decl.
* region-model-manager.cc (region_model_manager::get_bit_range):
New.
(region_model_manager::log_stats): Handle m_bit_range_regions.
* region-model.cc (region_model::get_lvalue_1): Handle
BIT_FIELD_REF.
* region-model.h (region_model_manager::get_bit_range): New decl.
(region_model_manager::m_bit_range_regions): New field.
* region.cc (region::get_base_region): Handle RK_BIT_RANGE.
(region::base_region_p): Likewise.
(region::calc_offset): Likewise.
(bit_range_region::dump_to_pp): New.
(bit_range_region::get_byte_size): New.
(bit_range_region::get_bit_size): New.
(bit_range_region::get_byte_size_sval): New.
(bit_range_region::get_relative_concrete_offset): New.
* region.h (enum region_kind): Add RK_BIT_RANGE.
(region::dyn_cast_bit_range_region): New vfunc.
(class bit_range_region): New.
(is_a_helper <const bit_range_region *>::test): New.
(default_hash_traits<bit_range_region::key_t>): New.
gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/torture/uninit-bit-field-ref.c: New test.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
David Malcolm [Fri, 28 Jan 2022 16:02:09 +0000 (11:02 -0500)]
analyzer: stop -ftrivial-auto-var-init from suppressing uninit warnings [PR104270]
GCC 12 has gained two features for dealing with uninitialized variables:
(a) a new -Wanalyzer-use-of-uninitialized-value warning within -fanalyzer
for interprocedural path-sensitive detection of ununit uses, and
(b) a new -ftrivial-auto-var-init option for mitigating some uses of
uninit variables
It turns out that using (b) was thwarting (a), as it led to -fanalyzer
seeing calls to IFN_DEFERRED_INIT, which -fanalyzer wasn't
special-casing, thus treating it as initializing the variables in
question, and thus silencing -Wanalyzer-use-of-uninitialized-value on
them.
invoke.texi says:
"GCC still considers an automatic variable that doesn't have an explicit
initializer as uninitialized, @option{-Wuninitialized} will still report
warning messages on such automatic variables."
and thus -Wanalyzer-use-of-uninitialized-value ought to as well.
This patch adds special-case handling to -fanalyzer for
IFN_DEFERRED_INIT, so that -fanalyzer will warn on uninit uses of
variables that are mitigated by -ftrivial-auto-var-init.
gcc/analyzer/ChangeLog:
PR analyzer/104270
* region-model.cc (region_model::on_call_pre): Handle
IFN_DEFERRED_INIT.
gcc/testsuite/ChangeLog:
PR analyzer/104270
* gcc.dg/analyzer/uninit-trivial-auto-var-init-pattern.c: New
test.
* gcc.dg/analyzer/uninit-trivial-auto-var-init-uninitialized.c:
New test.
* gcc.dg/analyzer/uninit-trivial-auto-var-init-zero.c: New test.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Bernd Kuhls [Fri, 27 Mar 2020 20:23:53 +0000 (21:23 +0100)]
gcc: define _REENTRANT for OpenRISC when -pthread is passed
The detection of pthread support fails on OpenRISC unless _REENTRANT
is defined. Added the CPP_SPEC definition to correct this.
gcc/ChangeLog:
PR target/94372
* config/or1k/linux.h (CPP_SPEC): Define.
Signed-off-by: Bernd Kuhls <bernd.kuhls@t-online.de>
Tamar Christina [Wed, 2 Feb 2022 10:52:17 +0000 (10:52 +0000)]
AArch32: use canonical ordering for complex mul, fma and fms
After the first patch in the series this updates the optabs to expect the
canonical sequence.
gcc/ChangeLog:
PR tree-optimization/102819
PR tree-optimization/103169
* config/arm/vec-common.md (cml<fcmac1><conj_op><mode>4): Use
canonical order.
Tamar Christina [Wed, 2 Feb 2022 10:51:38 +0000 (10:51 +0000)]
AArch64: use canonical ordering for complex mul, fma and fms
After the first patch in the series this updates the optabs to expect the
canonical sequence.
gcc/ChangeLog:
PR tree-optimization/102819
PR tree-optimization/103169
* config/aarch64/aarch64-simd.md (cml<fcmac1><conj_op><mode>4): Use
canonical order.
* config/aarch64/aarch64-sve.md (cml<fcmac1><conj_op><mode>4): Likewise.
Tamar Christina [Wed, 2 Feb 2022 10:39:03 +0000 (10:39 +0000)]
vect: Simplify and extend the complex numbers validation routines.
This patch boosts the analysis for complex mul,fma and fms in order to ensure
that it doesn't create an incorrect output.
Essentially it adds an extra verification to check that the two nodes it's going
to combine do the same operations on compatible values. The reason it needs to
do this is that if one computation differs from the other then with the current
implementation we have no way to deal with it since we have to remove the
permute.
When we can keep the permute around we can probably handle these by unrolling.
While implementing this since I have to do the traversal anyway I took advantage
of it by simplifying the code a bit. Previously we would determine whether
something is a conjugate and then try to figure out which conjugate it is and
then try to see if the permutes match what we expect.
Now the code that does the traversal will detect this in one go and return to us
whether the operation is something that can be combined and whether a conjugate
is present.
Secondly because it does this I can now simplify the checking code itself to
essentially just try to apply fixed patterns to each operation.
The patterns represent the order operations should appear in. For instance a
complex MUL operation combines :
Left 1 + Right 1
Left 2 + Right 2
with a permute on the nodes consisting of:
{ Even, Even } + { Odd, Odd }
{ Even, Odd } + { Odd, Even }
By abstracting over these patterns the checking code becomes quite simple.
As part of this I was checking the order of the operands which was left in
"slp" order. as in, the same order they showed up in during SLP, which means
that the accumulator is first. However it looks like I didn't document this
and the x86 optab was implemented assuming the same order as FMA, i.e. that
the accumulator is last.
I have this changed the order to match that of FMA and FMS which corrects the
x86 codegen and will update the Arm targets. This has now also been
documented.
gcc/ChangeLog:
PR tree-optimization/102819
PR tree-optimization/103169
* doc/md.texi: Update docs for cfms, cfma.
* tree-data-ref.h (same_data_refs): Accept optional offset.
* tree-vect-slp-patterns.cc (is_linear_load_p): Fix issue with repeating
patterns.
(vect_normalize_conj_loc): Remove.
(is_eq_or_top): Change to take two nodes.
(enum _conj_status, compatible_complex_nodes_p,
vect_validate_multiplication): New.
(class complex_add_pattern, complex_add_pattern::matches,
complex_add_pattern::recognize, class complex_mul_pattern,
complex_mul_pattern::recognize, class complex_fms_pattern,
complex_fms_pattern::recognize, class complex_operations_pattern,
complex_operations_pattern::recognize, addsub_pattern::recognize): Pass
new cache.
(complex_fms_pattern::matches, complex_mul_pattern::matches): Pass new
cache and use new validation code.
* tree-vect-slp.cc (vect_match_slp_patterns_2, vect_match_slp_patterns,
vect_analyze_slp): Pass along cache.
(compatible_calls_p): Expose.
* tree-vectorizer.h (compatible_calls_p, slp_node_hash,
slp_compat_nodes_map_t): New.
(class vect_pattern): Update signatures include new cache.
gcc/testsuite/ChangeLog:
PR tree-optimization/102819
PR tree-optimization/103169
* g++.dg/vect/pr99149.cc: xfail for now.
* gcc.dg/vect/complex/pr102819-1.c: New test.
* gcc.dg/vect/complex/pr102819-2.c: New test.
* gcc.dg/vect/complex/pr102819-3.c: New test.
* gcc.dg/vect/complex/pr102819-4.c: New test.
* gcc.dg/vect/complex/pr102819-5.c: New test.
* gcc.dg/vect/complex/pr102819-6.c: New test.
* gcc.dg/vect/complex/pr102819-7.c: New test.
* gcc.dg/vect/complex/pr102819-8.c: New test.
* gcc.dg/vect/complex/pr102819-9.c: New test.
* gcc.dg/vect/complex/pr103169.c: New test.
Martin Sebor [Wed, 2 Feb 2022 00:19:11 +0000 (17:19 -0700)]
Declare std::array members with attribute const [PR101831].
Resolves:
PR libstdc++/101831 - Spurious maybe-uninitialized warning on std::array::size
libstdc++-v3/ChangeLog:
PR libstdc++/101831
* include/std/array (begin): Declare const member function attribute
const.
(end, rbegin, rend, size, max_size, empty, data): Same.
* testsuite/23_containers/array/capacity/empty.cc: Add test cases.
* testsuite/23_containers/array/capacity/max_size.cc: Same.
* testsuite/23_containers/array/capacity/size.cc: Same.
* testsuite/23_containers/array/iterators/begin_end.cc: New test.
Hans-Peter Nilsson [Tue, 1 Feb 2022 23:00:10 +0000 (00:00 +0100)]
cris: Reload using special-regs before general-regs
On code where reload has an effect (i.e. quite rarely, just enough to be
noticeable), this change gets code quality back to the situation prior
to "Remove CRIS v32 ACR artefacts". We had from IRA a pseudoregister
marked to be reloaded from a union of all allocatable registers (here:
SPEC_GENNONACR_REGS) but where the register-class corresponding to the
constraint for the register-type alternative (here: GENERAL_REGS) was
*not* a subset of that class: SPEC_GENNONACR_REGS (and GENNONACR_REGS)
had a one-register "hole" for the ACR register, a register present in
GENERAL_REGS.
Code in reload.cc:find_reloads adds 4 to the cost of a register-type
alternative that is neither a subset of the preferred register class nor
vice versa and thus reload thinks it can't use. It would be preferable
to look for a non-empty intersection of the two, and use that
intersection for that alternative, something that can't be expressed
because a register class can't be formed from a random register set.
The effect was here that the GENERAL_REGS to/from memory alternatives
("r") had their cost raised such that the SPECIAL_REGS alternatives
("x") looked better. This happened to improve code quality just a
little bit compared to GENERAL_REGS being chosen.
Anyway, with the improved CRIS register-class topology, the
subset-checking code no longer has the GENERAL_REGS-demoting effect.
To get the same quality, we have to adjust the port such that
SPECIAL_REGS are specifically preferred when possible and advisible,
i.e. when there's at least two of those registers as for the CPU variant
with multiplication (which happens to be the variant maintained for
performance).
For the move-pattern, the obvious method may seem to simply "curse" the
constraints of some alternatives (by prepending one of the "?!^$"
characters) but that method can't be used, because we want the effect to
be conditional on the CPU variant. It'd also be a shame to split the
"*movsi_internal<setcc><setnz><setnzvc>" into two CPU-variants (with
different cursing). Iterators would help, but it still seems unwieldy.
Instead, add copies of the GENERAL_REGS variants (to the SPECIAL_REGS
alternatives) on the "other" side, and make use of the "enabled"
attribute to activate just the desired order of alternatives.
gcc:
* config/cris/cris.cc (cris_preferred_reload_class): Reject
"eliminated" registers and small-enough constants unless
reloaded into a class that is a subset of GENERAL_REGS.
* config/cris/cris.md (attribute "cpu_variant"): New.
(attribute "enabled"): Conditionalize on a matching attribute
cpu_variant, if specified.
("*movsi_internal<setcc><setnz><setnzvc>"): For moves to and from
memory, add cpu-variant-enabled variants for "r" alternatives on
the far side of the "x" alternatives, preferring the "x" ones
only for variants where MOF is present (in addition to SRP).
Hans-Peter Nilsson [Tue, 1 Feb 2022 23:00:10 +0000 (00:00 +0100)]
cris: Don't discriminate against ALL_REGS in TARGET_REGISTER_MOVE_COST
When the tightest class including both SPECIAL_REGS and GENERAL_REGS
is ALL_REGS, artificially special-casing for *either* to or from, hits
artificially hard. This gets the port back to the code quality before
the previous patch ("cris: Remove CRIS v32 ACR artefacts") - except
for_vfprintf_r and _vfiprintf_r in newlib (still .8 and .4% larger).
gcc:
* config/cris/cris.cc (cris_register_move_cost): Remove special pre-ira
extra cost for ALL_REGS.
Hans-Peter Nilsson [Tue, 1 Feb 2022 23:00:10 +0000 (00:00 +0100)]
cris: Remove CRIS v32 ACR artefacts
This is the change to which I alluded to this in r11-220 /
d0780379c1b6 as "causes extra register moves in libgcc". It has
unfortunate side-effects due to the change in register-class topology.
There's a slight improvement in coremark numbers (< 0.07%) though also
increase in code size total (< 0.7%) but looking at the individual
changes in functions, it's all-over (-7..+7%). Looking specifically
at functions that improved in speed, it's also both plus and minus in
code sizes. It's unworkable to separate improvements from regressions
for this case. I'll follow up with patches to restore the previous
code quality, in both size and speed.
gcc:
* config/cris/constraints.md (define_register_constraint "b"): Now
GENERAL_REGS.
* config/cris/cris.md (CRIS_ACR_REGNUM): Remove.
* config/cris/cris.h: (reg_class, REG_CLASS_NAMES)
(REG_CLASS_CONTENTS): Remove ACR_REGS, SPEC_ACR_REGS, GENNONACR_REGS,
and SPEC_GENNONACR_REGS.
* config/cris/cris.cc (cris_preferred_reload_class): Don't mention
ACR_REGS and return GENERAL_REGS instead of GENNONACR_REGS.
Hans-Peter Nilsson [Tue, 1 Feb 2022 23:00:10 +0000 (00:00 +0100)]
cris: For expanded movsi, don't match operands we know will be reloaded
In a session investigating unexpected fallout from a change, I
noticed reload needs one operand being a register to make an
informed decision. It can happen that there's just a constant
and a memory operand, as in:
(insn 668 667 42 104 (parallel [
(set (mem:SI (plus:SI (reg/v/f:SI 347 [ fs ])
(const_int 168 [0xa8])) \
[1 fs_126(D)->regs.cfa_how+0 S4 A8])
(const_int 2 [0x2]))
(clobber (reg:CC 19 dccr))
]) "<...>/gcc/libgcc/unwind-dw2.c":1121:21 22 {*movsi_internal}
(expr_list:REG_UNUSED (reg:CC 19 dccr)
(nil)))
This was helpfully created by combine. When this happens,
reload can't check for costs and preferred register classes,
(both operands will start with NO_REGS as the preferred class)
and will default to the constraints order in the insn in reload.
(Which also does its own temporary merge in find_reloads, but
that's a different story.) Better don't match the simple cases.
Beware that subregs have to be matched.
I'm doing this just for word_mode (SI) for now, but may repeat
this for the other valid modes as well. In particular, that
goes for DImode as I see the expanded movdi does *almost* this,
but uses register_operand instead of REG_S_P (from cris.h).
Using REG_S_P is the right choice here because register_operand
also matches (subreg (mem ...) ...) *until* reload is done.
By itself it's just a sub-0.1% performance win (coremark).
Also removing a stale comment.
gcc:
* config/cris/cris.md ("*movsi_internal<setcc><setnz><setnzvc>"):
Conditionalize on (sub-)register operands or operand 1 being 0.
Hans-Peter Nilsson [Tue, 1 Feb 2022 23:00:09 +0000 (00:00 +0100)]
cris: Don't default to -mmul-bug-workaround
This flips the default for the errata handling for an old version
(TL;DR: workaround: no multiply instruction last on a cache-line).
Newer versions of the CRIS cpu don't have that bug. While the impact
of the workaround is very marginal (coremark: less than .05% larger,
less than .0005% slower) it's an irritating pseudorandom factor when
assessing the impact of other changes.
Also, fix a wart requiring changes to more than TARGET_DEFAULT to flip
the default.
People building old kernels or operating systems to run on
ETRAX 100 LX are advised to pass "-mmul-bug-workaround".
gcc:
* config/cris/cris.h (TARGET_DEFAULT): Don't include MASK_MUL_BUG.
(MUL_BUG_ASM_DEFAULT): New macro.
(MAYBE_AS_NO_MUL_BUG_ABORT): Define in terms of MUL_BUG_ASM_DEFAULT.
* doc/invoke.texi (CRIS Options, -mmul-bug-workaround): Adjust
accordingly.
GCC Administrator [Wed, 2 Feb 2022 00:17:16 +0000 (00:17 +0000)]
Daily bump.
Jonathan Wakely [Tue, 1 Feb 2022 23:58:08 +0000 (23:58 +0000)]
libstdc++: Do not use dirent::d_type unconditionally
These new tests should not use the d_type member unless it's actually
present on the OS.
libstdc++-v3/ChangeLog:
* testsuite/27_io/filesystem/iterators/error_reporting.cc: Use
autoconf macro to check whether d_type is present.
* testsuite/experimental/filesystem/iterators/error_reporting.cc:
Likewise.