review.tizen.org Git - platform/upstream/gcc.git/log

tighten funcspec regexps

In -mcmodel=large, callee symbols are pulled ahead of the call insns.

The patterns in funcspec-[12].c tests in gcc.target/i386 match even
line breaks between 'call' and a function symbol expected to be
called, however, so it ends up unexpectedly matching a previous,
unrelated indirect call, up to the insn that loads the address of the
intended callee to a register, for all but the first callee, that
doesn't have a call insn before it.

All of these apparent passes are false positives.  We are NOT
generating the expected call insns.

This patch fixes only the patterns, so that they won't trigger false
positives any more.  There are several dozens of other tests that fail
with -mcmodel=large for similar reasons, but I'm still not sure about
how to deal with them.  I see no point in holding up this small
improvement over the lack of a larger solution of a different problem,
though.

for  gcc/testsuite/ChangeLog

* gcc.target/i386/funcspec-2.c: Tighten regexps to avoid false
positives with -mcmodel=large.
* gcc.target/i386/funcspec-3.c: Likewise.

fix ssse3_pshufbv8qi3 post-reload const pool load

The split in ssse3_pshufbv8qi3 forces a const vector into the constant
pool, and loads from it.  That runs after reload, so if the load
requires any reloading, we're out of luck.  Indeed, if the load
address is not legitimate, e.g. -mcmodel=large, the insn is no longer
recognized.

This patch turns the constant into an input operand, introduces an
expander to generate the constant unconditionally, and arranges for
this input operand to be retained as an unused immediate in the
alternatives that don't undergo splitting, and for it to be loaded
into the scratch register for those that do.

It is now the register allocator that arranges to load the const
vector into a register, so it deals with whatever legitimizing steps
needed for the target configuration.

for  gcc/ChangeLog

* config/i386/predicates.md (reg_or_const_vec_operand): New.
* config/i386/sse.md (ssse3_pshufbv8qi3): Add an expander for
the now *-prefixed insn_and_split, turn the splitter const vec
into an input for the insn, making it an ignored immediate for
non-split cases, and loaded into the scratch register
otherwise.

for  gcc/testsuite/ChangeLog

* gcc.target/i386/pr94467-3.c: New.

Fortran: Extend buffer, use snprintf to avoid overflows [PR99369]

gcc/fortran/ChangeLog:

PR fortran/99369
* resolve.c (resolve_operator): Make 'msg' buffer larger
and use snprintf.

gcc/testsuite/ChangeLog:

PR fortran/99369
* gfortran.dg/longnames.f90: New test.

Daily bump.

[PR99581] Use relaxed memory for more aarch64 memory constraints

The original patch for PR99581 resulted in GCC testsuite regression as
some constraints were not declared as relaxed memory ones. This patch
fixes this.

gcc/ChangeLog:

PR target/99581
* config/aarch64/constraints.md (Utq, UOb, UOh, UOw, UOd, UOty):
Use define_relaxed_memory_constraint for them.

Update gcc .po files.

* be.po, da.po, de.po, el.po, es.po, fi.po, fr.po, hr.po, id.po,
ja.po, nl.po, ru.po, sr.po, sv.po, tr.po, uk.po, vi.po, zh_CN.po,
zh_TW.po: Update.

Darwin : Address a translation comment.

Add a ':' to make the diagnostic read 'pch_address_space': xxx.

gcc/ChangeLog:

PR target/99733
* config/host-darwin.c (darwin_gt_pch_use_address): Add a
colon to the diagnostic message.

c++: Note duplicates in symbol table [PR 99283]

I ran into this reducing 99283, we were failing to mark binding
vectors when the current TU declares a duplicate decl (as opposed to
an import introduces a duplicate).

PR c++/99283
gcc/cp/
* name-lookup.c (check_module_override): Set global or partition
DUP on the binding vector.
gcc/testsuite/
* g++.dg/modules/pr99283-1_a.H: New.
* g++.dg/modules/pr99283-1_b.H: New.

fwprop: Fix single_use_p calculation

Commit efb6bc55a93a ("fwprop: Allow (subreg (mem)) simplifications")
introduced a check that was supposed to look at the propagated def's
number of uses.  It uses insn_info::num_uses (), which in reality
returns the number of uses def's insn has.  The whole change therefore
works only by accident.

Fix by looking at set_info's uses instead of insn_info's uses.  This
requires passing around set_info instead of insn_info.

gcc/ChangeLog:

2021-03-02  Ilya Leoshkevich  <iii@linux.ibm.com>

* fwprop.c (fwprop_propagation::fwprop_propagation): Look at
set_info's uses.
(try_fwprop_subst_note): Use set_info instead of insn_info.
(try_fwprop_subst_pattern): Likewise.
(try_fwprop_subst_notes): Likewise.
(try_fwprop_subst): Likewise.
(forward_propagate_subreg): Likewise.
(forward_propagate_and_simplify): Likewise.
(forward_propagate_into): Likewise.
* rtl-ssa/accesses.h (set_info::single_nondebug_use) New
method.
(set_info::single_nondebug_insn_use): Likewise.
(set_info::single_phi_use): Likewise.
* rtl-ssa/member-fns.inl (set_info::single_nondebug_use) New
method.
(set_info::single_nondebug_insn_use): Likewise.
(set_info::single_phi_use): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/s390/vector/long-double-asm-abi.c: New test.

libstdc++: Improve test for views::reverse

libstdc++-v3/ChangeLog:

* testsuite/std/ranges/adaptors/reverse.cc: Replace duplicated
line with a check that uses the const being/end overloads.

MAINTAINERS: add myself as static analyzer maintainer

ChangeLog:
* MAINTAINERS: Add myself as static analyzer maintainer.

libstdc++: Avoid accidental ADL when calling make_reverse_iterator

std::ranges::reverse_view uses make_reverse_iterator in its
implementation as described in [range.reverse.view]. This accidentally
allows ADL as an unqualified name is used in the call. According to
[contents], however, this should be treated as a qualified lookup into
the std namespace.

This leads to errors due to ambiguous name lookups when another
make_reverse_iterator function is found via ADL.

libstdc++-v3/Changelog:

* include/std/ranges (reverse_view::begin, reverse_view::end):
Qualify make_reverse_iterator calls to avoid ADL.
* testsuite/std/ranges/adaptors/reverse.cc: Test that
views::reverse works when make_reverse_iterator is defined
in an associated namespace.

Add forgotten attribution on PR target/99593 testcase.

testsuite/arm: Add arm_dsp_ok effective target and use it in arm/acle/dsp_arith.c

gcc.target/arm/acle/dsp_arith.c uses DSP intrinsics, which arm_acle.h
defines only with __ARM_FEATURE_DSP, so make the test check for that
property rather than arm_qbit_ok.

However, the existing arm_dsp effective target only checks if DSP
features are supported with the current multilib rather than trying
-march and -mfloat-abi options. Thus we introduce a similar effective
target, arm_dsp_ok and associated dg-add-options.

This makes dsp_arith.c unsupported rather than failed when no option
combination is suitable, for instance when running the tests with
-mcpu=cortex-m3.

2021-03-19 Christophe Lyon <christophe.lyon@linaro.org>

gcc/
* doc/sourcebuild.texi (arm_dsp_ok, arm_dsp): Document.

gcc/testsuite/
* lib/target-supports.exp
(check_effective_target_arm_dsp_ok_nocache)
(check_effective_target_arm_dsp_ok, add_options_for_arm_dsp): New.
* gcc.target/arm/acle/dsp_arith.c: Use arm_dsp_ok effective target
and add arm_dsp options.

testsuite/arm: Fix -mfloat-abi order in arm_v8_1m_mve_ok_nocache and arm_v8_1m_mve_fp_ok_nocache

Make the order in which we try -mfloat-abi options consistent with the
other similar effective targets: try softfp first, then hard.

This shows that a few tests implicitly rely on -mfloat-abi=hard, so we
add this option via dg-additional-options so that it comes after any
potential -mfloat-abi option that the preceding effective-targets
might have added.

armv8_1m-fpXX-move-1.c tests don't need arm_hard_ok because they don't
include arm_mve.h: adding -mfloat-abi=hard when using a soft/softfp
toolchain does not lead to the missing include gnu/stubs-*.h error.

This patch makes armv8_1m-fpXX-move-1.c pass on arm-linux-gnueabi, and
the other tests become unsupported (instead of fail) on this target.

On arm-eabi with default cpu/fpu/mode and a+rm multilibs, the same
mve/intrinsics/* tests become unsupported instead of pass because
arm_hard_ok fails with "selected processor lacks an FPU". Since we
also override the fpu via dg-options, we'd need another effective
target (say arm_hard_mve_ok) that would check -mfloat-abi=hard
-mfpu=auto -march=armv8.1-m.main+mve.fp at the same time. But we have
already so many arm effective targets, it doesn't seem like a good way
forward.

2021-03-19 Christophe Lyon <christophe.lyon@linaro.org>

gcc/testsuite/
* lib/target-supports.exp
(check_effective_target_arm_v8_1m_mve_fp_ok_nocache): Fix
-mfloat-abi= options order.
(check_effective_target_arm_v8_1m_mve_ok_nocache): Likewise
* gcc.target/arm/mve/intrinsics/mve_vector_float2.c: Add
arm_hard_ok effective target and -mfloat-abi=hard additional
option.
* gcc.target/arm/mve/intrinsics/mve_vector_int.c: Likewise.
* gcc.target/arm/mve/intrinsics/mve_vector_uint.c: Likewise.
* gcc.target/arm/mve/intrinsics/mve_vector_uint1.c: Likewise.
* gcc.target/arm/mve/intrinsics/mve_vector_uint2.c: Likewise.
* gcc.target/arm/mve/intrinsics/vgetq_lane_s64.c: Likewise.
* gcc.target/arm/mve/intrinsics/vgetq_lane_u64.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsetq_lane_s64.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsetq_lane_u64.c: Likewise.
* gcc.target/arm/armv8_1m-fp16-move-1.c: Add -mfloat-abi=hard
additional option.
* gcc.target/arm/armv8_1m-fp32-move-1.c: Likewise.
* gcc.target/arm/armv8_1m-fp64-move-1.c: Likewise.

testsuite/arm: Fix -mfloat-abi order in arm_v8_2a_bf16_neon_ok_nocache and arm_v8_2a_i8mm_ok_nocache

Make the order in which we try -mfloat-abi options consistent with the
other similar effective targets: try softfp first, then hard.

This shows that a few tests implicitly rely on -mfloat-abi=hard, so we
now check arm_hard_ok where needed.

This makes these tests unsupported rather than fail on
arm-linux-gnueabi.

2021-03-19 Christophe Lyon <christophe.lyon@linaro.org>

gcc/testsuite/
* lib/target-supports.exp
(check_effective_target_arm_v8_2a_i8mm_ok_nocache): Fix
-mfloat-abi= options order.
(check_effective_target_arm_v8_2a_bf16_neon_ok_nocache): Likewise.
* gcc.target/arm/bfloat16_scalar_1_1.c: Add arm_hard_ok effective
target and -mfloat-abi=hard additional option.
* gcc.target/arm/bfloat16_simd_1_1.c: Likewise.
* gcc.target/arm/simd/bf16_ma_1.c: Likewise.
* gcc.target/arm/simd/bf16_mmla_1.c: Likewise.
* gcc.target/arm/simd/vdot-2-1.c: Likewise.
* gcc.target/arm/simd/vdot-2-2.c: Likewise.

testsuite/arm: Add arm_hard_ok check in armv8_2-fp16-scalar-2.c

This test relies on -mfloat-abi=hard to pass (otherwise
test_mov_imm_[12] directly build the 1.0 fp16 representation via movw
r0, #15360 rather than using vmov.f16 s0, #1.0e+0 as expected by
scan-assembler-times)

Adding the arm_hard_ok check makes the test unsupported eg. on
arm-linux-gnueabi instead of reporting a failure.

2021-03-20 Christophe Lyon <christophe.lyon@linaro.org>

gcc/testsuite/
* gcc.target/arm/armv8_2-fp16-scalar-2.c: Add arm_hard_ok.

testsuite/arm: Add arm_softfp_ok or arm_hard_ok as needed.

Several tests override the -mfloat-abi option detected by their
effective targets. Make sure it is supported, so that these tests are
unsupported rather than failures (the inclusion of arm_neon.h
otherwise fails for lack of gnu/stubs-*.h)

This avoids failures with
bfloat16_simd_2_1.c
bfloat16_simd_3_1.c
bf16_vldn_1.c
bf16_vstn_1.c on arm-linux-gnueabi
and
pr51968.c
bfloat16_simd_1_2.c
bfloat16_simd_2_2.c
bfloat16_simd_3_2.c on arm-linux-gnueabihf.

On arm-eabi with default cpu/fpu/mode and a+rm multilibs,
bfloat16_simd_2_1.c, bfloat16_simd_3_1.c, bf16_vstn_1.c and
bf16_vldn_1.c become unsupported instead of pass because arm_hard_ok
fails with "selected processor lacks an FPU". Since we also override
the fpu in dg-additional-options, we'd need another effective target
(say arm_hard_neon_ok) that would check -mfloat-abi=hard -mfpu=neon at
the same time. But we have already so many arm effective targets, it
doesn't seem like a good way forward.

2021-03-19 Christophe Lyon <christophe.lyon@linaro.org>

gcc/testsuite/
* gcc.target/arm/bfloat16_simd_1_2.c: Add arm_softfp_ok.
* gcc.target/arm/bfloat16_simd_2_2.c: Likewise.
* gcc.target/arm/bfloat16_simd_3_2.c: Likewise.
* gcc.target/arm/pr51968.c: Likewise.
* gcc.target/arm/bfloat16_simd_2_1.c: arm_hard_ok.
* gcc.target/arm/bfloat16_simd_3_1.c: Likewise.
* gcc.target/arm/simd/bf16_vldn_1.c: Likewise.
* gcc.target/arm/simd/bf16_vstn_1.c: Likewise.

testsuite/arm: Remove useless -mfloat-abi option

These tests pass with their current dg-add-options, no need to force
-mfloat=abi.

I've noticed no impact on armv8_1m-shift-imm-1.c and
armv8_1m-shift-reg-1.c, bf16_reinterpret.c now passes on
arm-linux-gnueabi and bf16_dup.c now passes on arm-linux-gnueabihf.

This allows pr51534.c to pass when forcing -mfloat-abi=soft in
runtestflags, otherwise we get an error '-mfloat-abi=soft and
-mfloat-abi=hard may not be used together' because we try to compile
with both flags.

2021-03-19 Christophe Lyon <christophe.lyon@linaro.org>

gcc/testsuite/
* gcc.target/arm/armv8_1m-shift-imm-1.c: Remove -mfloat=abi option.
* gcc.target/arm/armv8_1m-shift-reg-1.c: Likewise.
* gcc.target/arm/bf16_dup.c: Likewise.
* gcc.target/arm/bf16_reinterpret.c: Likewise.
* gcc.target/arm/pr51534.c: Remove -mfloat=abi option.

testsuite/arm: Add arm_v8_2a_i8mm options in gcc.target/arm/simd/vmmla_1.c

We need to add the options corresponding to the arm_v8_2a_i8mm_ok
effective target in order to use the right float-abi option:
-mfloat-abi=softfp makes the test pass for arm-linux-gnueabi,
while no -mfloat-abi option is needed for arm-linux-gnueabihf.

2021-03-19 Christophe Lyon <christophe.lyon@linaro.org>

gcc/testsuite/
* gcc.target/arm/simd/vmmla_1.c: Add arm_v8_2a_i8mm options.

testsuite/arm: Add arm_v8_2a_fp16_neon and arm_v8_2a_bf16_neon options

A few tests lack the dg-add-options directives associated with the
dg-require-effective-target they are using. Adding them enables to
pass the right float-abi option, and thus make the tests pass instead
of emit an error.

For instance, we now pass -mfloat-abi=softfp on arm-linux-gnueabi
targets and the tests pass.

2021-03-19 Christophe Lyon <christophe.lyon@linaro.org>

gcc/testsuite/
* gcc.target/arm/bfloat16_scalar_typecheck.c: Add
arm_v8_2a_fp16_neon and arm_v8_2a_bf16_neon.
* gcc.target/arm/bfloat16_vector_typecheck_1.c: Likewise.
* gcc.target/arm/bfloat16_vector_typecheck_2.c: Likewise.

libstdc++: Disable "ALT128" long double support for Clang

Clang does not currently support the __ibm128 type [1] and only supports
the __ieee128 type in the unreleased 12.0.0 version [2]. That means it
is not possible to provide support for -mabi=ieeelongdouble with Clang
in an ABI compatible way (as we do for GCC by defining new facets and
other types in the __gnu_cxx_ldbl128 namespace).

By preventing the definition of _GLIBCXX_LONG_DOUBLE_ALT128_COMPAT when
compiling with Clang, all uses of __ibm128 and __ieee128 types will be
disabled. This can be revisited in future when Clang supports the types
(and provides a way to detect that support using the preprocessor).

[1] https://reviews.llvm.org/D93377
[2] https://reviews.llvm.org/D97846

libstdc++-v3/ChangeLog:

* include/bits/c++config (_GLIBCXX_LONG_DOUBLE_ALT128_COMPAT):
Do not define when compiling with Clang.

c++: Fix bogus warning in deprecated namespace [PR99318]

In GCC 10, I introduced cp_warn_deprecated_use_scopes so that we can
handle attribute deprecated on a namespace declaration.  This
function walks the decl's contexts so that we warn for code like

  namespace [[deprecated]] N { struct S { }; }
  N::S s;

We call cp_warn_deprecated_use_scopes when we encounter a TYPE_DECL.
But in the following testcase we have a TYPE_DECL whose context is
a deprecated function; that itself is not a reason to warn.  This
patch limits for which entities we call cp_warn_deprecated_use;
essentially it's what can follow ::.

I noticed that we didn't test that

  struct [[deprecated]] S { static void fn(); };
  S::fn();

produces the expected warning, so I've added gen-attrs-73.C.

gcc/cp/ChangeLog:

PR c++/99318
* decl2.c (cp_warn_deprecated_use_scopes): Only call
cp_warn_deprecated_use when decl is a namespace, class, or enum.

gcc/testsuite/ChangeLog:

PR c++/99318
* g++.dg/cpp0x/attributes-namespace6.C: New test.
* g++.dg/cpp0x/gen-attrs-73.C: New test.

Fortran: Fix func decl mismatch [PR93660]

gcc/fortran/ChangeLog:

PR fortran/93660
* trans-decl.c (build_function_decl): Add comment;
increment hidden_typelist for caf_token/caf_offset.
* trans-types.c (gfc_get_function_type): Add comment;
add missing caf_token/caf_offset args.

gcc/testsuite/ChangeLog:

PR fortran/93660
* gfortran.dg/gomp/declare-simd-coarray-lib.f90: New test.

aarch64: Make aarch64_add_offset work with -ftrapv [PR99540]

aarch64_add_offset uses expand_mult to multiply the SVE VL by an
out-of-range constant.  expand_mult takes an argument to indicate
whether the multiplication is signed or unsigned, but in this
context the multiplication is effectively signless and so the
choice seemed arbitrary.

However, one of the things that the signedness input does is
indicate whether signed overflow should be trapped for -ftrapv.
We don't want that here, so we must treat the multiplication
as unsigned.

gcc/
2021-03-23  Jakub Jelinek  <jakub@redhat.com>

PR target/99540
* config/aarch64/aarch64.c (aarch64_add_offset): Tell
expand_mult to perform an unsigned rather than a signed
multiplication.

gcc/testsuite/
2021-03-23  Richard Sandiford  <richard.sandiford@arm.com>

PR target/99540
* gcc.dg/vect/pr99540.c: New test.

x86: Add __volatile__ to __cpuid and __cpuid_count

Since CPUID instruction may return different values on hybrid core.
volatile is needed on asm statements in <cpuid.h>.

PR target/99704
* config/i386/cpuid.h (__cpuid): Add __volatile__.
(__cpuid_count): Likewise.

c++: Over-zealous assert [PR 99239]

This was simply an overzealous assert. Possibly correct thinking at
the time that code was written, but not true now. Of course we can
have imported artificial decls.

PR c++/99239
gcc/cp/
* decl.c (duplicate_decls): Remove assert about maybe-imported
artificial decls.
gcc/testsuite/
* g++.dg/modules/pr99239_a.H: New.
* g++.dg/modules/pr99239_b.H: New.

tree-optimization/99721 - avoid SLP nodes we cannot schedule

This makes sure we'll not run into SLP scheduling issues later by
rejecting all-constant children nodes without any scalar stmts early.

2021-03-23 Richard Biener <rguenther@suse.de>

PR tree-optimization/99721
* tree-vect-slp.c (vect_slp_analyze_node_operations):
Make sure we can schedule the node.

* gfortran.dg/vect/pr99721.f90: New testcase.

RISC-V: Fix riscv_subword() for big endian

gcc/
* config/riscv/riscv.c (riscv_subword): Take endianness into
account when calculating the byte offset.

RISC-V: Fix matches against subreg with a bytenum of 0 in riscv.md

These all intend the least significant subpart of the register.
Use the same endian-neutral "subreg_lowpart_operator" predicate that
ARM does instead.

gcc/
* config/riscv/predicates.md (subreg_lowpart_operator): New predicate
* config/riscv/riscv.md (*addsi3_extended2, *subsi3_extended2)
(*negsi2_extended2, *mulsi3_extended2, *<optab>si3_mask)
(*<optab>si3_mask_1, *<optab>di3_mask, *<optab>di3_mask_1)
(*<optab>si3_extend_mask, *<optab>si3_extend_mask_1): Use
new predicate "subreg_lowpart_operator"

RISC-V: Update shift-shift-5.c testcase for big endian

gcc/testsuite/

* gcc.target/riscv/shift-shift-5.c (sub): Change
order of struct fields depending on byteorder.

RISC-V: Fix trampoline generation on big endian

gcc/
* config/riscv/riscv.c (riscv_swap_instruction): New function
to byteswap an SImode rtx containing an instruction.
(riscv_trampoline_init): Byteswap the generated instructions
when needed.

RISC-V: Update soft-fp config for big-endian

libgcc/
* config/riscv/sfp-machine.h (__BYTE_ORDER): Set according
to __BYTE_ORDER__.

RISC-V: Add riscv{32,64}be with big endian as default

gcc/
* common/config/riscv/riscv-common.c
(TARGET_DEFAULT_TARGET_FLAGS): Set default endianness.
* config.gcc (riscv32be-*, riscv64be-*): Set
TARGET_BIG_ENDIAN_DEFAULT to 1.
* config/riscv/elf.h (LINK_SPEC): Change -melf* value
depending on default endianness.
* config/riscv/freebsd.h (LINK_SPEC): Likewise.
* config/riscv/linux.h (LINK_SPEC): Likewise.
* config/riscv/riscv.c (TARGET_DEFAULT_TARGET_FLAGS): Set
default endianness.
* config/riscv/riscv.h (DEFAULT_ENDIAN_SPEC): New macro.

RISC-V: Support -mlittle-endian and -mbig-endian

gcc/
* config/riscv/elf.h (LINK_SPEC): Pass linker endianness flag.
* config/riscv/freebsd.h (LINK_SPEC): Likewise.
* config/riscv/linux.h (LINK_SPEC): Likewise.
* config/riscv/riscv.h (ASM_SPEC): Pass -mbig-endian and
-mlittle-endian.
(BYTES_BIG_ENDIAN): Handle big endian.
(WORDS_BIG_ENDIAN): Define to BYTES_BIG_ENDIAN.
* config/riscv/riscv.opt (-mbig-endian, -mlittle-endian): New
options.
* doc/invoke.texi (-mbig-endian, -mlittle-endian): Document.

c++: Diagnose references to void in structured bindings [PR99650]

We ICE on the following testcase, because std::tuple_element<...,...>::type
is void and for structured bindings we therefore need to create
void & or void && which is invalid. We created such REFERENCE_TYPE and
later ICEd in the middle-end.
The following patch fixes it by diagnosing that.

2021-03-23 Jakub Jelinek <jakub@redhat.com>

PR c++/99650
* decl.c (cp_finish_decomp): Diagnose void initializers when
using tuple_element and get.

* g++.dg/cpp1z/decomp55.C: New test.

cprop_hardreg: Ensure replacement reg has compatible mode [PR99221]

In addition to the existing check also ask the target whether a
replacement register may be accessed in a different mode than it was set
before.

gcc/ChangeLog:

* regcprop.c (find_oldest_value_reg): Ask target whether
different mode is fine for replacement register.

mklog: fix test_mklog.py tests.

contrib/ChangeLog:

* mklog.py: Fix broken tests.

Handle setting of 1-bit anti-ranges uniformly.

PR tree-optimization/99296
* value-range.cc (irange::irange_set_1bit_anti_range): New.
(irange::irange_set_anti_range): Call irange_set_1bit_anti_range
* value-range.h (irange::irange_set_1bit_anti_range): New.

Update gcc sv.po.

* sv.po: Update.

Daily bump.

libstdc++: Implement string_view range constructor for C++20

This implements the new string_view constructor proposed by P1989R2.
This hasn't been voted into the C++23 draft yet, but it's been reviewed
by LWG and is expected to be approved at the next WG21 meeting.

libstdc++-v3/ChangeLog:

* include/std/string_view (basic_string_view(Range&&)): Define new
constructor and deduction guide.
* testsuite/21_strings/basic_string_view/cons/char/range_c++20.cc: New test.
* testsuite/21_strings/basic_string_view/cons/wchar_t/range_c++20.cc: New test.

c++: Cross-module partial specialiations [PR 99480]

We were not correctly handling cross-module redeclarations of
partial-specializations. They have their own TEMPLATE_DECL, which we
need to locate. I had a FIXME there about this case. Guess it's
fixed now.

PR c++/99480
gcc/cp/
* module.cc (depset::hash::make_dependency): Propagate flags for
partial specialization.
(module_may_redeclare): Handle partial specialization.
gcc/testsuite/
* g++.dg/modules/pr99480_a.H: New.
* g++.dg/modules/pr99480_b.H: New.

[PR99581] Define relaxed memory and use it for aarch64

aarch64 needs to skip memory address validation for LD1R insns.  Skipping
the address validation may result in LRA crash for some targets when usual
memory constraint is used.  This patch introduces define_relaxed_memory_constraint,
skipping address validation for it, and defining relaxed memory for
aarch64 LD1r insn memory operand.

gcc/ChangeLog:

PR target/99581
* config/aarch64/constraints.md (UtQ): Use
define_relaxed_memory_constraint for it.
* doc/md.texi (define_relaxed_memory_constraint): Describe it.
* genoutput.c (main): Process DEFINE_RELAXED_MEMORY_CONSTRAINT.
* genpreds.c (constraint_data): Add bitfield is_relaxed_memory.
(have_relaxed_memory_constraints): New static var.
(relaxed_memory_start, relaxed_memory_end): Ditto.
(add_constraint): Add arg is_relaxed_memory.  Check name for
relaxed memory.  Set up is_relaxed_memory in constraint_data and
have_relaxed_memory_constraints.  Adjust calls.
(choose_enum_order): Process relaxed memory.
(write_tm_preds_h): Ditto.
(main): Process DEFINE_RELAXED_MEMORY_CONSTRAINT.
* gensupport.c (process_rtx): Process DEFINE_RELAXED_MEMORY_CONSTRAINT.
* ira-costs.c (record_reg_classes): Process CT_RELAXED_MEMORY.
* ira-lives.c (single_reg_class): Use
insn_extra_relaxed_memory_constraint.
* ira.c (ira_setup_alts): CT_RELAXED_MEMORY.
* lra-constraints.c (valid_address_p): Use
insn_extra_relaxed_memory_constraint instead of other memory
constraints.
(process_alt_operands): Process CT_RELAXED_MEMORY.
(curr_insn_transform): Use insn_extra_relaxed_memory_constraint.
* recog.c (asm_operand_ok, preprocess_constraints): Process
CT_RELAXED_MEMORY.
* reload.c (find_reloads): Ditto.
* rtl.def (DEFINE_RELAXED_MEMORY_CONSTRAINT): New.
* stmt.c (parse_input_constraint): Use
insn_extra_relaxed_memory_constraint.

gcc/testsuite/ChangeLog:

PR target/99581
* gcc.target/powerpc/pr99581.c: New.

ubsan: Don't test for NaNs if those do not exist (PR97926)

2021-03-22 Segher Boessenkool <segher@kernel.crashing.org>

PR target/97926
* ubsan.c (ubsan_instrument_float_cast): Don't test for unordered if
there are no NaNs.

libstdc++: Add noexcept to std::begin etc as per LWG 2280 and 3537

This implements the proposed changes for LWG 3537 (which we're allowed
to do as an extension whatever the outcome of the issue). I noticed we
didn't implement LWG 2280 completely, as the std::begin and std::end
overloads for arrays were not noexcept.

libstdc++-v3/ChangeLog:

* include/bits/range_access.h (begin(T (&)[N]), end(T (&)[N])):
Add missing 'noexcept' as per LWG 2280.
(rbegin(T (&)[N]), rend(T (&)[N]), rbegin(initializer_list<T>))
(rend(initializer_list<T>)): Add 'noexcept' as per LWG 3537.
* testsuite/24_iterators/range_access/range_access.cc: Check for
expected noexcept specifiers. Check result types of generic
std::begin and std::end overloads.
* testsuite/24_iterators/range_access/range_access_cpp14.cc:
Check for expected noexcept specifiers.
* testsuite/24_iterators/range_access/range_access_cpp17.cc:
Likewise.

c++: duplicate alias templates with decltype [PR 99425]

This failure was ultimately from incorrect handling of alias
templates, but required a specific set of occurrences to happen in the
specialization hash table. This cleans up the specialization
streaming to add alias instantiations at the same point as other
instantiations. I also removed some unneeded global variables dealing
with mapping of duplicate decl contexts.

PR c++/99425
gcc/cp/
* cp-tree.h (map_context_from, map_context_to): Delete.
(add_mergeable_specialization): Add is_alias parm.
* pt.c (add_mergeable_specialization): Add is_alias parm, add them.
* module.cc (map_context_from, map_context_to): Delete.
(trees_in::decl_value): Add specializations later, adjust call.
Drop useless alias lookup. Set duplicate fn parm context.
(check_mergeable_decl): Drop context mapping.
(trees_in::is_matching_decl): Likewise.
(trees_in::read_function_def): Drop parameter context adjustment
here.
gcc/testsuite/
* g++.dg/modules/pr99425-1.h: New.
* g++.dg/modules/pr99425-1_a.H: New.
* g++.dg/modules/pr99425-1_b.H: New.
* g++.dg/modules/pr99425-1_c.C: New.
* g++.dg/modules/pr99425-2_a.X: New.
* g++.dg/modules/pr99425-2_b.X: New.
* g++.dg/template/pr99425.C: New.

arm: Fix MVE ICEs with vector moves and -mpure-code [PR97252]

This fixes around 500 ICEs in the testsuite which can be seen when
testing with -march=armv8.1-m.main+mve -mfloat-abi=hard -mpure-code
(leaving the testsuite free of ICEs in this configuration). All of the
ICEs are in arm_print_operand (which is expecting a mem and gets another
rtx, e.g. a const_vector) when running the output code for
*mve_mov<mode> in alternative 4.

The issue is that MVE vector moves were relying on the arm_reorg pass to
move constant vectors that we can't easily synthesize to the literal
pool. This doesn't work for -mpure-code where the literal pool is
disabled. LLVM puts these in .rodata: I've chosen to do the same here.

With this change, for -mpure-code, we no longer want to allow a constant
on the RHS of a vector load in RA. To achieve this, I added a new
constraint which matches constants only if the literal pool is
available.

gcc/ChangeLog:

PR target/97252
* config/arm/arm-protos.h (neon_make_constant): Add generate
argument to guard emitting insns, default to true.
* config/arm/arm.c (arm_legitimate_constant_p_1): Reject
CONST_VECTORs which neon_make_constant can't handle.
(neon_vdup_constant): Add generate argument, avoid emitting
insns if it's not set.
(neon_make_constant): Plumb new generate argument through.
* config/arm/constraints.md (Ui): New. Use it...
* config/arm/mve.md (*mve_mov<mode>): ... here.
* config/arm/vec-common.md (movv8hf): Use neon_make_constant to
synthesize constants.

Warn to not add debug hook targets

This adds a boiler-plate warning to the debug hooks structure to
strongly discourage people from adding new debug hook targets since
we want to get rid of the current abstraction in favor of maintaining
a DWARF view of debug in the middle-end and have support for alternate
output formats to be generated off that DWARF representation.

2021-03-22 Richard Biener <rguenther@suse.de>

* debug.h: Add deprecation warning.

tree-optimization/99694 - fix value-numbering PHIs

This avoids endless cycling when a PHI node with unchanged backedge
value (the PHI result appearing there) is subject to CSE since doing
that effectively alters the hash entry. The way to avoid this is
to ignore such edges when processing the PHI node.

2021-03-22 Richard Biener <rguenther@suse.de>

PR tree-optimization/99694
* tree-ssa-sccvn.c (visit_phi): Ignore edges with the
PHI result.

* gcc.dg/torture/pr99694.c: New testcase.

C++ modules: fix alloc-dealloc-mismatch ASAN issue

gcc/cp/ChangeLog:

PR c++/99687
* module.cc (fini_modules): Call vec_free instead of delete.

mklog: add new argument --directory.

The argument is handy when one needs to generate ChangeLog entries
for a different project (e.g. binutils).

contrib/ChangeLog:

* mklog.py: Add --directory argument.

PR target/99702: Check RTL type before get value

gcc/ChangeLog:

PR target/99702
* config/riscv/riscv.c (riscv_expand_block_move): Get RTL value
after type checking.

gcc/testsuite/ChangeLog:

PR target/99702
* gcc.target/riscv/pr99702.c: New.

Fortran: Fix 'name' bound size [PR99688]

gcc/fortran/ChangeLog:

PR fortran/99688
* match.c (select_type_set_tmp, gfc_match_select_type,
gfc_match_select_rank): Fix 'name' buffersize to avoid out of bounds.
* resolve.c (resolve_select_type): Likewise.

debug: Fix __int128 handling in dwarf2out [PR99562]

The PR66728 changes broke __int128 handling.
It emits wide_int numbers in their minimum unsigned precision
rather than in their full precision.
The problem is then that e.g. the DW_OP_implicit_value path:
          int_mode = as_a <scalar_int_mode> (mode);
          loc_result = new_loc_descr (DW_OP_implicit_value,
                                      GET_MODE_SIZE (int_mode), 0);
          loc_result->dw_loc_oprnd2.val_class = dw_val_class_wide_int;
          loc_result->dw_loc_oprnd2.v.val_wide = ggc_alloc<wide_int> ();
          *loc_result->dw_loc_oprnd2.v.val_wide = rtx_mode_t (rtl, int_mode);
emits invalid DWARF.  In particular this patch fixes there multiple
occurences of:
        .byte   0x9e    # DW_OP_implicit_value
        .uleb128 0x10
        .quad   0xffffffffffffffff
+       .quad   0
        .quad   .LVL46  # Location list begin address (*.LLST40)
        .quad   .LFE14  # Location list end address (*.LLST40)
where we said the value has 16 byte size but then only emitted 8 byte value.
My understanding is that most of the places that use val_wide expect
the precision they chose (the one of the mode they want etc.), the only
exception is the add_const_value_attribute case where it deals with
VOIDmode CONST_WIDE_INTs, for that I agree when we don't have a mode
we need to fallback to minimum precision (not sure if maximum of
min_precision UNSIGNED and SIGNED wouldn't be better, then consumers
would know if it is signed or unsigned by looking at the MSB),
but that code already computes the precision, just decided to
create the wide_int with much larger precision (e.g. 512 bit
on x86_64).

2021-03-22  Jakub Jelinek  <jakub@redhat.com>

PR debug/99562
PR debug/66728
* dwarf2out.c (get_full_len): Use get_precision rather than
min_precision.
(add_const_value_attribute): Make sure add_AT_wide argument has
precision prec rather than some very wide one.

rs6000: Fix some unexpected empty split conditions

This patch is to fix empty split-conditions of some
define_insn_and_split definitions where their conditions for
define_insn part aren't empty. As Segher and Mike pointed
out, they can sometimes lead to unexpected consequences.

Bootstrapped/regtested on powerpc64le-linux-gnu P9 and
powerpc64-linux-gnu P8.

gcc/ChangeLog:

* config/rs6000/rs6000.md (*rotldi3_insert_sf,
*mov<SFDF:mode><SFDF2:mode>cc_p9, floatsi<mode>2_lfiwax,
floatsi<mode>2_lfiwax_mem, floatunssi<mode>2_lfiwzx,
floatunssi<mode>2_lfiwzx_mem, *floatsidf2_internal,
*floatunssidf2_internal, fix_trunc<mode>si2_stfiwx,
fix_trunc<mode>si2_internal, fixuns_trunc<mode>si2_stfiwx,
*round32<mode>2_fprs, *roundu32<mode>2_fprs,
*fix_trunc<mode>si2_internal): Fix empty split condition.
* config/rs6000/vsx.md (*vsx_le_undo_permute_<mode>,
vsx_reduc_<VEC_reduc_name>_v2df, vsx_reduc_<VEC_reduc_name>_v4sf,
*vsx_reduc_<VEC_reduc_name>_v2df_scalar,
*vsx_reduc_<VEC_reduc_name>_v4sf_scalar): Likewise.

rs6000: Convert the vector set variable idx to DImode [PR98914]

vec_insert defines the element argument type to be signed int by ELFv2
ABI.  When expanding a vector with a variable rtx, convert the rtx type
to DImode to support both intrinsic usage and other callers from
rs6000_expand_vector_init produced by v[k] = val when k is long type.

gcc/ChangeLog:

2021-03-21  Xionghu Luo  <luoxhu@linux.ibm.com>

PR target/98914
* config/rs6000/rs6000.c (rs6000_expand_vector_set_var_p9):
Convert idx to DImode.
(rs6000_expand_vector_set_var_p8): Likewise.

gcc/testsuite/ChangeLog:

2021-03-21  Xionghu Luo  <luoxhu@linux.ibm.com>

PR target/98914
* gcc.target/powerpc/pr98914.c: New test.

Daily bump.

dwarf2out: Fix debug info for 2 byte floats [PR99388]

Aarch64, ARM and a couple of other architectures have 16-bit floats, HFmode.
As can be seen e.g. on
void
foo (void)
{
  __fp16 a = 1.0;
  asm ("nop");
  a = 2.0;
  asm ("nop");
  a = 3.0;
  asm ("nop");
}
testcase, GCC mishandles this on the dwarf2out.c side by assuming all
floating point types have sizes in multiples of 4 bytes, so what GCC emits
is it says that e.g. the DW_OP_implicit_value will be 2 bytes but then
doesn't emit anything and so anything emitted after it is treated by
consumers as the value and then they get out of sync.
real_to_target which insert_float uses indeed fills it that way, but putting
into an array of long 32 bits each time, but for the half floats it puts
everything into the least significant 16 bits of the first long no matter
what endianity host or target has.

The following patch fixes it.  With the patch the -g -O2 -dA output changes
(in a cross without .uleb128 support):
        .byte   0x9e    // DW_OP_implicit_value
        .byte   0x2     // uleb128 0x2
+       .2byte  0x3c00  // fp or vector constant word 0
        .byte   0x7     // DW_LLE_start_end (*.LLST0)
        .8byte  .LVL1   // Location list begin address (*.LLST0)
        .8byte  .LVL2   // Location list end address (*.LLST0)
        .byte   0x4     // uleb128 0x4; Location expression size
        .byte   0x9e    // DW_OP_implicit_value
        .byte   0x2     // uleb128 0x2
+       .2byte  0x4000  // fp or vector constant word 0
        .byte   0x7     // DW_LLE_start_end (*.LLST0)
        .8byte  .LVL2   // Location list begin address (*.LLST0)
        .8byte  .LFE0   // Location list end address (*.LLST0)
        .byte   0x4     // uleb128 0x4; Location expression size
        .byte   0x9e    // DW_OP_implicit_value
        .byte   0x2     // uleb128 0x2
+       .2byte  0x4200  // fp or vector constant word 0
        .byte   0       // DW_LLE_end_of_list (*.LLST0)

Bootstrapped/regtested on x86_64-linux, aarch64-linux and
armv7hl-linux-gnueabi, ok for trunk?

I fear the CONST_VECTOR case is still broken, while HFmode elements of vectors
should be fine (it uses eltsize of the element sizes) and likewise SFmode could
be fine, DFmode vectors are emitted as two 32-bit ints regardless of endianity
and I'm afraid it can't be right on big-endian.  But I haven't been able to
create a testcase that emits a CONST_VECTOR, for e.g. unused vector vars
with constant operands we emit CONCATN during expansion and thus ...
DW_OP_*piece for each element of the vector and for
DW_TAG_call_site_parameter we give up (because we handle CONST_VECTOR only
in loc_descriptor, not mem_loc_descriptor).

2021-03-21  Jakub Jelinek  <jakub@redhat.com>

PR debug/99388
* dwarf2out.c (insert_float): Change return type from void to
unsigned, handle GET_MODE_SIZE (mode) == 2 and return element size.
(mem_loc_descriptor, loc_descriptor, add_const_value_attribute):
Adjust callers.

Daily bump.

x86: Check cfun != NULL before accessing silent_p

Since construct_container may be called with cfun == NULL, check
cfun != NULL before accessing silent_p.

gcc/

PR target/99679
* config/i386/i386.c (construct_container): Check cfun != NULL
before accessing silent_p.

gcc/testsuite/

PR target/99679
* g++.target/i386/pr99679-1.C: New test.
* g++.target/i386/pr99679-2.C: Likewise.

c-family: Fix PR94272 -fcompare-debug issue even for C [PR99230]

The following testcase results in -fcompare-debug failure.
The problem is the similar like in PR94272
https://gcc.gnu.org/pipermail/gcc-patches/2020-March/542562.html
When genericizing, with -g0 we have just a TREE_SIDE_EFFECTS DO_STMT
in a branch of if, while with -g we have that wrapped into
TREE_SIDE_EFFECTS STATEMENT_LIST containing DEBUG_BEGIN_STMT and that
DO_STMT.
The do loop is empty with 0 condition, so c_genericize_control_stmt
turns it into an empty statement (without TREE_SIDE_EFFECTS).
For -g0 that means that suddenly the if branch doesn't have side effects
and is expanded differently. But with -g we still have TREE_SIDE_EFFECTS
STATEMENT_LIST containing DEBUG_BEGIN_STMT and non-TREE_SIDE_EFFECTS stmt.
The following patch fixes that by detecting this case and removing
TREE_SIDE_EFFECTS.

And, so that we don't duplicate the same code, changes the C++ FE to
just call the c_genericize_control_stmt function that can now handle it.

2021-03-20 Jakub Jelinek <jakub@redhat.com>

PR debug/99230
* c-gimplify.c (c_genericize_control_stmt): Handle STATEMENT_LIST.

* cp-gimplify.c (cp_genericize_r) <case STATEMENT_LIST>: Remove
special code, instead call c_genericize_control_stmt.

* gcc.dg/pr99230.c: New test.

[PATCH] Fix typo in gcc/asan.c comment

gcc/
* asan.c: Fix typos in comments.

[PR99680] Check empty constraint before using CONSTRAINT_LEN.

It seems CONSTRAINT_LEN treats constraint '\0' as one having length 1. Therefore we
read after the constraint string. The patch fixes it.

gcc/ChangeLog:

PR rtl-optimization/99680
* lra-constraints.c (skip_contraint_modifiers): Rename to skip_constraint_modifiers.
(process_address_1): Check empty constraint before using
CONSTRAINT_LEN.

Daily bump.

c: Fix up -Wunused-but-set-* warnings for _Atomics [PR99588]

As the following testcases show, compared to -D_Atomic= case we have many
-Wunused-but-set-* warning false positives.
When an _Atomic variable/parameter is read, we call mark_exp_read on it in
convert_lvalue_to_rvalue, but build_atomic_assign does not.
For consistency with the non-_Atomic case where we mark_exp_read the lhs
for lhs op= ... but not for lhs = ..., this patch does that too.
But furthermore we need to pattern match the trees emitted by _Atomic store,
so that _Atomic store itself is not marked as being a variable read, but
when the result of the store is used, we mark it.

2021-03-19 Jakub Jelinek <jakub@redhat.com>

PR c/99588
* c-typeck.c (mark_exp_read): Recognize what build_atomic_assign
with modifycode NOP_EXPR produces and mark the _Atomic var as read
if found.
(build_atomic_assign): For modifycode of NOP_EXPR, use COMPOUND_EXPRs
rather than STATEMENT_LIST. Otherwise call mark_exp_read on lhs.
Set TREE_SIDE_EFFECTS on the TARGET_EXPR.

* gcc.dg/Wunused-var-5.c: New test.
* gcc.dg/Wunused-var-6.c: New test.

Regenerate gcc.pot.

* gcc.pot: Regenerate.

Add Power10 scheduling description.

2021-03-19 Pat Haugen <pthaugen@linux.ibm.com>

gcc/

* config/rs6000/rs6000.c (power10_cost): New.
(rs6000_option_override_internal): Set Power10 costs.
(rs6000_issue_rate): Set Power10 issue rate.
* config/rs6000/power10.md: Rewrite for Power10.

libstdc++: Add std::is_scoped_enum for C++23

Implement this C++23 feature, as proposed by P1048R1.

This implementation assumes that a C++23 compiler supports concepts
already. I don't see any point in using preprocessor hacks to detect
compilers which define __cplusplus to a post-C++20 value but don't
support concepts yet.

libstdc++-v3/ChangeLog:

* include/std/type_traits (is_scoped_enum): Define.
* include/std/version (__cpp_lib_is_scoped_enum): Define.
* testsuite/20_util/is_scoped_enum/value.cc: New test.
* testsuite/20_util/is_scoped_enum/version.cc: New test.

Add size check to vector-matrix matmul.

It turns out the library version is much faster for vector-matrix
multiplications for large sizes than what inlining can produce.
Use size checks for switching between this and inlining for
that case to.

gcc/fortran/ChangeLog:

* frontend-passes.c (inline_limit_check): Add rank_a
argument. If a is rank 1, set the second dimension to 1.
(inline_matmul_assign): Pass rank_a argument to inline_limit_check.
(call_external_blas): Likewise.

gcc/testsuite/ChangeLog:

* gfortran.dg/inline_matmul_6.f90: Adjust count for
_gfortran_matmul.

[PR99663] Don't use unknown constraint for address constraint in process_address_1.

s390x has insns using several alternatives with address constraints. Even
if we don't know at this stage what alternative will be used, we still can say
that is an address constraint. So don't use unknown constraint in this
case when there are multiple constraints or/and alternative.

gcc/ChangeLog:

PR target/99663
* lra-constraints.c (process_address_1): Don't use unknown
constraint for address constraint.

gcc/testsuite/ChangeLog:

PR target/99663
* gcc.target/s390/pr99663.c: New.

c++: Only reject reinterpret casts from pointers to integers for manifestly_const_eval evaluation [PR99456]

My PR82304/PR95307 fix moved reinterpret cast from pointer to integer
diagnostics from cxx_eval_outermost_constant_expr where it caught
invalid code only at the outermost level down into
cxx_eval_constant_expression.
Unfortunately, it regressed following testcase, we emit worse code
including dynamic initialization of some vars.
While the initializers are not constant expressions due to the
reinterpret_cast in there, there is no reason not to fold them as an
optimization.

I've tried to make this dependent on !ctx->quiet, but that regressed
two further tests, and on ctx->strict, which regressed other tests,
so this patch bases that on manifestly_const_eval.

The new testcase is now optimized as much as it used to be in GCC 10
and the only regression it causes is an extra -Wnarrowing warning
on vla22.C test on invalid code (which the patch adjusts).

2021-03-19 Jakub Jelinek <jakub@redhat.com>

PR c++/99456
* constexpr.c (cxx_eval_constant_expression): For CONVERT_EXPR from
INDIRECT_TYPE_P to ARITHMETIC_TYPE_P, when !ctx->manifestly_const_eval
don't diagnose it, set *non_constant_p nor return t.

* g++.dg/opt/pr99456.C: New test.
* g++.dg/ext/vla22.C: Expect a -Wnarrowing warning for c++11 and
later.

Darwin : Fix build failure for powerpc-darwin8 [PR99661].

A hunk had been missed from r11-6417, fixed thus:

gcc/ChangeLog:

PR target/99661
* config.gcc (powerpc-*-darwin8): Delete the reference to
the now removed darwin8.h.

target/99660 - missing VX_CPU_PREFIX for vxworksae

This fixes an oversight which causes make all-gcc
to fail for --target=*vxworksae or vxworksmils, a regression
introduced by the recent VxWorks7 related updates.

Both AE and MILS variants resort to a common config/vxworksae.h,
which misses a definition of VX_CPU_PREFIX expected by port
specific headers.

The change just provides the missing definition.

2021-03-19 Olivier Hainque <hainque@adacore.com>

gcc/
PR target/99660
* config/vxworksae.h (VX_CPU_PREFIX): Define.

Use memcpy instead of strncpy to avoid error with -Werror=stringop-truncation.

gcc/ChangeLog:

* config/pa/pa.c (import_milli): Use memcpy instead of strncpy.

slp: remove unneeded permute calculation (PR99656)

The attach testcase ICEs because as you showed on the PR we have one child
which is an internal with a PERM of EVENEVEN and one with TOP.

The problem is while we can conceptually merge the permute itself into EVENEVEN,
merging the lanes don't really make sense.

That said, we no longer even require the merged lanes as we create the permutes
based on the KIND directly.

This patch just removes all of that code.

Unfortunately it still won't vectorize with the cost model enabled due to the
blend that's created combining the load and the external

note: node 0x51f2ce8 (max_nunits=1, refcnt=1)
note: op: VEC_PERM_EXPR
note:       { }
note:       lane permutation { 0[0] 1[1] }
note:       children 0x51f23e0 0x51f2578
note: node 0x51f23e0 (max_nunits=2, refcnt=1)
note: op template: _16 = REALPART_EXPR <*t1_9(D)>;
note:       stmt 0 _16 = REALPART_EXPR <*t1_9(D)>;
note:       stmt 1 _16 = REALPART_EXPR <*t1_9(D)>;
note:       load permutation { 0 0 }
note: node (external) 0x51f2578 (max_nunits=1, refcnt=1)
note:       { _18, _18 }

which costs the cost for the load-and-split and the cost of the external splat,
and the one for blending them while in reality it's just a scalar load and
insert.

The compiler (with the cost model disabled) generates

ldr     q1, [x19]
dup     v1.2d, v1.d[0]
ldr     d0, [x19, 8]
fneg    d0, d0
ins     v1.d[1], v0.d[0]

while really it should be

ldp     d1, d0, [x19]
fneg    d0, d0
ins     v1.d[1], v0.d[0]

but that's for another time.

gcc/ChangeLog:

PR tree-optimization/99656
* tree-vect-slp-patterns.c (linear_loads_p,
complex_add_pattern::matches, is_eq_or_top,
vect_validate_multiplication, complex_mul_pattern::matches,
complex_fms_pattern::matches): Remove complex_perm_kinds_t.
* tree-vectorizer.h: (complex_load_perm_t): Removed.
(slp_tree_to_load_perm_map_t): Use complex_perm_kinds_t instead of
complex_load_perm_t.

gcc/testsuite/ChangeLog:

PR tree-optimization/99656
* gfortran.dg/vect/pr99656.f90: New test.

x86: Issue error for return/argument only with function body

If we never generate function body, we shouldn't issue errors for return
nor argument. Add silent_p to i386 machine_function to avoid issuing
errors for return and argument without function body.

gcc/

PR target/99652
* config/i386/i386-options.c (ix86_init_machine_status): Set
silent_p to true.
* config/i386/i386.c (init_cumulative_args): Set silent_p to
false.
(construct_container): Return early for return and argument
errors if silent_p is true.
* config/i386/i386.h (machine_function): Add silent_p.

gcc/testsuite/

PR target/99652
* gcc.dg/torture/pr99652-1.c: New test.
* gcc.dg/torture/pr99652-2.c: Likewise.
* gcc.target/i386/pr57655.c: Adjusted.
* gcc.target/i386/pr59794-6.c: Likewise.
* gcc.target/i386/pr70738-1.c: Likewise.
* gcc.target/i386/pr96744-1.c: Likewise.

analyzer: mark epath_finder with DISABLE_COPY_AND_ASSIGN [PR99614]

cppcheck warns that class epath_finder does dynamic memory allocation, but
is missing a copy constructor and operator=.

This class isn't meant to be copied or assigned, so mark it with
DISABLE_COPY_AND_ASSIGN.

gcc/analyzer/ChangeLog:
PR analyzer/99614
* diagnostic-manager.cc (class epath_finder): Add
DISABLE_COPY_AND_ASSIGN.

arm: Fix mve_vshlq* [PR99593]

As mentioned in the PR, before the r11-6708-gbfab355012ca0f5219da8beb04f2fdaf757d34b7
change v[al]shr<mode>3 expanders were expanding the shifts by register
to gen_ashl<mode>3_{,un}signed which don't support immediate CONST_VECTOR
shift amounts, but now expand to mve_vshlq_<supf><mode> which does.
The testcase ICEs, because the constraint doesn't match the predicate and
because LRA works solely with the constraints, so it can e.g. from REG_EQUAL
propagate there a CONST_VECTOR which matches the constraint but fails the
predicate and only later on other passes will notice the predicate fails
and ICE.

Fixed by adding a constraint that matches the immediate part of the
predicate.

PR target/99593
* config/arm/constraints.md (Ds): New constraint.
* config/arm/vec-common.md (mve_vshlq_<supf><mode>): Use w,Ds
constraint instead of w,Dm.

* g++.target/arm/pr99593.C: New test.

amdgcn: Typo fix

gcc/ChangeLog:

* config/gcn/gcn.c (gcn_parse_amdgpu_hsa_kernel_attribute): Fix quotes
in error message.

substitute @tie{} with a space for the man pages

contrib/

2021-03-19 Matthias Klose <doko@ubuntu.com>

* texi2pod.pl: Substitute @tie{} with a space for the man pages.

Require linker plugin for another LTO test

If it is not present, fat LTO is generated with an additional warning.

gcc/testsuite/
* g++.dg/lto/pr89335_0.C: Require the linker plugin.

Fix segfault during encoding of CONSTRUCTORs

The segfault occurs in native_encode_initializer when it is encoding the
CONSTRUCTOR for an array whose lower bound is negative (it's OK in Ada).
The computation of the current position is done in HOST_WIDE_INT and this
does not work for arrays whose original range has a negative lower bound
and a positive upper bound; the computation must be done in sizetype
instead so that it may wrap around.

gcc/
PR middle-end/99641
* fold-const.c (native_encode_initializer) <CONSTRUCTOR>: For an
array type, do the computation of the current position in sizetype.

Daily bump.

c++: Fix error-recovery with requires expression [PR99500]

This fixes an ICE on invalid code where one of the parameters was
error_mark_node and thus resetting its DECL_CONTEXT crashed.

gcc/cp/ChangeLog:

PR c++/99500
* parser.c (cp_parser_requirement_parameter_list): Handle
error_mark_node.

gcc/testsuite/ChangeLog:

PR c++/99500
* g++.dg/cpp2a/concepts-err3.C: New test.

c++: Remove FLOAT_EXPR assert in tsubst.

This assert triggered when pr85013.C was compiled with -fchecking=2
which the usual testing doesn't exercise. Let's remove it for now
and revisit in GCC 12.

gcc/cp/ChangeLog:

* pt.c (tsubst_copy_and_build) <case FLOAT_EXPR>: Remove.

[PR99422] LRA: Use lookup_constraint only for a single constraint in process_address_1.

This is an additional patch for PR99422. In process_address_1 we
look only at the first constraint in the 1st alternative
and ignore all other possibilities. As we don't know what
alternative and constraint will be used at this stage, we can be sure
only for a single constraint with one alternative and should use unknown
constraint for all other cases.

gcc/ChangeLog:

PR target/99422
* lra-constraints.c (process_address_1): Use lookup_constraint
only for a single constraint.

PR middle-end/99502 - missing -Warray-bounds on partial out of bounds

gcc/ChangeLog:

PR middle-end/99502
* gimple-array-bounds.cc (inbounds_vbase_memaccess_p): Rename...
(inbounds_memaccess_p): ...to this. Check the ending offset of
the accessed member.

gcc/testsuite/ChangeLog:

PR middle-end/99502
* g++.dg/warn/Warray-bounds-22.C: New test.
* g++.dg/warn/Warray-bounds-23.C: New test.
* g++.dg/warn/Warray-bounds-24.C: New test.

c++: Add assert to tsubst.

As discussed in the r11-7709 patch, we can now make sure that tsubst
never sees a FLOAT_EXPR, much like its counterpart FIX_TRUNC_EXPR.

gcc/cp/ChangeLog:

* pt.c (tsubst_copy_and_build): Add assert.

amdgcn: Silence warnings in gcn.c

This fixes a few cases of "unquoted identifier or keyword", one "spurious
trailing punctuation sequence", and a "may be used uninitialized".

gcc/ChangeLog:

* config/gcn/gcn.c (gcn_parse_amdgpu_hsa_kernel_attribute): Add %< and
%> quote markers to error messages.
(gcn_goacc_validate_dims): Likewise.
(gcn_conditional_register_usage): Remove exclaimation mark from error
message.
(gcn_vectorize_vec_perm_const): Ensure perm is fully uninitialized.

Fix idiv latencies for znver3

update costs of integer divides to match actual latencies (the scheduler model
already does the right thing).  It is essentially no-op, since we end up
expanding idiv for all sensible constants, so this only may end up disabling
vectorization in some cases, but I did not find any such examples.  However in
general it is better ot have actual latencies than random numbers.

gcc/ChangeLog:

2021-03-18  Jan Hubicka  <hubicka@ucw.cz>

* config/i386/x86-tune-costs.h (struct processor_costs): Fix costs of
integer divides1.

PR target/99314: Fix integer signedness issue for cpymem pattern expansion.

Third operand of cpymem pattern is unsigned HOST_WIDE_INT, however we
are interpret that as signed HOST_WIDE_INT, that not a problem in
most case, but when the value is large than signed HOST_WIDE_INT, it
might screw up since we have using that value to calculate the buffer
size.

2021-03-05 Sinan Lin <sinan@isrc.iscas.ac.cn>
Kito Cheng <kito.cheng@sifive.com>

gcc/ChangeLog:

* config/riscv/riscv.c (riscv_block_move_straight): Change type
to unsigned HOST_WIDE_INT for parameter and local variable with
HOST_WIDE_INT type.
(riscv_adjust_block_mem): Ditto.
(riscv_block_move_loop): Ditto.
(riscv_expand_block_move): Ditto.

testsuite: Fix up strlenopt-80.c on powerpc [PR99636]

Similar issue as in strlenopt-73.c, various spots in this test rely
on MOVE_MAX >= 8, this time it uses a target selector to pick up a couple
of targets, and all of them but powerpc 32-bit satisfy it, but powerpc
32-bit have MOVE_MAX just 4.

2021-03-18 Jakub Jelinek <jakub@redhat.com>

PR testsuite/99636
* gcc.dg/strlenopt-80.c: For powerpc*-*-*, only enable for lp64.

testsuite: Fix up strlenopt-73.c on powerpc [PR99626]

As mentioned in the testcase as well as in the PR, this testcase relies on
MOVE_MAX being sufficiently large that the memcpy call is folded early into
load + store.  Some popular targets define MOVE_MAX to 8 or even 16 (e.g.
x86_64 or some options on s390x), but many other targets define it to just 4
(e.g. powerpc 32-bit), or even 2.

The testcase has already one test routine guarded on one particular target
with MOVE_MAX 16 (but does it incorrectly, __i386__ is only defined on
32-bit x86 and __SIZEOF_INT128__ is only defined on 64-bit targets), this
patch fixes that, and guards another test that relies on memcpy (, , 8)
being folded that way (which therefore needs MOVE_MAX >= 8) on a couple of
common targets that are known to have such MOVE_MAX.

2021-03-18  Jakub Jelinek  <jakub@redhat.com>

PR testsuite/99626
* gcc.dg/strlenopt-73.c: Ifdef out test_copy_cond_unequal_length_i64
on targets other than x86, aarch64, s390 and 64-bit powerpc.  Use
test_copy_cond_unequal_length_i128 for __x86_64__ with int128 support
rather than __i386__.

Update email address for primary entry

/
* MAINTAINERS: Update primary entry.

testsuite: Skip c-c++-common/zero-scratch-regs-10.c on arm

As discussed in PR 97680, -fzero-call-used-regs is not supported on
arm.

Skip this test to avoid failure reports.

2021-03-18 Christophe Lyon <christophe.lyon@linaro.org>

gcc/testsuite/
PR testsuite/97680
* c-c++-common/zero-scratch-regs-10.c: Skip on arm

Fix building the V850 port using recent versions of gcc.

gcc/
* config/v850/v850.c (construct_restore_jr): Increase static
buffer size.
(construct_save_jarl): Likewise.
* config/v850/v850.h (DWARF2_DEBUGGING_INFO): Define.

Objective-C++ : Fix handling of unnamed message parms [PR49070].

When we are parsing an Objective-C++ message, a colon is a valid
terminator for a assignment-expression. That is:

[receiver meth:x:x:x:x];

Is a valid, if somewhat unreadable, construction; corresponding
to a method declaration like:

- (id) meth:(id)arg0 :(id)arg1 :(id)arg2 :(id)arg3;

Where three of the message params have no selector name.

If fact, although it might be unintentional, Objective-C/C++ can
accept message selectors with all the parms unnamed (this applies
to the clang implementation too, which is taken as the reference
for the language).

For regular C++, the pattern x:x is not valid in that position an
an error is emitted with a fixit for the expected scope token.

If we simply made that error conditional on !c_dialect_objc()
that would regress Objective-C++ diagnostics for cases outside a
message selector, so we add a state flag for this.

gcc/cp/ChangeLog:

PR objc++/49070
* parser.c (cp_debug_parser): Add Objective-C++ message
state flag.
(cp_parser_nested_name_specifier_opt): Allow colon to
terminate an assignment-expression when parsing Objective-
C++ messages.
(cp_parser_objc_message_expression): Set and clear message
parsing state on entry and exit.
* parser.h (struct cp_parser): Add a context flag for
Objective-C++ message state.

gcc/testsuite/ChangeLog:

PR objc++/49070
* obj-c++.dg/pr49070.mm: New test.
* objc.dg/unnamed-parms.m: New test.

aarch64: Improve generic SVE tuning defaults

This patch adds the recently-added tweak to split some SVE VL-based scalar
operations [1] to the generic tuning used for SVE, as enabled by adding +sve
to the -march flag, for example -march=armv8.2-a+sve.

The recommendation for best performance on a particular CPU remains unchanged:
use the -mcpu option for that CPU, where possible. -mcpu=native makes this
straightforward for native compilation.

The tweak to split out SVE VL-based scalar operations is a consistent win for
the Neoverse V1 CPU and should be neutral for the Fujitsu A64FX. A run of
SPEC2017 on A64FX with this tweak on didn't show any non-noise differences.
It is also expected to be neutral on SVE2 implementations.

Therefore, the patch enables the tweak for generic +sve tuning e.g.
-march=armv8.2-a+sve. No SVE2 CPUs are expected to benefit from it,
therefore the tweak is disabled for generic tuning when +sve2 is in
-march e.g. -march=armv8.2-a+sve2.

The implementation of this approach requires a bit of custom logic in
aarch64_override_options_internal to handle these kinds of
architecture-dependent decisions, but we do believe the user-facing principle
here is important to implement.

In general, for the generic target we're using a decision framework that looks
like:

* If all cores that are known to benefit from an optimization
are of architecture X, and all other cores that implement X or above
are not impacted, or have a very slight impact, we will consider it for
generic tuning for architecture X.
* We will not enable that optimisation for generic tuning for architecture X+1
if no known cores of architecture X+1 or above will benefit.

This framework allows us to improve generic tuning for CPUs of generation X
while avoiding accumulating tweaks for future CPUs of generation X+1, X+2...
that do not need them, and thus avoid even the slight negative effects of
these optimisations if the user is willing to tell us the desired architecture
accurately.

X above can mean either annual architecture updates (Armv8.2-a, Armv8.3-a etc)
or optional architecture extensions (like SVE, SVE2).

[1] http://gcc.gnu.org/g:a65b9ad863c5fc0aea12db58557f4d286a1974d7

gcc/ChangeLog:

* config/aarch64/aarch64.c (aarch64_adjust_generic_arch_tuning): Define.
(aarch64_override_options_internal): Use it.
(generic_tunings): Add AARCH64_EXTRA_TUNE_CSE_SVE_VL_CONSTANTS to
tune_flags.

gcc/testsuite/ChangeLog:

* g++.target/aarch64/sve/aarch64-sve.exp: Add -moverride=tune=none to
sve_flags.
* g++.target/aarch64/sve/acle/aarch64-sve-acle-asm.exp: Likewise.
* g++.target/aarch64/sve/acle/aarch64-sve-acle.exp: Likewise.
* gcc.target/aarch64/sve/aarch64-sve.exp: Likewise.
* gcc.target/aarch64/sve/acle/aarch64-sve-acle-asm.exp: Likewise.
* gcc.target/aarch64/sve/acle/aarch64-sve-acle.exp: Likewise.

coroutines: init struct members to NULL

gcc/cp/ChangeLog:

PR c++/99617
* coroutines.cc (struct var_nest_node): Init then_cl and else_cl
to NULL.