review.tizen.org Git - platform/upstream/gcc.git/log

libstdc++: Optimize operator+(string/char*, char*/string) equally

Until now operator+(char*, const string&) and operator+(const string&,
char*) had different performance characteristics. The former required a
single memory allocation and the latter required two. This patch makes
the performance equal.

libstdc++-v3/ChangeLog:

* include/bits/basic_string.h (operator+(const string&, const char*)):
Remove naive implementation.
* include/bits/basic_string.tcc (operator+(const string&, const char*)):
Add single-allocation implementation.

Signed-off-by: Will Hawkins <whh8b@obs.cr>

tree.cc: Fix optimization of DFP default initialization

When an object of decimal floating-point type is default-initialized,
GCC is inconsistent about whether it is given the all-zero-bits
representation (zero with the least quantum exponent) or whether it
acts like a conversion of integer 0 to the DFP type (zero with quantum
exponent 0). In particular, the representation stored in memory can
have all zero bits, but optimization of access to the same object
based on its known constant value can then produce zero with quantum
exponent 0 instead.

C2x leaves the quantum exponent for default initialization
implementation-defined, but that doesn't allow such inconsistency in
the interpretation of a single object. All zero bits seems most
appropriate; change build_real to special-case dconst0 the same way
other constants are special-cased and ensure that the correct zero for
the type is generated.

Bootstrapped with no regressions for x86_64-pc-linux-gnu.

gcc/
* tree.cc (build_real): Give DFP dconst0 the minimum quantum
exponent for the type.

gcc/testsuite/
* gcc.dg/torture/dfp-default-init-1.c,
gcc.dg/torture/dfp-default-init-2.c,
gcc.dg/torture/dfp-default-init-3.c: New tests.

bpf: facilitate constant propagation of function addresses

eBPF effectively supports two kind of call instructions:

- The so called pseudo-calls ("bpf to bpf").
- External calls ("bpf to kernel").

The BPF call instruction always gets an immediate argument, whose
interpretation varies depending on the purpose of the instruction:

- For pseudo-calls, the immediate argument is interpreted as a
  32-bit PC-relative displacement measured in number of 64-bit words
  minus one.

- For external calls, the immediate argument is interpreted as the
  identification of a kernel helper.

In order to differenciate both flavors of CALL instructions the SRC
field of the instruction (otherwise unused) is abused as an opcode;
if the field holds 0 the instruction is an external call, if it holds
BPF_PSEUDO_CALL the instruction is a pseudo-call.

C-to-BPF toolchains, including the GNU toolchain, use the following
practical heuristic at assembly time in order to determine what kind
of CALL instruction to generate: call instructions requiring a fixup
at assembly time are interpreted as pseudo-calls.  This means that in
practice a call instruction involving symbols at assembly time (such
as `call foo') is assembled into a pseudo-call instruction, whereas
something like `call 12' is assembled into an external call
instruction.

In both cases, the argument of CALL is an immediate: at the time of
writing eBPF lacks support for indirect calls, i.e. there is no
call-to-register instruction.

This is the reason why BPF programs, in practice, rely on certain
optimizations to happen in order to generate calls to immediates.
This is a typical example involving a kernel helper:

  static void * (*bpf_map_lookup_elem)(void *map, const void *key)
    = (void *) 1;

  int foo (...)
  {
    char *ret;

    ret = bpf_map_lookup_elem (args...);
    if (ret)
      return 1;
    return 0;
  }

Note how the code above relies on the compiler to do constant
propagation so the call to bpf_map_lookup_elem can be compiled to a
`call 1' instruction.

While GCC provides a kernel_helper function declaration attribute that
can be used in a robust way to tell GCC to generate an external call
despite of optimization level and any other consideration, the Linux
kernel bpf_helpers.h file relies on tricks like the above.

This patch modifies the BPF backend to avoid SSA sparse constant
propagation to be "undone" by the expander loading the function
address into a register.  A new test is also added.

Tested in bpf-unknown-linux-gnu.
No regressions.

gcc/ChangeLog:

PR target/106733
* config/bpf/bpf.cc (bpf_legitimate_address_p): Recognize integer
constants as legitimate addresses for functions.
(bpf_small_register_classes_for_mode_p): Define target hook.

gcc/testsuite/ChangeLog:

PR target/106733
* gcc.target/bpf/constant-calls.c: Rename to ...
* gcc.target/bpf/constant-calls-1.c: and modify to not expect
failure anymore.
* gcc.target/bpf/constant-calls-2.c: New test.

libstdc++: Add check for LWG 3741 problem case

This LWG issue was closed as NAD, as it was just a bug in an
implementation, not a defect in the standard. Libstdc++ never had that
bug and always worked for the problem case. Add a test to ensure we
don't regress.

The problem occurs when abs is implemented using a ternary expression:

return d >= d.zero() ? d : -d;

If decltype(-d) is not the same as decltype(d) then this is ambiguous,
because each type can be converted to the other, so there is no common
type.

libstdc++-v3/ChangeLog:

* testsuite/20_util/duration_cast/rounding.cc: Check abs with
non-reduced duration.

Move things around in predicate analysis

This moves a few functions, notably normalization after a big comment
documenting it. I've left the rest unorganized for now.

* gimple-predicate-analysis.cc: Move predicate normalization
after the comment documenting it.

Split uninit analysis from predicate analysis

This splits the API collected in gimple-predicate-analysis.h into
what I'd call a predicate and assorted functionality plus utility
used by the uninit pass that happens to use that. I've tried to
be minimalistic with refactoring, there's still recursive
instantiation of uninit_analysis, the new class encapsulating a
series of uninit analysis queries from the uninit pass. But it
at least should make the predicate part actually reusable and
what predicate is dealt with is a little bit more clear in the
uninit_analysis part.

I will followup with moving the predicate implementation bits
together in the gimple-predicate-analysis.cc file.

* gimple-predicate-analysis.h (predicate): Split out
non-predicate related functionality into ..
(uninit_analysis): .. this new class.
* gimple-predicate-analysis.cc: Refactor into two classes.
* tree-ssa-uninit.cc (find_uninit_use): Use uninit_analysis.

Some more predicate analysis TLC

This limits the simple control dep also to the cd_root plus avoids
filling the lazily computed PHI def predicate in the early out path
which would leave it not simplified and normalized if it were
re-used. It also avoids computing the use predicates when the
post-dominance early out doesn't need it. It also syncs
predicate::use_cannot_happen with init_from_phi_def, adding the
missing PHI edge to the computed chains (the simple control dep
code already adds it).

* gimple-predicate-analysis.cc (predicate::use_cannot_happen):
Do simple_control_dep_chain only up to cd_root, add the PHI
operand edge to the chains like init_from_phi_def does.
(predicate::is_use_guarded): Speedup early out, avoid half-way
initializing the PHI def predicate.

i386: Fix up mode iterators that weren't expanded [PR106721]

Currently, when md file reader sees <something> and something is valid mode
(or code) attribute but which doesn't include case for the current mode
(or code), it just keeps the <something> untouched.
I went through all cases matching <[a-zA-Z] in tmp-mddump.md after make mddump.
Most of the cases were related to the recent V*BF mode additions, some
to V*HF mode too, and there was one typo.

2022-08-24 Jakub Jelinek <jakub@redhat.com>

PR target/106721
* config/i386/sse.md (shuffletype): Add V32BF, V16BF and V8BF entries.
Change V32HF, V16HF and V8HF entries from "f" to "i".
(iptr): Add V32BF, V16BF, V8BF and BF entries.
(i128vldq): Add V16HF and V16BF entries.
(avx512er_vmrcp28<mode><mask_name><round_saeonly_name>): Fix typo,
mask_opernad3 -> mask_operand3.

* gcc.target/i386/avx512vl-pr106721.c: New test.

preprocessor: Implement C++23 P2437R1 - Support for #warning [PR106646]

On Thu, Aug 18, 2022 at 11:02:44PM +0000, Joseph Myers wrote:
> ISO C2x standardizes the existing #warning extension. Arrange
> accordingly for it not to be diagnosed with -std=c2x -pedantic, but to
> be diagnosed with -Wc11-c2x-compat.

And here is the corresponding C++ version.
Don't pedwarn about this for C++23/GNU++23 and tweak the diagnostics
for C++ otherwise, + testsuite coverage.
The diagnostic wording is similar e.g. to the #elifdef diagnostics.

2022-08-24 Jakub Jelinek <jakub@redhat.com>

PR c++/106646
* init.cc: Implement C++23 P2437R1 - Support for #warning.
(lang_defaults): Set warning_directive for GNUCXX23 and CXX23.
* directives.cc (directive_diagnostics): Use different wording of
#warning pedwarn for C++.

* g++.dg/cpp/warning-1.C: New test.
* g++.dg/cpp/warning-2.C: New test.
* g++.dg/cpp/warning-3.C: New test.

gcov: fix file and function summary information

gcc/ChangeLog:

* gcov.cc (add_line_counts): Add group functions to coverage
summary.
(accumulate_line_counts): Similarly for files.

Co-Authored-By: Jørgen Kvalsvik <j@lambda.is>

LoongArch: Add new code model 'medium'.

The function jump instruction in normal mode is 'bl',
so the scope of the function jump is +-128MB.

Now we've added support for 'medium' mode, this mode is
to complete the function jump through two instructions:
pcalau12i + jirl
So in this mode the function jump range is increased to +-2GB.

Compared with 'normal' mode, 'medium' mode only affects the
jump range of functions.

gcc/ChangeLog:

* config/loongarch/genopts/loongarch-strings: Support code model medium.
* config/loongarch/genopts/loongarch.opt.in: Likewise.
* config/loongarch/loongarch-def.c: Likewise.
* config/loongarch/loongarch-def.h (CMODEL_LARGE): Likewise.
(CMODEL_EXTREME): Likewise.
(N_CMODEL_TYPES): Likewise.
(CMODEL_MEDIUM): Likewise.
* config/loongarch/loongarch-opts.cc: Likewise.
* config/loongarch/loongarch-opts.h (TARGET_CMODEL_MEDIUM): Likewise.
* config/loongarch/loongarch-str.h (STR_CMODEL_MEDIUM): Likewise.
* config/loongarch/loongarch.cc (loongarch_call_tls_get_addr):
Tls symbol Loading support medium mode.
(loongarch_legitimize_call_address): When medium mode, make a symbolic
jump with two instructions.
(loongarch_option_override_internal): Support medium.
* config/loongarch/loongarch.md (@pcalau12i<mode>): New template.
(@sibcall_internal_1<mode>): New function call templates added to support
medium mode.
(@sibcall_value_internal_1<mode>): Likewise.
(@sibcall_value_multiple_internal_1<mode>): Likewise.
(@call_internal_1<mode>): Likewise.
(@call_value_internal_1<mode>): Likewise.
(@call_value_multiple_internal_1<mode>): Likewise.
* config/loongarch/loongarch.opt: Support medium.
* config/loongarch/predicates.md: Add processing about medium mode.
* doc/invoke.texi: Document for '-mcmodel=medium'.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/func-call-medium-1.c: New test.
* gcc.target/loongarch/func-call-medium-2.c: New test.
* gcc.target/loongarch/func-call-medium-3.c: New test.
* gcc.target/loongarch/func-call-medium-4.c: New test.
* gcc.target/loongarch/func-call-medium-5.c: New test.
* gcc.target/loongarch/func-call-medium-6.c: New test.
* gcc.target/loongarch/func-call-medium-7.c: New test.
* gcc.target/loongarch/func-call-medium-8.c: New test.
* gcc.target/loongarch/tls-gd-noplt.c: Add compile parameter '-mexplicit-relocs'.

Speedup path discovery in predicate::use_cannot_happen

The following reverts a hunk from r8-5789-g11ef0b22d68cd1 that
made compute_control_dep_chain start from function entry rather
than the immediate dominator of the source block of the edge with
the undefined value on the PHI node.  Reverting at that point
does not reveal any testsuite FAIL, in particular the added
testcase still passes.  The following adjusts this to the other
function that computes predicates that hold on the PHI incoming
edges with undefined values, predicate::init_from_phi_def, which
starts at the immediate dominator of the PHI.  That's much less
likely to run into the CFG walking limit.

* gimple-predicate-analysis.cc (predicate::use_cannot_happen):
Start the compute_control_dep_chain walk from the immediate
dominator of the PHI.

Daily bump.

c++: Quash bogus -Wredundant-move warning

This patch fixes a pretty stoopid thinko.  When I added code to warn
about pessimizing std::move in initializations like

  T t{std::move(T())};

I also added code to unwrap the expression from { }.  But when we have

  return {std::move(t)};

we cannot warn about a redundant std::move because the implicit move
wouldn't happen for "return {t};" because the expression isn't just
a name.  However, we still want to warn about

  return {std::move(T())};

so let's not disable the -Wpessimizing-move warning.  Tests added for
both cases.

gcc/cp/ChangeLog:

* typeck.cc (maybe_warn_pessimizing_move): Don't warn about
redundant std::move when the expression was wrapped in { }.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/Wpessimizing-move10.C: New test.
* g++.dg/cpp0x/Wredundant-move12.C: New test.

x86: Replace vmovdqu with movdqu in BF16 XMM ABI tests

Since XMM BF16 tests only require SSE2, replace vmovdqu with movdqu in
BF16 XMM ABI tests to support SSE2 machines without AVX.

Tested on x86-64 machines with and without AVX.

* gcc.target/x86_64/abi/bf16/asm-support.S: Replace vmovdqu with
movdqu.

Update gcc .po files

* be.po, da.po, de.po, el.po, es.po, fi.po, fr.po, hr.po, id.po,
ja.po, nl.po, ru.po, sr.po, sv.po, tr.po, uk.po, vi.po, zh_CN.po,
zh_TW.po: Update.

libstdc++: Implement std::pair/tuple/misc enhancements from P2321R2

This implements the non-<ranges> changes from P2321R2, which primarily
consist of additional converting constructors, assignment operator and
swap overloads for std::pair and std::tuple.

libstdc++-v3/ChangeLog:

* include/bits/stl_bvector.h (_Bit_reference::operator=): Define
const overload for C++23 as per P2321R2.
* include/bits/stl_pair.h (pair::swap): Likewise.
(pair::pair): Define additional converting constructors for
C++23 as per P2321R2.
(pair::operator=): Define const overloads for C++23 as per
P2321R2.
(swap): Define overload taking const pair& for C++23 as per
P2321R2.
(basic_common_reference): Define partial specialization for
pair for C++23 as per P2321R2.
(common_type): Likewise.
* include/bits/uses_allocator_args.h
(uses_allocator_construction_args): Define additional pair
overloads for C++23 as per P2321R2.
* include/std/tuple (_Tuple_impl::_Tuple_impl): Define
additional converting constructors for C++23 as per P2321R2.
(_Tuple_impl::_M_assign): Define const overloads for C++23
as per P2321R2.
(_Tuple_impl::_M_swap): Likewise.
(tuple::__constructible): Define as a convenient renaming of
_TCC<true>::__constructible.
(tuple::__convertible): As above but for _TCC<true>::__convertible.
(tuple::tuple): Define additional converting constructors for
C++23 as per P2321R2.
(tuple::operator=): Define const overloads for C++23 as per
P2321R2.
(tuple::swap): Likewise.
(basic_common_reference): Define partial specialization for
tuple for C++23 as per P2321R2.
(common_type): Likewise.
* testsuite/20_util/pair/p2321r2.cc: New test.
* testsuite/20_util/tuple/p2321r2.cc: New test.
* testsuite/23_containers/vector/bool/element_access/1.cc: New test.

libstdc++: Separate construct/convertibility tests for std::tuple

P2321R2 adds additional conditionally explicit constructors to std::tuple
which we'll concisely implement in a subsequent patch using explicit(bool),
like in our C++20 std::pair implementation. This prerequisite patch
adds member typedefs to _TupleConstraints for testing element-wise
constructibility and convertibility separately; we'll use the first in
the new constructors' constraints, and the second in their explicit
specifier.

In passing, this patch also redefines the existing member predicates
__is_ex/implicitly_constructible in terms of these new members. This
seems to reduce compile time and memory usage by about 10% for large
tuples when using the converting constructors that're constrained by
_Explicit/_ImplicitCtor.

libstdc++-v3/ChangeLog:

* include/std/tuple (_TupleConstraints::__convertible): Define.
(_TupleConstraints::__constructible): Define.
(_TupleConstraints::__is_explicitly_constructible): Redefine this
in terms of __convertible and __constructible.
(_TupleConstraints::__is_implicitly_constructible): Likewise.

libstdc++: Fix visit<void>(v) for non-void visitors [PR106589]

The optimization for the common case of std::visit forgot to handle the
edge case of passing zero variants to a non-void visitor and converting
the result to void.

libstdc++-v3/ChangeLog:

PR libstdc++/106589
* include/std/variant (__do_visit): Handle is_void<R> for zero
argument case.
* testsuite/20_util/variant/visit_r.cc: Check std::visit<void>(v).

x86: Cast stride to __PTRDIFF_TYPE__ in AMX intrinsics

On 64-bit Windows, long is 32 bits and can't be used as stride in memory
operand when base is a pointer which is 64 bits. Cast stride to
__PTRDIFF_TYPE__, instead of long.

PR target/106714
* config/i386/amxtileintrin.h (_tile_loadd_internal): Cast to
__PTRDIFF_TYPE__.
(_tile_stream_loadd_internal): Likewise.
(_tile_stored_internal): Likewise.

tree-optimization/106722 - uninit analysis with long def -> use path

The following applies similar measures as r13-2133-ge66cf626c72d58
to the computation of the use predicate when the path from PHI def
to use is too long and we run into compute_control_dep_chain limits.

It also moves the preprocessor define limits internal.

This resolves the reduced testcase but not the original one.

PR tree-optimization/106722
* gimple-predicate-analysis.h (MAX_NUM_CHAINS, MAX_CHAIN_LEN,
MAX_POSTDOM_CHECK, MAX_SWITCH_CASES): Move ...
* gimple-predicate-analysis.cc: ... here and document.
(simple_control_dep_chain): New function, factored from
predicate::use_cannot_happen.
(predicate::use_cannot_happen): Adjust.
(predicate::predicate): Use simple_control_dep_chain as fallback.

* g++.dg/uninit-pr106722-1.C: New testcase.

testsuite: Add test for r11-4123

r11-4123 came without a test but I happened upon a nice test case that
got fixed by that revision.  So I think it'd be good to add it.  The
ICE was:

phi-1.C: In constructor 'ElementManager::ElementManager()':
phi-1.C:28:1: error: missing definition
   28 | ElementManager::ElementManager() : array_(makeArray()) {}
      | ^~~~~~~~~~~~~~
for SSA_NAME: _12 in statement:
_10 = PHI <_12(3), _11(5)>
PHI argument
_12
for PHI node
_10 = PHI <_12(3), _11(5)>
during GIMPLE pass: fixup_cfg
phi-1.C:28:1: internal compiler error: verify_ssa failed

gcc/testsuite/ChangeLog:

* g++.dg/torture/phi-1.C: New test.

New uninit testcase

I've reduced the following which doesn't seem covered in a good enough
way in the testsuite.

* gcc.dg/uninit-pred-10.c: New testcase.

gfortran.dg/gomp/depend-6.f90: Minor fix

Exactly the same as previous commit for depend-4.f90, r13-2151.

gcc/testsuite/

* gfortran.dg/gomp/depend-6.f90: Fix array index use for
depobj var + update scan-tree-dump-times.

gfortran.dg/gomp/depend-4.f90: Minor fix

gcc/testsuite/

* gfortran.dg/gomp/depend-4.f90: Fix array index use for
depobj var + update scan-tree-dump-times.

Copy range from op2 in foperator_equal::op1_range.

Like the integer version, when op1 == op2 is known to be true the
ranges are also equal.

gcc/ChangeLog:

* range-op-float.cc (foperator_equal::op1_range): Set range to
range of op2.

Refactor is_non_loop_exit_postdominating

That's a weird function in predicate analysis that currently looks like

/* Return true if BB1 is postdominating BB2 and BB1 is not a loop exit
   bb.  The loop exit bb check is simple and does not cover all cases.  */
static bool
is_non_loop_exit_postdominating (basic_block bb1, basic_block bb2)
{
  if (!dominated_by_p (CDI_POST_DOMINATORS, bb2, bb1))
    return false;
  if (single_pred_p (bb1) && !single_succ_p (bb2))
    return false;
  return true;
}

One can refactor this to

  return (dominated_by_p (CDI_POST_DOMINATORS, bb2, bb1)
          && !(single_pred_p (bb1) && !single_succ_p (bb2)));

Notable is that the comment refers to BB1 with respect to a loop
exit but the test seems to be written with an exit edge bb1 -> bb2
in mind.  None of the three callers are guaranteed to have bb1 and
bb2 connected directly with an edge.

The patch now introduces a is_loop_exit function and inlines
the post-dominance check which makes the find_control_equiv_block
case simpler because the post-dominance check can be elided.
It also avoids the double negation in compute_control_dep_chain
and makes it obvious this is the case where we do look at an edge.
For the main is_use_guarded API I chose to elide the loop exit
test, if the use block post-dominates the definition block of the
PHI node the use is always unconditional.  I don't quite understand
the loop exit special-casing of the remaining two uses though.

* gimple-predicate-analysis.cc (is_loop_exit): Split out
from ...
(is_non_loop_exit_postdominating): ... here.  Remove after
inlining ...
(find_control_equiv_block): ... here.
(compute_control_dep_chain): ... and here.
(predicate::is_use_guarded): Do not excempt loop exits
from short-cutting the case of the use post-dominating the
PHI definition.

Add __m128bf16/__m256bf16/__m512bf16 type for bf16 abi test

Fix the abi test fail issue caused by type missing.

gcc/testsuite/ChangeLog:

* gcc.target/x86_64/abi/bf16/bf16-helper.h:
Add _m128bf16/m256bf16/_m512bf16.
* gcc.target/x86_64/abi/bf16/m512bf16/bf16-zmm-check.h:
Include bf16-helper.h.

Return the correct relation

With an input condition of op1 > op2, and evaluating the unsigned expression:
LHS = op1 - op2
range-ops was returning LHS < op1 , which is incorrect as op2 coould be
zero. This patch adjusts it to return LHS <= op1.

PR tree-optimization/106687
gcc/
* range-op.cc (operator_minus::lhs_op1_relation): Return VREL_LE
for the VREL_GT case as well.

gcc/testsuite/
* g++.dg/pr106687.C: New.

Daily bump.

libstdc++: Document linker option for C++23 <stacktrace> [PR105678]

libstdc++-v3/ChangeLog:

PR libstdc++/105678
* doc/xml/manual/using.xml: Document -lstdc++_libbacktrace
requirement for using std::stacktrace. Also adjust -frtti and
-fexceptions to document non-default (i.e. negative) forms.
* doc/html/*: Regenerate.

libstdc++: Fix for explicit copy ctors in <thread> and <future> [PR106695]

When I changed std::thread and std::async to avoid unnecessary move
construction of temporaries, I introduced a regression where types with
an explicit copy constructor could not be passed to std::thread or
std::async. The fix is to add a constructor instead of using aggregate
initialization of an unnamed temporary.

libstdc++-v3/ChangeLog:

PR libstdc++/106695
* include/bits/std_thread.h (thread::_State_impl): Forward
individual arguments to _Invoker constructor.
(thread::_Invoker): Add constructor. Delete copies.
* include/std/future (__future_base::_Deferred_state): Forward
individual arguments to _Invoker constructor.
(__future_base::_Async_state_impl): Likewise.
* testsuite/30_threads/async/106695.cc: New test.
* testsuite/30_threads/thread/106695.cc: New test.

libstdc++: Check for overflow in regex back-reference [PR106607]

Currently we fail to notice integer overflow when parsing a
back-reference expression, or when converting the parsed result from
long to int. This changes the result to be int, so no conversion is
needed, and uses the overflow-checking built-ins to detect an
out-of-range back-reference.

libstdc++-v3/ChangeLog:

PR libstdc++/106607
* include/bits/regex_compiler.tcc (_Compiler::_M_cur_int_value):
Use built-ins to check for integer overflow in back-reference
number.
* testsuite/28_regex/basic_regex/106607.cc: New test.

pru: Optimize 64-bit logical operations

The earlyclobber in the pattern yields inefficient code due to
unnecessarily generated moves.  Optimize by removing the earlyclobber
for two special alternatives:
  - If OP2 is a small constant integer.
  - If the logical bit operation has only two operands.

gcc/ChangeLog:

* config/pru/pru.md (pru_<code>di3): New alternative for
two operands but without earlyclobber.

gcc/testsuite/ChangeLog:

* gcc.target/pru/bitop-di.c: New test.

Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>

pru: Add mov variants to load const -1

Use the FILL instruction to efficiently load -1 constants.

gcc/ChangeLog:

* config/pru/pru.md (prumov<mode>, mov<mode>): Add
variants for loading -1 consts.

gcc/testsuite/ChangeLog:

* gcc.target/pru/mov-m1.c: New test.

Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>

PR target/106564: pru: Optimize 64-bit sign- and zero-extend

Add new patterns to optimize 64-bit sign- and zero-extend operations for
the PRU target.

The new 64-bit zero-extend patterns are straightforward define_insns.

The old 16/32-bit sign-extend pattern has been rewritten from scratch
in order to add 64-bit support. The new pattern expands into several
optimized insns for filling bytes with zeros or ones, and for
conditional branching on bit-test. The bulk of this patch is to
implement the patterns for those new optimized insns.

PR target/106564

gcc/ChangeLog:

* config/pru/constraints.md (Um): New constraint for -1.
(Uf): New constraint for IOR fill-bytes constants.
(Uz): New constraint for AND zero-bytes constants.
* config/pru/predicates.md (const_fillbytes_operand): New
predicate for IOR fill-bytes constants.
(const_zerobytes_operand): New predicate for AND zero-bytes
constants.
* config/pru/pru-protos.h (pru_output_sign_extend): Remove.
(struct pru_byterange): New struct to describe a byte range.
(pru_calc_byterange): New declaration.
* config/pru/pru.cc (pru_rtx_costs): Add penalty for
64-bit zero-extend.
(pru_output_sign_extend): Remove.
(pru_calc_byterange): New helper function to extract byte
range info from a constant.
(pru_print_operand): Remove 'y' and 'z' print modifiers.
* config/pru/pru.md (zero_extendqidi2): New pattern.
(zero_extendhidi2): New pattern.
(zero_extendsidi2): New pattern.
(extend<EQS0:mode><EQD:mode>2): Rewrite as an expand.
(@pru_ior_fillbytes<mode>): New pattern.
(@pru_and_zerobytes<mode>): New pattern.
(<code>di3): Rewrite as an expand and handle ZERO and FILL
special cases.
(pru_<code>di3): New name for <code>di3.
(@cbranch_qbbx_const_<BIT_TEST:code><HIDI:mode>): New pattern to
handle bit-test for 64-bit registers.

gcc/testsuite/ChangeLog:

* gcc.target/pru/pr106564-1.c: New test.
* gcc.target/pru/pr106564-2.c: New test.
* gcc.target/pru/pr106564-3.c: New test.
* gcc.target/pru/pr106564-4.c: New test.

Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>

Fortran: fix simplification of intrinsics IBCLR and IBSET [PR106557]

gcc/fortran/ChangeLog:

PR fortran/106557
* simplify.cc (gfc_simplify_ibclr): Ensure consistent results of
the simplification by dropping a redundant memory representation
of argument x.
(gfc_simplify_ibset): Likewise.

gcc/testsuite/ChangeLog:

PR fortran/106557
* gfortran.dg/pr106557.f90: New test.

Remove dead predicate analysis GENERIC expr building code

The following removes the unused def_expr, use_expr and expr APIs
from the predicate class including the unconditional build of the
GENERIC use_expr on each uninit analysis run.

* gimple-predicate-analysis.h (predicate::m_use_expr): Remove.
(predicate::def_expr): Likewise.
(predicate::use_expr): Likewise.
(predicate::expr): Likewise.
* gimple-predicate-analysis.cc (predicate::def_expr): Remove.
(predicate::use_expr): Likewise.
(predicate::expr): Likewise.
(predicate::is_use_guarded): Do not build m_use_expr.

jobserver: detect properly O_NONBLOCK

PR lto/106700

gcc/ChangeLog:

* configure.ac: Detect O_NONBLOCK flag for open.
* config.in: Regenerate.
* configure: Regenerate.
* opts-common.cc (jobserver_info::connect): Set is_connected
properly based on O_NONBLOCK.
* opts-jobserver.h (struct jobserver_info): Add is_connected
member variable.

gcc/lto/ChangeLog:

* lto.cc (wait_for_child): Ask if we are connected to jobserver.
(stream_out_partitions): Likewise.

middle-end: Fix issue of poly_uint16 (1, 1) in self test

This patch fix issue of poly_uint16 (1, 1) in machine mode self test.

gcc/ChangeLog:

* simplify-rtx.cc (test_vector_subregs_fore_back): Make first value
and repeat value different.

lto-wrapper.cc: Delete offload_names temp files in case of error [PR106686]

Usually, the caller takes care of the .o files for the offload compilers
(suffix: ".target.o"). However, if an error occurs during processing
(e.g. fatal error by lto1), they were not deleted.

gcc/ChangeLog:

PR lto/106686
* lto-wrapper.cc (free_array_of_ptrs): Move before tool_cleanup.
(tool_cleanup): Unlink offload_names.
(compile_offload_image): Take filename argument to set it early.
(compile_images_for_offload_targets): Update call; set
offload_names to NULL after freeing the array.

tree-optimization/105937 - avoid uninit diagnostics crossing iterations

The following avoids adding PHIs to the worklist for uninit processing
if we reach them following backedges.  That confuses predicate analysis
because it assumes the use is happening in the same iteration as the the
definition.  For the testcase in the PR the situation is like

void foo (int val)
{
  int uninit;
  # val = PHI <..> (B)
  for (..)
    {
      if (..)
        {
          .. = val; (C)
          val = uninit;
        }
      # val = PHI <..> (A)
    }
}

and starting from (A) with 'uninit' as argument we arrive at (B)
and from there at (C).  Predicate analysis then tries to prove
the predicate of (B) (not the backedge) can prove that the
path from (B) to (C) is unreachable which isn't really what it
necessary - that's what we'd need to do when the preheader
edge of the loop were the edge with the uninitialized def.

So the following makes those cases intentionally false negatives.

PR tree-optimization/105937
* tree-ssa-uninit.cc (find_uninit_use): Do not queue PHIs
on backedges.
(execute_late_warn_uninitialized): Mark backedges.

* g++.dg/uninit-pr105937.C: New testcase.

Improve uninit analysis

The following reduces the number of false positives in uninit analysis
by providing fallback for situations the current analysis gives up
and thus warns because it cannot prove initialization.

The first situation is when compute_control_dep_chain gives up walking
because it runs into either param_uninit_control_dep_attempts or
MAX_CHAIN_LEN.  If in the process it did not collect a single path
from function entry to the interesting PHI edge then we'll give up
and diagnose.  The following patch insteads provides a sparse path
including only those predicates that always hold when the PHI edge
is reached in that case.  That's cheap to produce but may in some
odd cases prove less precise than what the code tries now (enumerating
all possible paths from function entry to the PHI edge, but only
use the first N of those and only require unreachability of those N).

The second situation is when the set of predicates computed to hold
on the use stmt was formed from multiple paths (there's a similar
enumeration of all paths and their predicates from the PHI def to the
use).  In that case use_preds.use_cannot_happen gives up because
it doesn't know which of the predicates from which path from PHI to
the use it can use to prove unreachability of the PHI edge that has
the uninitialized def.  The patch for this case simply computes
the intersection of the predicates and uses that for further analysis,
but in a crude way since the predicate vectors are not sorted.
Fortunately the total size is limited - we have max MAX_NUM_CHAINS
number of predicates each of length MAX_CHAIN_LEN so the brute
force intersection code should behave quite reasonable in practice.

* gimple-predicate-analysis.cc (predicate::use_cannot_happen):
If the use is guarded with multiple predicate paths compute
the predicates intersection before going forward.  When
compute_control_dep_chain wasn't able to come up with at
least one path from function entry to the PHI edge compute
a conservative sparse path instead.

analyzer: add missing final keyword

Fixes the following clang warning:
gcc/analyzer/region-model.cc:5096:8: warning: 'subclass_equal_p' overrides a member function but is not marked 'override' [-Winconsistent-missing-override]

gcc/analyzer/ChangeLog:

* region-model.cc: Add missing final keyword.

Daily bump.

fortran: Drop -static-lib{gfortran,quadmath} from f951 [PR46539]

As discussed earlier, all other -static-lib* options are Driver only,
these 2 are Driver in common.opt and Fortran in lang.opt.

The spec files never pass the -static-lib* options down to any compiler
(f951 etc.), so the 2 errors below are reported only when one
runs ./f951 -static-libgfortran by hand.

The following patch just removes f951 support of these options, the
gfortran driver behavior remains as before. For other -static-lib*
option (and even these because it is never passed to f951) we never
error if we can't support those options, and e.g. Darwin is actually
able to handle those options through other means.

2022-08-20 Jakub Jelinek <jakub@redhat.com>

PR fortran/46539
* lang.opt (static-libgfortran, static-libquadmath): Change Fortran
to Driver.
* options.cc (gfc_handle_option): Don't handle OPT_static_libgfortran
nor OPT_static_libquadmath here.

LoongArch: Add support code model extreme.

Use five instructions to calculate a signed 64-bit offset relative to the pc.

gcc/ChangeLog:

* config/loongarch/loongarch-opts.cc: Allow cmodel to be extreme.
* config/loongarch/loongarch.cc (loongarch_call_tls_get_addr):
Add extreme support for TLS GD and LD types.
(loongarch_legitimize_tls_address): Add extreme support for TLS LE
and IE.
(loongarch_split_symbol): When compiling with -mcmodel=extreme,
the symbol address will be obtained through five instructions.
(loongarch_print_operand_reloc): Add support.
(loongarch_print_operand): Add support.
(loongarch_print_operand_address): Add support.
(loongarch_option_override_internal): Set '-mcmodel=extreme' option
incompatible with '-mno-explicit-relocs'.
* config/loongarch/loongarch.md (@lui_l_hi20<mode>):
Loads bits 12-31 of data into registers.
(lui_h_lo20): Load bits 32-51 of the data and spell bits 0-31 of
the source register.
(lui_h_hi12): Load bits 52-63 of the data and spell bits 0-51 of
the source register.
* config/loongarch/predicates.md: Symbols need to be decomposed
when defining the macro TARGET_CMODEL_EXTREME
* doc/invoke.texi: Modify the description information of cmodel in the document.
Document -W[no-]extreme-plt.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/func-call-1.c: Add option '-mcmodel=normal'.
* gcc.target/loongarch/func-call-2.c: Likewise.
* gcc.target/loongarch/func-call-3.c: Likewise.
* gcc.target/loongarch/func-call-4.c: Likewise.
* gcc.target/loongarch/func-call-5.c: Likewise.
* gcc.target/loongarch/func-call-6.c: Likewise.
* gcc.target/loongarch/func-call-7.c: Likewise.
* gcc.target/loongarch/func-call-8.c: Likewise.
* gcc.target/loongarch/relocs-symbol-noaddend.c: Likewise.
* gcc.target/loongarch/func-call-extreme-1.c: New test.
* gcc.target/loongarch/func-call-extreme-2.c: New test.

libcpp: Implement C++23 P2290R3 - Delimited escape sequences [PR106645]

The following patch implements the C++23 P2290R3 paper.

2022-08-20  Jakub Jelinek  <jakub@redhat.com>

PR c++/106645
libcpp/
* include/cpplib.h (struct cpp_options): Implement
P2290R3 - Delimited escape sequences.  Add delimite_escape_seqs
member.
* init.cc (struct lang_flags): Likewise.
(lang_defaults): Add delim column.
(cpp_set_lang): Copy over delimite_escape_seqs.
* charset.cc (extend_char_range): New function.
(_cpp_valid_ucn): Use it.  Handle delimited escape sequences.
(convert_hex): Likewise.
(convert_oct): Likewise.
(convert_ucn): Use extend_char_range.
(convert_escape): Call convert_oct even for \o.
(_cpp_interpret_identifier): Handle delimited escape sequences.
* lex.cc (get_bidi_ucn_1): Likewise.  Add end argument, fill it in.
(get_bidi_ucn): Adjust get_bidi_ucn_1 caller.  Use end argument to
compute num_bytes.
gcc/testsuite/
* c-c++-common/cpp/delimited-escape-seq-1.c: New test.
* c-c++-common/cpp/delimited-escape-seq-2.c: New test.
* c-c++-common/cpp/delimited-escape-seq-3.c: New test.
* c-c++-common/Wbidi-chars-24.c: New test.
* gcc.dg/cpp/delimited-escape-seq-1.c: New test.
* gcc.dg/cpp/delimited-escape-seq-2.c: New test.
* g++.dg/cpp/delimited-escape-seq-1.C: New test.
* g++.dg/cpp/delimited-escape-seq-2.C: New test.

Daily bump.

mkoffload: Cleanup temporary omp_requires_file

The file (suffix ".mkoffload.omp_requires") used to save the 'omp requires'
data has to be passed to maybe_unlink for cleanup or -v -save-temps stderr
diagnostic. That was missed before. - For GCN, the same has to be done for
the files with suffix ".mkoffload.dbg.o".

gcc/ChangeLog:

* config/gcn/mkoffload.cc (main): Add omp_requires_file and dbgobj to
files_to_cleanup.
* config/i386/intelmic-mkoffload.cc (prepare_target_image): Add
omp_requires_file to temp_files.
* config/nvptx/mkoffload.cc (omp_requires_file): New global static var.
(main): Remove local omp_requires_file var.
(tool_cleanup): Handle omp_requires_file.

Remove path_range_query constructor that takes an edge.

The path_range_query constructor that takes an edge is really a
convenience function for the loop-ch pass. It feels wrong to pollute
the API with such a specialized function that could be done with
a small inline function closer to its user.

As an added benefit, we remove one use of reset_path. The last
remaining one is the forward threader one.

Tested, thread-counted, and benchmarked on x86-64 Linux.

gcc/ChangeLog:

* gimple-range-path.cc (path_range_query::path_range_query):
Remove constructor that takes edge.
* gimple-range-path.h (class path_range_query): Same.
* tree-ssa-loop-ch.cc (edge_range_query): New.
(entry_loop_condition_is_static): Call edge_range_query.

Add further FOR_EACH_ macros

contrib/ChangeLog:

* clang-format: Add further FOR_EACH_ macros.

i386: Add ABI test for __bf16 type

gcc/testsuite/ChangeLog:

* gcc.target/x86_64/abi/bf16/abi-bf16.exp: New test.
* gcc.target/x86_64/abi/bf16/args.h: Ditto.
* gcc.target/x86_64/abi/bf16/asm-support.S: Ditto.
* gcc.target/x86_64/abi/bf16/bf16-check.h: Ditto.
* gcc.target/x86_64/abi/bf16/bf16-helper.h: Ditto.
* gcc.target/x86_64/abi/bf16/defines.h: Ditto.
* gcc.target/x86_64/abi/bf16/m256bf16/abi-bf16-ymm.exp: Ditto.
* gcc.target/x86_64/abi/bf16/m256bf16/args.h: Ditto.
* gcc.target/x86_64/abi/bf16/m256bf16/asm-support.S: Ditto.
* gcc.target/x86_64/abi/bf16/m256bf16/bf16-ymm-check.h: Ditto.
* gcc.target/x86_64/abi/bf16/m256bf16/test_m256_returning.c: Ditto.
* gcc.target/x86_64/abi/bf16/m256bf16/test_passing_m256.c: Ditto.
* gcc.target/x86_64/abi/bf16/m256bf16/test_passing_structs.c: Ditto.
* gcc.target/x86_64/abi/bf16/m256bf16/test_passing_unions.c: Ditto.
* gcc.target/x86_64/abi/bf16/m256bf16/test_varargs-m256.c: Ditto.
* gcc.target/x86_64/abi/bf16/m512bf16/abi-bf16-zmm.exp: Ditto.
* gcc.target/x86_64/abi/bf16/m512bf16/args.h: Ditto.
* gcc.target/x86_64/abi/bf16/m512bf16/asm-support.S: Ditto.
* gcc.target/x86_64/abi/bf16/m512bf16/bf16-zmm-check.h: Ditto.
* gcc.target/x86_64/abi/bf16/m512bf16/test_m512_returning.c: Ditto.
* gcc.target/x86_64/abi/bf16/m512bf16/test_passing_m512.c: Ditto.
* gcc.target/x86_64/abi/bf16/m512bf16/test_passing_structs.c: Ditto.
* gcc.target/x86_64/abi/bf16/m512bf16/test_passing_unions.c: Ditto.
* gcc.target/x86_64/abi/bf16/m512bf16/test_varargs-m512.c: Ditto.
* gcc.target/x86_64/abi/bf16/macros.h: Ditto.
* gcc.target/x86_64/abi/bf16/test_3_element_struct_and_unions.c: Ditto.
* gcc.target/x86_64/abi/bf16/test_basic_alignment.c: Ditto.
* gcc.target/x86_64/abi/bf16/test_basic_array_size_and_align.c: Ditto.
* gcc.target/x86_64/abi/bf16/test_basic_returning.c: Ditto.
* gcc.target/x86_64/abi/bf16/test_basic_sizes.c: Ditto.
* gcc.target/x86_64/abi/bf16/test_basic_struct_size_and_align.c: Ditto.
* gcc.target/x86_64/abi/bf16/test_basic_union_size_and_align.c: Ditto.
* gcc.target/x86_64/abi/bf16/test_m128_returning.c: Ditto.
* gcc.target/x86_64/abi/bf16/test_passing_floats.c: Ditto.
* gcc.target/x86_64/abi/bf16/test_passing_m128.c: Ditto.
* gcc.target/x86_64/abi/bf16/test_passing_structs.c: Ditto.
* gcc.target/x86_64/abi/bf16/test_passing_unions.c: Ditto.
* gcc.target/x86_64/abi/bf16/test_struct_returning.c: Ditto.
* gcc.target/x86_64/abi/bf16/test_varargs-m128.c: Ditto.

Daily bump.

preprocessor: Support #warning for standard C2x

ISO C2x standardizes the existing #warning extension. Arrange
accordingly for it not to be diagnosed with -std=c2x -pedantic, but to
be diagnosed with -Wc11-c2x-compat.

Bootstrapped with no regressions for x86_64-pc-linux-gnu.

gcc/testsuite/
* gcc.dg/cpp/c11-warning-1.c, gcc.dg/cpp/c11-warning-2.c,
gcc.dg/cpp/c11-warning-3.c, gcc.dg/cpp/c11-warning-4.c,
gcc.dg/cpp/c2x-warning-1.c, gcc.dg/cpp/c2x-warning-2.c,
gcc.dg/cpp/gnu11-warning-1.c, gcc.dg/cpp/gnu11-warning-2.c,
gcc.dg/cpp/gnu11-warning-3.c, gcc.dg/cpp/gnu11-warning-4.c,
gcc.dg/cpp/gnu2x-warning-1.c, gcc.dg/cpp/gnu2x-warning-2.c: New
tests.

libcpp/
* include/cpplib.h (struct cpp_options): Add warning_directive.
* init.cc (struct lang_flags, lang_defaults): Add
warning_directive.
* directives.cc (DIRECTIVE_TABLE): Mark #warning as STDC2X not
EXTENSION.
(directive_diagnostics): Diagnose #warning with -Wc11-c2x-compat,
or with -pedantic for a standard not supporting #warning.

xtensa: Improve indirect sibling call handling

No longer needs the dedicated hard register (A11) for the address of the
call and the split patterns for fixups, due to the introduction of appropriate
register class and constraint.

(Note: "ISC_REGS" contains a hard register A8 used as a "static chain"
pointer for nested functions, but no problem; Pointer to nested function
actually points to "trampoline", and trampoline itself doesn't receive
"static chain" pointer to its parent's stack frame from the caller.)

gcc/ChangeLog:

* config/xtensa/xtensa.h
(enum reg_class, REG_CLASS_NAMES, REG_CLASS_CONTENTS):
Add new register class "ISC_REGS".
* config/xtensa/constraints.md (c): Add new register constraint.
* config/xtensa/xtensa.md (define_constants): Remove "A11_REG".
(sibcall_internal, sibcall_value_internal):
Change to use the new register constraint, and remove two split
patterns for fixups that are no longer needed.

gcc/testsuite/ChangeLog:

* gcc.target/xtensa/sibcalls.c: Add a new test function to ensure
that registers for arguments (occupy from A2 to A7) and for indirect
sibcall (should be assigned to A8) neither conflict nor spill out.

Revert "Fortran: fix invalid rank error in ASSOCIATED when rank is remapped [PR77652]"

This reverts commit 0110cfd5449bae3a772f45ea2e4c5dab5b7a8ccd.

RISC-V: Standardize formatting of SFB ALU conditional move

Standardize the formatting of SFB ALU conditional move operations from:

beq a2,zero,1f; mv a0,zero; 1: # movcc

to:

beq a2,zero,1f # movcc
mv a0,zero
1:

for consistency with other assembly code produced. No functional change.

gcc/
* config/riscv/riscv.md (*mov<GPR:mode><X:mode>cc): Fix output
pattern formatting.

contrib: Fix a typo in contrib/git-fetch-vendor.sh

2022-08-18 Andrea Corallo <andrea.corallo@arm.com>

* git-fetch-vendor.sh : Fix typo.

analyzer: warn on the use of floating-points operands in the size argument [PR106181]

This patch fixes the ICE reported in PR106181 and adds a new warning to
the analyzer complaining about the use of floating-point operands.

Regrtested on Linux x86_64.

2022-08-17 Tim Lange <mail@tim-lange.me>

gcc/analyzer/ChangeLog:

PR analyzer/106181
* analyzer.opt: Add Wanalyzer-imprecise-floating-point-arithmetic.
* region-model.cc (is_any_cast_p): Formatting.
(region_model::check_region_size): Ensure precondition.
(class imprecise_floating_point_arithmetic): New abstract
diagnostic class for all floating-point related warnings.
(class float_as_size_arg): Concrete diagnostic class to complain
about floating-point operands inside the size argument.
(class contains_floating_point_visitor):
New visitor to find floating-point operands inside svalues.
(region_model::check_dynamic_size_for_floats): New function.
(region_model::set_dynamic_extents):
Call to check_dynamic_size_for_floats.
* region-model.h (class region_model):
Add region_model::check_dynamic_size_for_floats.

gcc/ChangeLog:

PR analyzer/106181
* doc/invoke.texi: Add Wanalyzer-imprecise-fp-arithmetic.

gcc/testsuite/ChangeLog:

PR analyzer/106181
* gcc.dg/analyzer/allocation-size-1.c: New test.
* gcc.dg/analyzer/imprecise-floating-point-1.c: New test.
* gcc.dg/analyzer/pr106181.c: New test.

Make path_range_query standalone and add reset_path.

These are a bunch of cleanups inspired by Richi's suggestion of making
path_range_query standalone, instead of having to call
compute_ranges() for each path.

I've made the ranger need explicit, and moved the responsibility for
its creation to the caller.

I've also investigated and documented why the forward threader needs its
own compute exit dependencies variant. I can't wait for it to go away
:-/.

I've also added constructors that take a path and dependencies, and
made compute_ranges() private. Unfortunately, reinstantiating
path_range_query in the forward threader caused a 14% performance
regression in DOM, because the old threader calls it over and over on
the same path to simplify statements (some of which not even in the
IL, but that's old news).

In the meantime, I've left the ability to reset a path, but this time
appropriately called reset_path().

Tested, benchmarked, and thread counted on x86-64 Linux.

gcc/ChangeLog:

* gimple-range-path.cc (path_range_query::path_range_query): Add
various constructors to take a path.
(path_range_query::~path_range_query): Remove m_alloced_ranger.
(path_range_query::range_on_path_entry): Adjust for m_ranger being
a reference.
(path_range_query::set_path): Rename to...
(path_range_query::reset_path): ...this and call compute_ranges.
(path_range_query::ssa_range_in_phi): Adjust for m_ranger
reference.
(path_range_query::range_defined_in_block): Same.
(path_range_query::compute_ranges_in_block): Same.
(path_range_query::adjust_for_non_null_uses): Same.
(path_range_query::compute_exit_dependencies): Use m_path instead
of argument.
(path_range_query::compute_ranges): Remove path argument.
(path_range_query::range_of_stmt): Adjust for m_ranger reference.
(path_range_query::compute_outgoing_relations): Same.
* gimple-range-path.h (class path_range_query): Add various
constructors.
Make compute_ranges and compute_exit_dependencies private.
Rename set_path to reset_path.
Make m_ranger a reference.
Remove m_alloced_ranger.
* tree-ssa-dom.cc (pass_dominator::execute): Adjust constructor to
path_range_query.
* tree-ssa-loop-ch.cc (entry_loop_condition_is_static): Take a
ranger and instantiate a new path_range_query every time.
(ch_base::copy_headers): Pass ranger instead of path_range_query.
* tree-ssa-threadbackward.cc (class back_threader): Remove m_solver.
(back_threader::~back_threader): Remove m_solver.
(back_threader::find_taken_edge_switch): Adjust for m_ranger
reference.
(back_threader::find_taken_edge_cond): Same.
(back_threader::dump): Remove m_solver.
(back_threader::back_threader): Move verify_marked_backedges
here from the path_range_query constructor.
* tree-ssa-threadedge.cc (hybrid_jt_simplifier::simplify): Move
some code from compute_ranges_from_state here.
(hybrid_jt_simplifier::compute_ranges_from_state): Rename...
(hybrid_jt_simplifier::compute_exit_dependencies): ...to this.
* tree-ssa-threadedge.h (class hybrid_jt_simplifier): Rename
compute_ranges_from_state to compute_exit_dependencies.
Remove m_path.

middle-end/106617 - fix fold_binary_op_with_conditional_arg pattern issue

Now that we have parts of fold_binary_op_with_conditional_arg duplicated
in match.pd and are using ! to take or throw away the result we have to
be careful to not have both implementations play games which each other,
causing quadratic behavior. In particular the match.pd implementation
requires both arms to simplify while the fold-const.cc is happy with
just one arm simplifying (something we cannot express in match.pd).

The fix is to simply not enable the match.pd pattern for GENERIC.

PR middle-end/106617
* match.pd ((a ? b : c) > d -> a ? (b > d) : (c > d)): Fix
guard, disable on GENERIC to not cause quadratic behavior
with the fold-const.cc implementation and the use of !

* gcc.dg/pr106617.c: New testcase.

gcov-dump: properly use INCLUDE_VECTOR

PR gcov-profile/106659

gcc/ChangeLog:

* gcov-dump.cc (INCLUDE_VECTOR): Include vector.h with
INCLUDE_VECTOR.

x86: Support vector __bf16 type

gcc/ChangeLog:

* config/i386/i386-expand.cc (ix86_expand_sse_movcc): Handle vector
BFmode.
(ix86_expand_vector_init_duplicate): Support vector BFmode.
(ix86_expand_vector_init_one_nonzero): Ditto.
(ix86_expand_vector_init_one_var): Ditto.
(ix86_expand_vector_init_concat): Ditto.
(ix86_expand_vector_init_interleave): Ditto.
(ix86_expand_vector_init_general): Ditto.
(ix86_expand_vector_init): Ditto.
(ix86_expand_vector_set_var): Ditto.
(ix86_expand_vector_set): Ditto.
(ix86_expand_vector_extract): Ditto.
* config/i386/i386.cc (classify_argument): Add BF vector modes.
(function_arg_64): Ditto.
(ix86_gimplify_va_arg): Ditto.
(ix86_get_ssemov): Ditto.
* config/i386/i386.h (VALID_AVX256_REG_MODE): Add BF vector modes.
(VALID_AVX512F_REG_MODE): Ditto.
(host_detect_local_cpu): Ditto.
(VALID_SSE2_REG_MODE): Ditto.
* config/i386/i386.md: Add BF vector modes.
(MODE_SIZE): Ditto.
(ssemodesuffix): Add bf suffix for BF vector modes.
(ssevecmode): Ditto.
* config/i386/sse.md (VMOVE): Adjust for BF vector modes.
(VI12HFBF_AVX512VL): Ditto.
(V_256_512): Ditto.
(VF_AVX512HFBF16): Ditto.
(VF_AVX512BWHFBF16): Ditto.
(VIHFBF): Ditto.
(avx512): Ditto.
(VIHFBF_256): Ditto.
(VIHFBF_AVX512BW): Ditto.
(VI2F_256_512):Ditto.
(V8_128):Ditto.
(V16_256): Ditto.
(V32_512): Ditto.
(sseinsnmode): Ditto.
(sseconstm1): Ditto.
(sseintmodesuffix): New mode_attr.
(avx512fmaskmode): Ditto.
(avx512fmaskmodelower): Ditto.
(ssedoublevecmode): Ditto.
(ssehalfvecmode): Ditto.
(ssehalfvecmodelower): Ditto.
(ssescalarmode): Add vector BFmode mapping.
(ssescalarmodelower): Ditto.
(ssexmmmode): Ditto.
(ternlogsuffix): Ditto.
(ssescalarsize): Ditto.
(sseintprefix): Ditto.
(i128): Ditto.
(xtg_mode): Ditto.
(bcstscalarsuff): Ditto.
(<avx512>_blendm<mode>): New define_insn for BFmode.
(<avx512>_store<mode>_mask): Ditto.
(vcond_mask_<mode><avx512fmaskmodelower>): Ditto.
(vec_set<mode>_0): New define_insn for BF vector set.
(V8BFH_128): New mode_iterator for BFmode.
(avx512fp16_mov<mode>): Ditto.
(vec_set<mode>): New define_insn for BF vector set.
(@vec_extract_hi_<mode>): Ditto.
(@vec_extract_lo_<mode>): Ditto.
(vec_set_hi_<mode>): Ditto.
(vec_set_lo_<mode>): Ditto.
(*vec_extract<mode>_0): New define_insn_and_split for BF
vector extract.
(*vec_extract<mode>): New define_insn.
(VEC_EXTRACT_MODE): Add BF vector modes.
(PINSR_MODE): Add V8BF.
(sse2p4_1): Ditto.
(pinsr_evex_isa): Ditto.
(<sse2p4_1>_pinsr<ssemodesuffix>): Adjust to support
insert for V8BFmode.
(pbroadcast_evex_isa): Add BF vector modes.
(AVX2_VEC_DUP_MODE): Ditto.
(VEC_INIT_MODE): Ditto.
(VEC_INIT_HALF_MODE): Ditto.
(avx2_pbroadcast<mode>): Adjust to support BF vector mode
broadcast.
(avx2_pbroadcast<mode>_1): Ditto.
(<avx512>_vec_dup<mode>_1): Ditto.
(<mask_codefor><avx512>_vec_dup_gpr<mode><mask_name>):
Ditto.

gcc/testsuite/ChangeLog:

* g++.target/i386/vect-bfloat16-1.C: New test.
* gcc.target/i386/vect-bfloat16-1.c: New test.
* gcc.target/i386/vect-bfloat16-2a.c: New test.
* gcc.target/i386/vect-bfloat16-2b.c: New test.
* gcc.target/i386/vect-bfloat16-typecheck_1.c: New test.
* gcc.target/i386/vect-bfloat16-typecheck_2.c: New test.

build: regenerate gcc/configure

After the change 71f068a9b3332a2179dfc807cf9138f691d77461, gcc/configure
needs to re-generated.

gcc/ChangeLog:

* configure: Regenerate.

Makefile.def: drop remnants of unused libelf

Use of libelf was removed from gcc in r0-104274-g48215350c24d52 ("re PR
lto/46273 (Failed to bootstrap)") around 2010, before gcc-4.6.0.

This change removes unused references to libelf from top-level configure
and Makefile.

/
* Makefile.def: Drop libelf module and gcc-configure dependency
on it.
* Makefile.in: Regenerate with 'autogen Makefile.def'.
* Makefile.tpl (HOST_EXPORTS): Drop unused LIBELFLIBS and
LIBELFINC.
* configure: Regenrate.
* configure.ac (host_libs): Drop unused libelf.

Add libgo dependency on libbacktrace.

Noticed missing dependency when regenerated Makefile.in for unrelated
change with 'autogen Makefile.def'.

The change was lost in r12-6861-gaeac414923aa1e ("Revert "Fix PR 67102:
Add libstdc++ dependancy to libffi" [PR67102]").

/
* Makefile.in: Regenerate.

rs6000: Add expand pattern for multiply-add (PR103109)

gcc/
PR target/103109
* config/rs6000/rs6000.md (<u>maddditi4): New pattern for multiply-add.
(<u>madddi4_highpart): New.
(<u>madddi4_highpart_le): New.

gcc/testsuite/
PR target/103109
* gcc.target/powerpc/pr103109.h: New.
* gcc.target/powerpc/pr103109-1.c: New.
* gcc.target/powerpc/pr103109-2.c: New.

Use gimple_range_ssa_names in path_range_query.

gcc/ChangeLog:

* gimple-range-path.cc
(path_range_query::compute_exit_dependencies): Use
gimple_range_ssa_names.

RISC-V: Add runtime invariant support

RISC-V 'V' Extension support scalable vector like ARM SVE.
To support RVV, we need to introduce runtime invariant.

- For zve32*, the runtime invariant uses 32-bit chunk.
- For zve64*, the runtime invariant uses 64-bit chunk.

[1] https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc#sec-vector-extensions

This patch is preparing patch for RVV support.
Because we didn't introduce vector machine_mode yet, it safe to just change HOST_WIDE_INT into poly_int.
Also it safe to use "to_constant()" function for scalar operation.
This patch has been tested by full dejagnu regression.

gcc/ChangeLog:

* config/riscv/predicates.md: Adjust runtime invariant.
* config/riscv/riscv-modes.def (MAX_BITSIZE_MODE_ANY_MODE): New.
(NUM_POLY_INT_COEFFS): New.
* config/riscv/riscv-protos.h (riscv_initial_elimination_offset):Adjust
runtime invariant.
* config/riscv/riscv-sr.cc (riscv_remove_unneeded_save_restore_calls):
Adjust runtime invariant.
* config/riscv/riscv.cc (struct riscv_frame_info): Adjust runtime
invariant.
(enum riscv_microarchitecture_type): Ditto.
(riscv_valid_offset_p): Ditto.
(riscv_valid_lo_sum_p): Ditto.
(riscv_address_insns): Ditto.
(riscv_load_store_insns): Ditto.
(riscv_legitimize_move): Ditto.
(riscv_binary_cost): Ditto.
(riscv_rtx_costs): Ditto.
(riscv_output_move): Ditto.
(riscv_extend_comparands): Ditto.
(riscv_flatten_aggregate_field): Ditto.
(riscv_get_arg_info): Ditto.
(riscv_pass_by_reference): Ditto.
(riscv_elf_select_rtx_section): Ditto.
(riscv_stack_align): Ditto.
(riscv_compute_frame_info): Ditto.
(riscv_initial_elimination_offset): Ditto.
(riscv_set_return_address): Ditto.
(riscv_for_each_saved_reg): Ditto.
(riscv_first_stack_step): Ditto.
(riscv_expand_prologue): Ditto.
(riscv_expand_epilogue): Ditto.
(riscv_can_use_return_insn): Ditto.
(riscv_secondary_memory_needed): Ditto.
(riscv_hard_regno_nregs): Ditto.
(riscv_convert_vector_bits): New.
(riscv_option_override): Adjust runtime invariant.
(riscv_promote_function_mode): Ditto.
* config/riscv/riscv.h (POLY_SMALL_OPERAND_P): New.
(BITS_PER_RISCV_VECTOR): New.
(BYTES_PER_RISCV_VECTOR): New.
* config/riscv/riscv.md: Adjust runtime invariant.

LoongArch: Get __tls_get_addr address through got table when disable plt.

Fix bug, ICE with tls gd/ld var with -fno-plt.

gcc/ChangeLog:

* config/loongarch/loongarch.cc (loongarch_call_tls_get_addr):
Get __tls_get_addr address through got table when disable plt.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/tls-gd-noplt.c: New test.

xtensa: Optimize stack pointer updates in function pro/epilogue under certain conditions

This patch enforces the use of "addmi" machine instruction instead of
addition/subtraction with two source registers for adjusting the stack
pointer, if the adjustment fits into a signed 16-bit and is also a multiple
of 256.

    /* example */
    void test(void) {
      char buffer[4096];
      __asm__(""::"m"(buffer));
    }

    ;; before
    test:
movi.n a9, 1
slli a9, a9, 12
sub sp, sp, a9
movi.n a9, 1
slli a9, a9, 12
add.n sp, sp, a9
addi sp, sp, 0
ret.n

    ;; after
    test:
addmi sp, sp, -0x1000
addmi sp, sp, 0x1000
ret.n

gcc/ChangeLog:

* config/xtensa/xtensa.cc (xtensa_expand_prologue):
Use an "addmi" machine instruction for updating the stack pointer
rather than addition/subtraction via hard register A9, if the amount
of change satisfies the literal value conditions of that instruction
when the CALL0 ABI is used.
(xtensa_expand_epilogue): Ditto.
And also inhibit the stack pointer addition of constant zero.

Daily bump.

RISC-V/testsuite: Restrict remaining `fmin'/`fmax' tests to hard float

Complement commit 7915f6551343 ("RISC-V/testsuite: constraint some of
tests to hard_float") and also restrict the remaining `fmin'/`fmax'
tests to hard-float test configurations.

gcc/testsuite/
* gcc.target/riscv/fmax-snan.c: Add `dg-require-effective-target
hard_float'.
* gcc.target/riscv/fmaxf-snan.c: Likewise.
* gcc.target/riscv/fmin-snan.c: Likewise.
* gcc.target/riscv/fminf-snan.c: Likewise.

[Committed] PR target/106640: Fix use of XINT in TImode compute_convert_gain.

Thanks to Zdenek Sojka for reporting PR target/106640 where an RTL checking
build reveals a thinko in my recent patch to support TImode shifts/rotates
in STV. My "senior moment" was to inappropriately use XINT where I should
be using INTVAL of XEXP.

2022-08-17 Roger Sayle <roger@nextmovesoftware.com>

gcc/ChangeLog
PR target/106640
* config/i386/i386-features.cc
(timde_scalar_chain::compute_convert_gain): Replace incorrect use
of XINT with INTVAL (XEXP (src, 1)).

c++: Add new std::move test [PR67906]

As discussed in 67906, let's make sure we don't warn about a std::move
when initializing when there's a T(const T&&) ctor.

PR c++/67906

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/Wredundant-move11.C: New test.

Reset root oracle from path_oracle::reset_path.

When we cross a backedge in the path solver, we reset the path
relations and nuke the root oracle.  However, we forget to reset it
for the next path.  This is causing us to miss threads because
subsequent paths will have no root oracle to use.

With this patch we get 201 more threads in the threadfull passes in my
.ii files and 118 more overall (DOM gets less because threadfull runs
before).

Normally, I'd recommend this for the GCC 12 branch, but considering
how sensitive other passes are to jump threading, and that there is no
PR associated with this, perhaps we should leave this out.  Up to the
release maintainers of course.

gcc/ChangeLog:

* gimple-range-path.cc
(path_range_query::compute_ranges_in_block): Remove
set_root_oracle call.
(path_range_query::compute_ranges): Pass ranger oracle to
reset_path.
* value-relation.cc (path_oracle::reset_path): Set root oracle.
* value-relation.h (path_oracle::reset_path): Add root oracle
argument.

c++: Extend -Wredundant-move for const-qual objects [PR90428]

In this PR, Jon suggested extending the -Wredundant-move warning
to warn when the user is moving a const object as in:

  struct T { };

  T f(const T& t)
  {
    return std::move(t);
  }

where the std::move is redundant, because T does not have
a T(const T&&) constructor (which is very unlikely).  Even with
the std::move, T(T&&) would not be used because it would mean
losing the const.  Instead, T(const T&) will be called.

I had to restructure the function a bit, but it's better now.  This patch
depends on my other recent patches to maybe_warn_pessimizing_move.

PR c++/90428

gcc/cp/ChangeLog:

* typeck.cc (can_do_rvo_p): Rename to ...
(can_elide_copy_prvalue_p): ... this.
(maybe_warn_pessimizing_move): Extend the
-Wredundant-move warning to warn about std::move on a
const-qualified object.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/Wredundant-move1.C: Adjust dg-warning.
* g++.dg/cpp0x/Wredundant-move9.C: Likewise.
* g++.dg/cpp0x/Wredundant-move10.C: New test.

c++: Tweak for -Wpessimizing-move in templates [PR89780]

In my previous patches I've been extending our std::move warnings,
but this tweak actually dials it down a little bit.  As reported in
bug 89780, it's questionable to warn about expressions in templates
that were type-dependent, but aren't anymore because we're instantiating
the template.  As in,

  template <typename T>
  Dest withMove() {
    T x;
    return std::move(x);
  }

  template Dest withMove<Dest>(); // #1
  template Dest withMove<Source>(); // #2

Saying that the std::move is pessimizing for #1 is not incorrect, but
it's not useful, because removing the std::move would then pessimize #2.
So the user can't really win.  At the same time, disabling the warning
just because we're in a template would be going too far, I still want to
warn for

  template <typename>
  Dest withMove() {
    Dest x;
    return std::move(x);
  }

because the std::move therein will be pessimizing for any instantiation.

So I'm using the suppress_warning machinery to that effect.
Problem: I had to add a new group to nowarn_spec_t, otherwise
suppressing the -Wpessimizing-move warning would disable a whole bunch
of other warnings, which we really don't want.

PR c++/89780

gcc/cp/ChangeLog:

* pt.cc (tsubst_copy_and_build) <case CALL_EXPR>: Maybe suppress
-Wpessimizing-move.
* typeck.cc (maybe_warn_pessimizing_move): Don't issue warnings
if they are suppressed.
(check_return_expr): Disable -Wpessimizing-move when returning
a dependent expression.

gcc/ChangeLog:

* diagnostic-spec.cc (nowarn_spec_t::nowarn_spec_t): Handle
OPT_Wpessimizing_move and OPT_Wredundant_move.
* diagnostic-spec.h (nowarn_spec_t): Add NW_REDUNDANT enumerator.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/Wpessimizing-move3.C: Remove dg-warning.
* g++.dg/cpp0x/Wredundant-move2.C: Likewise.

c++: Extend -Wpessimizing-move to other contexts

In my recent patch which enhanced -Wpessimizing-move so that it warns
about class prvalues too I said that I'd like to extend it so that it
warns in more contexts where a std::move can prevent copy elision, such
as:

  T t = std::move(T());
  T t(std::move(T()));
  T t{std::move(T())};
  T t = {std::move(T())};
  void foo (T);
  foo (std::move(T()));

This patch does that by adding two maybe_warn_pessimizing_move calls.
These must happen before we've converted the initializers otherwise the
std::move will be buried in a TARGET_EXPR.

PR c++/106276

gcc/cp/ChangeLog:

* call.cc (build_over_call): Call maybe_warn_pessimizing_move.
* cp-tree.h (maybe_warn_pessimizing_move): Declare.
* decl.cc (build_aggr_init_full_exprs): Call
maybe_warn_pessimizing_move.
* typeck.cc (maybe_warn_pessimizing_move): Handle TREE_LIST and
CONSTRUCTOR.  Add a bool parameter and use it.  Adjust a diagnostic
message.
(check_return_expr): Adjust the call to maybe_warn_pessimizing_move.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/Wpessimizing-move7.C: Add dg-warning.
* g++.dg/cpp0x/Wpessimizing-move8.C: New test.

fortran: Add -static-libquadmath support [PR46539]

The following patch is a revival of the
https://gcc.gnu.org/legacy-ml/gcc-patches/2014-10/msg00771.html
patch.  While trunk configured against recent glibc and with linker
--as-needed support doesn't really need to link against -lquadmath
anymore, there are still other targets where libquadmath is still in
use.
As has been discussed, making -static-libgfortran imply statically
linking both libgfortran and libquadmath is undesirable because of
the significant licensing differences between the 2 libraries.
Compared to the 2014 patch, this one doesn't handle -lquadmath
addition in the driver, which to me looks incorrect, libgfortran
configure determines where in libgfortran.spec -lquadmath should
be present if at all and with what it should be wrapped, but
analyzes gfortran -### -static-libgfortran stderr and based on
that figures out what gcc/configure.ac determined.

2022-08-17  Francois-Xavier Coudert  <fxcoudert@gcc.gnu.org>
    Jakub Jelinek  <jakub@redhat.com>

PR fortran/46539
gcc/
* common.opt (static-libquadmath): New option.
* gcc.cc (driver_handle_option): Always accept -static-libquadmath.
* config/darwin.h (LINK_SPEC): Handle -static-libquadmath.
gcc/fortran/
* lang.opt (static-libquadmath): New option.
* invoke.texi (-static-libquadmath): Document it.
* options.cc (gfc_handle_option): Error out if -static-libquadmath
is passed but we do not support it.
libgfortran/
* acinclude.m4 (LIBQUADSPEC): From $FC -static-libgfortran -###
output determine -Bstatic/-Bdynamic, -bstatic/-bdynamic,
-aarchive_shared/-adefault linker support or Darwin remapping
of -lgfortran to libgfortran.a%s and use that around or instead
of -lquadmath in LIBQUADSPEC.
* configure: Regenerated.

Co-Authored-By: Francois-Xavier Coudert <fxcoudert@gcc.gnu.org>

Fortran: OpenMP fix declare simd inside modules and absent linear step [PR106566]

gcc/fortran/ChangeLog:

PR fortran/106566
* openmp.cc (gfc_match_omp_clauses): Fix setting linear-step value
to 1 when not specified.
(gfc_match_omp_declare_simd): Accept module procedures.

gcc/testsuite/ChangeLog:

PR fortran/106566
* gfortran.dg/gomp/declare-simd-4.f90: New test.
* gfortran.dg/gomp/declare-simd-5.f90: New test.
* gfortran.dg/gomp/declare-simd-6.f90: New test.

OpenMP requires: Fix diagnostic filename corner case

The issue occurs when there is, e.g., main._omp_fn.0 in two files with
different OpenMP requires clauses. The function entries in the offload
table ends up having the same decl tree and, hence, the diagnostic showed
the same filename for both. Solution: Use the .o filename in this case.

Note that the issue does not occur with same-named 'static' functions and
without the fatal error from the requires diagnostic, there would be
later a linker error due to having two 'main'.

gcc/
* lto-cgraph.cc (input_offload_tables): Improve requires diagnostic
when filenames come out identically.

OpenMP: Fix var replacement with 'simd' and linear-step vars [PR106548]

gcc/ChangeLog:

PR middle-end/106548
* omp-low.cc (lower_rec_input_clauses): Use build_outer_var_ref
for 'simd' linear-step values that are variable.

libgomp/ChangeLog:

PR middle-end/106548
* testsuite/libgomp.c/linear-2.c: New test.

libgomp/splay-tree.h: Fix splay_tree_prefix handling

When splay_tree_prefix is defined, the .h file
defines splay_* macros to add the prefix. However,
before those were only unset when additionally
splay_tree_c was defined.
Additionally, for consistency undefine splay_tree_c
also when no splay_tree_prefix is defined - there
is no interdependence either.

libgomp/ChangeLog:

* splay-tree.h: Fix splay_* macro unsetting if
splay_tree_prefix is defined.

OpenMP/C++: Allow classes with static members to be mappable [PR104493]

As this is the last lang-specific user of the omp_mappable_type hook,
the hook is removed, keeping only a generic omp_mappable_type for
incomplete types (or error_node).

PR c++/104493

gcc/c/ChangeLog:

* c-decl.cc (c_decl_attributes, finish_decl): Call omp_mappable_type
instead of removed langhook.
* c-typeck.cc (c_finish_omp_clauses): Likewise.

gcc/cp/ChangeLog:

* cp-objcp-common.h (LANG_HOOKS_OMP_MAPPABLE_TYPE): Remove.
* cp-tree.h (cp_omp_mappable_type, cp_omp_emit_unmappable_type_notes):
Remove.
* decl2.cc (cp_omp_mappable_type_1, cp_omp_mappable_type,
cp_omp_emit_unmappable_type_notes): Remove.
(cplus_decl_attributes): Call omp_mappable_type instead of
removed langhook.
* decl.cc (cp_finish_decl): Likewise; call cxx_incomplete_type_inform
in lieu of cp_omp_emit_unmappable_type_notes.
* semantics.cc (finish_omp_clauses): Likewise.

gcc/ChangeLog:

* gimplify.cc (omp_notice_variable): Call omp_mappable_type
instead of removed langhook.
* omp-general.h (omp_mappable_type): New prototype.
* omp-general.cc (omp_mappable_type): New; moved from ...
* langhooks.cc (lhd_omp_mappable_type): ... here.
* langhooks-def.h (lhd_omp_mappable_type,
LANG_HOOKS_OMP_MAPPABLE_TYPE): Remove.
(LANG_HOOKS_FOR_TYPES_INITIALIZER): Remote the latter.
* langhooks.h (struct lang_hooks_for_types): Remove
omp_mappable_type.

gcc/testsuite/ChangeLog:

* g++.dg/gomp/unmappable-1.C: Remove dg-error; remove dg-note no
longer shown as TYPE_MAIN_DECL is NULL.
* c-c++-common/gomp/map-incomplete-type.c: New test.

Co-authored-by: Chung-Lin Tang <cltang@codesourcery.com>

arm: Define with_float to hard when target name ends with hf

On arm, the --with-float= configure option is used to define include
files search path (among other things).  However, when targeting
arm-linux-gnueabihf, one would expect to automatically default to the
hard-float ABI, but this is not the case. As a consequence, GCC
bootstrap fails on an arm-linux-gnueabihf target if --with-float=hard
is not used.

This patch checks if the target name ends with 'hf' and defines
with_float to hard if not already defined.  This is achieved in
gcc/config.gcc, just before selecting the default CPU depending on the
$with_float value.

2022-08-17  Christophe Lyon  <christophe.lyon@arm.com>

gcc/
* config.gcc (arm): Define with_float to hard if target name ends
with 'hf'.

Refactor back_threader_profitability

The following refactors profitable_path_p in the backward threader,
splitting out parts that can be computed once the exit block is known,
parts that contiguously update and that can be checked allowing
for the path to be later identified as FSM with larger limits,
possibly_profitable_path_p, and final checks done when the whole
path is known, profitable_path_p.

I've removed the back_threader_profitability instance from the
back_threader class and instead instantiate it once per path
discovery.  I've kept the size compute non-incremental to simplify
the patch and not worry about unwinding.

There's key changes to previous behavior - namely we apply
the param_max_jump_thread_duplication_stmts early only when
we know the path cannot become an FSM one (multiway + thread through
latch) but make sure to elide the path query when we we didn't
yet discover that but are over this limit.  Similarly the
speed limit is now used even when we did not yet discover a
hot BB on the path.  Basically the idea is to only stop path
discovery when we know the path will never become profitable
but avoid the expensive path range query when we know it's
currently not.

I've done a few cleanups, merging functions, on the way.

* tree-ssa-threadbackward.cc
(back_threader_profitability): Split profitable_path_p
into possibly_profitable_path_p and itself, keep state
as new members.
(back_threader::m_profit): Remove.
(back_threader::find_paths): Likewise.
(back_threader::maybe_register_path): Take profitability
instance as parameter.
(back_threader::find_paths_to_names): Likewise.  Use
possibly_profitable_path_p and avoid the path range query
when the path is currently too large.
(back_threader::find_paths): Fold into ...
(back_threader::maybe_thread_block): ... this.
(get_gimple_control_stmt): Remove.
(back_threader_profitability::possibly_profitable_path_p):
Split out from profitable_path_p, do early profitability
checks.
(back_threader_profitability::profitable_path_p): Do final
profitability path after the taken edge has been determined.

Fix bug in emergency cxa pool free

This probably has never actually affected anyone in practice. The normal
ABI implementation just uses malloc and only falls back to the pool on
malloc failure. But if that happens a bunch of times the freelist gets out
of order which violates some of the invariants of the freelist (as well as
the comments that follow the bug). The bug is just a comparison reversal
when traversing the freelist in the case where the pointer being returned
to the pool is after the existing freelist.

libstdc++-v3/
* libsupc++/eh_alloc.cc (pool::free): Inverse comparison.

LoongArch: Provide fmin/fmax RTL pattern

We already had smin/smax RTL pattern using fmin/fmax instruction. But
for smin/smax, it's unspecified what will happen if either operand is
NaN. So we would generate calls to libc fmin/fmax functions with
-fno-finite-math-only (the default for all optimization levels expect
-Ofast).

But, LoongArch fmin/fmax instruction is IEEE-754-2008 conformant so we
can also use the instruction for fmin/fmax pattern and avoid the library
function call.

gcc/ChangeLog:

* config/loongarch/loongarch.md (fmax<mode>3): New RTL pattern.
(fmin<mode>3): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/fmax-fmin.c: New test.

Abstract interesting ssa-names from GORI.

Provide a routine to pick out the ssa-names from interesting statements.

* gimple-range-fold.cc (gimple_range_ssa_names): New.
* gimple-range-fold.h (gimple_range_ssa_names): New prototype.
* gimple-range-gori.cc (range_def_chain::get_def_chain): Move
code to new routine.

Daily bump.

c++: remove some xfails

These tests are now passing.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Wstringop-overflow-4.C: Only xfail for C++98.
* g++.target/i386/bfloat_cpp_typecheck.C: Remove xfail.

c++: Fix pragma suppression of -Wc++20-compat diagnostics [PR106423]

Gcc's '#pragma GCC diagnostic' directives are processed in "early mode"
(see handle_pragma_diagnostic_early) for the C++ frontend and, as such,
require that the target diagnostic option be enabled for the preprocessor
(see c_option_is_from_cpp_diagnostics).  This change modifies the
-Wc++20-compat option definition to register it as a preprocessor option
so that its associated diagnostics can be suppressed.  The changes also
implicitly disable the option in C++20 and later modes.  These changes
are consistent with the definition of the -Wc++11-compat option.

This support is motivated by the need to suppress the following diagnostic
otherwise issued in C++17 and earlier modes due to the char8_t typedef
present in the uchar.h header file in glibc 2.36.
  warning: identifier ‘char8_t’ is a keyword in C++20 [-Wc++20-compat]

Tests are added to validate suppression of both -Wc++11-compat and
-Wc++20-compat related diagnostics (fixes were only needed for the C++20
case).

PR c++/106423

gcc/c-family/ChangeLog:
* c-opts.cc (c_common_post_options): Disable -Wc++20-compat
diagnostics in C++20 and later.
* c.opt (Wc++20-compat): Enable hooks for the preprocessor.

gcc/cp/ChangeLog:
* parser.cc (cp_lexer_saving_tokens): Add comment regarding
diagnostic requirements.

gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/keywords2.C: New test.
* g++.dg/cpp2a/keywords2.C: New test.

libcpp/ChangeLog:
* include/cpplib.h (cpp_warning_reason): Add CPP_W_CXX20_COMPAT.
* init.cc (cpp_create_reader): Add cpp_warn_cxx20_compat.

docs: remove link to bullfreeware.com from install

As mentioned at https://gcc.gnu.org/PR106637#c2, the discontinued
providing binaries.

PR target/106637

gcc/ChangeLog:

* doc/install.texi: Remove link to www.bullfreeware.com

RISC-V: Support zfh and zfhmin extension

Zfh and Zfhmin are extensions for IEEE half precision, both are ratified
in Jan. 2022[1]:

- Zfh has full set of operation like F or D for single or double precision.
- Zfhmin has only provide minimal support for half precision operation,
like conversion, load, store and move instructions.

[1] https://github.com/riscv/riscv-isa-manual/commit/b35a54079e0da11740ce5b1e6db999d1d5172768

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc (riscv_implied_info): Add
zfh and zfhmin.
(riscv_ext_version_table): Ditto.
(riscv_ext_flag_table): Ditto.
* config/riscv/riscv-opts.h (MASK_ZFHMIN): New.
(MASK_ZFH): Ditto.
(TARGET_ZFHMIN): Ditto.
(TARGET_ZFH): Ditto.
* config/riscv/riscv.cc (riscv_output_move): Handle HFmode move
for zfh and zfhmin.
(riscv_emit_float_compare): Handle HFmode.
* config/riscv/riscv.md (ANYF): Add HF.
(SOFTF): Add HF.
(load): Ditto.
(store): Ditto.
(truncsfhf2): New.
(truncdfhf2): Ditto.
(extendhfsf2): Ditto.
(extendhfdf2): Ditto.
(*movhf_hardfloat): Ditto.
(*movhf_softfloat): Make sure not ZFHMIN.
* config/riscv/riscv.opt (riscv_zf_subext): New.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/_Float16-zfh-1.c: New.
* gcc.target/riscv/_Float16-zfh-2.c: Ditto.
* gcc.target/riscv/_Float16-zfh-3.c: Ditto.
* gcc.target/riscv/_Float16-zfhmin-1.c: Ditto.
* gcc.target/riscv/_Float16-zfhmin-2.c: Ditto.
* gcc.target/riscv/_Float16-zfhmin-3.c: Ditto.
* gcc.target/riscv/arch-16.c: Ditto.
* gcc.target/riscv/arch-17.c: Ditto.
* gcc.target/riscv/predef-21.c: Ditto.
* gcc.target/riscv/predef-22.c: Ditto.

RISC-V: Support _Float16 type.

RISC-V decide use _Float16 as primary IEEE half precision type, and this
already become part of psABI, this patch has added folloing support for
_Float16:

- Soft-float support for _Float16.
- Make sure _Float16 available on C++ mode.
- Name mangling for _Float16 on C++ mode.

gcc/ChangeLog

* config/riscv/riscv-builtins.cc: include stringpool.h
(riscv_float16_type_node): New.
(riscv_init_builtin_types): Ditto.
(riscv_init_builtins): Call riscv_init_builtin_types.
* config/riscv/riscv-modes.def (HF): New.
* config/riscv/riscv.cc (riscv_output_move): Handle HFmode.
(riscv_mangle_type): New.
(riscv_scalar_mode_supported_p): Ditto.
(riscv_libgcc_floating_mode_supported_p): Ditto.
(riscv_excess_precision): Ditto.
(riscv_floatn_mode): Ditto.
(riscv_init_libfuncs): Ditto.
(TARGET_MANGLE_TYPE): Ditto.
(TARGET_SCALAR_MODE_SUPPORTED_P): Ditto.
(TARGET_LIBGCC_FLOATING_MODE_SUPPORTED_P): Ditto.
(TARGET_INIT_LIBFUNCS): Ditto.
(TARGET_C_EXCESS_PRECISION): Ditto.
(TARGET_FLOATN_MODE): Ditto.
* config/riscv/riscv.md (mode): Add HF.
(softload): Add HF.
(softstore): Ditto.
(fmt): Ditto.
(UNITMODE): Ditto.
(movhf): New.
(*movhf_softfloat): New.

libgcc/ChangeLog:

* config/riscv/sfp-machine.h (_FP_NANFRAC_H): New.
(_FP_NANFRAC_H): Ditto.
(_FP_NANSIGN_H): Ditto.
* config/riscv/t-softfp32 (softfp_extensions): Add HF related
routines.
(softfp_truncations): Ditto.
(softfp_extras): Ditto.
* config/riscv/t-softfp64 (softfp_extras): Add HF related routines.

gcc/testsuite/ChangeLog:

* g++.target/riscv/_Float16.C: New.
* gcc.target/riscv/_Float16-soft-1.c: Ditto.
* gcc.target/riscv/_Float16-soft-2.c: Ditto.
* gcc.target/riscv/_Float16-soft-3.c: Ditto.
* gcc.target/riscv/_Float16-soft-4.c: Ditto.
* gcc.target/riscv/_Float16.c: Ditto.

soft-fp: Update soft-fp from glibc

This patch is updating all soft-fp from glibc, most changes are
copyright years update, removing "Contributed by" lines and update URL for
license, and changes other than those update are adding conversion
function between IEEE half and 32-bit/64-bit integer, those functions are
required by RISC-V _Float16 support.

libgcc/ChangeLog:

* soft-fp/fixhfdi.c: New.
* soft-fp/fixhfsi.c: Likewise.
* soft-fp/fixunshfdi.c: Likewise.
* soft-fp/fixunshfsi.c: Likewise.
* soft-fp/floatdihf.c: Likewise.
* soft-fp/floatsihf.c: Likewise.
* soft-fp/floatundihf.c: Likewise.
* soft-fp/floatunsihf.c: Likewise.
* soft-fp/adddf3.c: Updating copyright years, removing "Contributed by"
lines and update URL for license.
* soft-fp/addsf3.c: Likewise.
* soft-fp/addtf3.c: Likewise.
* soft-fp/divdf3.c: Likewise.
* soft-fp/divsf3.c: Likewise.
* soft-fp/divtf3.c: Likewise.
* soft-fp/double.h: Likewise.
* soft-fp/eqdf2.c: Likewise.
* soft-fp/eqhf2.c: Likewise.
* soft-fp/eqsf2.c: Likewise.
* soft-fp/eqtf2.c: Likewise.
* soft-fp/extenddftf2.c: Likewise.
* soft-fp/extended.h: Likewise.
* soft-fp/extendhfdf2.c: Likewise.
* soft-fp/extendhfsf2.c: Likewise.
* soft-fp/extendhftf2.c: Likewise.
* soft-fp/extendhfxf2.c: Likewise.
* soft-fp/extendsfdf2.c: Likewise.
* soft-fp/extendsftf2.c: Likewise.
* soft-fp/extendxftf2.c: Likewise.
* soft-fp/fixdfdi.c: Likewise.
* soft-fp/fixdfsi.c: Likewise.
* soft-fp/fixdfti.c: Likewise.
* soft-fp/fixhfti.c: Likewise.
* soft-fp/fixsfdi.c: Likewise.
* soft-fp/fixsfsi.c: Likewise.
* soft-fp/fixsfti.c: Likewise.
* soft-fp/fixtfdi.c: Likewise.
* soft-fp/fixtfsi.c: Likewise.
* soft-fp/fixtfti.c: Likewise.
* soft-fp/fixunsdfdi.c: Likewise.
* soft-fp/fixunsdfsi.c: Likewise.
* soft-fp/fixunsdfti.c: Likewise.
* soft-fp/fixunshfti.c: Likewise.
* soft-fp/fixunssfdi.c: Likewise.
* soft-fp/fixunssfsi.c: Likewise.
* soft-fp/fixunssfti.c: Likewise.
* soft-fp/fixunstfdi.c: Likewise.
* soft-fp/fixunstfsi.c: Likewise.
* soft-fp/fixunstfti.c: Likewise.
* soft-fp/floatdidf.c: Likewise.
* soft-fp/floatdisf.c: Likewise.
* soft-fp/floatditf.c: Likewise.
* soft-fp/floatsidf.c: Likewise.
* soft-fp/floatsisf.c: Likewise.
* soft-fp/floatsitf.c: Likewise.
* soft-fp/floattidf.c: Likewise.
* soft-fp/floattihf.c: Likewise.
* soft-fp/floattisf.c: Likewise.
* soft-fp/floattitf.c: Likewise.
* soft-fp/floatundidf.c: Likewise.
* soft-fp/floatundisf.c: Likewise.
* soft-fp/floatunditf.c: Likewise.
* soft-fp/floatunsidf.c: Likewise.
* soft-fp/floatunsisf.c: Likewise.
* soft-fp/floatunsitf.c: Likewise.
* soft-fp/floatuntidf.c: Likewise.
* soft-fp/floatuntihf.c: Likewise.
* soft-fp/floatuntisf.c: Likewise.
* soft-fp/floatuntitf.c: Likewise.
* soft-fp/gedf2.c: Likewise.
* soft-fp/gesf2.c: Likewise.
* soft-fp/getf2.c: Likewise.
* soft-fp/half.h: Likewise.
* soft-fp/ledf2.c: Likewise.
* soft-fp/lesf2.c: Likewise.
* soft-fp/letf2.c: Likewise.
* soft-fp/muldf3.c: Likewise.
* soft-fp/mulsf3.c: Likewise.
* soft-fp/multf3.c: Likewise.
* soft-fp/negdf2.c: Likewise.
* soft-fp/negsf2.c: Likewise.
* soft-fp/negtf2.c: Likewise.
* soft-fp/op-1.h: Likewise.
* soft-fp/op-2.h: Likewise.
* soft-fp/op-4.h: Likewise.
* soft-fp/op-8.h: Likewise.
* soft-fp/op-common.h: Likewise.
* soft-fp/quad.h: Likewise.
* soft-fp/single.h: Likewise.
* soft-fp/soft-fp.h: Likewise.
* soft-fp/subdf3.c: Likewise.
* soft-fp/subsf3.c: Likewise.
* soft-fp/subtf3.c: Likewise.
* soft-fp/truncdfhf2.c: Likewise.
* soft-fp/truncdfsf2.c: Likewise.
* soft-fp/truncsfhf2.c: Likewise.
* soft-fp/trunctfdf2.c: Likewise.
* soft-fp/trunctfhf2.c: Likewise.
* soft-fp/trunctfsf2.c: Likewise.
* soft-fp/trunctfxf2.c: Likewise.
* soft-fp/truncxfhf2.c: Likewise.
* soft-fp/unorddf2.c: Likewise.
* soft-fp/unordsf2.c: Likewise.
* soft-fp/unordtf2.c: Likewise.

Stop backwards thread discovery when leaving a loop

The backward threader copier cannot deal with the situation of
copying blocks belonging to different loops and will reject those
paths late. The following uses this to prune path discovery,
saving on compile-time. Note the off-loop block is still considered
as entry edge origin.

* tree-ssa-threadbackward.cc (back_threader::find_paths_to_names):
Do not walk further if we are leaving the current loop.