review.tizen.org Git - platform/upstream/gcc.git/log

vect: Simplify get_initial_def_for_reduction

After previous patches, we can now easily provide the neutral op
as an argument to get_initial_def_for_reduction.  This in turn
allows the adjustment calculation to be moved outside of
get_initial_def_for_reduction, which is the main motivation
of the patch.

gcc/
* tree-vect-loop.c (get_initial_def_for_reduction): Remove
adjustment handling.  Take the neutral value as an argument,
in place of the code argument.
(vect_transform_cycle_phi): Update accordingly.  Handle the
initial values of cond reductions separately from code reductions.
Choose the adjustment here rather than in
get_initial_def_for_reduction.  Sink the splat of vec_initial_def.

vect: Generalise neutral_op_for_slp_reduction

This patch generalises the interface to neutral_op_for_slp_reduction
so that it can be used for non-SLP reductions too. This isn't much
of a win on its own, but it helps later patches.

gcc/
* tree-vect-loop.c (neutral_op_for_slp_reduction): Replace with...
(neutral_op_for_reduction): ...this, providing a more general
interface.
(vect_create_epilog_for_reduction): Update accordingly.
(vectorizable_reduction): Likewise.
(vect_transform_cycle_phi): Likewise.

vect: Pass reduc_info to get_initial_def_for_reduction

Similarly to the previous patch, this one passes the reduc_info
to get_initial_def_for_reduction, rather than a stmt_vec_info that
lacks the metadata. This again becomes useful later.

gcc/
* tree-vect-loop.c (get_initial_def_for_reduction): Take the
reduc_info instead of the original stmt_vec_info.
(vect_transform_cycle_phi): Update accordingly.

vect: Pass reduc_info to get_initial_defs_for_reduction

This patch passes the reduc_info to get_initial_defs_for_reduction,
so that the function can get general information from there rather
than from the first SLP statement. This isn't a win on its own,
but it becomes important with later patches.

gcc/
* tree-vect-loop.c (get_initial_defs_for_reduction): Take the
reduc_info as an additional parameter.
(vect_transform_cycle_phi): Update accordingly.

vect: Add a vect_phi_initial_value helper function

This patch adds a helper function called vect_phi_initial_value
for returning the incoming value of a given loop phi. The main
reason for adding it is to ensure that the right preheader edge
is used when vectorising nested loops. (PHI_ARG_DEF_FROM_EDGE
itself doesn't assert that the given edge is for the right block,
although I guess that would be good to add separately.)

gcc/
* tree-vectorizer.h: Include tree-ssa-operands.h.
(vect_phi_initial_value): New function.
* tree-vect-loop.c (neutral_op_for_slp_reduction): Use it.
(get_initial_defs_for_reduction, info_for_reduction): Likewise.
(vect_create_epilog_for_reduction, vectorizable_reduction): Likewise.
(vect_transform_cycle_phi, vectorizable_induction): Likewise.

vect: Ensure reduc_inputs always have vectype

Vector reduction accumulators can differ in signedness from the
final scalar result. The conversions to handle that case were
distributed through vect_create_epilog_for_reduction; this patch
does the conversion up-front instead.

gcc/
* tree-vect-loop.c (vect_create_epilog_for_reduction): Convert
the phi results to vectype after creating them. Remove later
conversion code that thus becomes redundant.

vect: Remove new_phis from vect_create_epilog_for_reduction

vect_create_epilog_for_reduction had a variable called new_phis.
It collected the statements that produce the exit block definitions
of the vector reduction accumulators.  Although those statements
are indeed phis initially, they are often replaced with normal
statements later, leading to puzzling code like:

          FOR_EACH_VEC_ELT (new_phis, i, new_phi)
            {
              int bit_offset;
              if (gimple_code (new_phi) == GIMPLE_PHI)
                vec_temp = PHI_RESULT (new_phi);
              else
                vec_temp = gimple_assign_lhs (new_phi);

Also, although the array collects statements, in practice all users want
the lhs instead.

This patch therefore replaces new_phis with a vector of gimple values
called “reduc_inputs”.

Also, reduction chains and ncopies>1 were handled with identical code
(and there was a comment saying so).  The patch unites them into
a single “if”.

gcc/
* tree-vect-loop.c (vect_create_epilog_for_reduction): Replace
the new_phis vector with a reduc_inputs vector.  Combine handling
of reduction chains and ncopies > 1.

vect: Create array_slice of live-out stmts

This patch constructs an array_slice of the scalar statements that
produce live-out reduction results in the original unvectorised loop.
There are three cases:

- SLP reduction chains: the final SLP stmt is live-out
- full SLP reductions: all SLP stmts are live-out
- non-SLP reductions: the single scalar stmt is live-out

This is a slight simplification on its own, mostly because it maans
“group_size” has a consistent meaning throughout the function.
The main justification though is that it helps with later patches.

gcc/
* tree-vect-loop.c (vect_create_epilog_for_reduction): Truncate
scalar_results to group_size elements after reducing down from
N*group_size elements. Construct an array_slice of the live-out
stmts and assert that there is one stmt per scalar result.

vect: Simplify epilogue reduction code

vect_create_epilog_for_reduction only handles two cases: single-loop
reductions and double reductions. “nested cycles” (i.e. reductions
in the inner loop when vectorising an outer loop) are handled elsewhere
and don't need a vector->scalar reduction.

The function had variables called nested_in_vect_loop and double_reduc
and asserted that nested_in_vect_loop implied double_reduc, but it
still had code to handle nested_in_vect_loop && !double_reduc.
This patch removes that and uses double_reduc everywhere.

gcc/
* tree-vect-loop.c (vect_create_epilog_for_reduction): Remove
nested_in_vect_loop and use double_reduc everywhere. Remove dead
assignment to "loop".

ifcvt: Improve tests for predicated operations

-msve-vector-bits=128 causes the AArch64 port to list 128-bit Advanced
SIMD as the first-choice mode for vectorisation, with SVE being used for
things that Advanced SIMD can't handle as easily. However, ifcvt would
not then try to use SVE's predicated FP arithmetic, leading to tests
like TSVC ControlFlow-flt failing to vectorise.

The mask load/store code did try other vector modes, but could also be
improved to make sure that SVEness sticks when computing derived modes.

(Unlike mode_for_vector, related_vector_mode always returns a vector
mode, so there's no need to check VECTOR_MODE_P as well.)

gcc/
* internal-fn.c (vectorized_internal_fn_supported_p): Handle
vector types first. For scalar types, consider both the preferred
vector mode and the alternative vector modes.
* optabs-query.c (can_vec_mask_load_store_p): Use the same
structure as above, in particular using related_vector_mode
for modes provided by autovectorize_vector_modes.

gcc/testsuite/
* gcc.target/aarch64/sve/cond_arith_6.c: New test.

passes: Fix up subobject __bos [PR101419]

The following testcase is miscompiled, because VN during cunrolli changes
__bos argument from address of a larger field to address of a smaller field
and so __builtin_object_size (, 1) then folds into smaller value than the
actually available size.
copy_reference_ops_from_ref has a hack for this, but it was using
cfun->after_inlining as a check whether the hack can be ignored, and
cunrolli is after_inlining.

This patch uses a property to make it exact (set at the end of objsz
pass that doesn't do insert_min_max_p) and additionally based on discussions
in the PR moves the objsz pass earlier after IPA.

2021-07-13  Jakub Jelinek  <jakub@redhat.com>
    Richard Biener  <rguenther@suse.de>

PR tree-optimization/101419
* tree-pass.h (PROP_objsz): Define.
(make_pass_early_object_sizes): Declare.
* passes.def (pass_all_early_optimizations): Rename pass_object_sizes
there to pass_early_object_sizes, drop parameter.
(pass_all_optimizations): Move pass_object_sizes right after pass_ccp,
drop parameter, move pass_post_ipa_warn right after that.
* tree-object-size.c (pass_object_sizes::execute): Rename to...
(object_sizes_execute): ... this.  Add insert_min_max_p argument.
(pass_data_object_sizes): Move after object_sizes_execute.
(pass_object_sizes): Likewise.  In execute method call
object_sizes_execute, drop set_pass_param method and insert_min_max_p
non-static data member and its initializer in the ctor.
(pass_data_early_object_sizes, pass_early_object_sizes,
make_pass_early_object_sizes): New.
* tree-ssa-sccvn.c (copy_reference_ops_from_ref): Use
(cfun->curr_properties & PROP_objsz) instead of cfun->after_inlining.

* gcc.dg/builtin-object-size-10.c: Pass -fdump-tree-early_objsz-details
instead of -fdump-tree-objsz1-details in dg-options and adjust names
of dump file in scan-tree-dump.
* gcc.dg/pr101419.c: New test.

libgomp: Don't include limits.h instead of hidden visibility block

sem.h is included in between # pragma GCC visibility push(hidden)
and # pragma GCC visibility pop and includes limits.h there, which
since the introduction of sysconf declaration in recent glibcs
in there causes trouble.  libgomp assumes it is compiled by gcc,
so we don't really need to include limits.h there and can use
-__INT_MAX__ - 1 instead (which clang and icc support too for years).

2021-07-13  Jakub Jelinek  <jakub@redhat.com>
    Florian Weimer  <fweimer@redhat.com>

* config/linux/sem.h: Don't include limits.h.
(SEM_WAIT): Define to -__INT_MAX__ - 1 instead of INT_MIN.
* config/linux/affinity.c: Include limits.h.

docs: Add 'S' to Machine Constraints for RISC-V

It was undocument before, but it might used in linux kernel for resolve
code model issue, so LLVM community suggest we should document that,
so that make it become supported/documented/non-internal machine constraints.

gcc/ChangeLog:

PR target/101275
* config/riscv/constraints.md ("S"): Update description and remove
@internal.
* doc/md.texi (Machine Constraints): Document the 'S' constraints
for RISC-V.

Revert "Display the number of components BB vectorized"

This reverts commit c03cae4e066066278c8435c409829a9bf851e49f.

Deal with prefixed loads/stores in tests, PR testsuite/100166

This patch updates the various tests in the testsuite to treat plxv
and pstxv as being vector loads/stores. This shows up if you run the
testsuite with a compiler configured with the option: --with-cpu=power10.

2021-07-13 Michael Meissner <meissner@linux.ibm.com>

gcc/testsuite/
PR testsuite/100166
* gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a-pr63175.c: Update
insn counts to account for power10 prefixed loads and stores.
* gcc.target/powerpc/fold-vec-load-builtin_vec_xl-char.c:
Likewise.
* gcc.target/powerpc/fold-vec-load-builtin_vec_xl-double.c:
Likewise.
* gcc.target/powerpc/fold-vec-load-builtin_vec_xl-float.c:
Likewise.
* gcc.target/powerpc/fold-vec-load-builtin_vec_xl-int.c:
Likewise.
* gcc.target/powerpc/fold-vec-load-builtin_vec_xl-longlong.c:
Likewise.
* gcc.target/powerpc/fold-vec-load-builtin_vec_xl-short.c:
Likewise.
* gcc.target/powerpc/fold-vec-load-vec_vsx_ld-char.c: Likewise.
* gcc.target/powerpc/fold-vec-load-vec_vsx_ld-double.c: Likewise.
* gcc.target/powerpc/fold-vec-load-vec_vsx_ld-float.c: Likewise.
* gcc.target/powerpc/fold-vec-load-vec_vsx_ld-int.c: Likewise.
* gcc.target/powerpc/fold-vec-load-vec_vsx_ld-longlong.c:
Likewise.
* gcc.target/powerpc/fold-vec-load-vec_vsx_ld-short.c: Likewise.
* gcc.target/powerpc/fold-vec-load-vec_xl-char.c: Likewise.
* gcc.target/powerpc/fold-vec-load-vec_xl-double.c: Likewise.
* gcc.target/powerpc/fold-vec-load-vec_xl-float.c: Likewise.
* gcc.target/powerpc/fold-vec-load-vec_xl-int.c: Likewise.
* gcc.target/powerpc/fold-vec-load-vec_xl-longlong.c: Likewise.
* gcc.target/powerpc/fold-vec-load-vec_xl-short.c: Likewise.
* gcc.target/powerpc/fold-vec-splat-floatdouble.c: Likewise.
* gcc.target/powerpc/fold-vec-splat-longlong.c: Likewise.
* gcc.target/powerpc/fold-vec-store-builtin_vec_xst-char.c:
Likewise.
* gcc.target/powerpc/fold-vec-store-builtin_vec_xst-double.c:
Likewise.
* gcc.target/powerpc/fold-vec-store-builtin_vec_xst-float.c:
Likewise.
* gcc.target/powerpc/fold-vec-store-builtin_vec_xst-int.c:
Likewise.
* gcc.target/powerpc/fold-vec-store-builtin_vec_xst-longlong.c:
Likewise.
* gcc.target/powerpc/fold-vec-store-builtin_vec_xst-short.c:
Likewise.
* gcc.target/powerpc/fold-vec-store-vec_vsx_st-char.c: Likewise.
* gcc.target/powerpc/fold-vec-store-vec_vsx_st-double.c:
Likewise.
* gcc.target/powerpc/fold-vec-store-vec_vsx_st-float.c: Likewise.
* gcc.target/powerpc/fold-vec-store-vec_vsx_st-int.c: Likewise.
* gcc.target/powerpc/fold-vec-store-vec_vsx_st-longlong.c:
Likewise.
* gcc.target/powerpc/fold-vec-store-vec_vsx_st-short.c: Likewise.
* gcc.target/powerpc/fold-vec-store-vec_xst-char.c: Likewise.
* gcc.target/powerpc/fold-vec-store-vec_xst-double.c: Likewise.
* gcc.target/powerpc/fold-vec-store-vec_xst-float.c: Likewise.
* gcc.target/powerpc/fold-vec-store-vec_xst-int.c: Likewise.
* gcc.target/powerpc/fold-vec-store-vec_xst-longlong.c: Likewise.
* gcc.target/powerpc/fold-vec-store-vec_xst-short.c: Likewise.
* gcc.target/powerpc/lvsl-lvsr.c: Likewise.
* gcc.target/powerpc/pr86731-fwrapv-longlong.c: Likewise.

Fix vec-splati-runnable.c test.

I noticed that the vec-splati-runnable.c did not have an abort after one
of the tests.  If the test was run with optimization, the optimizer could
delete some of the tests and throw off the count.  However, due to the
fact that the value being loaded in that test is undefined, I did not
check what value was loaded, but I just stored it into a volatile global
variable.

2021-07-12  Michael Meissner  <meissner@linux.ibm.com>

gcc/testsuite/
* gcc.target/powerpc/vec-splati-runnable.c: Run test with -O2
optimization.  Do not check what XXSPLTIDP generates if the value
is undefined.

Change rs6000_const_f32_to_i32 return type.

The function rs6000_const_f32_to_i32 called REAL_VALUE_TO_TARGET_SINGLE
with a long long type and returns it. This patch changes the type to long
which is the proper type for REAL_VALUE_TO_TARGET_SINGLE.

2021-07-12 Michael Meissner <meissner@linux.ibm.com>

gcc/
* config/rs6000/altivec.md (xxspltiw_v4sf): Change local variable
value to to long.
* config/rs6000/rs6000-protos.h (rs6000_const_f32_to_i32): Change
return type to long.
* config/rs6000/rs6000.c (rs6000_const_f32_to_i32): Change return
type to long.

Daily bump.

Add relation processing to ubsan builtins.

Ubsan builtins call the plus/minus/multiple fold routines, but did not
use any relation information between the 2 operands that is available.
query and pass any relations.
This resolves gcc.dg/pr97505.c when operating in ranger-only mode.

* gimple-range-fold.cc (fold_using_range::range_of_builtin_ubsan_call):
Query relation between the 2 operands and use it.

docs: fix s/ei_safe_safe/ei_safe_edge/ typo

gcc/ChangeLog:
* doc/cfg.texi: Fix s/ei_safe_safe/ei_safe_edge/ typo.

c++: permit deduction guides at class scope [PR79501]

This adds support for declaring (class-scope) deduction guides for a
member class template. Fortunately it seems only a couple of changes
are needed in order for the existing CTAD machinery to handle them
properly: we need to make sure to give them a FUNCTION_TYPE instead of a
METHOD_TYPE, and we need to avoid using a BASELINK when looking them up.

PR c++/79501
PR c++/100983

gcc/cp/ChangeLog:

* decl.c (grokfndecl): Don't require that deduction guides are
declared at namespace scope. Check that class-scope deduction
guides have the same access as the member class template.
(grokdeclarator): Pretend class-scope deduction guides are static.
* search.c (lookup_member): Don't use a BASELINK for (class-scope)
deduction guides.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1z/class-deduction92.C: New test.
* g++.dg/cpp1z/class-deduction93.C: New test.
* g++.dg/cpp1z/class-deduction94.C: New test.
* g++.dg/cpp1z/class-deduction95.C: New test.

i386: Fix vec_set<mode> expanders [PR101424]

AVX does not support 32-byte integer compares, required by
ix86_expand_vector_set_var.  The following patch fixes vec_set<mode>
expanders by introducing new vec_setm_avx2_operand predicate for AVX
vector modes.

gcc/

2021-07-12  Uroš Bizjak  <ubizjak@gmail.com>

PR target/101424
* config/i386/predicates.md (vec_setm_sse41_operand):
Rename from vec_setm_operand.
(vec_setm_avx2_operand): New predicate.
* config/i386/sse.md (vec_set<V_128:mode>): Use V_128 mode iterator.
Use vec_setm_sse41_operand as operand 2 predicate.
(vec_set<V_256_512:mode): New expander.
* config/i386/mmx.md (vec_setv2hi): Use vec_setm_sse41_operand
as operand 2 predicate.

gcc/testsuite/

2021-07-12  Uroš Bizjak  <ubizjak@gmail.com>

PR target/101424
* gcc.target/i386/pr101424.c: New test.

Do not register a cast as an equivalence.

Registering an equivalence between objects of the same size in a cast can
cause other relations to be incorrect.

gcc/
PR tree-optimization/101335
* range-op.cc (operator_cast::lhs_op1_relation): Delete.

gcc/testsuite/
* gcc.dg/tree-ssa/pr101335.c: New.

libstdc++: Constrain std::as_writable_bytes [PR101411]

The std::as_writable_bytes function should be constrained to only accept
writable spans. Currently it can be called but then gives an error in
the function body.

Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:

PR libstdc++/101411
* include/std/span (as_writable_bytes): Add requires-clause.
* testsuite/23_containers/span/101411.cc: New test.

[PHIOPT/MATCH] Remove the statement to move if not used

Instead of waiting for DCE to remove the unused statement,
and maybe optimize another conditional, it is better if
we don't move the statement and have the statement
removed.

OK? Bootstrapped and tested on x86_64-linux-gnu.

Changes from v1:
* v2: Change the order of insertation and check to see if the lhs
is used rather than see if the lhs was used in the sequence.

gcc/ChangeLog:

* tree-ssa-phiopt.c (match_simplify_replacement): Move
insert of the sequence before the movement of the
statement. Check if to see if the statement is used
outside of the original phi to see if we should move it.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/pr96928-1.c: Update to similar as pr96928.c.

produce simple DOT graphs from SLP trees

This adds a dot_slp_tree debug function producing a simple DOT
graph from a starting node down the graph.  There's no fancy
direct invocation of dot but the output is directed to a specified
file.  It re-uses vect_print_slp_tree, naming nodes as their
address.

2021-07-12  Richard Biener  <rguenther@suse.de>

* dump-context.h (debug_dump_context::debug_dump_context):
Add FILE * parameter defaulted to stderr.
* dumpfile.c (debug_dump_context::debug_dump_context): Adjust.
* tree-vect-slp.c (dot_slp_tree): New functions.

tree-optimization/101373 - avoid PRE across externally throwing call

PRE already tries to avoid hoisting possibly trapping expressions
across calls that might not return normally but fails to consider
const calls that throw externally.  The following fixes that and
also plugs the hole of trapping references not pruned in case
they are not catched by the actuall call clobbering it.

At -Os we hit the same issue in RTL PRE and postreload-gcse has
even more incomplete checks so the patch adjusts both of those
as well.

2021-07-08  Richard Biener  <rguenther@suse.de>

PR tree-optimization/101373
* tree-ssa-pre.c (prune_clobbered_mems): Also prune trapping
references when the BB may not return.
(compute_avail): Pass in the function we're working on and
replace cfun references with it.  Externally throwing
const calls also possibly terminate the function.
(pass_pre::execute): Pass down the function we're working on.
* gcse.c (compute_hash_table_work): Externally throwing
const/pure calls also need record_last_mem_set_info.
* postreload-gcse.c (record_opr_changes): Looping or externally
throwing const/pure calls also need record_last_mem_set_info.

* g++.dg/torture/pr101373.C: New testcase, XFAILed.
* gnat.dg/opt95.adb: Likewise.

Change the type of memory classification functions to bool

2021-07-12 Uroš Bizjak <ubizjak@gmail.com>

gcc/
* recog.c (memory_address_addr_space_p): Change the type to bool.
Return true/false instead of 1/0.
(offsettable_memref_p): Ditto.
(offsettable_nonstrict_memref_p): Ditto.
(offsettable_address_addr_space_p): Ditto.
Change the type of addressp indirect function to bool.
* recog.h (memory_address_addr_space_p): Change the type to bool.
(strict_memory_address_addr_space_p): Ditto.
(offsettable_memref_p): Ditto.
(offsettable_nonstrict_memref_p): Ditto.
(offsettable_address_addr_space_p): Ditto.
* reload.c (maybe_memory_address_addr_space_p): Ditto.
(strict_memory_address_addr_space_p): Change the type to bool.
Return true/false instead of 1/0.
(maybe_memory_address_addr_space_p): Change the type to bool.

[Ada] adaint.c minor reformatting

gcc/ada/

* adaint.c (__gnat_number_of_cpus): Replace "#ifdef" by "#if
defined".

[Ada] Use GNAT encodings only when -fgnat-encodings=all is specified

gcc/ada/

* gcc-interface/decl.c (gnat_to_gnu_entity) <discrete_type>: Add a
parallel type only when -fgnat-encodings=all is specified.
<E_Array_Type>: Use the PAT name and special suffixes only when
-fgnat-encodings=all is specified.
<E_Array_Subtype>: Build a special type for debugging purposes only
when -fgnat-encodings=all is specified. Add a parallel type or use
the PAT name only when -fgnat-encodings=all is specified.
<E_Record_Type>: Generate debug info for the inner record types only
when -fgnat-encodings=all is specified.
<E_Record_Subtype>: Use a debug type for an artificial subtype only
except when -fgnat-encodings=all is specified.
(elaborate_expression_1): Reset need_for_debug when possible only
except when -fgnat-encodings=all is specified.
(components_to_record): Use XV encodings for variable size only
when -fgnat-encodings=all is specified.
(associate_original_type_to_packed_array): Add a parallel type only
when -fgnat-encodings=all is specified.
* gcc-interface/misc.c (gnat_get_array_descr_info): Do not return
full information only when -fgnat-encodings=all is specified.
* gcc-interface/utils.c (make_packable_type): Add a parallel type
only when -fgnat-encodings=all is specified.
(maybe_pad_type): Make the inner type a debug type only except when
-fgnat-encodings=all is specified. Create an XVS type for variable
size only when -fgnat-encodings=all is specified.
(rest_of_record_type_compilation): Add a parallel type only when
-fgnat-encodings=all is specified.

[Ada] Implement support for unconstrained array types with FLB

gcc/ada/

* gcc-interface/decl.c (gnat_to_gnu_entity) <E_Array_Type>: Use a
fixed lower bound if the index subtype is marked so, as well as a
more efficient formula for the upper bound if the array cannot be
superflat.
(flb_cannot_be_superflat): New predicate.
(cannot_be_superflat): Rename into...
(range_cannot_be_superfla): ...this. Minor tweak.

[Ada] Clean up Uint fields

gcc/ada/

* uintp.ads, types.h: New subtypes of Uint: Valid_Uint, Unat,
Upos, Nonzero_Uint with predicates. These correspond to new
field types in Gen_IL.
* gen_il-types.ads (Valid_Uint, Unat, Upos, Nonzero_Uint): New
field types.
* einfo-utils.ads, einfo-utils.adb, fe.h (Known_Alignment,
Init_Alignment): Use the initial zero value to represent
"unknown". This will ensure that if Alignment is called before
Set_Alignment, the compiler will blow up (if assertions are
enabled).
* atree.ads, atree.adb, atree.h, gen_il-gen.adb
(Get_Valid_32_Bit_Field): New generic low-level getter for
subtypes of Uint.
(Copy_Alignment): New procedure to copy Alignment field even
when Unknown.
(Init_Object_Size_Align, Init_Size_Align): Do not bypass the
Init_ procedures.
* exp_pakd.adb, freeze.adb, layout.adb, repinfo.adb,
sem_util.adb: Protect calls to Alignment with Known_Alignment.
Use Copy_Alignment when it might be unknown.
* gen_il-gen-gen_entities.adb (Alignment,
String_Literal_Length): Use type Unat instead of Uint, to ensure
that the field is always Set_ before we get it, and that it is
set to a nonnegative value.
(Enumeration_Pos): Unat.
(Enumeration_Rep): Valid_Uint. Can be negative, but must be
valid before fetching.
(Discriminant_Number): Upos.
(Renaming_Map): Remove.
* gen_il-gen-gen_nodes.adb (Char_Literal_Value, Reason): Unat.
(Intval, Corresponding_Integer_Value): Valid_Uint.
* gen_il-internals.ads: New functions for dealing with special
defaults and new subtypes of Uint.
* scans.ads: Correct comments.
* scn.adb (Post_Scan): Do not set Intval to No_Uint; that is no
longer allowed.
* sem_ch13.adb (Analyze_Enumeration_Representation_Clause): Do
not set Enumeration_Rep to No_Uint; that is no longer allowed.
(Offset_Value): Protect calls to Alignment with Known_Alignment.
* sem_prag.adb (Set_Atomic_VFA): Do not use Uint_0 to mean
"unknown"; call Init_Alignment instead.
* sinfo.ads: Minor comment fix.
* treepr.adb: Deal with printing of new field types.
* einfo.ads, gen_il-fields.ads (Renaming_Map): Remove.
* gcc-interface/decl.c (gnat_to_gnu_entity): Use Known_Alignment
before calling Alignment. This preserve some probably buggy
behavior: if the alignment is not set, it previously defaulted
to Uint_0; we now make that explicit. Use Copy_Alignment,
because "Set_Alignment (Y, Alignment (X));" no longer works when
the Alignment of X has not yet been set.
* gcc-interface/trans.c (process_freeze_entity): Use
Copy_Alignment.

[Ada] Add DWARF 5 support to System.Dwarf_Line

gcc/ada/

* libgnat/s-dwalin.ads: Adjust a few comments left and right.
(Line_Info_Register): Comment out unused components.
(Line_Info_Header): Add DWARF 5 support.
(Dwarf_Context): Likewise. Rename "prologue" into "header".
* libgnat/s-dwalin.adb: Alphabetize "with" clauses.
(DWARF constants): Add DWARF 5 support and reorder.
(For_Each_Row): Adjust.
(Initialize_Pass): Likewise.
(Initialize_State_Machine): Likewise and fix typo.
(Open): Add DWARF 5 support.
(Parse_Prologue): Rename into...
(Parse_Header): ...this and add DWARF 5 support.
(Read_And_Execute_Isn): Rename into...
(Read_And_Execute_Insn): ...this and adjust.
(To_File_Name): Change parameter name and add DWARF 5 support.
(Read_Entry_Format_Array): New procedure.
(Skip_Form): Add DWARF 5 support and reorder.
(Seek_Abbrev): Do not count entries and add DWARF 5 support.
(Debug_Info_Lookup): Add DWARF 5 support.
(Symbolic_Address.Set_Result): Likewise.
(Symbolic_Address): Adjust.

[Ada] Duplicate Size/Value_Size clause

gcc/ada/

* sem_ch13.adb (Duplicate_Clause): Add a helper routine
Check_One_Attr, with a parameter for the attribute_designator we
are looking for, and one for the attribute_designator of the
current node (which are usually the same). For Size and
Value_Size, call it twice, once for each.
* errout.ads: Fix a typo.

[Ada] Avoid unnecessary work when expanding 'Image into 'Put_Image

gcc/ada/

* exp_imgv.adb (Expand_Image_Attribute): Move rewriting to
attribute Put_Image to the beginning of expansion of attribute
Image.

Display the number of components BB vectorized

This amends the optimization message printed when a basic-block
part is vectorized to mention the number of SLP graph entries.
This helps when debugging vectorization differences and we end up
merging SLP instances for costing purposes.

2021-07-07 Richard Biener <rguenther@suse.de>

* tree-vect-slp.c (vect_slp_region): Show the number of
SLP graph entries in the optimization message.

* g++.dg/vect/slp-pr87105.cc: Adjust.
* gcc.dg/vect/bb-slp-pr54400.c: Likewise.

tree-optimization/101394 - fix PRE full redundancy wrt abnormals

This avoids adding a copy from an abnormal picked up from PHI
translation much like we'd avoid inserting the translated
expression on pred edges.

2021-07-12 Richard Biener <rguenther@suse.de>

PR tree-optimization/101394
* tree-ssa-pre.c (do_pre_regular_insertion): Avoid inserting
copies from abnormals for a full redundancy.

* gcc.dg/torture/pr101394.c: New testcase.

middle-end/101423 - internal calls do not trap

This adjusts gimple_could_trap_p to not consider internal function
calls to trap compared to indirect calls or calls to weak functions.

2021-07-12 Richard Biener <rguenther@suse.de>

PR middle-end/101423
* gimple.c (gimple_could_trap_p_1): Internal function calls
do not trap.
* tree-eh.c (tree_could_trap_p): Likewise.

Tweak testcase for PR tree-optimization/101403.

Initialize unused variable u in compound expression.  Committed as obvious.

2021-07-12  Roger Sayle  <roger@nextmovesoftware.com>
    Jakub Jelinek  <jakub@redhat.com>

gcc/testsuite/ChangeLog
PR tree-optimization/101403
* gcc.dg/pr101403.c: Avoid (unimportant) uninitialized variable.

arm/66791: Replace builtins for unsigned and fp vmul_n intrinsics.

gcc/ChangeLog:
PR target/66791
* config/arm/arm_neon.h (vmul_n_u32): Replace call to builtin with
__a * __b.
(vmulq_n_u32): Likewise.
(vmul_n_f32): Gate __a * __b on __FAST_MATH__.
(vmulq_n_f32): Likewise.
(vmul_n_f16): Likewise.
(vmulq_n_f16): Likewise.

gcc/testsuite/ChangeLog:
PR target/66791
* gcc.target/arm/armv8_2-fp16-neon-2.c: Adjust.

offloading: fix -foffload hinting

PR sanitizer/101425

gcc/ChangeLog:

* gcc.c (check_offload_target_name): Call
candidates_list_and_hint only if we have a candidate.

arm/98435: Missed optimization in expanding vector constructor.

The patch moves vec_init pattern from neon.md to vec-common.md,
and adjusts the mode to VDQX to accomodate binary floats. Also,
the pattern is additionally gated on VALID_MVE_MODE.

gcc/ChangeLog:
PR target/98435
* config/arm/neon.md (vec_init): Move to ...
* config/arm/vec-common.md (vec_init): ... here.
Change the pattern's mode to VDQX and gate it on VALID_MVE_MODE.

gcc/testsuite/ChangeLog:
PR target/98435
* gcc.target/arm/simd/pr98435.c: New test.

PR tree-optimization/101403: Incorrect folding of ((T)bswap(x))>>C

My sincere apologies for the breakage.  My recent patch to fold
bswapN(x)>>C where the constant C was large enough that the result
only contains bits from the low byte, and can therefore avoid
the byte swap contains a minor logic error.  The pattern contains
a convert? allowing an extension to occur between the bswap and
the shift.  The logic is correct if there's no extension, or the
extension has the same sign as the shift, but I'd mistakenly
convinced myself that these couldn't have different signedness.

(T)bswap16(x)>>12 is (T)((unsigned char)x>>4) or (T)((signed char)x>>4).
The bug is that for zero-extensions to signed type T, we need to use
the unsigned char variant [the signedness of the byte shift is not
(always) the same as the signedness of T and the original shift].

Then because I'm now paranoid, I've also added a clause to handle
the hypothetical (but in practice impossible) sign-extension to an
unsigned type T, which can implemented as (T)(x<<8)>>12.

2021-07-12  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
PR tree-optimization/101403
* match.pd ((T)bswap(X)>>C): Correctly handle cases where
signedness of the shift is not the same as the signedness of
the type extension.

gcc/testsuite/ChangeLog
PR tree-optimization/101403
* gcc.dg/pr101403.c: New test case.

Daily bump.

Require target lra for tests using asm goto

gcc/testsuite/ChangeLog:

* gcc.dg/torture/pr100329.c: Require target lra.
* gcc.dg/torture/pr100519.c: Likewise.

runtime: remove direct assignments to memory locations

PR bootstrap/101374
They cause a warning with the updated GCC -Warray-bounds option.
Replace them with calls to abort, which for our purposes is fine.

Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/333409

c++: 'new T[N]' and SFINAE [PR82110]

Here we're failing to treat 'new T[N]' as erroneous in a SFINAE context
when T isn't default constructible because expand_aggr_init_1 doesn't
communicate to build_aggr_init (its only SFINAE caller) whether the
initialization was actually successful. To fix this, this patch makes
expand_aggr_init_1 and its subroutine expand_default_init return true on
success, false on failure so that build_aggr_init can properly return
error_mark_node on failure.

PR c++/82110

gcc/cp/ChangeLog:

* init.c (build_aggr_init): Return error_mark_node if
expand_aggr_init_1 returns false.
(expand_default_init): Change return type to bool. Return false
on error, true on success.
(expand_aggr_init_1): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/pr78765.C: Expect another conversion failure
diagnostic.
* g++.dg/template/sfinae14.C: Flip incorrect assertion.
* g++.dg/cpp2a/concepts-requires27.C: New test.

Daily bump.

libffi/x86: Always check __x86_64__ for x86 hosts

The upstream libffi has

commit cb8474368cdef3207638d047bd6c707ad8fcb339
Author: hjl-tools <hjl.tools@gmail.com>
Date:   Wed Dec 2 12:52:12 2020 -0800

    libffi/x86: Always check __x86_64__ for x32 hosts (#601) (#602)

    Since for x86_64-*x32 and x86_64-x32-* hosts, -m32 generates ia32 codes.
    We should always check __x86_64__ for x32 hosts.

Since for gnux32 hosts, -m32 generates i386 codes, always check __x86_64__
for x86 hosts.

PR libffi/101336
* configure.host: Always check __x86_64__ for x86 hosts.

c++: concepts TS and explicit specialization [PR101098]

duplicate_decls was not recognizing the explicit specialization as matching
the implicit specialization of g<Y> because
function_requirements_equivalent_p was seeing the C constraint on the
implicit one and not on the explicit.

PR c++/101098

gcc/cp/ChangeLog:

* decl.c (function_requirements_equivalent_p): Only compare
trailing requirements on a specialization.

gcc/testsuite/ChangeLog:

* g++.dg/concepts/explicit-spec1.C: New test.

coroutines: Factor code. Match original source location in helpers [NFC].

This is primarily a source code refactoring, the only change is to
ensure that the outlined functions are marked to begin at the same
line as the original.  Otherwise, they get the default (which seems
to be input_location, which corresponds to the closing brace at the
point that this is done).  Having the source location point to that
confuses some debuggers.

This is a contributory fix to:
PR c++/99215 - coroutines: debugging with gdb

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/cp/ChangeLog:

* coroutines.cc (build_actor_fn): Move common code to
act_des_fn.
(build_destroy_fn): Likewise.
(act_des_fn): Build the void return here.  Ensure that the
source location matches the original function.

Improvement to signed division of integer constant on x86_64.

This patch tweaks the way GCC handles 32-bit integer division on
x86_64, when the numerator is constant.  Currently the function

int foo (int x) {
  return 100/x;
}

generates the code:
foo: movl    $100, %eax
        cltd
        idivl   %edi
        ret

where the sign-extension instruction "cltd" creates a long
dependency chain, as it depends on the "mov" before it, and
is depended upon by "idivl" after it.

With this patch, GCC now matches both icc and LLVM and uses
an xor instead, generating:
foo: xorl    %edx, %edx
        movl    $100, %eax
        idivl   %edi
        ret

Microbenchmarking confirms that this is faster on Intel
processors (Kaby lake), and no worse on AMD processors (Zen2),
which agrees with intuition, but oddly disagrees with the
llvm-mca cycle count prediction on godbolt.org.

The tricky bit is that this sign-extension instruction is only
produced by late (postreload) splitting, and unfortunately none
of the subsequent passes (e.g. cprop_hardreg) is able to
propagate and simplify its constant argument.  The solution
here is to introduce a define_insn_and_split that allows the
constant numerator operand to be captured (by combine) and
then split into an optimal form after reload.

The above microbenchmarking also shows that eliminating the
sign extension of negative values (using movl $-1,%edx) is also
a performance improvement, as performed by icc but not by LLVM.
Both the xor and movl sign-extensions are larger than cltd,
so this transformation is prevented for -Os.

2021-07-09  Roger Sayle  <roger@nextmovesoftware.com>
    Uroš Bizjak  <ubizjak@gmail.com>

gcc/ChangeLog
* config/i386/i386.md (*divmodsi4_const): Optimize SImode
divmod of a constant numerator with new define_insn_and_split.

gcc/testsuite/ChangeLog
* gcc.target/i386/divmod-9.c: New test case.

coroutines: Fix a typo in rewriting the function.

When amending the function re-write code, I made a typo in
the block connections. This has not shown up in any test
fails (as far as can be seen) but is a regression in debug
info.

Fixed thus.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/cp/ChangeLog:

* coroutines.cc
(coro_rewrite_function_body): Connect the replacement
function block to the block nest correctly.

Darwin, X86: Adjust call clobbers to allow for lazy-binding [PR 100152].

We allow public functions defined in a TU to bind locally for PIC
code (the default) on 64bit Mach-O.

If such functions are not inlined, we cannot tell at compile-time if
they might be called via the lazy symbol resolver (this can depend on
options given at link-time). Therefore, we must assume that the lazy
resolver could be used which clobbers R11 and R10.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/ChangeLog:

PR target/100152
* config/i386/i386-expand.c (ix86_expand_call): If a call is
to a non-local-binding, or local but to a public symbol, then
assume that it might be indirected via the lazy symbol binder.
Mark R10 and R10 as clobbered in that case.

Darwin, config: Revise host config fragment.

There were two uses for the Darwin host config fragment:

The first is to arrange for targets that support mdynamic-no-pic
to be built with that enabled (since it makes a significant
difference to the compiler performance). We can be more specific
in the application of this, since it only applies to 32b hosts
plus powerpc64-darwin9.

The second was to work around a tool bug where -fno-PIE was not
propagated to the link stage. This second use is redundant,
since the buggy toolchain cannot bootstrap current GCC sources
anyway.

This makes the host fragment more specific and reduces the number
of toolchains for which it is included which reduces clutter in
configure lines.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
config/ChangeLog:

* mh-darwin: Make this specific to handling the
mdynamic-no-pic case.

ChangeLog:

* configure: Regenerate.
* configure.ac: Adjust cases for which it is necessary to
include the Darwin host config fragment.

Missing piece in earlier change

gcc/ada/
* gcc-interface/utils.c (finish_subprog_decl): Remove obsolete line.

testsuite/101269: fix testcase when used with -m32

PR testsuite/101269 - new test case gcc.dg/debug/btf/btf-datasec-1.c
fails with its introduction in r12-1852

BTF datasec records for .rodata/.data are expected for now for all targets.
For powerpc based targets, use -msdata=none when ilp32 is enabled.

2021-07-09 Indu Bhagat <indu.bhagat@oracle.com>

gcc/testsuite/ChangeLog:

PR testsuite/101269
* gcc.dg/debug/btf/btf-datasec-1.c: Force -msdata=none with ilp32 for
powerpc based targets.

c++: requires-expr with dependent extra args [PR101181]

Here we're crashing ultimately because the mechanism for delaying
substitution into a requires-expression (and constexpr if and pack
expansions) doesn't expect to see dependent args.  But we end up
capturing dependent args here during substitution into the default
template argument as part of coerce_template_parms for the dependent
specialization p<T>.

This patch enables the commented out code in add_extra_args for handling
this situation.  This isn't needed for pack expansions (as the
accompanying comment points out), and it doesn't seem strictly necessary
for constexpr if either, but for requires-expressions delaying even
dependent substitution is important for ensuring we don't evaluate
requirements out of order.

It turns out we also need to make a copy of the arguments when capturing
them so that coerce_template_parms doesn't later add to them and form an
unexpected cycle (REQUIRES_EXPR_EXTRA_ARGS (t) would indirectly point to t).
We also need to make tsubst_template_args handle missing template
arguments, since the arguments we capture from coerce_template_parms
and are incomplete at that point.

PR c++/101181

gcc/cp/ChangeLog:

* constraint.cc (tsubst_requires_expr): Pass complain/in_decl to
add_extra_args.
* cp-tree.h (add_extra_args): Add complain/in_decl parameters.
* pt.c (build_extra_args): Make a copy of args.
(add_extra_args): Add complain/in_decl parameters.  Enable the
code for handling the case where the extra arguments are
dependent.
(tsubst_pack_expansion): Pass complain/in_decl to
add_extra_args.
(tsubst_template_args): Handle missing template arguments.
(tsubst_expr) <case IF_STMT>: Pass complain/in_decl to
add_extra_args.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-requires26.C: New test.
* g++.dg/cpp2a/lambda-uneval16.C: New test.

c++: find_template_parameters and TEMPLATE_DECLs [PR101247]

r12-1989 fixed the testcase in the PR, but unfortunately the fix is
buggy: it breaks the case where the common template between the
TEMPLATE_DECL t and ctx_parms is the innermost template (as in
concepts-memtmpl5.C below).  This can be fixed by instead passing the
TREE_TYPE of ctmpl to common_enclosing_class when ctmpl is a class
template.

But even after that's fixed, the analogous case where the innermost
template is a partial specialization is still broken (as in
concepts-memtmpl5a.C below), because ctmpl is always a primary template.

So this patch instead takes a diferent approach that doesn't rely on
ctx_parms at all: when looking for the template parameters of a
TEMPLATE_DECL that are shared with the current template context, just
walk its DECL_CONTEXT.  As long as the template is not overly general
(e.g. we didn't pass it through most_general_template), this should give
us exactly what we want, since if a TEMPLATE_DECL can be referred to
from some template context then the template parameters it uses must all
be in-scope and contained in its DECL_CONTEXT.  This effectively makes
us treat TEMPLATE_DECLs more similarly to other _DECLs (whose DECL_CONTEXT
we also walk).

PR c++/101247

gcc/cp/ChangeLog:

* pt.c (any_template_parm_r) <case TEMPLATE_DECL>: Just walk the
DECL_CONTEXT.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-memtmpl4.C: Uncomment the commented out
example, which we now handle correctly.
* g++.dg/cpp2a/concepts-memtmpl5.C: New test.
* g++.dg/cpp2a/concepts-memtmpl5a.C: New test.

libstdc++: Only use __gthread_yield if gthreads is available

libstdc++-v3/ChangeLog:

* include/std/mutex (__lock_impl): Check
_GLIBCXX_HAS_GTHREADS before using __gthread_yield.

[Ada] Fix style in expansion of attribute Put_Image

gcc/ada/

* exp_put_image.adb (Make_Put_Image_Name): Fix style.
(Image_Should_Call_Put_Image): Likewise.
(Build_Image_Call): Likewise.

[Ada] par-ch6: do not mark subprogram as missing "is" if imported

gcc/ada/

* par-ch6.adb (Contains_Import_Aspect): New function.
(P_Subprogram): Acknowledge `Import` aspects.

[Ada] Fix crash on type extensions with discriminants

gcc/ada/

* exp_put_image.adb (Make_Component_Attributes): Use
Implementation_Base_Type to get the parent type. Otherwise,
Parent_Type_Decl is actually an internally generated subtype
declaration, so we blow up on
Type_Definition (Parent_Type_Decl).

[Ada] Add missed OS constant values

gcc/ada/

* gsocket.h: Include net/if.h to get IF_NAMESIZE constant.
* s-oscons-tmplt.c: Define IPV6_FLOWINFO for Linux.

[Ada] Improve performance of Ada.Containers.Doubly_Linked_Lists.Generic_Sorting.Sort

gcc/ada/

* libgnat/a-cdlili.adb: Reimplement
Ada.Containers.Doubly_Linked_Lists.Generic_Sorting.Sort using
Mergesort instead of the previous Quicksort variant.

[Ada] Crash on expansion of BIP construct in -gnatf mode

gcc/ada/

* exp_ch6.adb (Is_Build_In_Place_Function_Call): Add check to
verify the Selector_Name of Exp_Node has been analyzed before
obtaining its entity.

[Ada] Typo corrections and minor reformatting

gcc/ada/

* libgnarl/s-osinte__vxworks.ads: Fix typo ("release" =>
"releases") plus comment reformatting.
* libgnat/s-os_lib.ads: In a comment, fix typo ("indended" =>
"intended"), add a hyphen and semicolon, plus reformatting. In
comment for subtype time_t, fix typo ("effect" => "affect"), add
hyphens, plus reformatting.
* libgnat/s-parame.ads, libgnat/s-parame__ae653.ads,
libgnat/s-parame__hpux.ads: Remove period from one-line comment.

[Ada] Add -gnatX support for casing on discriminated values

gcc/ada/

* exp_ch5.adb (Expand_General_Case_Statement): Add new function
Else_Statements to handle the case of invalid data analogously
to how it is handled when casing on a discrete value.
* sem_case.adb (Has_Static_Discriminant_Constraint): A new
Boolean-valued function.
(Composite_Case_Ops.Scalar_Part_Count): Include discriminants
when traversing components.
(Composite_Case_Ops.Choice_Analysis.Traverse_Discrete_Parts):
Include discriminants when traversing components; the component
range for a constrained discriminant is a single value.
(Composite_Case_Ops.Choice_Analysis.Parse_Choice): Eliminate
Done variable and modify how Next_Part is computed so that it is
always correct (as opposed to being incorrect when Done is
True). This includes changes in Update_Result (a local
procedure). Add new local procedure
Update_Result_For_Box_Component and call it not just for box
components but also for "missing" components (components
associated with an inactive variant).
(Check_Choices.Check_Composite_Case_Selector.Check_Component_Subtype):
Instead of disallowing all discriminated component types, allow
those that are unconstrained or statically constrained. Check
discriminant subtypes along with other component subtypes.
* doc/gnat_rm/implementation_defined_pragmas.rst: Update
documentation to reflect current implementation status.
* gnat_rm.texi: Regenerate.

[Ada] Crash on inlined separate subprogram

gcc/ada/

* sem_ch6.adb (Check_Pragma_Inline): Correctly use
Corresponding_Spec_Of_Stub when dealing subprogram body stubs.

[Ada] Declare time_t uniformly based on a system parameter

gcc/ada/

* Makefile.rtl: Add translations for s-parame__posix2008.ads
* libgnarl/s-linux.ads: Import System.Parameters.
(time_t): Declare using System.Parameters.time_t_bits.
* libgnarl/s-linux__alpha.ads: Likewise.
* libgnarl/s-linux__android.ads: Likewise.
* libgnarl/s-linux__hppa.ads: Likewise.
* libgnarl/s-linux__mips.ads: Likewise.
* libgnarl/s-linux__riscv.ads: Likewise.
* libgnarl/s-linux__sparc.ads: Likewise.
* libgnarl/s-linux__x32.ads: Likewise.
* libgnarl/s-qnx.ads: Likewise.
* libgnarl/s-osinte__aix.ads: Likewise.
* libgnarl/s-osinte__android.ads: Likewise.
* libgnarl/s-osinte__darwin.ads: Likewise.
* libgnarl/s-osinte__dragonfly.ads: Likewise.
* libgnarl/s-osinte__freebsd.ads: Likewise.
* libgnarl/s-osinte__gnu.ads: Likewise.
* libgnarl/s-osinte__hpux-dce.ads: Likewise.
* libgnarl/s-osinte__hpux.ads: Likewise.
* libgnarl/s-osinte__kfreebsd-gnu.ads: Likewise.
* libgnarl/s-osinte__lynxos178e.ads: Likewise.
* libgnarl/s-osinte__qnx.ads: Likewise.
* libgnarl/s-osinte__rtems.ads: Likewise.
* libgnarl/s-osinte__solaris.ads: Likewise.
* libgnarl/s-osinte__vxworks.ads: Likewise.
* libgnat/g-sothco.ads: Likewise.
* libgnat/s-osprim__darwin.adb: Likewise.
* libgnat/s-osprim__posix.adb: Likewise.
* libgnat/s-osprim__posix2008.adb: Likewise.
* libgnat/s-osprim__rtems.adb: Likewise.
* libgnat/s-osprim__x32.adb: Likewise.
* libgnarl/s-osinte__linux.ads: use type System.Linux.time_t.
* libgnat/s-os_lib.ads (time_t): Declare as subtype of
Long_Long_Integer.
* libgnat/s-parame.ads (time_t_bits): New constant.
* libgnat/s-parame__ae653.ads (time_t_bits): Likewise.
* libgnat/s-parame__hpux.ads (time_t_bits): Likewise.
* libgnat/s-parame__vxworks.ads (time_t_bits): Likewise.
* libgnat/s-parame__posix2008.ads: New file for 64 bit time_t.

[Ada] Add source file name to gnat bug box

gcc/ada/

* comperr.adb (Compiler_Abort): Print source file name.

[Ada] Fix layout of contracts

gcc/ada/

* libgnat/a-strunb.ads, libgnat/a-strunb__shared.ads: Fix layout
in contracts.

[Ada] Fix invalid JSON for derived variant record with -gnatRj

gcc/ada/

* repinfo.ads (JSON output format): Document adjusted key name.
* repinfo.adb (List_Record_Layout): Use Original_Record_Component
if the normalized position of the component is not known.
(List_Structural_Record_Layout): Rename Outer_Ent parameter into
Ext_End and add Ext_Level parameter. In an extension, if the parent
subtype has static discriminants, call List_Record_Layout on it.
Output "parent_" prefixes before "variant" according to Ext_Level.
Adjust recursive calls throughout the procedure.

[Ada] Fix typo in comment related to derived discriminated types

gcc/ada/

* exp_util.ads (Map_Types): Fix typo.

[Ada] Fix index range violations in krunch

gcc/ada/

* krunch.adb: Add safeguards against index range violations.

[Ada] Code cleanups in a-strfix.adb

gcc/ada/

* libgnat/a-strfix.adb: Take advantage of extended returns.

[Ada] Add paragraph about representation changes and Scalar_Storage_Order

gcc/ada/

* doc/gnat_rm/implementation_defined_attributes.rst
(Scalar_Storage_Order): Add paragraph about representation
changes.
* gnat_rm.texi: Regenerate.

[Ada] aarch64-rtems6: use wraplf variant for a-nallfl

gcc/ada/

* Makefile.rtl (LIBGNAT_TARGET_PAIRS) <aarch64*-*-rtems*>: Use
the wraplf variant of Aux_Long_Long_Float.

[Ada] Initialize local variables related to static expression functions

gcc/ada/

* sem_ch6.adb (Analyze_Expression_Function): Initialize Orig_N
and Typ variables.

[Ada] Inconsistency between declaration and body of predicate functions

gcc/ada/

* sem_ch13.adb (Resolve_Aspect_Expressions): Use the same
processing for Predicate, Static_Predicate and
Dynamic_Predicate. Do not build the predicate function spec.
Update comments.
(Resolve_Name): Only reset Entity when necessary to avoid
spurious visibility errors.
(Check_Aspect_At_End_Of_Declarations): Handle consistently all
Predicate aspects.
* sem_ch3.adb (Analyze_Subtype_Declaration): Fix handling of
private types with predicates.

[Ada] Incremental patch for restriction No_Dynamic_Accessibility_Checks

gcc/ada/

* sem_util.ads (Type_Access_Level): Add new optional parameter
Assoc_Ent.
* sem_util.adb (Accessibility_Level): Treat access discriminants
the same as components when the restriction
No_Dynamic_Accessibility_Checks is enabled.
(Deepest_Type_Access_Level): Remove exception for
Debug_Flag_Underscore_B when returning the result of
Type_Access_Level in the case where
No_Dynamic_Accessibility_Checks is active.
(Function_Call_Or_Allocator_Level): Correctly calculate the
level of Expr based on its containing subprogram instead of
using Current_Subprogram.
* sem_res.adb (Valid_Conversion): Add actual for new parameter
Assoc_Ent in call to Type_Access_Level, and add test of
No_Dynamic_Accessibility_Checks_Enabled to ensure that static
accessibility checks are performed for all anonymous access type
conversions.

[Ada] Update internal documentation of debugging information

gcc/ada/

* exp_dbug.ads: Update documentation of various items.

[Ada] Reorder preanalysis of static expression functions

gcc/ada/

* sem_ch6.adb (Analyze_Expression_Function): Reorder code.

[Ada] Decouple analysis of static expression functions from GNATprove

gcc/ada/

* sem_ch6.adb (Analyze_Expression_Function): Reorder code.

[Ada] Avoid repeated computing of type of expression functions

gcc/ada/

* sem_ch6.adb (Analyze_Expression_Function): Add variable to
avoid repeated calls to Etype.

[Ada] Fix comment related to analysis of expression functions

gcc/ada/

* sem_ch6.adb (Analyze_Expression_Function): Fix comment.

[Ada] Avoid repeated calls in analysis of expression functions

gcc/ada/

* sem_ch6.adb (Analyze_Expression_Function): Use Orig_N variable
instead of repeated calls to Original_Node.

[Ada] Refine types of local variables in analysis of expression functions

gcc/ada/

* sem_ch6.adb (Analyze_Expression_Function): Change types local
variables from Entity_Id to Node_Id.

[Ada] Remove an unnecessary local constant

gcc/ada/

* sem_ch6.adb (Analyze_Expression_Function): A local Expr
constant was shadowing a global constant with the same name and
the same value.

[Ada] Avoid unnecessary call in preanalysis without freezing

gcc/ada/

* sem_res.adb (Preanalyze_And_Resolve): Only call
Set_Must_Not_Freeze when it is necessary to restore the previous
value.

Fix build failure on Windows with older binutils

This is the build failure on Windows with binutils for which GNU as accepts
the --gdwarf-5 switch but GNU ld generates broken binaries with DWARF 5.

We already have the HAVE_LD_BROKEN_PE_DWARF5 kludge to disable DWARF 5 in
this case but it only tames the DWARF version in the compiler, so the
driver still passes --gdwarf-5 when invoked on an assembly file with -g.

gcc/
PR target/101377
* gcc.c (ASM_DEBUG_DWARF_OPTION): Set again to --gdwarf2 in
the case where HAVE_AS_WORKING_DWARF_N_FLAG is not defined
and HAVE_LD_BROKEN_PE_DWARF5 is defined.

i386: Fix *udivmodsi4_pow2_zext_? patterns

In addition to the obvious cut-n-pasto where *udivmodsi4_pow2_zext_2
never matches, limit the range of the immediate operand to prevent
out of range immediate operand of AND instruction.

Found by inspection, the patterns rarely match (if at all), since
tree optimizers do the transformation before RTL is generated. But
according to the comment above *udivmod<mode>4_pow2, the constant can
materialize after expansion, so leave these patterns around for now.

2021-07-09 Uroš Bizjak <ubizjak@gmail.com>

gcc/
* config/i386/i386.md (*udivmodsi4_pow2_zext_1): Limit the
log2 range of operands[3] to [1,31].
(*udivmodsi4_pow2_zext_2): Ditto. Correct insn RTX pattern.

docs: don't split @smallexample in multiple @groups

Noticed multiple groups split in HTML documentation where example
was written in two columns:

                                                   ""
                                                   "
  (define_expand "addsi3"                          {
    [(match_operand:SI 0 "register_operand" "")      handle_add (...
     (match_operand:SI 1 "register_operand" "")      DONE;
     (match_operand:SI 2 "register_operand" "")]   }")

The change uses single @group/@endgroup to prevent such break.

gcc/ChangeLog:

* doc/md.texi: Don't split @smallexample in multiple @groups.

docs: add missing 'see' word

gcc/ChangeLog:

* doc/md.texi: Add missing 'see' word.

Improve early simplify and match for phiopt

Previously the idea was gimple_simplify_phiopt would call
resimplify with a NULL sequence but that sometimes fails
even if there was only one statement produced. The cases
where it fails is when there are two simplifications happen.
In the case of the min/max production, the first simplifcation
produces:
(convert (min @1 @2))
And then the convert is removed by a second one. The Min statement
will be in the sequence while the op will be a SSA name. This was
rejected before as could not produce something in the sequence.
So this patch changes the way resimplify is called to always passing
a pointer to the sequence and then decide based on if op is a
SSA_NAME or not.

OK? Bootstrapped and tested on x86_64-linux-gnu.

gcc/ChangeLog:

* tree-ssa-phiopt.c (phiopt_early_allow): Change arguments
to take sequence and gimple_match_op. Accept the case where
op is a SSA_NAME and one statement in the sequence.
Also allow constants.
(gimple_simplify_phiopt): Always pass a sequence to resimplify.
Update call to phiopt_early_allow. Discard the sequence if not
used.

testsuite: mips: use noinline attribute instead of -fno-inline

mips.exp does not support -fno-inline, causing the tests return "ERROR:
Unrecognised option: -fno-inline for dg-options ... ".

Use noinline attribute like other mips target tests, to workaround it.

gcc/testsuite/

* gcc.target/mips/cfgcleanup-jalr2.c: Remove -fno-inline and add
__attribute__((noinline)).
* gcc.target/mips/cfgcleanup-jalr3.c: Likewise.

mips: check MSA support for vector modes [PR100760,PR100761,PR100762]

Check if the vector mode is really supported by MSA in certain cases,
instead of testing ISA_HAS_MSA. Simply testing ISA_HAS_MSA can cause
ICE when MSA is enabled besides other MIPS SIMD extensions (notably,
Loongson MMI).

gcc/

PR target/100760
PR target/100761
PR target/100762
* config/mips/mips.c (mips_const_insns): Use MSA_SUPPORTED_MODE_P
instead of ISA_HAS_MSA.
(mips_expand_vec_unpack): Likewise.
(mips_expand_vector_init): Likewise.

gcc/testsuite/

PR target/100760
PR target/100761
PR target/100762
* gcc.target/mips/pr100760.c: New test.
* gcc.target/mips/pr100761.c: New test.
* gcc.target/mips/pr100762.c: New test.

rs6000: Support [u]mod<mode>3 for vector modulo insns

This patch is to make Power10 newly introduced vector
modulo instructions exploited in vectorized loops, it
just simply renames existing define_insns as standard
pattern names.

gcc/ChangeLog:

* config/rs6000/vsx.md (mods_<mode>): Rename to...
(mod<mode>3): ... this.
(modu_<mode>): Rename to...
(umod<mode>3): ... this.
* config/rs6000/rs6000-builtin.def (MODS_V2DI, MODS_V4SI, MODU_V2DI,
MODU_V4SI): Adjust.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/mod-vectorize.c: New test.

test/rs6000: Add case to cover vector division

This patch is to add one test case to check if vectorizer
can exploit vector division instrutions newly introduced
by Power10.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/div-vectorize-1.c: New test.