platform/upstream/gcc.git
4 years agoopenmp: Add support for non-rect simd and improve collapsed simd support
Jakub Jelinek [Fri, 25 Sep 2020 08:43:37 +0000 (10:43 +0200)]
openmp: Add support for non-rect simd and improve collapsed simd support

The following change adds support for non-rectangular simd loops.
While working on that, I've noticed we actually don't vectorize collapsed
simd loops at all, because the code that I thought would be vectorizable
actually is not vectorized.  While in theory for the constant lower/upper
bounds and constant step of all but the outermost loop we could in theory
vectorize by computing the seprate iterators using vectorized division
and modulo for each of them from the single iterator that increments
by 1 from 0 to total iteration count in the loop nest, I think that would
be fairly expensive and the chances of the loop body being vectorizable
would be low e.g. because of array indices unlikely to be linear and would
need scatters/gathers.
This patch changes the generated code to vectorize only the innermost
loop which has higher chance of being vectorized.  Below is the list of
tests and function names in which the patch resulted in vectorizing something
that hasn't been vectorized before (ok, the first line is a new test).
I've also found that the vectorizer will not vectorize loops with non-constant
steps, I plan to do something about those incrementally on the omp-expand.c
side (basically, compute number of iterations before the loop and use a 0 to
number_of_iterations step 1 IV as the main one).

I have problem with the composite simd vectorization though.
The point is that each thread (or task etc.) is given only a range of
consecutive iterations, so somewhere earlier it computes total number of iterations
and splits the work between the workers and then the intent is to try to vectorize it.
So, each thread is then given a begin ... end-1 range that it would handle.
This means that from the single begin value I need to compute the individual iteration
vars I should start at and then goto into the loop nest to begin iterating there
(and actually compute how many iterations the innermost loop should do each time
so that it stops before end).
Very roughly the IL I emit is something like:
int t[100][100][100];

void
foo (int a, int b, int c, int d, int e, int f, int g, int h, int u, int v, int w, int x)
{
  int i, j, k;
  int cnt;
  if (x)
    {
      i = u; j = v; k = w; goto doit;
    }
  for (i = a; i < b; i += c)
    for (j = d; j < e; j += f)
      {
        k = g;
        doit:
        for (; k < h; k++)
          t[i][j][k] += i + j + k;
      }
}
Unfortunately, some pass then turns the innermost loop to have more than 2 basic blocks
and it isn't vectorized because of that.

Also, I have disabled (for now) SIMTization of collapsed simd loops, because for SIMT
it would be using a single thread anyway and I didn't want to bother with checking
SIMT on all places I've been changing.  If SIMT support is added for some or all
collapsed loops, that omp-low.c change needs to be reverted.

Here is that list of what hasn't been vectorized before and is now:

gcc/testsuite/gcc.dg/vect/vect-simd-17.c doit
gcc/testsuite/gfortran.dg/gomp/openmp-simd-6.f90 bar
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-10.c f28_taskloop_simd_normal._omp_fn.0
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-10.c _Z24f28_taskloop_simd_normalv._omp_fn.0
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-11.c f25_t_simd_normal._omp_fn.0
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-11.c f26_t_simd_normal._omp_fn.0
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-11.c f27_t_simd_normal._omp_fn.0
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-11.c f28_tpf_simd_guided32._omp_fn.1
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-11.c f28_tpf_simd_runtime._omp_fn.1
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-11.c _Z17f25_t_simd_normaliiiiiii._omp_fn.0
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-11.c _Z17f26_t_simd_normaliiiixxi._omp_fn.0
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-11.c _Z17f27_t_simd_normalv._omp_fn.0
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-11.c _Z20f28_tpf_simd_runtimev._omp_fn.1
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-11.c _Z21f28_tpf_simd_guided32v._omp_fn.1
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-2.c f7_simd_normal
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-2.c f7_simd_normal
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-2.c f8_f_simd_guided32
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-2.c f8_f_simd_guided32
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-2.c f8_f_simd_runtime
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-2.c f8_f_simd_runtime
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-2.c f8_pf_simd_guided32._omp_fn.0
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-2.c f8_pf_simd_runtime._omp_fn.0
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-2.c _Z18f8_pf_simd_runtimev._omp_fn.0
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-2.c _Z19f8_pf_simd_guided32v._omp_fn.0
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-4.c f8_taskloop_simd_normal._omp_fn.0
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-4.c _Z23f8_taskloop_simd_normalv._omp_fn.0
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-5.c f7_t_simd_normal._omp_fn.0
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-5.c f8_tpf_simd_guided32._omp_fn.1
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-5.c f8_tpf_simd_runtime._omp_fn.1
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-5.c _Z16f7_t_simd_normalv._omp_fn.0
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-5.c _Z19f8_tpf_simd_runtimev._omp_fn.1
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-5.c _Z20f8_tpf_simd_guided32v._omp_fn.1
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-8.c f25_simd_normal
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-8.c f25_simd_normal
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-8.c f26_simd_normal
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-8.c f26_simd_normal
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-8.c f27_simd_normal
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-8.c f27_simd_normal
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-8.c f28_f_simd_guided32
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-8.c f28_f_simd_guided32
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-8.c f28_f_simd_runtime
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-8.c f28_f_simd_runtime
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-8.c f28_pf_simd_guided32._omp_fn.0
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-8.c f28_pf_simd_runtime._omp_fn.0
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-8.c _Z19f28_pf_simd_runtimev._omp_fn.0
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-8.c _Z20f28_pf_simd_guided32v._omp_fn.0
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/master-combined-1.c main._omp_fn.9
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/master-combined-1.c main._omp_fn.9
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/simd-1.c f2
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/simd-1.c f2
libgomp/testsuite/libgomp.c/pr70680-2.c f1._omp_fn.0
libgomp/testsuite/libgomp.c/pr70680-2.c f2._omp_fn.0
libgomp/testsuite/libgomp.c/pr70680-2.c f3._omp_fn.0
libgomp/testsuite/libgomp.c/pr70680-2.c f4._omp_fn.0
libgomp/testsuite/libgomp.c/simd-8.c foo
libgomp/testsuite/libgomp.c/simd-9.c bar
libgomp/testsuite/libgomp.c/simd-9.c foo

2020-09-25  Jakub Jelinek  <jakub@redhat.com>

gcc/
* omp-low.c (scan_omp_1_stmt): Don't call scan_omp_simd for
collapse > 1 loops as simt doesn't support collapsed loops yet.
* omp-expand.c (expand_omp_for_init_counts, expand_omp_for_init_vars):
Small tweaks to function comment.
(expand_omp_simd): Rewritten collapse > 1 support to only attempt
to vectorize the innermost loop and emit set of outer loops around it.
For non-composite simd with collapse > 1 without broken loop don't
even try to compute number of iterations first.  Add support for
non-rectangular simd loops.
(expand_omp_for): Don't sorry_at on non-rectangular simd loops.
gcc/testsuite/
* gcc.dg/vect/vect-simd-17.c: New test.
libgomp/
* testsuite/libgomp.c/loop-25.c: New test.

4 years agoAdd cgraph_edge::debug function.
Martin Liska [Thu, 24 Sep 2020 14:29:49 +0000 (16:29 +0200)]
Add cgraph_edge::debug function.

gcc/ChangeLog:

* cgraph.c (cgraph_edge::debug): New.
* cgraph.h (cgraph_edge::debug): New.

4 years agoFix spacing in cgraph_node::dump.
Martin Liska [Thu, 24 Sep 2020 14:37:41 +0000 (16:37 +0200)]
Fix spacing in cgraph_node::dump.

gcc/ChangeLog:

* cgraph.c (cgraph_node::dump): Always print space at the end
of a message.  Remove one extra space.

4 years ago[testsuite] Add missing require-effective-target alloca
Tom de Vries [Fri, 25 Sep 2020 06:42:10 +0000 (08:42 +0200)]
[testsuite] Add missing require-effective-target alloca

Add missing require-effect-target alloca directives.

Tested on nvptx.

gcc/testsuite/ChangeLog:

2020-09-25  Tom de Vries  <tdevries@suse.de>

* gcc.dg/analyzer/pr93355-localealias.c: Require effective target
alloca.

4 years ago[testsuite] Add effective target ident_directive
Tom de Vries [Thu, 24 Sep 2020 08:49:02 +0000 (10:49 +0200)]
[testsuite] Add effective target ident_directive

On nvptx we run into:
...
FAIL: c-c++-common/ident-1b.c  -Wc++-compat   scan-assembler GCC:
FAIL: c-c++-common/ident-2b.c  -Wc++-compat   scan-assembler GCC:
...

Using a scan-assembler directive adds -fno-indent to the compile options.
The test c-c++-common/ident-1b.c adds dg-options "-fident", and intends to
check that the -fident overrides the -fno-indent, by means of the
scan-assembler.  But for nvptx, there's no .ident directive, both with -fident
and -fno-ident.

Fix this by adding an effective target ident_directive, and requiring
it in both test-cases.

Tested on nvptx and x86_64.

gcc/testsuite/ChangeLog:

2020-09-24  Tom de Vries  <tdevries@suse.de>

* lib/target-supports.exp (check_effective_target_ident_directive): New proc.
* c-c++-common/ident-1b.c: Require effective target ident_directive.
* c-c++-common/ident-2b.c: Same.

4 years agoDaily bump.
GCC Administrator [Fri, 25 Sep 2020 00:16:27 +0000 (00:16 +0000)]
Daily bump.

4 years agolibiberty: Add get_DW_UT_name and update include/dwarf2.{def,h}
Mark Wielaard [Wed, 23 Sep 2020 14:10:41 +0000 (16:10 +0200)]
libiberty: Add get_DW_UT_name and update include/dwarf2.{def,h}

This adds a get_DW_UT_name function to dwarfnames using dwarf2.def
for use in binutils readelf to show the unit types in a DWARF5 header.

Also remove DW_CIE_VERSION which was already removed in binutils/gdb
and is not used in gcc.

include/ChangeLog:

* dwarf2.def: Add DWARF5 Unit type header encoding macros
DW_UT_FIRST, DW_UT and DW_UT_END.
* dwarf2.h (enum dwarf_unit_type): Removed and define using
DW_UT_FIRST, DW_UT and DW_UT_END macros.
(DW_CIE_VERSION): Removed.
(get_DW_UT_name): New function declaration.

libiberty/ChangeLog:

* dwarfnames.c (get_DW_UT_name): Define using DW_UT_FIRST, DW_UT
and DW_UT_END.

4 years agoc++: Cleanup some decl pushing apis
Nathan Sidwell [Thu, 24 Sep 2020 19:50:29 +0000 (12:50 -0700)]
c++: Cleanup some decl pushing apis

In cleaning up local decl handling, here's an initial patch that takes
advantage of C++'s default args for the is_friend parm of pushdecl,
duplicate_decls and push_template_decl_real and the scope & tpl_header
parms of xref_tag.  Then many of the calls simply not mention these.
I also rename push_template_decl_real to push_template_decl, deleting
the original forwarding function.  This'll make my later patches
changing their types less intrusive.  There are 2 functional changes:

1) push_template_decl requires is_friend to be correct, it doesn't go
checking for a friend function (an assert is added).

2) debug_overload prints out Hidden and Using markers for the overload set.

gcc/cp/
* cp-tree.h (duplicate_decls): Default is_friend to false.
(xref_tag): Default tag_scope & tpl_header_p to ts_current & false.
(push_template_decl_real): Default is_friend to false.  Rename to
...
(push_template_decl): ... here.  Delete original decl.
* name-lookup.h (pushdecl_namespace_level): Default is_friend to
false.
(pushtag): Default tag_scope to ts_current.
* coroutines.cc (morph_fn_to_coro): Drop default args to xref_tag.
* decl.c (start_decl): Drop default args to duplicate_decls.
(start_enum): Drop default arg to pushtag & xref_tag.
(start_preparsed_function): Pass DECL_FRIEND_P to
push_template_decl.
(grokmethod): Likewise.
* friend.c (do_friend): Rename push_template_decl_real calls.
* lambda.c (begin_lamnbda_type): Drop default args to xref_tag.
(vla_capture_type): Likewise.
* name-lookup.c (maybe_process_template_type_declaration): Rename
push_template_decl_real call.
(pushdecl_top_level_and_finish): Drop default arg to
pushdecl_namespace_level.
* pt.c (push_template_decl_real): Assert no surprising friend
functions.  Rename to ...
(push_template_decl): ... here.  Delete original function.
(lookup_template_class_1): Drop default args from pushtag.
(instantiate_class_template_1): Likewise.
* ptree.c (debug_overload): Print hidden and using markers.
* rtti.c (init_rtti_processing): Drop refault args from xref_tag.
(build_dynamic_cast_1, tinfo_base_init): Likewise.
* semantics.c (begin_class_definition): Drop default args to
pushtag.
gcc/objcp/
* objcp-decl.c (objcp_start_struct): Drop default args to
xref_tag.
(objcp_xref_tag): Likewise.
libcc1/
* libcp1plugin.cc (supplement_binding): Drop default args to
duplicate_decls.
(safe_pushtag): Drop scope parm.  Drop default args to pushtag.
(safe_pushdecl_maybe_friend): Rename to ...
(safe_pushdecl): ... here. Drop is_friend parm.  Drop default args
to pushdecl.
(plugin_build_decl): Adjust safe_pushdecl & safe_pushtag calls.
(plugin_build_constant): Adjust safe_pushdecl call.

4 years agoc++: add testcase [PR97177]
Nathan Sidwell [Thu, 24 Sep 2020 19:13:28 +0000 (12:13 -0700)]
c++: add testcase [PR97177]

Pr97177 is the local-var duplicate of pr97171.  So just adding the testcase.

gcc/testsuite/
* g++.dg/template/local-var1.C: New.

4 years agoc++: restrict test to c++>=11 [pr97171]
Nathan Sidwell [Thu, 24 Sep 2020 18:34:10 +0000 (11:34 -0700)]
c++: restrict test to c++>=11 [pr97171]

I'd missed an important restriction on use of noexcept.  Fixed thusly

gcc/testsuite/
* g++.dg/template/local-fn4.C: Add target c++11

4 years agoruntime: remove __go_ptrace on AIX
Clément Chigot [Wed, 23 Sep 2020 14:08:21 +0000 (16:08 +0200)]
runtime: remove __go_ptrace on AIX

AIX ptrace syscalls doesn't have the same semantic than the glibc one.
The syscall package is already handling it correctly so disable the new
__go_ptrace C function for AIX.

Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/256777

4 years agolibstdc++: assert that type traits are not misused with incomplete types [PR 71579]
Antony Polukhin [Thu, 24 Sep 2020 17:51:37 +0000 (18:51 +0100)]
libstdc++: assert that type traits are not misused with incomplete types [PR 71579]

libstdc++-v3/ChangeLog:

PR libstdc++/71579
* include/std/type_traits (invoke_result, is_invocable)
(is_invocable_r, is_nothrow_invocable, is_nothrow_invocable_r):
Add static_asserts to make sure that the arguments of the type
traits are not misused with incomplete types.
* testsuite/20_util/invoke_result/incomplete_args_neg.cc: New test.
* testsuite/20_util/is_invocable/incomplete_args_neg.cc: New test.
* testsuite/20_util/is_invocable/incomplete_neg.cc: New test.
* testsuite/20_util/is_nothrow_invocable/incomplete_args_neg.cc:
New test.
* testsuite/20_util/is_nothrow_invocable/incomplete_neg.cc: Check
for error on incomplete type usage in trait.

4 years agolibstdc++: Specialize ranges::__detail::__box for semiregular types
Patrick Palka [Thu, 24 Sep 2020 16:58:39 +0000 (12:58 -0400)]
libstdc++: Specialize ranges::__detail::__box for semiregular types

The class template semiregular-box<T> defined in [range.semi.wrap] is
used by a number of views to accomodate non-semiregular subobjects
while ensuring that the overall view remains semiregular.  It provides
a stand-in default constructor, copy assignment operator and move
assignment operator whenever the underlying type lacks them.  The
wrapper derives from std::optional<T> to support default construction
when T is not default constructible.

It would be nice for this wrapper to essentially be a no-op when the
underlying type is already semiregular, but this is currently not the
case due to its use of std::optional<T>, which incurs space overhead
compared to storing just T.

To that end, this patch specializes the semiregular wrapper for
semiregular T.  Compared to the primary template, this specialization
uses less space, and it allows [[no_unique_address]] to optimize away
wrapped data members whose underlying type is empty and semiregular
(e.g. a non-capturing lambda).  This patch also applies
[[no_unique_address]] to the five data members that use the wrapper.

libstdc++-v3/ChangeLog:

* include/std/ranges (__detail::__boxable): Split out the
associated constraints of __box into here.
(__detail::__box): Use the __boxable concept.  Define a leaner
partial specialization for semiregular types.
(single_view::_M_value): Give it [[no_unique_address]].
(filter_view::_M_pred): Likewise.
(transform_view::_M_fun): Likewise.
(take_while_view::_M_pred): Likewise.
(drop_while_view::_M_pred):: Likewise.
* testsuite/std/ranges/adaptors/detail/semiregular_box.cc: New
test.

4 years agolibstdc++: Fix misnamed configure option in manual
Jonathan Wakely [Thu, 24 Sep 2020 16:33:16 +0000 (17:33 +0100)]
libstdc++: Fix misnamed configure option in manual

libstdc++-v3/ChangeLog:

* doc/xml/manual/configure.xml: Correct name of option.
* doc/html/*: Regenerate.

4 years agoarm: Add support for Neoverse N2 CPU
Alex Coplan [Thu, 24 Sep 2020 16:22:44 +0000 (17:22 +0100)]
arm: Add support for Neoverse N2 CPU

This adds support for Arm's Neoverse N2 CPU to the AArch32 backend.
Neoverse N2 builds AArch32 at EL0 and therefore needs support in AArch32
GCC.

gcc/ChangeLog:

* config/arm/arm-cpus.in (neoverse-n2): New.
* config/arm/arm-tables.opt: Regenerate.
* config/arm/arm-tune.md: Regenerate.
* doc/invoke.texi: Document support for Neoverse N2.

4 years agoaarch64: Add support for Neoverse N2 CPU
Alex Coplan [Thu, 24 Sep 2020 16:21:47 +0000 (17:21 +0100)]
aarch64: Add support for Neoverse N2 CPU

This patch adds support for Arm's Neoverse N2 CPU to the AArch64
backend.

gcc/ChangeLog:

* config/aarch64/aarch64-cores.def: Add Neoverse N2.
* config/aarch64/aarch64-tune.md: Regenerate.
* doc/invoke.texi: Document AArch64 support for Neoverse N2.

4 years agoadd move CTOR to auto_vec, use auto_vec for get_loop_exit_edges
Richard Biener [Thu, 6 Aug 2020 12:50:56 +0000 (14:50 +0200)]
add move CTOR to auto_vec, use auto_vec for get_loop_exit_edges

This adds a move CTOR to auto_vec<T, 0> and makes use of a
auto_vec<edge> return value for get_loop_exit_edges denoting
that lifetime management of the vector is handed to the caller.

The move CTOR prompted the hash_table change because it appearantly
makes the copy CTOR implicitely deleted (good) and hash-table
expansion of the odr_enum_map which is
hash_map <nofree_string_hash, odr_enum> where odr_enum has an
auto_vec<odr_enum_val, 0> member triggers this.  Not sure if
there's a latent bug there before this (I think we're not
invoking DTORs, but we're invoking copy-CTORs).

2020-08-06  Richard Biener  <rguenther@suse.de>

* vec.h (auto_vec<T, 0>::auto_vec (auto_vec &&)): New move CTOR.
(auto_vec<T, 0>::operator=(auto_vec &&)): Delete.
* hash-table.h (hash_table::expand): Use std::move when expanding.
* cfgloop.h (get_loop_exit_edges): Return auto_vec<edge>.
* cfgloop.c (get_loop_exit_edges): Adjust.
* cfgloopmanip.c (fix_loop_placement): Likewise.
* ipa-fnsummary.c (analyze_function_body): Likewise.
* ira-build.c (create_loop_tree_nodes): Likewise.
(create_loop_tree_node_allocnos): Likewise.
(loop_with_complex_edge_p): Likewise.
* ira-color.c (ira_loop_edge_freq): Likewise.
* loop-unroll.c (analyze_insns_in_loop): Likewise.
* predict.c (predict_loops): Likewise.
* tree-predcom.c (last_always_executed_block): Likewise.
* tree-ssa-loop-ch.c (ch_base::copy_headers): Likewise.
* tree-ssa-loop-im.c (store_motion_loop): Likewise.
* tree-ssa-loop-ivcanon.c (loop_edge_to_cancel): Likewise.
(canonicalize_loop_induction_variables): Likewise.
* tree-ssa-loop-manip.c (get_loops_exits): Likewise.
* tree-ssa-loop-niter.c (find_loop_niter): Likewise.
(finite_loop_p): Likewise.
(find_loop_niter_by_eval): Likewise.
(estimate_numbers_of_iterations): Likewise.
* tree-ssa-loop-prefetch.c (emit_mfence_after_loop): Likewise.
(may_use_storent_in_loop_p): Likewise.

4 years agoc++: local-decls are never member fns [PR97186]
Nathan Sidwell [Thu, 24 Sep 2020 13:17:00 +0000 (06:17 -0700)]
c++: local-decls are never member fns [PR97186]

This fixes an ICE in noexcept instantiation.  It was presuming
functions always have template_info, but that changed with my
DECL_LOCAL_DECL_P changes.  Fortunately DECL_LOCAL_DECL_P fns are
never member fns, so we don't need to go fishing out a this pointer.

Also I realized I'd misnamed local10.C, so renaming it local-fn3.C,
and while there adding the effective-target lto that David E pointed
out was missing.

PR c++/97186
gcc/cp/
* pt.c (maybe_instantiate_noexcept): Local externs are never
member fns.
gcc/testsuite/
* g++.dg/template/local10.C: Rename ...
* g++.dg/template/local-fn3.C: .. here.  Require lto.
* g++.dg/template/local-fn4.C: New.

4 years agoAdd modref testcase
Jan Hubicka [Thu, 24 Sep 2020 13:10:04 +0000 (15:10 +0200)]
Add modref testcase

* gcc.dg/tree-ssa/modref-1.c: New test.

4 years agoAdd access through parameter derference tracking to modref
Jan Hubicka [Thu, 24 Sep 2020 13:09:17 +0000 (15:09 +0200)]
Add access through parameter derference tracking to modref

re-add tracking of accesses which was unfinished in David's patch.
At the moment I only implemented tracking of the fact that access is based on
derefernece of the parameter (so we track THIS pointers).
Patch does not implement IPA propagation since it needs bit more work which
I will post shortly: ipa-fnsummary needs to track when parameter points to
local memory, summaries needs to be merged when function is inlined (because
jump functions are) and propagation needs to be turned into iterative dataflow
on SCC components.

Patch also adds documentation of -fipa-modref and params that was left uncommited
in my branch :(.

Even without this change it does lead to nice increase of disambiguations
for cc1plus build.

Alias oracle query stats:
  refs_may_alias_p: 62758323 disambiguations, 72935683 queries
  ref_maybe_used_by_call_p: 139511 disambiguations, 63654045 queries
  call_may_clobber_ref_p: 23502 disambiguations, 29242 queries
  nonoverlapping_component_refs_p: 0 disambiguations, 37654 queries
  nonoverlapping_refs_since_match_p: 19417 disambiguations, 55555 must overlaps, 75721 queries
  aliasing_component_refs_p: 54665 disambiguations, 752449 queries
  TBAA oracle: 21917926 disambiguations 53054678 queries
               15763411 are in alias set 0
               10162238 queries asked about the same object
               124 queries asked about the same alias set
               0 access volatile
               3681593 are dependent in the DAG
               1529386 are aritificially in conflict with void *

Modref stats:
  modref use: 8311 disambiguations, 32527 queries
  modref clobber: 742126 disambiguations, 1036986 queries
  1987054 tbaa queries (1.916182 per modref query)
  125479 base compares (0.121004 per modref query)

PTA query stats:
  pt_solution_includes: 968314 disambiguations, 13609584 queries
  pt_solutions_intersect: 1019136 disambiguations, 13147139 queries

So compared to
https://gcc.gnu.org/pipermail/gcc-patches/2020-September/554605.html
we get 41% more use disambiguations (with similar number of queries) and 8% more
clobber disambiguations.

For tramp3d:
Alias oracle query stats:
  refs_may_alias_p: 2052256 disambiguations, 2312703 queries
  ref_maybe_used_by_call_p: 7122 disambiguations, 2089118 queries
  call_may_clobber_ref_p: 234 disambiguations, 234 queries
  nonoverlapping_component_refs_p: 0 disambiguations, 4299 queries
  nonoverlapping_refs_since_match_p: 329 disambiguations, 10200 must overlaps, 10616 queries
  aliasing_component_refs_p: 857 disambiguations, 34555 queries
  TBAA oracle: 885546 disambiguations 1677080 queries
               132105 are in alias set 0
               469030 queries asked about the same object
               0 queries asked about the same alias set
               0 access volatile
               190084 are dependent in the DAG
               315 are aritificially in conflict with void *

Modref stats:
  modref use: 426 disambiguations, 1881 queries
  modref clobber: 10042 disambiguations, 16202 queries
  19405 tbaa queries (1.197692 per modref query)
  2775 base compares (0.171275 per modref query)

PTA query stats:
  pt_solution_includes: 313908 disambiguations, 526183 queries
  pt_solutions_intersect: 130510 disambiguations, 416084 queries

Here uses decrease by 4 disambiguations and clobber improve by 3.5%.  I think
the difference is caused by fact that gcc has much more alias set 0 accesses
originating from gimple and tree unions as I mentioned in original mail.

After pushing out the IPA propagation I will re-add code to track offsets and
sizes that further improve disambiguation. On tramp3d it enables a lot of DSE
for structure fields not acessed by uninlined function.

gcc/

* doc/invoke.texi: Document -fipa-modref, ipa-modref-max-bases,
ipa-modref-max-refs, ipa-modref-max-accesses, ipa-modref-max-tests.
* ipa-modref-tree.c (test_insert_search_collapse): Update.
(test_merge): Update.
(gt_ggc_mx): New function.
* ipa-modref-tree.h (struct modref_access_node): New structure.
(struct modref_ref_node): Add every_access and accesses array.
(modref_ref_node::modref_ref_node): Update ctor.
(modref_ref_node::search): New member function.
(modref_ref_node::collapse): New member function.
(modref_ref_node::insert_access): New member function.
(modref_base_node::insert_ref): Do not collapse base if ref is 0.
(modref_base_node::collapse): Copllapse also refs.
(modref_tree): Add accesses.
(modref_tree::modref_tree): Initialize max_accesses.
(modref_tree::insert): Add access parameter.
(modref_tree::cleanup): New member function.
(modref_tree::merge): Add parm_map; merge accesses.
(modref_tree::copy_from): New member function.
(modref_tree::create_ggc): Add max_accesses.
* ipa-modref.c (dump_access): New function.
(dump_records): Dump accesses.
(dump_lto_records): Dump accesses.
(get_access): New function.
(record_access): Record access.
(record_access_lto): Record access.
(analyze_call): Compute parm_map.
(analyze_function): Update construction of modref records.
(modref_summaries::duplicate): Likewise; use copy_from.
(write_modref_records): Stream accesses.
(read_modref_records): Sream accesses.
(pass_ipa_modref::execute): Update call of merge.
* params.opt (-param=modref-max-accesses): New.
* tree-ssa-alias.c (alias_stats): Add modref_baseptr_tests.
(dump_alias_stats): Update.
(base_may_alias_with_dereference_p): New function.
(modref_may_conflict): Check accesses.
(ref_maybe_used_by_call_p_1): Update call to modref_may_conflict.
(call_may_clobber_ref_p_1): Update call to modref_may_conflict.

4 years ago[testsuite, nvptx] Fix gcc.dg/tls/thr-cse-1.c
Tom de Vries [Thu, 24 Sep 2020 12:07:42 +0000 (14:07 +0200)]
[testsuite, nvptx] Fix gcc.dg/tls/thr-cse-1.c

With nvptx, we run into:
...
FAIL: gcc.dg/tls/thr-cse-1.c scan-assembler-not \
  emutls_get_address.*emutls_get_address.*
...
because the nvptx assembly looks like:
...
  call (%value_in), __emutls_get_address, (%out_arg1);
  ...
// BEGIN GLOBAL FUNCTION DECL: __emutls_get_address
.extern .func (.param.u64 %value_out) __emutls_get_address (.param.u64 %in_ar0);
...

Fix this by checking the slim final dump instead, where we have just:
...
   12: r35:DI=call [`__emutls_get_address'] argc:0
...

gcc/testsuite/ChangeLog:

2020-09-24  Tom de Vries  <tdevries@suse.de>

* gcc.dg/tls/thr-cse-1.c: Scan final dump instead of assembly for
nvptx.

4 years ago[testsuite] Scan final instead of asm in independent-cloneids-1.c
Tom de Vries [Thu, 24 Sep 2020 11:30:11 +0000 (13:30 +0200)]
[testsuite] Scan final instead of asm in independent-cloneids-1.c

When running test-case gcc.dg/independent-cloneids-1.c for nvptx, we get:
...
FAIL: scan-assembler-times (?n)^_*bar[.$_]constprop[.$_]0: 1
FAIL: scan-assembler-times (?n)^_*bar[.$_]constprop[.$_]1: 1
FAIL: scan-assembler-times (?n)^_*bar[.$_]constprop[.$_]2: 1
FAIL: scan-assembler-times (?n)^_*foo[.$_]constprop[.$_]0: 1
FAIL: scan-assembler-times (?n)^_*foo[.$_]constprop[.$_]1: 1
FAIL: scan-assembler-times (?n)^_*foo[.$_]constprop[.$_]2: 1
...

The test expects to find something like:
...
bar.constprop.0:
...
but instead on nvptx we have:
...
.func (.param.u32 %value_out) bar$constprop$0
...

Fix this by rewriting the scans to use the final dump instead.

Tested on x86_64.

gcc/testsuite/ChangeLog:

2020-09-24  Tom de Vries  <tdevries@suse.de>

* gcc.dg/independent-cloneids-1.c: Use scan-rtl-dump instead of
scan-assembler.

4 years agotarget/97192 - new testcase for fixed PR
Richard Biener [Thu, 24 Sep 2020 11:27:49 +0000 (13:27 +0200)]
target/97192 - new testcase for fixed PR

This adds another testcase for the PR97085 fix.

2020-09-24  Richard Biener  <rguenther@suse.de>

PR tree-optimization/97085
* gcc.dg/pr97192.c: New testcase.

4 years agoThis patch fixes PR96495 - frees result components outside loop.
Paul Thomas [Thu, 24 Sep 2020 10:52:30 +0000 (11:52 +0100)]
This patch fixes PR96495 - frees result components outside loop.

2020-24-09  Paul Thomas  <pault@gcc.gnu.org>

gcc/fortran
PR fortran/96495
* trans-expr.c (gfc_conv_procedure_call): Take the deallocation
of allocatable result components of a scalar result outside the
scalarization loop. Find and use the stored result.

gcc/testsuite/
PR fortran/96495
* gfortran.dg/alloc_comp_result_2.f90 : New test.

4 years ago[testsuite, nvptx] Fix string matching in gcc.dg/pr87314-1.c
Tom de Vries [Thu, 24 Sep 2020 10:22:13 +0000 (12:22 +0200)]
[testsuite, nvptx] Fix string matching in gcc.dg/pr87314-1.c

with nvptx we run into:
...
FAIL: gcc.dg/pr87314-1.c scan-assembler hellooo
...

The required string is part of the assembly, just in a different format than
expected:
...
        .const .align 1 .u8 $LC0[12] =
  { 104, 101, 108, 108, 111, 111, 111, 111, 98, 121, 101, 0 };
...

Fix this by adding an nvptx-specific scan-assembler directive.

Tested on nvptx and x86_64.

gcc/testsuite/ChangeLog:

2020-09-24  Tom de Vries  <tdevries@suse.de>

* gcc.dg/pr87314-1.c: Add nvptx-specific scan-assembler directive.

4 years agoarm: Add a couple of extra stack-protector tests
Richard Sandiford [Thu, 24 Sep 2020 09:06:11 +0000 (10:06 +0100)]
arm: Add a couple of extra stack-protector tests

These tests were inspired by corresponding aarch64 ones.
They already pass.

gcc/testsuite/
* gcc.target/arm/stack-protector-5.c: New test.
* gcc.target/arm/stack-protector-6.c: Likewise.

4 years agoarm: Fix canary address calculation for non-PIC
Richard Sandiford [Thu, 24 Sep 2020 09:06:11 +0000 (10:06 +0100)]
arm: Fix canary address calculation for non-PIC

For non-PIC, the stack protector patterns did:

  rtx mem = XEXP (force_const_mem (SImode, operands[1]), 0);
  emit_move_insn (operands[2], mem);

Here, operands[1] is the address of the canary (&__stack_chk_guard)
and operands[2] is the register that we want to move that address into.
However, the code above instead sets operands[2] to the address of a
constant pool entry that contains &__stack_chk_guard, rather than to
&__stack_chk_guard itself.  The sequence therefore does one less
pointer indirection than it should.

The net effect was to use &__stack_chk_guard for stack-smash detection,
instead of using __stack_chk_guard itself.

gcc/
* config/arm/arm.md (*stack_protect_combined_set_insn): For non-PIC,
load the address of the canary rather than the address of the
constant pool entry that points to it.
(*stack_protect_combined_test_insn): Likewise.

gcc/testsuite/
* gcc.target/arm/stack-protector-3.c: New test.
* gcc.target/arm/stack-protector-4.c: Likewise.

4 years agotree-optimization/97085 - fold some trivial bool vector ?:
Richard Biener [Thu, 24 Sep 2020 08:14:33 +0000 (10:14 +0200)]
tree-optimization/97085 - fold some trivial bool vector ?:

The following aovids the ICE in the testcase by doing some additional
simplification of VEC_COND_EXPRs for VECTOR_BOOLEAN_TYPE_P which
we don't really expect, esp. when they are not classical vectors,
thus AVX512 or SVE masks.

2020-09-24  Richard Biener  <rguenther@suse.de>

PR tree-optimization/97085
* match.pd (mask ? { false,..} : { true, ..} -> ~mask): New.

* gcc.dg/vect/pr97085.c: New testcase.

4 years ago[testsuite] Require non_strict_align in pr94600-{1,3}.c
Tom de Vries [Thu, 24 Sep 2020 08:03:10 +0000 (10:03 +0200)]
[testsuite] Require non_strict_align in pr94600-{1,3}.c

With the nvptx target, we run into:
...
FAIL: gcc.dg/pr94600-1.c scan-rtl-dump-times final "\\(mem/v" 6
FAIL: gcc.dg/pr94600-1.c scan-rtl-dump-times final "\\(set \\(mem/v" 6
FAIL: gcc.dg/pr94600-3.c scan-rtl-dump-times final "\\(mem/v" 1
FAIL: gcc.dg/pr94600-3.c scan-rtl-dump-times final "\\(set \\(mem/v" 1
...
The scans attempt to check for volatile stores, but on nvptx we have memcpy
instead.

This is due to nvptx being a STRICT_ALIGNMENT target, which has the effect
that the TYPE_MODE for the store target is set to BKLmode in
compute_record_mode.

Fix the FAILs by requiring effective target non_strict_align.

Tested on nvptx.

gcc/testsuite/ChangeLog:

2020-09-24  Tom de Vries  <tdevries@suse.de>

* gcc.dg/pr94600-1.c: Require effective target non_strict_align for
scan-rtl-dump-times.
* gcc.dg/pr94600-3.c: Same.

4 years agoFix memory allocations in ipa-modref.
Jan Hubicka [Thu, 24 Sep 2020 06:28:09 +0000 (08:28 +0200)]
Fix memory allocations in ipa-modref.

Pair ggc_delete with ggc_alloc_no_dtor.  I copy same scheme as used by Martin
in ipa-fnsummary, that is creating a static member function create_ggc hidding
the ugly bits and using it in ipa-modref.c.

I also noticed that modref-tree leaks memory on destruction/collapse method and
fixed that.

Bootstrapped/regtested x86_64-linux.

gcc/ChangeLog:

2020-09-24  Jan Hubicka  <hubicka@ucw.cz>

* ipa-modref-tree.h (modref_base::collapse): Release memory.
(modref_tree::create_ggc): New member function.
(modref_tree::colapse): Release memory.
(modref_tree::~modref_tree): New destructor.
* ipa-modref.c (modref_summaries::create_ggc): New function.
(analyze_function): Use create_ggc.
(modref_summaries::duplicate): Likewise.
(read_modref_records): Likewise.
(modref_read): Likewise.

4 years ago[testsuite] Check target alias in builtin-has-attribute-3.c
Tom de Vries [Thu, 24 Sep 2020 06:02:29 +0000 (08:02 +0200)]
[testsuite] Check target alias in builtin-has-attribute-3.c

When running test-case c-c++-common/builtin-has-attribute-3.c on nvptx, I get:
...
FAIL: c-c++-common/builtin-has-attribute-3.c  -Wc++-compat \
  (test for excess errors)
Excess errors:
src/gcc/testsuite/c-c++-common/builtin-has-attribute-3.c:33:33: error: \
  alias definitions not supported in this configuration
...

Fix this by adding -DSKIP_ALIAS to the compilation options for effective
target ! alias.

Tested on nvptx.

gcc/testsuite/ChangeLog:

* c-c++-common/builtin-has-attribute-3.c: Compile with -DSKIP_ALIAS
for effective target ! alias.

4 years agotest: Adjust case p9-vec-length-full-6.c [PR97075]
Kewen Lin [Thu, 24 Sep 2020 05:40:47 +0000 (00:40 -0500)]
test: Adjust case p9-vec-length-full-6.c [PR97075]

The commit r11-3230 brings a nice improvement to use full
vectors instead of partial vectors when available.  This
patch is to fix the test failures on p9-vec-length-full-6.c,
where 64bit/32bit pairs are able to use full vector instead.

Bootstrapped/regtested on powerpc64le-linux-gnu P9.

gcc/testsuite/ChangeLog:

PR tree-optimization/97075
* gcc.target/powerpc/p9-vec-length-full-6.c: Adjust.

4 years agoRe: [RS6000] Power10 libffi fixes
Alan Modra [Thu, 24 Sep 2020 05:28:53 +0000 (14:58 +0930)]
Re: [RS6000] Power10 libffi fixes

Adding a nop broke ffi_closure_LINUX64!

* src/powerpc/linux64_closure.S (ffi_closure_LINUX64): Correct
location of .Lret.

4 years ago[RS6000] rs6000_rtx_costs for PLUS/MINUS constant
Alan Modra [Thu, 18 Jun 2015 10:49:55 +0000 (20:19 +0930)]
[RS6000] rs6000_rtx_costs for PLUS/MINUS constant

These functions do behave a little differently for SImode, so the
mode should be passed.

* config/rs6000/rs6000.c (rs6000_rtx_costs): Pass mode to
reg_or_add_cint_operand and reg_or_sub_cint_operand.

4 years ago[RS6000] Count rldimi constant insns
Alan Modra [Wed, 1 Apr 2020 03:04:47 +0000 (13:34 +1030)]
[RS6000] Count rldimi constant insns

rldimi is generated by rs6000_emit_set_long_const when the high and
low 32 bits of a 64-bit constant are equal.

PR target/93012
* config/rs6000/rs6000.c (num_insns_constant_gpr): Count rldimi
constants correctly.

4 years ago[RS6000] Power10 libffi fixes
Alan Modra [Fri, 18 Sep 2020 13:51:05 +0000 (23:21 +0930)]
[RS6000] Power10 libffi fixes

Power10 pc-relative code doesn't use or preserve r2 as a TOC pointer.
That means calling between pc-relative and TOC using code can't be
done without intervening linker stubs, and a call from TOC code to
pc-relative code must have a nop after the bl in order to restore r2.

Now the PowerPC libffi assembly code doesn't use r2 except for the
implicit use when making calls back to C, ffi_closure_helper_LINUX64
and ffi_prep_args64.  So changing the assembly to interoperate with
pc-relative code without stubs is easily done.

* src/powerpc/linux64.S (ffi_call_LINUX64): Don't emit global
entry when __PCREL__.  Call using @notoc.  Add nops.
* src/powerpc/linux64_closure.S (ffi_closure_LINUX64): Likewise.
(ffi_go_closure_linux64): Likewise.

4 years ago[RS6000] Built-in __PCREL__ define
Alan Modra [Wed, 23 Sep 2020 10:45:39 +0000 (20:15 +0930)]
[RS6000] Built-in __PCREL__ define

Useful in assembly to know details of power10 function calls.

* config/rs6000/rs6000-c.c (rs6000_target_modify_macros):
Conditionally define __PCREL__.

4 years ago[RS6000] PR97107, libgo fails to build for power10
Alan Modra [Fri, 18 Sep 2020 13:33:11 +0000 (23:03 +0930)]
[RS6000] PR97107, libgo fails to build for power10

Calls from split-stack code to non-split-stack code need to expand
mapped stack memory via __morestack.  Even tail calls.

__morestack is quite a surprising function on powerpc in that it calls
back to its caller, and a tail call will continue running in the
context of extra mapped stack.

PR target/97107
* config/rs6000/rs6000-internal.h (struct rs6000_stack): Improve
calls_p comment.
* config/rs6000/rs6000-logue.c (rs6000_stack_info): Likewise.
(rs6000_expand_split_stack_prologue): Emit the prologue for
functions that make a sibling call.

4 years agoanalyzer: add testcases for PR 93355 (intl/localealias.c leak)
David Malcolm [Thu, 19 Dec 2019 21:15:09 +0000 (16:15 -0500)]
analyzer: add testcases for PR 93355 (intl/localealias.c leak)

PR analyzer/93355 reports a missing diagnostic about a FILE leak in
intl/localealias.c.  This appears to be due to a issue in the
feasibility-checking code, though there is also a state explosion.

This patch adds test cases that I've been using when investigating this,
two of them currently requiring -fno-analyzer-feasibility, and one
currently requiring -Wno-analyzer-too-complex.

gcc/testsuite/ChangeLog:
PR analyzer/93355
* gcc.dg/analyzer/pr93355-localealias-feasibility.c: New test.
* gcc.dg/analyzer/pr93355-localealias-simplified.c: New test.
* gcc.dg/analyzer/pr93355-localealias.c: New test.

4 years agoanalyzer: add -fno-analyzer-feasibility
David Malcolm [Wed, 23 Sep 2020 10:55:51 +0000 (06:55 -0400)]
analyzer: add -fno-analyzer-feasibility

This patch provides a new option "-fno-analyzer-feasibility" as a way
to disable feasibility-checking of the constraints along the control
flow paths for -fanalyzer diagnostics.  I'm adding this in the hope of
making it easier to debug issues involving the feasibility-checking
logic.

The patch adds a new rejected_constraint object which is captured if
exploded_path::feasible_p fails, and adds logic that uses this to emit
an additional custom_event within the checker_path for the diagnostic,
showing where in the control flow path the diagnostic would have been
rejected, and giving details of why.

gcc/analyzer/ChangeLog:
* analyzer.h (struct rejected_constraint): New decl.
* analyzer.opt (fanalyzer-feasibility): New option.
* diagnostic-manager.cc (path_builder::path_builder): Add
"problem" param and use it to initialize new field.
(path_builder::get_feasibility_problem): New accessor.
(path_builder::m_feasibility_problem): New field.
(dedupe_winners::add): Remove inversion of logic in "if" clause,
swapping if/else suites.  In the !feasible_p suite, inspect
flag_analyzer_feasibility and add code to handle when this
is off, accepting the infeasible path, but recording the
feasibility_problem.
(diagnostic_manager::emit_saved_diagnostic): Pass the
feasibility_problem to the path_builder.
(diagnostic_manager::add_events_for_eedge): If we have
a feasibility_problem at this edge, use it to add a custom event.
* engine.cc (exploded_path::feasible_p): Pass a
rejected_constraint ** to model.maybe_update_for_edge and transfer
ownership of any created instance to any feasibility_problem.
(feasibility_problem::dump_to_pp): New.
* exploded-graph.h (feasibility_problem::feasibility_problem):
Drop "model" param; add rejected_constraint * param.
(feasibility_problem::~feasibility_problem): New.
(feasibility_problem::dump_to_pp): New decl.
(feasibility_problem::m_model): Drop field.
(feasibility_problem::m_rc): New field.
* program-point.cc (function_point::get_location): Handle
PK_BEFORE_SUPERNODE and PK_AFTER_SUPERNODE.
* program-state.cc (program_state::on_edge): Pass NULL to new
param of region_model::maybe_update_for_edge.
* region-model.cc (region_model::add_constraint): New overload
adding a rejected_constraint ** param.
(region_model::maybe_update_for_edge): Add rejected_constraint **
param and pass it to the various apply_constraints_for_ calls.
(region_model::apply_constraints_for_gcond): Add
rejected_constraint ** param and pass it to add_constraint calls.
(region_model::apply_constraints_for_gswitch): Likewise.
(region_model::apply_constraints_for_exception): Likewise.
(rejected_constraint::dump_to_pp): New.
* region-model.h (region_model::maybe_update_for_edge):
Add rejected_constraint ** param.
(region_model::add_constraint): New overload adding a
rejected_constraint ** param.
(region_model::apply_constraints_for_gcond): Add
rejected_constraint ** param.
(region_model::apply_constraints_for_gswitch): Likewise.
(region_model::apply_constraints_for_exception): Likewise.
(struct rejected_constraint): New.

gcc/ChangeLog:
* doc/analyzer.texi (Analyzer Paths): Add note about
-fno-analyzer-feasibility.
* doc/invoke.texi (Static Analyzer Options): Add
-fno-analyzer-feasibility.

gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/feasibility-2.c: New test.

4 years agolibgo: update to Go1.15.2 release
Ian Lance Taylor [Wed, 23 Sep 2020 03:30:08 +0000 (20:30 -0700)]
libgo: update to Go1.15.2 release

Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/256618

4 years agoDaily bump.
GCC Administrator [Thu, 24 Sep 2020 00:16:31 +0000 (00:16 +0000)]
Daily bump.

4 years agors6000: Add 'd' for doubleword variant of vector insert
Paul A. Clarke [Wed, 23 Sep 2020 16:59:26 +0000 (11:59 -0500)]
rs6000: Add 'd' for doubleword variant of vector insert

When the "Vector Insert" section was added to the documentation,
the doubleword ('d') variant was omitted.  Add it.

2020-09-23  Paul A. Clarke  <pc@us.ibm.com>

gcc/
* doc/extend.texi: Add 'd' for doubleword variant of
vector insert instruction.

4 years agoBuild a zero element array type that reliably renders as T[0] in diagnostcs.
Martin Sebor [Wed, 23 Sep 2020 21:19:13 +0000 (15:19 -0600)]
Build a zero element array type that reliably renders as T[0] in diagnostcs.

gcc/ChangeLog:

* gimple-array-bounds.cc (build_zero_elt_array_type): New function.
(array_bounds_checker::check_mem_ref): Call it.

4 years agoHandle DECLs and EXPRESSIONs consistently (PR middle-end/97175).
Martin Sebor [Wed, 23 Sep 2020 21:04:32 +0000 (15:04 -0600)]
Handle DECLs and EXPRESSIONs consistently (PR middle-end/97175).

gcc/ChangeLog:

PR middle-end/97175
* builtins.c (maybe_warn_for_bound): Handle both DECLs and EXPRESSIONs
in pad->dst.ref, same is pad->src.ref.

gcc/testsuite/ChangeLog:

PR middle-end/97175
* gcc.dg/Wstringop-overflow-44.c: New test.

4 years agoCleanup modref interfaces.
Jan Hubicka [Wed, 23 Sep 2020 21:06:05 +0000 (23:06 +0200)]
Cleanup modref interfaces.

* ipa-fnsummary.c (refs_local_or_readonly_memory_p): New function.
(points_to_local_or_readonly_memory_p): New function.
* ipa-fnsummary.h (refs_local_or_readonly_memory_p): Declare.
(points_to_local_or_readonly_memory_p): Declare.
* ipa-modref.c (record_access_p): Use refs_local_or_readonly_memory_p.
* ipa-pure-const.c (check_op): Likewise.

* gcc.dg/tree-ssa/local-pure-const.c: Update template.

4 years agoAvoid assuming input corresponds to valid source code (PR c/97131).
Martin Sebor [Wed, 23 Sep 2020 21:02:01 +0000 (15:02 -0600)]
Avoid assuming input corresponds to valid source code (PR c/97131).

gcc/c-family/ChangeLog:

PR c/97131
* c-warn.c (warn_parm_ptrarray_mismatch): Handle more invalid input.

gcc/testsuite/ChangeLog:

PR c/97131
* gcc.dg/Warray-parameter-6.c: New test.

4 years ago[nvptx] Split up function ref plus const
Tom de Vries [Wed, 23 Sep 2020 15:35:23 +0000 (17:35 +0200)]
[nvptx] Split up function ref plus const

With test-case gcc.c-torture/compile/pr92231.c, we run into:
...
nvptx-as: ptxas terminated with signal 11 [Segmentation fault], core dumped^M
compiler exited with status 1
FAIL: gcc.c-torture/compile/pr92231.c   -O0  (test for excess errors)
...
due to using a function reference plus constant as operand:
...
  mov.u64 %r24,bar+4096';
...

Fix this by splitting such an insn into:
...
  mov.u64 %r24,bar';
  add.u64 %r24,%r24,4096';
...

Tested on nvptx.

gcc/ChangeLog:

* config/nvptx/nvptx.md: Don't allow operand containing sum of
function ref and const.

4 years agoaarch64: Prevent canary address being spilled to stack
Richard Sandiford [Wed, 23 Sep 2020 18:25:04 +0000 (19:25 +0100)]
aarch64: Prevent canary address being spilled to stack

This patch fixes the equivalent of arm bug PR85434/CVE-2018-12886
for aarch64: under high register pressure, the -fstack-protector
code might spill the address of the canary onto the stack and
reload it at the test site, giving an attacker the opportunity
to change the expected canary value.

This would happen in two cases:

- when generating PIC for -mstack-protector-guard=global
  (tested by stack-protector-6.c).  This is a direct analogue
  of PR85434, which was also about PIC for the global case.

- when using -mstack-protector-guard=sysreg.

The two problems were really separate bugs and caused by separate code,
but it was more convenient to fix them together.

The post-patch code still spills _GLOBAL_OFFSET_TABLE_ for
stack-protector-6.c, which is a more general problem.  However,
it no longer spills the canary address itself.

The patch also fixes an ICE when using -mstack-protector-guard=sysreg
with ILP32: even if the register read is SImode, the address
calculation itself should still be DImode.

gcc/
* config/aarch64/aarch64-protos.h (aarch64_salt_type): New enum.
(aarch64_stack_protect_canary_mem): Declare.
* config/aarch64/aarch64.md (UNSPEC_SALT_ADDR): New unspec.
(stack_protect_set): Forward to stack_protect_combined_set.
(stack_protect_combined_set): New pattern.  Use
aarch64_stack_protect_canary_mem.
(reg_stack_protect_address_<mode>): Add a salt operand.
(stack_protect_test): Forward to stack_protect_combined_test.
(stack_protect_combined_test): New pattern.  Use
aarch64_stack_protect_canary_mem.
* config/aarch64/aarch64.c (strip_salt): New function.
(strip_offset_and_salt): Likewise.
(tls_symbolic_operand_type): Use strip_offset_and_salt.
(aarch64_stack_protect_canary_mem): New function.
(aarch64_cannot_force_const_mem): Use strip_offset_and_salt.
(aarch64_classify_address): Likewise.
(aarch64_symbolic_address_p): Likewise.
(aarch64_print_operand): Likewise.
(aarch64_output_addr_const_extra): New function.
(aarch64_tls_symbol_p): Use strip_salt.
(aarch64_classify_symbol): Likewise.
(aarch64_legitimate_pic_operand_p): Use strip_offset_and_salt.
(aarch64_legitimate_constant_p): Likewise.
(aarch64_mov_operand_p): Use strip_salt.
(TARGET_ASM_OUTPUT_ADDR_CONST_EXTRA): Override.

gcc/testsuite/
* gcc.target/aarch64/stack-protector-5.c: New test.
* gcc.target/aarch64/stack-protector-6.c: Likewise.
* gcc.target/aarch64/stack-protector-7.c: Likewise.

4 years agoaarch64: Add a couple of extra stack-protector tests
Richard Sandiford [Wed, 23 Sep 2020 18:21:56 +0000 (19:21 +0100)]
aarch64: Add a couple of extra stack-protector tests

These tests were inspired by corresponding arm ones.  They already pass.

gcc/testsuite/
* gcc.target/aarch64/stack-protector-3.c: New test.
* gcc.target/aarch64/stack-protector-4.c: Likewise.

4 years agoanalyzer: fix member call on null seen with ubsan [PR97178]
David Malcolm [Wed, 23 Sep 2020 15:18:43 +0000 (11:18 -0400)]
analyzer: fix member call on null seen with ubsan [PR97178]

gcc/analyzer/ChangeLog:
PR analyzer/97178
* engine.cc (impl_run_checkers): Update for change to ext_state
ctor.
* program-state.cc (selftest::test_sm_state_map): Pass an engine
instance to ext_state ctor.
(selftest::test_program_state_1): Likewise.
(selftest::test_program_state_2): Likewise.
(selftest::test_program_state_merging): Likewise.
(selftest::test_program_state_merging_2): Likewise.
* program-state.h (extrinsic_state::extrinsic_state): Remove NULL
default value for "eng" param.

4 years agoAArch64: Implement missing p128<->f64 reinterpret intrinsics
Kyrylo Tkachov [Wed, 23 Sep 2020 16:37:58 +0000 (17:37 +0100)]
AArch64: Implement missing p128<->f64 reinterpret intrinsics

This patch implements the missing reinterprets to and from poly128_t and
float64x2_t.
I've plugged in the appropriate testing in the advsimd-intrinsics.exp
too.

Bootstrapped and tested on aarch64-none-linux-gnu.
Tested advsimd-intrinsics.exp on arm-none-eabi too to make sure arm
testing isn't affected.

gcc/
PR target/71233
* config/aarch64/arm_neon.h (vreinterpretq_f64_p128,
vreinterpretq_p128_f64): Define.

gcc/testsuite/
PR target/71233
* gcc.target/aarch64/advsimd-intrinsics/arm-neon-ref.h
(clean_results): Add float64x2_t cleanup.
(DECL_VARIABLE_128BITS_VARIANTS): Add float64x2_t variable.
* gcc.target/aarch64/advsimd-intrinsics/vreinterpret_p128.c: Add
testing of vreinterpretq_f64_p128, vreinterpretq_p128_f64.

4 years agoc++: Remove some gratuitous typedefing
Nathan Sidwell [Wed, 23 Sep 2020 15:13:25 +0000 (08:13 -0700)]
c++: Remove some gratuitous typedefing

This is C++, we don't need 'typedef struct foo foo;'. Oh, and bool
bitfields are a thing.

gcc/cp/
* name-lookup.h (typedef cxx_binding): Delete tdef.
(typedef cp_binding_level): Likewise.
(struct cxx_binding): Flags are bools.

4 years agoarm: Add support for Neoverse V1 CPU
Alex Coplan [Wed, 23 Sep 2020 14:21:00 +0000 (15:21 +0100)]
arm: Add support for Neoverse V1 CPU

This adds support for Arm's Neoverse V1 CPU to the AArch32 backend.

---

gcc/ChangeLog:

* config/arm/arm-cpus.in (neoverse-v1): New.
* config/arm/arm-tables.opt: Regenerate.
* config/arm/arm-tune.md: Regenerate.
* doc/invoke.texi: Document support for Neoverse V1.

4 years agoaarch64: Add support for Neoverse V1 CPU
Alex Coplan [Wed, 23 Sep 2020 14:20:19 +0000 (15:20 +0100)]
aarch64: Add support for Neoverse V1 CPU

This adds support for Arm's Neoverse V1 CPU to the AArch64 backend.

---

gcc/ChangeLog:

* config/aarch64/aarch64-cores.def: Add Neoverse V1.
* config/aarch64/aarch64-tune.md: Regenerate.
* doc/invoke.texi: Document support for Neoverse V1.

4 years agoc++: dependent local extern decl ICE [PR97171]
Nathan Sidwell [Wed, 23 Sep 2020 14:01:10 +0000 (07:01 -0700)]
c++: dependent local extern decl ICE [PR97171]

I'd missed the piece of substutution for the uses of a local extern
decl.  Just grab the local specialization.  We need to do this
regardless of dependentness because we always cloned the local extern.

PR c++/97171
gcc/cp/
* pt.c (tsubst_copy) [FUNCTION_DECL,VAR_DECL]: Retrieve local
specialization for DECL_LOCAL_P decls.
gcc/testsuite/
* g++.dg/template/local10.C: New.

4 years agoc: Fix -Wduplicated-branches ICE [PR97125]
Marek Polacek [Sun, 20 Sep 2020 20:11:00 +0000 (16:11 -0400)]
c: Fix -Wduplicated-branches ICE [PR97125]

We crash here because since r11-3302 the C FE uses codes like SWITCH_STMT
in the else branches in the attached test, and inchash::add_expr in
do_warn_duplicated_branches doesn't handle these front-end codes.  In
the C++ FE this works because by the time we get to do_warn_duplicated_branches
we've already cp_genericize'd the SWITCH_STMT tree into a SWITCH_EXPR.

The fix is to call do_warn_duplicated_branches_r only after loops and other
structured control constructs have been lowered.

gcc/c-family/ChangeLog:

PR c/97125
* c-gimplify.c (c_genericize): Only call do_warn_duplicated_branches_r
after loops and other structured control constructs have been lowered.

gcc/testsuite/ChangeLog:

PR c/97125
* c-c++-common/Wduplicated-branches-15.c: New test.

4 years agomiddle-end/96453 - relax gimple_expand_vec_cond_expr
Richard Biener [Wed, 23 Sep 2020 13:03:31 +0000 (15:03 +0200)]
middle-end/96453 - relax gimple_expand_vec_cond_expr

This relaxes the condition under which we also try NE_EXPR
for a fake generated compare in addition to LT_EXPR given
the fact the verification ICEd when it failed but obviously
was only implemented for constants.  Thus the patch removes
the verification and the restriction to constant operands.

2020-09-23  Richard Biener  <rguenther@suse.de>

PR middle-end/96453
* gimple-isel.cc (gimple_expand_vec_cond_expr): Remove
LT_EXPR -> NE_EXPR verification and also apply it for
non-constant masks.

* gcc.dg/pr96453.c: New testcase.

4 years agoMinor modref optimization and statistics fix
Jan Hubicka [Wed, 23 Sep 2020 13:12:18 +0000 (15:12 +0200)]
Minor modref optimization and statistics fix

this patch fixes bug in tracking memory stats and also I have noticed that while
the pass takes care to stop traking things when things are obviously out of hand
it still keeps summaries that have no useful info for loads or stores and also
many summaries are just copying const/pure attributes.  This patch thus also
adds logic to detect if summary is useful and drop it early otherwise.  This
reduces number of queries to the oracle and saves memory/lto streaming.

For cc1plus LTO build (configured with --disable-plugin
--enable-checking=release --with-build-config=lto) I now get:

Alias oracle query stats:
  refs_may_alias_p: 62488734 disambiguations, 72660949 queries
  ref_maybe_used_by_call_p: 128863 disambiguations, 63393551 queries
  call_may_clobber_ref_p: 16013 disambiguations, 21776 queries
  nonoverlapping_component_refs_p: 0 disambiguations, 37628 queries
  nonoverlapping_refs_since_match_p: 19397 disambiguations, 55370 must overlaps, 75516 queries
  aliasing_component_refs_p: 54741 disambiguations, 752198 queries
  TBAA oracle: 21632692 disambiguations 52565147 queries
               15656420 are in alias set 0
               10108172 queries asked about the same object
               124 queries asked about the same alias set
               0 access volatile
               3640460 are dependent in the DAG
               1527279 are aritificially in conflict with void *

Modref stats:
  modref use: 5712 disambiguations, 31221 queries
  modref clobber: 684316 disambiguations, 1010000 queries
  1779717 tbaa queries (1.762096 per modref query)

PTA query stats:
  pt_solution_includes: 947334 disambiguations, 13601373 queries
  pt_solutions_intersect: 1011662 disambiguations, 13139565 queries

The number of queries should change, but the number of disambiguations should
not.  However comparing with stats here
https://gcc.gnu.org/pipermail/gcc-patches/2020-September/554309.html
I see about 50% drop in clobber disambiguations. There is however same drop in
other alias oracle stats.  I suppose someting changed in meanwhile on mainline
because I was basing that on older tree.  I tried to proofread changes between
mainline and branch and they seem all quite obvious.

This is consistent with what I get on tramp3d:

Alias oracle query stats:
  refs_may_alias_p: 2051320 disambiguations, 2312132 queries
  ref_maybe_used_by_call_p: 7058 disambiguations, 2088222 queries
  call_may_clobber_ref_p: 232 disambiguations, 232 queries
  nonoverlapping_component_refs_p: 0 disambiguations, 4339 queries
  nonoverlapping_refs_since_match_p: 329 disambiguations, 10200 must overlaps, 10616 queries
  aliasing_component_refs_p: 857 disambiguations, 34639 queries
  TBAA oracle: 886768 disambiguations 1670635 queries
               131572 are in alias set 0
               461689 queries asked about the same object
               0 queries asked about the same alias set
               0 access volatile
               190291 are dependent in the DAG
               315 are aritificially in conflict with void *

Modref stats:
  modref use: 430 disambiguations, 1885 queries
  modref clobber: 9657 disambiguations, 16076 queries
  19027 tbaa queries (1.183566 per modref query)

PTA query stats:
  pt_solution_includes: 311756 disambiguations, 524179 queries
  pt_solutions_intersect: 129689 disambiguations, 415878 queries

In both cases the number of disambiguations should be same (queries are not
comparable).

Bootstrapped/regtested x86_64-linux, comitted.

gcc/ChangeLog:

2020-09-23  Jan Hubicka  <hubicka@ucw.cz>

* ipa-modref.c (modref_summary::lto_useful_p): New member function.
(modref_summary::useful_p): New member function.
(analyze_function): Drop useless summaries.
(modref_write): Skip useless summaries.
(pass_ipa_modref::execute): Drop useless summaries.
* ipa-modref.h (struct GTY): Declare useful_p and lto_useful_p.
* tree-ssa-alias.c (dump_alias_stats): Fix.
(modref_may_conflict): Fix stats.

4 years agomiddle-end/96466 - fix VEC_COND isel/expansion issue
Richard Biener [Wed, 23 Sep 2020 12:20:44 +0000 (14:20 +0200)]
middle-end/96466 - fix VEC_COND isel/expansion issue

We need to avoid forcing BLKmode for truth vectors, instead do as
other code and use VOIDmode so layout_type can pick a suitable and
consistent mode.  RTL expansion of vect_cond_mask also needs to deal
with CONST_INT operands which means passing the mode explicitely.

2020-09-23  Richard Biener  <rguenther@suse.de>

PR middle-end/96466
* internal-fn.c (expand_vect_cond_mask_optab_fn): Use
appropriate mode for force_reg.
* tree.c (build_truth_vector_type_for): Pass VOIDmode to
make_vector_type.

* gcc.dg/pr96466.c: New testcase.

4 years agovect: Fix epilogue loop handling of partial vectors
Richard Sandiford [Wed, 23 Sep 2020 11:29:40 +0000 (12:29 +0100)]
vect: Fix epilogue loop handling of partial vectors

This patch fixes the fallout that Kewen reported on Power after
the recent change to avoid unnecessary use of partial vectors.
As Kewen said, the problem is that vect_analyze_loop_2 doesn't
know how many epilogue iterations there will be, and so it
cannot make a final decision about whether the number of
iterations forces an epilogue loop to use partial vectors.

This is similar to the current situation for peeling: we don't know
during initial analysis whether an epilogue loop will itself require
peeling.  Instead we decide that during vect_do_peeling, where the
final number of epilogue loop iterations is known.

The patch takes a similar approach for the decision about whether
to use partial vectors.  As the comments in the patch say, the
idea is that vect_analyze_loop_2 should make peeling and partial-
vector decisions based on the assumption that the loop_vinfo will
be used as the main loop, while vect_do_peeling should make them
in the knowledge that the loop_vinfo will be used as an epilogue loop.

This allows the same analysis to be used for both cases, which we
rely on for implementing VECT_COMPARE_COSTS; see the big comment
in vect_analyze_loop for details.

I hope the patch makes the (mostly preexisting) structure a bit
more obvious.  It isn't what anyone would design from scratch,
but that's the nature of working with a mature vector framework.

Arranging things this way means that vect_verify_full_masking
and vect_verify_loop_lens now become part of the “can” rather
than “will” test for partial vectors.

Also, while splitting out the logic that handles epilogues with
constant iterations, I added a check to make sure that we don't
try to use partial vectors to vectorise a single-scalar loop.
This required some changes to the Power tests.

gcc/
* tree-vectorizer.h (determine_peel_for_niter): Delete in favor of...
(vect_determine_partial_vectors_and_peeling): ...this new function.
* tree-vect-loop-manip.c (vect_update_epilogue_niters): New function.
Reject using vector epilogue loops for single iterations.  Install
the constant number of epilogue loop iterations in the associated
loop_vinfo.  Rely on vect_determine_partial_vectors_and_peeling
to do the main part of the test.
(vect_do_peeling): Use vect_update_epilogue_niters to handle
epilogue loops with a known number of iterations.  Skip recomputing
the number of iterations later in that case.  Otherwise, use
vect_determine_partial_vectors_and_peeling to decide whether the
epilogue loop needs to use partial vectors or peeling.
* tree-vect-loop.c (_loop_vec_info::_loop_vec_info): Set the
default can_use_partial_vectors_p to false if partial-vector-usage=0.
(determine_peel_for_niter): Remove in favor of...
(vect_determine_partial_vectors_and_peeling): ...this new function,
split out from...
(vect_analyze_loop_2): ...here.  Reflect the vect_verify_full_masking
and vect_verify_loop_lens results in CAN_USE_PARTIAL_VECTORS_P
rather than USING_PARTIAL_VECTORS_P.

gcc/testsuite/
* gcc.target/powerpc/p9-vec-length-epil-1.c: Do not expect the
single-iteration epilogues of the 64-bit loops to be vectorized.
* gcc.target/powerpc/p9-vec-length-epil-7.c: Likewise.
* gcc.target/powerpc/p9-vec-length-epil-8.c: Likewise.

4 years agoAArch64: Implement missing vrndns_f32 intrinsic
Kyrylo Tkachov [Wed, 23 Sep 2020 11:02:29 +0000 (12:02 +0100)]
AArch64: Implement missing vrndns_f32 intrinsic

This patch implements the missing vrndns_f32 intrinsic. This operates on a scalar float32_t value.
It can be mapped down to a __builtin_aarch64_frintnsf builtin.

This patch does that.

Bootstrapped and tested on aarch64-none-linux-gnu.

gcc/
PR target/71233
* config/aarch64/aarch64-simd-builtins.def (frintn): Use BUILTIN_VHSDF_HSDF
for modes.  Remove explicit hf instantiation.
* config/aarch64/arm_neon.h (vrndns_f32): Define.

gcc/testsuite/
PR target/71233
* gcc.target/aarch64/simd/vrndns_f32_1.c: New test.

4 years agotree-optimization/97173 - extend assert in vectorizable_live_operation
Richard Biener [Wed, 23 Sep 2020 08:42:48 +0000 (10:42 +0200)]
tree-optimization/97173 - extend assert in vectorizable_live_operation

The condition we're expecting to eventually run into isn't fully
captured by checking for CTORs, instead we can also run into the
CTOR element conversion.

2020-09-23  Richard Biener  <rguenther@suse.de>

PR tree-optimization/97173
* tree-vect-loop.c (vectorizable_live_operation): Extend
assert to also conver element conversions.

* gcc.dg/vect/pr97173.c: New testcase.

4 years agoAArch64: Implement missing _p64 intrinsics for vector permutes
Kyrylo Tkachov [Wed, 23 Sep 2020 10:07:50 +0000 (11:07 +0100)]
AArch64: Implement missing _p64 intrinsics for vector permutes

This patch implements some missing vector permute intrinsics operating on poly64x2_t types.
They are implemented identically to their uint64x2_t brethren.

Bootstrapped and tested on aarch64-none-linux-gnu.

gcc/
PR target/71233
* config/aarch64/arm_neon.h (vtrn1q_p64, vtrn2q_p64, vuzp1q_p64,
vuzp2q_p64, vzip1q_p64, vzip2q_p64): Define.

gcc/testsuite/
PR target/71233
* gcc.target/aarch64/simd/trn_zip_p64_1.c: New test.

4 years agoAArch64: Implement vldrq_p128 intrinsic
Kyrylo Tkachov [Wed, 23 Sep 2020 09:32:42 +0000 (10:32 +0100)]
AArch64: Implement vldrq_p128 intrinsic

This patch implements the missing vldrq_p128 intrinsic that just loads from the appropriate pointer.

Bootstrapped and tested on aarch64-none-linux-gnu.

gcc/
PR target/71233
* config/aarch64/arm_neon.h (vldrq_p128): Define.

gcc/testsuite/
PR target/71233
* gcc.target/aarch64/simd/vldrq_p128_1.c: New test.

4 years agoAArch64: Implement vstrq_p128 intrinsic
Kyrylo Tkachov [Wed, 23 Sep 2020 09:29:17 +0000 (10:29 +0100)]
AArch64: Implement vstrq_p128 intrinsic

This patch implements the missing vstrq_p128 intrinsic.
It just performs a store of the poly128_t argument to a memory location.

Bootstrapped and tested on aarch64-none-linux-gnu.

gcc/
PR target/71233
* config/aarch64/arm_neon.h (vstrq_p128): Define.

gcc/testsuite/
PR target/71233
* gcc.target/aarch64/simd/vstrq_p128_1.c: New test.

4 years agogcc/analyzer: Silence -Wpragma warns with GCC < 10
Tobias Burnus [Wed, 23 Sep 2020 09:07:40 +0000 (11:07 +0200)]
gcc/analyzer: Silence -Wpragma warns with GCC < 10

gcc/analyzer/ChangeLog:

* analyzer-logging.cc: Guard '#pragma ... ignored "-Wformat-diag"'
by '#if __GNUC__ >= 10'
* analyzer.h: Likewise.
* call-string.cc: Likewise.

4 years agotree-optimization/97151 - improve PTA for C++ operator delete
Richard Biener [Wed, 23 Sep 2020 08:11:03 +0000 (10:11 +0200)]
tree-optimization/97151 - improve PTA for C++ operator delete

C++ operator delete, when DECL_IS_REPLACEABLE_OPERATOR_DELETE_P,
does not cause the deleted object to be escaped.  It also has no
other interesting side-effects for PTA so skip it like we do
for BUILT_IN_FREE.

2020-09-23  Richard Biener  <rguenther@suse.de>

PR tree-optimization/97151
* tree-ssa-structalias.c (find_func_aliases_for_call):
DECL_IS_REPLACEABLE_OPERATOR_DELETE_P has no effect on
arguments.

* g++.dg/cpp1y/new1.C: Adjust for two more handled transforms.

4 years agomiddle-end/97162 - fix ICE when building gamess
Richard Biener [Wed, 23 Sep 2020 08:07:37 +0000 (10:07 +0200)]
middle-end/97162 - fix ICE when building gamess

This appropriately guards the check for a hard register in
compare_base_decls which otherwise ICEs when passed a CONST_DECL.

2020-09-23  Richard Biener  <rguenther@suse.de>

PR middle-end/97162
* alias.c (compare_base_decls): Use DECL_HARD_REGISTER
and guard with VAR_P.

4 years agogcov: fix streaming corruption
Martin Liska [Mon, 21 Sep 2020 14:26:10 +0000 (16:26 +0200)]
gcov: fix streaming corruption

gcc/ChangeLog:

PR gcov-profile/97069
* profile.c (branch_prob): Line number must be at least 1.

gcc/testsuite/ChangeLog:

PR gcov-profile/97069
* g++.dg/gcov/pr97069.C: New test.

4 years ago[nvptx] Handle move from DF subreg to DF reg in nvptx_output_mov_insn
Tom de Vries [Tue, 22 Sep 2020 11:16:39 +0000 (13:16 +0200)]
[nvptx] Handle move from DF subreg to DF reg in nvptx_output_mov_insn

When compiling test-case gcc.dg/atomic/c11-atomic-exec-1.c, we run into
these ptxas errors:
...
line 100; error: Rounding modifier required for instruction 'cvt'
line 105; error: Rounding modifier required for instruction 'cvt'
...

The problem is that this move:
...
//(insn 13 11 14 2
//      (set (reg:DF 28 [ _9 ])
//           (subreg:DF (reg:TI 22 [ _1 ]) 0)) 9 {*movdf_insn}
//       (nil))
                cvt.f64.u64     %r28, %r22$0;
...
is emitted as cvt.f64.u64, while it should be a mov.b64 instead.

Fix this by handling this case in nvptx_output_mov_insn.

Tested on nvptx.

gcc/ChangeLog:

PR target/97158
* config/nvptx/nvptx.c (nvptx_output_mov_insn): Handle move from
DF subreg to DF reg.

4 years ago[testsuite] Add missing require-effective-target alloca
Tom de Vries [Wed, 23 Sep 2020 07:16:17 +0000 (09:16 +0200)]
[testsuite] Add missing require-effective-target alloca

Add missing require-effect-target alloca directives.

Tested on nvptx.

gcc/testsuite/ChangeLog:

* gcc.dg/Warray-bounds-63.c: Add require-effective-target alloca.
* gcc.dg/Warray-bounds-66.c: Same.
* gcc.dg/atomic/stdatomic-vm.c: Same.

4 years agosyscall: fix TestForeground for AIX
Clément Chigot [Fri, 3 May 2019 14:53:13 +0000 (16:53 +0200)]
syscall: fix TestForeground for AIX

Syscall function can't be used on AIX. Therefore, Ioctl in
TestForeground must call raw_ioctl.

Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/175080

4 years agosyscall: remove ptrace syscall on ppc64
Clément Chigot [Tue, 7 May 2019 11:57:40 +0000 (13:57 +0200)]
syscall: remove ptrace syscall on ppc64

ptrace is available only for 32 bits programs.

Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/252558

4 years agoAdd $(ZLIBINC) to CFLAGS-analyzer/engine.o
David Malcolm [Tue, 22 Sep 2020 21:42:27 +0000 (17:42 -0400)]
Add $(ZLIBINC) to CFLAGS-analyzer/engine.o

gcc/ChangeLog:
* Makefile.in: Add $(ZLIBINC) to CFLAGS-analyzer/engine.o.

4 years agoanalyzer: use switch in exploded_node::on_stmt
David Malcolm [Tue, 22 Sep 2020 15:29:02 +0000 (11:29 -0400)]
analyzer: use switch in exploded_node::on_stmt

This patch replaces a sequence of dyn_cast to different gimple stmt
types in exploded_node::on_stmt with a switch on the gimple_code.  This
makes clearer which kinds of stmt are currently treated as no-ops, as a
precursor to handling them properly.

No functional change intended.

gcc/analyzer/ChangeLog:
* engine.cc (exploded_node::on_stmt): Replace sequence of dyn_cast
with switch.

4 years agoruntime, net: fix build errors on AIX
Clément Chigot [Tue, 26 May 2020 09:31:37 +0000 (11:31 +0200)]
runtime, net: fix build errors on AIX

Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/235158

4 years agolibbacktrace: handle pc == low correctly
Ian Lance Taylor [Wed, 23 Sep 2020 00:27:46 +0000 (17:27 -0700)]
libbacktrace: handle pc == low correctly

* dwarf.c (report_inlined_functions): Handle PC == -1 and PC ==
p->low.
(dwarf_lookup_pc): Likewise.

4 years agoDaily bump.
GCC Administrator [Wed, 23 Sep 2020 00:16:27 +0000 (00:16 +0000)]
Daily bump.

4 years agogo.test: update issue4458.go for recent change
Ian Lance Taylor [Tue, 22 Sep 2020 23:36:02 +0000 (16:36 -0700)]
go.test: update issue4458.go for recent change

4 years agoIgnore clobbers in modref
Jan Hubicka [Tue, 22 Sep 2020 20:36:01 +0000 (22:36 +0200)]
Ignore clobbers in modref

* ipa-modref.c (analyze_stmt): Ignore gimple clobber.

4 years agoc++: Return only in-scope tparms in keep_template_parm [PR95310]
Patrick Palka [Tue, 22 Sep 2020 20:26:52 +0000 (16:26 -0400)]
c++: Return only in-scope tparms in keep_template_parm [PR95310]

In the testcase below, the dependent specializations iter_reference_t<F>
and iter_reference_t<Out> share the same tree due to specialization
caching.  So when find_template_parameters walks through the
requires-expression (as part of normalization), it sees and includes the
out-of-scope template parameter F in the list of template parameters
it found within the requires-expression (along with Out and N).

From a correctness perspective this is harmless since the parameter mapping
routines only care about the level and index of each parameter, so F is
no different from Out in that sense.  And it's also harmless that two
parameters in the parameter mapping have the same level and index.

But having both Out and F in the parameter mapping means extra work for
hash_atomic_constrant, tsubst_parameter_mapping and get_mapped_args; and
it also means we print this irrelevant template parameter in the
testcase's diagnostics (via pp_cxx_parameter_mapping):

  in requirements with ‘Out o’ [with N = (const int&)&a; F = const int*; Out = const int*]

This patch makes keep_template_parm return only in-scope template
parameters by looking into ctx_parms for the corresponding in-scope
one, through a new helper function corresponding_template_parameter.

(That we sometimes print irrelevant template parameters in diagnostics
is also the subject of PR99 and PR66968, so the above diagnostic issue
could likely be fixed in a more general way, but this targeted fix to
keep_template_parm is perhaps worthwhile on its own.)

gcc/cp/ChangeLog:

PR c++/95310
* pt.c (corresponding_template_parameter): Define.
(keep_template_parm): Use it to adjust the given template
parameter to the corresponding in-scope one from ctx_parms.

gcc/testsuite/ChangeLog:

PR c++/95310
* g++.dg/concepts/diagnostic15.C: New test.

4 years agoc++: Add test for PR96652
Patrick Palka [Tue, 22 Sep 2020 20:26:49 +0000 (16:26 -0400)]
c++: Add test for PR96652

Fixed by r11-3361.

gcc/testsuite/ChangeLog:

PR c++/96652
* g++.dg/cpp0x/decltype-96652.C: New test.

4 years agoFix ipa-modref selftest and destructor
Jan Hubicka [Tue, 22 Sep 2020 20:16:00 +0000 (22:16 +0200)]
Fix ipa-modref selftest and destructor

* ipa-modref-tree.c: Add namespace selftest.
(modref_tree_c_tests): Rename to ...
(ipa_modref_tree_c_tests): ... this.
* ipa-modref.c (pass_modref): Remove destructor.
(ipa_modref_c_finalize): New function.
* ipa-modref.h (ipa_modref_c_finalize): Declare.
* selftest-run-tests.c (selftest::run_tests): Call
ipa_modref_c_finalize.
* selftest.h (ipa_modref_tree_c_tests): Declare.
* toplev.c: Include ipa-modref-tree.h and ipa-modref.h
(toplev::finalize): Call ipa_modref_c_finalize.

4 years agoc++: Remove a broken error-recovery path
Nathan Sidwell [Tue, 22 Sep 2020 19:21:13 +0000 (12:21 -0700)]
c++: Remove a broken error-recovery path

The remaining use of xref_tag_from_type was also suspicious.  It turns
out to be an error path.  At parse time we diagnose that a class
definition cannot appear, but we swallow the definition.  This code
was attempting to push it into the global scope (or find a conflict).
This seems needless, just return error_mark_node.  This was the
simpler fix than going through the parser and figuring out how to get
it to put in error_mark_node at the right point.

gcc/cp/
* cp-tree.h (xref_tag_from_type): Don't declare.
* decl.c (xref_tag_from_type): Delete.
* pt.c (lookup_template_class_1): Erroneously located class
definitions just give error_mark, don't try and inject it into the
namespace.

4 years agoc++: Ignore __sanitizer_ptr_{sub,cmp} builtin calls during constant expression evalua...
Jakub Jelinek [Tue, 22 Sep 2020 19:06:32 +0000 (21:06 +0200)]
c++: Ignore __sanitizer_ptr_{sub,cmp} builtin calls during constant expression evaluation [PR97145]

These two builtin calls are added already during parsing before pointer
subtractions or comparisons, normally they perform runtime verification
of whether the pointers point to the same object or different objects,
but during constant expressione valuation we don't really need those
builtins for anything.

2020-09-22  Jakub Jelinek  <jakub@redhat.com>

PR c++/97145
* constexpr.c (cxx_eval_builtin_function_call): Return void_node for
calls to __sanitize_ptr_{sub,cmp} builtins.

* g++.dg/asan/pr97145.C: New test.

4 years agolibstdc++: Fix out-of-bounds string_view access in filesystem::path [PR 97167]
Jonathan Wakely [Tue, 22 Sep 2020 19:02:58 +0000 (20:02 +0100)]
libstdc++: Fix out-of-bounds string_view access in filesystem::path [PR 97167]

libstdc++-v3/ChangeLog:

PR libstdc++/97167
* src/c++17/fs_path.cc (path::_Parser::root_path()): Check
for empty string before inspecting the first character.
* testsuite/27_io/filesystem/path/append/source.cc: Append
empty string_view to path.

4 years agoanalyzer: add -fdump-analyzer-json
David Malcolm [Fri, 18 Sep 2020 17:59:21 +0000 (13:59 -0400)]
analyzer: add -fdump-analyzer-json

I've found this useful for debugging state explosions in the analyzer.

gcc/analyzer/ChangeLog:
* analysis-plan.cc: Include "json.h".
* analyzer.opt (fdump-analyzer-json): New.
* call-string.cc: Include "json.h".
(call_string::to_json): New.
* call-string.h (call_string::to_json): New decl.
* checker-path.cc: Include "json.h".
* constraint-manager.cc: Include "json.h".
(equiv_class::to_json): New.
(constraint::to_json): New.
(constraint_manager::to_json): New.
* constraint-manager.h (equiv_class::to_json): New decl.
(constraint::to_json): New decl.
(constraint_manager::to_json): New decl.
* diagnostic-manager.cc: Include "json.h".
(saved_diagnostic::to_json): New.
(diagnostic_manager::to_json): New.
* diagnostic-manager.h (saved_diagnostic::to_json): New decl.
(diagnostic_manager::to_json): New decl.
* engine.cc: Include "json.h", <zlib.h>.
(exploded_node::status_to_str): New.
(exploded_node::to_json): New.
(exploded_edge::to_json): New.
(exploded_graph::to_json): New.
(dump_analyzer_json): New.
(impl_run_checkers): Call it.
* exploded-graph.h (exploded_node::status_to_str): New decl.
(exploded_node::to_json): New.
(exploded_edge::to_json): New.
(exploded_graph::to_json): New.
* pending-diagnostic.cc: Include "json.h".
* program-point.cc: Include "json.h".
(program_point::to_json): New.
* program-point.h (program_point::to_json): New decl.
* program-state.cc: Include "json.h".
(extrinsic_state::to_json): New.
(sm_state_map::to_json): New.
(program_state::to_json): New.
* program-state.h (extrinsic_state::to_json): New decl.
(sm_state_map::to_json): New decl.
(program_state::to_json): New decl.
* region-model-impl-calls.cc: Include "json.h".
* region-model-manager.cc: Include "json.h".
* region-model-reachability.cc: Include "json.h".
* region-model.cc: Include "json.h".
* region-model.h (svalue::to_json): New decl.
(region::to_json): New decl.
* region.cc: Include "json.h".
(region::to_json: New.
* sm-file.cc: Include "json.h".
* sm-malloc.cc: Include "json.h".
* sm-pattern-test.cc: Include "json.h".
* sm-sensitive.cc: Include "json.h".
* sm-signal.cc: Include "json.h".
(signal_delivery_edge_info_t::to_json): New.
* sm-taint.cc: Include "json.h".
* sm.cc: Include "diagnostic.h", "tree-diagnostic.h", and
"json.h".
(state_machine::state::to_json): New.
(state_machine::to_json): New.
* sm.h (state_machine::state::to_json): New.
(state_machine::to_json): New.
* state-purge.cc: Include "json.h".
* store.cc: Include "json.h".
(binding_key::get_desc): New.
(binding_map::to_json): New.
(binding_cluster::to_json): New.
(store::to_json): New.
* store.h (binding_key::get_desc): New decl.
(binding_map::to_json): New decl.
(binding_cluster::to_json): New decl.
(store::to_json): New decl.
* supergraph.cc: Include "json.h".
(supergraph::to_json): New.
(supernode::to_json): New.
(superedge::to_json): New.
* supergraph.h (supergraph::to_json): New decl.
(supernode::to_json): New decl.
(superedge::to_json): New decl.
* svalue.cc: Include "json.h".
(svalue::to_json): New.

gcc/ChangeLog:
* doc/analyzer.texi (Other Debugging Techniques): Mention
-fdump-analyzer-json.
* doc/invoke.texi (Static Analyzer Options): Add
-fdump-analyzer-json.

4 years agobpf: use xBPF signed div, mod insns when available
David Faust [Tue, 22 Sep 2020 18:31:35 +0000 (20:31 +0200)]
bpf: use xBPF signed div, mod insns when available

The 'mod' and 'div' operators in eBPF are unsigned, with no signed
counterpart. xBPF adds two new ALU operations, sdiv and smod, for
signed division and modulus, respectively. Update bpf.md with
'define_insn' blocks for signed div and mod to use them when targetting
xBPF, and add new tests to ensure they are used appropriately.

2020-09-17  David Faust  <david.faust@oracle.com>

gcc/
* config/bpf/bpf.md: Add defines for signed div and mod operators.

gcc/testsuite/
* gcc.target/bpf/diag-sdiv.c: New test.
* gcc.target/bpf/diag-smod.c: New test.
* gcc.target/bpf/xbpf-sdiv-1.c: New test.
* gcc.target/bpf/xbpf-smod-1.c: New test.

4 years agocompiler: call runtime.eqtype for non-interface type switch on aix
Clément Chigot [Wed, 16 Sep 2020 12:03:10 +0000 (14:03 +0200)]
compiler: call runtime.eqtype for non-interface type switch on aix

All type switch clauses must call runtime.eqtype if the linker isn't
able to merge type descriptors pointers. Previously, only interface-type
clauses were doing it.

Updates golang/go#39276

Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/255202

4 years agolibgomp.fortran/pr66199-5.f90: Make stop codes unique
Tobias Burnus [Tue, 22 Sep 2020 17:15:44 +0000 (19:15 +0200)]
libgomp.fortran/pr66199-5.f90: Make stop codes unique

libgomp/ChangeLog:

PR fortran/95654
* testsuite/libgomp.fortran/pr66199-5.f90: Make stop codes unique.

4 years agolibstdc++: Fix overflow handling in std::align
Glen Joseph Fernandes [Tue, 22 Sep 2020 16:49:48 +0000 (17:49 +0100)]
libstdc++: Fix overflow handling in std::align

libstdc++-v3/ChangeLog:

* include/bits/align.h (align): Fix overflow handling.
* testsuite/20_util/align/3.cc: New test.

4 years agoc++: fix injected friend of template class
Nathan Sidwell [Tue, 22 Sep 2020 16:00:20 +0000 (09:00 -0700)]
c++: fix injected friend of template class

In working on fixing hiddenness, I discovered some suspicious code in
template instantiation.  I suspect it dates from when we didn't do the
hidden friend injection thing at all.  The xreftag finds the same
class, but makes it visible to name lookup.  Which is wrong.
hurrah, fixing a bug by deleting code!

gcc/cp/
* pt.c (instantiate_class_template_1): Do not repush and unhide
injected friend.
gcc/testsuite/
* g++.old-deja/g++.pt/friend34.C: Check injected friend is still
invisible.

4 years agolibstdc++: Introduce new headers for C++20 ranges components
Jonathan Wakely [Tue, 22 Sep 2020 14:45:54 +0000 (15:45 +0100)]
libstdc++: Introduce new headers for C++20 ranges components

This introduces two new headers:

<bits/ranges_base.h> defines the minimal components needed
for using C++20 ranges (customization point objects such as
std::ranges::begin, concepts such as std::ranges::range, etc.)

<bits/ranges_util.h> includes <bits/ranges_base.h> and additionally
defines subrange, which is needed by <bits/ranges_algo.h>.

Most of the content of <bits/ranges_base.h> was previously defined in
<bits/range_access.h>, but a few pieces were only defined in <ranges>.
This meant the entire <ranges> header was needed in <algorithm> and
<memory>, even though they don't use all the range adaptors.

By moving the ranges components out of <bits/range_access.h> that file
is left defining just the contents of [iterator.range] i.e. std::begin,
std::end, std::size etc. and not C++20 ranges components.

For consistency with other C++20 ranges headers, <bits/range_cmp.h> is
renamed to <bits/ranges_cmp.h>.

libstdc++-v3/ChangeLog:

* include/Makefile.am: Add new headers and adjust for renamed
header.
* include/Makefile.in: Regenerate.
* include/bits/iterator_concepts.h: Adjust for renamed header.
* include/bits/range_access.h (ranges::*): Move to new
<bits/ranges_base.h> header.
* include/bits/ranges_algobase.h: Include new <bits/ranges_base.h>
header instead of <ranges>.
* include/bits/ranges_algo.h: Include new <bits/ranges_util.h>
header.
* include/bits/range_cmp.h: Moved to...
* include/bits/ranges_cmp.h: ...here.
* include/bits/ranges_base.h: New header.
* include/bits/ranges_util.h: New header.
* include/experimental/string_view: Include new
<bits/ranges_base.h> header.
* include/std/functional: Adjust for renamed header.
* include/std/ranges (ranges::view_base, ranges::enable_view)
(ranges::dangling, ranges::borrowed_iterator_t): Move to new
<bits/ranges_base.h> header.
(ranges::view_interface, ranges::subrange)
(ranges::borrowed_subrange_t): Move to new <bits/ranges_util.h>
header.
* include/std/span: Include new <bits/ranges_base.h> header.
* include/std/string_view: Likewise.
* testsuite/24_iterators/back_insert_iterator/pr93884.cc: Add
missing <ranges> header.
* testsuite/24_iterators/front_insert_iterator/pr93884.cc:
Likewise.

4 years agotestsuite: Prune more output in timevar1.C.
Marek Polacek [Tue, 22 Sep 2020 13:25:35 +0000 (09:25 -0400)]
testsuite: Prune more output in timevar1.C.

gcc/testsuite/ChangeLog:

* g++.dg/ext/timevar1.C: Also prune N%.

4 years agotestsuite: Prune more output in timevar2.C.
Marek Polacek [Mon, 21 Sep 2020 19:39:53 +0000 (15:39 -0400)]
testsuite: Prune more output in timevar2.C.

gcc/testsuite/ChangeLog:

* g++.dg/ext/timevar2.C: Also prune N%.

4 years agoswitch lowering: limit number of cluster attemps
Martin Liska [Tue, 22 Sep 2020 10:23:35 +0000 (12:23 +0200)]
switch lowering: limit number of cluster attemps

gcc/ChangeLog:

PR tree-optimization/96979
* doc/invoke.texi: Document new param max-switch-clustering-attempts.
* params.opt: Add new parameter.
* tree-switch-conversion.c (jump_table_cluster::find_jump_tables):
Limit number of attempts.
(bit_test_cluster::find_bit_tests): Likewise.

gcc/testsuite/ChangeLog:

PR tree-optimization/96979
* g++.dg/tree-ssa/pr96979.C: New test.

4 years agoIBM Z: Try to make use of load-and-test instructions
Stefan Schulze Frielinghaus [Fri, 18 Sep 2020 07:10:19 +0000 (09:10 +0200)]
IBM Z: Try to make use of load-and-test instructions

This patch enables a peephole2 optimization which transforms a load of
constant zero into a temporary register which is then finally used to
compare against a floating-point register of interest into a single load
and test instruction.  However, the optimization is only applied if both
registers are dead afterwards and if we test for (in)equality only.
This is relaxed in case of fast math.

This is a follow up to PR88856.

gcc/ChangeLog:

* config/s390/s390.md ("*cmp<mode>_ccs_0", "*cmp<mode>_ccz_0",
"*cmp<mode>_ccs_0_fastmath"): Basically change "*cmp<mode>_ccs_0" into
"*cmp<mode>_ccz_0" and for fast math add "*cmp<mode>_ccs_0_fastmath".

gcc/testsuite/ChangeLog:

* gcc.target/s390/load-and-test-fp-1.c: Change test to include all
possible combinations of dead/live registers and comparisons (equality,
relational).
* gcc.target/s390/load-and-test-fp-2.c: Same as load-and-test-fp-1.c
but for fast math.
* gcc.target/s390/load-and-test-fp.h: New test included by
load-and-test-fp-{1,2}.c.

4 years ago[libgomp, nvptx] Print error log for link error
Tom de Vries [Tue, 22 Sep 2020 05:51:58 +0000 (07:51 +0200)]
[libgomp, nvptx] Print error log for link error

By running libgomp test-case libgomp.c/target-28.c with GOMP_NVPTX_PTXRW=w
(using a maintenance patch that adds support for this env var), we dump the
ptx in target-28.exe to file.  By editing one ptx file to rename
gomp_nvptx_main to gomp_nvptx_main2 in both declaration and call, and
running with GOMP_NVPTX_PTXRW=r, we trigger a link error:
...
$ GOMP_NVPTX_PTXRW=r ./target-28.exe
libgomp: cuLinkComplete error: unknown error
...
The error is somewhat uninformative.

Fix this by dumping the error log returned by the failing cuda call, such
that we have instead:
...
$ GOMP_NVPTX_PTXRW=r ./target-28.exe
libgomp: Link error log error   : \
  Undefined reference to 'gomp_nvptx_main2' in ''
libgomp: cuLinkComplete error: unknown error
...

Build on x86_64 with nvptx accelerator, tested libgomp.

libgomp/ChangeLog:

* plugin/plugin-nvptx.c (link_ptx): Print elog if cuLinkComplete call
fails.

4 years agoAArch64: Implement missing vcls intrinsics on unsigned types
Kyrylo Tkachov [Tue, 22 Sep 2020 11:03:49 +0000 (12:03 +0100)]
AArch64: Implement missing vcls intrinsics on unsigned types

This patch implements some missing intrinsics that perform a CLS on unsigned SIMD types.

Bootstrapped and tested on aarch64-none-linux-gnu.

gcc/
PR target/71233
* config/aarch64/arm_neon.h (vcls_u8, vcls_u16, vcls_u32,
vclsq_u8, vclsq_u16, vclsq_u32): Define.

gcc/testsuite/
PR target/71233
* gcc.target/aarch64/simd/vcls_unsigned_1.c: New test.