platform/upstream/gcc.git
3 years agotestsuite: Fix up two tests for recent libstdc++ header changes [PR101647]
Jakub Jelinek [Thu, 29 Jul 2021 12:17:55 +0000 (14:17 +0200)]
testsuite: Fix up two tests for recent libstdc++ header changes [PR101647]

After recent libstdc++ header changes <functional> no longer includes
(parts of?) <array> and doesn't have to and <memory> no longer includes
(parts of?) <initializer_list>.
This patch fixes:
testsuite/g++.dg/pr71389.C:10:39: error: aggregate 'std::array<std::array<int, 16>, 16> v13' has incomplete type and cannot be defined
as well as
testsuite/g++.dg/cpp0x/initlist48.C:11:6: error: 'initializer_list' in namespace 'std' does not name a template type; did you mean 'uninitialized_fill'?

2021-07-29  Jakub Jelinek  <jakub@redhat.com>

PR testsuite/101647
* g++.dg/pr71389.C: Include <array> instead of <functional>.
* g++.dg/cpp0x/initlist48.C: Include also <initializer_list>.

3 years ago[OpenACC] Extract 'pass_oacc_loop_designation' out of 'pass_oacc_device_lower'
Thomas Schwinge [Tue, 2 Mar 2021 12:20:11 +0000 (04:20 -0800)]
[OpenACC] Extract 'pass_oacc_loop_designation' out of 'pass_oacc_device_lower'

This really is a separate step -- and another pass to be added between the two,
later on.

gcc/
* omp-offload.c (oacc_loop_xform_head_tail, oacc_loop_process):
'update_stmt' after modification.
(pass_oacc_loop_designation): New function, extracted out of...
(pass_oacc_device_lower): ... this.
(pass_data_oacc_loop_designation, pass_oacc_loop_designation)
(make_pass_oacc_loop_designation): New
* passes.def: Add it.
* tree-parloops.c (create_parallel_loop): Adjust.
* tree-pass.h (make_pass_oacc_loop_designation): New.
gcc/testsuite/
* c-c++-common/goacc/classify-kernels-unparallelized.c:
's%oaccdevlow%oaccloops%g'.
* c-c++-common/goacc/classify-kernels.c: Likewise.
* c-c++-common/goacc/classify-parallel.c: Likewise.
* c-c++-common/goacc/classify-routine-nohost.c: Likewise.
* c-c++-common/goacc/classify-routine.c: Likewise.
* c-c++-common/goacc/classify-serial.c: Likewise.
* c-c++-common/goacc/routine-nohost-1.c: Likewise.
* g++.dg/goacc/template.C: Likewise.
* gcc.dg/goacc/loop-processing-1.c: Likewise.
* gfortran.dg/goacc/classify-kernels-unparallelized.f95: Likewise.
* gfortran.dg/goacc/classify-kernels.f95: Likewise.
* gfortran.dg/goacc/classify-parallel.f95: Likewise.
* gfortran.dg/goacc/classify-routine-nohost.f95: Likewise.
* gfortran.dg/goacc/classify-routine.f95: Likewise.
* gfortran.dg/goacc/classify-serial.f95: Likewise.
* gfortran.dg/goacc/routine-multiple-directives-1.f90: Likewise.
libgomp/
* testsuite/libgomp.oacc-c-c++-common/pr85486-2.c:
's%oaccdevlow%oaccloops%g'.
* testsuite/libgomp.oacc-c-c++-common/pr85486-3.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/pr85486.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/routine-nohost-1.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/vector-length-128-1.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/vector-length-128-2.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/vector-length-128-3.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/vector-length-128-4.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/vector-length-128-5.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/vector-length-128-6.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/vector-length-128-7.c:
Likewise.
* testsuite/libgomp.oacc-fortran/routine-nohost-1.f90: Likewise.

Co-Authored-By: Julian Brown <julian@codesourcery.com>
Co-Authored-By: Kwok Cheung Yeung <kcy@codesourcery.com>
3 years agoFix failed test cases caused by disabling mode promotion for pseudos [PR100952]
Haochen Gui [Thu, 29 Jul 2021 06:56:12 +0000 (14:56 +0800)]
Fix failed test cases caused by disabling mode promotion for pseudos [PR100952]

gcc/testsuite
PR target/100952
* gcc.target/powerpc/pr56605.c: Change matching
conditions.
* gcc.target/powerpc/pr81348.c: Likewise.

3 years agoBackwards jump threader rewrite with ranger.
Aldy Hernandez [Tue, 15 Jun 2021 10:32:51 +0000 (12:32 +0200)]
Backwards jump threader rewrite with ranger.

This is a rewrite of the backwards threader with a ranger based solver.

The code is divided into two parts: the path solver in
gimple-range-path.*, and the path discovery bits in
tree-ssa-threadbackward.c.

The legacy code is still available with --param=threader-mode=legacy,
but will be removed shortly after.

gcc/ChangeLog:

* Makefile.in (tree-ssa-loop-im.o-warn): New.
* flag-types.h (enum threader_mode): New.
* params.opt: Add entry for --param=threader-mode.
* tree-ssa-threadbackward.c (THREADER_ITERATIVE_MODE): New.
(class back_threader): New.
(back_threader::back_threader): New.
(back_threader::~back_threader): New.
(back_threader::maybe_register_path): New.
(back_threader::find_taken_edge): New.
(back_threader::find_taken_edge_switch): New.
(back_threader::find_taken_edge_cond): New.
(back_threader::resolve_def): New.
(back_threader::resolve_phi): New.
(back_threader::find_paths_to_names): New.
(back_threader::find_paths): New.
(dump_path): New.
(debug): New.
(thread_jumps::find_jump_threads_backwards): Call ranger threader.
(thread_jumps::find_jump_threads_backwards_with_ranger): New.
(pass_thread_jumps::execute): Abstract out code...
(try_thread_blocks): ...here.
* tree-ssa-threadedge.c (jump_threader::thread_outgoing_edges):
Abstract out threading candidate code to...
(single_succ_to_potentially_threadable_block): ...here.
* tree-ssa-threadedge.h (single_succ_to_potentially_threadable_block):
New.
* tree-ssa-threadupdate.c (register_jump_thread): Return boolean.
* tree-ssa-threadupdate.h (class jump_thread_path_registry):
Return bool from register_jump_thread.

libgomp/ChangeLog:

* testsuite/libgomp.graphite/force-parallel-4.c: Adjust for
threader.
* testsuite/libgomp.graphite/force-parallel-8.c: Same.

gcc/testsuite/ChangeLog:

* g++.dg/debug/dwarf2/deallocator.C: Adjust for threader.
* gcc.c-torture/compile/pr83510.c: Same.
* dg.dg/analyzer/pr94851-2.c: Same.
* gcc.dg/loop-unswitch-2.c: Same.
* gcc.dg/old-style-asm-1.c: Same.
* gcc.dg/pr68317.c: Same.
* gcc.dg/pr97567-2.c: Same.
* gcc.dg/predict-9.c: Same.
* gcc.dg/shrink-wrap-loop.c: Same.
* gcc.dg/sibcall-1.c: Same.
* gcc.dg/tree-ssa/builtin-sprintf-3.c: Same.
* gcc.dg/tree-ssa/pr21001.c: Same.
* gcc.dg/tree-ssa/pr21294.c: Same.
* gcc.dg/tree-ssa/pr21417.c: Same.
* gcc.dg/tree-ssa/pr21458-2.c: Same.
* gcc.dg/tree-ssa/pr21563.c: Same.
* gcc.dg/tree-ssa/pr49039.c: Same.
* gcc.dg/tree-ssa/pr61839_1.c: Same.
* gcc.dg/tree-ssa/pr61839_3.c: Same.
* gcc.dg/tree-ssa/pr77445-2.c: Same.
* gcc.dg/tree-ssa/split-path-4.c: Same.
* gcc.dg/tree-ssa/ssa-dom-thread-11.c: Same.
* gcc.dg/tree-ssa/ssa-dom-thread-12.c: Same.
* gcc.dg/tree-ssa/ssa-dom-thread-14.c: Same.
* gcc.dg/tree-ssa/ssa-dom-thread-18.c: Same.
* gcc.dg/tree-ssa/ssa-dom-thread-6.c: Same.
* gcc.dg/tree-ssa/ssa-dom-thread-7.c: Same.
* gcc.dg/tree-ssa/ssa-fre-48.c: Same.
* gcc.dg/tree-ssa/ssa-thread-11.c: Same.
* gcc.dg/tree-ssa/ssa-thread-12.c: Same.
* gcc.dg/tree-ssa/ssa-thread-14.c: Same.
* gcc.dg/tree-ssa/vrp02.c: Same.
* gcc.dg/tree-ssa/vrp03.c: Same.
* gcc.dg/tree-ssa/vrp05.c: Same.
* gcc.dg/tree-ssa/vrp06.c: Same.
* gcc.dg/tree-ssa/vrp07.c: Same.
* gcc.dg/tree-ssa/vrp09.c: Same.
* gcc.dg/tree-ssa/vrp19.c: Same.
* gcc.dg/tree-ssa/vrp20.c: Same.
* gcc.dg/tree-ssa/vrp33.c: Same.
* gcc.dg/uninit-pred-9_b.c: Same.
* gcc.dg/uninit-pr61112.c: Same.
* gcc.dg/vect/bb-slp-16.c: Same.
* gcc.target/i386/avx2-vect-aggressive.c: Same.
* gcc.dg/tree-ssa/ranger-threader-1.c: New test.
* gcc.dg/tree-ssa/ranger-threader-2.c: New test.
* gcc.dg/tree-ssa/ranger-threader-3.c: New test.
* gcc.dg/tree-ssa/ranger-threader-4.c: New test.
* gcc.dg/tree-ssa/ranger-threader-5.c: New test.

3 years agoc/101512 - fix missing address-taking in c_common_mark_addressable_vec
Richard Biener [Wed, 21 Jul 2021 07:14:24 +0000 (09:14 +0200)]
c/101512 - fix missing address-taking in c_common_mark_addressable_vec

c_common_mark_addressable_vec fails to look through C_MAYBE_CONST_EXPR
in the case it isn't at the toplevel.

2021-07-21  Richard Biener  <rguenther@suse.de>

PR c/101512
gcc/c-family/
* c-common.c (c_common_mark_addressable_vec): Look through
C_MAYBE_CONST_EXPR even if not at the toplevel.

gcc/testsuite/
* gcc.dg/torture/pr101512.c: New testcase.

3 years agoAdjust docu of TARGET_VECTORIZE_VEC_PERM_CONST
Andreas Krebbel [Thu, 29 Jul 2021 06:03:36 +0000 (08:03 +0200)]
Adjust docu of TARGET_VECTORIZE_VEC_PERM_CONST

gcc/ChangeLog:

* target.def: in0 and in1 do not need to be registers.
* doc/tm.texi: Regenerate.

3 years agoanalyzer: : Refactor callstring to work with pairs of supernodes.
Ankur Saini [Sun, 25 Jul 2021 09:17:53 +0000 (14:47 +0530)]
analyzer: : Refactor callstring to work with pairs of supernodes.

2021-07-25  Ankur Saini  <arsenic@sourceware.org>

gcc/analyzer/ChangeLog:
* call-string.cc (call_string::element_t::operator==): New operator.
(call_String::element_t::operator!=): New operator.
(call_string::element_t::get_caller_function): New function.
(call_string::element_t::get_callee_function): New function.
(call_string::call_string): Refactor to Initialise m_elements.
(call_string::operator=): Refactor to work with m_elements.
(call_string::operator==): Likewise.
(call_string::to_json): Likewise.
(call_string::hash): Refactor to hash e.m_caller.
(call_string::push_call): Refactor to work with m_elements.
(call_string::push_call): New overload to push call via supernodes.
(call_string::pop): Refactor to work with m_elements.
(call_string::calc_recursion_depth): Likewise.
(call_string::cmp): Likewise.
(call_string::validate): Likewise.
(call_string::operator[]): Likewise.
* call-string.h (class supernode): New forward decl.
(struct call_string::element_t): New struct.
(call_string::call_string): Refactor to initialise m_elements.
(call_string::bool empty_p): Refactor to work with m_elements.
(call_string::get_callee_node): New decl.
(call_string::get_caller_node): New decl.
(m_elements): Replaces m_return_edges.
* program-point.cc (program_point::get_function_at_depth): Refactor to
work with new call-string format.
(program_point::validate): Likewise.
(program_point::on_edge): Likewise.

3 years agoAdjust/Refine testcases.
liuhongt [Thu, 29 Jul 2021 01:33:15 +0000 (09:33 +0800)]
Adjust/Refine testcases.

gcc/testsuite/ChangeLog:

PR target/99881
* gcc.target/i386/pr91446.c:
* gcc.target/i386/pr92658-avx512bw-2.c:
* gcc.target/i386/pr92658-sse4-2.c:
* gcc.target/i386/pr92658-sse4.c:
* gcc.target/i386/pr99881.c:

3 years agoAdd a separate function to calculate cost for WIDEN_MULT_EXPR.
liuhongt [Wed, 28 Jul 2021 08:24:52 +0000 (16:24 +0800)]
Add a separate function to calculate cost for WIDEN_MULT_EXPR.

gcc/ChangeLog:

PR target/39821
* config/i386/i386.c (ix86_widen_mult_cost): New function.
(ix86_add_stmt_cost): Use ix86_widen_mult_cost for
WIDEN_MULT_EXPR.

gcc/testsuite/ChangeLog:

PR target/39821
* gcc.target/i386/sse2-pr39821.c: New test.
* gcc.target/i386/sse4-pr39821.c: New test.

3 years agoUse preferred mode for doloop IV [PR61837]
Jiufu Guo [Thu, 15 Jul 2021 09:21:00 +0000 (17:21 +0800)]
Use preferred mode for doloop IV [PR61837]

Currently, doloop.xx variable is using the type as niter which may be
shorter than word size.  For some targets, it would be better to use
word size type.  For example, on 64bit system, to access 32bit value,
subreg maybe used.  Then using 64bit type maybe better for niter if
it can be present in both 32bit and 64bit.

This patch add target hook to query preferred mode for doloop IV,
and update mode accordingly.

gcc/ChangeLog:

2021-07-29  Jiufu Guo  <guojiufu@linux.ibm.com>

PR target/61837
* config/rs6000/rs6000.c (TARGET_PREFERRED_DOLOOP_MODE): New hook.
(rs6000_preferred_doloop_mode): New hook.
* doc/tm.texi: Regenerate.
* doc/tm.texi.in: Add hook preferred_doloop_mode.
* target.def (preferred_doloop_mode): New hook.
* targhooks.c (default_preferred_doloop_mode): New hook.
* targhooks.h (default_preferred_doloop_mode): New hook.
* tree-ssa-loop-ivopts.c (compute_doloop_base_on_mode): New function.
(add_iv_candidate_for_doloop): Call targetm.preferred_doloop_mode
and compute_doloop_base_on_mode.

gcc/testsuite/ChangeLog:

2021-07-29  Jiufu Guo  <guojiufu@linux.ibm.com>

PR target/61837
* gcc.target/powerpc/pr61837.c: New test.

3 years agoDaily bump.
GCC Administrator [Thu, 29 Jul 2021 00:16:43 +0000 (00:16 +0000)]
Daily bump.

3 years agoCorrect uninitialized object offset and size computation [PR101494].
Martin Sebor [Wed, 28 Jul 2021 22:25:40 +0000 (16:25 -0600)]
Correct uninitialized object offset and size computation [PR101494].

Resolves:
PR middle-end/101494 - -Wuninitialized false alarm with memrchr of size 0

gcc/ChangeLog:

PR middle-end/101494
* tree-ssa-uninit.c (maybe_warn_operand): Correct object offset
and size computation.

gcc/testsuite/ChangeLog:

PR middle-end/101494
* gcc.dg/uninit-pr101494.c: New test.

3 years agoCorrect -Warray-bounds handling if function pointers [PR101601].
Martin Sebor [Wed, 28 Jul 2021 22:14:38 +0000 (16:14 -0600)]
Correct -Warray-bounds handling if function pointers [PR101601].

Resolves:
PR middle-end/101601 - -Warray-bounds triggers error: arrays of functions are not meaningful

PR middle-end/101601

gcc/ChangeLog:

* gimple-array-bounds.cc (array_bounds_checker::check_mem_ref): Remove
a pointless test.
Handle pointers to functions.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Warray-bounds-25.C: New test.
* gcc.dg/Warray-bounds-85.c: New test.

3 years agoAdd new gimple-ssa-warn-access pass.
Martin Sebor [Wed, 28 Jul 2021 21:28:10 +0000 (15:28 -0600)]
Add new gimple-ssa-warn-access pass.

gcc/ChangeLog:

* Makefile.in (OBJS): Add gimple-ssa-warn-access.o and pointer-query.o.
* attribs.h (fndecl_dealloc_argno): Move fndecl_dealloc_argno to tree.h.
* builtins.c (compute_objsize_r): Move to pointer-query.cc.
(access_ref::access_ref): Same.
(access_ref::phi): Same.
(access_ref::get_ref): Same.
(access_ref::size_remaining): Same.
(access_ref::offset_in_range): Same.
(access_ref::add_offset): Same.
(access_ref::inform_access): Same.
(ssa_name_limit_t::visit_phi): Same.
(ssa_name_limit_t::leave_phi): Same.
(ssa_name_limit_t::next): Same.
(ssa_name_limit_t::next_phi): Same.
(ssa_name_limit_t::~ssa_name_limit_t): Same.
(pointer_query::pointer_query): Same.
(pointer_query::get_ref): Same.
(pointer_query::put_ref): Same.
(pointer_query::flush_cache): Same.
(warn_string_no_nul): Move to gimple-ssa-warn-access.cc.
(check_nul_terminated_array): Same.
(unterminated_array): Same.
(maybe_warn_for_bound): Same.
(check_read_access): Same.
(warn_for_access): Same.
(get_size_range): Same.
(check_access): Same.
(gimple_call_alloc_size): Move to tree.c.
(gimple_parm_array_size): Move to pointer-query.cc.
(get_offset_range): Same.
(gimple_call_return_array): Same.
(handle_min_max_size): Same.
(handle_array_ref): Same.
(handle_mem_ref): Same.
(compute_objsize): Same.
(gimple_call_alloc_p): Move to gimple-ssa-warn-access.cc.
(call_dealloc_argno): Same.
(fndecl_dealloc_argno): Same.
(new_delete_mismatch_p): Same.
(matching_alloc_calls_p): Same.
(warn_dealloc_offset): Same.
(maybe_emit_free_warning): Same.
* builtins.h (check_nul_terminated_array): Move to
gimple-ssa-warn-access.h.
(check_nul_terminated_array): Same.
(warn_string_no_nul): Same.
(unterminated_array): Same.
(class ssa_name_limit_t): Same.
(class pointer_query): Same.
(struct access_ref): Same.
(class range_query): Same.
(struct access_data): Same.
(gimple_call_alloc_size): Same.
(gimple_parm_array_size): Same.
(compute_objsize): Same.
(class access_data): Same.
(maybe_emit_free_warning): Same.
* calls.c (initialize_argument_information): Remove call to
maybe_emit_free_warning.
* gimple-array-bounds.cc: Include new header..
* gimple-fold.c: Same.
* gimple-ssa-sprintf.c: Same.
* gimple-ssa-warn-restrict.c: Same.
* passes.def: Add pass_warn_access.
* tree-pass.h (make_pass_warn_access): Declare.
* tree-ssa-strlen.c: Include new headers.
* tree.c (fndecl_dealloc_argno): Move here from builtins.c.
* tree.h (fndecl_dealloc_argno): Move here from attribs.h.
* gimple-ssa-warn-access.cc: New file.
* gimple-ssa-warn-access.h: New file.
* pointer-query.cc: New file.
* pointer-query.h: New file.

gcc/cp/ChangeLog:

* init.c: Include new header.

3 years agoPR 100168: Fix call test on power10.
Michael Meissner [Wed, 28 Jul 2021 21:24:23 +0000 (17:24 -0400)]
PR 100168: Fix call test on power10.

Fix a test that was checking for 64-bit TOC calls, to also allow for
PC-relative calls.

2021-07-28  Michael Meissner  <meissner@linux.ibm.com>

gcc/testsuite
PR testsuite/100168
* gcc.dg/pr56727-2.c: Add support for PC-relative calls.

3 years agoanalyzer: play better with -fsanitize=bounds
David Malcolm [Wed, 28 Jul 2021 18:47:54 +0000 (14:47 -0400)]
analyzer: play better with -fsanitize=bounds

gcc/analyzer/ChangeLog:
* region-model.cc (region_model::on_call_pre): Treat
IFN_UBSAN_BOUNDS, BUILT_IN_STACK_SAVE, and BUILT_IN_STACK_RESTORE
as no-ops, rather than handling them as unknown functions.

gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/torture/ubsan-1.c: New test.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
3 years agoanalyzer: remove redundant return value from various impl_call_*
David Malcolm [Wed, 28 Jul 2021 18:45:59 +0000 (14:45 -0400)]
analyzer: remove redundant return value from various impl_call_*

gcc/analyzer/ChangeLog:
* region-model-impl-calls.cc (region_model::impl_call_alloca):
Drop redundant return value.
(region_model::impl_call_builtin_expect): Likewise.
(region_model::impl_call_calloc): Likewise.
(region_model::impl_call_malloc): Likewise.
(region_model::impl_call_memset): Likewise.
(region_model::impl_call_operator_new): Likewise.
(region_model::impl_call_operator_delete): Likewise.
(region_model::impl_call_strlen): Likewise.
* region-model.cc (region_model::on_call_pre): Fix return value of
known functions that don't have unknown side-effects.
* region-model.h (region_model::impl_call_alloca): Drop redundant
return value.
(region_model::impl_call_builtin_expect): Likewise.
(region_model::impl_call_calloc): Likewise.
(region_model::impl_call_malloc): Likewise.
(region_model::impl_call_memset): Likewise.
(region_model::impl_call_strlen): Likewise.
(region_model::impl_call_operator_new): Likewise.
(region_model::impl_call_operator_delete): Likewise.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
3 years agoFortran: ICE in resolve_allocate_deallocate for invalid STAT argument
Harald Anlauf [Wed, 28 Jul 2021 17:11:27 +0000 (19:11 +0200)]
Fortran: ICE in resolve_allocate_deallocate for invalid STAT argument

gcc/fortran/ChangeLog:

PR fortran/101564
* expr.c (gfc_check_vardef_context): Add check for KIND and LEN
parameter inquiries.
* match.c (gfc_match): Fix comment for %v code.
(gfc_match_allocate, gfc_match_deallocate): Replace use of %v code
by %e in gfc_match to allow for function references as STAT and
ERRMSG arguments.
* resolve.c (resolve_allocate_deallocate): Avoid NULL pointer
dereferences and shortcut for bad STAT and ERRMSG argument to
(DE)ALLOCATE.  Remove bogus parts of checks for STAT and ERRMSG.

gcc/testsuite/ChangeLog:

PR fortran/101564
* gfortran.dg/allocate_stat_3.f90: New test.
* gfortran.dg/allocate_stat.f90: Adjust error messages.
* gfortran.dg/implicit_11.f90: Likewise.
* gfortran.dg/inquiry_type_ref_3.f90: Likewise.

3 years agoubsan: Fix ICEs with DECL_REGISTER tests [PR101624]
Jakub Jelinek [Wed, 28 Jul 2021 16:43:15 +0000 (18:43 +0200)]
ubsan: Fix ICEs with DECL_REGISTER tests [PR101624]

The following testcase ICEs, because the base is a CONST_DECL for
the Fortran parameter, and ubsan/sanopt uses DECL_REGISTER macro on it.
 /* In VAR_DECL and PARM_DECL nodes, nonzero means declared `register'.  */
 #define DECL_REGISTER(NODE) (DECL_WRTL_CHECK (NODE)->decl_common.decl_flag_0)
while CONST_DECL doesn't satisfy DECL_WRTL_CHECK.

The following patch checks explicitly for VAR_DECL/PARM_DECL/RESULT_DECL
only before using DECL_REGISTER, assumes other decls aren't DECL_REGISTER.
Not really sure about RESULT_DECL but it at least satisfies DECL_WRTL_CHECK...

2021-07-28  Jakub Jelinek  <jakub@redhat.com>

PR middle-end/101624
* ubsan.c (maybe_instrument_pointer_overflow,
instrument_object_size): Only test DECL_REGISTER on VAR_DECLs,
PARM_DECLs or RESULT_DECLs.
* sanopt.c (maybe_optimize_ubsan_ptr_ifn): Likewise.

* gfortran.dg/ubsan/ubsan.exp: New file.
* gfortran.dg/ubsan/pr101624.f90: New test.

3 years agomatch.pd: Fix up recent __builtin_bswap16 simplifications [PR101642]
Jakub Jelinek [Wed, 28 Jul 2021 16:41:50 +0000 (18:41 +0200)]
match.pd: Fix up recent __builtin_bswap16 simplifications [PR101642]

The following testcase ICEs.  The problem is that for __builtin_bswap16
(and only that, others are fine) the argument of the builtin is promoted
to int while the patterns assume it is not and is the same as that of
the return type.
For the bswap simplifications before these new ones it just means we
fail to optimize stuff like __builtin_bswap16 (__builtin_bswap16 (x))
because there are casts in between, but the last one, equality comparison
of __builtin_bswap16 with integer constant results in ICE, because
we create comparison with incompatible types of the operands, and the
other might be fine because usually we bit and the operand before promoting,
but I think it is too dangerous to rely on it, one day we find out that
because it is operand to such a built in, we can throw away any changes
that affect the upper bits and all of sudden it would misbehave.

So, this patch introduces converts that shouldn't do anything for
bswap{32,64,128} and should fix these issues for bswap16.

2021-07-28  Jakub Jelinek  <jakub@redhat.com>

PR middle-end/101642
* match.pd (bswap16 (x) == bswap16 (y)): Cast both operands
to type of bswap16 for comparison.
(bswap16 (x) == cst): Cast bswap16 operand to type of cst.

* gcc.c-torture/compile/pr101642.c: New test.

3 years agoIBM Z: Fix 5 tests in 31-bit mode
Ilya Leoshkevich [Fri, 9 Jul 2021 11:27:55 +0000 (13:27 +0200)]
IBM Z: Fix 5 tests in 31-bit mode

gcc/testsuite/ChangeLog:

* gcc.target/s390/global-array-element-pic2.c: Add -mzarch, add
an expectation for 31-bit mode.
* gcc.target/s390/load-imm64-1.c: Use unsigned long long.
* gcc.target/s390/load-imm64-2.c: Likewise.
* gcc.target/s390/vector/long-double-vx-macro-off-on.c: Use
-mzarch.
* gcc.target/s390/vector/long-double-vx-macro-on-off.c:
Likewise.

3 years agotree-optimization/101615 - SLP permute opt with CTOR roots
Richard Biener [Wed, 28 Jul 2021 13:12:00 +0000 (15:12 +0200)]
tree-optimization/101615 - SLP permute opt with CTOR roots

CTOR roots are not explicitely represented so we have to make sure
to materialize permutes on SLP graph entries to them.

2021-07-28  Richard Biener  <rguenther@suse.de>

PR tree-optimization/101615
* tree-vect-slp.c (vect_optimize_slp): Materialize permutes
at CTOR SLP graph entries.

* gcc.dg/vect/bb-slp-pr101615-2.c: New testcase.

3 years agoaarch64: Add smov alternative to sign_extend pattern
Kyrylo Tkachov [Wed, 28 Jul 2021 15:34:03 +0000 (16:34 +0100)]
aarch64: Add smov alternative to sign_extend pattern

In the testcase here we were generating a umov + sxth to move
a half-word value from SIMD to GP regs with sign-extension.
We can use a single smov instruction for it instead but the
sign-extend pattern was missing the right alternative.
The *zero_extend<SHORT:mode><GPI:mode>2_aarch64 pattern for
zero-extension already has the right alternative for
the analogous umov instruction, so this mirrors that pattern.

Bootstrapped and tested on aarch64-none-linux-gnu.

The test gcc.target/aarch64/sve/clastb_4.c is adjusted to scan for
the clastb  h0, p0, h0, z0.h form
instead of
the clastb  w0, p0, w0, z0.h form.

This is an improvement as the W forms of the clast instructions are more expensive.

gcc/ChangeLog:

* config/aarch64/aarch64.md (*extend<SHORT:mode><GPI:mode>2_aarch64):
Add "r,w" alternative.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/smov_1.c: New test.
* gcc.target/aarch64/sve/clastb_4.c: Adjust clast scan-assembler.

3 years agox86: Don't set AVX_U128_DIRTY when zeroing YMM/ZMM register
H.J. Lu [Tue, 27 Jul 2021 14:46:04 +0000 (07:46 -0700)]
x86: Don't set AVX_U128_DIRTY when zeroing YMM/ZMM register

There is no SSE <-> AVX transition penalty if the upper bits of YMM/ZMM
registers are unchanged and YMM/ZMM store doesn't change the upper bits
of YMM/ZMM registers.

1. Since zeroing YMM/ZMM register is implemented with zeroing XMM
register, don't set AVX_U128_DIRTY when zeroing YMM/ZMM register.
2. Since store doesn't change the INIT state on the upper bits of
YMM/ZMM register, don't set AVX_U128_DIRTY on store if the source
of store was never non-zero.

Here are the vzeroupper count differences on SPEC CPU 2017 with

-Ofast -march=skylake-avx512

                Before  After    Diff
500.perlbench_r 226 225 -0.44%
502.gcc_r       1263 1103 -12.67%
503.bwaves_r    14 14 0.00%
505.mcf_r       29 28 -3.45%
507.cactuBSSN_r 4651 4628 -0.49%
508.namd_r      433 432 -0.23%
510.parest_r    20380 19347 -5.07%
511.povray_r    495 452 -8.69%
519.lbm_r       2 2 0.00%
520.omnetpp_r   5954 5677 -4.65%
521.wrf_r       12353 12339 -0.11%
523.xalancbmk_r 13137 13001 -1.04%
525.x264_r      192 191 -0.52%
526.blender_r   2515 2366 -5.92%
527.cam4_r      4601 4583 -0.39%
531.deepsjeng_r 20 19 -5.00%
538.imagick_r   898 805 -10.36%
541.leela_r     427 399 -6.56%
544.nab_r       74 74 0.00%
548.exchange2_r 72 72 0.00%
549.fotonik3d_r 318 318 0.00%
554.roms_r      558 554 -0.72%
557.xz_r        79 52 -34.18%

and performance differences are within noise range.

gcc/

PR target/101456
* config/i386/i386.c (ix86_avx_u128_mode_needed): Don't set
AVX_U128_DIRTY when all bits are zero.

gcc/testsuite/

PR target/101456
* gcc.target/i386/pr101456-1.c: New test.
* gcc.target/i386/pr101456-2.c: Likewise.

3 years agotree-optimization/101615 - SLP permute opt of existing vectors
Richard Biener [Wed, 28 Jul 2021 12:16:35 +0000 (14:16 +0200)]
tree-optimization/101615 - SLP permute opt of existing vectors

This fixes one issue discovered when analyzing PR101615, namely
we happily push permutes to pre-existing vectors but end up
not actually permuting them.  In fact we don't want to, so force
materialization on the external.

It doesn't fix the original testcase though.

2021-07-28  Richard Biener  <rguenther@suse.de>

PR tree-optimization/101615
* tree-vect-slp.c (vect_optimize_slp): Pre-existing vector
external nodes cannot be permuted so make them perm_out 0.

* gcc.dg/vect/bb-slp-pr101615-1.c: New testcase.

3 years agoamdgcn: Fix attributes for LLVM-12 [PR 100208]
Andrew Stubbs [Tue, 27 Jul 2021 14:40:21 +0000 (15:40 +0100)]
amdgcn: Fix attributes for LLVM-12 [PR 100208]

This should work for a wider range of LLVM 12 variants now.
More work required for LLVM 13 though.

gcc/ChangeLog:

PR target/100208
* config.in: Regenerate.
* config/gcn/gcn-hsa.h (A_FIJI): New define.
(A_900): New define.
(A_906): New define.
(A_908): New define.
(ASM_SPEC): Use A_FIJI, A_900, A_906 and A_908.
* config/gcn/gcn.c (output_file_start): Adjust attributes according
to the assembler capabilities.
* config/gcn/mkoffload.c (main): Likewise.
* configure: Regenerate.
* configure.ac: Add tests for LLVM assembler attribute features.

3 years agoReturn undefined on edges which are not executed.
Andrew MacLeod [Wed, 28 Jul 2021 12:30:02 +0000 (08:30 -0400)]
Return undefined on edges which are not executed.

When a branch has been folded, mark any range requests on the unexecutable edge as
UNDEFINED.

* gimple-range-gori.cc (gori_compute::outgoing_edge_range_p): Check for
cond_false and cond_true on branches.

3 years agoanalyzer: Handle strdup builtins
Siddhesh Poyarekar [Wed, 28 Jul 2021 10:13:47 +0000 (15:43 +0530)]
analyzer: Handle strdup builtins

Consolidate allocator builtin handling and add support for
__builtin_strdup and __builtin_strndup.

gcc/analyzer/ChangeLog:

* analyzer.cc (is_named_call_p, is_std_named_call_p): Make
first argument a const_tree.
* analyzer.h (is_named_call_p, -s_std_named_call_p): Likewise.
* sm-malloc.cc (known_allocator_p): New function.
(malloc_state_machine::on_stmt): Use it.

gcc/testsuite/ChangeLog:

* gcc.dg/analyzer/strdup-1.c (test_4, test_5, test_6): New
tests.

3 years agoanalyzer: Recognize __builtin_free as a matching deallocator
Siddhesh Poyarekar [Wed, 28 Jul 2021 05:03:46 +0000 (10:33 +0530)]
analyzer: Recognize __builtin_free as a matching deallocator

Recognize __builtin_free as being equivalent to free when passed into
__attribute__((malloc ())), similar to how it is treated when it is
encountered as a call.  This fixes spurious warnings in glibc where
xmalloc family of allocators as well as reallocarray, memalign,
etc. are declared to have __builtin_free as the free function.

gcc/analyzer/ChangeLog:

* sm-malloc.cc
(malloc_state_machine::get_or_create_deallocator): Recognize
__builtin_free.

gcc/testsuite/ChangeLog:

* gcc.dg/analyzer/attr-malloc-1.c (compatible_alloc,
compatible_alloc2): New extern allocator declarations.
(test_9, test_10): New tests.

3 years agod: Wrong evaluation order of binary expressions (PR101640)
Iain Buclaw [Tue, 27 Jul 2021 11:24:34 +0000 (13:24 +0200)]
d: Wrong evaluation order of binary expressions (PR101640)

The use of fold_build2 can in some cases swap the order of its operands
if that is the more optimal thing to do.  However this breaks semantic
guarantee of left-to-right evaluation in D.

PR d/101640

gcc/d/ChangeLog:

* expr.cc (binary_op): Use build2 instead of fold_build2.

gcc/testsuite/ChangeLog:

* gdc.dg/pr96429.d: Update test.
* gdc.dg/pr101640.d: New test.

3 years agod: fix ICE at convert_expr(tree_node*, Type*, Type*) (PR101490)
Iain Buclaw [Mon, 26 Jul 2021 13:11:42 +0000 (15:11 +0200)]
d: fix ICE at convert_expr(tree_node*, Type*, Type*) (PR101490)

Both the front-end and code generator had a modulo by zero bug when testing if
a conversion from a static array to dynamic array was valid.

PR d/101490

gcc/d/ChangeLog:

* dmd/MERGE: Merge upstream dmd 27e388b4c.
* d-codegen.cc (build_array_index): Handle void arrays same as byte.
* d-convert.cc (convert_expr): Handle converting to zero-sized arrays.

gcc/testsuite/ChangeLog:

* gdc.dg/pr101490.d: New test.

3 years agod: __FUNCTION__ doesn't work in core.stdc.stdio functions without cast (PR101441)
Iain Buclaw [Mon, 26 Jul 2021 13:24:12 +0000 (15:24 +0200)]
d: __FUNCTION__ doesn't work in core.stdc.stdio functions without cast (PR101441)

Backports fix from upstream to allow __FUNCTION__ and
__PRETTY_FUNCTION__ to be used as C string literals.

Reviewed-on: https://github.com/dlang/dmd/pull/12923

PR d/101441

gcc/d/ChangeLog:

* dmd/MERGE: Merge upstream dmd f8c1ca928.

3 years agod: Compile-time reflection for supported built-ins (PR101127)
Iain Buclaw [Sun, 25 Jul 2021 21:19:36 +0000 (23:19 +0200)]
d: Compile-time reflection for supported built-ins (PR101127)

In order to allow user-code to determine whether a back-end builtin is
available without error, LANG_HOOKS_BUILTIN_FUNCTION_EXT_SCOPE has been
defined to delay putting back-end builtin functions until the ISA that
defines them has been declared.

However in D, there is no global namespace.  All builtins get pushed
into the `gcc.builtins' module, which is constructed during the semantic
analysis pass, which has already finished by the time target attributes
are evaluated.  So builtins are not pushed by the new langhook because
they would be ultimately ignored.  Builtins exposed to D code then can
now only be altered by the command-line.

PR d/101127

gcc/d/ChangeLog:

* d-builtins.cc (d_builtin_function_ext_scope): New function.
* d-lang.cc (LANG_HOOKS_BUILTIN_FUNCTION_EXT_SCOPE): Define.
* d-tree.h (d_builtin_function_ext_scope): Declare.

gcc/testsuite/ChangeLog:

* gdc.dg/pr101127a.d: New test.
* gdc.dg/pr101127b.d: New test.

3 years agod: Change in DotTemplateExp type semantics leading to regression (PR101619)
Iain Buclaw [Sun, 25 Jul 2021 17:54:08 +0000 (19:54 +0200)]
d: Change in DotTemplateExp type semantics leading to regression (PR101619)

By giving dot templates a type, meant that properry resolving silently
started passing for code that should never have passed.  The simple fix
is to provide implementations for checkType and checkValue that give an
error about dot templates having neither a value nor type.

Reviewed-on: https://github.com/dlang/dmd/pull/12920

PR d/101619

gcc/d/ChangeLog:

* dmd/MERGE: Merge upstream dmd 1d8386a63.

3 years agoIBM Z: Enable LSan and TSan
Ilya Leoshkevich [Thu, 22 Jul 2021 13:51:56 +0000 (15:51 +0200)]
IBM Z: Enable LSan and TSan

libsanitizer/ChangeLog:

* configure.tgt (s390*-*-linux*): Enable LSan and TSan for
s390x.

3 years agoAArch64: use stable sorting in generating ldp/stp
Bin Cheng [Wed, 28 Jul 2021 09:50:59 +0000 (17:50 +0800)]
AArch64: use stable sorting in generating ldp/stp

In some corner cases, we have code as below:
  [base + 0x310] = A
  [base + 0x320] = B
  [base + 0x330] = C
  [base + 0x320] = D
unstable sorting could result in wrong value in offset 0x320.  The
patch fixes it by using gcc_stablesort.

2021-07-28  Bin Cheng  <bin.cheng@linux.alibaba.com>

* config/aarch64/aarch64.c (aarch64_gen_adjusted_ldpstp): use
gcc_stablesort.

3 years agoDon't skip prologue/epilogue when initializing alias.
Bin Cheng [Wed, 28 Jul 2021 09:44:35 +0000 (17:44 +0800)]
Don't skip prologue/epilogue when initializing alias.

Register might be modified in prologue/epilogue, which shouldn't
be skipped in alias info analysis.

2021-07-28  Bin Cheng  <bin.cheng@linux.alibaba.com>

gcc/
* alias.c (init_alias_analysis): Don't skip prologue/epilogue.

3 years agoi386: Improve AVX2 expansion of vector >> vector DImode arithm. shifts [PR101611]
Jakub Jelinek [Wed, 28 Jul 2021 08:52:51 +0000 (10:52 +0200)]
i386: Improve AVX2 expansion of vector >> vector DImode arithm. shifts [PR101611]

AVX2 introduced vector >> vector shifts, but unfortunately for V{2,4}DImode
it only supports logical and not arithmetic shifts, only AVX512F for
V8DImode or AVX512VL for V{2,4}DImode fixed that omission.
Earlier in GCC12 cycle I've committed vector >> scalar arithmetic shift
emulation using various sequences, this patch handles the vector >> vector
case.  No need to adjust costs, the previous cost adjustment actually
covers even the vector by vector shifts.
The patch emits the right arithmetic V{2,4}DImode shifts using 2 logical right
V{2,4}DImode shifts (once of the original operands, once of sign mask
constant by the vector shift count), xor and subtraction, on each element
(long long) x >> y is done as
(((unsigned long long) x >> y) ^ (0x8000000000000000ULL >> y))
- (0x8000000000000000ULL >> y)
i.e. if x doesn't have in some element the MSB set, it is just the logical
shift, if it does, then the xor and subtraction cause also all higher bits
to be set.

2021-07-28  Jakub Jelinek  <jakub@redhat.com>

PR target/101611
* config/i386/sse.md (vashr<mode>3): Split into vashrv8di3 expander
and vashrv4di3 expander, where the latter requires just TARGET_AVX2
and has special !TARGET_AVX512VL expansion.
(vashrv2di3<mask_name>): Rename to ...
(vashrv2di3): ... this.  Change condition to TARGET_XOP || TARGET_AVX2
and add special !TARGET_XOP && !TARGET_AVX512VL expansion.

* gcc.target/i386/avx2-pr101611-1.c: New test.
* gcc.target/i386/avx2-pr101611-2.c: New test.

3 years agoCorrect a mistake in a warnung for -Wnonnull.
Martin Uecker [Wed, 28 Jul 2021 06:41:38 +0000 (08:41 +0200)]
Correct a mistake in a warnung for -Wnonnull.

In the warning for -Wnonnull when warning about array parameters
with bounds > 0 and which are NULL the numbers referring to the
two arguments are switched. This patch corrects the mistake.

2021-07-28  Martin Uecker  <muecker@gwdg.de>

gcc/
* calls.c (maybe_warn_rdwr_sizes): Correct argument
numbers in warning that were switched.

gcc/testsuite/
* gcc.dg/Wnonnull-4.c: Correct argument numbers in warnings.

3 years agoBind(c): Improve error checking in CFI_* functions
Sandra Loosemore [Thu, 15 Jul 2021 15:48:45 +0000 (08:48 -0700)]
Bind(c): Improve error checking in CFI_* functions

This patch adds additional run-time checking for invalid arguments to
CFI_establish and CFI_setpointer.  It also changes existing messages
throughout the CFI_* functions to use PRIiPTR to format CFI_index_t
values instead of casting them to int and using %d (which may not work
on targets where int is a smaller type), simplifies wording of some
messages, and fixes issues with capitalization, typos, and the like.
Additionally some coding standards problems such as >80 character lines
are addressed.

2021-07-24  Sandra Loosemore  <sandra@codesourcery.com>

PR libfortran/101317

libgfortran/
* runtime/ISO_Fortran_binding.c: Include <inttypes.h>.
(CFI_address): Tidy error messages and comments.
(CFI_allocate): Likewise.
(CFI_deallocate): Likewise.
(CFI_establish): Likewise.  Add new checks for validity of
elem_len when it's used, plus type argument and extents.
(CFI_is_contiguous): Tidy error messages and comments.
(CFI_section): Likewise.  Refactor some repetitive code to
make it more understandable.
(CFI_select_part): Likewise.
(CFI_setpointer): Likewise.  Check that source is not an
unallocated allocatable array or an assumed-size array.

gcc/testsuite/
* gfortran.dg/ISO_Fortran_binding_17.f90: Fix typo in error
message patterns.

3 years agoBind(c): Fix bugs in CFI_section
Sandra Loosemore [Sat, 17 Jul 2021 23:12:18 +0000 (16:12 -0700)]
Bind(c): Fix bugs in CFI_section

CFI_section was incorrectly adjusting the base pointer for the result
array twice in different ways.  It was also overwriting the array
dimension info in the result descriptor before computing the base
address offset from the source descriptor, which caused problems if
the two descriptors are the same.  This patch fixes both problems and
makes the code simpler, too.

A consequence of this patch is that the result array is now 0-based in
all dimensions instead of starting at the numbering to match the first
element of the source array.  The Fortran standard only specifies the
shape of the result array, not its lower bounds, so this is permitted
and probably less confusing for users as well as implementors.

2021-07-17  Sandra Loosemore  <sandra@codesourcery.com>

PR libfortran/101310

libgfortran/
* runtime/ISO_Fortran_binding.c (CFI_section): Fix the base
address computation and simplify the code.

gcc/testsuite/
* gfortran.dg/ISO_Fortran_binding_1.c (section_c): Remove
incorrect assertions.

3 years agoFix ISO_Fortran_binding.h paths in gfortran testsuite
Sandra Loosemore [Thu, 8 Jul 2021 19:00:57 +0000 (12:00 -0700)]
Fix ISO_Fortran_binding.h paths in gfortran testsuite

ISO_Fortran_binding.h is now generated in the libgfortran build
directory where it is on the default include path.  Adjust includes in
the gfortran testsuite not to include an explicit path pointing at the
source directory.

2021-07-27  Sandra Loosemore  <sandra@codesourcery.com>

gcc/testsuite/
PR libfortran/101305
* gfortran.dg/ISO_Fortran_binding_1.c: Adjust include path.
* gfortran.dg/ISO_Fortran_binding_10.c: Likewise.
* gfortran.dg/ISO_Fortran_binding_11.c: Likewise.
* gfortran.dg/ISO_Fortran_binding_12.c: Likewise.
* gfortran.dg/ISO_Fortran_binding_15.c: Likewise.
* gfortran.dg/ISO_Fortran_binding_16.c: Likewise.
* gfortran.dg/ISO_Fortran_binding_17.c: Likewise.
* gfortran.dg/ISO_Fortran_binding_18.c: Likewise.
* gfortran.dg/ISO_Fortran_binding_3.c: Likewise.
* gfortran.dg/ISO_Fortran_binding_5.c: Likewise.
* gfortran.dg/ISO_Fortran_binding_6.c: Likewise.
* gfortran.dg/ISO_Fortran_binding_7.c: Likewise.
* gfortran.dg/ISO_Fortran_binding_8.c: Likewise.
* gfortran.dg/ISO_Fortran_binding_9.c: Likewise.
* gfortran.dg/PR94327.c: Likewise.
* gfortran.dg/PR94331.c: Likewise.
* gfortran.dg/bind_c_array_params_3_aux.c: Likewise.
* gfortran.dg/iso_fortran_binding_uint8_array_driver.c: Likewise.
* gfortran.dg/pr93524.c: Likewise.

3 years agoBind(C): Correct sizes of some types in CFI_establish
Sandra Loosemore [Thu, 8 Jul 2021 23:38:14 +0000 (16:38 -0700)]
Bind(C): Correct sizes of some types in CFI_establish

CFI_establish was failing to set the default elem_len correctly for
CFI_type_cptr, CFI_type_cfunptr, CFI_type_long_double, and
CFI_type_long_double_Complex.

2021-07-13  Sandra Loosemore  <sandra@codesourcery.com>

libgfortran/
PR libfortran/101305
* runtime/ISO_Fortran_binding.c (CFI_establish): Special-case
CFI_type_cptr and CFI_type_cfunptr.  Correct size of long double
on targets where it has kind 10.

3 years agoBind(C): Fix type encodings in ISO_Fortran_binding.h
Sandra Loosemore [Thu, 8 Jul 2021 15:21:20 +0000 (08:21 -0700)]
Bind(C): Fix type encodings in ISO_Fortran_binding.h

ISO_Fortran_binding.h had many incorrect hardwired kind encodings in
the definitions of the CFI_type_* macros.  Additionally, not all
targets support all the defined type encodings, and the Fortran
standard requires those macros to have a negative value.

This patch changes ISO_Fortran_binding.h to use sizeof instead of
hard-coded sizes, and assembles it from fragments that reflect the
set of types supported by the target.

2021-07-22  Sandra Loosemore  <sandra@codesourcery.com>
    Tobias Burnus  <tobias@codesourcery.com>

libgfortran/
PR libfortran/101305
* ISO_Fortran_binding.h: Fix hard-coded sizes and split into...
* ISO_Fortran_binding-1-tmpl.h: New file.
* ISO_Fortran_binding-2-tmpl.h: New file.
* ISO_Fortran_binding-3-tmpl.h: New file.
* Makefile.am: Add rule for generating ISO_Fortran_binding.h.
Adjust pathnames to that file.
* Makefile.in: Regenerated.
* mk-kinds-h.sh: New file.
* runtime/ISO_Fortran_binding.c: Fix include path.

3 years agovect: Fix wrong check in vect_recog_mulhs_pattern [PR101596]
Kewen Lin [Wed, 28 Jul 2021 03:04:22 +0000 (22:04 -0500)]
vect: Fix wrong check in vect_recog_mulhs_pattern [PR101596]

As PR101596 showed, vect_recog_mulhs_pattern uses target_precision to
check the scale_term is expected or not, it could be wrong when the
precision of the actual used new_type larger than target_precision as
shown by the example.

This patch is to use precision of new_type instead of target_precision
for the scale_term matching check.

Bootstrapped & regtested on powerpc64le-linux-gnu P10,
powerpc64-linux-gnu P8, x86_64-redhat-linux and aarch64-linux-gnu.

gcc/ChangeLog:

PR tree-optimization/101596
* tree-vect-patterns.c (vect_recog_mulhs_pattern): Fix wrong check
by using new_type's precision instead.

gcc/testsuite/ChangeLog:

PR tree-optimization/101596
* gcc.target/powerpc/pr101596-1.c: New test.
* gcc.target/powerpc/pr101596-2.c: Likewise.
* gcc.target/powerpc/pr101596-3.c: Likewise.

3 years agoAdd the member integer_to_sse to processor_cost as a cost simulation for movd/pinsrd...
liuhongt [Fri, 26 Mar 2021 02:56:47 +0000 (10:56 +0800)]
Add the member integer_to_sse to processor_cost as a cost simulation for movd/pinsrd. It will be used to calculate the cost of vec_construct.

gcc/ChangeLog:

PR target/99881
* config/i386/i386.h (processor_costs): Add new member
integer_to_sse.
* config/i386/x86-tune-costs.h (ix86_size_cost, i386_cost,
i486_cost, pentium_cost, lakemont_cost, pentiumpro_cost,
geode_cost, k6_cost, athlon_cost, k8_cost, amdfam10_cost,
bdver_cost, znver1_cost, znver2_cost, znver3_cost,
btver1_cost, btver2_cost, btver3_cost, pentium4_cost,
nocona_cost, atom_cost, atom_cost, slm_cost, intel_cost,
generic_cost, core_cost): Initialize integer_to_sse same value
as sse_op.
(skylake_cost): Initialize integer_to_sse twice as much as sse_op.
* config/i386/i386.c (ix86_builtin_vectorization_cost):
Use integer_to_sse instead of sse_op to calculate the cost of
vec_construct.

gcc/testsuite/ChangeLog:

PR target/99881
* gcc.target/i386/pr99881.c: New test.

3 years agoDaily bump.
GCC Administrator [Wed, 28 Jul 2021 00:16:25 +0000 (00:16 +0000)]
Daily bump.

3 years agors6000: Write static initializations for overload tables
Bill Schmidt [Tue, 27 Jul 2021 03:07:19 +0000 (23:07 -0400)]
rs6000: Write static initializations for overload tables

2021-06-07  Bill Schmidt  <wschmidt@linux.ibm.com>

gcc/
* config/rs6000/rs6000-gen-builtins.c (write_ovld_static_init): New
function.
(write_init_file): Call write_ovld_static_init.

3 years agors6000: Write static initializations for built-in table
Bill Schmidt [Tue, 27 Jul 2021 03:04:44 +0000 (23:04 -0400)]
rs6000: Write static initializations for built-in table

2021-07-26  Bill Schmidt  <wschmidt@linux.ibm.com>

gcc/
* config/rs6000/rs6000-gen-builtins.c (write_bif_static_init): New
function.
(write_init_file): Call write_bif_static_init.

3 years agors6000: Write output to the builtins init file, part 3 of 3
Bill Schmidt [Tue, 27 Jul 2021 15:31:20 +0000 (11:31 -0400)]
rs6000: Write output to the builtins init file, part 3 of 3

2021-07-27  Bill Schmidt  <wschmidt@linux.ibm.com>

gcc/
* config/rs6000/rs6000-gen-builtins.c (typemap): New struct.
(TYPE_MAP_SIZE): New macro.
(type_map): New initialized variable.
(typemap_cmp): New function.
(write_type_node): Likewise.
(write_fntype_init): Implement.

3 years agoLet -Wuninitialized assume built-ins don't change const arguments [PR101584].
Martin Sebor [Tue, 27 Jul 2021 22:02:54 +0000 (16:02 -0600)]
Let -Wuninitialized assume built-ins don't change const arguments [PR101584].

PR tree-optimization/101584 - missing -Wuninitialized with an allocated object after a built-in call

gcc/ChangeLog:

PR tree-optimization/101584
* tree-ssa-uninit.c (builtin_call_nomodifying_p): New function.
(check_defs): Call it.

gcc/testsuite/ChangeLog:

PR tree-optimization/101584
* gcc.dg/uninit-38.c: Remove assertions.
* gcc.dg/uninit-41.c: New test.

3 years agolibstdc++: Simplify std::optional::value()
Jonathan Wakely [Tue, 27 Jul 2021 13:50:28 +0000 (14:50 +0100)]
libstdc++: Simplify std::optional::value()

The structure of these functions likely dates from the time before G++
fully supported C++14 extended constexpr, so that the throw expression
had to be the operand of a conditional expression. That is not true now,
so we can use a more straightforward version of the code.

We can also simplify the declaration of __throw_bad_optional_access by
using the C++11-style [[noreturn]] attribute so that a separate
declaration isn't needed.

libstdc++-v3/ChangeLog:

* include/experimental/optional (__throw_bad_optional_access):
Replace GNU attribute with C++11 attribute.
(optional::value, optional::value_or): Use if statements
instead of conditional expressions.
* include/std/optional (__throw_bad_optional_access)
(optional::value, optional::value_or): Likewise.

3 years agotestsuite: Add missing C++ includes to tests [PR101646]
Jonathan Wakely [Tue, 27 Jul 2021 20:29:10 +0000 (21:29 +0100)]
testsuite: Add missing C++ includes to tests [PR101646]

These tests stopped working after some libstdc++ refactoring, because
they aren't including what they use.

gcc/testsuite/ChangeLog:

PR testsuite/101646
* g++.dg/coroutines/pr99047.C:
* g++.dg/pr71655.C:

3 years agoUse OEP_DECL_NAME when comparing VLA bounds [PR101585].
Martin Sebor [Tue, 27 Jul 2021 19:51:55 +0000 (13:51 -0600)]
Use OEP_DECL_NAME when comparing VLA bounds [PR101585].

Resolves:
PR c/101585 - Bad interaction of -fsanitize=undefined and -Wvla-parameters

gcc/c-family:

PR c/101585
* c-warn.c (warn_parm_ptrarray_mismatch): Use OEP_DECL_NAME.

gcc/testsuite:
PR c/101585
* gcc.dg/Wvla-parameter-13.c: New test.

3 years agoImplement OpenMP 5.1 section 3.15: omp_display_env
Ulrich Drepper [Tue, 27 Jul 2021 19:08:41 +0000 (21:08 +0200)]
Implement OpenMP 5.1 section 3.15: omp_display_env

This is a new interface which is easily implemented using the
already existing code for the handling of the OMP_DISPLAY_ENV
environment variable.

libgomp/
* env.c (wait_policy, stacksize): New static variables,
move out of handle_omp_display_env.
(omp_display_env): New function.  The meat of the old
handle_omp_display_env function.
(handle_omp_display_env): Change to not take parameters
and instead use the global variables.  Only perform
parsing, defer to omp_display_env for the implementation.
(initialize_env): Remove local variables wait_policy and
stacksize.  Don't pass parameters to handle_omp_display_env.
* fortran.c: Add ialias_redirect for omp_display_env.
(omp_display_env_, omp_display_env_8_): New functions.
* libgomp.map (OMP_5.1): New version.  Add omp_display_env,
omp_display_env_, and omp_display_env_8_.
* omp.h.in: Declare omp_display_env.
* omp_lib.f90.in: Likewise.
* omp_lib.h.in: Likewise.

3 years agoFix argument to pthread_join
Jeff Law [Tue, 27 Jul 2021 18:14:05 +0000 (14:14 -0400)]
Fix argument to pthread_join

gcc/testsuite
* g++.dg/gcov/gcov-threads-1.C: Fix argument to pthread_join.

3 years agoAbstract out (forward) jump threader state handling.
Aldy Hernandez [Thu, 15 Jul 2021 13:06:36 +0000 (15:06 +0200)]
Abstract out (forward) jump threader state handling.

The *forward* jump threader has multiple places where it pushes and
pops state, and where it sets context up for the jump threading
simplifier callback.  Not only are the idioms repetitive, but the only
reason for passing const_and_copies, avail_exprs_stack, and the evrp
engine around are so we can set up context.

As part of my jump threading work, I will divorce the evrp engine from
the DOM jump threader, replacing it with a subset of the path solver I
have just contributed.  Since this will entail passing even more
context around, I've abstracted out the state handling so it can be
passed around in one object.  This cleans up the code, and also makes
it trivial to set up context with another engine in the future.

FWIW, I've used these cleanups and the path solver in a POC to improve
DOM's threaded edges by an additional 5%, and the overall threading
opportunities in the compiler by 1%.  This is in addition to the gains
I have documented in the backwards threader rewrite.

There are no functional changes with this patch.

Tested on x86-64 Linux.

gcc/ChangeLog:

* tree-ssa-dom.c (dom_jump_threader_simplifier):
Put avail_exprs_stack in the class, instead of passing it to
jump_threader_simplifier.
(dom_jump_threader_simplifier::simplify): Add state argument.
(dom_opt_dom_walker): Add state.
(pass_dominator::execute): Pass state to threader.
(dom_opt_dom_walker::before_dom_children): Use state.
* tree-ssa-threadedge.c (jump_threader::jump_threader): Replace
arguments by state.
(jump_threader::record_temporary_equivalences_from_phis):
Register equivalences through the state variable.
(jump_threader::record_temporary_equivalences_from_stmts_at_dest):
Record ranges in a statement through the state variable.
(jump_threader::simplify_control_stmt_condition): Pass state to
simplify.
(jump_threader::simplify_control_stmt_condition_1): Same.
(jump_threader::thread_around_empty_blocks): Remove obsolete
comment.
(jump_threader::thread_through_normal_block): Record equivalences
on edge through the state variable.
(jump_threader::thread_across_edge): Abstract state pushing.
(jt_state::jt_state): New.
(jt_state::push): New.
(jt_state::pop): New.
(jt_state::register_equiv): New.
(jt_state::record_ranges_from_stmt): New.
(jt_state::register_equivs_on_edge): New.
(jump_threader_simplifier::jump_threader_simplifier): Move from
header.
(jump_threader_simplifier::simplify): Add state argument.
* tree-ssa-threadedge.h (class jt_state): New.
(class jump_threader): Add state to constructor.
(class jump_threader_simplifier): Add state to simplify.  Remove
avail_exprs_stack from class.
* tree-vrp.c (vrp_jump_threader_simplifier::simplify): Add state
argument.
(vrp_jump_threader::vrp_jump_threader): Add state.
(vrp_jump_threader::~vrp_jump_threader): Cleanup state.

3 years agoc++: Reject ordered comparison of null pointers [PR99701]
Marek Polacek [Fri, 16 Jul 2021 19:58:01 +0000 (15:58 -0400)]
c++: Reject ordered comparison of null pointers [PR99701]

When implementing DR 1512 in r11-467 I neglected to reject ordered
comparison of two null pointers, like nullptr < nullptr.  This patch
fixes that omission.

DR 1512
PR c++/99701

gcc/cp/ChangeLog:

* cp-gimplify.c (cp_fold): Remove {LE,LT,GE,GT_EXPR} from
a switch.
* typeck.c (cp_build_binary_op): Reject ordered comparison
of two null pointers.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/nullptr11.C: Remove invalid tests.
* g++.dg/cpp0x/nullptr46.C: Add dg-error.
* g++.dg/cpp2a/spaceship-err7.C: New test.
* g++.dg/expr/ptr-comp4.C: New test.

libstdc++-v3/ChangeLog:

* testsuite/20_util/tuple/comparison_operators/overloaded.cc:
Move a line...
* testsuite/20_util/tuple/comparison_operators/overloaded2.cc:
...here.  New test.

3 years agolibstdc++: Adjust whitespace in <bits/cow_string.h>
Jonathan Wakely [Tue, 27 Jul 2021 11:13:42 +0000 (12:13 +0100)]
libstdc++: Adjust whitespace in <bits/cow_string.h>

libstdc++-v3/ChangeLog:

* include/bits/cow_string.h: Consistently use tab for
indentation.

3 years agolibstdc++: Move COW string definitions to separate header
Jonathan Wakely [Mon, 26 Jul 2021 14:08:00 +0000 (15:08 +0100)]
libstdc++: Move COW string definitions to separate header

This moves the definitions of the COW string to a separate file, so that
they don't need to be preprocessed for the common case. We could also
move the SSO string definitions to a new file, so that they don't need
to be preprocessed for the old ABI case, but that would require more
shovel work because there are some parts of <bits/basic_string.h> and
<bits/basic_string.tcc> that are common to both definitions.

libstdc++-v3/ChangeLog:

* include/Makefile.am: Add new header.
* include/Makefile.in: Regenerate.
* include/bits/basic_string.h [!_GLIBCXX_USE_CXX11_ABI]
(basic_string): Move definition of Copy-on-Write string to
new file.
* include/bits/basic_string.tcc: Likewise.
* include/bits/cow_string.h: New file.

3 years agolibstdc++: Remove unnecessary uses of <utility>
Jonathan Wakely [Thu, 22 Jul 2021 13:48:27 +0000 (14:48 +0100)]
libstdc++: Remove unnecessary uses of <utility>

The <algorithm> header includes <utility>, with a comment referring to
UK-300, a National Body comment on the C++11 draft. That comment
proposed to move std::swap to <utility> and then require <algorithm> to
include <utility>. The comment was rejected, so we do not need to
implement the suggestion. For backwards compatibility with C++03 we do
want <algorithm> to define std::swap, but it does so anyway via
<bits/move.h>. We don't need the whole of <utility> to do that.

A few other headers that need std::swap can include <bits/move.h> to
get it, instead of <utility>.

There are several headers that include <utility> to get std::pair, but
they can use <bits/stl_pair.h> to get it without also including the
rel_ops namespace and other contents of <utility>.

Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:

* include/std/algorithm: Do not include <utility>.
* include/std/functional: Likewise.
* include/std/regex: Include <bits/stl_pair.h> instead of
<utility>.
* include/debug/map.h: Likewise.
* include/debug/multimap.h: Likewise.
* include/debug/multiset.h: Likewise.
* include/debug/set.h: Likewise.
* include/debug/vector: Likewise.
* include/bits/fs_path.h: Likewise.
* include/bits/unique_ptr.h: Do not include <utility>.
* include/experimental/any: Likewise.
* include/experimental/executor: Likewise.
* include/experimental/memory: Likewise.
* include/experimental/optional: Likewise.
* include/experimental/socket: Use __exchange instead
of std::exchange.
* src/filesystem/ops-common.h: Likewise.
* testsuite/20_util/default_delete/48631_neg.cc: Adjust expected
errors to not use a hardcoded line number.
* testsuite/20_util/default_delete/void_neg.cc: Likewise.
* testsuite/20_util/specialized_algorithms/uninitialized_copy/constrained.cc:
Include <utility> for std::as_const.
* testsuite/20_util/specialized_algorithms/uninitialized_default_construct/constrained.cc:
Likewise.
* testsuite/20_util/specialized_algorithms/uninitialized_move/constrained.cc:
Likewise.
* testsuite/20_util/specialized_algorithms/uninitialized_value_construct/constrained.cc:
Likewise.
* testsuite/23_containers/vector/cons/destructible_debug_neg.cc:
Adjust dg-error line number.

3 years agolibstdc++: Reduce header dependencies on <array> and <utility>
Jonathan Wakely [Thu, 22 Jul 2021 13:48:27 +0000 (14:48 +0100)]
libstdc++: Reduce header dependencies on <array> and <utility>

This refactoring reduces the memory usage and compilation time to parse
a number of headers that depend on std::pair, std::tuple or std::array.
Previously the headers for these class templates were all intertwined,
due to the common dependency on std::tuple_size, std::tuple_element and
their std::get overloads. This decouples the headers by moving some
parts of <utility> into a new <bits/utility.h> header. This means that
<array> and <tuple> no longer need to include the whole of <utility>,
and <tuple> no longer needs to include <array>.

This decoupling benefits headers such as <thread> and <scoped_allocator>
which only need std::tuple, and so no longer have to parse std::array.

Some other headers such as <any>, <optional> and <variant> no longer
need to include <utility> just for the std::in_place tag types, so
do not have to parse the std::pair definitions.

Removing direct uses of <utility> also means that the std::rel_ops
namespace is not transitively declared by other headers.

Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:

* include/Makefile.am: Add bits/utility.h header.
* include/Makefile.in: Regenerate.
* include/bits/utility.h: New file.
* include/std/utility (tuple_size, tuple_element): Move
to new header.
* include/std/type_traits (__is_tuple_like_impl<tuple<T...>>):
Move to <tuple>.
(_Index_tuple, _Build_index_tuple, integer_sequence): Likewise.
(in_place_t, in_place_index_t, in_place_type_t): Likewise.
* include/bits/ranges_util.h: Include new header instead of
<utility>.
* include/bits/stl_pair.h (tuple_size, tuple_element): Move
partial specializations for std::pair here.
(get): Move overloads for std::pair here.
* include/std/any: Include new header instead of <utility>.
* include/std/array: Likewise.
* include/std/memory_resource: Likewise.
* include/std/optional: Likewise.
* include/std/variant: Likewise.
* include/std/tuple: Likewise.
(__is_tuple_like_impl<tuple<T...>>): Move here.
(get) Declare overloads for std::array.
* include/std/version (__cpp_lib_tuples_by_type): Change type
to long.
* testsuite/20_util/optional/84601.cc: Include <utility>.
* testsuite/20_util/specialized_algorithms/uninitialized_fill/constrained.cc:
Likewise.
* testsuite/23_containers/array/tuple_interface/get_neg.cc:
Adjust dg-error line numbers.
* testsuite/std/ranges/access/cbegin.cc: Include <utility>.
* testsuite/std/ranges/access/cend.cc: Likewise.
* testsuite/std/ranges/access/end.cc: Likewise.
* testsuite/std/ranges/single_view.cc: Likewise.

3 years agoImplement basic block path solver.
Aldy Hernandez [Tue, 15 Jun 2021 10:20:43 +0000 (12:20 +0200)]
Implement basic block path solver.

This is is the main basic block path solver for use in the ranger-based
backwards threader.  Given a path of BBs, the class can solve the final
conditional or any SSA name used in calculating the final conditional.

gcc/ChangeLog:

* Makefile.in (OBJS): Add gimple-range-path.o.
* gimple-range-path.cc: New file.
* gimple-range-path.h: New file.

3 years agosimplify-rtx: Push sign/zero-extension inside vec_duplicate
Jonathan Wright [Fri, 16 Jul 2021 14:34:38 +0000 (15:34 +0100)]
simplify-rtx: Push sign/zero-extension inside vec_duplicate

As a general principle, vec_duplicate should be as close to the root
of an expression as possible. Where unary operations have
vec_duplicate as an argument, these operations should be pushed
inside the vec_duplicate.

This patch modifies unary operation simplification to push
sign/zero-extension of a scalar inside vec_duplicate.

This patch also updates all RTL patterns in aarch64-simd.md to use
the new canonical form.

gcc/ChangeLog:

2021-07-19  Jonathan Wright  <jonathan.wright@arm.com>

* config/aarch64/aarch64-simd.md: Push sign/zero-extension
inside vec_duplicate for all patterns.
* simplify-rtx.c (simplify_context::simplify_unary_operation_1):
Push sign/zero-extension inside vec_duplicate.

3 years agoDon't use libgomp 'cbuf' buffering with OpenACC 'async'
Thomas Schwinge [Fri, 23 Jul 2021 20:01:32 +0000 (22:01 +0200)]
Don't use libgomp 'cbuf' buffering with OpenACC 'async'

The host data might not be computed yet (by an earlier asynchronous compute
region, for example.

libgomp/
* target.c (gomp_coalesce_buf_add): Update comment.
(gomp_copy_host2dev, gomp_map_vars_internal): Don't expect to see
'aq && cbuf'.
(gomp_map_vars_internal): Only 'if (!aq)', do
'gomp_coalesce_buf_add'.
* testsuite/libgomp.oacc-c-c++-common/async-data-1-2.c: Remove
XFAIL.

Co-Authored-By: Julian Brown <julian@codesourcery.com>
3 years agoFix OpenACC "ephemeral" asynchronous host-to-device copies
Julian Brown [Tue, 29 Jun 2021 23:42:03 +0000 (16:42 -0700)]
Fix OpenACC "ephemeral" asynchronous host-to-device copies

This patch fixes several places in libgomp/target.c where "ephemeral" data
(on the stack or in temporary heap locations) may be used as the source of
an asynchronous host-to-device copy that may not complete before the host
data disappears.

An existing, but flawed, workaround for this problem in the AMD GCN
libgomp offloading plugin is currently present on mainline, and was
posted for the og9 branch here:

  https://gcc.gnu.org/legacy-ml/gcc-patches/2019-08/msg00901.html

and previous versions of this patch were posted here (for mainline/og9):

  https://gcc.gnu.org/legacy-ml/gcc-patches/2019-11/msg01482.html
  https://gcc.gnu.org/legacy-ml/gcc-patches/2019-09/msg01026.html

libgomp/
* libgomp.h (gomp_copy_host2dev): Update prototype.
* oacc-mem.c (memcpy_tofrom_device, update_dev_host): Add new
argument to gomp_copy_host2dev (false).
* plugin/plugin-gcn.c (struct copy_data): Remove free_src field.
(copy_data): Don't free src.
(queue_push_copy): Remove free_src handling.
(GOMP_OFFLOAD_dev2dev): Update call to queue_push_copy.
(GOMP_OFFLOAD_openacc_async_host2dev): Remove source-data
snapshotting.
(GOMP_OFFLOAD_openacc_async_dev2host): Update call to
queue_push_copy.
* target.c (goacc_device_copy_async): Add SRCADDR_ORIG parameter.
(gomp_copy_host2dev): Add EPHEMERAL parameter.  Snapshot source
data when true, and set up deferred freeing of temporary buffer.
(gomp_copy_dev2host): Update call to goacc_device_copy_async.
(gomp_map_vars_existing, gomp_map_pointer, gomp_attach_pointer)
(gomp_detach_pointer, gomp_map_vars_internal, gomp_update): Update
calls to gomp_copy_host2dev with appropriate ephemeral argument.
* testsuite/libgomp.oacc-c-c++-common/async-data-1-1.c: Remove
XFAIL.

Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>
3 years agoAdd 'libgomp.oacc-c-c++-common/async-data-1-{1,2}.c'
Thomas Schwinge [Tue, 9 Jan 2018 10:57:00 +0000 (11:57 +0100)]
Add 'libgomp.oacc-c-c++-common/async-data-1-{1,2}.c'

libgomp/
* testsuite/libgomp.oacc-c-c++-common/async-data-1-1.c: New file.
* testsuite/libgomp.oacc-c-c++-common/async-data-1-2.c: Likewise.

Co-Authored-By: Tom de Vries <tom@codesourcery.com>
3 years ago[OpenACC] Clarify sequencing of 'async' data copying vs. profiling events in 'libgomp...
Thomas Schwinge [Fri, 23 Jul 2021 13:07:34 +0000 (15:07 +0200)]
[OpenACC] Clarify sequencing of 'async' data copying vs. profiling events in 'libgomp.oacc-c-c++-common/acc_prof-{init,parallel}-1.c'

... as noticed with GCN offloading.

Fix-up for r271346 (commit 5fae049dc272144f8e61af94ee0ba42b270915e5)
"OpenACC Profiling Interface (incomplete)".

libgomp/
* testsuite/libgomp.oacc-c-c++-common/acc_prof-init-1.c: Clarify
sequencing of 'async' data copying vs. profiling events.
* testsuite/libgomp.oacc-c-c++-common/acc_prof-parallel-1.c:
Likewise.

3 years agoFix OpenACC 'async'/'wait' issues in 'libgomp.oacc-c-c++-common/lib-{94,95}.c', ...
Thomas Schwinge [Tue, 8 Jun 2021 17:32:22 +0000 (19:32 +0200)]
Fix OpenACC 'async'/'wait' issues in 'libgomp.oacc-c-c++-common/lib-{94,95}.c', 'libgomp.oacc-fortran/lib-16{,-2}.f90'

Fix-up for r265842 (commit 58168bbf6f8fb456280cca13343a498ad94878c7)
"[OpenACC 2.5, libgomp] Add *_async versions of runtime library API functions".

libgomp/
* testsuite/libgomp.oacc-c-c++-common/lib-94.c: Fix OpenACC
'async'/'wait' issue.
* testsuite/libgomp.oacc-c-c++-common/lib-95.c: Likewise.
* testsuite/libgomp.oacc-fortran/lib-16-2.f90: Likewise.
* testsuite/libgomp.oacc-fortran/lib-16.f90: Likewise.

Co-Authored-By: Julian Brown <julian@codesourcery.com>
3 years agotree-optimization/101573 - improve uninit warning at -O0
Richard Biener [Thu, 22 Jul 2021 10:26:16 +0000 (12:26 +0200)]
tree-optimization/101573 - improve uninit warning at -O0

We can improve uninit warnings from the early pass by looking
at PHI arguments on fallthru edges that are uninitialized and
have uses that are before a possible loop exit.  This catches
some cases earlier that we'd only warn in a more confusing
way after early inlining as seen by testcase adjustments.

It introduces

FAIL: gcc.dg/uninit-23.c (test for excess errors)

where we additionally warn

gcc.dg/uninit-23.c:21:13: warning: 't4' is used uninitialized [-Wuninitialized]

which I think is OK even if it's not obvious that the new
warning is an improvement when you look at the obvious source.

Somehow for all cases I never get the `'foo' was declared here`
notes, I didn't dig why that happens but it's odd.

2021-07-22  Richard Biener  <rguenther@suse.de>

PR tree-optimization/101573
* tree-ssa-uninit.c (warn_uninit_phi_uses): New function
looking at uninitialized PHI arg defs in some constrained cases.
(warn_uninitialized_vars): Call it.
(execute_early_warn_uninitialized): Calculate dominators.

* gcc.dg/uninit-pr101573.c: New testcase.
* gcc.dg/uninit-15-O0.c: Adjust.
* gcc.dg/uninit-15.c: Likewise.
* gcc.dg/uninit-23.c: Likewise.
* c-c++-common/uninit-17.c: Likewise.

3 years agotree-optimization/39821 - fix cost classification for widening arith
Richard Biener [Tue, 27 Jul 2021 07:24:57 +0000 (09:24 +0200)]
tree-optimization/39821 - fix cost classification for widening arith

This adjusts the vectorizer to cost vector_stmt for widening
arithmetic instead of vec_promote_demote in the line of telling
the target that stmt_info->stmt is the meaningful piece we cost.

2021-07-27  Richard Biener  <rguenther@suse.de>

PR tree-optimization/39821
* tree-vect-stmts.c (vect_model_promotion_demotion_cost): Use
vector_stmt for widening arithmetic.
(vectorizable_conversion): Adjust.

3 years agoipa: Adjust references to identify read-only globals
Martin Jambor [Tue, 27 Jul 2021 08:02:38 +0000 (10:02 +0200)]
ipa: Adjust references to identify read-only globals

this patch has been motivated by SPEC 2017's 544.nab_r in which there is
a static variable which is never written to and so zero throughout the
run-time of the benchmark.  However, it is passed by reference to a
function in which it is read and (after some multiplications) passed
into __builtin_exp which in turn unnecessarily consumes almost 10% of
the total benchmark run-time.  The situation is illustrated by the added
testcase remref-3.c.

The patch adds a flag to ipa-prop descriptor of each parameter to mark
such parameters.  IPA-CP and inling then take the effort to remove
IPA_REF_ADDR references in the caller and only add IPA_REF_LOAD
reference to the clone/overall inlined function.  This is sufficient
for subsequent symbol table analysis code to identify the read-only
variable as such and optimize the code.

There are two changes from the RFC version posted to the list earlier.
First, three missing calls to get_base_address were added (there was
another one in an assert).  Second, references are not stripped off
the callers if the cloned function cannot change the signature.  The
second change reveals a real shortcoming stemming from the fact we
cannot adjust function prototypes with fnspecs.  But that is a more
general problem.

gcc/ChangeLog:

2021-07-20  Martin Jambor  <mjambor@suse.cz>

* cgraph.h (ipa_replace_map): New field force_load_ref.
* ipa-prop.h (ipa_param_descriptor): Reduce precision of move_cost,
aded new flag load_dereferenced, adjusted comments.
(ipa_get_param_dereferenced): New function.
(ipa_set_param_dereferenced): Likewise.
* cgraphclones.c (cgraph_node::create_virtual_clone): Follow it.
* ipa-cp.c: Include gimple.h.
(ipcp_discover_new_direct_edges): Take into account dereferenced flag.
(get_replacement_map): New parameter force_load_ref, set the
appropriate flag in ipa_replace_map if set.
(struct symbol_and_index_together): New type.
(adjust_refs_in_act_callers): New function.
(adjust_references_in_caller): Likewise.
(create_specialized_node): When appropriate, call
adjust_references_in_caller and force only load references.
* ipa-prop.c (load_from_dereferenced_name): New function.
(ipa_analyze_controlled_uses): Also detect loads from a
dereference, harden testing of call statements.
(ipa_write_node_info): Stream the dereferenced flag.
(ipa_read_node_info): Likewise.
(ipa_set_jf_constant): Also create refdesc when jump function
references a variable.
(cgraph_node_for_jfunc): Rename to symtab_node_for_jfunc, work
also on references of variables and return a symtab_node.  Adjust
all callers.
(propagate_controlled_uses): Also remove references to VAR_DECLs.

gcc/testsuite/ChangeLog:

2021-06-29  Martin Jambor  <mjambor@suse.cz>

* gcc.dg/ipa/remref-3.c: New test.
* gcc.dg/ipa/remref-4.c: Likewise.
* gcc.dg/ipa/remref-5.c: Likewise.
* gcc.dg/ipa/remref-6.c: Likewise.

3 years agogimple-fold: Fix up __builtin_clear_padding on classes with virtual inheritence ...
Jakub Jelinek [Tue, 27 Jul 2021 07:59:37 +0000 (09:59 +0200)]
gimple-fold: Fix up __builtin_clear_padding on classes with virtual inheritence [PR101586]

For the following testcase, B is 16-byte type, containing 8-byte
virtual pointer and 1-byte A member, and C contains two FIELD_DECLs,
one with B type and size of just 8-byte and then a field with type
A and 1-byte size.
The __builtin_clear_padding code was upset about the B typed FIELD_DECL
containing FIELD_DECLs beyond the field size and triggered
assertion failure.
This patch makes it ignore all FIELD_DECLs that are (fully) beyond the sz
passed from the caller (except for the flexible array member
diagnostics that is kept).

2021-07-27  Jakub Jelinek  <jakub@redhat.com>

PR middle-end/101586
* gimple-fold.c (clear_padding_type): Ignore FIELD_DECLs with byte
positions above or equal to sz except for diagnostics of flexible
array members.

* g++.dg/torture/builtin-clear-padding-4.C: New test.

3 years agoPR 100170: Fix eq/ne tests on power10.
Michael Meissner [Tue, 27 Jul 2021 01:27:00 +0000 (21:27 -0400)]
PR 100170: Fix eq/ne tests on power10.

This patch updates eq/ne tests in the testsuite to adjust the test if
power10 code generation is used.

2021-07-26  Michael Meissner  <meissner@linux.ibm.com>

gcc/testsuite/
PR testsuite/100170
* gcc.target/powerpc/ppc-eq0-1.c: Adjust insn counts if power10
code is generated.
* gcc.target/powerpc/ppc-ne0-1.c: (ne0): Adjust insn counts if
power10 code is generated.
(plus_ne0): Move to ppc-ne0-2.c.
(cmp_plus_ne): Likewise.
(plus_ne0_cmp): Likewise.
* gcc.target/powerpc/ppc-ne0-2.c: New file.

3 years agoDaily bump.
GCC Administrator [Tue, 27 Jul 2021 00:16:27 +0000 (00:16 +0000)]
Daily bump.

3 years agoConfirm and Handle only ASCII in toupper and tolower ranges.
Andrew MacLeod [Mon, 26 Jul 2021 21:25:06 +0000 (17:25 -0400)]
Confirm and Handle only ASCII in toupper and tolower ranges.

PR tree-optimization/78888
* gimple-range-fold.cc (get_letter_range): New.
(fold_using_range::range_of_builtin_call): Call get_letter_range.

3 years agoanalyzer: fix uninit false +ve when returning structs
David Malcolm [Mon, 26 Jul 2021 19:25:00 +0000 (15:25 -0400)]
analyzer: fix uninit false +ve when returning structs

This patch fixes some false positives from
 -Wanalyzer-use-of-uninitialized-value
when returning structs from functions (seen on the Linux kernel).

gcc/analyzer/ChangeLog:
* region-model.cc (region_model::on_call_pre): Always set conjured
LHS, not just for SSA names.

gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/sock-1.c: New test.
* gcc.dg/analyzer/sock-2.c: New test.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
3 years agoAdjust ranges for to_upper and to_lower.
Andrew MacLeod [Mon, 26 Jul 2021 13:40:32 +0000 (09:40 -0400)]
Adjust ranges for to_upper and to_lower.

Exclude lower case chars from to_upper and upper case chars from to_lower.

gcc/
PR tree-optimization/78888
* gimple-range-fold.cc (fold_using_range::range_of_builtin_call): Add cases
for CFN_BUILT_IN_TOUPPER and CFN_BUILT_IN_TOLOWER.

gcc/testsuite/
* gcc.dg/pr78888.c: New.

3 years agoFold bswap32(x) != 0 to x != 0 (and related transforms)
Roger Sayle [Mon, 26 Jul 2021 16:30:26 +0000 (17:30 +0100)]
Fold bswap32(x) != 0 to x != 0 (and related transforms)

This patch to match.pd implements several closely related folding
simplifications at the tree-level, that make use of the property
that bit permutation functions, rotate and bswap have inverses.

[1] bswap(X) eq/ne C, for constant C, simplifies to X eq/ne C'
where C'=bswap(C), generalizing the transform in the subject.
[2] bswap(X) eq/ne bswap(Y) simplifies to X eq/ne Y.
[3] lrotate(X,C1) eq/ne C2 simplifies to X eq/ne C3, where
C3 = rrotate(C2,C1), i.e. apply the inverse rotation to C2.
[4] Likewise, rrotate(X,C1) eq/ne C2 simplifies to X eq/ne C3,
where C3 = lrotate(C2,C1).
[5] rotate(X,Z) eq/ne rotate(Y,Z) simplifies to X eq/ne Y, when
the bit-count Z (the same on both sides) has no side-effects.
[6] rotate(X,Y) eq/ne 0 simplifies to X eq/ne 0 if Y has no
side-effects.
[7] Likewise, rotate(X,Y) eq/ne -1 simplifies to X eq/ne -1,
if Y has no side-effects.

2010-07-26  Roger Sayle  <roger@nextmovesoftware.com>
    Marc Glisse  <marc.glisse@inria.fr>

gcc/ChangeLog
* match.pd (rotate): Simplify equality/inequality of rotations.
(bswap): Simplify equality/inequality tests of byte swapping.

gcc/testsuite/ChangeLog
* gcc.dg/fold-eqrotate-1.c: New test case.
* gcc.dg/fold-eqbswap-1.c: New test case.

3 years agoRegenerate .pot files.
Joseph Myers [Mon, 26 Jul 2021 15:27:23 +0000 (15:27 +0000)]
Regenerate .pot files.

gcc/po/
* gcc.pot: Regenerate.

libcpp/po/
* cpplib.pot: Regenerate.

3 years agoImplement operator_bitwise_xor::op1_op2_relation_effect.
Aldy Hernandez [Mon, 26 Jul 2021 11:08:24 +0000 (06:08 -0500)]
Implement operator_bitwise_xor::op1_op2_relation_effect.

This patch adjusts XORing of ranges where the operands are known to be
equal or not equal.

We should probably do the same thing for the op[12]_range methods.

gcc/ChangeLog:

* range-op.cc (operator_bitwise_xor::op1_op2_relation_effect):
New.

3 years agoPass relationship to methods calling generic fold_range.
Aldy Hernandez [Mon, 26 Jul 2021 11:06:37 +0000 (06:06 -0500)]
Pass relationship to methods calling generic fold_range.

Fix a small oversight in methods calling the base class fold_range.

gcc/ChangeLog:

* range-op.cc (operator_lshift::fold_range): Pass rel to
base class fold_range.
(operator_rshift::fold_range): Same.

3 years agoRemove legacy external declarations in toplev.h [PR101447]
Ashimida [Mon, 26 Jul 2021 14:38:50 +0000 (10:38 -0400)]
Remove legacy external declarations in toplev.h [PR101447]

gcc/
PR driver/101447
* toplev.h (min_align_loops_log): Remove declaration.
(min_align_jumps_log, min_align_labels_log): Likewise.
(min_align_functions_log): Likewise.

3 years agoPR fortran/93308/93963/94327/94331/97046 problems raised by descriptor handling
Tobias Burnus [Mon, 26 Jul 2021 12:20:46 +0000 (14:20 +0200)]
PR fortran/93308/93963/94327/94331/97046 problems raised by descriptor handling

Fortran: Fix attributes and bounds in ISO_Fortran_binding.

2021-07-26  José Rui Faustino de Sousa  <jrfsousa@gmail.com>
    Tobias Burnus  <tobias@codesourcery.com>

PR fortran/93308
PR fortran/93963
PR fortran/94327
PR fortran/94331
PR fortran/97046

gcc/fortran/ChangeLog:

* trans-decl.c (convert_CFI_desc): Only copy out the descriptor
if necessary.
* trans-expr.c (gfc_conv_gfc_desc_to_cfi_desc): Updated attribute
handling which reflect a previous intermediate version of the
standard. Only copy out the descriptor if necessary.

libgfortran/ChangeLog:

* runtime/ISO_Fortran_binding.c (cfi_desc_to_gfc_desc): Add code
to verify the descriptor. Correct bounds calculation.
(gfc_desc_to_cfi_desc): Add code to verify the descriptor.

gcc/testsuite/ChangeLog:

* gfortran.dg/ISO_Fortran_binding_1.f90: Add pointer attribute,
this test is still erroneous but now it compiles.
* gfortran.dg/bind_c_array_params_2.f90: Update regex to match
code changes.
* gfortran.dg/PR93308.f90: New test.
* gfortran.dg/PR93963.f90: New test.
* gfortran.dg/PR94327.c: New test.
* gfortran.dg/PR94327.f90: New test.
* gfortran.dg/PR94331.c: New test.
* gfortran.dg/PR94331.f90: New test.
* gfortran.dg/PR97046.f90: New test.

3 years agoAbstract out conditional simplification out of execute_vrp.
Aldy Hernandez [Mon, 26 Jul 2021 09:53:41 +0000 (11:53 +0200)]
Abstract out conditional simplification out of execute_vrp.

VRP simplifies conditionals involving casted values outside of the main
folding mechanism, because this optimization inhibits the VRP jump
threader from threading through the comparison.

As part of replacing VRP with an evrp instance, I am making sure we do
everything VRP does.  Hence, I am abstracting this functionality out so
we can call it from from elsewhere.

ISTM that when the proposed ranger-based jump threader can handle
everything the forward threader does, there will be no need for this
optimization to be done outside of the evrp folder.  Perhaps we can fold
this into the substitute_using_ranges class.  But that's further down
the line.

Also, there is no need to pass a vr_values around, when the base
range_query class will do.  I fixed this, at it makes it trivial to pass
down a ranger or evrp instance.

Tested on x86-64 Linux.

gcc/ChangeLog:

* tree-vrp.c (vrp_simplify_cond_using_ranges): Rename vr_values
with range_query.
(execute_vrp): Abstract out simplification of conditionals...
(simplify_casted_conds): ...here.

3 years agoPass gimple context to array_bounds_checker.
Aldy Hernandez [Mon, 26 Jul 2021 07:47:42 +0000 (09:47 +0200)]
Pass gimple context to array_bounds_checker.

I have changed the use of the array_bounds_checker in VRP to use a
ranger in my local tree to make sure there are no regressions when using
either VRP or the ranger.  In doing so I noticed that the checker
does not pass context to get_value_range, which causes the ranger to miss a
few cases.  This patch fixes the oversight.

Tested on x86-64 Linux using the array bounds checker both with VRP and
the ranger.

gcc/ChangeLog:

* gimple-array-bounds.cc (array_bounds_checker::get_value_range):
Add gimple argument.
(array_bounds_checker::check_array_ref): Same.
(array_bounds_checker::check_addr_expr): Same.
(array_bounds_checker::check_array_bounds): Pass statement to
check_array_bounds and check_addr_expr.
* gimple-array-bounds.h (check_array_bounds): Add gimple argument.
(check_addr_expr): Same.
(get_value_range): Same.

3 years agoAArch64: correct dot-product RTL patterns for aarch64.
Tamar Christina [Mon, 26 Jul 2021 09:23:21 +0000 (10:23 +0100)]
AArch64: correct dot-product RTL patterns for aarch64.

The previous fix for this problem was wrong due to a subtle difference between
where NEON expects the RMW values and where intrinsics expects them.

The insn pattern is modeled after the intrinsics and so needs an expand for
the vectorizer optab to switch the RTL.

However operand[3] is not expected to be written to so the current pattern is
bogus.

Instead I rewrite the RTL to be in canonical ordering and merge them.

gcc/ChangeLog:

* config/aarch64/aarch64-simd-builtins.def (sdot, udot): Rename to..
(sdot_prod, udot_prod): ... This.
* config/aarch64/aarch64-simd.md (aarch64_<sur>dot<vsi2qi>): Merged
into...
(<sur>dot_prod<vsi2qi>): ... this.
(aarch64_<sur>dot_lane<vsi2qi>, aarch64_<sur>dot_laneq<vsi2qi>):
Change operands order.
(<sur>sadv16qi): Use new operands order.
* config/aarch64/arm_neon.h (vdot_u32, vdotq_u32, vdot_s32,
vdotq_s32): Use new RTL ordering.

3 years agoAArch64: correct usdot vectorizer and intrinsics optabs
Tamar Christina [Mon, 26 Jul 2021 09:22:23 +0000 (10:22 +0100)]
AArch64: correct usdot vectorizer and intrinsics optabs

There's a slight mismatch between the vectorizer optabs and the intrinsics
patterns for NEON.  The vectorizer expects operands[3] and operands[0] to be
the same but the aarch64 intrinsics expanders expect operands[0] and
operands[1] to be the same.

This means we need different patterns here.  This adds a separate usdot
vectorizer pattern which just shuffles around the RTL params.

There's also an inconsistency between the usdot and (u|s)dot intrinsics RTL
patterns which is not corrected here.

gcc/ChangeLog:

* config/aarch64/aarch64-builtins.c (TYPES_TERNOP_SUSS,
aarch64_types_ternop_suss_qualifiers): New.
* config/aarch64/aarch64-simd-builtins.def (usdot_prod): Use it.
* config/aarch64/aarch64-simd.md (usdot_prod<vsi2qi>): Re-organize RTL.
* config/aarch64/arm_neon.h (vusdot_s32, vusdotq_s32): Use it.

3 years agoopenmp: Add support for omp attributes section and scan directives
Jakub Jelinek [Mon, 26 Jul 2021 07:13:47 +0000 (09:13 +0200)]
openmp: Add support for omp attributes section and scan directives

This patch adds support for expressing the section and scan directives
using the attribute syntax and additionally fixes some bugs in the attribute
syntax directive handling.
For now it requires that the scan and section directives appear as the only
attribute, not combined with other OpenMP or non-OpenMP attributes on the same
statement.

2021-07-26  Jakub Jelinek  <jakub@redhat.com>

* parser.h (struct cp_lexer): Add orphan_p member.
* parser.c (cp_parser_statement): Don't change in_omp_attribute_pragma
upon restart from CPP_PRAGMA handling.  Fix up condition when a lexer
should be destroyed and adjust saved_tokens if it records tokens from
the to be destroyed lexer.
(cp_parser_omp_section_scan): New function.
(cp_parser_omp_scan_loop_body): Use it.  If
parser->lexer->in_omp_attribute_pragma, allow optional comma
after scan.
(cp_parser_omp_sections_scope): Use cp_parser_omp_section_scan.

* g++.dg/gomp/attrs-1.C: Use attribute syntax even for section
and scan directives.
* g++.dg/gomp/attrs-2.C: Likewise.
* g++.dg/gomp/attrs-6.C: New test.
* g++.dg/gomp/attrs-7.C: New test.
* g++.dg/gomp/attrs-8.C: New test.

3 years agoDaily bump.
GCC Administrator [Mon, 26 Jul 2021 00:16:23 +0000 (00:16 +0000)]
Daily bump.

3 years ago[Ada] Declare time_t uniformly based on a system parameter #2
Arnaud Charlet [Sun, 25 Jul 2021 13:23:44 +0000 (09:23 -0400)]
[Ada] Declare time_t uniformly based on a system parameter #2

gcc/ada/

* libgnat/s-osprim__x32.adb: Add missing with clause.

3 years agoDaily bump.
GCC Administrator [Sun, 25 Jul 2021 00:16:22 +0000 (00:16 +0000)]
Daily bump.

3 years agoinclude: Fix -Wundef warnings in ansidecl.h
Marek Polacek [Tue, 20 Jul 2021 20:26:28 +0000 (16:26 -0400)]
include: Fix -Wundef warnings in ansidecl.h

This quashes -Wundef warnings in ansidecl.h when compiled in C or C++.
In C, __cpp_constexpr and __cplusplus aren't defined so we evaluate
them to 0; conversely, __STDC_VERSION__ is not defined in C++.
This has caused grief when -Wundef is used with -Werror.

I've also tested -traditional-cpp.

include/ChangeLog:

* ansidecl.h: Check if __cplusplus is defined before checking
the value of __cpp_constexpr and __cplusplus.  Don't check
__STDC_VERSION__ in C++.

3 years agoDaily bump.
GCC Administrator [Sat, 24 Jul 2021 00:16:44 +0000 (00:16 +0000)]
Daily bump.

3 years agoFortran: extend check for array arguments and reject CLASS array elements.
Harald Anlauf [Fri, 23 Jul 2021 19:00:10 +0000 (21:00 +0200)]
Fortran: extend check for array arguments and reject CLASS array elements.

gcc/fortran/ChangeLog:

PR fortran/101536
* check.c (array_check): Adjust check for the case of CLASS
arrays.

gcc/testsuite/ChangeLog:

PR fortran/101536
* gfortran.dg/pr101536.f90: New test.

3 years agoexpmed: Fix store_integral_bit_field [PR101562]
Jakub Jelinek [Fri, 23 Jul 2021 17:55:16 +0000 (19:55 +0200)]
expmed: Fix store_integral_bit_field [PR101562]

Our documentation says that paradoxical subregs shouldn't appear
in strict_low_part:
'(strict_low_part (subreg:M (reg:N R) 0))'
     This expression code is used in only one context: as the
     destination operand of a 'set' expression.  In addition, the
     operand of this expression must be a non-paradoxical 'subreg'
     expression.
but on the testcase below that triggers UB at runtime
store_integral_bit_field emits exactly that.

The following patch fixes it by ensuring the requirement is satisfied.

2021-07-23  Jakub Jelinek  <jakub@redhat.com>

PR rtl-optimization/101562
* expmed.c (store_integral_bit_field): Only use movstrict_optab
if the operand isn't paradoxical.

* gcc.c-torture/compile/pr101562.c: New test.

3 years agoUse range_query object in array bounds class.
Aldy Hernandez [Fri, 23 Jul 2021 14:19:59 +0000 (16:19 +0200)]
Use range_query object in array bounds class.

Now that all dependencies of array_bounds_checker take a range_query, we
can sever the relationship with vr_values.  Changing this will allow us
to use the array_bounds_checker with VRP, evrp, or the ranger.

Tested on x86-64 Linux.

gcc/ChangeLog:

* gimple-array-bounds.h (class array_bounds_checker): Change
ranges type to range_query.

3 years agoaarch64: Use memcpy to copy vector tables in vst1[q]_x2 intrinsics
Jonathan Wright [Fri, 23 Jul 2021 12:41:39 +0000 (13:41 +0100)]
aarch64: Use memcpy to copy vector tables in vst1[q]_x2 intrinsics

Use __builtin_memcpy to copy vector structures instead of building
a new opaque structure one vector at a time in each of the vst1[q]_x2
Neon intrinsics in arm_neon.h. This simplifies the header file and
also improves code generation - superfluous move instructions were
emitted for every register extraction/set in this additional
structure.

Add new code generation tests to verify that superfluous move
instructions are not generated for the vst1q_x2 intrinsics.

gcc/ChangeLog:

2021-07-23  Jonathan Wright  <jonathan.wright@arm.com>

* config/aarch64/arm_neon.h (vst1_s64_x2): Use
__builtin_memcpy instead of constructing
__builtin_aarch64_simd_oi one vector at a time.
(vst1_u64_x2): Likewise.
(vst1_f64_x2): Likewise.
(vst1_s8_x2): Likewise.
(vst1_p8_x2): Likewise.
(vst1_s16_x2): Likewise.
(vst1_p16_x2): Likewise.
(vst1_s32_x2): Likewise.
(vst1_u8_x2): Likewise.
(vst1_u16_x2): Likewise.
(vst1_u32_x2): Likewise.
(vst1_f16_x2): Likewise.
(vst1_f32_x2): Likewise.
(vst1_p64_x2): Likewise.
(vst1q_s8_x2): Likewise.
(vst1q_p8_x2): Likewise.
(vst1q_s16_x2): Likewise.
(vst1q_p16_x2): Likewise.
(vst1q_s32_x2): Likewise.
(vst1q_s64_x2): Likewise.
(vst1q_u8_x2): Likewise.
(vst1q_u16_x2): Likewise.
(vst1q_u32_x2): Likewise.
(vst1q_u64_x2): Likewise.
(vst1q_f16_x2): Likewise.
(vst1q_f32_x2): Likewise.
(vst1q_f64_x2): Likewise.
(vst1q_p64_x2): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/vector_structure_intrinsics.c: Add new
tests.

3 years agoaarch64: Use memcpy to copy vector tables in vst1[q]_x3 intrinsics
Jonathan Wright [Fri, 23 Jul 2021 11:41:05 +0000 (12:41 +0100)]
aarch64: Use memcpy to copy vector tables in vst1[q]_x3 intrinsics

Use __builtin_memcpy to copy vector structures instead of building
a new opaque structure one vector at a time in each of the vst1[q]_x3
Neon intrinsics in arm_neon.h. This simplifies the header file and
also improves code generation - superfluous move instructions were
emitted for every register extraction/set in this additional
structure.

Add new code generation tests to verify that superfluous move
instructions are not generated for the vst1q_x3 intrinsics.

gcc/ChangeLog:

2021-07-23  Jonathan Wright  <jonathan.wright@arm.com>

* config/aarch64/arm_neon.h (vst1_s64_x3): Use
__builtin_memcpy instead of constructing
__builtin_aarch64_simd_ci one vector at a time.
(vst1_u64_x3): Likewise.
(vst1_f64_x3): Likewise.
(vst1_s8_x3): Likewise.
(vst1_p8_x3): Likewise.
(vst1_s16_x3): Likewise.
(vst1_p16_x3): Likewise.
(vst1_s32_x3): Likewise.
(vst1_u8_x3): Likewise.
(vst1_u16_x3): Likewise.
(vst1_u32_x3): Likewise.
(vst1_f16_x3): Likewise.
(vst1_f32_x3): Likewise.
(vst1_p64_x3): Likewise.
(vst1q_s8_x3): Likewise.
(vst1q_p8_x3): Likewise.
(vst1q_s16_x3): Likewise.
(vst1q_p16_x3): Likewise.
(vst1q_s32_x3): Likewise.
(vst1q_s64_x3): Likewise.
(vst1q_u8_x3): Likewise.
(vst1q_u16_x3): Likewise.
(vst1q_u32_x3): Likewise.
(vst1q_u64_x3): Likewise.
(vst1q_f16_x3): Likewise.
(vst1q_f32_x3): Likewise.
(vst1q_f64_x3): Likewise.
(vst1q_p64_x3): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/vector_structure_intrinsics.c: Add new
tests.

3 years agox86: Don't return hard register when LRA is in progress
H.J. Lu [Thu, 22 Jul 2021 12:17:27 +0000 (05:17 -0700)]
x86: Don't return hard register when LRA is in progress

Don't return hard register in ix86_gen_scratch_sse_rtx when LRA is in
progress to avoid ICE when there are no available hard registers for
LRA.

gcc/

PR target/101504
* config/i386/i386.c (ix86_gen_scratch_sse_rtx): Don't return
hard register when LRA is in progress.

gcc/testsuite/

PR target/101504
* gcc.target/i386/pr101504.c: New test.