platform/upstream/gcc.git
4 years agoFortran: OpenMP - fix simd with (last)private (PR97061)
Tobias Burnus [Wed, 16 Sep 2020 14:23:13 +0000 (16:23 +0200)]
Fortran: OpenMP - fix simd with (last)private (PR97061)

gcc/fortran/ChangeLog:

PR fortran/97061
* trans-openmp.c (gfc_trans_omp_do): Handle simd with (last)private.

gcc/testsuite/ChangeLog:

PR fortran/97061
* gfortran.dg/gomp/openmp-simd-6.f90: New test.

4 years agoc++: Break out actual instantiation from instantiate_decl
Nathan Sidwell [Wed, 16 Sep 2020 14:14:14 +0000 (07:14 -0700)]
c++: Break out actual instantiation from instantiate_decl

This refactors instantiate_decl, breaking out the actual instantiation
work to instantiate_body.  That'll allow me to address the OMP UDR
issue, but it also means we have slightly neater code in
instantiate_decl anyway.

gcc/cp/
* pt.c (instantiate_body): New, broken out of ..
(instantiate_decl): ... here.  Call it.

4 years agovec: don't select partial vectors when unnecessary
Andrea Corallo [Fri, 28 Aug 2020 15:01:15 +0000 (16:01 +0100)]
vec: don't select partial vectors when unnecessary

gcc/ChangeLog

2020-09-09  Andrea Corallo  <andrea.corallo@arm.com>

* tree-vect-loop.c (vect_need_peeling_or_partial_vectors_p): New
function.
(vect_analyze_loop_2): Make use of it not to select partial
vectors if no peel is required.
(determine_peel_for_niter): Move out some logic into
'vect_need_peeling_or_partial_vectors_p'.

gcc/testsuite/ChangeLog

2020-09-09  Andrea Corallo  <andrea.corallo@arm.com>

* gcc.target/aarch64/sve/cost_model_10.c: New test.
* gcc.target/aarch64/sve/clastb_8.c: Update test for new
vectorization strategy.
* gcc.target/aarch64/sve/cost_model_5.c: Likewise.
* gcc.target/aarch64/sve/struct_vect_14.c: Likewise.
* gcc.target/aarch64/sve/struct_vect_15.c: Likewise.
* gcc.target/aarch64/sve/struct_vect_16.c: Likewise.
* gcc.target/aarch64/sve/struct_vect_17.c: Likewise.

4 years agortl_data: Add sp_is_clobbered_by_asm
H.J. Lu [Mon, 14 Sep 2020 15:52:27 +0000 (08:52 -0700)]
rtl_data: Add sp_is_clobbered_by_asm

Add sp_is_clobbered_by_asm to rtl_data to inform backends that the stack
pointer is clobbered by asm statement.

gcc/

PR target/97032
* cfgexpand.c (asm_clobber_reg_kind): Set sp_is_clobbered_by_asm
to true if the stack pointer is clobbered by asm statement.
* emit-rtl.h (rtl_data): Add sp_is_clobbered_by_asm.
* config/i386/i386.c (ix86_get_drap_rtx): Set need_drap to true
if the stack pointer is clobbered by asm statement.

gcc/testsuite/

PR target/97032
* gcc.target/i386/pr97032.c: New test.

4 years agotestsuite/97066 - minor change to bypass plusminus-with-convert rule
Feng Xue [Wed, 16 Sep 2020 08:21:14 +0000 (16:21 +0800)]
testsuite/97066 - minor change to bypass plusminus-with-convert rule

The following testcases will be simplified by the new rule
(T)(A) +- (T)(B) -> (T)(A +- B), so could not keep code pattern
expected by test-check. Adjust test code to suppress simplification.

2020-09-16  Feng Xue  <fxue@os.amperecomputing.com>

gcc/testsuite/
PR testsuite/97066
* gcc.dg/ifcvt-3.c: Modified to suppress simplification.
* gcc.dg/tree-ssa/20030807-10.c: Likewise.

4 years agoIBM Z: Fix *vec_tf_to_v1tf constraints
Ilya Leoshkevich [Wed, 2 Sep 2020 16:00:35 +0000 (18:00 +0200)]
IBM Z: Fix *vec_tf_to_v1tf constraints

Certain alternatives of *vec_tf_to_v1tf use "v" constraint for its
TFmode source operand.  Therefore it is assigned to VEC_REGS class,
and when it is reloaded using *movtf_64, whose relevant alternatives
need FP_REGS, LRA loops and ICE happens.  The reason is that register
class mismatch causes LRA to emit another reload, which triggers this
issue again.

Fix by using "f" constraint, which is more appropriate for FP register
pairs anyway.

gcc/ChangeLog:

2020-09-02  Ilya Leoshkevich  <iii@linux.ibm.com>

* config/s390/vector.md(*vec_tf_to_v1tf): Use "f" instead of "v"
  for the source operand.

4 years agoC-SKY: Refine target name for elf target test
Jojo R [Wed, 16 Sep 2020 10:34:43 +0000 (18:34 +0800)]
C-SKY: Refine target name for elf target test

gcc/testsuite/ChangeLog:

* lib/target-supports.exp (check_profiling_available): Refine name of elf target.

4 years agoC-SKY: Set use_gcc_stdint=wrap for elf target
Jojo R [Wed, 16 Sep 2020 10:34:42 +0000 (18:34 +0800)]
C-SKY: Set use_gcc_stdint=wrap for elf target

gcc/ChangeLog:

* config.gcc (C-SKY): Set use_gcc_stdint=wrap for elf target.

4 years agoC-SKY: Enable crtbegin/crtend.o of libgcc for elf target
Jojo R [Wed, 16 Sep 2020 10:34:41 +0000 (18:34 +0800)]
C-SKY: Enable crtbegin/crtend.o of libgcc for elf target

libgcc/ChangeLog:

* config.host (C-SKY): Enable crtbegin/crtend.o of libgcc for elf target.

4 years agoremove STMT_VINFO_NUM_SLP_USES
Richard Biener [Wed, 16 Sep 2020 09:24:23 +0000 (11:24 +0200)]
remove STMT_VINFO_NUM_SLP_USES

This removes STMT_VINFO_NUM_SLP_USES by pushing the setting of
the shared stmt_vec_info vector type to where we actually need it
which is alignment analysis and vectorizable_* analysis (where
we could eventually elide it for non-load/store operations).

In particular "uses" in the cache and in disqualified SLP
subgraphs should no longer provide conflicting vector types
this way.

2020-09-16  Richard Biener  <rguenther@suse.de>

* tree-vectorizer.h (_stmt_vec_info::num_slp_uses): Remove.
(STMT_VINFO_NUM_SLP_USES): Likewise.
(vect_free_slp_instance): Adjust.
(vect_update_shared_vectype): Declare.
* tree-vectorizer.c (vec_info::~vec_info): Adjust.
* tree-vect-loop.c (vect_analyze_loop_2): Likewise.
(vectorizable_live_operation): Use vector type from
SLP_TREE_REPRESENTATIVE.
(vect_transform_loop): Adjust.
* tree-vect-data-refs.c (vect_slp_analyze_node_alignment):
Set the shared vector type.
* tree-vect-slp.c (vect_free_slp_tree): Remove final_p
parameter, remove STMT_VINFO_NUM_SLP_USES updating.
(vect_free_slp_instance): Adjust.
(vect_create_new_slp_node): Remove STMT_VINFO_NUM_SLP_USES
updating.
(vect_update_shared_vectype): Always compare with the
present vector type, update if NULL.
(vect_build_slp_tree_1): Do not update the shared vector
type here.
(vect_build_slp_tree_2): Adjust.
(slp_copy_subtree): Likewise.
(vect_attempt_slp_rearrange_stmts): Likewise.
(vect_analyze_slp_instance): Likewise.
(vect_analyze_slp): Likewise.
(vect_slp_analyze_node_operations_1): Update the shared
vector type.
(vect_slp_analyze_operations): Adjust.
(vect_slp_analyze_bb_1): Likewise.

4 years agoC-SKY: Support multilib for mfloat-abi=.
Jojo R [Wed, 16 Sep 2020 07:29:18 +0000 (15:29 +0800)]
C-SKY: Support multilib for mfloat-abi=.

gcc/ChangeLog:

* config/csky/t-csky-linux (CSKY_MULTILIB_OSDIRNAMES): Use mfloat-abi.
(MULTILIB_OPTIONS): Likewise.
* config/csky/t-csky-elf (MULTILIB_OPTIONS): Likewise.
(MULTILIB_EXCEPTIONS): Likewise.

4 years agoarm: Avoid unused parameter warning
Jakub Jelinek [Wed, 16 Sep 2020 08:11:53 +0000 (10:11 +0200)]
arm: Avoid unused parameter warning

2020-09-16  Jakub Jelinek  <jakub@redhat.com>

* config/arm/arm.c (arm_option_restore): Comment out opts argument
name to avoid unused parameter warnings.

4 years agooptions, lto: Optimize streaming of optimization nodes
Jakub Jelinek [Wed, 16 Sep 2020 08:04:32 +0000 (10:04 +0200)]
options, lto: Optimize streaming of optimization nodes

When working on the previous patch, I've noticed that all cl_optimization
fields appart from strings are streamed with bp_pack_value (..., 64); so we
waste quite a lot of space, given that many of the options are just booleans
or char options and there are 450-ish of them.

Fixed by streaming the number of bits the corresponding fields have.
While for char fields we have also range information, except for 3
it is either -128, 127 or 0, 255, so it didn't seem worth it to bother
with using range-ish packing.

2020-09-16  Jakub Jelinek  <jakub@redhat.com>

* optc-save-gen.awk: In cl_optimization_stream_out use
bp_pack_var_len_{int,unsigned} instead of bp_pack_value.  In
cl_optimization_stream_in use bp_unpack_var_len_{int,unsigned}
instead of bp_unpack_value.  Formatting fix.

4 years agostore-merging: Consider also overlapping stores earlier in the by bitpos sorting...
Jakub Jelinek [Wed, 16 Sep 2020 07:42:33 +0000 (09:42 +0200)]
store-merging: Consider also overlapping stores earlier in the by bitpos sorting [PR97053]

As the testcases show, if we have something like:
  MEM <char[12]> [&b + 8B] = {};
  MEM[(short *) &b] = 5;
  _5 = *x_4(D);
  MEM <long long unsigned int> [&b + 2B] = _5;
  MEM[(char *)&b + 16B] = 88;
  MEM[(int *)&b + 20B] = 1;
then in sort_by_bitpos the stores are almost like in the given order,
except the first store is after the = _5; store.
We can't coalesce the = 5; store with = _5;, because the latter is MEM_REF,
while the former INTEGER_CST, and we can't coalesce the = _5 store with
the = {} store because the former is MEM_REF, the latter INTEGER_CST.
But we happily coalesce the remaining 3 stores, which is wrong, because the
= _5; store overlaps those and is in between them in the program order.
We already have code to deal with similar cases in check_no_overlap, but we
deal only with the following stores in sort_by_bitpos order, not the earlier
ones.

The following patch checks also the earlier ones.  In coalesce_immediate_stores
it computes the first one that needs to be checked (all the ones whose
bitpos + bitsize is smaller or equal to merged_store->start don't need to be
checked and don't need to be checked even for any following attempts because
of the sort_by_bitpos sorting) and the end of that (that is the first store
in the merged_store).

2020-09-16  Jakub Jelinek  <jakub@redhat.com>

PR tree-optimization/97053
* gimple-ssa-store-merging.c (check_no_overlap): Add FIRST_ORDER,
START, FIRST_EARLIER and LAST_EARLIER arguments.  Return false if
any stores between FIRST_EARLIER inclusive and LAST_EARLIER exclusive
has order in between FIRST_ORDER and LAST_ORDER and overlaps the to
be merged store.
(imm_store_chain_info::try_coalesce_bswap): Add FIRST_EARLIER argument.
Adjust check_no_overlap caller.
(imm_store_chain_info::coalesce_immediate_stores): Add first_earlier
and last_earlier variables, adjust them during iterations.  Adjust
check_no_overlap callers, call check_no_overlap even when extending
overlapping stores by extra INTEGER_CST stores.

* gcc.dg/store_merging_31.c: New test.
* gcc.dg/store_merging_32.c: New test.

4 years agoC-SKY: Fix wrong ld name with option -mfloat-abi=hard.
Jojo R [Wed, 16 Sep 2020 03:28:30 +0000 (11:28 +0800)]
C-SKY: Fix wrong ld name with option -mfloat-abi=hard.

gcc/ChangeLog:

* config/csky/csky-linux-elf.h (GLIBC_DYNAMIC_LINKER): Use mfloat-abi.

4 years agors6000: Remove useless insns fed into lvx/stvx [PR97019]
Kewen Lin [Wed, 16 Sep 2020 03:32:55 +0000 (22:32 -0500)]
rs6000: Remove useless insns fed into lvx/stvx [PR97019]

This patch is to extend the existing function find_alignment_op to
check all defintions of base_reg are AND operations with mask -16B
to force the alignment.  If all are satifised, it passes all AND
operations and instructions to function recombine_lvx_pattern
and recombine_stvx_pattern, they can remove all useless ANDs
further.

Bootstrapped/regtested on powerpc64le-linux-gnu P8.

gcc/ChangeLog:

PR target/97019
* config/rs6000/rs6000-p8swap.c (find_alignment_op): Adjust to
support multiple defintions which are all AND operations with
the mask -16B.
(recombine_lvx_pattern): Adjust to handle multiple AND operations
from find_alignment_op.
(recombine_stvx_pattern): Likewise.

gcc/testsuite/ChangeLog:

PR target/97019
* gcc.target/powerpc/pr97019.c: New test.

4 years agoC-SKY: Support -mfloat-abi=hard.
Jojo R [Tue, 15 Sep 2020 08:08:01 +0000 (16:08 +0800)]
C-SKY: Support -mfloat-abi=hard.

gcc/ChangeLog:

* config/csky/csky.md (CSKY_NPARM_FREGS): New.
(call_value_internal_vs/d): New.
(untyped_call): New.
* config/csky/csky.h (TARGET_SINGLE_FPU): New.
(TARGET_DOUBLE_FPU): New.
(FUNCTION_VARG_REGNO_P): New.
(CSKY_VREG_MODE_P): New.
(FUNCTION_VARG_MODE_P): New.
(CUMULATIVE_ARGS): Add extra regs info.
(INIT_CUMULATIVE_ARGS): Use csky_init_cumulative_args.
(FUNCTION_ARG_REGNO_P): Use FUNCTION_VARG_REGNO_P.
* config/csky/csky-protos.h (csky_init_cumulative_args): Extern.
* config/csky/csky.c (csky_cpu_cpp_builtins): Support TARGET_HARD_FLOAT_ABI.
(csky_function_arg): Likewise.
(csky_num_arg_regs): Likewise.
(csky_function_arg_advance): Likewise.
(csky_function_value): Likewise.
(csky_libcall_value): Likewise.
(csky_function_value_regno_p): Likewise.
(csky_arg_partial_bytes): Likewise.
(csky_setup_incoming_varargs): Likewise.
(csky_init_cumulative_args): New.

gcc/testsuite/ChangeLog:

* gcc.dg/builtin-apply2.c : Skip if CSKY.
* gcc.dg/torture/stackalign/builtin-apply-2.c : Likewise.

4 years agors6000: Fix misnamed built-in
Bill Schmidt [Wed, 16 Sep 2020 01:34:22 +0000 (20:34 -0500)]
rs6000: Fix misnamed built-in

The description in rs6000-builtin.def provides for a builtin named
__builtin_altivec_xst_len_r.  However, it is hand-defined in
altivec_init_builtins as __builtin_xst_len_r, against the usual naming
practice.  Fix that.

2020-09-15  Bill Schmidt  <wschmidt@linux.ibm.com>

gcc/
* config/rs6000/rs6000-call.c (altivec_init_builtins): Fix name
of __builtin_altivec_xst_len_r.

4 years agolibgo: additional type/const references in sysinfo.c
Than McIntosh [Tue, 15 Sep 2020 12:31:30 +0000 (08:31 -0400)]
libgo: additional type/const references in sysinfo.c

Add a few more explicit references to enumeration constants
(RUSAGE_SELF, DT_UNKNOWN) in sysinfo.c to insure that their hosting enums
are emitted into DWARF, when using a clang host compiler during
the gollvm build.

Updates golang/go#41382.
Updates golang/go#41404.

Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/254941

4 years agoDaily bump.
GCC Administrator [Wed, 16 Sep 2020 00:16:37 +0000 (00:16 +0000)]
Daily bump.

4 years agoanalyzer: fix ICE when merging constraints w/o transitivity [PR96650]
David Malcolm [Tue, 15 Sep 2020 09:51:04 +0000 (05:51 -0400)]
analyzer: fix ICE when merging constraints w/o transitivity [PR96650]

PR analyzer/96650 reports an assertion failure when merging the
intersection of two sets of constraints, due to the resulting
constraints being infeasible.

It turns out that the two input sets were each infeasible if
transitivity were considered, but -fanalyzer-transitivity was off.
However for this case, the merging code was "discovering" the
transitive infeasibility of the intersection of the constraints even
when -fanalyzer-transitivity is off, triggering an assertion failure.

I attempted various fixes for this, but each of them would have
introduced O(N^2) logic into the constraint-handling code into the
-fno-analyzer-transitivity case (with N == the number of constraints).

This patch fixes the ICE by tweaking the assertion, so that we
silently drop such constraints if -fanalyzer-transitivity is off.

gcc/analyzer/ChangeLog:
PR analyzer/96650
* constraint-manager.cc (merger_fact_visitor::on_fact): Replace
assertion that add_constraint succeeded with an assertion that
if it fails, -fanalyzer-transitivity is off.

gcc/testsuite/ChangeLog:
PR analyzer/96650
* gcc.dg/analyzer/pr96650-1-notrans.c: New test.
* gcc.dg/analyzer/pr96650-1-trans.c: New test.
* gcc.dg/analyzer/pr96650-2-notrans.c: New test.
* gcc.dg/analyzer/pr96650-2-trans.c: New test.

4 years agolibgomp/target.c: Silence -Wuninitialized warning
Tobias Burnus [Tue, 15 Sep 2020 19:28:40 +0000 (21:28 +0200)]
libgomp/target.c: Silence -Wuninitialized warning

libgomp/ChangeLog:

PR fortran/96668
* target.c (gomp_map_vars_internal): Initialize has_nullptr.

4 years agortlanal: fix subreg handling in set_noop_p ()
Ilya Leoshkevich [Tue, 8 Sep 2020 23:23:51 +0000 (01:23 +0200)]
rtlanal: fix subreg handling in set_noop_p ()

The following s390 rtx is errneously considered a no-op:

(set (subreg:DF (reg:TF %f0) 8) (subreg:DF (reg:V1TF %f0) 8))

Here, SET_DEST is a second register in a floating-point register pair,
and SET_SRC is the second half of a vector register, so they refer to
different bits.

Fix by treating subregs of registers in different modes conservatively.

gcc/ChangeLog:

2020-09-11  Ilya Leoshkevich  <iii@linux.ibm.com>

* rtlanal.c (set_noop_p): Treat subregs of registers in
different modes conservatively.

4 years agomake swap argument of vect_get_and_check_slp_defs readonly
Richard Biener [Tue, 15 Sep 2020 14:12:53 +0000 (16:12 +0200)]
make swap argument of vect_get_and_check_slp_defs readonly

Since some time we're only using this argument to communicate from
vect_build_slp_tree_1 to vect_get_and_check_slp_defs.  This makes
the direction of information flow clear.

2020-09-15  Richard Biener  <rguenther@suse.de>

* tree-vect-slp.c (vect_get_and_check_slp_defs): Make swap
argument by-value and do not change it.
(vect_build_slp_tree_2): Adjust, set swap to NULL after last
use.

4 years agoc++: Partially revert: local externs in templates do not get template head
Nathan Sidwell [Tue, 15 Sep 2020 14:55:18 +0000 (07:55 -0700)]
c++: Partially revert: local externs in templates do not get template head

Turns out I didn't get OMP reductions correct.  To address those I
need to do some reorganization, so this patch just reverts the
OMP-specific pieces of the local decl changes.

gcc/cp/
* pt.c (push_template_decl_real): OMP reductions retain a template
header.
(tsubst_function_decl): Likewise.

4 years agotree-optimization/94234 - add plusminus-with-convert pattern
Feng Xue [Mon, 17 Aug 2020 15:00:35 +0000 (23:00 +0800)]
tree-optimization/94234 - add plusminus-with-convert pattern

Add a rule (T)(A) +- (T)(B) -> (T)(A +- B), which works only when (A +- B)
could be folded to a simple value. By this rule, a plusminus-mult-with-convert
expression could be handed over to the rule (A * C) +- (B * C) -> (A +- B).

2020-09-15  Feng Xue  <fxue@os.amperecomputing.com>

gcc/
PR tree-optimization/94234
* match.pd (T)(A) +- (T)(B) -> (T)(A +- B): New simplification.

gcc/testsuite/
PR tree-optimization/94234
* gcc.dg/pr94234-3.c: New test.

4 years agogcc.target/i386/pr78904-4a.c: Compile with -mtune=generic
H.J. Lu [Tue, 15 Sep 2020 14:21:58 +0000 (07:21 -0700)]
gcc.target/i386/pr78904-4a.c: Compile with -mtune=generic

commit e95395926a84a2406faefe0995295d199d595440
Author: Uros Bizjak <ubizjak@gmail.com>
Date:   Thu Jun 18 20:12:48 2020 +0200

    i386: Fix mode of ZERO_EXTRACT RTXes, remove ext_register_operand predicate.

caused

FAIL: gcc.target/i386/pr78904-4a.c scan-assembler [ \t]movb[\t ]+%.h, t

when compiled with --target_board='unix{-m32\ -march=cascadelake}'.  With
-mtune=generic:

movzwl 4(%esp), %edx
movl 8(%esp), %eax
movb %dh, t(%eax)
ret

With -mtune=cascadelake:

movzbl 5(%esp), %edx
movl 8(%esp), %eax
movb %dl, t(%eax)
ret

Add -mtune=generic for --target_board='unix{-m32\ -march=cascadelake}'.

* gcc.target/i386/pr78904-4a.c: Compile with -mtune=generic.

4 years agobb-reorder: Fix for ICEs caused by 69ca5f3a9882
Segher Boessenkool [Mon, 14 Sep 2020 18:19:47 +0000 (18:19 +0000)]
bb-reorder: Fix for ICEs caused by 69ca5f3a9882

After the previous patch we are left with an unreachable BB.  This will
ICE if either we have -fschedule-fusion, or we do not have peephole2.

2020-09-15  Segher Boessenkool  <segher@kernel.crashing.org>

PR rtl-optimization/96475
* bb-reorder.c (duplicate_computed_gotos): If we did anything, run
cleanup_cfg.

4 years agoAllow more BB vectorization
Richard Biener [Tue, 15 Sep 2020 12:35:40 +0000 (14:35 +0200)]
Allow more BB vectorization

The following allows more BB vectorization by generally building leafs
from scalars rather than giving up.  Note this is only a first step
towards this and as can be seen with the exception for node splitting
it is generally hard to get this heuristic sound.  I've added variants
of the bb-slp-48.c testcase to make sure we still try permuting for
example.

2020-09-15  Richard Biener  <rguenther@suse.de>

* tree-vect-slp.c (vect_build_slp_tree_2): Also consider
building an operand from scalars when building it did not
fail fatally but avoid messing with the upcall splitting
of groups.

* gcc.dg/vect/bb-slp-48.c: New testcase.
* gcc.dg/vect/bb-slp-7.c: Adjust.

4 years agoarm: Fix testisms introduced with fix for pr target/95646
Andre Vieira [Mon, 14 Sep 2020 08:03:08 +0000 (09:03 +0100)]
arm: Fix testisms introduced with fix for pr target/95646

This patch changes the test to use the effective-target machinery disables the
error message "ARMv8-M Security Extensions incompatible with selected FPU" when
-mfloat-abi=soft.
Further changes 'asm' to '__asm__' to avoid failures with '-std=' options.

gcc/ChangeLog:
2020-07-06  Andre Vieira  <andre.simoesdiasvieira@arm.com>

* config/arm/arm.c (arm_options_perform_arch_sanity_checks): Do not
check +D32 for CMSE if -mfloat-abi=soft

gcc/testsuite/ChangeLog:
2020-07-06  Andre Vieira  <andre.simoesdiasvieira@arm.com>

* gcc.target/arm/pr95646.c: Fix testism.

4 years agoRetune mask <->integer cost for non-AVX512 micro-architecture.
liuhongt [Mon, 24 Aug 2020 12:36:52 +0000 (20:36 +0800)]
Retune mask <->integer cost for non-AVX512 micro-architecture.

gcc/ChangeLog:

PR target/96744
* config/i386/x86-tune-costs.h (struct processor_costs):
Increase mask <-> integer cost for non AVX512 target to avoid
spill gpr to mask. Also retune mask <-> integer and
mask_load/store for skylake_cost.

4 years agoi386: Fix up vector mul and div with broadcasts in -masm=intel mode
Jakub Jelinek [Tue, 15 Sep 2020 07:37:48 +0000 (09:37 +0200)]
i386: Fix up vector mul and div with broadcasts in -masm=intel mode

These patterns printed bogus <>s around the {1to16} and similar strings.

2020-09-15  Jakub Jelinek  <jakub@redhat.com>

PR target/97028
* config/i386/sse.md (mul<mode>3<mask_name>_bcs,
<avx512>_div<mode>3<mask_name>_bcst): Use <avx512bcst> instead of
<<avx512bcst>>.

* gcc.target/i386/avx512f-pr97028.c: Untested fix.

4 years agoOpenMP/Fortran: Fix (re)mapping of allocatable/pointer arrays [PR96668]
Tobias Burnus [Tue, 15 Sep 2020 07:24:47 +0000 (09:24 +0200)]
OpenMP/Fortran: Fix (re)mapping of allocatable/pointer arrays [PR96668]

gcc/cp/ChangeLog:

PR fortran/96668
* cp-gimplify.c (cxx_omp_finish_clause): Add bool openacc arg.
* cp-tree.h (cxx_omp_finish_clause): Likewise
* semantics.c (handle_omp_for_class_iterator): Update call.

gcc/fortran/ChangeLog:

PR fortran/96668
* trans.h (gfc_omp_finish_clause): Add bool openacc arg.
* trans-openmp.c (gfc_omp_finish_clause): Ditto. Use
GOMP_MAP_ALWAYS_POINTER with PSET for pointers.
(gfc_trans_omp_clauses): Like the latter and also if the always
modifier is used.

gcc/ChangeLog:

PR fortran/96668
* gimplify.c (gimplify_omp_for): Add 'bool openacc' argument;
update omp_finish_clause calls.
(gimplify_adjust_omp_clauses_1, gimplify_adjust_omp_clauses,
gimplify_expr, gimplify_omp_loop): Update omp_finish_clause
and/or gimplify_for calls.
* langhooks-def.h (lhd_omp_finish_clause): Add bool openacc arg.
* langhooks.c (lhd_omp_finish_clause): Likewise.
* langhooks.h (lhd_omp_finish_clause): Likewise.
* omp-low.c (scan_sharing_clauses): Keep GOMP_MAP_TO_PSET cause for
'declare target' vars.

include/ChangeLog:

PR fortran/96668
* gomp-constants.h (GOMP_MAP_ALWAYS_POINTER_P): Define.

libgomp/ChangeLog:

PR fortran/96668
* libgomp.h (struct target_var_desc): Add has_null_ptr_assoc member.
* target.c (gomp_map_vars_existing): Add always_to_flag flag.
(gomp_map_vars_existing): Update call to it.
(gomp_map_fields_existing): Likewise
(gomp_map_vars_internal): Update PSET handling such that if a nullptr is
now allocated or if GOMP_MAP_POINTER is used PSET is updated and pointer
remapped.
(GOMP_target_enter_exit_data): Hanlde GOMP_MAP_ALWAYS_POINTER like
GOMP_MAP_POINTER.
* testsuite/libgomp.fortran/map-alloc-ptr-1.f90: New test.
* testsuite/libgomp.fortran/map-alloc-ptr-2.f90: New test.

4 years agotree-optimization/94234 - Fold plusminus_mult expr with multi-use operands
Feng Xue [Tue, 1 Sep 2020 09:17:58 +0000 (17:17 +0800)]
tree-optimization/94234 - Fold plusminus_mult expr with multi-use operands

2020-09-03  Feng Xue  <fxue@os.amperecomputing.com>

gcc/
PR tree-optimization/94234
* genmatch.c (dt_simplify::gen_1): Emit check on final simplification
result when "!" is specified on toplevel output expr.
* match.pd ((A * C) +- (B * C) -> (A +- B) * C): Allow folding on expr
with multi-use operands if final result is a simple gimple value.

gcc/testsuite/
PR tree-optimization/94234
* gcc.dg/pr94234-2.c: New test.

4 years agoDaily bump.
GCC Administrator [Tue, 15 Sep 2020 00:16:37 +0000 (00:16 +0000)]
Daily bump.

4 years agodoc: fix spelling of -fprofile-reproducibility
Sergei Trofimovich [Fri, 11 Sep 2020 14:31:35 +0000 (15:31 +0100)]
doc: fix spelling of -fprofile-reproducibility

gcc/ChangeLog:

* doc/invoke.texi: fix '-fprofile-reproducibility' option
spelling in manual.

4 years agolibbacktrace: support MiniDebugInfo
Ian Lance Taylor [Mon, 14 Sep 2020 21:01:56 +0000 (14:01 -0700)]
libbacktrace: support MiniDebugInfo

libbacktrace/ChangeLog:
PR libbacktrace/93608
Add support for MiniDebugInfo.
* elf.c (struct elf_view): Define.  Replace most uses of
backtrace_view with elf_view.
(elf_get_view): New static functions.  Replace most calls of
backtrace_get_view with elf_get_view.
(elf_release_view): New static functions.  Replace most calls of
backtrace_release_view with elf_release_view.
(elf_uncompress_failed): Rename from elf_zlib_failed.  Change all
callers.
(LZMA_STATES, LZMA_POS_STATES, LZMA_DIST_STATES): Define.
(LZMA_DIST_SLOTS, LZMA_DIST_MODEL_START): Define.
(LZMA_DIST_MODEL_END, LZMA_FULL_DISTANCES): Define.
(LZMA_ALIGN_SIZE, LZMA_LEN_LOW_SYMBOLS): Define.
(LZMA_LEN_MID_SYMBOLS, LZMA_LEN_HIGH_SYMBOLS): Define.
(LZMA_LITERAL_CODERS_MAX, LZMA_LITERAL_CODER_SIZE): Define.
(LZMA_PROB_IS_MATCH_LEN, LZMA_PROB_IS_REP_LEN): Define.
(LZMA_PROB_IS_REP0_LEN, LZMA_PROB_IS_REP1_LEN): Define.
(LZMA_PROB_IS_REP2_LEN, LZMA_PROB_IS_REP0_LONG_LEN): Define.
(LZMA_PROB_DIST_SLOT_LEN, LZMA_PROB_DIST_SPECIAL_LEN): Define.
(LZMA_PROB_DIST_ALIGN_LEN): Define.
(LZMA_PROB_MATCH_LEN_CHOICE_LEN): Define.
(LZMA_PROB_MATCH_LEN_CHOICE2_LEN): Define.
(LZMA_PROB_MATCH_LEN_LOW_LEN): Define.
(LZMA_PROB_MATCH_LEN_MID_LEN): Define.
(LZMA_PROB_MATCH_LEN_HIGH_LEN): Define.
(LZMA_PROB_REP_LEN_CHOICE_LEN): Define.
(LZMA_PROB_REP_LEN_CHOICE2_LEN): Define.
(LZMA_PROB_REP_LEN_LOW_LEN): Define.
(LZMA_PROB_REP_LEN_MID_LEN): Define.
(LZMA_PROB_REP_LEN_HIGH_LEN): Define.
(LZMA_PROB_LITERAL_LEN): Define.
(LZMA_PROB_IS_MATCH_OFFSET, LZMA_PROB_IS_REP_OFFSET): Define.
(LZMA_PROB_IS_REP0_OFFSET, LZMA_PROB_IS_REP1_OFFSET): Define.
(LZMA_PROB_IS_REP2_OFFSET): Define.
(LZMA_PROB_IS_REP0_LONG_OFFSET): Define.
(LZMA_PROB_DIST_SLOT_OFFSET): Define.
(LZMA_PROB_DIST_SPECIAL_OFFSET): Define.
(LZMA_PROB_DIST_ALIGN_OFFSET): Define.
(LZMA_PROB_MATCH_LEN_CHOICE_OFFSET): Define.
(LZMA_PROB_MATCH_LEN_CHOICE2_OFFSET): Define.
(LZMA_PROB_MATCH_LEN_LOW_OFFSET): Define.
(LZMA_PROB_MATCH_LEN_MID_OFFSET): Define.
(LZMA_PROB_MATCH_LEN_HIGH_OFFSET): Define.
(LZMA_PROB_REP_LEN_CHOICE_OFFSET): Define.
(LZMA_PROB_REP_LEN_CHOICE2_OFFSET): Define.
(LZMA_PROB_REP_LEN_LOW_OFFSET): Define.
(LZMA_PROB_REP_LEN_MID_OFFSET): Define.
(LZMA_PROB_REP_LEN_HIGH_OFFSET): Define.
(LZMA_PROB_LITERAL_OFFSET): Define.
(LZMA_PROB_TOTAL_COUNT): Define.
(LZMA_IS_MATCH, LZMA_IS_REP, LZMA_IS_REP0): Define.
(LZMA_IS_REP1, LZMA_IS_REP2, LZMA_IS_REP0_LONG): Define.
(LZMA_DIST_SLOT, LZMA_DIST_SPECIAL, LZMA_DIST_ALIGN): Define.
(LZMA_MATCH_LEN_CHOICE, LZMA_MATCH_LEN_CHOICE2): Define.
(LZMA_MATCH_LEN_LOW, LZMA_MATCH_LEN_MID): Define.
(LZMA_MATCH_LEN_HIGH, LZMA_REP_LEN_CHOICE): Define.
(LZMA_REP_LEN_CHOICE2, LZMA_REP_LEN_LOW): Define.
(LZMA_REP_LEN_MID, LZMA_REP_LEN_HIGH, LZMA_LITERAL): Define.
(elf_lzma_varint): New static function.
(elf_lzma_range_normalize): New static function.
(elf_lzma_bit, elf_lzma_integer): New static functions.
(elf_lzma_reverse_integer): New static function.
(elf_lzma_len, elf_uncompress_lzma_block): New static functions.
(elf_uncompress_lzma): New static function.
(backtrace_uncompress_lzma): New function.
(elf_add): Add memory and memory_size parameters.  Change all
callers.  Look for .gnu_debugdata section, and, if found,
decompress it and use it for symbols and debug info.  Permit the
descriptor parameter to be -1.
* internal.h (backtrace_uncompress_lzma): Declare.
* mtest.c: New file.
* xztest.c: New file.
* configure.ac: Check for nm, xz, and comm programs.  Check for
liblzma library.
(HAVE_MINIDEBUG): Define.
* Makefile.am (mtest_SOURCES): Define.
(mtest_CFLAGS, mtest_LDADD): Define.
(TESTS): Add mtest_minidebug if HAVE_MINIDEBUG.
(%_minidebug): New pattern rule, if HAVE_MINIDEBUG.
(xztest_SOURCES, xztest_CFLAGS, xztest_LDADD): Define.
(xztest_alloc_SOURCES, xztest_alloc_CFLAGS): Define
(xztest_alloc_LDADD): Define.
(BUILDTESTS): Add mtest, xztest, xztest_alloc.
(CLEANFILES): Add files created by minidebug pattern.
(btest.lo): Correct INCDIR reference.
(mtest.lo, xztest.lo, ztest.lo): New targets.
* configure: Regenerate.
* config.h.in: Regenerate.
* Makefile.in: Regenerate.

4 years agoc++: Use VAR_OR_FUNCTION_DECL_P.
Marek Polacek [Mon, 14 Sep 2020 18:49:31 +0000 (14:49 -0400)]
c++: Use VAR_OR_FUNCTION_DECL_P.

gcc/cp/ChangeLog:

* pt.c (push_template_decl_real): Use VAR_OR_FUNCTION_DECL_P.

4 years agobpf: use the expected instruction for NOPs
Jose E. Marchesi [Mon, 14 Sep 2020 18:35:22 +0000 (20:35 +0200)]
bpf: use the expected instruction for NOPs

The BPF ISA doesn't have a no-operation instruction, but in practice
the Linux kernel verifier performs some optimizations that rely on
these instructions to be encoded in a particular way.  As it turns
out, we were using the "wrong" instruction in GCC.

This patch makes GCC to generate the expected instruction for NOP (a
`ja 0') and also adds a test to make sure this is the case.

Tested in bpf-unknown-none targets.

2020-09-14  Jose E. Marchesi  <jose.marchesi@oracle.com>

gcc/

* config/bpf/bpf.md ("nop"): Re-define as `ja 0'.

gcc/testsuite/

* gcc.target/bpf/nop-1.c: New test.

4 years agoDarwin, X86, testsuite: Fix pr87767 tests for Darwin.
Iain Sandoe [Mon, 14 Sep 2020 18:37:30 +0000 (19:37 +0100)]
Darwin, X86, testsuite: Fix pr87767 tests for Darwin.

The tests assume non-PIC for m32 (which means that they fail on default
PIC targets, like Darwin).  There is also a missing space before the
closing '}' on one of the dg-require-effective-target which means that
test fails on machines without avx512f.

gcc/testsuite/ChangeLog:

* gcc.target/i386/avx512f-broadcast-pr87767-1.c: Make the test
run as non-dynamic for m32 Darwin.
* gcc.target/i386/avx512f-broadcast-pr87767-3.c: Likewise.
* gcc.target/i386/avx512f-broadcast-pr87767-5.c: Likewise.
* gcc.target/i386/avx512f-broadcast-pr87767-7.c: Likewise.
* gcc.target/i386/avx512vl-broadcast-pr87767-1.c: Likewise.
* gcc.target/i386/avx512vl-broadcast-pr87767-3.c: Likewise.
* gcc.target/i386/avx512vl-broadcast-pr87767-5.c: Likewise.
* gcc.target/i386/avx512f-broadcast-pr87767-6.c: Adjust dg-requires
clause.

4 years agoc++: local externs in templates do not get template head
Nathan Sidwell [Mon, 14 Sep 2020 16:42:29 +0000 (09:42 -0700)]
c++: local externs in templates do not get template head

Now we consistently mark local externs with DECL_LOCAL_DECL_P, we can
teach the template machinery not to give them a TEMPLATE_DECL head,
and the instantiation machinery treat them as the local specialiations
they are.  (openmp UDRs also fall into this category, and are dealt
with similarly.)

gcc/cp/
* pt.c (push_template_decl_real): Don't attach a template head to
local externs.
(tsubst_function_decl): Add support for headless local extern
decls.
(tsubst_decl): Add support for headless local extern decls.

4 years agoanalyzer: add -param=analyzer-max-constraints=
David Malcolm [Sun, 13 Sep 2020 23:35:28 +0000 (19:35 -0400)]
analyzer: add -param=analyzer-max-constraints=

On attempting to run the full test suite with -fanalyzer via
  make check RUNTESTFLAGS="-v -v --target_board=unix/-fanalyzer"
I saw it get stuck on:
  gcc.c-torture/compile/20001226-1.c
It turns out this was on a debug build, rather than a release build;
but a release build with -fanalyzer took:
  real 1m33.689s
  user 1m30.239s
  sys  0m2.727s
as compared to:
  real 0m2.361s
  user 0m2.107s
  sys  0m0.214s
without -fanalyzer.

This torture test performs 64 * 64 uniqely-coded comparisons between
elements of a pair of arrays until it finds an element that's different,
leading to an accumulation of 4096 constraints along the path where
no difference is found.

"perf" shows most of the time is spent in canonicalizing and copying
constraint_manager instances, presumably as it copies and merges states
with increasingly more complex sets of constraints as it analyzes
further along the "no differences yet" path.

This patch crudely works around this by adding a
  -param=analyzer-max-constraints=
limit, defaulting to 20, above which constraints will be silently
dropped.  With this -fanalyzer takes:
  real 0m6.935s
  user 0m6.413s
  sys  0m0.396s
on the above case.

gcc/analyzer/ChangeLog:
* analyzer.opt (-param=analyzer-max-constraints=): New param.
* constraint-manager.cc
(constraint_manager::add_constraint_internal): Silently reject
attempts to add constraints when the above limit is reached.

4 years agoanalyzer: fix constraint explosion on many-cased switch [PR96653]
David Malcolm [Fri, 11 Sep 2020 01:23:38 +0000 (21:23 -0400)]
analyzer: fix constraint explosion on many-cased switch [PR96653]

PR analyzer/96653 reports a CPU-time and memory explosion in -fanalyzer
seen in Linux 5.9-rc1:drivers/media/v4l2-core/v4l2-ctrls.c on a switch
statement with many cases.

The issue is some old code in constraint_manager::get_or_add_equiv_class
for ensuring that comparisons between equivalence classes containing
constants work correctly.  The old code added constraints for every
pair of ECs containing constants, leading to O(N^2) constraints (for
N constants).  Given that get_or_add_equiv_class also involves O(N)
comparisons, this led to at least O(N^3) CPU time, and O(N^2) memory
consumption when handling the "default" case, where N is the number of
other cases in the switch statement.

The state rewrite of r11-2694-g808f4dfeb3a95f50f15e71148e5c1067f90a126d
added checking for comparisons between constants, making these explicit
constraints redundant, but failed to remove the code mentioned above.

This patch removes it, fixing the blow-up of constraints in the default
case.

gcc/analyzer/ChangeLog:
PR analyzer/96653
* constraint-manager.cc
(constraint_manager::get_or_add_equiv_class): Don't accumulate
transitive closure of all constraints on constants.

gcc/testsuite/ChangeLog:
PR analyzer/96653
* gcc.dg/analyzer/pr96653.c: New test.

4 years agoanalyzer: add regression test for leak false positive
David Malcolm [Mon, 14 Sep 2020 13:05:50 +0000 (09:05 -0400)]
analyzer: add regression test for leak false positive

Downstream bug report:
  https://bugzilla.redhat.com/show_bug.cgi?id=1878600
describes a false positive from -Wanalyzer-file-leak seen with
gcc 10.2, but which has been fixed in gcc 11.

This patch adds the reproducer as a regression test.

gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/rhbz1878600.c: New test.

4 years agoanalyzer: fix ICE on setjmp with non-pointer-type [PR97029]
David Malcolm [Sat, 12 Sep 2020 13:28:05 +0000 (09:28 -0400)]
analyzer: fix ICE on setjmp with non-pointer-type [PR97029]

gcc/analyzer/ChangeLog:
PR analyzer/97029
* analyzer.cc (is_setjmp_call_p): Require the initial arg to be a
pointer.
* region-model.cc (region_model::deref_rvalue): Assert that the
svalue is of pointer type.

gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/pr97029.c: New test.

4 years agoFix dangling references in thunks at -O0
Eric Botcazou [Mon, 14 Sep 2020 15:24:32 +0000 (17:24 +0200)]
Fix dangling references in thunks at -O0

When a thunk cannot be emitted in assembly directly, expand_thunk
generates regular GIMPLE code but unconditionally forces a tail
call to the target of the thunk.  That's theoretically OK because
the thunk essentially forwards its parameters to the target, but
in practice the RTL expander can spill parameters passed by reference
on the stack in assign_parm_setup_reg.

gcc/ChangeLog:
* cgraphunit.c (cgraph_node::expand_thunk): Make sure to set
cfun->tail_call_marked when forcing a tail call.
* function.c (assign_parm_setup_reg): Always use a register to
load a parameter passed by reference if cfun->tail_call_marked.

gcc/testsuite/ChangeLog:
* gnat.dg/thunk1.adb: New test.
* gnat.dg/thunk1_pkg1.ads: New helper.
* gnat.dg/thunk1_pkg2.ads: Likewise.
* gnat.dg/thunk1_pkg2.adb: Likewise.

4 years agoRename mffgpr/mftgpr insn types and remove Power6 references.
Pat Haugen [Mon, 14 Sep 2020 14:37:34 +0000 (09:37 -0500)]
Rename mffgpr/mftgpr insn types and remove Power6 references.

The following is mostly a mechanical change to rename the mffgpr/mftgpr
insn types to mtvsr/mfvsr to be more clear. It also removes Power6
references to those insn types since we no longer generate those
instructions.

2020-09-14  Pat Haugen  <pthaugen@linux.ibm.com>

gcc/
* config/rs6000/power10.md (power10-mffgpr, power10-mftgpr): Rename to
power10-mtvsr/power10-mfvsr.
* config/rs6000/power6.md (X2F_power6, power6-mftgpr, power6-mffgpr):
Remove.
* config/rs6000/power8.md (power8-mffgpr, power8-mftgpr): Rename to
power8-mtvsr/power8-mfvsr.
* config/rs6000/power9.md (power9-mffgpr, power9-mftgpr): Rename to
power9-mtvsr/power9-mfvsr.
* config/rs6000/rs6000.c (rs6000_adjust_cost): Remove Power6
TYPE_MFFGPR cases.
* config/rs6000/rs6000.md (mffgpr, mftgpr, zero_extendsi<mode>2,
extendsi<mode>2, @signbit<mode>2_dm, lfiwax, lfiwzx, *movsi_internal1,
movsi_from_sf, *movdi_from_sf_zero_ext, *mov<mode>_internal,
movsd_hardfloat, movsf_from_si, *mov<mode>_hardfloat64, p8_mtvsrwz,
p8_mtvsrd_df, p8_mtvsrd_sf, p8_mfvsrd_3_<mode>, *movdi_internal64,
unpack<mode>_dm): Rename mffgpr/mftgpr to mtvsr/mfvsr.
* config/rs6000/vsx.md (vsx_mov<mode>_64bit, vsx_extract_<mode>,
vsx_extract_si, *vsx_extract_<mode>_p8): Likewise.

4 years agoarm: Fix up gcc.target/arm/lto/pr96939_* FAIL
Jakub Jelinek [Mon, 14 Sep 2020 08:53:50 +0000 (10:53 +0200)]
arm: Fix up gcc.target/arm/lto/pr96939_* FAIL

The following patch on top of the
https://gcc.gnu.org/pipermail/gcc-patches/2020-September/553801.html
patch fixes the gcc.target/arm/lto/pr96939_* test in certain ARM
configurations.
As said in the above mentioned patch, the generic code takes care of
saving/restoring TargetVariables or Target Save options, so this just
arranges for the generic code to save those instead of needing the
arm backend to do it manually.

2020-09-14  Jakub Jelinek  <jakub@redhat.com>

* config/arm/arm.opt (x_arm_arch_string, x_arm_cpu_string,
x_arm_tune_string): Remove TargetSave entries.
(march=, mcpu=, mtune=): Add Save keyword.
* config/arm/arm.c (arm_option_save): Remove.
(TARGET_OPTION_SAVE): Don't redefine.
(arm_option_restore): Don't restore x_arm_*_string here.

4 years agolibgccjit: Regenerate documentation for new entry point.
Andrea Corallo [Mon, 14 Sep 2020 07:01:18 +0000 (09:01 +0200)]
libgccjit: Regenerate documentation for new entry point.

Fix missing doc regeneration that should have been done by
4ecc0061c4 "libgccjit: Add new gcc_jit_global_set_initializer entry
point"
<https://gcc.gnu.org/onlinedocs/jit/internals/index.html#submitting-patches>.

gcc/jit/ChangeLog

2020-09-14  Andrea Corallo  <andrea.corallo@arm.com>

* docs/_build/texinfo/libgccjit.texi: Regenerate.

4 years agooptions: Save and restore opts_set for Optimization and Target options
Jakub Jelinek [Mon, 14 Sep 2020 07:04:45 +0000 (09:04 +0200)]
options: Save and restore opts_set for Optimization and Target options

> Seems a latent issue.
> Neither cl_optimization_{save,restore} nor cl_target_option_{save,restore}
> (nor any of the target hooks they call) saves or restores any opts_set
> values, so I think opts_set can be trusted only during option processing (if
> at all), but not later.
> So, short term a fix would be IMHO just stop using opts_set altogether in
> arm_configure_build_target, it doesn't make much sense to me, it should test
> if those strings are non-NULL instead, or at least do that when it is
> invoked from arm_option_restore (e.g. could be done by calling it with
> opts instead of &global_options_set ).
> Longer term, the question is if cl_optimization_{save,restore} and
> cl_target_option_{save,restore} shouldn't be changed not to only
> save/restore the options, but also save the opts_set flags.
> It could be done e.g. by adding a bool array or set of bool members
> to struct cl_optimization and struct cl_target_option , or even more compact
> by using bitmasks, pack each 64 adjacent option flags into a UHWI element
> of an array.

So, I've tried under debugger how it behaves and seems global_options_set
is really an or of whether an option has been ever seen as explicit, either
on the command line or in any of the option pragmas or optimize/target
attributes seen so far, so it isn't something that can be relied on.

The following patch implements the saving/restoring of the opts_set bits
(though only for the options/variables saved by the generic options-save.c
code, for the target specific stuff that isn't handled by the generic code
the opts_set argument is now passed to the hook and the backends can choose
e.g. to use a TargetSave variable to save the flags either individually or
together in some bitmask (or ignore it if they never need opts_set for the
options).

This patch itself doesn't fix the testcase failing on arm, but a follow up
patch will.

2020-09-14  Jakub Jelinek  <jakub@redhat.com>

gcc/
* opt-read.awk: Also initialize extra_target_var_types array.
* opth-gen.awk: Emit explicit_mask arrays to struct cl_optimization
and cl_target_option.  Adjust cl_optimization_save,
cl_optimization_restore, cl_target_option_save and
cl_target_option_restore declarations.
* optc-save-gen.awk: Add opts_set argument to cl_optimization_save,
cl_optimization_restore, cl_target_option_save and
cl_target_option_restore functions and save or restore opts_set
next to the opts values into or from explicit_mask arrays.
In cl_target_option_eq and cl_optimization_option_eq compare
explicit_mask arrays, in cl_target_option_hash and cl_optimization_hash
hash them and in cl_target_option_stream_out,
cl_target_option_stream_in, cl_optimization_stream_out and
cl_optimization_stream_in stream them.
* tree.h (build_optimization_node, build_target_option_node): Add
opts_set argument.
* tree.c (build_optimization_node): Add opts_set argument, pass it
to cl_optimization_save.
(build_target_option_node): Add opts_set argument, pass it to
cl_target_option_save.
* function.c (invoke_set_current_function_hook): Adjust
cl_optimization_restore caller.
* ipa-inline-transform.c (inline_call): Adjust cl_optimization_restore
and build_optimization_node callers.
* target.def (TARGET_OPTION_SAVE, TARGET_OPTION_RESTORE): Add opts_set
argument.
* target-globals.c (save_target_globals_default_opts): Adjust
cl_optimization_restore callers.
* toplev.c (process_options): Adjust build_optimization_node and
cl_optimization_restore callers.
(target_reinit): Adjust cl_optimization_restore caller.
* tree-streamer-in.c (lto_input_ts_function_decl_tree_pointers):
Adjust build_optimization_node and cl_optimization_restore callers.
* doc/tm.texi: Updated.
* config/aarch64/aarch64.c (aarch64_override_options): Adjust
build_target_option_node caller.
(aarch64_option_save, aarch64_option_restore): Add opts_set argument.
(aarch64_set_current_function): Adjust cl_target_option_restore
caller.
(aarch64_option_valid_attribute_p): Adjust cl_target_option_save,
cl_target_option_restore, cl_optimization_restore,
build_optimization_node and build_target_option_node callers.
* config/aarch64/aarch64-c.c (aarch64_pragma_target_parse): Adjust
cl_target_option_restore and build_target_option_node callers.
* config/arm/arm.c (arm_option_save, arm_option_restore): Add
opts_set argument.
(arm_option_override): Adjust cl_target_option_save,
build_optimization_node and build_target_option_node callers.
(arm_set_current_function): Adjust cl_target_option_restore caller.
(arm_valid_target_attribute_tree): Adjust build_target_option_node
caller.
(add_attribute): Formatting fix.
(arm_valid_target_attribute_p): Adjust cl_optimization_restore,
cl_target_option_restore, arm_valid_target_attribute_tree and
build_optimization_node callers.
* config/arm/arm-c.c (arm_pragma_target_parse): Adjust
cl_target_option_restore callers.
* config/csky/csky.c (csky_option_override): Adjust
build_target_option_node and cl_target_option_save callers.
* config/gcn/gcn.c (gcn_fixup_accel_lto_options): Adjust
build_optimization_node and cl_optimization_restore callers.
* config/i386/i386-builtins.c (get_builtin_code_for_version):
Adjust cl_target_option_save and cl_target_option_restore
callers.
* config/i386/i386-c.c (ix86_pragma_target_parse): Adjust
build_target_option_node and cl_target_option_restore callers.
* config/i386/i386-options.c (ix86_function_specific_save,
ix86_function_specific_restore): Add opts_set arguments.
(ix86_valid_target_attribute_tree): Adjust build_target_option_node
caller.
(ix86_valid_target_attribute_p): Adjust build_optimization_node,
cl_optimization_restore, cl_target_option_restore,
ix86_valid_target_attribute_tree and build_optimization_node callers.
(ix86_option_override_internal): Adjust build_target_option_node
caller.
(ix86_reset_previous_fndecl, ix86_set_current_function): Adjust
cl_target_option_restore callers.
* config/i386/i386-options.h (ix86_function_specific_save,
ix86_function_specific_restore): Add opts_set argument.
* config/nios2/nios2.c (nios2_option_override): Adjust
build_target_option_node caller.
(nios2_option_save, nios2_option_restore): Add opts_set argument.
(nios2_valid_target_attribute_tree): Adjust build_target_option_node
caller.
(nios2_valid_target_attribute_p): Adjust build_optimization_node,
cl_optimization_restore, cl_target_option_save and
cl_target_option_restore callers.
(nios2_set_current_function, nios2_pragma_target_parse): Adjust
cl_target_option_restore callers.
* config/pru/pru.c (pru_option_override): Adjust
build_target_option_node caller.
(pru_set_current_function): Adjust cl_target_option_restore
callers.
* config/rs6000/rs6000.c (rs6000_debug_reg_global): Adjust
cl_target_option_save caller.
(rs6000_option_override_internal): Adjust build_target_option_node
caller.
(rs6000_valid_attribute_p): Adjust build_optimization_node,
cl_optimization_restore, cl_target_option_save,
cl_target_option_restore and build_target_option_node callers.
(rs6000_pragma_target_parse): Adjust cl_target_option_restore and
build_target_option_node callers.
(rs6000_activate_target_options): Adjust cl_target_option_restore
callers.
(rs6000_function_specific_save, rs6000_function_specific_restore):
Add opts_set argument.
* config/s390/s390.c (s390_function_specific_restore): Likewise.
(s390_option_override_internal): Adjust s390_function_specific_restore
caller.
(s390_option_override, s390_valid_target_attribute_tree): Adjust
build_target_option_node caller.
(s390_valid_target_attribute_p): Adjust build_optimization_node,
cl_optimization_restore and cl_target_option_restore callers.
(s390_activate_target_options): Adjust cl_target_option_restore
caller.
* config/s390/s390-c.c (s390_cpu_cpp_builtins): Adjust
cl_target_option_save caller.
(s390_pragma_target_parse): Adjust build_target_option_node and
cl_target_option_restore callers.
gcc/c-family/
* c-attribs.c (handle_optimize_attribute): Adjust
cl_optimization_save, cl_optimization_restore and
build_optimization_node callers.
* c-pragma.c (handle_pragma_optimize): Adjust
build_optimization_node caller.
(handle_pragma_push_options): Adjust
build_optimization_node and build_target_option_node callers.
(handle_pragma_pop_options, handle_pragma_reset_options):
Adjust cl_optimization_restore callers.
gcc/go/
* go-gcc.cc (Gcc_backend::function): Adjust
cl_optimization_save, cl_optimization_restore and
build_optimization_node callers.
gcc/ada/
* gcc-interface/trans.c (gigi): Adjust build_optimization_node
caller.

4 years ago[libgomp, nvptx] Add __sync_compare_and_swap_16
Tom de Vries [Sat, 5 Sep 2020 10:45:07 +0000 (12:45 +0200)]
[libgomp, nvptx] Add __sync_compare_and_swap_16

As reported here
( https://gcc.gnu.org/pipermail/gcc-patches/2020-September/553070.html  ),
when running test-case libgomp.c-c++-common/reduction-16.c for powerpc host
with nvptx accelerator, we run into:
...
unresolved symbol __sync_val_compare_and_swap_16
...

I can reproduce the problem on x86_64 with a trigger patch that:
- initializes ix86_isa_flags2 to TARGET_ISA2_CX16
- enables define_expand "atomic_load<mode>" in gcc/config/i386/sync.md
  for TImode

The problem is that omp-expand.c generates atomic builtin calls based on
checks whether those are supported on the host, which forces the target to
support these, even though those checks fail for the accelerator target.

Fix this by:
- adding a __sync_val_compare_and_swap_16 in libgomp for nvptx,
  which falls back onto libatomic's __atomic_compare_and_swap_16
- adding -foffload=-latomic in the test-case

Tested libgomp on x86_64-linux with nvptx accelerator.

Tested libgomp with trigger patch on x86_64-linux with nvptx accelerator.

libgomp/ChangeLog:

* config/nvptx/atomic.c: New file.  Add
__sync_val_compare_and_swap_16.
* testsuite/libgomp.c-c++-common/reduction-16.c: Add -latomic for
target offload_target_nvptx.

4 years agoDaily bump.
GCC Administrator [Mon, 14 Sep 2020 00:16:23 +0000 (00:16 +0000)]
Daily bump.

4 years agoImprove costs for DImode shifts of interger constants.
John David Anglin [Sun, 13 Sep 2020 18:47:59 +0000 (18:47 +0000)]
Improve costs for DImode shifts of interger constants.

2020-09-13  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
* config/pa/pa.c (hppa_rtx_costs) [ASHIFT, ASHIFTRT, LSHIFTRT]:
Provide accurate costs for DImode shifts of integer constants.

4 years agoDaily bump.
GCC Administrator [Sun, 13 Sep 2020 00:16:23 +0000 (00:16 +0000)]
Daily bump.

4 years agod: Return promoted types in d_type_promotes_to when linkage is not D
Iain Buclaw [Sat, 12 Sep 2020 14:48:58 +0000 (16:48 +0200)]
d: Return promoted types in d_type_promotes_to when linkage is not D

This enables warnings to be shown when a bad type is passed to va_arg
inside an extern(C) or extern(C++) function.

gcc/d/ChangeLog:

PR d/97002
* d-codegen.cc (d_build_call): Set input_location on CALL_EXPR.
* d-lang.cc: Include function.h.
(d_type_promotes_to): Do default conversions for C and C++ functions.
* intrinsics.cc (expand_intrinsic_vaarg): Use build1_loc to build
VA_ARG_EXPR.

gcc/testsuite/ChangeLog:

PR d/97002
* gdc.dg/pr97002.d: New test.

4 years agod: Build TYPE_DECLs for non-numeric enum types.
Iain Buclaw [Sat, 12 Sep 2020 14:26:58 +0000 (16:26 +0200)]
d: Build TYPE_DECLs for non-numeric enum types.

This is done so that the DWARF pass will emit a DW_TAG_typedef where the
member type of an enum can't be represented in an ENUMERAL_TYPE.

gcc/d/ChangeLog:

* d-builtins.cc (d_build_d_type_nodes): Call build_ctype() on all
basic front-end types.
* decl.cc (DeclVisitor::visit (EnumDeclaration *)): Always add decl to
current binding level.
(build_type_decl): Build TYPE_DECL as a typedef if not for an enum or
record type.
* types.cc (TypeVisitor::visit (TypeEnum *)): Set underlying type for
ENUMERAL_TYPEs.  Build TYPE_DECL for non-numeric enums.

4 years agoAdd new shrpsi instruction variands to gcc/config/pa/pa.md.
John David Anglin [Sat, 12 Sep 2020 14:38:51 +0000 (14:38 +0000)]
Add new shrpsi instruction variands to gcc/config/pa/pa.md.

2020-09-12  Roger Sayle  <roger@nextmovesoftware.com>
    John David Anglin  <danglin@gcc.gnu.org>

gcc/ChangeLog
* config/pa/pa.md (shrpsi4_1, shrpsi4_2): New define_insns split
out from previous shrpsi4 providing two commutitive variants using
plus_xor_ior_operator as a predicate.
(shrpdi4_1, shrpdi4_2, shrpdi_3, shrpdi_4): Likewise DImode versions
where _1 and _2 take register shifts, and _3 and _4 for integers.
(rotlsi3_internal): Name this anonymous instruction.
(rotrdi3): New DImode insn copied from rotrsi3.
(rotldi3): New DImode expander copied from rotlsi3.
(rotldi4_internal): New DImode insn copied from rotsi3_internal.

4 years agoAdd preliminary support for 128-bit integer types
Eric Botcazou [Sat, 12 Sep 2020 10:59:09 +0000 (12:59 +0200)]
Add preliminary support for 128-bit integer types

This is only the gigi part, in preparation for the bulk of the
implementation.

gcc/ada/ChangeLog:
* fe.h: Fix pilot error in previous change.
* gcc-interface/gigi.h (enum standard_datatypes): Add ADT_mulv128_decl.
(mulv128_decl): New macro.
(get_target_long_long_long_size): Declare.
* gcc-interface/decl.c (gnat_to_gnu_entity): Use a maximum size of
128 bits for discrete types if Enable_128bit_Types is true.
* gcc-interface/targtyps.c: Include target.h.
(get_target_long_long_long_size): New function.
* gcc-interface/trans.c (gigi): Initialize mulv128_decl if need be.
(build_binary_op_trapv): Call it for 128-bit multiplication.
* gcc-interface/utils.c (make_type_from_size): Enforce a maximum
size of 128 bits if Enable_128bit_Types is true.

4 years agoFix small inconsistency in new predicate
Eric Botcazou [Sat, 12 Sep 2020 10:47:39 +0000 (12:47 +0200)]
Fix small inconsistency in new predicate

This can result on the mainline in a segfault when an object declared
at library level is used in the declaration of another, local object.

gcc/ada/ChangeLog:
* gcc-interface/trans.c (lvalue_for_aggr_p) <N_Object_Declaration>:
Return false unconditionally.

4 years agoMinor tweak to line debug info
Eric Botcazou [Sat, 12 Sep 2020 10:42:06 +0000 (12:42 +0200)]
Minor tweak to line debug info

This prevents the SLOC of the expression for a tag from being present
in the line debug info every time it is referenced for coverage purposes.

gcc/ada/ChangeLog:
* gcc-interface/trans.c (gnat_to_gnu) <N_Object_Declaration>: Clear
the SLOC of the expression of a tag.

4 years agoAccept absolute address clause for array of UNC nominal subtype
Eric Botcazou [Sat, 12 Sep 2020 10:36:30 +0000 (12:36 +0200)]
Accept absolute address clause for array of UNC nominal subtype

This changes the compiler to accept again absolute address clause for
aliased array of unconstrained nominal subtype, instead of erroring
out in this case.

gcc/ada/ChangeLog:
* gcc-interface/decl.c (gnat_to_gnu_entity) <E_Variable>: Only give
a warning for the overlay of an aliased array with an unconstrained
nominal subtype if the address is absolute.

4 years agoDaily bump.
GCC Administrator [Sat, 12 Sep 2020 00:16:30 +0000 (00:16 +0000)]
Daily bump.

4 years agoPowerPC: rename some functions.
Michael Meissner [Fri, 11 Sep 2020 22:10:14 +0000 (18:10 -0400)]
PowerPC: rename some functions.

gcc/
2020-09-11  Michael Meissner  <meissner@linux.ibm.com>

* config/rs6000/rs6000.c (rs6000_maybe_emit_maxc_minc): Rename
from rs6000_emit_p9_fp_minmax.  Change return type to bool.  Add
comments to document NaN/signed zero behavior.
(rs6000_maybe_emit_fp_cmove): Rename from rs6000_emit_p9_fp_cmove.
(have_compare_and_set_mask): New helper function.
(rs6000_emit_cmove): Update calls to new names and the new helper
function.

4 years agoi386: Fix array index in expander
Nathan Sidwell [Fri, 11 Sep 2020 21:13:52 +0000 (14:13 -0700)]
i386: Fix array index in expander

I noticed a compiler warning about out-of-bound access.  Fixed thusly.

gcc/
* config/i386/sse.md (mov<mode>): Fix operand indices.

4 years agolibstdc++: only pull in bits/align.h if C++11 or later
Thomas Rodgers [Fri, 11 Sep 2020 20:51:07 +0000 (13:51 -0700)]
libstdc++: only pull in bits/align.h if C++11 or later

libstdc++-v3/ChangeLog:

* include/std/memory: Move #include <bits/align.h> inside C++11
conditional includes.

4 years agoc++: Concepts and local externs
Nathan Sidwell [Fri, 11 Sep 2020 20:42:59 +0000 (13:42 -0700)]
c++: Concepts and local externs

I discovered that we'd accept constraints on block-scope function
decls inside templates.  This fixes that.

gcc/cp/
* decl.c (grokfndecl): Don't attach to local extern.

4 years ago[PATCH,rs6000] Testsuite fixup pr96139 tests
Will Schmidt [Wed, 9 Sep 2020 15:59:38 +0000 (10:59 -0500)]
[PATCH,rs6000] Testsuite fixup pr96139 tests

Hi,
  As reported, the recently added pr96139 tests will fail on older targets
  because the tests are missing the appropriate -mvsx or -maltivec options.
  This adds the options and clarifies the dg-require statements.

  The pr96139-c.c test needs -maltivec to work, but does not actually use
  vectors, so does not require -mvsx like the others.

  Sniff-regtested OK when specifying older targets on a power7 host.
  --target_board=unix/'{-mcpu=power4,-mcpu=power5,-mcpu=power6,-mcpu=power7,
  -mcpu=power8,-mcpu=power9}''{-m64,-m32}'"

gcc/testsuite/ChangeLog:
* gcc.target/powerpc/pr96139-a.c: Specify -mvsx option and update the
dg-require stanza to match.
* gcc.target/powerpc/pr96139-b.c: Same.
* gcc.target/powerpc/pr96139-c.c: Specify -maltivec option and update
the dg-require stanza to match.

4 years agolibstdc++: Split std::align/assume_aligned to bits/align.h
Thomas Rodgers [Fri, 11 Sep 2020 18:55:18 +0000 (11:55 -0700)]
libstdc++: Split std::align/assume_aligned to bits/align.h

We would like to be able to use std::align and std::assume_aligned
without pulling in everything in <memory>.

libstdc++-v3/ChangeLog:

* include/Makefile.am (bits_headers): Add new header.
* include/Makefile.in: Regenerate.
* include/bits/align.h: New file.
* include/std/memory (align): Move definition to bits/align.h.
(assume_aligned): Likewise.

4 years agolibstdc++: Fix chrono::__detail::ceil to work with C++11
Jonathan Wakely [Fri, 11 Sep 2020 18:59:11 +0000 (19:59 +0100)]
libstdc++: Fix chrono::__detail::ceil to work with C++11

In C++11 constexpr functions can only have a return statement, so we
need to fix __detail::ceil to make it valid in C++11. This can be done
by moving the comparison and increment into a new function, __ceil_impl,
and calling that with the result of the duration_cast.

This would mean the standard C++17 std::chrono::ceil function would make
two further calls, which would add too much overhead when not inlined.
For C++17 and later use a using-declaration to add chrono::ceil to
namespace __detail. For C++11 and C++14 define chrono::__detail::__ceil
as a C++11-compatible constexpr function template.

libstdc++-v3/ChangeLog:

* include/std/chrono [C++17] (chrono::__detail::ceil): Add
using declaration to make chrono::ceil available for internal
use with a consistent name.
(chrono::__detail::__ceil_impl): New function template.
(chrono::__detail::ceil): Use __ceil_impl to compare and
increment the value. Remove SFINAE constraint.

4 years agoFix fma test case [PR97018]
Sunil K Pandey [Fri, 11 Sep 2020 06:17:59 +0000 (23:17 -0700)]
Fix fma test case [PR97018]

These tests are written for 256 bit vector. For -march=cascadelake,
vector size changed to 512 bit. It doubles the number of fma
instruction and test fail. Fix is to explicitly disable 512 bit
vector by passing additional option -mno-avx512f.

Tested on x86-64.

gcc/testsuite/ChangeLog:

PR target/97018
* gcc.target/i386/l_fma_double_1.c: Add option -mno-avx512f.
* gcc.target/i386/l_fma_double_2.c: Likewise.
* gcc.target/i386/l_fma_double_3.c: Likewise.
* gcc.target/i386/l_fma_double_4.c: Likewise.
* gcc.target/i386/l_fma_double_5.c: Likewise.
* gcc.target/i386/l_fma_double_6.c: Likewise.
* gcc.target/i386/l_fma_float_1.c: Likewise.
* gcc.target/i386/l_fma_float_2.c: Likewise.
* gcc.target/i386/l_fma_float_3.c: Likewise.
* gcc.target/i386/l_fma_float_4.c: Likewise.
* gcc.target/i386/l_fma_float_5.c: Likewise.
* gcc.target/i386/l_fma_float_6.c: Likewise.

4 years agoMove/correct offset adjustment (PR middle-end/96903).
Martin Sebor [Fri, 11 Sep 2020 15:40:45 +0000 (09:40 -0600)]
Move/correct offset adjustment (PR middle-end/96903).

Resolves:
PR middle-end/96903 - bogus warning on memcpy at negative offset from array end

gcc/ChangeLog:

PR middle-end/96903
* builtins.c (compute_objsize): Remove incorrect offset adjustment.
(compute_objsize): Adjust offset range here instead.

gcc/testsuite/ChangeLog:

PR middle-end/96903
* gcc.dg/Wstringop-overflow-42.c:: Add comment.
* gcc.dg/Wstringop-overflow-43.c: New test.

4 years agoobjc++: Always pop scope with method definitions [PR97015]
Nathan Sidwell [Fri, 11 Sep 2020 15:23:32 +0000 (08:23 -0700)]
objc++: Always pop scope with method definitions [PR97015]

Syntax errors in method definition lists could leave us in a function
scope.  My recent change for block scope externs didn't like that.
This reimplements the parsing loop to finish the method definition we
started.  AFAICT the original code was attempting to provide some
error recovery.  Also while there, simply do the token peeking at the
top of the loop, rather than at the two(!) ends.

gcc/cp/
* parser.c (cp_parser_objc_method_definition_list): Reimplement
loop, make sure we pop scope.
gcc/testsuite/
* obj-c++.dg/syntax-error-9.mm: Adjust expected errors.

4 years agoc++: Remove LOOKUP_CONSTINIT.
Marek Polacek [Thu, 10 Sep 2020 23:18:34 +0000 (19:18 -0400)]
c++: Remove LOOKUP_CONSTINIT.

Since we now have DECL_DECLARED_CONSTINIT_P, we no longer need
LOOKUP_CONSTINIT.

gcc/cp/ChangeLog:

* cp-tree.h (LOOKUP_CONSTINIT): Remove.
(LOOKUP_REWRITTEN): Adjust.
* decl.c (duplicate_decls): Set DECL_DECLARED_CONSTINIT_P.
(check_initializer): Use DECL_DECLARED_CONSTINIT_P instead of
LOOKUP_CONSTINIT.
(cp_finish_decl): Don't set DECL_DECLARED_CONSTINIT_P.  Use
DECL_DECLARED_CONSTINIT_P instead of LOOKUP_CONSTINIT.
(grokdeclarator): Set DECL_DECLARED_CONSTINIT_P.
* decl2.c (grokfield): Don't handle LOOKUP_CONSTINIT.
* parser.c (cp_parser_decomposition_declaration): Remove
LOOKUP_CONSTINIT handling.
(cp_parser_init_declarator): Likewise.
* pt.c (tsubst_expr): Likewise.
(instantiate_decl): Likewise.
* typeck2.c (store_init_value): Use DECL_DECLARED_CONSTINIT_P instead
of LOOKUP_CONSTINIT.

4 years agolibstdc++: Fix build error in <bits/regex_error.h>
Jonathan Wakely [Fri, 11 Sep 2020 13:51:36 +0000 (14:51 +0100)]
libstdc++: Fix build error in <bits/regex_error.h>

libstdc++-v3/ChangeLog:

* include/bits/regex_error.h (__throw_regex_error): Fix
parameter declaration and use reserved attribute names.

4 years agolibstdc++: Avoid rounding errors on custom clocks in condition_variable
Mike Crowe [Fri, 11 Sep 2020 13:25:00 +0000 (14:25 +0100)]
libstdc++: Avoid rounding errors on custom clocks in condition_variable

The fix for PR68519 in 83fd5e73b3c16296e0d7ba54f6c547e01c7eae7b only
applied to condition_variable::wait_for. This problem can also apply to
condition_variable::wait_until but only if the custom clock is using a
more recent epoch so that a small enough delta can be calculated. let's
use the newly-added chrono::__detail::ceil to fix this and also make use
of that function to simplify the previous wait_for fixes.

Also, simplify the existing test case for PR68519 a little and make its
variables local so we can add a new test case for the above problem.
Unfortunately, the test would have only started failing if sufficient
time has passed since the chrono::steady_clock epoch had passed anyway,
but it's better than nothing.

libstdc++-v3/ChangeLog:

* include/std/condition_variable (condition_variable::wait_until):
Convert delta to steady_clock duration before adding to current
steady_clock time to avoid rounding errors described in PR68519.
(condition_variable::wait_for): Simplify calculation of absolute
time by using chrono::__detail::ceil in both overloads.
* testsuite/30_threads/condition_variable/members/68519.cc:
(test_wait_for): Renamed from test01. Replace unassigned val
variable with constant false. Reduce scope of mx and cv
variables to just test_wait_for function.
(test_wait_until): Add new test case.

4 years agolibstdc++: Avoid rounding errors in std::future::wait_* [PR 91486]
Mike Crowe [Fri, 11 Sep 2020 13:25:00 +0000 (14:25 +0100)]
libstdc++: Avoid rounding errors in std::future::wait_* [PR 91486]

Convert the specified duration to the target clock's duration type
before adding it to the current time in
__atomic_futex_unsigned::_M_load_when_equal_for and
_M_load_when_equal_until.  This removes the risk of the timeout being
rounded down to the current time resulting in there being no wait at all
when the duration type lacks sufficient precision to hold the
steady_clock current time.

Rather than using the style of fix from PR68519, let's expose the C++17
std::chrono::ceil function as std::chrono::__detail::ceil so that it can
be used in code compiled with earlier standards versions and simplify
the fix. This was suggested by John Salmon in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91486#c5 .

This problem has become considerably less likely to trigger since I
switched the __atomic__futex_unsigned::__clock_t reference clock from
system_clock to steady_clock and added the loop, but the consequences of
triggering it have changed too.

By my calculations it takes just over 194 days from the epoch for the
current time not to be representable in a float. This means that
system_clock is always subject to the problem (with the standard 1970
epoch) whereas steady_clock with float duration only runs out of
resolution machine has been running for that long (assuming the Linux
implementation of CLOCK_MONOTONIC.)

The recently-added loop in
__atomic_futex_unsigned::_M_load_when_equal_until turns this scenario
into a busy wait.

Unfortunately the combination of both of these things means that it's
not possible to write a test case for this occurring in
_M_load_when_equal_until as it stands.

libstdc++-v3/ChangeLog:

PR libstdc++/91486
* include/bits/atomic_futex.h
(__atomic_futex_unsigned::_M_load_when_equal_for)
(__atomic_futex_unsigned::_M_load_when_equal_until): Use
__detail::ceil to convert delta to the reference clock
duration type to avoid resolution problems.
* include/std/chrono (__detail::ceil): Move implementation
of std::chrono::ceil into private namespace so that it's
available to pre-C++17 code.
* testsuite/30_threads/async/async.cc (test_pr91486):
Test __atomic_futex_unsigned::_M_load_when_equal_for.

4 years agolibstdc++: Loop when futex waits against arbitrary clock
Mike Crowe [Fri, 11 Sep 2020 13:25:00 +0000 (14:25 +0100)]
libstdc++: Loop when futex waits against arbitrary clock

If std::future::wait_until is passed a time point measured against a
clock that is neither std::chrono::steady_clock nor
std::chrono::system_clock then the generic implementation of
__atomic_futex_unsigned::_M_load_when_equal_until is called which
calculates the timeout based on __clock_t and calls the
_M_load_when_equal_until method for that clock to perform the actual
wait.

There's no guarantee that __clock_t is running at the same speed as the
caller's clock, so if the underlying wait times out timeout we need to
check the timeout against the caller's clock again before potentially
looping.

Also add two extra tests to the testsuite's async.cc:

* run test03 with steady_clock_copy, which behaves identically to
  chrono::steady_clock, but isn't chrono::steady_clock. This causes
  the overload of __atomic_futex_unsigned::_M_load_when_equal_until
  that takes an arbitrary clock to be called.

* invent test04 which uses a deliberately slow running clock in order
  to exercise the looping behaviour of
  __atomic_futex_unsigned::_M_load_when_equal_until described above.

libstdc++-v3/ChangeLog:

* include/bits/atomic_futex.h
(__atomic_futex_unsigned::_M_load_when_equal_until): Add
loop on generic _Clock to check the timeout against _Clock
again after _M_load_when_equal_until returns indicating a
timeout.
* testsuite/30_threads/async/async.cc: Invent slow_clock
that runs at an eleventh of steady_clock's speed. Use it
to test the user-supplied-clock variant of
__atomic_futex_unsigned::_M_load_when_equal_until works
generally with test03 and loops correctly when the timeout
time hasn't been reached in test04.

4 years agolibstdc++: Use std::chrono::steady_clock as atomic_futex reference clock
Mike Crowe [Fri, 11 Sep 2020 13:25:00 +0000 (14:25 +0100)]
libstdc++: Use std::chrono::steady_clock as atomic_futex reference clock

The user-visible effect of this change is that std::future::wait_for now
uses std::chrono::steady_clock to determine the timeout.  This makes it
immune to changes made to the system clock.  It also means that anyone
using their own clock types with std::future::wait_until will have the
timeout converted to std::chrono::steady_clock rather than
std::chrono::system_clock.

Now that use of both std::chrono::steady_clock and
std::chrono::system_clock are correctly supported for the wait timeout, I
believe that std::chrono::steady_clock is a better choice for the reference
clock that all other clocks are converted to since it is guaranteed to
advance steadily.  The previous behaviour of converting to
std::chrono::system_clock risks timeouts changing dramatically when the
system clock is changed.

libstdc++-v3/ChangeLog:

* include/bits/atomic_futex.h (__atomic_futex_unsigned): Change
__clock_t typedef to use steady_clock so that unknown clocks are
synced to it rather than system_clock. Change existing __clock_t
overloads of _M_load_and_text_until_impl and
_M_load_when_equal_until to use system_clock explicitly. Remove
comment about DR 887 since these changes address that problem as
best as we currently able.

4 years agolibstdc++: Support futex waiting on chrono::steady_clock directly
Mike Crowe [Fri, 11 Sep 2020 13:25:00 +0000 (14:25 +0100)]
libstdc++: Support futex waiting on chrono::steady_clock directly

The user-visible effect of this change is for std::future::wait_until to
use CLOCK_MONOTONIC when passed a timeout of std::chrono::steady_clock
type.  This makes it immune to any changes made to the system clock
CLOCK_REALTIME.

Add an overload of __atomic_futex_unsigned::_M_load_and_text_until_impl
that accepts a std::chrono::steady_clock, and correctly passes this
through to __atomic_futex_unsigned_base::_M_futex_wait_until_steady
which uses CLOCK_MONOTONIC for the timeout within the futex system call.
These functions are mostly just copies of the std::chrono::system_clock
versions with small tweaks.

Prior to this commit, a std::chrono::steady timeout would be converted
via std::chrono::system_clock which risks reducing or increasing the
timeout if someone changes CLOCK_REALTIME whilst the wait is happening.
(The commit immediately prior to this one increases the window of
opportunity for that from a short period during the calculation of a
relative timeout, to the entire duration of the wait.)

FUTEX_WAIT_BITSET was added in kernel v2.6.25.  If futex reports ENOSYS
to indicate that this operation is not supported then the code falls
back to using clock_gettime(2) to calculate a relative time to wait for.

I believe that I've added this functionality in a way that it doesn't
break ABI compatibility, but that has made it more verbose and less type
safe.  I believe that it would be better to maintain the timeout as an
instance of the correct clock type all the way down to a single
_M_futex_wait_until function with an overload for each clock.  The
current scheme of separating out the seconds and nanoseconds early risks
accidentally calling the wait function for the wrong clock.
Unfortunately, doing this would break code that compiled against the old
header.

libstdc++-v3/ChangeLog:

* config/abi/pre/gnu.ver: Update for addition of
__atomic_futex_unsigned_base::_M_futex_wait_until_steady.
* include/bits/atomic_futex.h (__atomic_futex_unsigned_base):
Add comments to clarify that _M_futex_wait_until and
_M_load_and_test_until use CLOCK_REALTIME.
(__atomic_futex_unsigned_base::_M_futex_wait_until_steady)
(__atomic_futex_unsigned_base::_M_load_and_text_until_steady):
New member functions that use CLOCK_MONOTONIC.
(__atomic_futex_unsigned_base::_M_load_and_test_until_impl)
(__atomic_futex_unsigned_base::_M_load_when_equal_until): Add
overloads that accept a steady_clock time_point and use the
new member functions.
* src/c++11/futex.cc: Include headers required for
clock_gettime.
(futex_clock_monotonic_flag): New constant to tell futex to
use CLOCK_MONOTONIC to match existing futex_clock_realtime_flag.
(futex_clock_monotonic_unavailable): New global to store the
result of trying to use CLOCK_MONOTONIC.
(__atomic_futex_unsigned_base::_M_futex_wait_until_steady): Add
new variant of _M_futex_wait_until that uses CLOCK_MONOTONIC to
support waiting using steady_clock.

4 years agolibstdc++: Use FUTEX_CLOCK_REALTIME for futex wait
Mike Crowe [Fri, 11 Sep 2020 13:24:59 +0000 (14:24 +0100)]
libstdc++: Use FUTEX_CLOCK_REALTIME for futex wait

The futex system call supports waiting for an absolute time if
FUTEX_WAIT_BITSET is used rather than FUTEX_WAIT.  Doing so provides two
benefits:

1. The call to gettimeofday is not required in order to calculate a
   relative timeout.

2. If someone changes the system clock during the wait then the futex
   timeout will correctly expire earlier or later.  Currently that only
   happens if the clock is changed prior to the call to gettimeofday.

According to futex(2), support for FUTEX_CLOCK_REALTIME was added in the
v2.6.28 Linux kernel and FUTEX_WAIT_BITSET was added in v2.6.25.  To
ensure that the code still works correctly with earlier kernel versions,
an ENOSYS error from futex[1] results in the
futex_clock_realtime_unavailable flag being set.  This flag is used to
avoid the unnecessary unsupported futex call in the future and to fall
back to the previous gettimeofday and relative time implementation.

glibc applied an equivalent switch in pthread_cond_timedwait to use
FUTEX_CLOCK_REALTIME and FUTEX_WAIT_BITSET rather than FUTEX_WAIT for
glibc-2.10 back in 2009.  See
glibc:cbd8aeb836c8061c23a5e00419e0fb25a34abee7

The futex_clock_realtime_unavailable flag is accessed using
std::memory_order_relaxed to stop it becoming a bottleneck.  If the
first two calls to _M_futex_wait_until happen to happen simultaneously
then the only consequence is that both will try to use
FUTEX_CLOCK_REALTIME, both risk discovering that it doesn't work and, if
so, both set the flag.

[1] This is how glibc's nptl-init.c determines whether these flags are
    supported.

libstdc++-v3/ChangeLog:

* src/c++11/futex.cc: Add new constants for required futex
flags.  Add futex_clock_realtime_unavailable flag to store
result of trying to use FUTEX_CLOCK_REALTIME.
(__atomic_futex_unsigned_base::_M_futex_wait_until): Try to
use FUTEX_WAIT_BITSET with FUTEX_CLOCK_REALTIME and only
fall back to using gettimeofday and FUTEX_WAIT if that's not
supported.

4 years agolibstdc++: Improve std::async test
Mike Crowe [Fri, 11 Sep 2020 13:24:59 +0000 (14:24 +0100)]
libstdc++: Improve std::async test

Add tests for waiting for the future using both chrono::steady_clock and
chrono::system_clock in preparation for dealing with those clocks
properly in futex.cc.

libstdc++-v3/ChangeLog:

* testsuite/30_threads/async/async.cc (test02): Test steady_clock
with std::future::wait_until.
(test03): Add new test templated on clock type waiting for future
associated with async to resolve.
(main): Call test03 to test both system_clock and steady_clock.

4 years agolibstdc++-v3/libsupc++/eh_call.cc: Avoid "set but not used" warning
Christophe Lyon [Fri, 11 Sep 2020 12:07:02 +0000 (12:07 +0000)]
libstdc++-v3/libsupc++/eh_call.cc: Avoid "set but not used" warning

When building with -fno-exceptions, bad_exception_allowed is set but
not used, causing a warning during the build.

This patch adds __attribute__((unused)) to avoid it.

2020-09-11  Torbjörn SVENSSON  <torbjorn.svensson@st.com>
    Christophe Lyon  <christophe.lyon@linaro.org>

libstdc++-v3/
* libsupc++/eh_call.cc: Avoid warning with -fno-exceptions.

4 years agolibstdc++-v3/libsupc++/eh_call.cc: Avoid warning with -fno-exceptions.
Christophe Lyon [Fri, 11 Sep 2020 09:21:55 +0000 (09:21 +0000)]
libstdc++-v3/libsupc++/eh_call.cc: Avoid warning with -fno-exceptions.

When building with -fno-exceptions, __throw_exception_again expands to
nothing, causing a "suggest braces around empty body in an 'if'
statement" warning.

This patch adds braces, like what was done in eh_personality.cc in svn
r193295 (git g:54ba39f599fc2f3d59fd3cd828a301ce9b731a20)

2020-09-11  Torbjörn SVENSSON  <torbjorn.svensson@st.com>
    Christophe Lyon  <christophe.lyon@linaro.org>

libstdc++-v3/
* libsupc++/eh_call.cc: Avoid warning with -fno-exceptions.

4 years agolibstdc++-v3/include/bits/regex_error.h: Avoid warning with -fno-exceptions.
Christophe Lyon [Fri, 11 Sep 2020 11:53:15 +0000 (11:53 +0000)]
libstdc++-v3/include/bits/regex_error.h: Avoid warning with -fno-exceptions.

When building with -fno-exceptions, __GLIBCXX_THROW_OR_ABORT expands to
abort(), causing warnings:
unused parameter '__ecode'
unused parameter '__what'

This patch adds __attribute__((unused)) to avoid them.

2020-09-11  Torbjörn SVENSSON <torbjorn.svensson@st.com>
    Christophe Lyon  <christophe.lyon@linaro.org>

libstdc++-v3/
* include/bits/regex_error.h: Avoid warning with -fno-exceptions.

4 years agotree-optimization/97020 - account SLP cost in loop vect again
Richard Biener [Fri, 11 Sep 2020 11:51:58 +0000 (13:51 +0200)]
tree-optimization/97020 - account SLP cost in loop vect again

The previous re-org made the cost of SLP vector stmts in loop
vectorization ignored.  The following rectifies this mistake.

2020-09-11  Richard Biener  <rguenther@suse.de>

PR tree-optimization/97020
* tree-vect-slp.c (vect_slp_analyze_operations): Apply
SLP costs when doing loop vectorization.

4 years agotestsuite: gimplefe-44 requires exceptions
Andrew Stubbs [Thu, 10 Sep 2020 13:58:15 +0000 (14:58 +0100)]
testsuite: gimplefe-44 requires exceptions

This avoids an ICE on amdgcn.

gcc/testsuite/ChangeLog:

* gcc.dg/gimplefe-44.c: Require exceptions.

4 years agolibgccjit: Add new gcc_jit_global_set_initializer entry point
Andrea Corallo [Sat, 30 May 2020 09:33:08 +0000 (10:33 +0100)]
libgccjit: Add new gcc_jit_global_set_initializer entry point

gcc/jit/ChangeLog

2020-08-01  Andrea Corallo  <andrea.corallo@arm.com>

* docs/topics/compatibility.rst (LIBGCCJIT_ABI_14): New ABI tag.
* docs/topics/expressions.rst (gcc_jit_global_set_initializer):
Document new entry point in section 'Global variables'.
* jit-playback.c (global_new_decl, global_finalize_lvalue): New
method.
(playback::context::new_global): Make use of global_new_decl,
global_finalize_lvalue.
(load_blob_in_ctor): New template function in use by the
following.
(playback::context::new_global_initialized): New method.
* jit-playback.h (class context): Decl 'new_global_initialized',
'global_new_decl', 'global_finalize_lvalue'.
(lvalue::set_initializer): Add implementation.
* jit-recording.c (recording::memento_of_get_pointer::get_size)
(recording::memento_of_get_type::get_size): Add implementation.
(recording::global::write_initializer_reproducer): New function in
use by 'recording::global::write_reproducer'.
(recording::global::replay_into)
(recording::global::write_to_dump)
(recording::global::write_reproducer): Handle
initialized case.
* jit-recording.h (class type): Decl 'get_size' and
'num_elements'.
* libgccjit++.h (class lvalue): Declare new 'set_initializer'
method.
(class lvalue): Decl 'is_global' and 'set_initializer'.
(class global) Decl 'write_initializer_reproducer'. Add
'm_initializer', 'm_initializer_num_bytes' fields.  Implement
'set_initializer'. Add a destructor to free 'm_initializer'.
* libgccjit.c (gcc_jit_global_set_initializer): New function.
* libgccjit.h (gcc_jit_global_set_initializer): New function
declaration.
* libgccjit.map (LIBGCCJIT_ABI_14): New ABI tag.

gcc/testsuite/ChangeLog

2020-08-01  Andrea Corallo  <andrea.corallo@arm.com>

* jit.dg/all-non-failing-tests.h: Add test-blob.c.
* jit.dg/test-global-set-initializer.c: New testcase.

4 years ago[libatomic] Add nvptx support
Tom de Vries [Mon, 7 Sep 2020 08:47:25 +0000 (10:47 +0200)]
[libatomic] Add nvptx support

Add nvptx support to libatomic.

Given that atomic_test_and_set is not implemented for nvptx (PR96964), the
compiler translates __atomic_test_and_set falling back onto the "Failing all
else, assume a single threaded environment and simply perform the operation"
case in expand_atomic_test_and_set, so it doesn't map onto an actual atomic
operation.

Still, that counts as supported for the configure test of libatomic, so we
end up with HAVE_ATOMIC_TAS_1/2/4/8/16 == 1, and the corresponding
__atomic_test_and_set_1/2/4/8/16 in libatomic all using that non-atomic
implementation.

Fix this by adding an atomic_test_and_set expansion for nvptx, that uses
libatomics __atomic_test_and_set_1.

This again makes the configure tests for HAVE_ATOMIC_TAS_1/2/4/8/16 fail, so
instead we use this case in tas_n.c:
...
/* If this type is smaller than word-sized, fall back to a word-sized
   compare-and-swap loop.  */
bool
SIZE(libat_test_and_set) (UTYPE *mptr, int smodel)
...
which for __atomic_test_and_set_8 uses INVERT_MASK_8.

Add INVERT_MASK_8 in libatomic_i.h, as well as MASK_8.

Tested libatomic testsuite on nvptx.

gcc/ChangeLog:

PR target/96964
* config/nvptx/nvptx.md (define_expand "atomic_test_and_set"): New
expansion.

libatomic/ChangeLog:

PR target/96898
* configure.tgt: Add nvptx.
* libatomic_i.h (MASK_8, INVERT_MASK_8): New macro definition.
* config/nvptx/host-config.h: New file.
* config/nvptx/lock.c: New file.

4 years agoamdgcn: align TImode registers
Andrew Stubbs [Thu, 10 Sep 2020 09:10:32 +0000 (10:10 +0100)]
amdgcn: align TImode registers

This prevents execution failures caused by partially overlapping input and
output registers.  This is the same solution already used for DImode.

gcc/ChangeLog:

* config/gcn/gcn.c (gcn_hard_regno_mode_ok): Align TImode registers.
* config/gcn/gcn.md: Assert that TImode registers do not early clobber.

4 years agoimprove BB vectorization dump locations
Richard Biener [Fri, 11 Sep 2020 07:57:18 +0000 (09:57 +0200)]
improve BB vectorization dump locations

This tries to improve BB vectorization dumps by providing more
precise locations.  Currently the vect_location is simply the
very last stmt in a basic-block that has a location.  So for

double a[4], b[4];
int x[4], y[4];
void foo()
{
  a[0] = b[0]; // line 5
  a[1] = b[1];
  a[2] = b[2];
  a[3] = b[3];
  x[0] = y[0]; // line 9
  x[1] = y[1];
  x[2] = y[2];
  x[3] = y[3];
} // line 13

we show the user with -O3 -fopt-info-vec

t.c:13:1: optimized: basic block part vectorized using 16 byte vectors

while with the patch we point to both independently vectorized
opportunities:

t.c:5:8: optimized: basic block part vectorized using 16 byte vectors
t.c:9:8: optimized: basic block part vectorized using 16 byte vectors

there's the possibility that the location regresses in case the
root stmt in the SLP instance has no location.  For a SLP subgraph
with multiple entries the location also chooses one entry at random,
not sure in which case we want to dump both.

Still as the plan is to extend the basic-block vectorization
scope from single basic-block to multiple ones this is a first
step to preserve something sensible.

Implementation-wise this makes both costing and code-generation
happen on the subgraphs as analyzed.

2020-09-11  Richard Biener  <rguenther@suse.de>

* tree-vectorizer.h (_slp_instance::location): New method.
(vect_schedule_slp): Adjust prototype.
* tree-vectorizer.c (vec_info::remove_stmt): Adjust
the BB region begin if we removed the stmt it points to.
* tree-vect-loop.c (vect_transform_loop): Adjust.
* tree-vect-slp.c (_slp_instance::location): Implement.
(vect_analyze_slp_instance): For BB vectorization set
vect_location to that of the instance.
(vect_slp_analyze_operations): Likewise.
(vect_bb_vectorization_profitable_p): Remove wrapper.
(vect_slp_analyze_bb_1): Remove cost check here.
(vect_slp_region): Cost check and code generate subgraphs separately,
report optimized locations and missed optimizations due to
profitability for each of them.
(vect_schedule_slp): Get the vector of SLP graph entries to
vectorize as argument.

4 years agoFix ICE on nested packed variant record type
Eric Botcazou [Fri, 11 Sep 2020 09:14:49 +0000 (11:14 +0200)]
Fix ICE on nested packed variant record type

This is a regression present on the mainline and 10 branch: the compiler
aborts on code accessing a component of a packed record type whose type
is a packed discriminated record type with variant part.

gcc/ada/ChangeLog:
* gcc-interface/utils.c (type_has_variable_size): New function.
(create_field_decl): In the packed case, also force byte alignment
when the type of the field has variable size.

gcc/testsuite/ChangeLog:
* gnat.dg/pack27.adb: New test.
* gnat.dg/pack27_pkg.ads: New helper.

4 years agoAdd missing stride entry in debug info
Eric Botcazou [Fri, 11 Sep 2020 09:13:54 +0000 (11:13 +0200)]
Add missing stride entry in debug info

This adds a missing stride entry for bit-packed arrays of record types.

gcc/ada/ChangeLog:
* gcc-interface/misc.c (get_array_bit_stride): Return TYPE_ADA_SIZE
for record and union types.

4 years agoDrop GNAT encodings for fixed-point types
Eric Botcazou [Fri, 11 Sep 2020 08:54:11 +0000 (10:54 +0200)]
Drop GNAT encodings for fixed-point types

GDB can now deal with the DWARF representation just fine.

gcc/ada/ChangeLog:
* gcc-interface/misc.c (gnat_get_fixed_point_type): Bail out only
when the GNAT encodings are specifically used.

4 years agoFix crash on array component with nonstandard index type
Eric Botcazou [Fri, 11 Sep 2020 08:41:28 +0000 (10:41 +0200)]
Fix crash on array component with nonstandard index type

This is a regression present on mainline, 10 and 9 branches: the compiler
goes into an infinite recursion eventually exhausting the stack for the
declaration of a discriminated record type with an array component having
a discriminant as bound and an index type that is an enumeration type with
a non-standard representation clause.

gcc/ada/ChangeLog:
* gcc-interface/decl.c (gnat_to_gnu_entity) <E_Array_Subtype>: Only
create extra subtypes for discriminants if the RM size of the base
type of the index type is lower than that of the index type.

gcc/testsuite/ChangeLog:
* gnat.dg/specs/discr7.ads: New test.

4 years agoAdjust email address
Eric Botcazou [Fri, 11 Sep 2020 08:16:17 +0000 (10:16 +0200)]
Adjust email address

4 years agoAdjust email address
Eric Botcazou [Fri, 11 Sep 2020 08:12:28 +0000 (10:12 +0200)]
Adjust email address

4 years agoAdjust email address
Eric Botcazou [Fri, 11 Sep 2020 08:09:59 +0000 (10:09 +0200)]
Adjust email address

4 years agotree-optimization/97013 - avoid duplicate 'vectorization is not profitable'
Richard Biener [Fri, 11 Sep 2020 06:59:58 +0000 (08:59 +0200)]
tree-optimization/97013 - avoid duplicate 'vectorization is not profitable'

This avoids dumping 'vectorization is not profitable' one more time
if none of the opportunities in a BB is profitable.

2020-09-11  Richard Biener  <rguenther@suse.de>

PR tree-optimization/97013
* tree-vect-slp.c (vect_slp_analyze_bb_1): Remove duplicate dumping.

4 years agorandom vectorizer fixes
Richard Biener [Thu, 10 Sep 2020 14:23:29 +0000 (16:23 +0200)]
random vectorizer fixes

This fixes random things found when doing SLP discovery from
arbitrary sets of stmts.

2020-09-10  Richard Biener  <rguenther@suse.de>

* tree-vect-slp.c (vect_build_slp_tree_1): Check vector
types for all lanes are compatible.
(vect_analyze_slp_instance): Appropriately check for stores.
(vect_schedule_slp): Likewise.

4 years ago[nvptx] Fix UB in nvptx_assemble_value
Tom de Vries [Fri, 11 Sep 2020 05:13:25 +0000 (07:13 +0200)]
[nvptx] Fix UB in nvptx_assemble_value

When nvptx_assemble_value is called with size == 16, this bitshift runs
into UB:
...
  val &= ((unsigned  HOST_WIDE_INT)2 << (size * BITS_PER_UNIT - 1)) - 1;
...

Fix this by checking the shift amount.

Tested on nvptx.

gcc/ChangeLog:

* config/nvptx/nvptx.c (nvptx_assemble_value): Fix undefined
behaviour.