review.tizen.org Git - platform/upstream/gcc.git/log

testsuite: Uncomment __cpp_consteval test

The __cpp_consteval macro and corresponding test have been initially
commented out because the consteval support didn't have virtual consteval
method support. The r11-1789-ge6321c4508b2a85c21246c1c06a8208e2a151e48
change enabled the macro but didn't enable the corresponding test.

2021-06-10 Jakub Jelinek <jakub@redhat.com>

* g++.dg/cpp2a/feat-cxx2a.C: Uncomment __cpp_consteval test.
* g++.dg/cpp23/feat-cxx2b.C: Likewise.

testsuite: Fix up libgomp.fortran/pr100981-2.f90 testcase [PR100981]

The dsdotr and dsdoti variables uninitialized and the testcase fails e.g.
on i686-linux. Fixed by zero initialization.

2021-06-10 Jakub Jelinek <jakub@redhat.com>

PR tree-optimization/100981
* testsuite/libgomp.fortran/pr100981-2.f90 (cdcdot): Initialize
dsdotr and dsdoti to 0.

ifcvt: Fix -fcompare-debug bug [PR100852]

The following testcase fails -fcompare-debug, because it is ifcvt optimized
into umin only with -g0 and not with -g - the function(s) use
prev_nonnote_insn, which without -g finds a real insn the code is looking
for, while with -g finds a DEBUG_INSN.

2021-06-10 Jakub Jelinek <jakub@redhat.com>

PR debug/100852
* ifcvt.c (noce_get_alt_condition, noce_try_abs): Use
prev_nonnote_nondebug_insn instead of prev_nonnote_insn.

* g++.dg/opt/pr100852.C: New test.

aix: Power10 assembler invocation.

gcc/ChangeLog:

2021-06-09 Clement Chigot <clement.chigot@atos.net>

* config/rs6000/aix71.h (ASM_CPU_SPEC): Add Power10 directive.
* config/rs6000/aix72.h (ASM_CPU_SPEC): Likewise.

Daily bump.

analyzer: make various region_model member functions const

gcc/analyzer/ChangeLog:
* region-model.cc (region_model::get_lvalue_1): Make const.
(region_model::get_lvalue): Likewise.
(region_model::get_rvalue_1): Likewise.
(region_model::get_rvalue): Likewise.
(region_model::deref_rvalue): Likewise.
(region_model::get_rvalue_for_bits): Likewise.
* region-model.h (region_model::get_lvalue): Likewise.
(region_model::get_rvalue): Likewise.
(region_model::deref_rvalue): Likewise.
(region_model::get_rvalue_for_bits): Likewise.
(region_model::get_lvalue_1): Likewise.
(region_model::get_rvalue_1): Likewise.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

libstd++: Only support atomic_ref::wait tests which are always lockfree

Fixes a regression on arm32 targets.

libstdc++-v3/ChangeLog:
* testsuite/29_atomics/atomic_ref/wait_notify.cc: Guard
test logic with constexpr check for is_always_lock_free.

Fix PR 100925: Limit some a?CST1:CST2 optimizations to intergal types only

The problem here is with offset (and pointer) types is we produce
a negative expression when this optimization hits.
It is easier to disable this optimization for all non-integeral types
instead of finding an integer type which is the same precission as the
type to do the negative expression on it.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

gcc/ChangeLog:

PR tree-optimization/100925
* match.pd (a ? CST1 : CST2): Limit transformations
that would produce a negative to integeral types only.
Change !POINTER_TYPE_P to INTEGRAL_TYPE_P also.

gcc/testsuite/ChangeLog:

* g++.dg/torture/pr100925.C: New test.

Revert "Finish last change"

This reverts commit 4af4d9a458b9baa019fe15d37d95306ab1f0f2e4.

Finish last change

gcc/
* doc/tm.texi: Correctly update.

Update doc/tm.texi.in to fix commit 4a0c4eaea32

PR other/100735
* doc/tm.texi.in (Trampolines): Add a missing blank line.

d: TypeInfo error when using slice copy on Structs (PR100964)

Known limitation: does not work for struct with postblit or dtor.

Reviewed-on: https://github.com/dlang/dmd/pull/12648

gcc/d/ChangeLog:

PR d/100964
* dmd/MERGE: Merge upstream dmd 4a4e46a6f.

d: Respect explicit align(N) type alignment (PR100935)

It was previously the natural type alignment, defined as the maximum of
the field alignments for an aggregate. Make sure an explicit align(N)
overrides it.

Reviewed-on: https://github.com/dlang/dmd/pull/12646

gcc/d/ChangeLog:

PR d/100935
* dmd/MERGE: Merge upstream dmd f3fdeb578.

libgomp: Compile tests with -march=i486 only if needed

Don't add -march=i486 if atomic compare-and-swap is supported on 'int'.
This fixes libgomp tests with "-march=x86-64 -m32 -fcf-protection".

* testsuite/lib/libgomp.exp (libgomp_init): Don't add -march=i486
if atomic compare-and-swap is supported on 'int'.

Document that -fno-trampolines is for Ada only [PR100735]

gcc/
PR other/100735
* doc/invoke.texi (Code Gen Options); Document that -fno-trampolines
and -ftrampolines work only with Ada.
* doc/tm.texi.in (Trampolines): Likewise.
* doc/tm.texi: Regenerated.

RS6000 Add 128-bit Binary Integer sign extend operations

This patch adds the 128-bit sign extension instruction support and
corresponding builtin support.

RS6000 Add 128-bit Binary Integer sign extend operations

2021-06-08  Carl Love  <cel@us.ibm.com>

gcc/ChangeLog

* config/rs6000/altivec.h (vec_signextll, vec_signexti, vec_signextq):
Add define for new builtins.
* config/rs6000/altivec.md(altivec_vreveti2): Add define_expand.
* config/rs6000/rs6000-builtin.def (VSIGNEXTI, VSIGNEXTLL):  Add
overloaded builtin definitions.
(VSIGNEXTSB2W, VSIGNEXTSH2W, VSIGNEXTSB2D, VSIGNEXTSH2D,VSIGNEXTSW2D,
VSIGNEXTSD2Q): Add builtin expansions.
(SIGNEXT): Add P10 overload definition.
* config/rs6000/rs6000-call.c (P9V_BUILTIN_VEC_VSIGNEXTI, P9V_BUILTIN_VEC_VSIGNEXTLL,
P10_BUILTIN_VEC_SIGNEXT): Add overloaded argument definitions.
* config/rs6000/vsx.md (vsx_sign_extend_v2di_v1ti): Add define_insn.
(vsignextend_v2di_v1ti, vsignextend_qi_<mode>, vsignextend_hi_<mode>,
vsignextend_si_v2di)[VIlong]: Add define_expand.
Make define_insn vsx_sign_extend_si_v2di visible.
* doc/extend.texi:  Add documentation for the vec_signexti,
vec_signextll builtins and vec_signextq.

gcc/testsuite/ChangeLog

* gcc.target/powerpc/int_128bit-runnable.c (extsd2q): Update expected
count.
Add tests for vec_signextq.
* gcc.target/powerpc/p9-sign_extend-runnable.c:  New test case.

Conversions between 128-bit integer and floating point values.

The files fixkfti-sw.c and fixunskfti-sw.c are renamed versions of
fixkfti.c and fixunskfti.c respectively to do the conversions in software.
The function names in the files were updated with the rename as well as
some white spaces fixes. The file float128-p10.c contains the functions
for using the ISA 3.1 hardware instructions to perform the conversions.

2021-06-08 Carl Love <cel@us.ibm.com>

gcc/ChangeLog

* config/rs6000/rs6000.c (__fixkfti, __fixunskfti, __floattikf,
__floatuntikf): Names changed to __fixkfti_sw, __fixunskfti_sw,
__floattikf_sw, __floatuntikf_sw respectively.
* config/rs6000/rs6000.md (floatti<mode>2, floatunsti<mode>2,
fix_trunc<mode>ti2, fixuns_trunc<mode>ti2): Add
define_insn for mode IEEE 128.

gcc/testsuite/ChangeLog

* gcc.target/powerpc/fp128_conversions.c: New file.
* gcc.target/powerpc/int_128bit-runnable.c(vextsd2q,
vcmpuq, vcmpsq, vcmpequq, vcmpequq., vcmpgtsq, vcmpgtsq.
vcmpgtuq, vcmpgtuq.): Update scan-assembler-times.
(ppc_native_128bit): Remove dg-require-effective-target.

libgcc/ChangeLog

* config.host: Add if test and set for
libgcc_cv_powerpc_3_1_float128_hw.
* config/rs6000/fixkfti.c: Renamed to fixkfti-sw.c.
Change calls of __fixkfti to __fixkfti_sw.
* config/rs6000/fixunskfti.c: Renamed to fixunskfti-sw.c.
Change calls of __fixunskfti to __fixunskfti_sw.
* config/rs6000/float128-p10.c (__floattikf_hw,
__floatuntikf_hw, __fixkfti_hw, __fixunskfti_hw): New file.
* config/rs6000/float128-ifunc.c (SW_OR_HW_ISA3_1): New macro.
(__floattikf_resolve, __floatuntikf_resolve, __fixkfti_resolve,
__fixunskfti_resolve): Add resolve functions.
(__floattikf, __floatuntikf, __fixkfti, __fixunskfti): New functions.
* config/rs6000/float128-sed (floattitf, __floatuntitf,
__fixtfti, __fixunstfti): Add editor commands to change names.
* config/rs6000/float128-sed-hw (__floattitf,
__floatuntitf, __fixtfti, __fixunstfti): Add editor commands to
change names.
* config/rs6000/floattikf.c: Renamed to floattikf-sw.c.
* config/rs6000/floatuntikf.c: Renamed to floatuntikf-sw.c.
* config/rs6000/quad-float128.h (__floattikf_sw,
__floatuntikf_sw, __fixkfti_sw, __fixunskfti_sw, __floattikf_hw,
__floatuntikf_hw, __fixkfti_hw, __fixunskfti_hw, __floattikf,
__floatuntikf, __fixkfti, __fixunskfti): New extern declarations.
* config/rs6000/t-float128 (floattikf, floatuntikf,
fixkfti, fixunskfti): Remove file names from fp128_ppc_funcs.
(floattikf-sw, floatuntikf-sw, fixkfti-sw, fixunskfti-sw): Add
file names to fp128_ppc_funcs.
* config/rs6000/t-float128-hw(fp128_3_1_hw_funcs,
fp128_3_1_hw_src, fp128_3_1_hw_static_obj, fp128_3_1_hw_shared_obj,
fp128_3_1_hw_obj): Add variables for ISA 3.1 support.
* config/rs6000/t-float128-p10-hw: New file.
* configure: Update script for isa 3.1 128-bit float support.
* configure.ac: Add check for 128-bit float hardware support.

rs6000, Add test 128-bit shifts for just the int128 type.

This patch also renames and moves the VSX_TI iterator from vsx.md to
VEC_TI in vector.md. The uses of VEC_TI are also updated.

2021-04-29 Carl Love <cel@us.ibm.com>

gcc/ChangeLog

* config/rs6000/altivec.md (altivec_vslq, altivec_vsrq):
Rename to altivec_vslq_<mode>, altivec_vsrq_<mode>, mode VEC_TI.
* config/rs6000/vector.md (VEC_TI): Was named VSX_TI in vsx.md.
(vashlv1ti3): Change to vashl<mode>3, mode VEC_TI.
(vlshrv1ti3): Change to vlshr<mode>3, mode VEC_TI.
* config/rs6000/vsx.md (VSX_TI): Remove define_mode_iterator. Update
uses of VSX_TI to VEC_TI.

gcc/testsuite/ChangeLog

* gcc.target/powerpc/int_128bit-runnable.c: Add shift_right, shift_left
tests.

Add 128-bit int to 128-bit DFP (floattitd2) and 128-bit DFP to 128-bit int (fixtdti2) support

2021-06-08 Carl Love <cel@us.ibm.com>

gcc/ChangeLog

* config/rs6000/dfp.md (floattitd2, fixtdti2): New define_insns.

gcc/testsuite/ChangeLog

* gcc.target/powerpc/int_128bit-runnable.c: Add 128-bit DFP
conversion tests.

RS6000 add 128-bit Integer Operations part 1

2021-06-07 Carl Love <cel@us.ibm.com>

gcc/ChangeLog

* config/rs6000/altivec.h (vec_dive, vec_mod): Add define for new
builtins.
* config/rs6000/altivec.md (UNSPEC_VMULEUD, UNSPEC_VMULESD,
UNSPEC_VMULOUD, UNSPEC_VMULOSD): New unspecs.
(altivec_eqv1ti, altivec_gtv1ti, altivec_gtuv1ti, altivec_vmuleud,
altivec_vmuloud, altivec_vmulesd, altivec_vmulosd, altivec_vrlq,
altivec_vrlqmi, altivec_vrlqmi_inst, altivec_vrlqnm,
altivec_vrlqnm_inst, altivec_vslq, altivec_vsrq, altivec_vsraq,
altivec_vcmpequt_p, altivec_vcmpgtst_p, altivec_vcmpgtut_p): New
define_insn.
(vec_widen_umult_even_v2di, vec_widen_smult_even_v2di,
vec_widen_umult_odd_v2di, vec_widen_smult_odd_v2di, altivec_vrlqmi,
altivec_vrlqnm): New define_expands.
* config/rs6000/rs6000-builtin.def (VCMPEQUT_P, VCMPGTST_P,
VCMPGTUT_P): Add macro expansions.
(BU_P10V_AV_P): Add builtin predicate definition.
(VCMPGTUT, VCMPGTST, VCMPEQUT, CMPNET, CMPGE_1TI,
CMPGE_U1TI, CMPLE_1TI, CMPLE_U1TI, VNOR_V1TI_UNS, VNOR_V1TI, VCMPNET_P,
VCMPAET_P, VMULEUD, VMULESD, VMULOUD, VMULOSD, VRLQ,
VSLQ, VSRQ, VSRAQ, VRLQNM, DIV_V1TI, UDIV_V1TI, DIVES_V1TI, DIVEU_V1TI,
MODS_V1TI, MODU_V1TI, VRLQMI): New macro expansions.
(VRLQ, VSLQ, VSRQ, VSRAQ, DIVE, MOD): New overload expansions.
* config/rs6000/rs6000-call.c (P10_BUILTIN_VCMPEQUT,
P10V_BUILTIN_CMPGE_1TI, P10V_BUILTIN_CMPGE_U1TI,
P10V_BUILTIN_VCMPGTUT, P10V_BUILTIN_VCMPGTST,
P10V_BUILTIN_CMPLE_1TI, P10V_BUILTIN_VCMPLE_U1TI,
P10V_BUILTIN_DIV_V1TI, P10V_BUILTIN_UDIV_V1TI,
P10V_BUILTIN_VMULESD, P10V_BUILTIN_VMULEUD,
P10V_BUILTIN_VMULOSD, P10V_BUILTIN_VMULOUD,
P10V_BUILTIN_VNOR_V1TI, P10V_BUILTIN_VNOR_V1TI_UNS,
P10V_BUILTIN_VRLQ, P10V_BUILTIN_VRLQMI,
P10V_BUILTIN_VRLQNM, P10V_BUILTIN_VSLQ,
P10V_BUILTIN_VSRQ, P10V_BUILTIN_VSRAQ,
P10V_BUILTIN_VCMPGTUT_P, P10V_BUILTIN_VCMPGTST_P,
P10V_BUILTIN_VCMPEQUT_P, P10V_BUILTIN_VCMPGTUT_P,
P10V_BUILTIN_VCMPGTST_P, P10V_BUILTIN_CMPNET,
P10V_BUILTIN_VCMPNET_P, P10V_BUILTIN_VCMPAET_P,
P10V_BUILTIN_DIVES_V1TI, P10V_BUILTIN_MODS_V1TI,
P10V_BUILTIN_MODU_V1TI):
New overloaded definitions.
(rs6000_gimple_fold_builtin) [P10V_BUILTIN_VCMPEQUT,
P10V_BUILTIN_CMPNET, P10V_BUILTIN_CMPGE_1TI,
P10V_BUILTIN_CMPGE_U1TI, P10V_BUILTIN_VCMPGTUT,
P10V_BUILTIN_VCMPGTST, P10V_BUILTIN_CMPLE_1TI,
P10V_BUILTIN_CMPLE_U1TI]: New case statements.
(rs6000_init_builtins) [bool_V1TI_type_node, int_ftype_int_v1ti_v1ti]:
New assignments.
(altivec_init_builtins): New E_V1TImode case statement.
(builtin_function_type)[P10_BUILTIN_128BIT_VMULEUD,
P10_BUILTIN_128BIT_VMULOUD, P10_BUILTIN_128BIT_DIVEU_V1TI,
P10_BUILTIN_128BIT_MODU_V1TI, P10_BUILTIN_CMPGE_U1TI,
P10_BUILTIN_VCMPGTUT, P10_BUILTIN_VCMPEQUT]: New case statements.
* config/rs6000/rs6000.c (rs6000_handle_altivec_attribute) [E_TImode,
E_V1TImode]: New case statements.
* config/rs6000/rs6000.h (rs6000_builtin_type_index): New enum
value RS6000_BTI_bool_V1TI.
* config/rs6000/vector.md (vector_gtv1ti,vector_nltv1ti,
vector_gtuv1ti, vector_nltuv1ti, vector_ngtv1ti, vector_ngtuv1ti,
vector_eq_v1ti_p, vector_ne_v1ti_p, vector_ae_v1ti_p,
vector_gt_v1ti_p, vector_gtu_v1ti_p, vrotlv1ti3, vashlv1ti3,
vlshrv1ti3, vashrv1ti3): New define_expands.
* config/rs6000/vsx.md (UNSPEC_VSX_DIVSQ, UNSPEC_VSX_DIVUQ,
UNSPEC_VSX_DIVESQ, UNSPEC_VSX_DIVEUQ, UNSPEC_VSX_MODSQ,
UNSPEC_VSX_MODUQ): New unspecs.
(mulv2di3, vsx_div_v1ti, vsx_udiv_v1ti, vsx_dives_v1ti,
vsx_diveu_v1ti, vsx_mods_v1ti, vsx_modu_v1ti, xxswapd_v1ti): New
define_insns.
(vcmpnet): New define_expand.
* doc/extend.texi: Add documentation for the new builtins vec_rl,
vec_rlmi, vec_rlnm, vec_sl, vec_sr, vec_sra, vec_mule, vec_mulo,
vec_div, vec_dive, vec_mod, vec_cmpeq, vec_cmpne, vec_cmpgt, vec_cmplt,
vec_cmpge, vec_cmple, vec_all_eq, vec_all_ne, vec_all_gt, vec_all_lt,
vec_all_ge, vec_all_le, vec_any_eq, vec_any_ne, vec_any_gt, vec_any_lt,
vec_any_ge, vec_any_le.

gcc/testsuite/ChangeLog

* gcc.target/powerpc/int_128bit-runnable.c: New test file.

rs6000, Fix arguments in altivec_vrlwmi and altivec_rlwdi builtins

2021-06-07 Carl Love <cel@us.ibm.com>

gcc/
* config/rs6000/altivec.md (altivec_vrl<VI_char>mi): Fix
bug in argument generation.

gcc/testsuite/
* gcc.target/powerpc/check-builtin-vec_rlnm-runnable.c:
New runnable test case.
* gcc.target/powerpc/vec-rlmi-rlnm.c: Update scan assembler times
for xxlor instruction.

arm: Auto-vectorization for MVE: vclz

This patch adds support for auto-vectorization of clz for MVE.

It does so by removing the unspec from mve_vclzq_<supf><mode> and uses
'clz' instead. It moves to neon_vclz<mode> expander from neon.md to
vec-common.md and renames it into the standard name clz<mode>2.

2021-06-09 Christophe Lyon <christophe.lyon@linaro.org>

gcc/
* config/arm/iterators.md (<supf>): Remove VCLZQ_U, VCLZQ_S.
(VCLZQ): Remove.
* config/arm/mve.md (mve_vclzq_<supf><mode>): Add '@' prefix,
remove <supf> iterator.
(mve_vclzq_u<mode>): New.
* config/arm/neon.md (clz<mode>2): Rename to neon_vclz<mode>.
(neon_vclz<mode): Move to ...
* config/arm/unspecs.md (VCLZQ_U, VCLZQ_S): Remove.
* config/arm/vec-common.md: ... here. Add support for MVE.

gcc/testsuite/
* gcc.target/arm/simd/mve-vclz.c: New test.

arm: Auto-vectorization for MVE and Neon: vhadd/vrhadd

This patch adds support for auto-vectorization of average value
computation using vhadd or vrhadd, for both MVE and Neon.

The patch adds the needed [u]avg<mode>3_[floor|ceil] patterns to
vec-common.md, I'm not sure how to factorize them without introducing
an unspec iterator?

It also adds tests for 'floor' and for 'ceil', each for MVE and Neon.

2021-06-09 Christophe Lyon <christophe.lyon@linaro.org>

gcc/
* config/arm/mve.md (mve_vhaddq_<supf><mode>): Prefix with '@'.
(@mve_vrhaddq_<supf><mode): Likewise.
* config/arm/neon.md (neon_v<r>hadd<sup><mode>): Likewise.
* config/arm/vec-common.md (avg<mode>3_floor, uavg<mode>3_floor)
(avg<mode>3_ceil", uavg<mode>3_ceil): New patterns.

gcc/testsuite/
* gcc.target/arm/simd/mve-vhadd-1.c: New test.
* gcc.target/arm/simd/mve-vhadd-2.c: New test.
* gcc.target/arm/simd/neon-vhadd-1.c: New test.
* gcc.target/arm/simd/neon-vhadd-2.c: New test.

Fix doc/typo

gcc/

* doc/invoke.texi: Fix typo.

[PATCH] PR middle-end/53267: Constant fold BUILT_IN_FMOD.

gcc/ChangeLog
PR middle-end/53267
* fold-const-call.c (fold_const_call_sss) [CASE_CFN_FMOD]:
Support evaluation of fmod/fmodf/fmodl at compile-time.

gcc/testsuite/ChangeLog
* gcc.dg/builtins-70.c: New test.

Fix p10 fusion test cases for -m32

The counts of fusion insns are slightly different for 32-bit compiles
so we need different scan-assembler-times counts for 32 and 64 bit
in the test cases for p10 fusion.

gcc/testsuite/ChangeLog

* gcc.target/powerpc/fusion-p10-2logical.c: Update fused insn
counts to test 32 and 64 bit separately.
* gcc.target/powerpc/fusion-p10-addadd.c: Update fused insn
counts to test 32 and 64 bit separately.
* gcc.target/powerpc/fusion-p10-ldcmpi.c: Update fused insn
counts to test 32 and 64 bit separately.
* gcc.target/powerpc/fusion-p10-logadd.c: Update fused insn
counts to test 32 and 64 bit separately.

tree-optimization/100981 - fix SLP patterns involving reductions

The following fixes the SLP FMA patterns to preserve reduction
info and the reduction vectorization to consider internal function
call defs for the reduction stmt.

2021-06-09 Richard Biener <rguenther@suse.de>

PR tree-optimization/100981
gcc/
* tree-vect-loop.c (vect_create_epilog_for_reduction): Use
gimple_get_lhs to also handle calls.
* tree-vect-slp-patterns.c (complex_pattern::build): Transfer
reduction info.

gcc/testsuite/
* gfortran.dg/vect/pr100981-1.f90: New testcase.

libgomp/
* testsuite/libgomp.fortran/pr100981-2.f90: New testcase.

tree-optimization/97832 - handle associatable chains in SLP discovery

This makes SLP discovery handle associatable (including mixed
plus/minus) chains better by swapping operands across the whole
chain.  To work this adds caching of the 'matches' lanes for
failed SLP discovery attempts, thereby fixing a failed SLP
discovery for the slp-pr98855.cc testcase which results in
building an operand from scalars as expected.  Unfortunately
this makes us trip over the cost threshold so I'm XFAILing the
testcase for now.

For BB vectorization all this doesn't work because we have no way
to distinguish good from bad associations as we eventually build
operands from scalars and thus not fail in the classical sense.

2021-05-31  Richard Biener  <rguenther@suse.de>

PR tree-optimization/97832
* tree-vectorizer.h (_slp_tree::failed): New.
* tree-vect-slp.c (_slp_tree::_slp_tree): Initialize
failed member.
(_slp_tree::~_slp_tree): Free failed.
(vect_build_slp_tree): Retain failed nodes and record
matches in them, copying that back out when running
into a cached fail.  Dump start and end of discovery.
(dt_sort_cmp): New.
(vect_build_slp_tree_2): Handle associatable chains
together doing more aggressive operand swapping.

* gcc.dg/vect/pr97832-1.c: New testcase.
* gcc.dg/vect/pr97832-2.c: Likewise.
* gcc.dg/vect/pr97832-3.c: Likewise.
* g++.dg/vect/slp-pr98855.cc: XFAIL.

Always enable DT_INIT_ARRAY/DT_FINI_ARRAY on Linux

DT_INIT_ARRAY/DT_FINI_ARRAY support was added to glibc 2.1 by

commit fcf70d4114db9ff7923f5dfeb3fea6e2d623e5c2
Author: Ulrich Drepper <drepper@redhat.com>
Date:   Sat Jul 24 19:45:13 1999 +0000

    Update.

    1999-07-24  Ulrich Drepper  <drepper@cygnus.com>

            * elf/dl-fini.c: Handle DT_FINI_ARRAY.
            * elf/link.h (struct link_map): Remove l_init_running.  Add l_runcount
            and l_initcount.
            * elf/dl-init.c: Handle DT_INIT_ARRAY.
...

and added to binutils 2.12 by

commit e9682144c14fc809af72bd6c0b8c69731d38679c
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Mon Mar 4 20:40:48 2002 +0000

    2002-03-04  H.J. Lu <hjl@gnu.org>

            * config/obj-elf.c (special_section): Add .init_array,
            .fini_array and .preinit_array.

            * config/tc-ia64.h (ELF_TC_SPECIAL_SECTIONS): Remove
            .init_array and .fini_array.

gcc/

PR target/100896
* config.gcc (gcc_cv_initfini_array): Set to yes for Linux and
GNU targets.
* doc/install.texi: Require glibc 2.1 and binutils 2.12 for
Linux and GNU targets.

libstdc++: Fix constraint on std::optional assignment [PR 100982]

libstdc++-v3/ChangeLog:

PR libstdc++/100982
* include/std/optional (optional::operator=(const optional<U>&)):
Fix value category used in is_assignable check.
* testsuite/20_util/optional/assignment/100982.cc: New test.

docs: add missing @headitem in Intrinsic Procedures

gcc/fortran/ChangeLog:

* intrinsic.texi: Add missing @headitem to tables with a header.

Simplify vect_is_simple_use

This simplifies vect_is_simple_use to always get the def-type from
the stmt_info instead of singleing out some gimple stmt kinds.

2021-06-09 Richard Biener <rguenther@suse.de>

* tree-vect-stmts.c (vect_is_simple_use): Always get dt
from the stmt.

Fix my e-mail in the ChangeLog

libstdc++: Add warnings for some C++23 deprecations

LWG 3036 deprecates std::pmr::polymorphic_allocator<T>::destroy in
favour of the equivalent member of std::allocator_traits.

LWG 3170 deprecates std::allocator<T>::is_always_equal in favour of
the equivalent member of std::allocator_traits.

This also updates a comment to note that we support the LWG 3541 change
(even before the issue was opened).

Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:

* include/bits/allocator.h (allocator::is_always_equal): Deprecate.
* include/bits/iterator_concepts.h (indirectly_readable_traits):
Add LWG issue number to comment.
* include/std/memory_resource (polymorphic_allocator::release):
Deprecate.
* testsuite/20_util/allocator/requirements/typedefs.cc: Add
dg-warning for deprecation. Also check std::allocator<void>.

arc: Update doloop_end patterns

ARC processor can use LP instruction to implement zero overlay loops.
The current inplementation doesn't handle the unlikely situation when
the loop iterator is located in memory. Refurbish the loop_end insn
pattern into a define_insn_and_split pattern.

gcc/
2021-07-09 Claudiu Zissulescu <claziss@synopsys.com>

* config/arc/arc.md (loop_end): Change it to
define_insn_and_split.

Signed-off-by: Claudiu Zissulescu <claziss@synopsys.com>

arc: Fix (u)maddhisi patterns

Rework the (u)maddhisi4 patterns and use VMAC2H(U) instruction instead
of the 64bit MAC(U) instruction.
This fixes the next execute.exp failures:
     arith-rand-ll.c   -O2  execution test
     arith-rand-ll.c   -O3  execution test
     pr78726.c   -O2  execution test
     pr78726.c   -O3  execution test

gcc/
2021-06-09  Claudiu Zissulescu  <claziss@synopsys.com>

* config/arc/arc.md (maddhisi4): Use VMAC2H instruction.
(machi): New pattern.
(umaddhisi4): Use VMAC2HU instruction.
(umachi): New pattern.

Signed-off-by: Claudiu Zissulescu <claziss@synopsys.com>

arc: Update 64bit move split patterns.

ARCv2HS can use a limited number of instructions to implement 64bit
moves. The VADD2 is used as a 64bit move, the LDD/STD are 64 bit loads
and stores. All those instructions are not baseline, hence we need to
provide alternatives when they are not available or cannot be generate
due to instruction restriction.

This patch is cleaning up those move patterns, and updates splits
instruction lengths.

gcc/
2021-06-09 Claudiu Zissulescu <claziss@synopsys.com>

* config/arc/arc-protos.h (arc_split_move_p): New prototype.
* config/arc/arc.c (arc_split_move_p): New function.
(arc_split_move): Clean up.
* config/arc/arc.md (movdi_insn): Clean up, use arc_split_move_p.
(movdf_insn): Likewise.
* config/arc/simdext.md (mov<VWH>_insn): Likewise.

Signed-off-by: Claudiu Zissulescu <claziss@synopsys.com>

openmp: Gimplify OMP_CLAUSE_SIZE during gfc_omp_finish_clause [PR100965]

As the testcase shows, we need to gimplify OMP_CLAUSE_SIZE, so that we
don't end up with SAVE_EXPR or anything similar non-gimple in it.

2021-06-08 Jakub Jelinek <jakub@redhat.com>

PR fortran/100965
* trans-openmp.c (gfc_omp_finish_clause): Gimplify OMP_CLAUSE_SIZE.

* gfortran.dg/gomp/pr100965.f90: New test.

i386: Do not emit segment overrides for %p and %P [PR100936]

Using %p to move the address of a symbol using LEA:

  asm ("lea %p1, %0" : "=r"(addr) : "m"(var));

emits assembler warning when VAR is declared in a non-generic address space:

  Warning: segment override on `lea' is ineffectual

The problem is with %p operand modifier, which should emit raw symbol name:

  p -- print raw symbol name.

Similar problem exists with %P modifier, trying to CALL or JMP to an
overridden symbol,e.g:

        call %gs:zzz
        jmp %gs:zzz

emits assembler warning:

  Warning: skipping prefixes on `call'
  Warning: skipping prefixes on `jmp'

Ensure that %p and %P never emit segment overrides.

2021-06-08  Uroš Bizjak  <ubizjak@gmail.com>

gcc/
PR target/100936
* config/i386/i386.c (print_operand_address_as): Rename "no_rip"
argument to "raw".  Do not emit segment overrides when "raw" is true.

gcc/testsuite/

PR target/100936
* gcc.target/i386/pr100936.c: New test.

Improve JSON examples.

gcc/ChangeLog:

* doc/gcov.texi: Create a proper JSON files.
* doc/invoke.texi: Remove dots in order to make it a valid
JSON object.

rs6000: Support doubleword swaps removal in rot64 load store [PR100085]

On P8LE, extra rot64+rot64 load or store instructions are generated
in float128 to vector __int128 conversion.

This patch teaches pass swaps to also handle such pattens to remove
extra swap instructions.

(insn 7 6 8 2 (set (subreg:V1TI (reg:KF 123) 0)
        (rotate:V1TI (mem/u/c:V1TI (reg/f:DI 121) [0  S16 A128])
            (const_int 64 [0x40]))) {*vsx_le_permute_v1ti})
(insn 8 7 9 2 (set (subreg:V1TI (reg:KF 122) 0)
        (rotate:V1TI (subreg:V1TI (reg:KF 123) 0)
            (const_int 64 [0x40])))  {*vsx_le_permute_v1ti})
=>
(insn 22 6 23 2 (set (subreg:V1TI (reg:KF 123) 0)
        (mem/u/c:V1TI (and:DI (reg/f:DI 121)
          (const_int -16 [0xfffffffffffffff0])) [0  S16 A128])))
(insn 23 22 25 2 (set (subreg:V1TI (reg:KF 122) 0)
        (subreg:V1TI (reg:KF 123) 0)))

gcc/ChangeLog:

2021-06-09  Xionghu Luo  <luoxhu@linux.ibm.com>

* config/rs6000/rs6000-p8swap.c (pattern_is_rotate64): New.
(insn_is_load_p): Use pattern_is_rotate64.
(insn_is_swap_p): Likewise.
(quad_aligned_load_p): Likewise.
(const_load_sequence_p): Likewise.
(replace_swapped_aligned_load): Likewise.
(recombine_lvx_pattern): Likewise.
(recombine_stvx_pattern): Likewise.

gcc/testsuite/ChangeLog:

2021-06-09  Xionghu Luo  <luoxhu@linux.ibm.com>

* gcc.target/powerpc/float128-call.c: Adjust.
* gcc.target/powerpc/pr100085.c: New test.

Virtualize fur_source and turn it into a proper API.

No more accessing the local info.  Also add fur_source/fold_stmt where ranges
are provided via being specified, or a vector to replace gimple_fold_range.

* gimple-range-gori.cc (gori_compute::outgoing_edge_range_p): Use a
fur_stmt source record.
* gimple-range.cc (fur_source::get_operand): Generic range query.
(fur_source::get_phi_operand): New.
(fur_source::register_dependency): New.
(fur_source::query): New.
(class fur_edge): New.  Edge source for operands.
(fur_edge::fur_edge): New.
(fur_edge::get_operand): New.
(fur_edge::get_phi_operand): New.
(fur_edge::query): New.
(fur_stmt::fur_stmt): New.
(fur_stmt::get_operand): New.
(fur_stmt::get_phi_operand): New.
(fur_stmt::query): New.
(class fur_depend): New.  Statement source and process dependencies.
(fur_depend::fur_depend): New.
(fur_depend::register_dependency): New.
(class fur_list): New.  List source for operands.
(fur_list::fur_list): New.
(fur_list::get_operand): New.
(fur_list::get_phi_operand): New.
(fold_range): New.  Instantiate appropriate fur_source class and fold.
(fold_using_range::range_of_range_op): Use new API.
(fold_using_range::range_of_address): Ditto.
(fold_using_range::range_of_phi): Ditto.
(imple_ranger::fold_range_internal): Use fur_depend class.
(fold_using_range::range_of_ssa_name_with_loop_info): Use new API.
* gimple-range.h (class fur_source): Now a base class.
(class fur_stmt): New.
(fold_range): New prototypes.
(fur_source::fur_source): Delete.

c++: remove redundant warning [PR100879]

Before my r277864, build_new_op promoted enums to int before passing them on
to cp_build_binary_op; after that commit, it doesn't, so
warn_for_sign_compare sees the enum operands and gives a redundant warning.
This warning dates back to 1995, and seems to have been dead code for a long
time--likely since build_new_op was added in 1997--so let's just remove it.

PR c++/100879

gcc/c-family/ChangeLog:

* c-warn.c (warn_for_sign_compare): Remove C++ enum mismatch
warning.

gcc/testsuite/ChangeLog:

* g++.dg/diagnostic/enum3.C: New test.

Daily bump.

libstdc++: Fix Wrong param type in :atomic_ref<_Tp*>::wait [PR100889]

libstdc++-v3/ChangeLog:

PR libstdc++/100889
* include/bits/atomic_base.h (atomic_ref<_Tp*>::wait):
Change parameter type from _Tp to _Tp*.
* testsuite/29_atomics/atomic_ref/wait_notify.cc: Extend
coverage of types tested.

[libstdc++] Remove unused hasher instance.

This is a remnant of poorly executed refactoring.

libstdc++-v3/ChangeLog:

* include/std/barrier (__tree_barrier::_M_arrive): Remove
unnecessary hasher instantiation.

c++: explicit() ignored on deduction guide [PR100065]

When we have explicit() with a value-dependent argument, we can't
evaluate it at parsing time, so cp_parser_function_specifier_opt stashes
the argument into the decl-specifiers and grokdeclarator then stores it
into explicit_specifier_map, which is then used when substituting the
function decl. grokdeclarator stores it for constructors and conversion
functions, but we also need to do it for deduction guides, otherwise
we'll forget that we've seen an explicit-specifier as in the attached
test.

PR c++/100065

gcc/cp/ChangeLog:

* decl.c (grokdeclarator): Store a value-dependent
explicit-specifier even for deduction guides.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/explicit18.C: New test.

Improve match_simplify_replacement in phi-opt

This improves match_simplify_replace in phi-opt to handle the
case where there is one cheap (non-call) preparation statement in the
middle basic block similar to xor_replacement and others.
This allows to remove xor_replacement which it does too.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

Thanks,
Andrew Pinski

Changes since v1:
v3 - Just minor changes to using gimple_assign_lhs
instead of gimple_lhs and fixing a comment.
v2 - change the check on the preparation statement to
allow only assignments and no calls and only assignments
that feed into the phi.

gcc/ChangeLog:

PR tree-optimization/25290
* tree-ssa-phiopt.c (xor_replacement): Delete.
(tree_ssa_phiopt_worker): Delete use of xor_replacement.
(match_simplify_replacement): Allow one cheap preparation
statement that can be moved to before the if.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/pr96928-1.c: Fix testcase for now that ~
happens on the outside of the bit_xor.

c++: update diagnostic messages

These tests needed updating for the diagnostic change in r12-1306.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/pr60209-neg.C: Update diagnostic.
* g++.dg/diagnostic/string-literal-concat.C: Likewise.
* g++.dg/ext/utf-badconcat.C: Likewise.
* g++.dg/ext/utf-badconcat2.C: Likewise.

Fix bootstrap2 breakage due to re-use of obj-c checksum

gcc/objc:
2021-06-08 Bernd Edlinger <bernd.edlinger@hotmail.de>

* Make-lang.in (cc1-obj-checksum.c): Check previous
stage checksum exists.

gcc/objcp:
2021-06-08 Bernd Edlinger <bernd.edlinger@hotmail.de>

* Make-lang.in (cc1objplus-checksum.c): Check previous
stage checksum exists.

c++: Test for mixed string literal concatenation

From wg21.link/p2201r1

gcc/cp/ChangeLog:

* parser.c (cp_parser_string_literal): Adjust diagnostic.

gcc/testsuite/ChangeLog:

* g++.dg/cpp23/mixed-concat1.C: New test.

c++: Test for whitespace and line splice

From wg21.link/P2223R2

gcc/testsuite/ChangeLog:

* g++.dg/cpp23/whitespace-splice1.C: New test.

c++: Add test for C++23 narrowing conv to bool

From wg21.link/P1401R5.

gcc/testsuite/ChangeLog:

* g++.dg/cpp23/narrowing-bool1.C: New test.

analyzer: bitfield fixes [PR99212]

This patch verifies the previous fix for bitfield sizes by implementing
enough support for bitfields in the analyzer to get the test cases to pass.

The patch implements support in the analyzer for reading from a
BIT_FIELD_REF, and support for folding BIT_AND_EXPR of a mask, to handle
the cases generated in tests.

The existing bitfields tests in data-model-1.c turned out to rely on
undefined behavior, in that they were assigning values to a signed
bitfield that were outside of the valid range of values.  I believe that
that's why we were seeing target-specific differences in the test
results (PR analyzer/99212).  The patch updates the test to remove the
undefined behaviors.

gcc/analyzer/ChangeLog:
PR analyzer/99212
* region-model-manager.cc
(region_model_manager::maybe_fold_binop): Add support for folding
BIT_AND_EXPR of compound_svalue and a mask constant.
* region-model.cc (region_model::get_rvalue_1): Implement
BIT_FIELD_REF in terms of...
(region_model::get_rvalue_for_bits): New function.
* region-model.h (region_model::get_rvalue_for_bits): New decl.
* store.cc (bit_range::from_mask): New function.
(selftest::test_bit_range_intersects_p): New selftest.
(selftest::assert_bit_range_from_mask_eq): New.
(ASSERT_BIT_RANGE_FROM_MASK_EQ): New macro.
(selftest::assert_no_bit_range_from_mask_eq): New.
(ASSERT_NO_BIT_RANGE_FROM_MASK): New macro.
(selftest::test_bit_range_from_mask): New selftest.
(selftest::analyzer_store_cc_tests): Call the new selftests.
* store.h (bit_range::intersects_p): New.
(bit_range::from_mask): New decl.
(concrete_binding::get_bit_range): New accessor.
(store_manager::get_concrete_binding): New overload taking
const bit_range &.

gcc/testsuite/ChangeLog:
PR analyzer/99212
* gcc.dg/analyzer/bitfields-1.c: New test.
* gcc.dg/analyzer/data-model-1.c (struct sbits): Make bitfields
explicitly signed.
(test_44): Update test values assigned to the bits to ones that
fit in the range of the bitfield type.  Remove xfails.
(test_45): Remove xfails.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

analyzer: fix region::get_bit_size for bitfields

gcc/analyzer/ChangeLog:
* analyzer.h (int_size_in_bits): New decl.
* region.cc (int_size_in_bits): New function.
(region::get_bit_size): Reimplement in terms of the above.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

analyzer: split out struct bit_range from class concrete_binding

gcc/analyzer/ChangeLog:
* store.cc (concrete_binding::dump_to_pp): Move bulk of
implementation to...
(bit_range::dump_to_pp): ...this new function.
(bit_range::cmp): New.
(concrete_binding::overlaps_p): Update for use of bit_range.
(concrete_binding::cmp_ptr_ptr): Likewise.
* store.h (struct bit_range): New.
(class concrete_binding): Replace fields m_start_bit_offset and
m_size_in_bits with new field m_bit_range.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

analyzer: remove redundant typedef

Delete an overzealous copy&paste.

gcc/analyzer/ChangeLog:
* svalue.h (conjured_svalue::iterator_t): Delete.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

c++: braced-list overload resolution [PR100963]

My PR969626 patch made us ignore template candidates when there's a perfect
non-template candidate. In this case, we were considering B(int) a perfect
match for B({0}), but the brace elision makes it imperfect.

PR c++/100963

gcc/cp/ChangeLog:

* call.c (perfect_conversion_p): Check check_narrowing.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/initlist124.C: New test.

Update Power10 scheduling description for new fused instruction types.

gcc/ChangeLog:

* config/rs6000/power10.md (power10-fused-load, power10-fused-store,
power10-fused_alu, power10-fused-vec, power10-fused-branch): New.

Further improve redundant test/compare removal on the H8

gcc/
* config/h8300/logical.md (andqi3_1): Move BCLR case into define_insn_and_split.
Create length attribute on define_insn_and_split. Only split for cases which we
know will use AND.
(andqi3_1<cczn>): Renamed from andqi3_1_clobber_flags. Only handle AND here and
fix length computation.
(b<code><mode>msx): Combine QImode and HImode H8/SX patterns using iterator.

libstdc++: Finish implementing LWG 3413 for propagate_const

We already have conditional noexcept so this just constrains the
non-member swap overload.

Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:

* include/experimental/propagate_const (swap): Constrain.
* testsuite/experimental/propagate_const/swap/lwg3413.cc: New test.

tree-optimization/100923 - fix alias-ref construction wrt availability

This PR shows that building an ao_ref from value-numbers is prone to
expose bogus contextual alias info to the oracle.  The following makes
sure to construct ao_refs from SSA names available at the program point
only.

On the way it modifies the awkward valueize_refs[_1] API.

2021-06-08  Richard Biener  <rguenther@suse.de>

PR tree-optimization/100923
* tree-ssa-sccvn.c (valueize_refs_1): Take a pointer to
the operand vector to be valueized.
(valueize_refs): Likewise.
(valueize_shared_reference_ops_from_ref): Adjust.
(valueize_shared_reference_ops_from_call): Likewise.
(vn_reference_lookup_3): Likewise.
(vn_reference_lookup_pieces): Likewise.  Re-valueize
with honoring availability when we are about to create
the ao_ref and valueized before.
(vn_reference_lookup): Likewise.
(vn_reference_insert_pieces): Adjust.

* gcc.dg/torture/pr100923.c: New testcase.

Make SLP root stmt a vector

This fixes a TODO noticed when adding vectorization of
BIT_INSERT_EXPRs and what's now useful for vectorization of
BB reductions.

2021-06-08 Richard Biener <rguenther@suse.de>

* tree-vectorizer.h (_slp_instance::root_stmt): Change to...
(_slp_instance::root_stmts): ... a vector.
(SLP_INSTANCE_ROOT_STMT): Rename to ...
(SLP_INSTANCE_ROOT_STMTS): ... this.
(slp_root::root): Change to...
(slp_root::roots): ... a vector.
(slp_root::slp_root): Adjust.
* tree-vect-slp.c (_slp_instance::location): Adjust.
(vect_free_slp_instance): Release the root stmt vector.
(vect_build_slp_instance): Adjust.
(vect_analyze_slp): Likewise.
(_bb_vec_info::~_bb_vec_info): Likewise.
(vect_slp_analyze_operations): Likewise.
(vect_bb_vectorization_profitable_p): Likewise. Adjust
costs for the root stmt.
(vect_slp_check_for_constructors): Gather all BIT_INSERT_EXPRs
as root stmts.
(vect_slp_analyze_bb_1): Simplify by marking all root stmts
as pure_slp.
(vectorize_slp_instance_root_stmt): Adjust.
(vect_schedule_slp): Likewise.

Implement a context aware pointer equivalency class for use in evrp.

The substitute_and_fold_engine which evrp uses is expecting symbolics
from value_of_expr / value_on_edge / etc, which ranger does not provide.
In some cases, these provide important folding cues, as in the case of
aliases for pointers.  For example, legacy evrp may return [&foo, &foo]
for the value of "bar" where bar is on an edge where bar == &foo, or
when bar has been globally set to &foo.  This information is then used
by the subst & fold engine to propagate the known value of bar.

Currently this is a major source of discrepancies between evrp and
ranger.  Of the 284 cases legacy evrp is getting over ranger, 237 are
for pointer equality as discussed above.

This patch implements a context aware pointer equivalency class which
ranger-evrp can use to query what an SSA pointer is currently
equivalent to.  With it, we reduce the 284 cases legacy evrp is getting
to 47.

The API for the pointer equivalency analyzer is the following:

class pointer_equiv_analyzer
{
public:
  pointer_equiv_analyzer (gimple_ranger *r);
  ~pointer_equiv_analyzer ();
  void enter (basic_block);
  void leave (basic_block);
  void visit_stmt (gimple *stmt);
  tree get_equiv (tree ssa) const;
...
};

The enter(), leave(), and visit_stmt() methods are meant to be called
from a DOM walk.   At any point throughout the walk, one can call
get_equiv() to get whatever an SSA is equivalent to.

Tested on x86-64 Linux with a regular bootstrap/tests and by comparing
EVRP folds over ranger before and after this patch.

gcc/ChangeLog:

* gimple-ssa-evrp.c (class ssa_equiv_stack): New.
(ssa_equiv_stack::ssa_equiv_stack): New.
(ssa_equiv_stack::~ssa_equiv_stack): New.
(ssa_equiv_stack::enter): New.
(ssa_equiv_stack::leave): New.
(ssa_equiv_stack::push_replacement): New.
(ssa_equiv_stack::get_replacement): New.
(is_pointer_ssa): New.
(class pointer_equiv_analyzer): New.
(pointer_equiv_analyzer::pointer_equiv_analyzer): New.
(pointer_equiv_analyzer::~pointer_equiv_analyzer): New.
(pointer_equiv_analyzer::set_global_equiv): New.
(pointer_equiv_analyzer::set_cond_equiv): New.
(pointer_equiv_analyzer::get_equiv): New.
(pointer_equiv_analyzer::enter): New.
(pointer_equiv_analyzer::leave): New.
(pointer_equiv_analyzer::get_equiv_expr): New.
(pta_valueize): New.
(pointer_equiv_analyzer::visit_stmt): New.
(pointer_equiv_analyzer::visit_edge): New.
(hybrid_folder::value_of_expr): Call PTA.
(hybrid_folder::value_on_edge): Same.
(hybrid_folder::pre_fold_bb): New.
(hybrid_folder::post_fold_bb): New.
(hybrid_folder::pre_fold_stmt): New.
(rvrp_folder::pre_fold_bb): New.
(rvrp_folder::post_fold_bb): New.
(rvrp_folder::pre_fold_stmt): New.
(rvrp_folder::value_of_expr): Call PTA.
(rvrp_folder::value_on_edge): Same.

[GCN] Fix run-time variable 'num_workers'

... which currently has *not* been forced to 'num_workers (1)'.

In addition to the testcases modified here, this also fixes:

    FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/mode-transitions.c -DACC_DEVICE_TYPE_radeon=1 -DACC_MEM_SHARED=0 -foffload=amdgcn-amdhsa  -O0  execution test
    [Etc.]

    mode-transitions.exe: [...]/libgomp.oacc-c-c++-common/mode-transitions.c:702: t17: Assertion `arr_b[i] == (i ^ 31) * 8' failed.

libgomp/
* plugin/plugin-gcn.c (gcn_exec): Force 'num_workers (1)'
unconditionally.
* testsuite/libgomp.oacc-c-c++-common/acc_prof-kernels-1.c:
Update.
* testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/routine-wv-2.c: Likewise.

Enable more 'libgomp.oacc-*/lib-*' testcases for non-'openacc_nvidia_accel_selected'

libgomp/
* testsuite/libgomp.oacc-c-c++-common/lib-11.c: Enable for all but
'-DACC_MEM_SHARED=0'.
* testsuite/libgomp.oacc-c-c++-common/lib-13.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-14.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-15.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-20.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-23.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-24.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-34.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-42.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-44.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-48.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-88.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-89.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-92.c: Likewise.
* testsuite/libgomp.oacc-fortran/lib-14.f90: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-5.c: Add
'acc_device_radeon' testing.
* testsuite/libgomp.oacc-c-c++-common/lib-6.c: Likewise.
* testsuite/libgomp.oacc-fortran/lib-5.f90: Likewise.
* testsuite/libgomp.oacc-fortran/lib-7.f90: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-52.c: Enable for all.
* testsuite/libgomp.oacc-c-c++-common/lib-53.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-54.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-86.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-87.c: Likewise.
* testsuite/libgomp.oacc-fortran/lib-10.f90: Likewise.
* testsuite/libgomp.oacc-fortran/lib-8.f90: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-57.c: Improve checking
for non-'openacc_nvidia_accel_selected'.
* testsuite/libgomp.oacc-c-c++-common/lib-58.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-62.c: Clarify that "Not
all implement this checking".
* testsuite/libgomp.oacc-c-c++-common/lib-63.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-64.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-65.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-67.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-68.c: Likewise.

Fix 'libgomp.oacc-fortran/parallel-dims.f90' for 'acc_device_radeon'

..., by simplifying 'libgomp.oacc-c-c++-common/parallel-dims.c', and updating
the former correspondingly. '__builtin_goacc_parlevel_id' does the right thing
for all 'acc_device_*'.

Follow-up to commit 09e0ad6253f4330977e1b2f116b5e289dc2c2a02 "Update OpenACC
tests for amdgcn".

libgomp/
* testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: Simplify.
* testsuite/libgomp.oacc-fortran/parallel-dims-aux.c: Update.

Fix 'libgomp.oacc-c-c++-common/acc_prof-kernels-1.c' for 'acc_device_radeon'

... on top of r279378 (commit 26b74ed0223d108d7d7818c3c860f20cfe81a4af)
"Update OpenACC tests for amdgcn".

libgomp/
* testsuite/libgomp.oacc-c-c++-common/acc_prof-kernels-1.c: Fix
for 'acc_device_radeon'.

Enhance 'libgomp.oacc-c-c++-common/firstprivate-1.c' for non-'acc_device_nvidia'

libgomp/
* testsuite/libgomp.oacc-c-c++-common/firstprivate-1.c: Enhance
for non-'acc_device_nvidia'.

Add 'acc_device_radeon' testing to 'libgomp.oacc-*/acc_on_device-*'

libgomp/
* testsuite/libgomp.oacc-c-c++-common/acc_on_device-1.c: Add
'acc_device_radeon' testing.
* testsuite/libgomp.oacc-fortran/acc_on_device-1-1.f90: Likewise.
* testsuite/libgomp.oacc-fortran/acc_on_device-1-2.f: Likewise.
* testsuite/libgomp.oacc-fortran/acc_on_device-1-3.f: Likewise.

Don't require 'openacc_nvidia_accel_selected' in 'libgomp.oacc-c-c++-common/async_queue-1.c'

That is, re-enable it for host-fallback, and enable it for GCN offloading.

Fix-up for r279378 (commit 26b74ed0223d108d7d7818c3c860f20cfe81a4af)
"Update OpenACC tests for amdgcn".

libgomp/
* testsuite/libgomp.oacc-c-c++-common/async_queue-1.c: Don't
require 'openacc_nvidia_accel_selected'. Fix up for
'ACC_DEVICE_TYPE_radeon'.

Don't require 'openacc_nvidia_accel_selected' in additional 'libgomp.oacc-*/declare-*'

Like r253779 (commit 92d5d01ac65e395ceaecc5d930f6017952aa4934)
"Enable libgomp.oacc-*/declare-*.{c,f90} for non-nvidia devices".

libgomp/
* testsuite/libgomp.oacc-c++/declare-1.C: Don't require
'openacc_nvidia_accel_selected'.
* testsuite/libgomp.oacc-c-c++-common/declare-3.c: Likewise.

openmp: Fix ICE on depend(source) clause during cdtor cloning [PR100957]

The depend(source) clause has NULL OMP_CLAUSE_DECL, it has just the
depend kind specified and no arguments. So copy_tree_body_r shouldn't
check TREE_CODE on it without checking it is non-NULL.

2021-06-08 Jakub Jelinek <jakub@redhat.com>

PR c++/100957
* tree-inline.c (copy_tree_body_r): For OMP_CLAUSE_DEPEND don't
check TREE_CODE if OMP_CLAUSE_DECL is NULL.

* g++.dg/gomp/doacross-2.C: New test.

[GCN] Streamline 'libgomp/testsuite/lib/libgomp.exp:check_effective_target_openacc_radeon_accel_selected'

The GCN support that got added in r278935 (commit
83caa34e2a618842e05f59cbb3e2dda93dc23270) "Enable OpenACC GCN testing" was
forked before my r269107 (commit ee332b4a9a19552d160a23155f59b11692d8f07e)
"[libgomp] Clarify difference between offload target, offload plugin, and
OpenACC device type", and didn't later pick up these changes.

No functional change.

libgomp/
* testsuite/lib/libgomp.exp
(check_effective_target_openacc_radeon_accel_selected):
Streamline.

Revert PR80547 workaround in 'libgomp.oacc-c-c++-common/parallel-dims.c'

This problem has been fixed long ago, in r267934 (commit
d41d952c9bbdffe6fd2badc9c4f2c18d241ce412) "[nvptx] Handle assignment to
gang-level reduction variable".

libgomp/
* testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: Revert
PR80547 workaround.

[nvptx] Update comment in 'libgomp.oacc-c-c++-common/parallel-dims.c'

Small fix-up for r267889 (commit 2b9d9e393766d2fa6e2dd5f361d0db14872cf261)
"[nvptx] Enable large vectors":

> * testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: Expect vector
> length 2097152 to be reduced to 1024 instead of 32.

libgomp/
* testsuite/libgomp.oacc-c-c++-common/parallel-dims.c
<acc_device_nvidia>: Update comment.

middle-end/100951 - make sure to generate VECTOR_CST in lowering

When vector lowering creates piecewise ops make sure to create
VECTOR_CSTs instead of CONSTRUCTORs when possible.

gcc/

2021-06-07 Richard Biener <rguenther@suse.de>

PR middle-end/100951
* tree-vect-generic.c (expand_vector_piecewise): Build a
VECTOR_CST if all elements are constant.
(expand_vector_condition): Likewise.
(lower_vec_perm): Likewise.
(expand_vector_conversion): Likewise.

gcc/testsuite/

2021-06-07 H.J. Lu <hjl.tools@gmail.com>

PR middle-end/100951
* gcc.target/i386/pr100951.c: New test.

testsuite: Add -Wno-psabi -w to pr100887.c test [PR100943]

On x86 the test is using -mavx512f and so never reports the various
-Wpsabi notes/warnings, but on other targets it can.

2021-06-08 Jakub Jelinek <jakub@redhat.com>

PR target/100887
PR testsuite/100943
* gcc.dg/pr100887.c: Add -Wno-psabi -w to dg-options.

Fortran/OpenMP: Fix clause splitting for target/parallel/teams [PR99928]

PR middle-end/99928

gcc/fortran/ChangeLog:

* trans-openmp.c (gfc_add_clause_implicitly): New.
(gfc_split_omp_clauses): Use it.
(gfc_free_split_omp_clauses): New.
(gfc_trans_omp_do_simd, gfc_trans_omp_parallel_do,
gfc_trans_omp_parallel_do_simd, gfc_trans_omp_distribute,
gfc_trans_omp_teams, gfc_trans_omp_target, gfc_trans_omp_taskloop,
gfc_trans_omp_master_taskloop, gfc_trans_omp_parallel_master): Use it.

gcc/testsuite/ChangeLog:

* gfortran.dg/gomp/openmp-simd-6.f90: Update scan-tree-dump.
* gfortran.dg/gomp/scan-5.f90: Likewise.
* gfortran.dg/gomp/loop-1.f90: Likewise; remove xfail.
* gfortran.dg/gomp/pr99928-1.f90: Remove xfail.
* gfortran.dg/gomp/pr99928-2.f90: Likewise.
* gfortran.dg/gomp/pr99928-3.f90: Likewise.
* gfortran.dg/gomp/pr99928-8.f90: Likewise.

docs: document evrp-sparse-threshold param

gcc/ChangeLog:

* doc/invoke.texi: Document new param evrp-sparse-threshold.

Fix "tailing" typo.

gcc/fortran/ChangeLog:

* intrinsic.texi: Fix typo.
* trans-expr.c (gfc_trans_pointer_assignment): Likewise.

gcc/ChangeLog:

* genautomata.c (create_automata): Fix typo.

libgfortran/ChangeLog:

* intrinsics/chmod.c (chmod_internal): Fix typo.
* io/transfer.c (read_sf): Likewise.

libquadmath/ChangeLog:

* libquadmath.texi: Fix typo.

gcc/testsuite/ChangeLog:

* gcc.dg/format/strfmon-1.c: Fix typo.
* gfortran.dg/char4-subscript.f90: Likewise.

predcom: Enabled by loop vect at O2 [PR100794]

As PR100794 shows, in the current implementation PRE bypasses
some optimization to avoid introducing loop carried dependence
which stops loop vectorizer to vectorize the loop. At -O2,
there is no downstream pass to re-catch this kind of opportunity
if loop vectorizer fails to vectorize that loop.

This patch follows Richi's suggestion in the PR, if predcom flag
isn't set and loop vectorization will enable predcom without any
unrolling implicitly. The Power9 SPEC2017 evaluation showed it
can speed up 521.wrf_r 3.30% and 554.roms_r 1.08% at very-cheap
cost model, no remarkable impact at cheap cost model, the build
time and size impact is fine (see the PR for the details).

By the way, I tested another proposal to guard PRE not skip the
optimization for cheap and very-cheap vect cost models, the
evaluation results showed it's fine with very cheap cost model,
but it can degrade some bmks like 521.wrf_r -9.17% and
549.fotonik3d_r -2.07% etc.

Bootstrapped/regtested on powerpc64le-linux-gnu P9,
x86_64-redhat-linux and aarch64-linux-gnu.

gcc/ChangeLog:

PR tree-optimization/100794
* tree-predcom.c (tree_predictive_commoning_loop): Add parameter
allow_unroll_p and only allow unrolling when it's true.
(tree_predictive_commoning): Add parameter allow_unroll_p and
adjust for it.
(run_tree_predictive_commoning): Likewise.
(pass_predcom::gate): Check flag_tree_loop_vectorize and
global_options_set.x_flag_predictive_commoning.
(pass_predcom::execute): Adjust for allow_unroll_p.

gcc/testsuite/ChangeLog:

PR tree-optimization/100794
* gcc.dg/tree-ssa/pr100794.c: New test.

predcom: Adjust some unnecessary update_ssa calls

As Richi suggested in PR100794, this patch is to remove
some unnecessary update_ssa calls with flag
TODO_update_ssa_only_virtuals, also do some refactoring.

Bootstrapped/regtested on powerpc64le-linux-gnu P9,
x86_64-redhat-linux and aarch64-linux-gnu, built well
on Power9 ppc64le with --with-build-config=bootstrap-O3,
and passed both P8 and P9 SPEC2017 full build with
{-O3, -Ofast} + {,-funroll-loops}.

gcc/ChangeLog:

* tree-predcom.c (execute_pred_commoning): Remove update_ssa call.
(tree_predictive_commoning_loop): Factor some cleanup stuffs into
lambda function cleanup, remove scev_reset call, and adjust return
value.
(tree_predictive_commoning): Adjust for different changed values,
only set flag TODO_update_ssa_only_virtuals if changed.
(pass_data pass_data_predcom): Remove TODO_update_ssa_only_virtuals
from todo_flags_finish.

c++: preserve BASELINK from lookup [PR91706]

In the earlier patch for PR91706 I fixed the BASELINK built by
baselink_for_fns, but since we already had one from lookup, we should keep
that one around instead of stripping it. The removed hunk in
get_class_binding was a wierdly large amount of code to decide whether to
pull out BASELINK_FUNCTIONS.

gcc/cp/ChangeLog:

PR c++/91706
* name-lookup.c (get_class_binding): Keep a BASELINK.
(set_inherited_value_binding_p): Adjust.
* lambda.c (is_lambda_ignored_entity): Adjust.
* pt.c (lookup_template_function): Copy a BASELINK before
modifying it.

c++: alias with same name as base fn [PR91706]

This is a bit complex.  Looking up c<T> in the definition of D::c finds
C::c, OK.  Looking up c in the definition of E finds D::c, OK.  Since the
alias is not dependent, we strip it from the template argument, leaving

using E = A<decltype(c<T>())>;

where 'c' still refers to C::c.  But instantiating E looks up 'c' again and
finds D::c, which isn't a function, and sadness ensues.

I think the bug here is looking up 'c' in D at instantiation time; the
declaration we found before is not dependent.  This seems to happen because
baselink_for_fns gets BASELINK_BINFO wrong; it is supposed to be the base
where lookup found the functions, C in this case.

gcc/cp/ChangeLog:

PR c++/91706
* semantics.c (baselink_for_fns): Fix BASELINK_BINFO.

gcc/testsuite/ChangeLog:

PR c++/91706
* g++.dg/template/lookup17.C: New test.

c++: fix modules binfo merging

My coming fix for PR91706 caused some regressions in the modules testsuite.
This turned out to be because the change to properly use the base subobject
BINFO as BASELINK_BINFO hit problems with the code for merging binfos.  The
tree reader needed a typo fix.  The duplicate_hash function was crashing on
the BINFO for a variadic base in <variant>.  I started fixing the hash
function, but then noticed that there's no ::equal function defined;
duplicate_hash just uses pointer equality, so we might as well also
use the normal pointer hash for the moment.

gcc/cp/ChangeLog:

* module.cc (duplicate_hash::hash): Comment out.
(trees_in::tree_value): Adjust loop counter.

c++: alias member template [PR100102]

Patrick already fixed the primary cause of this bug. But while I was
looking at this testcase I noticed that with the qualified name k::o we
ended up with a plain FUNCTION_DECL, whereas without the k:: we got a
BASELINK. There seems to be no good reason not to return the BASELINK
in this case as well.

PR c++/100102

gcc/cp/ChangeLog:

* init.c (build_offset_ref): Return the BASELINK for a static
member function.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/alias-decl-73.C: New test.

Daily bump.

Implement a sparse bitmap representation for Rangers on-entry cache.

Use a sparse representation for the on entry cache, and utilize it when
the number of basic blocks in the function exceeds param_evrp_sparse_threshold.

PR tree-optimization/PR100299
* gimple-range-cache.cc (class sbr_sparse_bitmap): New.
(sbr_sparse_bitmap::sbr_sparse_bitmap): New.
(sbr_sparse_bitmap::bitmap_set_quad): New.
(sbr_sparse_bitmap::bitmap_get_quad): New.
(sbr_sparse_bitmap::set_bb_range): New.
(sbr_sparse_bitmap::get_bb_range): New.
(sbr_sparse_bitmap::bb_range_p): New.
(block_range_cache::block_range_cache): initialize bitmap obstack.
(block_range_cache::~block_range_cache): Destruct obstack.
(block_range_cache::set_bb_range): Decide when to utilze the
sparse on entry cache.
* gimple-range-cache.h (block_range_cache): Add bitmap obstack.
* params.opt (-param=evrp-sparse-threshold): New.

Implement multi-bit aligned accessors for sparse bitmap.

Provide set/get routines to allow sparse bitmaps to be treated as an array
of multiple bit values. Only chunk sizes that are powers of 2 are supported.

* bitmap.c (bitmap_set_aligned_chunk): New.
(bitmap_get_aligned_chunk): New.
(test_aligned_chunk): New.
(bitmap_c_tests): Call test_aligned_chunk.
* bitmap.h (bitmap_set_aligned_chunk, bitmap_get_aligned_chunk): New.

i386: Add init pattern for V4QI vectors [PR100637]

2021-06-07 Uroš Bizjak <ubizjak@gmail.com>

gcc/
PR target/100637
* config/i386/i386-expand.c (ix86_expand_vector_init_duplicate):
Handle V4QI mode.
(ix86_expand_vector_init_one_nonzero): Ditto.
(ix86_expand_vector_init_one_var): Ditto.
(ix86_expand_vector_init_general): Ditto.
* config/i386/mmx.md (vec_initv4qiqi): New expander.

gcc/testsuite/

PR target/100637
* gcc.target/i386/pr100637-5b.c: New test.
* gcc.target/i386/pr100637-5w.c: Ditto.

x86: Don't compile pr82735-[345].c for x32

Since -mabi=ms isn't compatible with x32, skip pr82735-[345].c for x32.

PR target/82735
* gcc.target/i386/pr82735-3.c: Don't compile for x32.
* gcc.target/i386/pr82735-4.c: Likewise.
* gcc.target/i386/pr82735-5.c: Likewise.

Fix old thinko in warning on pointer for storage order purposes

gcc/c
PR c/100920
* c-typeck.c (convert_for_assignment): Test fndecl_built_in_p to
spot built-in functions.
gcc/testsuite/
* gcc.dg/sso-14.c: Adjust.

c++: access of dtor named by qualified template-id [PR100918]

Here, when resolving the destructor named by Inner<int>::~Inner<int>
(which is valid until C++20) we end up in cp_parser_lookup_name called
indirectly from cp_parser_template_id to look up the name Inner from
the scope Inner<int>.  The lookup naturally finds the injected-class-name,
and because the flag is_template is true, we adjust this lookup result
to the TEMPLATE_DECL Inner.  We then check access of this adjusted
lookup result.  But this access check fails because the lookup scope is
Inner<int> and the context_for_name_lookup for the TEMPLATE_DECL is
Outer (whereas for the injected-class-name it's also Inner<int>).

The simplest fix seems to be to check access of the original lookup
result (the injected-class-name) instead of the adjusted result (the
TEMPLATE_DECL).  So this patch moves the access check in
cp_parser_lookup_name to before the injected-class-name adjustment.

PR c++/100918

gcc/cp/ChangeLog:

* parser.c (cp_parser_lookup_name): Check access of the lookup
result before we potentially adjust an injected-class-name to
its TEMPLATE_DECL.

gcc/testsuite/ChangeLog:

* g++.dg/template/access38.C: New test.

libstdc++: add missing typename for dependent type in ranges::elements_view [PR100900]

Clang complains about the missing typename. I believe it's not required
in a more complete implementation of C++, but it's nicer to support
less complete implementations.

PR libstdc++/100900

libstdc++-v3/ChangeLog:

* include/std/ranges (elements_view::__iter_cat::_S_iter_cat):
Add missing typename.

x86: Update g++.target/i386/pr100885.C

Since long is 32 bits for x32, update g++.target/i386/pr100885.C to cast
__m64 to long long for x32.

PR target/100885
* g++.target/i386/pr100885.C (_mm_set_epi64): Cast __m64 to long
long.

libstdc++: Constrain three-way comparison for std::optional [PR 98842]

The operator<=>(const optional<T>&, const U&) operator is supposed to be
constrained with three_way_comparable_with<U, T> so that it can only be
used when T and U are weakly-equality-comparable and also three-way
comparable.

Adding that constrain completely breaks std::optional comparisons,
because it causes constraint recursion. To avoid that, an additional
check that U is not a specialization of std::optional is needed. That
appears to be a defect in the standard and should be reported to LWG.

Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:

PR libstdc++/98842
* include/std/optional (operator<=>(const optional<T>& const U&)):
Add missing constraint and add workaround for template
recursion.
* testsuite/20_util/optional/relops/three_way.cc: Check that
type without equality comparison cannot be compared when wrapped
in std::optional.

Use moves to eliminate redundant test/compare instructions

gcc/

* config/h8300/movepush.md: Change most _clobber_flags
patterns to instead use <cczn> subst.
(movsi_cczn): New pattern with usable CC cases split out.
(movsi_h8sx_cczn): Likewise.

Reformat target.def for better parsing.

gcc/c-family/ChangeLog:

* c-target.def: Split long lines and replace them
with '\n\'.

gcc/ChangeLog:

* common/common-target.def: Split long lines and replace them
with '\n\'.
* target.def: Likewise.
* doc/tm.texi: Re-generated.

For obj-c stage-final re-use the checksum from the previous stage

This silences the stage compare.

gcc/objc:
2021-06-07 Bernd Edlinger <bernd.edlinger@softing.com>

* Make-lang.in (cc1obj-checksum.c): For stage-final re-use
the checksum from the previous stage.

gcc/objcp:
2021-06-07 Bernd Edlinger <bernd.edlinger@softing.com>

* Make-lang.in (cc1objplus-checksum.c): For stage-final re-use
the checksum from the previous stage.