liuhongt [Tue, 9 Apr 2019 06:38:33 +0000 (14:38 +0800)]
AVX512FP16: Add scalar fma instructions.
Add vfmadd[132,213,231]sh/vfnmadd[132,213,231]sh/
vfmsub[132,213,231]sh/vfnmsub[132,213,231]sh.
gcc/ChangeLog:
* config/i386/avx512fp16intrin.h (_mm_fmadd_sh):
New intrinsic.
(_mm_mask_fmadd_sh): Likewise.
(_mm_mask3_fmadd_sh): Likewise.
(_mm_maskz_fmadd_sh): Likewise.
(_mm_fmadd_round_sh): Likewise.
(_mm_mask_fmadd_round_sh): Likewise.
(_mm_mask3_fmadd_round_sh): Likewise.
(_mm_maskz_fmadd_round_sh): Likewise.
(_mm_fnmadd_sh): Likewise.
(_mm_mask_fnmadd_sh): Likewise.
(_mm_mask3_fnmadd_sh): Likewise.
(_mm_maskz_fnmadd_sh): Likewise.
(_mm_fnmadd_round_sh): Likewise.
(_mm_mask_fnmadd_round_sh): Likewise.
(_mm_mask3_fnmadd_round_sh): Likewise.
(_mm_maskz_fnmadd_round_sh): Likewise.
(_mm_fmsub_sh): Likewise.
(_mm_mask_fmsub_sh): Likewise.
(_mm_mask3_fmsub_sh): Likewise.
(_mm_maskz_fmsub_sh): Likewise.
(_mm_fmsub_round_sh): Likewise.
(_mm_mask_fmsub_round_sh): Likewise.
(_mm_mask3_fmsub_round_sh): Likewise.
(_mm_maskz_fmsub_round_sh): Likewise.
(_mm_fnmsub_sh): Likewise.
(_mm_mask_fnmsub_sh): Likewise.
(_mm_mask3_fnmsub_sh): Likewise.
(_mm_maskz_fnmsub_sh): Likewise.
(_mm_fnmsub_round_sh): Likewise.
(_mm_mask_fnmsub_round_sh): Likewise.
(_mm_mask3_fnmsub_round_sh): Likewise.
(_mm_maskz_fnmsub_round_sh): Likewise.
* config/i386/i386-builtin-types.def
(V8HF_FTYPE_V8HF_V8HF_V8HF_UQI_INT): New builtin type.
* config/i386/i386-builtin.def: Add new builtins.
* config/i386/i386-expand.c: Handle new builtin type.
* config/i386/sse.md (fmai_vmfmadd_<mode><round_name>):
Ajdust to support FP16.
(fmai_vmfmsub_<mode><round_name>): Ditto.
(fmai_vmfnmadd_<mode><round_name>): Ditto.
(fmai_vmfnmsub_<mode><round_name>): Ditto.
(*fmai_fmadd_<mode>): Ditto.
(*fmai_fmsub_<mode>): Ditto.
(*fmai_fnmadd_<mode><round_name>): Ditto.
(*fmai_fnmsub_<mode><round_name>): Ditto.
(avx512f_vmfmadd_<mode>_mask<round_name>): Ditto.
(avx512f_vmfmadd_<mode>_mask3<round_name>): Ditto.
(avx512f_vmfmadd_<mode>_maskz<round_expand_name>): Ditto.
(avx512f_vmfmadd_<mode>_maskz_1<round_name>): Ditto.
(*avx512f_vmfmsub_<mode>_mask<round_name>): Ditto.
(avx512f_vmfmsub_<mode>_mask3<round_name>): Ditto.
(*avx512f_vmfmsub_<mode>_maskz_1<round_name>): Ditto.
(*avx512f_vmfnmsub_<mode>_mask<round_name>): Ditto.
(*avx512f_vmfnmsub_<mode>_mask3<round_name>): Ditto.
(*avx512f_vmfnmsub_<mode>_mask<round_name>): Ditto.
(*avx512f_vmfnmadd_<mode>_mask<round_name>): Renamed to ...
(avx512f_vmfnmadd_<mode>_mask<round_name>) ... this, and
adjust to support FP16.
(avx512f_vmfnmadd_<mode>_mask3<round_name>): Ditto.
(avx512f_vmfnmadd_<mode>_maskz_1<round_name>): Ditto.
(avx512f_vmfnmadd_<mode>_maskz<round_expand_name>): New
expander.
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx-1.c: Add test for new builtins.
* gcc.target/i386/sse-13.c: Ditto.
* gcc.target/i386/sse-23.c: Ditto.
* gcc.target/i386/sse-14.c: Add test for new intrinsics.
* gcc.target/i386/sse-22.c: Ditto.
H.J. Lu [Fri, 5 Apr 2019 21:57:41 +0000 (14:57 -0700)]
AVX512FP16: Enable FP16 mask load/store.
gcc/ChangeLog:
* config/i386/sse.md (avx512fmaskmodelower): Extend to support
HF modes.
(maskload<mode><avx512fmaskmodelower>): Ditto.
(maskstore<mode><avx512fmaskmodelower>): Ditto.
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx512fp16-xorsign-1.c: New test.
liuhongt [Mon, 2 Mar 2020 09:41:56 +0000 (17:41 +0800)]
AVX512FP16: Add testcase for fp16 bitwise operations.
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx512fp16-neg-1a.c: New test.
* gcc.target/i386/avx512fp16-neg-1b.c: Ditto.
* gcc.target/i386/avx512fp16-scalar-bitwise-1a.c: Ditto.
* gcc.target/i386/avx512fp16-scalar-bitwise-1b.c: Ditto.
* gcc.target/i386/avx512fp16-vector-bitwise-1a.c: Ditto.
* gcc.target/i386/avx512fp16-vector-bitwise-1b.c: Ditto.
* gcc.target/i386/avx512fp16vl-neg-1a.c: Ditto.
* gcc.target/i386/avx512fp16vl-neg-1b.c: Ditto.
H.J. Lu [Wed, 3 Apr 2019 04:44:55 +0000 (21:44 -0700)]
AVX512FP16: Add scalar/vector bitwise operations, including
1. FP16 vector xor/ior/and/andnot/abs/neg
2. FP16 scalar abs/neg/copysign/xorsign
gcc/ChangeLog:
* config/i386/i386-expand.c (ix86_expand_fp_absneg_operator):
Handle HFmode.
(ix86_expand_copysign): Ditto.
(ix86_expand_xorsign): Ditto.
* config/i386/i386.c (ix86_build_const_vector): Handle HF vector
modes.
(ix86_build_signbit_mask): Ditto.
(ix86_can_change_mode_class): Ditto.
* config/i386/i386.md
(SSEMODEF): Add HFmode.
(ssevecmodef): Ditto.
(<code>hf2): New define_expand.
(*<code>hf2_1): New define_insn_and_split.
(copysign<mode>): Extend to support HFmode under AVX512FP16.
(xorsign<mode>): Ditto.
* config/i386/sse.md (VFB): New mode iterator.
(VFB_128_256): Ditto.
(VFB_512): Ditto.
(sseintvecmode2): Support HF vector mode.
(<code><mode>2): Use new mode iterator.
(*<code><mode>2): Ditto.
(copysign<mode>3): Ditto.
(xorsign<mode>3): Ditto.
(<code><mode>3<mask_name>): Ditto.
(<code><mode>3<mask_name>): Ditto.
(<sse>_andnot<mode>3<mask_name>): Adjust for HF vector mode.
(<sse>_andnot<mode>3<mask_name>): Ditto.
(*<code><mode>3<mask_name>): Ditto.
(*<code><mode>3<mask_name>): Ditto.
liuhongt [Mon, 2 Mar 2020 09:31:47 +0000 (17:31 +0800)]
AVX512FP16: Add testcase for fma instructions
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx512fp16-vfmaddXXXph-1a.c: New test.
* gcc.target/i386/avx512fp16-vfmaddXXXph-1b.c: Ditto.
* gcc.target/i386/avx512fp16-vfmsubXXXph-1a.c: Ditto.
* gcc.target/i386/avx512fp16-vfmsubXXXph-1b.c: Ditto.
* gcc.target/i386/avx512fp16-vfnmaddXXXph-1a.c: Ditto.
* gcc.target/i386/avx512fp16-vfnmaddXXXph-1b.c: Ditto.
* gcc.target/i386/avx512fp16-vfnmsubXXXph-1a.c: Ditto.
* gcc.target/i386/avx512fp16-vfnmsubXXXph-1b.c: Ditto.
* gcc.target/i386/avx512fp16vl-vfmaddXXXph-1a.c: Ditto.
* gcc.target/i386/avx512fp16vl-vfmaddXXXph-1b.c: Ditto.
* gcc.target/i386/avx512fp16vl-vfmsubXXXph-1a.c: Ditto.
* gcc.target/i386/avx512fp16vl-vfmsubXXXph-1b.c: Ditto.
* gcc.target/i386/avx512fp16vl-vfnmaddXXXph-1a.c: Ditto.
* gcc.target/i386/avx512fp16vl-vfnmaddXXXph-1b.c: Ditto.
* gcc.target/i386/avx512fp16vl-vfnmsubXXXph-1a.c: Ditto.
* gcc.target/i386/avx512fp16vl-vfnmsubXXXph-1b.c: Ditto.
liuhongt [Thu, 21 Mar 2019 03:25:58 +0000 (11:25 +0800)]
AVX512FP16: Add FP16 fma instructions.
Add vfmadd[132,213,231]ph/vfnmadd[132,213,231]ph/vfmsub[132,213,231]ph/
vfnmsub[132,213,231]ph.
gcc/ChangeLog:
* config/i386/avx512fp16intrin.h (_mm512_mask_fmadd_ph):
New intrinsic.
(_mm512_mask3_fmadd_ph): Likewise.
(_mm512_maskz_fmadd_ph): Likewise.
(_mm512_fmadd_round_ph): Likewise.
(_mm512_mask_fmadd_round_ph): Likewise.
(_mm512_mask3_fmadd_round_ph): Likewise.
(_mm512_maskz_fmadd_round_ph): Likewise.
(_mm512_fnmadd_ph): Likewise.
(_mm512_mask_fnmadd_ph): Likewise.
(_mm512_mask3_fnmadd_ph): Likewise.
(_mm512_maskz_fnmadd_ph): Likewise.
(_mm512_fnmadd_round_ph): Likewise.
(_mm512_mask_fnmadd_round_ph): Likewise.
(_mm512_mask3_fnmadd_round_ph): Likewise.
(_mm512_maskz_fnmadd_round_ph): Likewise.
(_mm512_fmsub_ph): Likewise.
(_mm512_mask_fmsub_ph): Likewise.
(_mm512_mask3_fmsub_ph): Likewise.
(_mm512_maskz_fmsub_ph): Likewise.
(_mm512_fmsub_round_ph): Likewise.
(_mm512_mask_fmsub_round_ph): Likewise.
(_mm512_mask3_fmsub_round_ph): Likewise.
(_mm512_maskz_fmsub_round_ph): Likewise.
(_mm512_fnmsub_ph): Likewise.
(_mm512_mask_fnmsub_ph): Likewise.
(_mm512_mask3_fnmsub_ph): Likewise.
(_mm512_maskz_fnmsub_ph): Likewise.
(_mm512_fnmsub_round_ph): Likewise.
(_mm512_mask_fnmsub_round_ph): Likewise.
(_mm512_mask3_fnmsub_round_ph): Likewise.
(_mm512_maskz_fnmsub_round_ph): Likewise.
* config/i386/avx512fp16vlintrin.h (_mm256_fmadd_ph):
New intrinsic.
(_mm256_mask_fmadd_ph): Likewise.
(_mm256_mask3_fmadd_ph): Likewise.
(_mm256_maskz_fmadd_ph): Likewise.
(_mm_fmadd_ph): Likewise.
(_mm_mask_fmadd_ph): Likewise.
(_mm_mask3_fmadd_ph): Likewise.
(_mm_maskz_fmadd_ph): Likewise.
(_mm256_fnmadd_ph): Likewise.
(_mm256_mask_fnmadd_ph): Likewise.
(_mm256_mask3_fnmadd_ph): Likewise.
(_mm256_maskz_fnmadd_ph): Likewise.
(_mm_fnmadd_ph): Likewise.
(_mm_mask_fnmadd_ph): Likewise.
(_mm_mask3_fnmadd_ph): Likewise.
(_mm_maskz_fnmadd_ph): Likewise.
(_mm256_fmsub_ph): Likewise.
(_mm256_mask_fmsub_ph): Likewise.
(_mm256_mask3_fmsub_ph): Likewise.
(_mm256_maskz_fmsub_ph): Likewise.
(_mm_fmsub_ph): Likewise.
(_mm_mask_fmsub_ph): Likewise.
(_mm_mask3_fmsub_ph): Likewise.
(_mm_maskz_fmsub_ph): Likewise.
(_mm256_fnmsub_ph): Likewise.
(_mm256_mask_fnmsub_ph): Likewise.
(_mm256_mask3_fnmsub_ph): Likewise.
(_mm256_maskz_fnmsub_ph): Likewise.
(_mm_fnmsub_ph): Likewise.
(_mm_mask_fnmsub_ph): Likewise.
(_mm_mask3_fnmsub_ph): Likewise.
(_mm_maskz_fnmsub_ph): Likewise.
* config/i386/i386-builtin.def: Add corresponding new builtins.
* config/i386/sse.md
(<avx512>_fmadd_<mode>_maskz<round_expand_name>): Adjust to
support HF vector modes.
(<sd_mask_codefor>fma_fmadd_<mode><sd_maskz_name><round_name>):
Ditto.
(*<sd_mask_codefor>fma_fmadd_<mode><sd_maskz_name>_bcst_1): Ditto.
(*<sd_mask_codefor>fma_fmadd_<mode><sd_maskz_name>_bcst_2): Ditto.
(*<sd_mask_codefor>fma_fmadd_<mode><sd_maskz_name>_bcst_3): Ditto.
(<avx512>_fmadd_<mode>_mask<round_name>): Ditto.
(<avx512>_fmadd_<mode>_mask3<round_name>): Ditto.
(<avx512>_fmsub_<mode>_maskz<round_expand_name>): Ditto.
(<sd_mask_codefor>fma_fmsub_<mode><sd_maskz_name><round_name>):
Ditto.
(*<sd_mask_codefor>fma_fmsub_<mode><sd_maskz_name>_bcst_1): Ditto.
(*<sd_mask_codefor>fma_fmsub_<mode><sd_maskz_name>_bcst_2): Ditto.
(*<sd_mask_codefor>fma_fmsub_<mode><sd_maskz_name>_bcst_3): Ditto.
(<avx512>_fmsub_<mode>_mask<round_name>): Ditto.
(<avx512>_fmsub_<mode>_mask3<round_name>): Ditto.
(<sd_mask_codefor>fma_fnmadd_<mode><sd_maskz_name><round_name>):
Ditto.
(*<sd_mask_codefor>fma_fnmadd_<mode><sd_maskz_name>_bcst_1): Ditto.
(*<sd_mask_codefor>fma_fnmadd_<mode><sd_maskz_name>_bcst_2): Ditto.
(*<sd_mask_codefor>fma_fnmadd_<mode><sd_maskz_name>_bcst_3): Ditto.
(<avx512>_fnmadd_<mode>_mask<round_name>): Ditto.
(<avx512>_fnmadd_<mode>_mask3<round_name>): Ditto.
(<avx512>_fnmsub_<mode>_maskz<round_expand_name>): Ditto.
(<sd_mask_codefor>fma_fnmsub_<mode><sd_maskz_name><round_name>):
Ditto.
(*<sd_mask_codefor>fma_fnmsub_<mode><sd_maskz_name>_bcst_1): Ditto.
(*<sd_mask_codefor>fma_fnmsub_<mode><sd_maskz_name>_bcst_2): Ditto.
(*<sd_mask_codefor>fma_fnmsub_<mode><sd_maskz_name>_bcst_3): Ditto.
(<avx512>_fnmsub_<mode>_mask<round_name>): Ditto.
(<avx512>_fnmsub_<mode>_mask3<round_name>): Ditto.
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx-1.c: Add test for new builtins.
* gcc.target/i386/sse-13.c: Ditto.
* gcc.target/i386/sse-23.c: Ditto.
* gcc.target/i386/sse-14.c: Add test fot new intrinsics.
* gcc.target/i386/sse-22.c: Ditto.
liuhongt [Mon, 2 Mar 2020 09:29:15 +0000 (17:29 +0800)]
AVX512FP16: Add testcase for vfmaddsub[132,213,231]ph/vfmsubadd[132,213,231]ph.
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx512fp16-vfmaddsubXXXph-1a.c: New test.
* gcc.target/i386/avx512fp16-vfmaddsubXXXph-1b.c: Ditto.
* gcc.target/i386/avx512fp16-vfmsubaddXXXph-1a.c: Ditto.
* gcc.target/i386/avx512fp16-vfmsubaddXXXph-1b.c: Ditto.
* gcc.target/i386/avx512fp16vl-vfmaddsubXXXph-1a.c: Ditto.
* gcc.target/i386/avx512fp16vl-vfmaddsubXXXph-1b.c: Ditto.
* gcc.target/i386/avx512fp16vl-vfmsubaddXXXph-1a.c: Ditto.
* gcc.target/i386/avx512fp16vl-vfmsubaddXXXph-1b.c: Ditto.
liuhongt [Wed, 20 Mar 2019 05:24:58 +0000 (13:24 +0800)]
AVX512FP16: Add vfmaddsub[132,213,231]ph/vfmsubadd[132,213,231]ph.
gcc/ChangeLog:
* config/i386/avx512fp16intrin.h (_mm512_fmaddsub_ph):
New intrinsic.
(_mm512_mask_fmaddsub_ph): Likewise.
(_mm512_mask3_fmaddsub_ph): Likewise.
(_mm512_maskz_fmaddsub_ph): Likewise.
(_mm512_fmaddsub_round_ph): Likewise.
(_mm512_mask_fmaddsub_round_ph): Likewise.
(_mm512_mask3_fmaddsub_round_ph): Likewise.
(_mm512_maskz_fmaddsub_round_ph): Likewise.
(_mm512_mask_fmsubadd_ph): Likewise.
(_mm512_mask3_fmsubadd_ph): Likewise.
(_mm512_maskz_fmsubadd_ph): Likewise.
(_mm512_fmsubadd_round_ph): Likewise.
(_mm512_mask_fmsubadd_round_ph): Likewise.
(_mm512_mask3_fmsubadd_round_ph): Likewise.
(_mm512_maskz_fmsubadd_round_ph): Likewise.
* config/i386/avx512fp16vlintrin.h (_mm256_fmaddsub_ph):
New intrinsic.
(_mm256_mask_fmaddsub_ph): Likewise.
(_mm256_mask3_fmaddsub_ph): Likewise.
(_mm256_maskz_fmaddsub_ph): Likewise.
(_mm_fmaddsub_ph): Likewise.
(_mm_mask_fmaddsub_ph): Likewise.
(_mm_mask3_fmaddsub_ph): Likewise.
(_mm_maskz_fmaddsub_ph): Likewise.
(_mm256_fmsubadd_ph): Likewise.
(_mm256_mask_fmsubadd_ph): Likewise.
(_mm256_mask3_fmsubadd_ph): Likewise.
(_mm256_maskz_fmsubadd_ph): Likewise.
(_mm_fmsubadd_ph): Likewise.
(_mm_mask_fmsubadd_ph): Likewise.
(_mm_mask3_fmsubadd_ph): Likewise.
(_mm_maskz_fmsubadd_ph): Likewise.
* config/i386/i386-builtin.def: Add corresponding new builtins.
* config/i386/sse.md (VFH_SF_AVX512VL): New mode iterator.
* (<avx512>_fmsubadd_<mode>_maskz<round_expand_name>): New expander.
* (<avx512>_fmaddsub_<mode>_maskz<round_expand_name>): Use
VFH_SF_AVX512VL.
* (<sd_mask_codefor>fma_fmaddsub_<mode><sd_maskz_name><round_name>):
Ditto.
* (<avx512>_fmaddsub_<mode>_mask<round_name>): Ditto.
* (<avx512>_fmaddsub_<mode>_mask3<round_name>): Ditto.
* (<sd_mask_codefor>fma_fmsubadd_<mode><sd_maskz_name><round_name>):
Ditto.
* (<avx512>_fmsubadd_<mode>_mask<round_name>): Ditto.
* (<avx512>_fmsubadd_<mode>_mask3<round_name>): Ditto.
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx-1.c: Add test for new builtins.
* gcc.target/i386/sse-13.c: Ditto.
* gcc.target/i386/sse-23.c: Ditto.
* gcc.target/i386/sse-14.c: Add test for new intrinsics.
* gcc.target/i386/sse-22.c: Ditto.
liuhongt [Sat, 18 Sep 2021 04:14:32 +0000 (12:14 +0800)]
Support embedded broadcast for AVX512FP16 instructions.
gcc/ChangeLog:
PR target/87767
* config/i386/i386.c (ix86_print_operand): Handle
V8HF/V16HF/V32HFmode.
* config/i386/i386.h (VALID_BCST_MODE_P): Add HFmode.
* config/i386/sse.md (avx512bcst): Remove.
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx512fp16-broadcast-1.c: New test.
* gcc.target/i386/avx512fp16-broadcast-2.c: New test.
Jason Merrill [Fri, 17 Sep 2021 18:18:55 +0000 (14:18 -0400)]
c++: improve lookup of member-qualified names
I've been working on the resolution of CWG1835 by P1787, which among many
other things clarified that a name after -> or . is looked up first in the
class of the object expression even if it's dependent. This patch does not
make that change; this is a smaller change extracted from that work in
progress to make the lookup in the object type work better in cases where
unqualified lookup doesn't find anything.
Basically, if we see "t.foo::" we know that looking up foo in t needs to
find a type, so we build an implicit TYPENAME_TYPE for it.
This also implements the change from P1787 to assume that a name followed by
< in a type-only context names a template, since the less-than operator
can't appear in a type context. This makes some of the lines in dtor11.C
work.
I introduce the predicate 'dependentish_scope_p' for the case where the
current instantiation has dependent bases, so even though we can perform
name lookup, we can't conclude that a lookup failure is conclusive.
gcc/cp/ChangeLog:
* cp-tree.h (dependentish_scope_p): Declare.
* pt.c (dependentish_scope_p): New.
* parser.c (cp_parser_lookup_name): Return a TYPENAME_TYPE
for lookup of a type in a dependent object.
(cp_parser_template_id): Handle TYPENAME_TYPE.
(cp_parser_template_name): If we're looking for a type,
a name followed by < names a template.
gcc/testsuite/ChangeLog:
* g++.dg/template/dtor5.C: Adjust expected error.
* g++.dg/cpp23/lookup2.C: New test.
* g++.dg/template/dtor11.C: New test.
Jason Merrill [Fri, 17 Sep 2021 17:55:42 +0000 (13:55 -0400)]
c++: fix comment typo
gcc/cp/ChangeLog:
* cp-tree.h: Fix typo in LANG_FLAG list.
GCC Administrator [Sat, 18 Sep 2021 00:16:36 +0000 (00:16 +0000)]
Daily bump.
Martin Sebor [Fri, 17 Sep 2021 21:39:13 +0000 (15:39 -0600)]
Factor predidacte analysis out of tree-ssa-uninit.c into its own module.
gcc/ChangeLog:
* Makefile.in (OBJS): Add gimple-predicate-analysis.o.
* tree-ssa-uninit.c (max_phi_args): Move to gimple-predicate-analysis.
(MASK_SET_BIT, MASK_TEST_BIT, MASK_EMPTY): Same.
(check_defs): Add comment.
(can_skip_redundant_opnd): Update comment.
(compute_uninit_opnds_pos): Adjust to namespace change.
(find_pdom): Move to gimple-predicate-analysis.cc.
(find_dom): Same.
(struct uninit_undef_val_t): New.
(is_non_loop_exit_postdominating): Move to gimple-predicate-analysis.cc.
(find_control_equiv_block): Same.
(MAX_NUM_CHAINS, MAX_CHAIN_LEN, MAX_POSTDOM_CHECK): Same.
(MAX_SWITCH_CASES): Same.
(compute_control_dep_chain): Same.
(find_uninit_use): Use predicate analyzer.
(struct pred_info): Move to gimple-predicate-analysis.
(convert_control_dep_chain_into_preds): Same.
(find_predicates): Same.
(collect_phi_def_edges): Same.
(warn_uninitialized_phi): Use predicate analyzer.
(find_def_preds): Move to gimple-predicate-analysis.
(dump_pred_info): Same.
(dump_pred_chain): Same.
(dump_predicates): Same.
(destroy_predicate_vecs): Remove.
(execute_late_warn_uninitialized): New.
(get_cmp_code): Move to gimple-predicate-analysis.
(is_value_included_in): Same.
(value_sat_pred_p): Same.
(find_matching_predicate_in_rest_chains): Same.
(is_use_properly_guarded): Same.
(prune_uninit_phi_opnds): Same.
(find_var_cmp_const): Same.
(use_pred_not_overlap_with_undef_path_pred): Same.
(pred_equal_p): Same.
(is_neq_relop_p): Same.
(is_neq_zero_form_p): Same.
(pred_expr_equal_p): Same.
(is_pred_expr_subset_of): Same.
(is_pred_chain_subset_of): Same.
(is_included_in): Same.
(is_superset_of): Same.
(pred_neg_p): Same.
(simplify_pred): Same.
(simplify_preds_2): Same.
(simplify_preds_3): Same.
(simplify_preds_4): Same.
(simplify_preds): Same.
(push_pred): Same.
(push_to_worklist): Same.
(get_pred_info_from_cmp): Same.
(is_degenerated_phi): Same.
(normalize_one_pred_1): Same.
(normalize_one_pred): Same.
(normalize_one_pred_chain): Same.
(normalize_preds): Same.
(can_one_predicate_be_invalidated_p): Same.
(can_chain_union_be_invalidated_p): Same.
(uninit_uses_cannot_happen): Same.
(pass_late_warn_uninitialized::execute): Define.
* gimple-predicate-analysis.cc: New file.
* gimple-predicate-analysis.h: New file.
Harald Anlauf [Fri, 17 Sep 2021 19:45:33 +0000 (21:45 +0200)]
Fortran - (large) arrays in the main shall be static
gcc/fortran/ChangeLog:
PR fortran/102366
* trans-decl.c (gfc_finish_var_decl): Disable the warning message
for variables moved from stack to static storange if they are
declared in the main, but allow the move to happen.
gcc/testsuite/ChangeLog:
PR fortran/102366
* gfortran.dg/pr102366.f90: New test.
Jonathan Wakely [Fri, 17 Sep 2021 11:28:35 +0000 (12:28 +0100)]
libstdc++: Add 'noexcept' to path::iterator members
All path::iterator operations are non-throwing.
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:
* include/bits/fs_path.h (path::iterator): Add noexcept to all
member functions and friend functions.
(distance): Add noexcept.
(advance): Add noexcept and inline.
* include/experimental/bits/fs_path.h (path::iterator):
Add noexcept to all member functions.
Jonathan Wakely [Fri, 17 Sep 2021 11:27:02 +0000 (12:27 +0100)]
libstdc++: Fix last std::tuple constructor missing 'constexpr' [PR102270]
Also rename the test so it actually runs.
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:
PR libstdc++/102270
* include/std/tuple (_Tuple_impl): Add constexpr to constructor
missed in previous patch.
* testsuite/20_util/tuple/cons/102270.C: Moved to...
* testsuite/20_util/tuple/cons/102270.cc: ...here.
* testsuite/util/testsuite_allocator.h (SimpleAllocator): Add
constexpr to constructor so it can be used for C++20 tests.
Julian Brown [Mon, 29 Jun 2020 20:17:30 +0000 (13:17 -0700)]
openacc: Remove unnecessary barriers (gimple worker partitioning/broadcast)
This is an optimisation for middle-end worker-partitioning support (used
to support multiple workers on AMD GCN). At present, barriers may be
emitted in cases where they aren't needed and cannot be optimised away.
This patch stops the extraneous barriers from being emitted in the
first place.
One exception to the above (where the barrier is still needed) is for
predicated blocks of code that perform a write to gang-private shared
memory from one worker. We must execute a barrier before other workers
read that shared memory location.
gcc/
* config/gcn/gcn.c (gimple.h): Include.
(gcn_fork_join): Emit barrier for worker-level joins.
* omp-oacc-neuter-broadcast.cc (find_local_vars_to_propagate): Add
writes_gang_private bitmap parameter. Set bit for blocks
containing gang-private variable writes.
(worker_single_simple): Don't emit barrier after predicated block.
(worker_single_copy): Don't emit barrier if we're not broadcasting
anything and the block contains no gang-private writes.
(neuter_worker_single): Don't predicate blocks that only contain
NOPs or internal marker functions. Pass has_gang_private_write
argument to worker_single_copy.
(oacc_do_neutering): Add writes_gang_private bitmap handling.
Julian Brown [Mon, 29 Jun 2020 20:16:52 +0000 (13:16 -0700)]
openacc: Shared memory layout optimisation
This patch implements an algorithm to lay out local data-share (LDS)
space. It currently works for AMD GCN. At the moment, LDS is used for
three things:
1. Gang-private variables
2. Reduction temporaries (accumulators)
3. Broadcasting for worker partitioning
After the patch is applied, (2) and (3) are placed at preallocated
locations in LDS, and (1) continues to be handled by the backend (as it
is at present prior to this patch being applied). LDS now looks like this:
+--------------+ (gang-private size + 1024, = 1536)
| free space |
| ... |
| - - - - - - -|
| worker bcast |
+--------------+
| reductions |
+--------------+ <<< -mgang-private-size=<number> (def. 512)
| gang-private |
| vars |
+--------------+ (32)
| low LDS vars |
+--------------+ LDS base
So, gang-private space is fixed at a constant amount at compile time
(which can be increased with a command-line switch if necessary
for some given code). The layout algorithm takes out a slice of the
remainder of usable space for reduction vars, and uses the rest for
worker partitioning.
The partitioning algorithm works as follows.
1. An "adjacency" set is built up for each basic block that might
do a broadcast. This is calculated by starting at each such block,
and doing a recursive DFS walk over successors to find the next
block (or blocks) that *also* does a broadcast
(dfs_broadcast_reachable_1).
2. The adjacency set is inverted to get adjacent predecessor blocks also.
3. Blocks that will perform a broadcast are sorted by size of that
broadcast: the biggest blocks are handled first.
4. A splay tree structure is used to calculate the spans of LDS memory
that are already allocated by the blocks adjacent to this one
(merge_ranges{,_1}.
5. The current block's broadcast space is allocated from the first free
span not allocated in the splay tree structure calculated above
(first_fit_range). This seems to work quite nicely and efficiently
with the splay tree structure.
6. Continue with the next-biggest broadcast block until we're done.
In this way, "adjacent" broadcasts will not use the same piece of
LDS memory.
PR96334 "openacc: Unshare reduction temporaries for GCN" got merged in:
The GCN backend uses tree nodes like MEM((__lds TYPE *) <constant>)
for reduction temporaries. Unlike e.g. var decls and SSA names, these
nodes cannot be shared during gimplification, but are so in some
circumstances. This is detected when appropriate --enable-checking
options are used. This patch unshares such nodes when they are reused
more than once.
gcc/
* config/gcn/gcn-protos.h
(gcn_goacc_create_worker_broadcast_record): Update prototype.
* config/gcn/gcn-tree.c (gcn_goacc_get_worker_red_decl): Use
preallocated block of LDS memory. Do not cache/share decls for
reduction temporaries between invocations.
(gcn_goacc_reduction_teardown): Unshare VAR on second use.
(gcn_goacc_create_worker_broadcast_record): Add OFFSET parameter
and return temporary LDS space at that offset. Return pointer in
"sender" case.
* config/gcn/gcn.c (acc_lds_size, gang_private_hwm, lds_allocs):
New global vars.
(ACC_LDS_SIZE): Define as acc_lds_size.
(gcn_init_machine_status): Don't initialise lds_allocated,
lds_allocs, reduc_decls fields of machine function struct.
(gcn_option_override): Handle default size for gang-private
variables and -mgang-private-size option.
(gcn_expand_prologue): Use LDS_SIZE instead of LDS_SIZE-1 when
initialising M0_REG.
(gcn_shared_mem_layout): New function.
(gcn_print_lds_decl): Update comment. Use global lds_allocs map and
gang_private_hwm variable.
(TARGET_GOACC_SHARED_MEM_LAYOUT): Define target hook.
* config/gcn/gcn.h (machine_function): Remove lds_allocated,
lds_allocs, reduc_decls. Add reduction_base, reduction_limit.
* config/gcn/gcn.opt (gang_private_size_opt): New global.
(mgang-private-size=): New option.
* doc/tm.texi.in (TARGET_GOACC_SHARED_MEM_LAYOUT): Place
documentation hook.
* doc/tm.texi: Regenerate.
* omp-oacc-neuter-broadcast.cc (targhooks.h, diagnostic-core.h):
Add includes.
(build_sender_ref): Handle sender_decl being pointer.
(worker_single_copy): Add PLACEMENT and ISOLATE_BROADCASTS
parameters. Pass placement argument to
create_worker_broadcast_record hook invocations. Handle
sender_decl being pointer and isolate_broadcasts inserting extra
barriers.
(blk_offset_map_t): Add typedef.
(neuter_worker_single): Add BLK_OFFSET_MAP parameter. Pass
preallocated range to worker_single_copy call.
(dfs_broadcast_reachable_1): New function.
(idx_decl_pair_t, used_range_vec_t): New typedefs.
(sort_size_descending): New function.
(addr_range): New class.
(splay_tree_compare_addr_range, splay_tree_free_key)
(first_fit_range, merge_ranges_1, merge_ranges): New functions.
(execute_omp_oacc_neuter_broadcast): Rename to...
(oacc_do_neutering): ... this. Add BOUNDS_LO, BOUNDS_HI
parameters. Arrange layout of shared memory for broadcast
operations.
(execute_omp_oacc_neuter_broadcast): New function.
(pass_omp_oacc_neuter_broadcast::gate): Remove num_workers==1
handling from here. Enable pass for all OpenACC routines in order
to call shared memory-layout hook.
* target.def (create_worker_broadcast_record): Add OFFSET
parameter.
(shared_mem_layout): New hook.
libgomp/
* testsuite/libgomp.oacc-c-c++-common/broadcast-many.c: Update.
Julian Brown [Mon, 29 Jun 2020 20:16:51 +0000 (13:16 -0700)]
openacc: Turn off worker partitioning if num_workers==1
This patch turns off the middle-end worker-partitioning support if the
number of workers for an outlined offload function is one. In that case,
we do not need to perform the broadcasting/neutering code transformation.
gcc/
* omp-oacc-neuter-broadcast.cc
(pass_omp_oacc_neuter_broadcast::gate): Disable if num_workers is
1.
(execute_omp_oacc_neuter_broadcast): Adjust.
Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>
Julian Brown [Mon, 29 Jun 2020 20:16:52 +0000 (13:16 -0700)]
Add 'libgomp.oacc-c-c++-common/broadcast-many.c'
libgomp/
* testsuite/libgomp.oacc-c-c++-common/broadcast-many.c: New test.
Andrew MacLeod [Fri, 17 Sep 2021 13:48:35 +0000 (09:48 -0400)]
Provide a relation oracle for paths.
This provides a path_oracle class which can optionally be used in conjunction
with another oracle to track relations on a path as it is walked.
* value-relation.cc (class equiv_chain): Move to header file.
(path_oracle::path_oracle): New.
(path_oracle::~path_oracle): New.
(path_oracle::register_relation): New.
(path_oracle::query_relation): New.
(path_oracle::reset_path): New.
(path_oracle::dump): New.
* value-relation.h (class equiv_chain): Move to here.
(class path_oracle): New.
Andrew MacLeod [Wed, 15 Sep 2021 18:43:51 +0000 (14:43 -0400)]
Virtualize relation oracle and various cleanups.
Standardize equiv_oracle API onto the new relation_oracle virtual base, and
then have dom_oracle inherit from that.
equiv_set always returns an equivalency set now, never NULL.
EQ_EXPR requires symmetry now. Each SSA name must be in the other equiv set.
Shuffle some routines around, simplify.
* gimple-range-cache.cc (ranger_cache::ranger_cache): Create a DOM
based oracle.
* gimple-range-fold.cc (fur_depend::register_relation): Use
register_stmt/edge routines.
* value-relation.cc (equiv_chain::find): Relocate from equiv_oracle.
(equiv_oracle::equiv_oracle): Create self equivalence cache.
(equiv_oracle::~equiv_oracle): Release same.
(equiv_oracle::equiv_set): Return entry from self equiv cache if there
are no equivalences.
(equiv_oracle::find_equiv_block): Move list find to equiv_chain.
(equiv_oracle::register_relation): Rename from register_equiv.
(relation_chain_head::find_relation): Relocate from dom_oracle.
(relation_oracle::register_stmt): New.
(relation_oracle::register_edge): New.
(dom_oracle::*): Rename from relation_oracle.
(dom_oracle::register_relation): Adjust to call equiv_oracle.
(dom_oracle::set_one_relation): Split from register_relation.
(dom_oracle::register_transitives): Consolidate 2 methods.
(dom_oracle::find_relation_block): Move core to relation_chain.
(dom_oracle::query_relation): Rename from find_relation_dom and adjust.
* value-relation.h (class relation_oracle): New pure virtual base.
(class equiv_oracle): Inherit from relation_oracle and adjust.
(class dom_oracle): Rename from old relation_oracle and adjust.
qing zhao [Fri, 17 Sep 2021 17:33:47 +0000 (10:33 -0700)]
testsuite: Fix gcc.target/i386/auto-init-* tests.
This set of tests failed on many different combination of -march, -mtune.
some of them failed with -fstack-protestor-all, or -mno-sse. And the
pattern matches are also different on lp64 or ia32.
The reason for these failures is that the RTL or assembly level patten
matches are only valid for -march=x86-64 -mtune=generic.
We restrict the testing only for -march=x86-64 and -mtune=generic. Also
add -fno-stack-protector or -msse for some of the testing cases.
gcc/testsuite/ChangeLog:
2021-09-17 qing zhao <qing.zhao@oracle.com>
* gcc.target/i386/auto-init-1.c: Restrict the testing only for
-march=x86-64 and -mtune=generic. Add -fno-stack-protector.
* gcc.target/i386/auto-init-2.c: Restrict the testing only for
-march=x86-64 and -mtune=generic -msse.
* gcc.target/i386/auto-init-3.c: Likewise.
* gcc.target/i386/auto-init-4.c: Likewise.
* gcc.target/i386/auto-init-5.c: Different pattern match for lp64 and
ia32.
* gcc.target/i386/auto-init-6.c: Restrict the testing only for
-march=x86-64 and -mtune-generic -msse. Add -fno-stack-protector.
* gcc.target/i386/auto-init-7.c: Likewise.
* gcc.target/i386/auto-init-8.c: Restrict the testing only for
-march=x86-64 and -mtune=generic -msse..
* gcc.target/i386/auto-init-padding-1.c: Likewise.
* gcc.target/i386/auto-init-padding-10.c: Likewise.
* gcc.target/i386/auto-init-padding-11.c: Likewise.
* gcc.target/i386/auto-init-padding-12.c: Likewise.
* gcc.target/i386/auto-init-padding-2.c: Likewise.
* gcc.target/i386/auto-init-padding-3.c: Restrict the testing only for
-march=x86-64. Different pattern match for lp64 and ia32.
* gcc.target/i386/auto-init-padding-4.c: Restrict the testing only for
-march=x86-64 and -mtune-generic -msse.
* gcc.target/i386/auto-init-padding-5.c: Likewise.
* gcc.target/i386/auto-init-padding-6.c: Likewise.
* gcc.target/i386/auto-init-padding-7.c: Restrict the testing only for
-march=x86-64 and -mtune-generic -msse. Add -fno-stack-protector.
* gcc.target/i386/auto-init-padding-8.c: Likewise.
* gcc.target/i386/auto-init-padding-9.c: Restrict the testing only for
-march=x86-64. Different pattern match for lp64 and ia32.
Martin Sebor [Fri, 17 Sep 2021 16:36:54 +0000 (10:36 -0600)]
Better handle MIN/MAX_EXPR of unrelated objects [PR102200].
Resolves:
PR middle-end/102200 - ICE on a min of a decl and pointer in a loop
gcc/ChangeLog:
PR middle-end/102200
* pointer-query.cc (access_ref::inform_access): Handle MIN/MAX_EXPR.
(handle_min_max_size): Change argument. Store original SSA_NAME for
operands to potentially distinct (sub)objects.
(compute_objsize_r): Adjust call to the above.
gcc/testsuite/ChangeLog:
PR middle-end/102200
* gcc.dg/Wstringop-overflow-62.c: Adjust text of an expected note.
* gcc.dg/Warray-bounds-89.c: New test.
* gcc.dg/Wstringop-overflow-74.c: New test.
* gcc.dg/Wstringop-overflow-75.c: New test.
* gcc.dg/Wstringop-overflow-76.c: New test.
Bill Schmidt [Fri, 17 Sep 2021 15:47:27 +0000 (10:47 -0500)]
rs6000: Support for vectorizing built-in functions
This patch just duplicates a couple of functions and adjusts them to use the
new builtin names. There's no logical change otherwise.
2021-09-17 Bill Schmidt <wschmidt@linux.ibm.com>
gcc/
* config/rs6000/rs6000.c (rs6000-builtins.h): New include.
(rs6000_new_builtin_vectorized_function): New function.
(rs6000_new_builtin_md_vectorized_function): Likewise.
(rs6000_builtin_vectorized_function): Call
rs6000_new_builtin_vectorized_function.
(rs6000_builtin_md_vectorized_function): Call
rs6000_new_builtin_md_vectorized_function.
Bill Schmidt [Fri, 17 Sep 2021 15:32:59 +0000 (10:32 -0500)]
rs6000: Handle some recent MMA builtin changes
Peter Bergner recently added two new builtins __builtin_vsx_lxvp and
__builtin_vsx_stxvp. These happened to break a pattern in MMA builtins that
I had been using to automate gimple folding of MMA builtins. Previously,
every MMA function that could be folded had an associated internal function
that it was folded into. The LXVP/STXVP builtins are just folded directly
into memory operations.
Instead of relying on this pattern, this patch adds a new attribute to
builtins called "mmaint," which is set for all MMA builtins that have an
associated internal builtin. The naming convention that adds _INTERNAL to
the builtin index name remains.
The rest of the patch is just duplicating Peter's patch, using the new
builtin infrastructure.
2021-09-17 Bill Schmidt <wschmidt@linux.ibm.com>
gcc/
* config/rs6000/rs6000-builtin-new.def (ASSEMBLE_ACC): Add mmaint flag.
(ASSEMBLE_PAIR): Likewise.
(BUILD_ACC): Likewise.
(DISASSEMBLE_ACC): Likewise.
(DISASSEMBLE_PAIR): Likewise.
(PMXVBF16GER2): Likewise.
(PMXVBF16GER2NN): Likewise.
(PMXVBF16GER2NP): Likewise.
(PMXVBF16GER2PN): Likewise.
(PMXVBF16GER2PP): Likewise.
(PMXVF16GER2): Likewise.
(PMXVF16GER2NN): Likewise.
(PMXVF16GER2NP): Likewise.
(PMXVF16GER2PN): Likewise.
(PMXVF16GER2PP): Likewise.
(PMXVF32GER): Likewise.
(PMXVF32GERNN): Likewise.
(PMXVF32GERNP): Likewise.
(PMXVF32GERPN): Likewise.
(PMXVF32GERPP): Likewise.
(PMXVF64GER): Likewise.
(PMXVF64GERNN): Likewise.
(PMXVF64GERNP): Likewise.
(PMXVF64GERPN): Likewise.
(PMXVF64GERPP): Likewise.
(PMXVI16GER2): Likewise.
(PMXVI16GER2PP): Likewise.
(PMXVI16GER2S): Likewise.
(PMXVI16GER2SPP): Likewise.
(PMXVI4GER8): Likewise.
(PMXVI4GER8PP): Likewise.
(PMXVI8GER4): Likewise.
(PMXVI8GER4PP): Likewise.
(PMXVI8GER4SPP): Likewise.
(XVBF16GER2): Likewise.
(XVBF16GER2NN): Likewise.
(XVBF16GER2NP): Likewise.
(XVBF16GER2PN): Likewise.
(XVBF16GER2PP): Likewise.
(XVF16GER2): Likewise.
(XVF16GER2NN): Likewise.
(XVF16GER2NP): Likewise.
(XVF16GER2PN): Likewise.
(XVF16GER2PP): Likewise.
(XVF32GER): Likewise.
(XVF32GERNN): Likewise.
(XVF32GERNP): Likewise.
(XVF32GERPN): Likewise.
(XVF32GERPP): Likewise.
(XVF64GER): Likewise.
(XVF64GERNN): Likewise.
(XVF64GERNP): Likewise.
(XVF64GERPN): Likewise.
(XVF64GERPP): Likewise.
(XVI16GER2): Likewise.
(XVI16GER2PP): Likewise.
(XVI16GER2S): Likewise.
(XVI16GER2SPP): Likewise.
(XVI4GER8): Likewise.
(XVI4GER8PP): Likewise.
(XVI8GER4): Likewise.
(XVI8GER4PP): Likewise.
(XVI8GER4SPP): Likewise.
(XXMFACC): Likewise.
(XXMTACC): Likewise.
(XXSETACCZ): Likewise.
(ASSEMBLE_PAIR_V): Likewise.
(BUILD_PAIR): Likewise.
(DISASSEMBLE_PAIR_V): Likewise.
(LXVP): New.
(STXVP): New.
* config/rs6000/rs6000-call.c (rs6000_gimple_fold_new_mma_builtin):
Handle RS6000_BIF_LXVP and RS6000_BIF_STXVP.
* config/rs6000/rs6000-gen-builtins.c (attrinfo): Add ismmaint.
(parse_bif_attrs): Handle ismmaint.
(write_decls): Add bif_mmaint_bit and bif_is_mmaint.
(write_bif_static_init): Handle ismmaint.
Bill Schmidt [Fri, 17 Sep 2021 15:06:22 +0000 (10:06 -0500)]
rs6000: Handle gimple folding of target built-ins
This is another patch that looks bigger than it really is. Because we
have a new namespace for the builtins, allowing us to have both the old
and new builtin infrastructure supported at once, we need versions of
these functions that use the new builtin namespace. Otherwise the code is
unchanged.
2021-09-17 Bill Schmidt <wschmidt@linux.ibm.com>
gcc/
* config/rs6000/rs6000-call.c (rs6000_gimple_fold_new_builtin): New
forward decl.
(rs6000_gimple_fold_builtin): Call rs6000_gimple_fold_new_builtin.
(rs6000_new_builtin_valid_without_lhs): New function.
(rs6000_gimple_fold_new_mma_builtin): Likewise.
(rs6000_gimple_fold_new_builtin): Likewise.
Thomas Schwinge [Fri, 13 Aug 2021 16:03:38 +0000 (18:03 +0200)]
Fix 'hash_table::expand' to destruct stale Value objects
Thus plugging potentional memory leaks if these have non-trivial
constructor/destructor.
See
<https://stackoverflow.com/questions/6730403/how-to-delete-object-constructed-via-placement-new-operator>
and others.
As one example, compilation of 'g++.dg/warn/Wmismatched-tags.C' per
'valgrind --leak-check=full' improves as follows:
[...]
-
-104 bytes in 1 blocks are definitely lost in loss record 399 of 519
- at 0x483DFAF: realloc (vg_replace_malloc.c:836)
- by 0x223B62C: xrealloc (xmalloc.c:179)
- by 0xA8D848: void va_heap::reserve<class_decl_loc_t::class_key_loc_t>(vec<class_decl_loc_t::class_key_loc_t, va_heap, vl_embed>*&, unsigned int, bool) (vec.h:290)
- by 0xA8B373: vec<class_decl_loc_t::class_key_loc_t, va_heap, vl_ptr>::reserve(unsigned int, bool) (vec.h:1858)
- by 0xA8B277: vec<class_decl_loc_t::class_key_loc_t, va_heap, vl_ptr>::safe_push(class_decl_loc_t::class_key_loc_t const&) (vec.h:1967)
- by 0xA57481: class_decl_loc_t::add_or_diag_mismatched_tag(tree_node*, tag_types, bool, bool) (parser.c:32967)
- by 0xA573E1: class_decl_loc_t::add(cp_parser*, unsigned int, tag_types, tree_node*, bool, bool) (parser.c:32941)
- by 0xA56C52: cp_parser_check_class_key(cp_parser*, unsigned int, tag_types, tree_node*, bool, bool) (parser.c:32819)
- by 0xA3AD12: cp_parser_elaborated_type_specifier(cp_parser*, bool, bool) (parser.c:20227)
- by 0xA37EF2: cp_parser_type_specifier(cp_parser*, int, cp_decl_specifier_seq*, bool, int*, bool*) (parser.c:18942)
- by 0xA31CDD: cp_parser_decl_specifier_seq(cp_parser*, int, cp_decl_specifier_seq*, int*) (parser.c:15517)
- by 0xA43C71: cp_parser_parameter_declaration(cp_parser*, int, bool, bool*) (parser.c:24242)
-
-168 bytes in 3 blocks are definitely lost in loss record 422 of 519
- at 0x483DFAF: realloc (vg_replace_malloc.c:836)
- by 0x223B62C: xrealloc (xmalloc.c:179)
- by 0xA8D848: void va_heap::reserve<class_decl_loc_t::class_key_loc_t>(vec<class_decl_loc_t::class_key_loc_t, va_heap, vl_embed>*&, unsigned int, bool) (vec.h:290)
- by 0xA8B373: vec<class_decl_loc_t::class_key_loc_t, va_heap, vl_ptr>::reserve(unsigned int, bool) (vec.h:1858)
- by 0xA8B277: vec<class_decl_loc_t::class_key_loc_t, va_heap, vl_ptr>::safe_push(class_decl_loc_t::class_key_loc_t const&) (vec.h:1967)
- by 0xA57481: class_decl_loc_t::add_or_diag_mismatched_tag(tree_node*, tag_types, bool, bool) (parser.c:32967)
- by 0xA573E1: class_decl_loc_t::add(cp_parser*, unsigned int, tag_types, tree_node*, bool, bool) (parser.c:32941)
- by 0xA56C52: cp_parser_check_class_key(cp_parser*, unsigned int, tag_types, tree_node*, bool, bool) (parser.c:32819)
- by 0xA3AD12: cp_parser_elaborated_type_specifier(cp_parser*, bool, bool) (parser.c:20227)
- by 0xA37EF2: cp_parser_type_specifier(cp_parser*, int, cp_decl_specifier_seq*, bool, int*, bool*) (parser.c:18942)
- by 0xA31CDD: cp_parser_decl_specifier_seq(cp_parser*, int, cp_decl_specifier_seq*, int*) (parser.c:15517)
- by 0xA53385: cp_parser_single_declaration(cp_parser*, vec<deferred_access_check, va_gc, vl_embed>*, bool, bool, bool*) (parser.c:31072)
-
-488 bytes in 7 blocks are definitely lost in loss record 449 of 519
- at 0x483DFAF: realloc (vg_replace_malloc.c:836)
- by 0x223B62C: xrealloc (xmalloc.c:179)
- by 0xA8D848: void va_heap::reserve<class_decl_loc_t::class_key_loc_t>(vec<class_decl_loc_t::class_key_loc_t, va_heap, vl_embed>*&, unsigned int, bool) (vec.h:290)
- by 0xA8B373: vec<class_decl_loc_t::class_key_loc_t, va_heap, vl_ptr>::reserve(unsigned int, bool) (vec.h:1858)
- by 0xA8B277: vec<class_decl_loc_t::class_key_loc_t, va_heap, vl_ptr>::safe_push(class_decl_loc_t::class_key_loc_t const&) (vec.h:1967)
- by 0xA57481: class_decl_loc_t::add_or_diag_mismatched_tag(tree_node*, tag_types, bool, bool) (parser.c:32967)
- by 0xA573E1: class_decl_loc_t::add(cp_parser*, unsigned int, tag_types, tree_node*, bool, bool) (parser.c:32941)
- by 0xA56C52: cp_parser_check_class_key(cp_parser*, unsigned int, tag_types, tree_node*, bool, bool) (parser.c:32819)
- by 0xA3AD12: cp_parser_elaborated_type_specifier(cp_parser*, bool, bool) (parser.c:20227)
- by 0xA37EF2: cp_parser_type_specifier(cp_parser*, int, cp_decl_specifier_seq*, bool, int*, bool*) (parser.c:18942)
- by 0xA31CDD: cp_parser_decl_specifier_seq(cp_parser*, int, cp_decl_specifier_seq*, int*) (parser.c:15517)
- by 0xA49508: cp_parser_member_declaration(cp_parser*) (parser.c:26440)
-
-728 bytes in 7 blocks are definitely lost in loss record 455 of 519
- at 0x483B7F3: malloc (vg_replace_malloc.c:309)
- by 0x223B63F: xrealloc (xmalloc.c:177)
- by 0xA8D848: void va_heap::reserve<class_decl_loc_t::class_key_loc_t>(vec<class_decl_loc_t::class_key_loc_t, va_heap, vl_embed>*&, unsigned int, bool) (vec.h:290)
- by 0xA8B373: vec<class_decl_loc_t::class_key_loc_t, va_heap, vl_ptr>::reserve(unsigned int, bool) (vec.h:1858)
- by 0xA57508: class_decl_loc_t::add_or_diag_mismatched_tag(tree_node*, tag_types, bool, bool) (parser.c:32980)
- by 0xA573E1: class_decl_loc_t::add(cp_parser*, unsigned int, tag_types, tree_node*, bool, bool) (parser.c:32941)
- by 0xA56C52: cp_parser_check_class_key(cp_parser*, unsigned int, tag_types, tree_node*, bool, bool) (parser.c:32819)
- by 0xA48BC6: cp_parser_class_head(cp_parser*, bool*) (parser.c:26090)
- by 0xA4674B: cp_parser_class_specifier_1(cp_parser*) (parser.c:25302)
- by 0xA47D76: cp_parser_class_specifier(cp_parser*) (parser.c:25680)
- by 0xA37E27: cp_parser_type_specifier(cp_parser*, int, cp_decl_specifier_seq*, bool, int*, bool*) (parser.c:18912)
- by 0xA31CDD: cp_parser_decl_specifier_seq(cp_parser*, int, cp_decl_specifier_seq*, int*) (parser.c:15517)
-
-832 bytes in 8 blocks are definitely lost in loss record 458 of 519
- at 0x483B7F3: malloc (vg_replace_malloc.c:309)
- by 0x223B63F: xrealloc (xmalloc.c:177)
- by 0xA8D848: void va_heap::reserve<class_decl_loc_t::class_key_loc_t>(vec<class_decl_loc_t::class_key_loc_t, va_heap, vl_embed>*&, unsigned int, bool) (vec.h:290)
- by 0xA901ED: bool vec_safe_reserve<class_decl_loc_t::class_key_loc_t, va_heap>(vec<class_decl_loc_t::class_key_loc_t, va_heap, vl_embed>*&, unsigned int, bool) (vec.h:697)
- by 0xA8F161: void vec_alloc<class_decl_loc_t::class_key_loc_t, va_heap>(vec<class_decl_loc_t::class_key_loc_t, va_heap, vl_embed>*&, unsigned int) (vec.h:718)
- by 0xA8D18D: vec<class_decl_loc_t::class_key_loc_t, va_heap, vl_embed>::copy() const (vec.h:979)
- by 0xA8B0C3: vec<class_decl_loc_t::class_key_loc_t, va_heap, vl_ptr>::copy() const (vec.h:1824)
- by 0xA896B1: class_decl_loc_t::operator=(class_decl_loc_t const&) (parser.c:32697)
- by 0xA571FD: class_decl_loc_t::add(cp_parser*, unsigned int, tag_types, tree_node*, bool, bool) (parser.c:32899)
- by 0xA56C52: cp_parser_check_class_key(cp_parser*, unsigned int, tag_types, tree_node*, bool, bool) (parser.c:32819)
- by 0xA3AD12: cp_parser_elaborated_type_specifier(cp_parser*, bool, bool) (parser.c:20227)
- by 0xA37EF2: cp_parser_type_specifier(cp_parser*, int, cp_decl_specifier_seq*, bool, int*, bool*) (parser.c:18942)
-
-1,144 bytes in 11 blocks are definitely lost in loss record 466 of 519
- at 0x483B7F3: malloc (vg_replace_malloc.c:309)
- by 0x223B63F: xrealloc (xmalloc.c:177)
- by 0xA8D848: void va_heap::reserve<class_decl_loc_t::class_key_loc_t>(vec<class_decl_loc_t::class_key_loc_t, va_heap, vl_embed>*&, unsigned int, bool) (vec.h:290)
- by 0xA901ED: bool vec_safe_reserve<class_decl_loc_t::class_key_loc_t, va_heap>(vec<class_decl_loc_t::class_key_loc_t, va_heap, vl_embed>*&, unsigned int, bool) (vec.h:697)
- by 0xA8F161: void vec_alloc<class_decl_loc_t::class_key_loc_t, va_heap>(vec<class_decl_loc_t::class_key_loc_t, va_heap, vl_embed>*&, unsigned int) (vec.h:718)
- by 0xA8D18D: vec<class_decl_loc_t::class_key_loc_t, va_heap, vl_embed>::copy() const (vec.h:979)
- by 0xA8B0C3: vec<class_decl_loc_t::class_key_loc_t, va_heap, vl_ptr>::copy() const (vec.h:1824)
- by 0xA896B1: class_decl_loc_t::operator=(class_decl_loc_t const&) (parser.c:32697)
- by 0xA571FD: class_decl_loc_t::add(cp_parser*, unsigned int, tag_types, tree_node*, bool, bool) (parser.c:32899)
- by 0xA56C52: cp_parser_check_class_key(cp_parser*, unsigned int, tag_types, tree_node*, bool, bool) (parser.c:32819)
- by 0xA48BC6: cp_parser_class_head(cp_parser*, bool*) (parser.c:26090)
- by 0xA4674B: cp_parser_class_specifier_1(cp_parser*) (parser.c:25302)
-
-1,376 bytes in 10 blocks are definitely lost in loss record 467 of 519
- at 0x483DFAF: realloc (vg_replace_malloc.c:836)
- by 0x223B62C: xrealloc (xmalloc.c:179)
- by 0xA8D848: void va_heap::reserve<class_decl_loc_t::class_key_loc_t>(vec<class_decl_loc_t::class_key_loc_t, va_heap, vl_embed>*&, unsigned int, bool) (vec.h:290)
- by 0xA8B373: vec<class_decl_loc_t::class_key_loc_t, va_heap, vl_ptr>::reserve(unsigned int, bool) (vec.h:1858)
- by 0xA8B277: vec<class_decl_loc_t::class_key_loc_t, va_heap, vl_ptr>::safe_push(class_decl_loc_t::class_key_loc_t const&) (vec.h:1967)
- by 0xA57481: class_decl_loc_t::add_or_diag_mismatched_tag(tree_node*, tag_types, bool, bool) (parser.c:32967)
- by 0xA573E1: class_decl_loc_t::add(cp_parser*, unsigned int, tag_types, tree_node*, bool, bool) (parser.c:32941)
- by 0xA56C52: cp_parser_check_class_key(cp_parser*, unsigned int, tag_types, tree_node*, bool, bool) (parser.c:32819)
- by 0xA3AD12: cp_parser_elaborated_type_specifier(cp_parser*, bool, bool) (parser.c:20227)
- by 0xA37EF2: cp_parser_type_specifier(cp_parser*, int, cp_decl_specifier_seq*, bool, int*, bool*) (parser.c:18942)
- by 0xA31CDD: cp_parser_decl_specifier_seq(cp_parser*, int, cp_decl_specifier_seq*, int*) (parser.c:15517)
- by 0xA301E0: cp_parser_simple_declaration(cp_parser*, bool, tree_node**) (parser.c:14772)
-
-3,552 bytes in 33 blocks are definitely lost in loss record 483 of 519
- at 0x483B7F3: malloc (vg_replace_malloc.c:309)
- by 0x223B63F: xrealloc (xmalloc.c:177)
- by 0xA8D848: void va_heap::reserve<class_decl_loc_t::class_key_loc_t>(vec<class_decl_loc_t::class_key_loc_t, va_heap, vl_embed>*&, unsigned int, bool) (vec.h:290)
- by 0xA901ED: bool vec_safe_reserve<class_decl_loc_t::class_key_loc_t, va_heap>(vec<class_decl_loc_t::class_key_loc_t, va_heap, vl_embed>*&, unsigned int, bool) (vec.h:697)
- by 0xA8F161: void vec_alloc<class_decl_loc_t::class_key_loc_t, va_heap>(vec<class_decl_loc_t::class_key_loc_t, va_heap, vl_embed>*&, unsigned int) (vec.h:718)
- by 0xA8D18D: vec<class_decl_loc_t::class_key_loc_t, va_heap, vl_embed>::copy() const (vec.h:979)
- by 0xA8B0C3: vec<class_decl_loc_t::class_key_loc_t, va_heap, vl_ptr>::copy() const (vec.h:1824)
- by 0xA8964A: class_decl_loc_t::class_decl_loc_t(class_decl_loc_t const&) (parser.c:32689)
- by 0xA8F515: hash_table<hash_map<tree_decl_hash, class_decl_loc_t, simple_hashmap_traits<default_hash_traits<tree_decl_hash>, class_decl_loc_t> >::hash_entry, false, xcallocator>::expand() (hash-table.h:839)
- by 0xA8D4B3: hash_table<hash_map<tree_decl_hash, class_decl_loc_t, simple_hashmap_traits<default_hash_traits<tree_decl_hash>, class_decl_loc_t> >::hash_entry, false, xcallocator>::find_slot_with_hash(tree_node* const&, unsigned int, insert_option) (hash-table.h:1008)
- by 0xA8B1DC: hash_map<tree_decl_hash, class_decl_loc_t, simple_hashmap_traits<default_hash_traits<tree_decl_hash>, class_decl_loc_t> >::get_or_insert(tree_node* const&, bool*) (hash-map.h:200)
- by 0xA57128: class_decl_loc_t::add(cp_parser*, unsigned int, tag_types, tree_node*, bool, bool) (parser.c:32888)
[...]
LEAK SUMMARY:
- definitely lost: 8,440 bytes in 81 blocks
+ definitely lost: 48 bytes in 1 blocks
indirectly lost: 12,529 bytes in 329 blocks
possibly lost: 0 bytes in 0 blocks
still reachable: 1,644,376 bytes in 768 blocks
gcc/
* hash-table.h (hash_table<Descriptor, Lazy, Allocator>::expand):
Destruct stale Value objects.
* hash-map-tests.c (test_map_of_type_with_ctor_and_dtor_expand):
Update.
Sandra Loosemore [Fri, 17 Sep 2021 15:20:51 +0000 (08:20 -0700)]
Fortran: Use _Float128 rather than __float128 for c_float128 kind.
The GNU Fortran manual documents that the c_float128 kind corresponds
to __float128, but in fact the implementation uses float128_type_node,
which is _Float128. Both refer to the 128-bit IEEE/ISO encoding, but
some targets including aarch64 only define _Float128 and not __float128,
and do not provide quadmath.h. This caused errors in some test cases
referring to __float128.
This patch changes the documentation (including code comments) and
test cases to use _Float128 to match the implementation.
2021-09-16 Sandra Loosemore <sandra@codesourcery.com>
gcc/fortran/
* intrinsic.texi (ISO_C_BINDING): Change C_FLOAT128 to correspond
to _Float128 rather than __float128.
* iso-c-binding.def (c_float128): Update comments.
* trans-intrinsic.c (gfc_builtin_decl_for_float_kind): Likewise.
(build_round_expr): Likewise.
(gfc_build_intrinsic_lib_fndcecls): Likewise.
* trans-types.h (gfc_real16_is_float128): Likewise.
gcc/testsuite/
* gfortran.dg/PR100914.c: Do not include quadmath.h. Use
_Float128 _Complex instead of __complex128.
* gfortran.dg/PR100914.f90: Add -Wno-pedantic to suppress error
about use of _Float128.
* gfortran.dg/c-interop/typecodes-array-float128-c.c: Use
_Float128 instead of __float128.
* gfortran.dg/c-interop/typecodes-sanity-c.c: Likewise.
* gfortran.dg/c-interop/typecodes-scalar-float128-c.c: Likewise.
* lib/target-supports.exp
(check_effective_target_fortran_real_c_float128): Update comments.
libgfortran/
* ISO_Fortran_binding.h: Update comments.
* runtime/ISO_Fortran_binding.c: Likewise.
Roger Sayle [Fri, 17 Sep 2021 14:57:34 +0000 (15:57 +0100)]
PR c/102245: Disable sign-changing optimization for shifts by zero.
Respecting Jakub's suggestion that it may be better to warn-on-valid for
"if (x << 0)" as the author might have intended "if (x < 0)" [which will
also warn when x is _Bool], the simplest way to resolve this regression
is to disable the recently added fold transformation for shifts by zero;
these will be optimized later (elsewhere). Guarding against integer_zerop
is the simplest of three alternatives; the second being to only apply
this transformation to GIMPLE and not GENERIC, and the third (potentially)
being to explicitly handle shifts by zero here, with an (if cond then else),
optimizing the expression to a convert, but awkwardly duplicating a
more general transformation earlier in match.pd's shift simplifications.
2021-09-17 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
PR c/102245
* match.pd (shift optimizations): Disable recent sign-changing
optimization for shifts by zero, these will be folded later.
gcc/testsuite/ChangeLog
PR c/102245
* gcc.dg/Wint-in-bool-context-4.c: New test case.
Bill Schmidt [Tue, 31 Aug 2021 20:13:01 +0000 (15:13 -0500)]
rs6000: Move __builtin_mffsl to the [always] stanza
I over-restricted use of __builtin_mffsl, since I was unaware that it
automatically uses mffs when mffsl is not available. Paul Clarke pointed
this out in discussion of his SSE 4.1 compatibility patches.
2021-08-31 Bill Schmidt <wschmidt@linux.ibm.com>
gcc/
* config/rs6000/rs6000-builtin-new.def (__builtin_mffsl): Move from
[power9] to [always].
Tobias Burnus [Fri, 17 Sep 2021 13:43:30 +0000 (15:43 +0200)]
Fortran: Prefer GCC internal macros to float.h in ISO_Fortran_binding.h.
2021-09-17 Sandra Loosemore <sandra@codesourcery.com>
Tobias Burnus <tobias@codesourcery.com>
libgfortran/
* ISO_Fortran_binding.h: Only include float.h if the C compiler
doesn't have predefined __LDBL_* and __DBL_* macros. Handle
LDBL_MANT_DIG == 53 for FreeBSD.
Iain Sandoe [Sat, 7 Aug 2021 13:45:56 +0000 (14:45 +0100)]
configure, jit: Allow for 'make check-gcc-jit'.
This is a convenience feature that allows the user to
do "make check-gcc-jit" at the top level of the build
to check that facility in isolation from others.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
ChangeLog:
* Makefile.def: Add a jit check target for the jit
language.
* Makefile.in: Regenerate.
Richard Biener [Fri, 17 Sep 2021 10:48:09 +0000 (12:48 +0200)]
Revert no longer needed fix for PR95539
The workaround is no longer necessary since we maintain alignment
info on the DR group leader only.
2021-09-17 Richard Biener <rguenther@suse.de>
* tree-vect-stmts.c (vectorizable_load): Do not frob
stmt_info for SLP.
Jonathan Wakely [Fri, 17 Sep 2021 11:25:40 +0000 (12:25 +0100)]
libstdc++: Rename tests with incorrect extension
The libstdc++ testsuite only runs .cc files, so these two old tests have
never been run.
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:
* testsuite/26_numerics/valarray/dr630-3.C: Moved to...
* testsuite/26_numerics/valarray/dr630-3.cc: ...here.
* testsuite/27_io/basic_iostream/cons/16251.C: Moved to...
* testsuite/27_io/basic_iostream/cons/16251.cc: ...here.
Jakub Jelinek [Fri, 17 Sep 2021 10:33:36 +0000 (12:33 +0200)]
libgomp: Spelling error fix in OpenMP 5.1 conformance section
Fix spelling of OpenMP directive declare variant.
2021-09-17 Jakub Jelinek <jakub@redhat.com>
* libgomp.texi (OpenMP 5.1): Spelling fix,
declare variante -> declare variant.
Jakub Jelinek [Fri, 17 Sep 2021 09:28:31 +0000 (11:28 +0200)]
openmp: Add support for OpenMP 5.1 atomics for C++
Besides the C++ FE changes, I've noticed that the C FE didn't reject
#pragma omp atomic capture compare
{ v = x; x = y; }
and other forms of atomic swap, this patch fixes that too. And the
c-family/ routine needed quite a few changes so that the new code
in it works fine with both FEs.
2021-09-17 Jakub Jelinek <jakub@redhat.com>
gcc/c-family/
* c-omp.c (c_finish_omp_atomic): Avoid creating
TARGET_EXPR if test is true, use create_tmp_var_raw instead of
create_tmp_var and add a zero initializer to TARGET_EXPRs that
had NULL initializer. When omitting operands after v = x,
use type of v rather than type of x. Fix type of vtmp
TARGET_EXPR.
gcc/c/
* c-parser.c (c_parser_omp_atomic): Reject atomic swap if capture
is true.
gcc/cp/
* cp-tree.h (finish_omp_atomic): Add r and weak arguments.
* parser.c (cp_parser_omp_atomic): Update function comment for
OpenMP 5.1 atomics, parse OpenMP 5.1 atomics and fail, compare and
weak clauses.
* semantics.c (finish_omp_atomic): Add r and weak arguments, handle
them, handle COND_EXPRs.
* pt.c (tsubst_expr): Adjust for COND_EXPR forms that
finish_omp_atomic can now produce.
gcc/testsuite/
* c-c++-common/gomp/atomic-18.c: Expect same diagnostics in C++ as in
C.
* c-c++-common/gomp/atomic-25.c: Drop c effective target.
* c-c++-common/gomp/atomic-26.c: Likewise.
* c-c++-common/gomp/atomic-27.c: Likewise.
* c-c++-common/gomp/atomic-28.c: Likewise.
* c-c++-common/gomp/atomic-29.c: Likewise.
* c-c++-common/gomp/atomic-30.c: Likewise. Adjust expected diagnostics
for C++ when it differs from C.
(foo): Change return type from double to void.
* g++.dg/gomp/atomic-5.C: Adjust expected diagnostics wording.
* g++.dg/gomp/atomic-20.C: New test.
libgomp/
* testsuite/libgomp.c-c++-common/atomic-19.c: Drop c effective target.
Use /* */ comments instead of //.
* testsuite/libgomp.c-c++-common/atomic-20.c: Likewise.
* testsuite/libgomp.c-c++-common/atomic-21.c: Likewise.
* testsuite/libgomp.c++/atomic-16.C: New test.
* testsuite/libgomp.c++/atomic-17.C: New test.
H.J. Lu [Wed, 15 Sep 2021 06:18:21 +0000 (14:18 +0800)]
x86: Add TARGET_SSE_PARTIAL_REG_[FP_]CONVERTS_DEPENDENCY
1. Replace TARGET_SSE_PARTIAL_REG_DEPENDENCY with
TARGET_SSE_PARTIAL_REG_FP_CONVERTS_DEPENDENCY in SSE FP to FP splitters.
2. Replace TARGET_SSE_PARTIAL_REG_DEPENDENCY with
TARGET_SSE_PARTIAL_REG_CONVERTS_DEPENDENCY in SSE INT to FP splitters.
3. Also check TARGET_SSE_PARTIAL_REG_FP_CONVERTS_DEPENDENCY and
TARGET_SSE_PARTIAL_REG_DEPENDENCY when handling avx_partial_xmm_update
attribute. Don't convert AVX partial XMM register update if there is no
partial SSE register dependency for SSE conversion.
gcc/
* config/i386/i386-features.c (remove_partial_avx_dependency):
Also check TARGET_SSE_PARTIAL_REG_FP_CONVERTS_DEPENDENCY and
and TARGET_SSE_PARTIAL_REG_CONVERTS_DEPENDENCY before generating
vxorps.
* config/i386/i386.h (TARGET_SSE_PARTIAL_REG_FP_CONVERTS_DEPENDENCY):
New.
(TARGET_SSE_PARTIAL_REG_CONVERTS_DEPENDENCY): Likewise.
* config/i386/i386.md (SSE FP to FP splitters): Replace
TARGET_SSE_PARTIAL_REG_DEPENDENCY with
TARGET_SSE_PARTIAL_REG_FP_CONVERTS_DEPENDENCY.
(SSE INT to FP splitter): Replace TARGET_SSE_PARTIAL_REG_DEPENDENCY
with TARGET_SSE_PARTIAL_REG_CONVERTS_DEPENDENCY.
* config/i386/x86-tune.def
(X86_TUNE_SSE_PARTIAL_REG_FP_CONVERTS_DEPENDENCY): New.
(X86_TUNE_SSE_PARTIAL_REG_CONVERTS_DEPENDENCY): Likewise.
gcc/testsuite/
* gcc.target/i386/avx-covert-1.c: New file.
* gcc.target/i386/avx-fp-covert-1.c: Likewise.
* gcc.target/i386/avx-int-covert-1.c: Likewise.
* gcc.target/i386/sse-covert-1.c: Likewise.
* gcc.target/i386/sse-fp-covert-1.c: Likewise.
* gcc.target/i386/sse-int-covert-1.c: Likewise.
H.J. Lu [Wed, 15 Sep 2021 06:17:58 +0000 (14:17 +0800)]
x86: Properly handle USE_VECTOR_FP_CONVERTS/USE_VECTOR_CONVERTS
Check TARGET_USE_VECTOR_FP_CONVERTS or TARGET_USE_VECTOR_CONVERTS when
handling avx_partial_xmm_update attribute. Don't convert AVX partial
XMM register update if vector packed SSE conversion should be used.
gcc/
PR target/101900
* config/i386/i386-features.c (remove_partial_avx_dependency):
Check TARGET_USE_VECTOR_FP_CONVERTS and TARGET_USE_VECTOR_CONVERTS
before generating vxorps.
gcc/testsuite
PR target/101900
* gcc.target/i386/pr101900-1.c: New test.
* gcc.target/i386/pr101900-2.c: Likewise.
* gcc.target/i386/pr101900-3.c: Likewise.
H.J. Lu [Wed, 15 Sep 2021 06:17:08 +0000 (14:17 +0800)]
x86: Update memcpy/memset inline strategies for -mtune=tremont
Simply memcpy and memset inline strategies to avoid branches for
-mtune=tremont:
1. Create Tremont cost model from generic cost model.
2. With MOVE_RATIO and CLEAR_RATIO == 17, GCC will use integer/vector
load and store for up to 16 * 16 (256) bytes when the data size is
fixed and known.
3. Inline only if data size is known to be <= 256.
a. Use "rep movsb/stosb" with simple code sequence if the data size
is a constant.
b. Use loop if data size is not a constant.
4. Use memcpy/memset libray function if data size is unknown or > 256.
* config/i386/i386-options.c (processor_cost_table): Use
tremont_cost for Tremont.
* config/i386/x86-tune-costs.h (tremont_memcpy): New.
(tremont_memset): Likewise.
(tremont_cost): Likewise.
* config/i386/x86-tune.def (X86_TUNE_PREFER_KNOWN_REP_MOVSB_STOSB):
Enable for Tremont.
H.J. Lu [Wed, 15 Sep 2021 06:15:10 +0000 (14:15 +0800)]
x86: Update -mtune=tremont
Initial -mtune=tremont update
1. Use Haswell scheduling model.
2. Assume that stack engine allows to execute push&pop instructions in
parall.
3. Prepare for scheduling pass as -mtune=generic.
4. Use the same issue rate as -mtune=generic.
5. Enable partial_reg_dependency.
6. Disable accumulate_outgoing_args
7. Enable use_leave
8. Enable push_memory
9. Disable four_jump_limit
10. Disable opt_agu
11. Disable avoid_lea_for_addr
12. Disable avoid_mem_opnd_for_cmove
13. Enable misaligned_move_string_pro_epilogues
14. Enable use_cltd
16. Enable avoid_false_dep_for_bmi
17. Enable avoid_mfence
18. Disable expand_abs
19. Enable sse_typeless_stores
20. Enable sse_load0_by_pxor
21. Disable split_mem_opnd_for_fp_converts
22. Disable slow_pshufb
23. Enable partial_reg_dependency
This is the first patch to tune for Tremont. With all patches applied,
performance impacts on SPEC CPU 2017 are:
500.perlbench_r 1.81%
502.gcc_r 0.57%
505.mcf_r 1.16%
520.omnetpp_r 0.00%
523.xalancbmk_r 0.00%
525.x264_r 4.55%
531.deepsjeng_r 0.00%
541.leela_r 0.39%
548.exchange2_r 1.13%
557.xz_r 0.00%
geomean for intrate 0.95%
503.bwaves_r 0.00%
507.cactuBSSN_r 6.94%
508.namd_r 12.37%
510.parest_r 1.01%
511.povray_r 3.70%
519.lbm_r 36.61%
521.wrf_r 8.79%
526.blender_r 2.91%
527.cam4_r 6.23%
538.imagick_r 0.28%
544.nab_r 21.99%
549.fotonik3d_r 3.63%
554.roms_r -1.20%
geomean for fprate 7.50%
gcc/ChangeLog
* common/config/i386/i386-common.c: Use Haswell scheduling model
for Tremont.
* config/i386/i386.c (ix86_sched_init_global): Prepare for Tremont
scheduling pass.
* config/i386/x86-tune-sched.c (ix86_issue_rate): Change Tremont
issue rate to 4.
(ix86_adjust_cost): Handle Tremont.
* config/i386/x86-tune.def (X86_TUNE_SSE_PARTIAL_REG_DEPENDENCY):
Enable for Tremont.
(X86_TUNE_USE_LEAVE): Likewise.
(X86_TUNE_PUSH_MEMORY): Likewise.
(X86_TUNE_MISALIGNED_MOVE_STRING_PRO_EPILOGUES): Likewise.
(X86_TUNE_USE_CLTD): Likewise.
(X86_TUNE_AVOID_FALSE_DEP_FOR_BMI): Likewise.
(X86_TUNE_AVOID_MFENCE): Likewise.
(X86_TUNE_SSE_TYPELESS_STORES): Likewise.
(X86_TUNE_SSE_LOAD0_BY_PXOR): Likewise.
(X86_TUNE_ACCUMULATE_OUTGOING_ARGS): Disable for Tremont.
(X86_TUNE_FOUR_JUMP_LIMIT): Likewise.
(X86_TUNE_OPT_AGU): Likewise.
(X86_TUNE_AVOID_LEA_FOR_ADDR): Likewise.
(X86_TUNE_AVOID_MEM_OPND_FOR_CMOVE): Likewise.
(X86_TUNE_EXPAND_ABS): Likewise.
(X86_TUNE_SPLIT_MEM_OPND_FOR_FP_CONVERTS): Likewise.
(X86_TUNE_SLOW_PSHUFB): Likewise.
Eric Botcazou [Fri, 17 Sep 2021 08:12:12 +0000 (10:12 +0200)]
Fix PR rtl-optimization/102306
This is a duplication of volatile loads introduced during GCC 9 development
by the 2->2 mechanism of the RTL combiner. There is already a substantial
checking for volatile references in can_combine_p but it implicitly assumes
that the combination reduces the number of instructions, which is of course
not the case here. So the fix teaches try_combine to abort the combination
when it is about to make a copy of volatile references to preserve them.
gcc/
PR rtl-optimization/102306
* combine.c (try_combine): Abort the combination if we are about to
duplicate volatile references.
gcc/testsuite/
* gcc.target/sparc/
20210917-1.c: New test.
liuhongt [Tue, 25 Feb 2020 02:42:13 +0000 (10:42 +0800)]
AVX512FP16: Add intrinsics for casting between vector float16 and vector float32/float64/integer.
gcc/ChangeLog:
* config/i386/avx512fp16intrin.h (_mm_undefined_ph):
New intrinsic.
(_mm256_undefined_ph): Likewise.
(_mm512_undefined_ph): Likewise.
(_mm_cvtsh_h): Likewise.
(_mm256_cvtsh_h): Likewise.
(_mm512_cvtsh_h): Likewise.
(_mm512_castph_ps): Likewise.
(_mm512_castph_pd): Likewise.
(_mm512_castph_si512): Likewise.
(_mm512_castph512_ph128): Likewise.
(_mm512_castph512_ph256): Likewise.
(_mm512_castph128_ph512): Likewise.
(_mm512_castph256_ph512): Likewise.
(_mm512_zextph128_ph512): Likewise.
(_mm512_zextph256_ph512): Likewise.
(_mm512_castps_ph): Likewise.
(_mm512_castpd_ph): Likewise.
(_mm512_castsi512_ph): Likewise.
* config/i386/avx512fp16vlintrin.h (_mm_castph_ps):
New intrinsic.
(_mm256_castph_ps): Likewise.
(_mm_castph_pd): Likewise.
(_mm256_castph_pd): Likewise.
(_mm_castph_si128): Likewise.
(_mm256_castph_si256): Likewise.
(_mm_castps_ph): Likewise.
(_mm256_castps_ph): Likewise.
(_mm_castpd_ph): Likewise.
(_mm256_castpd_ph): Likewise.
(_mm_castsi128_ph): Likewise.
(_mm256_castsi256_ph): Likewise.
(_mm256_castph256_ph128): Likewise.
(_mm256_castph128_ph256): Likewise.
(_mm256_zextph128_ph256): Likewise.
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx512fp16-typecast-1.c: New test.
* gcc.target/i386/avx512fp16-typecast-2.c: Ditto.
* gcc.target/i386/avx512fp16vl-typecast-1.c: Ditto.
* gcc.target/i386/avx512fp16vl-typecast-2.c: Ditto.
liuhongt [Mon, 2 Mar 2020 09:25:19 +0000 (17:25 +0800)]
AVX512FP16: Add testcase for vcvtsh2sd/vcvtsh2ss/vcvtsd2sh/vcvtss2sh.
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx512fp16-vcvtsd2sh-1a.c: New test.
* gcc.target/i386/avx512fp16-vcvtsd2sh-1b.c: Ditto.
* gcc.target/i386/avx512fp16-vcvtsh2sd-1a.c: Ditto.
* gcc.target/i386/avx512fp16-vcvtsh2sd-1b.c: Ditto.
* gcc.target/i386/avx512fp16-vcvtsh2ss-1a.c: Ditto.
* gcc.target/i386/avx512fp16-vcvtsh2ss-1b.c: Ditto.
* gcc.target/i386/avx512fp16-vcvtss2sh-1a.c: Ditto.
* gcc.target/i386/avx512fp16-vcvtss2sh-1b.c: Ditto.
liuhongt [Tue, 19 Mar 2019 08:43:29 +0000 (16:43 +0800)]
AVX512FP16: Add vcvtsh2ss/vcvtsh2sd/vcvtss2sh/vcvtsd2sh.
gcc/ChangeLog:
* config/i386/avx512fp16intrin.h (_mm_cvtsh_ss):
New intrinsic.
(_mm_mask_cvtsh_ss): Likewise.
(_mm_maskz_cvtsh_ss): Likewise.
(_mm_cvtsh_sd): Likewise.
(_mm_mask_cvtsh_sd): Likewise.
(_mm_maskz_cvtsh_sd): Likewise.
(_mm_cvt_roundsh_ss): Likewise.
(_mm_mask_cvt_roundsh_ss): Likewise.
(_mm_maskz_cvt_roundsh_ss): Likewise.
(_mm_cvt_roundsh_sd): Likewise.
(_mm_mask_cvt_roundsh_sd): Likewise.
(_mm_maskz_cvt_roundsh_sd): Likewise.
(_mm_cvtss_sh): Likewise.
(_mm_mask_cvtss_sh): Likewise.
(_mm_maskz_cvtss_sh): Likewise.
(_mm_cvtsd_sh): Likewise.
(_mm_mask_cvtsd_sh): Likewise.
(_mm_maskz_cvtsd_sh): Likewise.
(_mm_cvt_roundss_sh): Likewise.
(_mm_mask_cvt_roundss_sh): Likewise.
(_mm_maskz_cvt_roundss_sh): Likewise.
(_mm_cvt_roundsd_sh): Likewise.
(_mm_mask_cvt_roundsd_sh): Likewise.
(_mm_maskz_cvt_roundsd_sh): Likewise.
* config/i386/i386-builtin-types.def
(V8HF_FTYPE_V2DF_V8HF_V8HF_UQI_INT,
V8HF_FTYPE_V4SF_V8HF_V8HF_UQI_INT,
V2DF_FTYPE_V8HF_V2DF_V2DF_UQI_INT,
V4SF_FTYPE_V8HF_V4SF_V4SF_UQI_INT): Add new builtin types.
* config/i386/i386-builtin.def: Add corrresponding new builtins.
* config/i386/i386-expand.c: Handle new builtin types.
* config/i386/sse.md (VF48_128): New mode iterator.
(avx512fp16_vcvtsh2<ssescalarmodesuffix><mask_scalar_name><round_saeonly_scalar_name>):
New.
(avx512fp16_vcvt<ssescalarmodesuffix>2sh<mask_scalar_name><round_scalar_name>):
Ditto.
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx-1.c: Add test for new builtins.
* gcc.target/i386/sse-13.c: Ditto.
* gcc.target/i386/sse-23.c: Ditto.
* gcc.target/i386/sse-14.c: Add test for new intrinsics.
* gcc.target/i386/sse-22.c: Ditto.
liuhongt [Mon, 2 Mar 2020 09:22:40 +0000 (17:22 +0800)]
AVX512FP16: Add testcase for vcvtph2pd/vcvtph2psx/vcvtpd2ph/vcvtps2phx.
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx512fp16-helper.h (V512): Add DF contents.
(src3f): New.
* gcc.target/i386/avx512fp16-vcvtpd2ph-1a.c: New test.
* gcc.target/i386/avx512fp16-vcvtpd2ph-1b.c: Ditto.
* gcc.target/i386/avx512fp16-vcvtph2pd-1a.c: Ditto.
* gcc.target/i386/avx512fp16-vcvtph2pd-1b.c: Ditto.
* gcc.target/i386/avx512fp16-vcvtph2psx-1a.c: Ditto.
* gcc.target/i386/avx512fp16-vcvtph2psx-1b.c: Ditto.
* gcc.target/i386/avx512fp16-vcvtps2ph-1a.c: Ditto.
* gcc.target/i386/avx512fp16-vcvtps2ph-1b.c: Ditto.
* gcc.target/i386/avx512fp16vl-vcvtpd2ph-1a.c: Ditto.
* gcc.target/i386/avx512fp16vl-vcvtpd2ph-1b.c: Ditto.
* gcc.target/i386/avx512fp16vl-vcvtph2pd-1a.c: Ditto.
* gcc.target/i386/avx512fp16vl-vcvtph2pd-1b.c: Ditto.
* gcc.target/i386/avx512fp16vl-vcvtph2psx-1a.c: Ditto.
* gcc.target/i386/avx512fp16vl-vcvtph2psx-1b.c: Ditto.
* gcc.target/i386/avx512fp16vl-vcvtps2ph-1a.c: Ditto.
* gcc.target/i386/avx512fp16vl-vcvtps2ph-1b.c: Ditto.
liuhongt [Mon, 21 Jun 2021 02:24:09 +0000 (10:24 +0800)]
AVX512FP16: Add vcvtph2pd/vcvtph2psx/vcvtpd2ph/vcvtps2phx.
gcc/ChangeLog:
* config/i386/avx512fp16intrin.h (_mm512_cvtph_pd):
New intrinsic.
(_mm512_mask_cvtph_pd): Likewise.
(_mm512_maskz_cvtph_pd): Likewise.
(_mm512_cvt_roundph_pd): Likewise.
(_mm512_mask_cvt_roundph_pd): Likewise.
(_mm512_maskz_cvt_roundph_pd): Likewise.
(_mm512_cvtxph_ps): Likewise.
(_mm512_mask_cvtxph_ps): Likewise.
(_mm512_maskz_cvtxph_ps): Likewise.
(_mm512_cvtx_roundph_ps): Likewise.
(_mm512_mask_cvtx_roundph_ps): Likewise.
(_mm512_maskz_cvtx_roundph_ps): Likewise.
(_mm512_cvtxps_ph): Likewise.
(_mm512_mask_cvtxps_ph): Likewise.
(_mm512_maskz_cvtxps_ph): Likewise.
(_mm512_cvtx_roundps_ph): Likewise.
(_mm512_mask_cvtx_roundps_ph): Likewise.
(_mm512_maskz_cvtx_roundps_ph): Likewise.
(_mm512_cvtpd_ph): Likewise.
(_mm512_mask_cvtpd_ph): Likewise.
(_mm512_maskz_cvtpd_ph): Likewise.
(_mm512_cvt_roundpd_ph): Likewise.
(_mm512_mask_cvt_roundpd_ph): Likewise.
(_mm512_maskz_cvt_roundpd_ph): Likewise.
* config/i386/avx512fp16vlintrin.h (_mm_cvtph_pd):
New intrinsic.
(_mm_mask_cvtph_pd): Likewise.
(_mm_maskz_cvtph_pd): Likewise.
(_mm256_cvtph_pd): Likewise.
(_mm256_mask_cvtph_pd): Likewise.
(_mm256_maskz_cvtph_pd): Likewise.
(_mm_cvtxph_ps): Likewise.
(_mm_mask_cvtxph_ps): Likewise.
(_mm_maskz_cvtxph_ps): Likewise.
(_mm256_cvtxph_ps): Likewise.
(_mm256_mask_cvtxph_ps): Likewise.
(_mm256_maskz_cvtxph_ps): Likewise.
(_mm_cvtxps_ph): Likewise.
(_mm_mask_cvtxps_ph): Likewise.
(_mm_maskz_cvtxps_ph): Likewise.
(_mm256_cvtxps_ph): Likewise.
(_mm256_mask_cvtxps_ph): Likewise.
(_mm256_maskz_cvtxps_ph): Likewise.
(_mm_cvtpd_ph): Likewise.
(_mm_mask_cvtpd_ph): Likewise.
(_mm_maskz_cvtpd_ph): Likewise.
(_mm256_cvtpd_ph): Likewise.
(_mm256_mask_cvtpd_ph): Likewise.
(_mm256_maskz_cvtpd_ph): Likewise.
* config/i386/i386-builtin.def: Add corresponding new builtins.
* config/i386/i386-builtin-types.def: Add corresponding builtin types.
* config/i386/i386-expand.c: Handle new builtin types.
* config/i386/sse.md
(VF4_128_8_256): New.
(VF48H_AVX512VL): Ditto.
(ssePHmode): Add HF vector modes.
(castmode): Add new convertable modes.
(qq2phsuff): Ditto.
(ph2pssuffix): New.
(avx512fp16_vcvt<castmode>2ph_<mode><mask_name><round_name>): Ditto.
(avx512fp16_vcvt<castmode>2ph_<mode>): Ditto.
(*avx512fp16_vcvt<castmode>2ph_<mode>): Ditto.
(avx512fp16_vcvt<castmode>2ph_<mode>_mask): Ditto.
(*avx512fp16_vcvt<castmode>2ph_<mode>_mask): Ditto.
(*avx512fp16_vcvt<castmode>2ph_<mode>_mask_1): Ditto.
(avx512fp16_float_extend_ph<mode>2<mask_name><round_saeonly_name>):
Ditto.
(avx512fp16_float_extend_ph<mode>2<mask_name>): Ditto.
(*avx512fp16_float_extend_ph<mode>2_load<mask_name>): Ditto.
(avx512fp16_float_extend_phv2df2<mask_name>): Ditto.
(*avx512fp16_float_extend_phv2df2_load<mask_name>): Ditto.
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx-1.c: Add test for new builtins.
* gcc.target/i386/sse-13.c: Ditto.
* gcc.target/i386/sse-23.c: Ditto.
* gcc.target/i386/sse-14.c: Add test for new intrinsics.
* gcc.target/i386/sse-22.c: Ditto.
liuhongt [Fri, 15 Mar 2019 06:17:54 +0000 (14:17 +0800)]
AVX512FP16: Add vcvttsh2si/vcvttsh2usi.
gcc/ChangeLog:
* config/i386/avx512fp16intrin.h (_mm_cvttsh_i32):
New intrinsic.
(_mm_cvttsh_u32): Likewise.
(_mm_cvtt_roundsh_i32): Likewise.
(_mm_cvtt_roundsh_u32): Likewise.
(_mm_cvttsh_i64): Likewise.
(_mm_cvttsh_u64): Likewise.
(_mm_cvtt_roundsh_i64): Likewise.
(_mm_cvtt_roundsh_u64): Likewise.
* config/i386/i386-builtin.def: Add corresponding new builtins.
* config/i386/sse.md
(avx512fp16_fix<fixunssuffix>_trunc<mode>2<round_saeonly_name>):
New.
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx512fp16-vcvttsh2si-1a.c: New test.
* gcc.target/i386/avx512fp16-vcvttsh2si-1b.c: Ditto.
* gcc.target/i386/avx512fp16-vcvttsh2si64-1a.c: Ditto.
* gcc.target/i386/avx512fp16-vcvttsh2si64-1b.c: Ditto.
* gcc.target/i386/avx512fp16-vcvttsh2usi-1a.c: Ditto.
* gcc.target/i386/avx512fp16-vcvttsh2usi-1b.c: Ditto.
* gcc.target/i386/avx512fp16-vcvttsh2usi64-1a.c: Ditto.
* gcc.target/i386/avx512fp16-vcvttsh2usi64-1b.c: Ditto.
* gcc.target/i386/avx-1.c: Add test for new builtins.
* gcc.target/i386/sse-13.c: Ditto.
* gcc.target/i386/sse-23.c: Ditto.
* gcc.target/i386/sse-14.c: Add test for new intrinsics.
* gcc.target/i386/sse-22.c: Ditto.
liuhongt [Mon, 2 Mar 2020 09:20:14 +0000 (17:20 +0800)]
AVX512FP16: Add testcase for vcvttph2w/vcvttph2uw/vcvttph2dq/vcvttph2udq/vcvttph2qq/vcvttph2uqq.
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx512fp16-vcvttph2dq-1a.c: New test.
* gcc.target/i386/avx512fp16-vcvttph2dq-1b.c: Ditto.
* gcc.target/i386/avx512fp16-vcvttph2qq-1a.c: Ditto.
* gcc.target/i386/avx512fp16-vcvttph2qq-1b.c: Ditto.
* gcc.target/i386/avx512fp16-vcvttph2udq-1a.c: Ditto.
* gcc.target/i386/avx512fp16-vcvttph2udq-1b.c: Ditto.
* gcc.target/i386/avx512fp16-vcvttph2uqq-1a.c: Ditto.
* gcc.target/i386/avx512fp16-vcvttph2uqq-1b.c: Ditto.
* gcc.target/i386/avx512fp16-vcvttph2uw-1a.c: Ditto.
* gcc.target/i386/avx512fp16-vcvttph2uw-1b.c: Ditto.
* gcc.target/i386/avx512fp16-vcvttph2w-1a.c: Ditto.
* gcc.target/i386/avx512fp16-vcvttph2w-1b.c: Ditto.
* gcc.target/i386/avx512fp16vl-vcvttph2dq-1a.c: Ditto.
* gcc.target/i386/avx512fp16vl-vcvttph2dq-1b.c: Ditto.
* gcc.target/i386/avx512fp16vl-vcvttph2qq-1a.c: Ditto.
* gcc.target/i386/avx512fp16vl-vcvttph2qq-1b.c: Ditto.
* gcc.target/i386/avx512fp16vl-vcvttph2udq-1a.c: Ditto.
* gcc.target/i386/avx512fp16vl-vcvttph2udq-1b.c: Ditto.
* gcc.target/i386/avx512fp16vl-vcvttph2uqq-1a.c: Ditto.
* gcc.target/i386/avx512fp16vl-vcvttph2uqq-1b.c: Ditto.
* gcc.target/i386/avx512fp16vl-vcvttph2uw-1a.c: Ditto.
* gcc.target/i386/avx512fp16vl-vcvttph2uw-1b.c: Ditto.
* gcc.target/i386/avx512fp16vl-vcvttph2w-1a.c: Ditto.
* gcc.target/i386/avx512fp16vl-vcvttph2w-1b.c: Ditto.
liuhongt [Wed, 13 Mar 2019 07:16:03 +0000 (15:16 +0800)]
AVX512FP16: Add vcvttph2w/vcvttph2uw/vcvttph2dq/vcvttph2qq/vcvttph2udq/vcvttph2uqq
gcc/ChangeLog:
* config/i386/avx512fp16intrin.h (_mm512_cvttph_epi32):
New intrinsic.
(_mm512_mask_cvttph_epi32): Likewise.
(_mm512_maskz_cvttph_epi32): Likewise.
(_mm512_cvtt_roundph_epi32): Likewise.
(_mm512_mask_cvtt_roundph_epi32): Likewise.
(_mm512_maskz_cvtt_roundph_epi32): Likewise.
(_mm512_cvttph_epu32): Likewise.
(_mm512_mask_cvttph_epu32): Likewise.
(_mm512_maskz_cvttph_epu32): Likewise.
(_mm512_cvtt_roundph_epu32): Likewise.
(_mm512_mask_cvtt_roundph_epu32): Likewise.
(_mm512_maskz_cvtt_roundph_epu32): Likewise.
(_mm512_cvttph_epi64): Likewise.
(_mm512_mask_cvttph_epi64): Likewise.
(_mm512_maskz_cvttph_epi64): Likewise.
(_mm512_cvtt_roundph_epi64): Likewise.
(_mm512_mask_cvtt_roundph_epi64): Likewise.
(_mm512_maskz_cvtt_roundph_epi64): Likewise.
(_mm512_cvttph_epu64): Likewise.
(_mm512_mask_cvttph_epu64): Likewise.
(_mm512_maskz_cvttph_epu64): Likewise.
(_mm512_cvtt_roundph_epu64): Likewise.
(_mm512_mask_cvtt_roundph_epu64): Likewise.
(_mm512_maskz_cvtt_roundph_epu64): Likewise.
(_mm512_cvttph_epi16): Likewise.
(_mm512_mask_cvttph_epi16): Likewise.
(_mm512_maskz_cvttph_epi16): Likewise.
(_mm512_cvtt_roundph_epi16): Likewise.
(_mm512_mask_cvtt_roundph_epi16): Likewise.
(_mm512_maskz_cvtt_roundph_epi16): Likewise.
(_mm512_cvttph_epu16): Likewise.
(_mm512_mask_cvttph_epu16): Likewise.
(_mm512_maskz_cvttph_epu16): Likewise.
(_mm512_cvtt_roundph_epu16): Likewise.
(_mm512_mask_cvtt_roundph_epu16): Likewise.
(_mm512_maskz_cvtt_roundph_epu16): Likewise.
* config/i386/avx512fp16vlintrin.h (_mm_cvttph_epi32):
New intirnsic.
(_mm_mask_cvttph_epi32): Likewise.
(_mm_maskz_cvttph_epi32): Likewise.
(_mm256_cvttph_epi32): Likewise.
(_mm256_mask_cvttph_epi32): Likewise.
(_mm256_maskz_cvttph_epi32): Likewise.
(_mm_cvttph_epu32): Likewise.
(_mm_mask_cvttph_epu32): Likewise.
(_mm_maskz_cvttph_epu32): Likewise.
(_mm256_cvttph_epu32): Likewise.
(_mm256_mask_cvttph_epu32): Likewise.
(_mm256_maskz_cvttph_epu32): Likewise.
(_mm_cvttph_epi64): Likewise.
(_mm_mask_cvttph_epi64): Likewise.
(_mm_maskz_cvttph_epi64): Likewise.
(_mm256_cvttph_epi64): Likewise.
(_mm256_mask_cvttph_epi64): Likewise.
(_mm256_maskz_cvttph_epi64): Likewise.
(_mm_cvttph_epu64): Likewise.
(_mm_mask_cvttph_epu64): Likewise.
(_mm_maskz_cvttph_epu64): Likewise.
(_mm256_cvttph_epu64): Likewise.
(_mm256_mask_cvttph_epu64): Likewise.
(_mm256_maskz_cvttph_epu64): Likewise.
(_mm_cvttph_epi16): Likewise.
(_mm_mask_cvttph_epi16): Likewise.
(_mm_maskz_cvttph_epi16): Likewise.
(_mm256_cvttph_epi16): Likewise.
(_mm256_mask_cvttph_epi16): Likewise.
(_mm256_maskz_cvttph_epi16): Likewise.
(_mm_cvttph_epu16): Likewise.
(_mm_mask_cvttph_epu16): Likewise.
(_mm_maskz_cvttph_epu16): Likewise.
(_mm256_cvttph_epu16): Likewise.
(_mm256_mask_cvttph_epu16): Likewise.
(_mm256_maskz_cvttph_epu16): Likewise.
* config/i386/i386-builtin.def: Add new builtins.
* config/i386/sse.md
(avx512fp16_fix<fixunssuffix>_trunc<mode>2<mask_name><round_saeonly_name>):
New.
(avx512fp16_fix<fixunssuffix>_trunc<mode>2<mask_name>): Ditto.
(*avx512fp16_fix<fixunssuffix>_trunc<mode>2_load<mask_name>): Ditto.
(avx512fp16_fix<fixunssuffix>_truncv2di2<mask_name>): Ditto.
(avx512fp16_fix<fixunssuffix>_truncv2di2_load<mask_name>): Ditto.
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx-1.c: Add test for new builtins.
* gcc.target/i386/sse-13.c: Ditto.
* gcc.target/i386/sse-23.c: Ditto.
* gcc.target/i386/sse-14.c: Add test for new intrinsics.
* gcc.target/i386/sse-22.c: Ditto.
liuhongt [Mon, 2 Mar 2020 09:18:26 +0000 (17:18 +0800)]
AVX512FP16: Add testcase for vcvtsh2si/vcvtsh2usi/vcvtsi2sh/vcvtusi2sh.
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx512fp16-helper.h (V512): Add int32
component.
* gcc.target/i386/avx512fp16-vcvtsh2si-1a.c: New test.
* gcc.target/i386/avx512fp16-vcvtsh2si-1b.c: Ditto.
* gcc.target/i386/avx512fp16-vcvtsh2si64-1a.c: Ditto.
* gcc.target/i386/avx512fp16-vcvtsh2si64-1b.c: Ditto.
* gcc.target/i386/avx512fp16-vcvtsh2usi-1a.c: Ditto.
* gcc.target/i386/avx512fp16-vcvtsh2usi-1b.c: Ditto.
* gcc.target/i386/avx512fp16-vcvtsh2usi64-1a.c: Ditto.
* gcc.target/i386/avx512fp16-vcvtsh2usi64-1b.c: Ditto.
* gcc.target/i386/avx512fp16-vcvtsi2sh-1a.c: Ditto.
* gcc.target/i386/avx512fp16-vcvtsi2sh-1b.c: Ditto.
* gcc.target/i386/avx512fp16-vcvtsi2sh64-1a.c: Ditto.
* gcc.target/i386/avx512fp16-vcvtsi2sh64-1b.c: Ditto.
* gcc.target/i386/avx512fp16-vcvtusi2sh-1a.c: Ditto.
* gcc.target/i386/avx512fp16-vcvtusi2sh-1b.c: Ditto.
* gcc.target/i386/avx512fp16-vcvtusi2sh64-1a.c: Ditto.
* gcc.target/i386/avx512fp16-vcvtusi2sh64-1b.c: Ditto.
liuhongt [Mon, 11 Mar 2019 21:51:49 +0000 (14:51 -0700)]
AVX512FP16: Add vcvtsh2si/vcvtsh2usi/vcvtsi2sh/vcvtusi2sh.
gcc/ChangeLog:
* config/i386/avx512fp16intrin.h (_mm_cvtsh_i32): New intrinsic.
(_mm_cvtsh_u32): Likewise.
(_mm_cvt_roundsh_i32): Likewise.
(_mm_cvt_roundsh_u32): Likewise.
(_mm_cvtsh_i64): Likewise.
(_mm_cvtsh_u64): Likewise.
(_mm_cvt_roundsh_i64): Likewise.
(_mm_cvt_roundsh_u64): Likewise.
(_mm_cvti32_sh): Likewise.
(_mm_cvtu32_sh): Likewise.
(_mm_cvt_roundi32_sh): Likewise.
(_mm_cvt_roundu32_sh): Likewise.
(_mm_cvti64_sh): Likewise.
(_mm_cvtu64_sh): Likewise.
(_mm_cvt_roundi64_sh): Likewise.
(_mm_cvt_roundu64_sh): Likewise.
* config/i386/i386-builtin-types.def: Add corresponding builtin types.
* config/i386/i386-builtin.def: Add corresponding new builtins.
* config/i386/i386-expand.c (ix86_expand_round_builtin):
Handle new builtin types.
* config/i386/sse.md
(avx512fp16_vcvtsh2<sseintconvertsignprefix>si<rex64namesuffix><round_name>):
New define_insn.
(avx512fp16_vcvtsh2<sseintconvertsignprefix>si<rex64namesuffix>_2): Likewise.
(avx512fp16_vcvt<floatsuffix>si2sh<rex64namesuffix><round_name>): Likewise.
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx-1.c: Add test for new builtins.
* gcc.target/i386/sse-13.c: Ditto.
* gcc.target/i386/sse-23.c: Ditto.
* gcc.target/i386/sse-14.c: Add test for new intrinsics.
* gcc.target/i386/sse-22.c: Ditto.
GCC Administrator [Fri, 17 Sep 2021 00:16:25 +0000 (00:16 +0000)]
Daily bump.
Ian Lance Taylor [Thu, 16 Sep 2021 23:09:35 +0000 (16:09 -0700)]
libgo: update to go1.17.1 release
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/350414
Maxim Blinov [Thu, 16 Sep 2021 22:32:58 +0000 (00:32 +0200)]
analyzer: Fix bootstrap with clang
gcc/analyzer/ChangeLog:
PR bootstrap/102242
* engine.cc (INCLUDE_UNIQUE_PTR): Define.
Jonathan Wakely [Thu, 16 Sep 2021 20:21:56 +0000 (21:21 +0100)]
libstdc++: Regenerate the src/debug Makefiles as needed
When the build configuration changes and Makefiles are recreated, the
src/debug/Makefile and src/debug/*/Makefile files are not recreated,
because they're not managed in the usual way by automake. This can lead
to build failures or surprising inconsistencies between the main and
debug versions of the library when doing incremental builds.
This causes them to be regenerated if any of the corresponding non-debug
makefiles is newer.
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:
* src/Makefile.am (stamp-debug): Add all Makefiles as
prerequisites.
* src/Makefile.in: Regenerate.
Jonathan Wakely [Thu, 16 Sep 2021 19:10:02 +0000 (20:10 +0100)]
libstdc++: Increase timeout factor for slow pb_ds tests
Compiling these tests still times out too often when running the
testsuite with more parallel jobs than there are available cores.
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:
* testsuite/ext/pb_ds/regression/tree_map_rand.cc: Increase
timeout factor to 3.
* testsuite/ext/pb_ds/regression/tree_set_rand.cc: Likewise.
Jonathan Wakely [Thu, 16 Sep 2021 14:36:31 +0000 (15:36 +0100)]
libstdc++: Update documentation that only refers to c++98 and c++11
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:
* doc/xml/manual/using.xml: Generalize to apply to more than
just -std=c++11.
* doc/html/manual/using_macros.html: Regenerate.
Jonathan Wakely [Thu, 16 Sep 2021 13:14:38 +0000 (14:14 +0100)]
libstdc++: Add noexcept to std::nullopt_t constructor
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:
* include/std/optional (nullptr_t): Make constructor noexcept.
Jonathan Wakely [Thu, 16 Sep 2021 12:35:24 +0000 (13:35 +0100)]
libstdc++: Remove non-deducible parameter for std::advance overload
This was just a copy and paste error.
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:
* include/bits/fs_path.h (advance): Remove non-deducible
template parameter.
Jonathan Wakely [Wed, 15 Sep 2021 20:53:35 +0000 (21:53 +0100)]
libstdc++: Add missing 'constexpr' to std::tuple [PR102270]
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:
PR libstdc++/102270
* include/std/tuple (_Head_base, _Tuple_impl): Add
_GLIBCXX20_CONSTEXPR to allocator-extended constructors.
(tuple<>::swap(tuple&)): Add _GLIBCXX20_CONSTEXPR.
* testsuite/20_util/tuple/cons/102270.C: New test.
Jonathan Wakely [Wed, 15 Sep 2021 20:49:29 +0000 (21:49 +0100)]
libstdc++: Add missing constraint to std::span deduction guide [PR102280]
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:
PR libstdc++/102280
* include/std/span (span(Range&&)): Add constraint to deduction
guide.
Jonathan Wakely [Wed, 15 Sep 2021 20:39:27 +0000 (21:39 +0100)]
libstdc++: Fix recipes for C++11-compiled files in src/c++98
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:
* src/c++98/Makefile.am: Use CXXCOMPILE not LTCXXCOMPILE.
* src/c++98/Makefile.in: Regenerate.
Jonathan Wakely [Wed, 15 Sep 2021 20:40:20 +0000 (21:40 +0100)]
libstdc++: Add noexcept to std::to_string overloads that don't allocate
When the values is guaranteed to fit in the SSO buffer we know the
string won't allocate, so the function can be noexcept. For 32-bit
integers, we know they need no more than 9 bytes (or 10 with a minus
sign) and the SSO buffer is 15 bytes.
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:
* include/bits/basic_string.h [_GLIBCXX_USE_CXX11_ABI]
(to_string): Add noexcept if the type width is 32 bits or less.
Jonathan Wakely [Tue, 14 Sep 2021 08:34:30 +0000 (09:34 +0100)]
libstdc++: Add noexcept to unique_ptr accessors
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:
* include/bits/unique_ptr.h (__uniq_ptr_impl::_M_ptr)
(__uniq_ptr_impl::_M_deleter): Add noexcept.
Thomas Rodgers [Thu, 16 Sep 2021 21:42:58 +0000 (14:42 -0700)]
libstdc++: Fix UB in atomic_ref/wait_notify.cc [PR101761]
Remove UB in atomic_ref/wait_notify test.
Signed-off-by: Thomas Rodgers <trodgers@redhat.com>
libstdc++-v3/ChangeLog:
PR libstdc++/101761
* testsuite/29_atomics/atomic_ref/wait_notify.cc (test): Use
va and vb as arguments to wait/notify, remove unused bb local.
Bill Schmidt [Thu, 16 Sep 2021 19:52:42 +0000 (14:52 -0500)]
rs6000: Handle overloads during program parsing
Although this patch looks quite large, the changes are fairly minimal.
Most of it is duplicating the large function that does the overload
resolution using the automatically generated data structures instead of
the old hand-generated ones. This doesn't make the patch terribly easy to
review, unfortunately. Just be aware that generally we aren't changing
the logic and functionality of overload handling.
2021-09-16 Bill Schmidt <wschmidt@linux.ibm.com>
gcc/
* config/rs6000/rs6000-c.c (rs6000-builtins.h): New include.
(altivec_resolve_new_overloaded_builtin): New forward decl.
(rs6000_new_builtin_type_compatible): New function.
(altivec_resolve_overloaded_builtin): Call
altivec_resolve_new_overloaded_builtin.
(altivec_build_new_resolved_builtin): New function.
(altivec_resolve_new_overloaded_builtin): Likewise.
* config/rs6000/rs6000-call.c (rs6000_new_builtin_is_supported):
Likewise.
* config/rs6000/rs6000-gen-builtins.c (write_decls): Remove _p from
name of rs6000_new_builtin_is_supported.
Patrick Palka [Thu, 16 Sep 2021 19:03:55 +0000 (15:03 -0400)]
c++: constrained variable template issues [PR98486]
This fixes some issues with constrained variable templates:
- Constraints aren't checked when explicitly specializing a variable
template.
- Constraints aren't attached to a static data member template at
parse time.
- Constraints don't get propagated when (partially) instantiating a
static data member template, so we need to make sure to look up
constraints using the most general template during satisfaction.
PR c++/98486
gcc/cp/ChangeLog:
* constraint.cc (get_normalized_constraints_from_decl): Always
look up constraints using the most general template.
* decl.c (grokdeclarator): Set constraints on a static data
member template.
* pt.c (determine_specialization): Check constraints on a
variable template.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/concepts-var-templ1.C: New test.
* g++.dg/cpp2a/concepts-var-templ1a.C: New test.
* g++.dg/cpp2a/concepts-var-templ1b.C: New test.
Harald Anlauf [Thu, 16 Sep 2021 18:12:21 +0000 (20:12 +0200)]
Fortran - fix handling of optional allocatable DT arguments with INTENT(OUT)
gcc/fortran/ChangeLog:
PR fortran/102287
* trans-expr.c (gfc_conv_procedure_call): Wrap deallocation of
allocatable components of optional allocatable derived type
procedure arguments with INTENT(OUT) into a presence check.
gcc/testsuite/ChangeLog:
PR fortran/102287
* gfortran.dg/intent_out_14.f90: New test.
Andrew Pinski [Wed, 15 Sep 2021 09:51:08 +0000 (09:51 +0000)]
Fix PR 67102: Add libstdc++ dependancy to libffi
The error message is obvious -funconfigured-libstdc++-v3 is used
on the g++ command line. So we just add the dependancy.
OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.
ChangeLog:
PR bootstrap/67102
* Makefile.def: Have configure-target-libffi depend on
all-target-libstdc++-v3.
* Makefile.in: Regenerate.
Uros Bizjak [Thu, 16 Sep 2021 17:04:34 +0000 (19:04 +0200)]
[i386] Change ix86_decompose_address return type to bool.
After a recent change only a boolean value is returned.
2021-09-16 Uroš Bizjak <ubizjak@gmail.com>
gcc/
* config/i386/i386-protos.h (ix86_decompose_address):
Change return type to bool.
* config/i386/i386.c (ix86_decompose_address): Ditto.
Tobias Burnus [Thu, 16 Sep 2021 16:35:34 +0000 (18:35 +0200)]
PowerPC: Fix rs6000-gen-builtins with build != host [PR102353]
This mimics what the main Makefile.in does: compile the generator
files under build (with Makefile.in's 'build/%.o' rule for compilation).
It also adds $(RUN_GEN) to optionally run it with valgrind and
the $(build_exeext) suffix.
Before, the .o files were compiled with $(COMPILE), causing link
error with $(LINKER_FOR_BUILD) for build != host.
gcc/
PR target/102353
* config/rs6000/t-rs6000 (build/rs6000-gen-builtins.o, build/rbtree.o):
Added 'build/' to target, use build/%.o rule.
(build/rs6000-gen-builtins$(build_exeext)): Add 'build/' and
'$(build_exeext)' to target and 'build/' for the *.o files.
(rs6000-builtins.c): Update for those changes; run rs6000-gen-builtins
with $(RUN_GEN).
Martin Jambor [Thu, 16 Sep 2021 12:04:06 +0000 (14:04 +0200)]
cgraph: Do not warn about caller count mismatches of removed functions
To verify other changes in the patch series, I have been searching for
"Invalid sum of caller counts" string in symtab dump but found that
there are false warnings about functions which have their body removed
because they are now unreachable. Those are of course invalid and so
this patches avoids checking such cgraph_nodes.
gcc/ChangeLog:
2021-08-20 Martin Jambor <mjambor@suse.cz>
* cgraph.c (cgraph_node::dump): Do not check caller count sums if
the body has been removed. Remove trailing whitespace.
Iain Sandoe [Wed, 14 Jul 2021 09:13:23 +0000 (10:13 +0100)]
coroutines: Small cleanups to await_statement_walker [NFC].
There is no need to make a MODIFY_EXPR for any of the condition
vars that we synthesize.
Expansion of co_return can be carried out independently of any
co_awaits that might be contained which simplifies this.
Where we are rewriting statements to handle await expression
logic, there is no need to carry out any analysis - we just need
to detect the presence of any co_await.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/cp/ChangeLog:
* coroutines.cc (await_statement_walker): Code cleanups.
Richard Biener [Thu, 16 Sep 2021 09:19:14 +0000 (11:19 +0200)]
middle-end/102360 - adjust .DEFERRED_INIT expansion
This avoids using native_interpret_type when we cannot do it with
the original type of the variable, instead use an integer type
for the initialization and side-step the size limitation of
native_interpret_int.
2021-09-16 Richard Biener <rguenther@suse.de>
PR middle-end/102360
* internal-fn.c (expand_DEFERRED_INIT): Make pattern-init
of non-memory more robust.
* g++.dg/pr102360.C: New testcase.
Daniel Cederman [Mon, 25 Mar 2019 08:12:17 +0000 (09:12 +0100)]
sparc: Add scheduling information for LEON5
The LEON5 can often dual issue instructions from the same 64-bit aligned
double word if there are no data dependencies. Add scheduling information
to avoid scheduling unpairable instructions back-to-back.
gcc/ChangeLog:
* config/sparc/sparc-opts.h (enum sparc_processor_type): Add LEON5
* config/sparc/sparc.c (struct processor_costs): Add LEON5 costs
(leon5_adjust_cost): Increase cost of store with data dependency
on ALU instruction and FPU anti-dependencies.
(sparc_option_override): Add LEON5 costs
(sparc_adjust_cost): Add LEON5 cost adjustments
* config/sparc/sparc.h: Add LEON5
* config/sparc/sparc.md: Include LEON5 scheduling information
* config/sparc/sparc.opt: Add LEON5
* doc/invoke.texi: Add LEON5
* config/sparc/leon5.md: New file.
Daniel Cederman [Thu, 1 Oct 2020 07:11:38 +0000 (09:11 +0200)]
sparc: Add NOP in stack_protect_set32 if sparc_fix_b2bst enabled
This is needed to prevent the Store -> (Non-store or load) -> Store
sequence.
gcc/ChangeLog:
* config/sparc/sparc.md (stack_protect_set32): Add NOP to prevent
sensitive sequence for B2BST errata workaround.
Daniel Cederman [Mon, 12 Oct 2020 06:50:35 +0000 (08:50 +0200)]
sparc: Prevent atomic instructions in beginning of functions for UT700
A call to the function might have a load instruction in the delay slot
and a load followed by an atomic function could cause a deadlock.
gcc/ChangeLog:
* config/sparc/sparc.c (sparc_do_work_around_errata): Do not begin
functions with atomic instruction in the UT700 errata workaround.
Daniel Cederman [Fri, 16 Oct 2020 07:12:30 +0000 (09:12 +0200)]
sparc: Skip all empty assembly statements
This version detects multiple empty assembly statements in a row and also
detects non-memory barrier empty assembly statements (__asm__("")). It
can be used instead of next_active_insn().
gcc/ChangeLog:
* config/sparc/sparc.c (next_active_non_empty_insn): New function
that returns next active non empty assembly instruction.
(sparc_do_work_around_errata): Use new function.
Daniel Cederman [Fri, 25 Sep 2020 11:17:46 +0000 (13:17 +0200)]
sparc: Treat more instructions as load or store in errata workarounds
Check the attribute of instruction to determine if it performs a store
or load operation. This more generic approach sees the last instruction
in the GOTdata_op model as a potential load and treats the memory barrier
as a potential store instruction.
gcc/ChangeLog:
* config/sparc/sparc.c (store_insn_p): Add predicate for store
attributes.
(load_insn_p): Add predicate for load attributes.
(sparc_do_work_around_errata): Use new predicates.
Andreas Larsson [Wed, 5 Jul 2017 11:36:31 +0000 (13:36 +0200)]
sparc: Print out bit names for LEON and LEON3 with -mdebug
gcc/ChangeLog:
* config/sparc/sparc.c (dump_target_flag_bits): Print bit names for
LEON and LEON3.
Christophe Lyon [Thu, 16 Sep 2021 09:31:31 +0000 (09:31 +0000)]
testsuite: Support single-precision in g++.dg/eh/arm-vfp-unwind.C
g++.dg/eh/arm-vfp-unwind.C uses an asm statement relying on
double-precision FPU support. This patch extends it support
single-precision, useful for targets without double-precision.
2021-09-16 Richard Earnshaw <rearnsha@arm.com>
gcc/testsuite/
* g++.dg/eh/arm-vfp-unwind.C: Support single-precision.
Martin Liska [Thu, 16 Sep 2021 09:17:28 +0000 (11:17 +0200)]
mips: Fix macro typo
gcc/ChangeLog:
* config/mips/netbsd.h: Fix typo in name of a macro.
liuhongt [Thu, 2 Sep 2021 05:05:54 +0000 (13:05 +0800)]
Check mask type when doing cond_op related gimple simplification.
gcc/ChangeLog:
PR middle-end/102080
* match.pd: Check mask type when doing cond_op related gimple
simplification.
* tree.c (is_truth_type_for): New function.
* tree.h (is_truth_type_for): New declaration.
gcc/testsuite/ChangeLog:
PR middle-end/102080
* gcc.target/i386/pr102080.c: New test.
liuhongt [Mon, 2 Mar 2020 09:15:55 +0000 (17:15 +0800)]
AVX512FP16: Add testcase for vcvtw2ph/vcvtuw2ph/vcvtdq2ph/vcvtudq2ph/vcvtqq2ph/vcvtuqq2ph.
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx512fp16-vcvtdq2ph-1a.c: New test.
* gcc.target/i386/avx512fp16-vcvtdq2ph-1b.c: Ditto.
* gcc.target/i386/avx512fp16-vcvtqq2ph-1a.c: Ditto.
* gcc.target/i386/avx512fp16-vcvtqq2ph-1b.c: Ditto.
* gcc.target/i386/avx512fp16-vcvtudq2ph-1a.c: Ditto.
* gcc.target/i386/avx512fp16-vcvtudq2ph-1b.c: Ditto.
* gcc.target/i386/avx512fp16-vcvtuqq2ph-1a.c: Ditto.
* gcc.target/i386/avx512fp16-vcvtuqq2ph-1b.c: Ditto.
* gcc.target/i386/avx512fp16-vcvtuw2ph-1a.c: Ditto.
* gcc.target/i386/avx512fp16-vcvtuw2ph-1b.c: Ditto.
* gcc.target/i386/avx512fp16-vcvtw2ph-1a.c: Ditto.
* gcc.target/i386/avx512fp16-vcvtw2ph-1b.c: Ditto.
* gcc.target/i386/avx512fp16vl-vcvtdq2ph-1a.c: Ditto.
* gcc.target/i386/avx512fp16vl-vcvtdq2ph-1b.c: Ditto.
* gcc.target/i386/avx512fp16vl-vcvtqq2ph-1a.c: Ditto.
* gcc.target/i386/avx512fp16vl-vcvtqq2ph-1b.c: Ditto.
* gcc.target/i386/avx512fp16vl-vcvtudq2ph-1a.c: Ditto.
* gcc.target/i386/avx512fp16vl-vcvtudq2ph-1b.c: Ditto.
* gcc.target/i386/avx512fp16vl-vcvtuqq2ph-1a.c: Ditto.
* gcc.target/i386/avx512fp16vl-vcvtuqq2ph-1b.c: Ditto.
* gcc.target/i386/avx512fp16vl-vcvtuw2ph-1a.c: Ditto.
* gcc.target/i386/avx512fp16vl-vcvtuw2ph-1b.c: Ditto.
* gcc.target/i386/avx512fp16vl-vcvtw2ph-1a.c: Ditto.
* gcc.target/i386/avx512fp16vl-vcvtw2ph-1b.c: Ditto.
liuhongt [Wed, 6 Mar 2019 23:22:18 +0000 (15:22 -0800)]
AVX512FP16: Add vcvtuw2ph/vcvtw2ph/vcvtdq2ph/vcvtudq2ph/vcvtqq2ph/vcvtuqq2ph
gcc/ChangeLog:
* config/i386/avx512fp16intrin.h (_mm512_cvtepi32_ph): New
intrinsic.
(_mm512_mask_cvtepi32_ph): Likewise.
(_mm512_maskz_cvtepi32_ph): Likewise.
(_mm512_cvt_roundepi32_ph): Likewise.
(_mm512_mask_cvt_roundepi32_ph): Likewise.
(_mm512_maskz_cvt_roundepi32_ph): Likewise.
(_mm512_cvtepu32_ph): Likewise.
(_mm512_mask_cvtepu32_ph): Likewise.
(_mm512_maskz_cvtepu32_ph): Likewise.
(_mm512_cvt_roundepu32_ph): Likewise.
(_mm512_mask_cvt_roundepu32_ph): Likewise.
(_mm512_maskz_cvt_roundepu32_ph): Likewise.
(_mm512_cvtepi64_ph): Likewise.
(_mm512_mask_cvtepi64_ph): Likewise.
(_mm512_maskz_cvtepi64_ph): Likewise.
(_mm512_cvt_roundepi64_ph): Likewise.
(_mm512_mask_cvt_roundepi64_ph): Likewise.
(_mm512_maskz_cvt_roundepi64_ph): Likewise.
(_mm512_cvtepu64_ph): Likewise.
(_mm512_mask_cvtepu64_ph): Likewise.
(_mm512_maskz_cvtepu64_ph): Likewise.
(_mm512_cvt_roundepu64_ph): Likewise.
(_mm512_mask_cvt_roundepu64_ph): Likewise.
(_mm512_maskz_cvt_roundepu64_ph): Likewise.
(_mm512_cvtepi16_ph): Likewise.
(_mm512_mask_cvtepi16_ph): Likewise.
(_mm512_maskz_cvtepi16_ph): Likewise.
(_mm512_cvt_roundepi16_ph): Likewise.
(_mm512_mask_cvt_roundepi16_ph): Likewise.
(_mm512_maskz_cvt_roundepi16_ph): Likewise.
(_mm512_cvtepu16_ph): Likewise.
(_mm512_mask_cvtepu16_ph): Likewise.
(_mm512_maskz_cvtepu16_ph): Likewise.
(_mm512_cvt_roundepu16_ph): Likewise.
(_mm512_mask_cvt_roundepu16_ph): Likewise.
(_mm512_maskz_cvt_roundepu16_ph): Likewise.
* config/i386/avx512fp16vlintrin.h (_mm_cvtepi32_ph): New
intrinsic.
(_mm_mask_cvtepi32_ph): Likewise.
(_mm_maskz_cvtepi32_ph): Likewise.
(_mm256_cvtepi32_ph): Likewise.
(_mm256_mask_cvtepi32_ph): Likewise.
(_mm256_maskz_cvtepi32_ph): Likewise.
(_mm_cvtepu32_ph): Likewise.
(_mm_mask_cvtepu32_ph): Likewise.
(_mm_maskz_cvtepu32_ph): Likewise.
(_mm256_cvtepu32_ph): Likewise.
(_mm256_mask_cvtepu32_ph): Likewise.
(_mm256_maskz_cvtepu32_ph): Likewise.
(_mm_cvtepi64_ph): Likewise.
(_mm_mask_cvtepi64_ph): Likewise.
(_mm_maskz_cvtepi64_ph): Likewise.
(_mm256_cvtepi64_ph): Likewise.
(_mm256_mask_cvtepi64_ph): Likewise.
(_mm256_maskz_cvtepi64_ph): Likewise.
(_mm_cvtepu64_ph): Likewise.
(_mm_mask_cvtepu64_ph): Likewise.
(_mm_maskz_cvtepu64_ph): Likewise.
(_mm256_cvtepu64_ph): Likewise.
(_mm256_mask_cvtepu64_ph): Likewise.
(_mm256_maskz_cvtepu64_ph): Likewise.
(_mm_cvtepi16_ph): Likewise.
(_mm_mask_cvtepi16_ph): Likewise.
(_mm_maskz_cvtepi16_ph): Likewise.
(_mm256_cvtepi16_ph): Likewise.
(_mm256_mask_cvtepi16_ph): Likewise.
(_mm256_maskz_cvtepi16_ph): Likewise.
(_mm_cvtepu16_ph): Likewise.
(_mm_mask_cvtepu16_ph): Likewise.
(_mm_maskz_cvtepu16_ph): Likewise.
(_mm256_cvtepu16_ph): Likewise.
(_mm256_mask_cvtepu16_ph): Likewise.
(_mm256_maskz_cvtepu16_ph): Likewise.
* config/i386/i386-builtin-types.def: Add corresponding builtin types.
* config/i386/i386-builtin.def: Add corresponding new builtins.
* config/i386/i386-expand.c
(ix86_expand_args_builtin): Handle new builtin types.
(ix86_expand_round_builtin): Ditto.
* config/i386/i386-modes.def: Declare V2HF and V6HF.
* config/i386/sse.md (VI2H_AVX512VL): New.
(qq2phsuff): Ditto.
(sseintvecmode): Add HF vector modes.
(avx512fp16_vcvt<floatsuffix><sseintconvert>2ph_<mode><mask_name><round_name>):
New.
(avx512fp16_vcvt<floatsuffix><sseintconvert>2ph_<mode>): Ditto.
(*avx512fp16_vcvt<floatsuffix><sseintconvert>2ph_<mode>): Ditto.
(avx512fp16_vcvt<floatsuffix><sseintconvert>2ph_<mode>_mask): Ditto.
(*avx512fp16_vcvt<floatsuffix><sseintconvert>2ph_<mode>_mask): Ditto.
(*avx512fp16_vcvt<floatsuffix><sseintconvert>2ph_<mode>_mask_1): Ditto.
(avx512fp16_vcvt<floatsuffix>qq2ph_v2di): Ditto.
(*avx512fp16_vcvt<floatsuffix>qq2ph_v2di): Ditto.
(avx512fp16_vcvt<floatsuffix>qq2ph_v2di_mask): Ditto.
(*avx512fp16_vcvt<floatsuffix>qq2ph_v2di_mask): Ditto.
(*avx512fp16_vcvt<floatsuffix>qq2ph_v2di_mask_1): Ditto.
* config/i386/subst.md (round_qq2phsuff): New subst_attr.
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx-1.c: Add test for new builtins.
* gcc.target/i386/sse-13.c: Ditto.
* gcc.target/i386/sse-23.c: Ditto.
* gcc.target/i386/sse-14.c: Add test for new intrinsics.
* gcc.target/i386/sse-22.c: Ditto.
liuhongt [Mon, 2 Mar 2020 09:14:10 +0000 (17:14 +0800)]
AVX512FP16: Add testcase for vcvtph2w/vcvtph2uw/vcvtph2dq/vcvtph2udq/vcvtph2qq/vcvtph2uqq.
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx512fp16-helper.h (V512): Add QI
components.
* gcc.target/i386/avx512fp16-vcvtph2dq-1a.c: New test.
* gcc.target/i386/avx512fp16-vcvtph2dq-1b.c: Ditto.
* gcc.target/i386/avx512fp16-vcvtph2qq-1a.c: Ditto.
* gcc.target/i386/avx512fp16-vcvtph2qq-1b.c: Ditto.
* gcc.target/i386/avx512fp16-vcvtph2udq-1a.c: Ditto.
* gcc.target/i386/avx512fp16-vcvtph2udq-1b.c: Ditto.
* gcc.target/i386/avx512fp16-vcvtph2uqq-1a.c: Ditto.
* gcc.target/i386/avx512fp16-vcvtph2uqq-1b.c: Ditto.
* gcc.target/i386/avx512fp16-vcvtph2uw-1a.c: Ditto.
* gcc.target/i386/avx512fp16-vcvtph2uw-1b.c: Ditto.
* gcc.target/i386/avx512fp16-vcvtph2w-1a.c: Ditto.
* gcc.target/i386/avx512fp16-vcvtph2w-1b.c: Ditto.
* gcc.target/i386/avx512fp16vl-vcvtph2dq-1a.c: Ditto.
* gcc.target/i386/avx512fp16vl-vcvtph2dq-1b.c: Ditto.
* gcc.target/i386/avx512fp16vl-vcvtph2qq-1a.c: Ditto.
* gcc.target/i386/avx512fp16vl-vcvtph2qq-1b.c: Ditto.
* gcc.target/i386/avx512fp16vl-vcvtph2udq-1a.c: Ditto.
* gcc.target/i386/avx512fp16vl-vcvtph2udq-1b.c: Ditto.
* gcc.target/i386/avx512fp16vl-vcvtph2uqq-1a.c: Ditto.
* gcc.target/i386/avx512fp16vl-vcvtph2uqq-1b.c: Ditto.
* gcc.target/i386/avx512fp16vl-vcvtph2uw-1a.c: Ditto.
* gcc.target/i386/avx512fp16vl-vcvtph2uw-1b.c: Ditto.
* gcc.target/i386/avx512fp16vl-vcvtph2w-1a.c: Ditto.
* gcc.target/i386/avx512fp16vl-vcvtph2w-1b.c: Ditto.
liuhongt [Mon, 4 Mar 2019 21:46:01 +0000 (13:46 -0800)]
AVX512FP16: Add vcvtph2dq/vcvtph2qq/vcvtph2w/vcvtph2uw/vcvtph2uqq/vcvtph2udq
gcc/ChangeLog:
* config/i386/avx512fp16intrin.h (_mm512_cvtph_epi32):
New intrinsic/
(_mm512_mask_cvtph_epi32): Likewise.
(_mm512_maskz_cvtph_epi32): Likewise.
(_mm512_cvt_roundph_epi32): Likewise.
(_mm512_mask_cvt_roundph_epi32): Likewise.
(_mm512_maskz_cvt_roundph_epi32): Likewise.
(_mm512_cvtph_epu32): Likewise.
(_mm512_mask_cvtph_epu32): Likewise.
(_mm512_maskz_cvtph_epu32): Likewise.
(_mm512_cvt_roundph_epu32): Likewise.
(_mm512_mask_cvt_roundph_epu32): Likewise.
(_mm512_maskz_cvt_roundph_epu32): Likewise.
(_mm512_cvtph_epi64): Likewise.
(_mm512_mask_cvtph_epi64): Likewise.
(_mm512_maskz_cvtph_epi64): Likewise.
(_mm512_cvt_roundph_epi64): Likewise.
(_mm512_mask_cvt_roundph_epi64): Likewise.
(_mm512_maskz_cvt_roundph_epi64): Likewise.
(_mm512_cvtph_epu64): Likewise.
(_mm512_mask_cvtph_epu64): Likewise.
(_mm512_maskz_cvtph_epu64): Likewise.
(_mm512_cvt_roundph_epu64): Likewise.
(_mm512_mask_cvt_roundph_epu64): Likewise.
(_mm512_maskz_cvt_roundph_epu64): Likewise.
(_mm512_cvtph_epi16): Likewise.
(_mm512_mask_cvtph_epi16): Likewise.
(_mm512_maskz_cvtph_epi16): Likewise.
(_mm512_cvt_roundph_epi16): Likewise.
(_mm512_mask_cvt_roundph_epi16): Likewise.
(_mm512_maskz_cvt_roundph_epi16): Likewise.
(_mm512_cvtph_epu16): Likewise.
(_mm512_mask_cvtph_epu16): Likewise.
(_mm512_maskz_cvtph_epu16): Likewise.
(_mm512_cvt_roundph_epu16): Likewise.
(_mm512_mask_cvt_roundph_epu16): Likewise.
(_mm512_maskz_cvt_roundph_epu16): Likewise.
* config/i386/avx512fp16vlintrin.h (_mm_cvtph_epi32):
New intrinsic.
(_mm_mask_cvtph_epi32): Likewise.
(_mm_maskz_cvtph_epi32): Likewise.
(_mm256_cvtph_epi32): Likewise.
(_mm256_mask_cvtph_epi32): Likewise.
(_mm256_maskz_cvtph_epi32): Likewise.
(_mm_cvtph_epu32): Likewise.
(_mm_mask_cvtph_epu32): Likewise.
(_mm_maskz_cvtph_epu32): Likewise.
(_mm256_cvtph_epu32): Likewise.
(_mm256_mask_cvtph_epu32): Likewise.
(_mm256_maskz_cvtph_epu32): Likewise.
(_mm_cvtph_epi64): Likewise.
(_mm_mask_cvtph_epi64): Likewise.
(_mm_maskz_cvtph_epi64): Likewise.
(_mm256_cvtph_epi64): Likewise.
(_mm256_mask_cvtph_epi64): Likewise.
(_mm256_maskz_cvtph_epi64): Likewise.
(_mm_cvtph_epu64): Likewise.
(_mm_mask_cvtph_epu64): Likewise.
(_mm_maskz_cvtph_epu64): Likewise.
(_mm256_cvtph_epu64): Likewise.
(_mm256_mask_cvtph_epu64): Likewise.
(_mm256_maskz_cvtph_epu64): Likewise.
(_mm_cvtph_epi16): Likewise.
(_mm_mask_cvtph_epi16): Likewise.
(_mm_maskz_cvtph_epi16): Likewise.
(_mm256_cvtph_epi16): Likewise.
(_mm256_mask_cvtph_epi16): Likewise.
(_mm256_maskz_cvtph_epi16): Likewise.
(_mm_cvtph_epu16): Likewise.
(_mm_mask_cvtph_epu16): Likewise.
(_mm_maskz_cvtph_epu16): Likewise.
(_mm256_cvtph_epu16): Likewise.
(_mm256_mask_cvtph_epu16): Likewise.
(_mm256_maskz_cvtph_epu16): Likewise.
* config/i386/i386-builtin-types.def: Add new builtin types.
* config/i386/i386-builtin.def: Add new builtins.
* config/i386/i386-expand.c
(ix86_expand_args_builtin): Handle new builtin types.
(ix86_expand_round_builtin): Ditto.
* config/i386/sse.md (sseintconvert): New.
(ssePHmode): Ditto.
(UNSPEC_US_FIX_NOTRUNC): Ditto.
(sseintconvertsignprefix): Ditto.
(avx512fp16_vcvtph2<sseintconvertsignprefix><sseintconvert>_<mode><mask_name><round_name>):
Ditto.
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx-1.c: Add test for new builtins.
* gcc.target/i386/sse-13.c: Ditto.
* gcc.target/i386/sse-23.c: Ditto.
* gcc.target/i386/sse-14.c: Add test for new intrinsics.
* gcc.target/i386/sse-22.c: Ditto.
liuhongt [Mon, 2 Mar 2020 10:04:45 +0000 (18:04 +0800)]
AVX512FP16: Add testcase for vmovsh/vmovw.
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx512fp16-vmovsh-1a.c: New test.
* gcc.target/i386/avx512fp16-vmovsh-1b.c: Ditto.
* gcc.target/i386/avx512fp16-vmovw-1a.c: Ditto.
* gcc.target/i386/avx512fp16-vmovw-1b.c: Ditto.
* gcc.target/i386/avx512fp16-vmovw-2a.c: Ditto.
* gcc.target/i386/avx512fp16-vmovw-2b.c: Ditto.
* gcc.target/i386/avx512fp16-vmovw-3a.c: Ditto.
* gcc.target/i386/avx512fp16-vmovw-3b.c: Ditto.
* gcc.target/i386/avx512fp16-vmovw-4a.c: Ditto.
* gcc.target/i386/avx512fp16-vmovw-4b.c: Ditto.
liuhongt [Thu, 28 Feb 2019 19:43:30 +0000 (11:43 -0800)]
AVX512FP16: Add vmovw/vmovsh.
gcc/ChangeLog:
* config/i386/avx512fp16intrin.h: (_mm_cvtsi16_si128):
New intrinsic.
(_mm_cvtsi128_si16): Likewise.
(_mm_mask_load_sh): Likewise.
(_mm_maskz_load_sh): Likewise.
(_mm_mask_store_sh): Likewise.
(_mm_move_sh): Likewise.
(_mm_mask_move_sh): Likewise.
(_mm_maskz_move_sh): Likewise.
* config/i386/i386-builtin-types.def: Add corresponding builtin types.
* config/i386/i386-builtin.def: Add corresponding new builtins.
* config/i386/i386-expand.c
(ix86_expand_special_args_builtin): Handle new builtin types.
(ix86_expand_vector_init_one_nonzero): Adjust for FP16 target.
* config/i386/sse.md (VI2F): New mode iterator.
(vec_set<mode>_0): Use new mode iterator.
(avx512f_mov<ssescalarmodelower>_mask): Adjust for HF vector mode.
(avx512f_store<mode>_mask): Ditto.
Jason Merrill [Wed, 15 Sep 2021 23:01:14 +0000 (19:01 -0400)]
c++: Small location tweak
As Marek suggested.
gcc/cp/ChangeLog:
* constexpr.c (cxx_eval_outermost_constant_expr): Use
protected_set_expr_location.
Kewen Lin [Thu, 16 Sep 2021 01:02:00 +0000 (20:02 -0500)]
rs6000: Remove useless toc-fusion option
toc-fusion was intended for Power9 toc fusion previously,
but Power9 doesn't support fusion at all eventually, this
patch is to remove this useless option.
gcc/ChangeLog:
* config/rs6000/rs6000.opt (-mtoc-fusion): Remove.
GCC Administrator [Thu, 16 Sep 2021 00:16:28 +0000 (00:16 +0000)]
Daily bump.
Patrick Palka [Wed, 15 Sep 2021 22:41:21 +0000 (18:41 -0400)]
c++: shortcut bad convs during overload resolution, part 2 [PR101904]
The r12-3346 change makes us avoid computing excess argument conversions
during overload resolution, but only when it turns out there's a
strictly viable candidate in the overload set. If there's no such
candidate then we still need to compute more conversions than strictly
necessary because subsequent conversions after the first bad conversion
can turn a non-strictly viable candidate into an unviable one, and that
affects the outcome of overload resolution and the behavior of its
callers (because of -fpermissive).
But at least in a SFINAE context, the distinction between a non-strictly
viable and an unviable candidate shouldn't matter all that much since
performing a bad conversion is always an error (even with -fpermissive),
and so forming a call to a non-strictly viable candidate will end up
being a SFINAE error anyway, just like in the unviable case. Hence a
non-strictly viable candidate is effectively unviable (in a SFINAE
context), and we don't really need to distinguish between the two kinds.
We can take advantage of this observation to avoid computing excess
argument conversions even when there's no strictly viable candidate in
the overload set.
This patch implements this idea. We usually detect a SFINAE context by
looking for the absence of the tf_error flag, but that's not specific
enough: we can also get here from build_user_type_conversion with
tf_error cleared, and there the distinction between a non-strictly
viable candidate and an unviable candidate still matters (it determines
whether a user-defined conversion is bad or just doesn't exist). So this
patch sets and checks for the tf_conv flag to detect this situation too,
which avoids regressing conv2.C below.
Unlike the previous change, this one does affect the outcome of overload
resolution, but it should do so only in a way that preserves backwards
compatibility with -fpermissive.
PR c++/101904
gcc/cp/ChangeLog:
* call.c (build_user_type_conversion_1): Add tf_conv to complain.
(add_candidates): When in a SFINAE context, instead of adding a
candidate to bad_fns just mark it unviable.
gcc/testsuite/ChangeLog:
* g++.dg/ext/conv2.C: New test.
* g++.dg/template/conv17.C: Extend test.
David Edelsohn [Wed, 15 Sep 2021 21:10:35 +0000 (17:10 -0400)]
rs6000: fix xcoff section encoding
The encoding needs to be applied if the decl is not an alias: both a NULL
summary *OR* the decl alias flag is false. This patch updates the
earlier fix to continue with the encoding selection if the summary is
NULL.
gcc/ChangeLog:
* config/rs6000/rs6000.c (rs6000_xcoff_encode_section_info):
Proceed if no symbol summary or the symbol alias flag is false.
Jason Merrill [Fri, 10 Sep 2021 20:36:21 +0000 (16:36 -0400)]
c++: add parsing_function_declarator predicate
While looking at PR96184 I noticed that we were recognizing the situation of
parsing a function declarator based on current_binding_level, and that we
ought to make that a predicate function. This patch is just refactoring,
but I just suggested using it in a review of another patch.
gcc/cp/ChangeLog:
* cp-tree.h (parsing_function_declarator): Declare.
* name-lookup.c (set_decl_context_in_fn): Use it.
* parser.c (cp_parser_direct_declarator): Use it.
(parsing_function_declarator): New.
Jakub Jelinek [Wed, 15 Sep 2021 20:21:17 +0000 (22:21 +0200)]
c++: Fix handling of decls with flexible array members initialized with side-effects [PR88578]
> > Note, if the flexible array member is initialized only with non-constant
> > initializers, we have a worse bug that this patch doesn't solve, the
> > splitting of initializers into constant and dynamic initialization removes
> > the initializer and we don't have just wrong DECL_*SIZE, but nothing is
> > emitted when emitting those vars into assembly either and so the dynamic
> > initialization clobbers other vars that may overlap the variable.
> > I think we need keep an empty CONSTRUCTOR elt in DECL_INITIAL for the
> > flexible array member in that case.
>
> Makes sense.
So, the following patch fixes that.
The typeck2.c change makes sure we keep those CONSTRUCTORs around (although
they should be empty because all their elts had side-effects/was
non-constant if it was removed earlier), and the varasm.c change is to avoid
ICEs on those as well as ICEs on other flex array members that had some
initializers without side-effects, but not on the last array element.
The code was already asserting that the (index of the last elt in the
CONSTRUCTOR + 1) times elt size is equal to TYPE_SIZE_UNIT of the local->val
type, which is true for C flex arrays or for C++ if they don't have any
side-effects or the last elt doesn't have side-effects, this patch changes
that to assertion that the TYPE_SIZE_UNIT is greater than equal to the
offset of the end of last element in the CONSTRUCTOR and uses TYPE_SIZE_UNIT
(int_size_in_bytes) in the code later on.
2021-09-15 Jakub Jelinek <jakub@redhat.com>
PR c++/88578
PR c++/102295
gcc/
* varasm.c (output_constructor_regular_field): Instead of assertion
that array_size_for_constructor result is equal to size of
TREE_TYPE (local->val) in bytes, assert that the type size is greater
or equal to array_size_for_constructor result and use type size as
fieldsize.
gcc/cp/
* typeck2.c (split_nonconstant_init_1): Don't throw away empty
initializers of flexible array members if they have non-zero type
size.
gcc/testsuite/
* g++.dg/ext/flexary39.C: New test.
* g++.dg/ext/flexary40.C: New test.
Patrick Palka [Wed, 15 Sep 2021 17:54:53 +0000 (13:54 -0400)]
c++: default ctor that's also a list ctor [PR102050]
In grok_special_member_properties we need to set TYPE_HAS_COPY_CTOR,
TYPE_HAS_DEFAULT_CONSTRUCTOR and TYPE_HAS_LIST_CTOR independently
from each other because a constructor can be both a default and list
constructor (as in the first testcase), or both a default and copy
constructor (as in the second testcase).
PR c++/102050
gcc/cp/ChangeLog:
* decl.c (grok_special_member_properties): Set
TYPE_HAS_COPY_CTOR, TYPE_HAS_DEFAULT_CONSTRUCTOR
and TYPE_HAS_LIST_CTOR independently from each other.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/initlist125.C: New test.
* g++.dg/cpp0x/initlist126.C: New test.
Alexandre Oliva [Wed, 15 Sep 2021 16:04:54 +0000 (13:04 -0300)]
zero-call-used-regs attr for ada
Make the zero_call_used_regs attribute usable as a Machine_Attribute
pragma.
for gcc/ada/ChangeLog
* gcc-interface/utils.c: Include opts.h.
(handle_zero_call_used_regs_attribute): New.
(gnat_internal_attribute_table): Add zero_call_used_regs.
for gcc/testsuite/ChangeLog
* gnat.dg/zcur_attr.adb, gnat.dg/zcur_attr.ads: New.
Martin Liska [Wed, 15 Sep 2021 15:45:21 +0000 (17:45 +0200)]
i386: port vxworks to TARGET_CPU_P macro
PR target/102351
gcc/ChangeLog:
* config/i386/vxworks.h: Use new macro TARGET_CPU_P.