Andrew MacLeod [Tue, 23 Aug 2022 14:17:02 +0000 (10:17 -0400)]
Process unsigned overflow relations for plus and minus is range-ops.
If a relation is available, calculate overflow and normal ranges. Then
apply as appropriate.
gcc/
* range-op.cc (plus_minus_ranges): New.
(adjust_op1_for_overflow): New.
(operator_plus::op1_range): Use new adjustment.
(operator_plus::op2_range): Ditto.
(operator_minus::op1_range): Ditto.
* value-relation.h (relation_lt_le_gt_ge_p): New.
gcc/testsuite/
* gcc.dg/tree-ssa/pr79095.c: Test evrp pass rather than vrp1.
Andrew MacLeod [Thu, 22 Sep 2022 22:17:20 +0000 (18:17 -0400)]
Refine ranges using relations in GORI.
This allows GORI to recognize when a relation passed in applies to the
2 operands of the current statement. Check to see if further range
refinement is possible before proceeding.
* gimple-range-gori.cc (gori_compute::refine_using_relation): New.
(gori_compute::compute_operand1_range): Invoke
refine_using_relation when applicable.
(gori_compute::compute_operand2_range): Ditto.
* gimple-range-gori.h (class gori_compute): Adjust prototypes.
Andrew MacLeod [Thu, 22 Sep 2022 21:55:56 +0000 (17:55 -0400)]
Track value_relations in GORI.
This allows GORI to recognize and pass relations along the calculation chain.
This will allow relations between the LHS and the operand being calculated
to be utilized in op1_range and op2_range.
* gimple-range-gori.cc (ori_compute::compute_operand_range):
Create a relation record and pass it along when possible.
(gori_compute::compute_operand1_range): Pass relation along.
(gori_compute::compute_operand2_range): Ditto.
(gori_compute::compute_operand1_and_operand2_range): Ditto.
* gimple-range-gori.h (class gori_compute): Adjust prototypes.
* gimple-range-op.cc (gimple_range_op_handler::calc_op1): Pass
relation to op1_range call.
(gimple_range_op_handler::calc_op2): Pass relation to op2_range call.
* gimple-range-op.h (class gimple_range_op_handler): Adjust
prototypes.
Andrew MacLeod [Thu, 22 Sep 2022 21:27:36 +0000 (17:27 -0400)]
Move class value_relation the header file.
* value-relation.cc (class value_relation): Move to .h file.
(value_relation::set_relation): Ditto.
(value_relation::value_relation): ditto.
* value-relation.h (class value_relation): Move from .cc file.
(value_relation::set_relation): Ditto
(value_relation::value_relation): Ditto.
Andrew MacLeod [Tue, 27 Sep 2022 23:12:06 +0000 (19:12 -0400)]
Audit op1_range and op2_range for undefined LHS.
If the LHS is undefined, GORI should cease looking. There are numerous
places where this happens, and a few potential traps.
* range-op.cc (operator_minus::op2_range): Check for undefined.
(operator_mult::op1_range): Ditto.
(operator_exact_divide::op1_range): Ditto.
(operator_lshift::op1_range): Ditto.
(operator_rshift::op1_range): Ditto.
(operator_cast::op1_range): Ditto.
(operator_bitwise_and::op1_range): Ditto.
(operator_bitwise_or::op1_range): Ditto.
(operator_trunc_mod::op1_range): Ditto.
(operator_trunc_mod::op2_range): Ditto.
(operator_bitwise_not::op1_range): Ditto.
(pointer_or_operator::op1_range): Ditto.
(range_op_handler::op1_range): Ditto.
(range_op_handler::op2_range): Ditto.
Andrew MacLeod [Tue, 27 Sep 2022 22:42:33 +0000 (18:42 -0400)]
Remove undefined behaviour from testscase.
There was a patch posted to remove the undefined behaviour from this
testcase, but it appear to never have been applied.
gcc/teststuite/
PR tree-optimization/102892
* gcc.dg/pr102892-1.c: Remove undefined behaviour.
Patrick Palka [Thu, 29 Sep 2022 20:27:30 +0000 (16:27 -0400)]
c++: implicit lookup of std::initializer_list [PR102576]
Here the lookup for the implicit use of std::initializer_list fails
because we do it using get_namespace_binding, which isn't import aware.
Fix this by using lookup_qualified_name instead.
PR c++/102576
gcc/cp/ChangeLog:
* pt.cc (listify): Use lookup_qualified_name instead of
get_namespace_binding.
gcc/testsuite/ChangeLog:
* g++.dg/modules/pr102576_a.H: New test.
* g++.dg/modules/pr102576_b.C: New test.
Jason Merrill [Thu, 29 Sep 2022 17:45:02 +0000 (13:45 -0400)]
c++: fix triviality of class with unsatisfied op=
cxx20_pair is trivially copyable because it has a trivial copy constructor
and only a deleted copy assignment operator; the non-triviality of the
unsatisfied copy assignment overload is not considered.
gcc/cp/ChangeLog:
* class.cc (check_methods): Call constraints_satisfied_p.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/cond-triv3.C: New test.
François Dumont [Thu, 22 Sep 2022 04:58:48 +0000 (06:58 +0200)]
libstdc++: [_GLIBCXX_INLINE_VERSION] Add gdb pretty print for _GLIBCXX_DEBUG
In _GLIBCXX_DEBUG mode containers are in std::__debug namespace but not template
parameters. In _GLIBCXX_INLINE_VERSION mode most types are in std::__8 namespace but
not std::__debug containers. We need to register specific type printers for this
combination.
libstdc++-v3/ChangeLog:
* python/libstdcxx/v6/printers.py (add_one_template_type_printer): Register
printer for types in std::__debug namespace with template parameters in std::__8
namespace.
Olivier Hainque [Mon, 7 Mar 2022 11:50:27 +0000 (11:50 +0000)]
Improve comments and INITFINI macro use in vxcrtsutff.c
This change augments the comment attached to the use of auto-host.h
in vxcrtstuff.c to better describe the reason for including it and
for the associated series of #undef directives.
It also augments the comment on dso_handle and removes a redundant
guard on HAVE_INITFINI_ARRAY_SUPPORT for the shared version of the
objects, nested within a section guarded on USE_INITFINI_ARRAY.
2022-09-29 Olivier Hainque <hainque@adacore.com>
libgcc/
* config/vxcrtstuff.c: Improve the comment attached to the use
of auto-host.h and of __dso_handle. Remove redundant guard on
HAVE_INITFINI_ARRAY_SUPPORT within a USE_INITFINI_ARRAY section.
Jason Merrill [Tue, 20 Sep 2022 21:12:29 +0000 (17:12 -0400)]
c++: check DECL_INITIAL for constexpr
We were overlooking non-potentially-constant bits in variable initializer
because we didn't walk into DECL_INITIAL.
gcc/cp/ChangeLog:
* constexpr.cc (potential_constant_expression_1): Look into
DECL_INITIAL. Use location wrappers.
gcc/testsuite/ChangeLog:
* g++.dg/cpp1y/constexpr-local4.C: Expect error sooner.
* g++.dg/cpp2a/consteval24.C: Likewise.
* g++.dg/cpp2a/consteval7.C: Likewise.
* g++.dg/cpp2a/inline-asm3.C: Likewise.
Jason Merrill [Fri, 23 Sep 2022 13:07:22 +0000 (09:07 -0400)]
c++: fix class-valued ?: extension
When the gimplifier encounters the same TARGET_EXPR twice, it evaluates
TARGET_EXPR_INITIAL the first time and clears it so that the later
evaluation is just the temporary. With this testcase, using the extension
to treat an omitted middle operand as repeating the first operand, that led
to doing a bitwise copy of the S(1) temporary on return rather than properly
calling the copy constructor.
We can't use S(1) to initialize the return value here anyway, because we
need to materialize it into a temporary so we can convert it to bool and
determine which arm we're evaluating. So let's just treat the middle
operand as an xvalue.
PR c++/93046
gcc/cp/ChangeLog:
* call.cc (build_conditional_expr): For a?:c extension, treat
a reused class prvalue as an xvalue.
gcc/testsuite/ChangeLog:
* g++.dg/ext/cond4.C: Add runtime test.
Jason Merrill [Mon, 19 Sep 2022 17:08:10 +0000 (19:08 +0200)]
c++: reduce temporaries in ?:
When the sides of ?: are class prvalues, we wrap the COND_EXPR in a
TARGET_EXPR so that both sides will initialize the same temporary. But in
this case we were stripping the outer TARGET_EXPR and conditionally creating
different temporaries, unnecessarily using extra stack. The
recently added TARGET_EXPR_NO_ELIDE flag avoids this.
gcc/cp/ChangeLog:
* call.cc (build_conditional_expr): Set TARGET_EXPR_NO_ELIDE on the
outer TARGET_EXPR.
gcc/testsuite/ChangeLog:
* g++.dg/tree-ssa/cond-temp1.C: New test.
Andrew Stubbs [Mon, 26 Sep 2022 16:33:38 +0000 (17:33 +0100)]
amdgcn: remove unused variable
This was left over from a previous version of the SIMD clone patch.
gcc/ChangeLog:
* config/gcn/gcn.cc (gcn_simd_clone_compute_vecsize_and_simdlen):
Remove unused elt_bits variable.
Olivier Hainque [Thu, 10 Mar 2022 10:53:27 +0000 (10:53 +0000)]
Comment about HAVE_INITFINI_ARRAY_SUPPORT in vxworks.h
Explain that we rely on compiler .c files
to include auto-host.h before target configuration headers.
2022-09-29 Olivier Hainque <hainque@adacore.com>
gcc/
* config/vxworks.h: Add comment on our use of
HAVE_INITFINI_ARRAY_SUPPORT.
Olivier Hainque [Sun, 20 Mar 2022 17:39:15 +0000 (17:39 +0000)]
Add an mcmodel=large multilib for aarch64-vxworks
This makes good sense in general anyway, and in particular
with forthcoming support for shared shared libraries, which will
work for mrtp alone but not yet for mrtp+mcmodel=large.
2022-09-29 Olivier Hainque <hainque@adacore.com>
gcc/
* config/aarch64/t-aarch64-vxworks: Request multilib
variants for mcmodel=large.
Olivier Hainque [Tue, 19 Apr 2022 09:07:32 +0000 (09:07 +0000)]
Remove TARGET_FLOAT128_ENABLE_TYPE setting for VxWorks
We have, in vxworks.h:
/* linux64.h enables this, not supported in vxWorks. */
#undef TARGET_FLOAT128_ENABLE_TYPE
#define TARGET_FLOAT128_ENABLE_TYPE 0
We inherit linux64.h for a few reasons, but don't really support
float128 for vxworks, so the setting made sense.
Many tests rely on the linux default (1) though, so resetting is
causing lots of failures on compilation tests that would pass otherwise.
Not resetting lets users write code declaring floa128
objects but linking will typically fail at some point, so
there's no real adverse effect.
Bottom line is we don't have any particular incentive to alter
the default, whatever the default, so better leave the parameter
alone.
2022-09-29 Olivier Hainque <hainque@adacore.com>
gcc/
* config/rs6000/vxworks.h (TARGET_FLOAT128_ENABLE_TYPE): Remove
resetting to 0.
Olivier Hainque [Thu, 10 Mar 2022 10:46:19 +0000 (10:46 +0000)]
Robustify DWARF2_UNWIND_INFO handling in vx-common.h
This adjusts vx-common.h to #define DWARF2_UNWIND_INFO to 0
when ARM_UNWIND_INFO is set, preventing defaults.h from
possibly setting DWARF2_UNWIND_INFO to 1 (as well) on its own
afterwards if the macro isn't defined.
2022-09-29 Olivier Hainque <hainque@adacore.com>
gcc/
* config/vx-common.h (DWARF2_UNWIND_INFO): #define to 0
when ARM_UNWIND_INFO is set.
Julian Brown [Tue, 27 Sep 2022 17:39:59 +0000 (17:39 +0000)]
OpenACC: whole struct vs. component mappings (PR107028)
This patch fixes an ICE when both a complete struct variable and
components of that struct are mapped on the same directive for OpenACC,
using a modified version of the scheme used for OpenMP in the following
patch:
https://gcc.gnu.org/pipermail/gcc-patches/2022-September/601558.html
A new function has been added to make sure that the mapping kinds of
the whole struct and the member access are compatible -- conservatively,
so as not to copy more to/from the device than the user expects.
This version of the patch uses a different method to detect duplicate
clauses for OpenACC in oacc_resolve_clause_dependencies, and removes
the now-redundant check in omp_accumulate_sibling_lists. (The latter
check would no longer trigger when we map the whole struct on the same
directive because the component-mapping clauses are now deleted before
the check is executed.)
2022-09-28 Julian Brown <julian@codesourcery.com>
gcc/
PR middle-end/107028
* gimplify.cc (omp_check_mapping_compatibility,
oacc_resolve_clause_dependencies): New functions.
(omp_accumulate_sibling_list): Remove redundant duplicate clause
detection for OpenACC.
(build_struct_sibling_lists): Skip deleted groups. Don't build sibling
list for struct variables that are fully mapped on the same directive
for OpenACC.
(gimplify_scan_omp_clauses): Call oacc_resolve_clause_dependencies.
gcc/testsuite/
PR middle-end/107028
* c-c++-common/goacc/struct-component-kind-1.c: New test.
* g++.dg/goacc/pr107028-1.C: New test.
* g++.dg/goacc/pr107028-2.C: New test.
* gfortran.dg/goacc/mapping-tests-5.f90: New test.
Patrick Palka [Thu, 29 Sep 2022 13:18:40 +0000 (09:18 -0400)]
c++: implement __remove_cv, __remove_reference and __remove_cvref
This implements builtins for std::remove_cv, std::remove_reference and
std::remove_cvref using TRAIT_TYPE from the previous patch.
gcc/c-family/ChangeLog:
* c-common.cc (c_common_reswords): Add __remove_cv,
__remove_reference and __remove_cvref.
* c-common.h (enum rid): Add RID_REMOVE_CV, RID_REMOVE_REFERENCE
and RID_REMOVE_CVREF.
gcc/cp/ChangeLog:
* constraint.cc (diagnose_trait_expr): Handle CPTK_REMOVE_CV,
CPTK_REMOVE_REFERENCE and CPTK_REMOVE_CVREF.
* cp-objcp-common.cc (names_builtin_p): Likewise.
* cp-tree.h (enum cp_trait_kind): Add CPTK_REMOVE_CV,
CPTK_REMOVE_REFERENCE and CPTK_REMOVE_CVREF.
* cxx-pretty-print.cc (pp_cxx_trait): Handle CPTK_REMOVE_CV,
CPTK_REMOVE_REFERENCE and CPTK_REMOVE_CVREF.
* parser.cc (cp_keyword_starts_decl_specifier_p): Return true
for RID_REMOVE_CV, RID_REMOVE_REFERENCE and RID_REMOVE_CVREF.
(cp_parser_trait): Handle RID_REMOVE_CV, RID_REMOVE_REFERENCE
and RID_REMOVE_CVREF.
(cp_parser_simple_type_specifier): Likewise.
* semantics.cc (finish_trait_type): Likewise.
libstdc++-v3/ChangeLog:
* include/bits/unique_ptr.h (unique_ptr<_Tp[], _Dp>): Remove
__remove_cv and use __remove_cv_t instead.
gcc/testsuite/ChangeLog:
* g++.dg/ext/has-builtin-1.C: Test existence of __remove_cv,
__remove_reference and __remove_cvref.
* g++.dg/ext/remove_cv.C: New test.
* g++.dg/ext/remove_reference.C: New test.
* g++.dg/ext/remove_cvref.C: New test.
Patrick Palka [Thu, 29 Sep 2022 13:18:11 +0000 (09:18 -0400)]
c++: introduce TRAIT_TYPE alongside TRAIT_EXPR
We already have generic support for predicate-like traits that yield a
boolean value via TRAIT_EXPR, but we lack the same support for traits
that yield a type instead of a value. Such support would streamline
implementing efficient builtins for the standard library type traits.
To that end this patch implements a generic TRAIT_TYPE type alongside
TRAIT_EXPR, and reimplements the existing UNDERLYING_TYPE builtin trait
using this new TRAIT_TYPE.
gcc/cp/ChangeLog:
* cp-objcp-common.cc (cp_common_init_ts): Replace
UNDERLYING_TYPE with TRAIT_TYPE.
* cp-tree.def (TRAIT_TYPE): Define.
(UNDERLYING_TYPE): Remove.
* cp-tree.h (TRAIT_TYPE_KIND_RAW): Define.
(TRAIT_TYPE_KIND): Define.
(TRAIT_TYPE_TYPE1): Define.
(TRAIT_TYPE_TYPE2): Define.
(WILDCARD_TYPE_P): Return true for TRAIT_TYPE.
(finish_trait_type): Declare.
* cxx-pretty-print.cc (cxx_pretty_printer::primary_expression):
Adjust after renaming pp_cxx_trait_expression.
(cxx_pretty_printer::simple_type_specifier) <case TRAIT_TYPE>:
New.
(cxx_pretty_printer::type_id): Replace UNDERLYING_TYPE with
TRAIT_TYPE.
(pp_cxx_trait_expression): Rename to ...
(pp_cxx_trait): ... this. Handle TRAIT_TYPE as well. Correct
pretty printing of the trailing arguments.
* cxx-pretty-print.h (pp_cxx_trait_expression): Rename to ...
(pp_cxx_trait_type): ... this.
* error.cc (dump_type) <case UNDERLYING_TYPE>: Remove.
<case TRAIT_TYPE>: New.
(dump_type_prefix): Replace UNDERLYING_WITH with TRAIT_TYPE.
(dump_type_suffix): Likewise.
* mangle.cc (write_type) <case UNDERLYING_TYPE>: Remove.
<case TRAIT_TYPE>: New.
* module.cc (trees_out::type_node) <case UNDERLYING_TYPE>:
Remove.
<case TRAIT_TYPE>: New.
(trees_in::tree_node): Likewise.
* parser.cc (cp_parser_primary_expression): Adjust after
renaming cp_parser_trait_expr.
(cp_parser_trait_expr): Rename to ...
(cp_parser_trait): ... this. Call finish_trait_type for traits
that yield a type.
(cp_parser_simple_type_specifier): Adjust after renaming
cp_parser_trait_expr.
* pt.cc (for_each_template_parm_r) <case UNDERLYING_TYPE>:
Remove.
<case TRAIT_TYPE>: New.
(tsubst): Likewise.
(unify): Replace UNDERLYING_TYPE with TRAIT_TYPE.
(dependent_type_p_r): Likewise.
* semantics.cc (finish_underlying_type): Don't return
UNDERLYING_TYPE anymore when processing_template_decl.
(finish_trait_type): Define.
* tree.cc (strip_typedefs) <case UNDERLYING_TYPE>: Remove.
<case TRAIT_TYPE>: New.
(cp_walk_subtrees): Likewise.
* typeck.cc (structural_comptypes): Likewise.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/alias-decl-59.C: Adjust expected error message.
* g++.dg/ext/underlying_type7.C: Likewise.
* g++.dg/ext/underlying_type13.C: New test.
* g++.dg/ext/underlying_type14.C: New test.
Jonathan Wakely [Thu, 29 Sep 2022 10:30:05 +0000 (11:30 +0100)]
libstdc++: Guard use of new built-in with __has_builtin
I forgot that non-GCC compilers don't have this built-in yet.
For Clang we could do something like the check below (as described in
P2255), but for now I'm just fixing the regression.
#if __has_builtin((__reference_binds_to_temporary)
bool _Dangle = __reference_binds_to_temporary(_Tp, _Res_t)
&& __and_<is_reference<_Tp>,
__not_<is_reference<_Res_t>>,
is_convertible<__remove_cvref_t<_Res_t>*,
__remove_cvref_t<_Tp>*>>::value
#endif
libstdc++-v3/ChangeLog:
* include/std/type_traits (__is_invocable_impl): Check
__has_builtin(__reference_converts_from_temporary) before using
built-in.
Nathan Sidwell [Wed, 28 Sep 2022 16:21:14 +0000 (09:21 -0700)]
c++: import/export NTTP objects
This adds smarts to the module machinery to handle NTTP object
VAR_DECLs. Like typeinfo objects, these must be ignored in the symbol
table, streamed specially and recreated on stream in.
gcc/cp/
PR c++/100616
* module.cc (enum tree_tag): Add tt_nttp_var.
(trees_out::decl_node): Handle NTTP objects.
(trees_in::tree_node): Handle tt_nttp_var.
(depset::hash::add_binding_entry): Skip NTTP objects.
gcc/testsuite/
PR c++/100616
* g++.dg/modules/100616_a.H: New.
* g++.dg/modules/100616_b.C: New.
* g++.dg/modules/100616_c.C: New.
* g++.dg/modules/100616_d.C: New.
Jose E. Marchesi [Thu, 4 Aug 2022 19:16:10 +0000 (21:16 +0200)]
place `const volatile' objects in read-only sections
It is common for C BPF programs to use variables that are implicitly
set by the BPF loader and run-time. It is also necessary for these
variables to be stored in read-only storage so the BPF verifier
recognizes them as such. This leads to declarations using both
`const' and `volatile' qualifiers, like this:
const volatile unsigned char is_allow_list = 0;
Where `volatile' is used to avoid the compiler to optimize out the
variable, or turn it into a constant, and `const' to make sure it is
placed in .rodata.
Now, it happens that:
- GCC places `const volatile' objects in the .data section, under the
assumption that `volatile' somehow voids the `const'.
- LLVM places `const volatile' objects in .rodata, under the
assumption that `volatile' is orthogonal to `const'.
So there is a divergence, that has practical consequences: it makes
BPF programs compiled with GCC to not work properly.
When looking into this, I found this bugzilla:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=25521
"change semantics of const volatile variables"
which was filed back in 2005, long ago. This report was already
asking to put `const volatile' objects in .rodata, questioning the
current behavior.
While discussing this in the #gcc IRC channel I was pointed out to the
following excerpt from the C18 spec:
6.7.3 Type qualifiers / 5 The properties associated with qualified
types are meaningful only for expressions that are
lval-values [note 135]
135) The implementation may place a const object that is not
volatile in a read-only region of storage. Moreover, the
implementation need not allocate storage for such an object if
its $ address is never used.
This footnote may be interpreted as if const objects that are volatile
shouldn't be put in read-only storage. Even if I personally was not
very convinced of that interpretation (see my earlier comment in BZ
25521) I filed the following issue in the LLVM tracker in order to
discuss the matter:
https://github.com/llvm/llvm-project/issues/56468
As you can see, Aaron Ballman, one of the LLVM hackers, asked the WG14
reflectors about this. He reported that the reflectors don't think
footnote 135 has any normative value.
So, not having a normative mandate on either direction, there are two
options:
a) To change GCC to place `const volatile' objects in .rodata instead
of .data.
b) To change LLVM to place `const volatile' objects in .data instead
of .rodata.
Considering that:
- One target (bpf-unknown-none) breaks with the current GCC behavior.
- No target/platform relies on the GCC behavior, that we know.
- Changing the LLVM behavior at this point would be very severely
traumatic for the BPF people and their users.
I think the right thing to do at this point is a).
Therefore this patch.
Regtested in x86_64-linux-gnu and bpf-unknown-none.
No regressions observed.
gcc/ChangeLog:
PR middle-end/25521
* varasm.cc (categorize_decl_for_section): Place `const volatile'
objects in read-only sections.
(default_select_section): Likewise.
gcc/testsuite/ChangeLog:
PR middle-end/25521
* lib/target-supports.exp (check_effective_target_elf): Define.
* gcc.dg/pr25521.c: New test.
Richard Sandiford [Thu, 29 Sep 2022 10:32:57 +0000 (11:32 +0100)]
data-ref: Fix ranges_maybe_overlap_p test
dr_may_alias_p rightly used poly_int_tree_p to guard a use of
ranges_maybe_overlap_p, but used the non-poly extractors.
This caused a few failures in the SVE ACLE asm tests.
gcc/
* tree-data-ref.cc (dr_may_alias_p): Use to_poly_widest instead
of to_widest.
Richard Sandiford [Thu, 29 Sep 2022 10:32:57 +0000 (11:32 +0100)]
aarch64: Remove redundant TARGET_* checks
After previous patches, it's possible to remove TARGET_*
options that are redundant due to (IMO) obvious dependencies.
gcc/
* config/aarch64/aarch64.h (TARGET_CRYPTO, TARGET_SHA3, TARGET_SM4)
(TARGET_DOTPROD): Don't depend on TARGET_SIMD.
(TARGET_AES, TARGET_SHA2): Likewise. Remove TARGET_CRYPTO test.
(TARGET_FP_F16INST): Don't depend on TARGET_FLOAT.
(TARGET_SVE2, TARGET_SVE_F32MM, TARGET_SVE_F64MM): Don't depend
on TARGET_SVE.
(TARGET_SVE2_AES, TARGET_SVE2_BITPERM, TARGET_SVE2_SHA3)
(TARGET_SVE2_SM4): Don't depend on TARGET_SVE2.
(TARGET_F32MM, TARGET_F64MM): Delete.
* config/aarch64/aarch64-c.cc (aarch64_update_cpp_builtins): Guard
float macros with just TARGET_FLOAT rather than TARGET_FLOAT
|| TARGET_SIMD.
* config/aarch64/aarch64-simd.md (copysign<mode>3): Depend
only on TARGET_SIMD, rather than TARGET_FLOAT && TARGET_SIMD.
(aarch64_crypto_aes<aes_op>v16qi): Depend only on TARGET_AES,
rather than TARGET_SIMD && TARGET_AES.
(aarch64_crypto_aes<aesmc_op>v16qi): Likewise.
(*aarch64_crypto_aese_fused): Likewise.
(*aarch64_crypto_aesd_fused): Likewise.
(aarch64_crypto_pmulldi): Likewise.
(aarch64_crypto_pmullv2di): Likewise.
(aarch64_crypto_sha1hsi): Likewise TARGET_SHA2.
(aarch64_crypto_sha1hv4si): Likewise.
(aarch64_be_crypto_sha1hv4si): Likewise.
(aarch64_crypto_sha1su1v4si): Likewise.
(aarch64_crypto_sha1<sha1_op>v4si): Likewise.
(aarch64_crypto_sha1su0v4si): Likewise.
(aarch64_crypto_sha256h<sha256_op>v4si): Likewise.
(aarch64_crypto_sha256su0v4si): Likewise.
(aarch64_crypto_sha256su1v4si): Likewise.
(aarch64_crypto_sha512h<sha512_op>qv2di): Likewise TARGET_SHA3.
(aarch64_crypto_sha512su0qv2di): Likewise.
(aarch64_crypto_sha512su1qv2di, eor3q<mode>4): Likewise.
(aarch64_rax1qv2di, aarch64_xarqv2di, bcaxq<mode>4): Likewise.
(aarch64_sm3ss1qv4si): Likewise TARGET_SM4.
(aarch64_sm3tt<sm3tt_op>qv4si): Likewise.
(aarch64_sm3partw<sm3part_op>qv4si): Likewise.
(aarch64_sm4eqv4si, aarch64_sm4ekeyqv4si): Likewise.
* config/aarch64/aarch64.md (<FLOATUORS:optab>dihf2)
(copysign<GPF:mode>3, copysign<GPF:mode>3_insn)
(xorsign<mode>3): Remove redundant TARGET_FLOAT condition.
Richard Sandiford [Thu, 29 Sep 2022 10:32:57 +0000 (11:32 +0100)]
aarch64: Tweak handling of -mgeneral-regs-only
-mgeneral-regs-only is effectively "+nofp for the compiler without
changing the assembler's ISA flags". Currently that's implemented
by making TARGET_FLOAT, TARGET_SIMD and TARGET_SVE depend on
!TARGET_GENERAL_REGS_ONLY and then making any feature that needs FP
registers depend (directly or indirectly) on one of those three TARGET
macros. The problem is that it's easy to forgot to do the last bit.
This patch instead represents the distinction between "assemnbler
ISA flags" and "compiler ISA flags" more directly, funnelling
all updates through a new function that sets both sets of flags
together.
gcc/
* config/aarch64/aarch64.opt (aarch64_asm_isa_flags): New variable.
* config/aarch64/aarch64.h (aarch64_asm_isa_flags)
(aarch64_isa_flags): Redefine as read-only macros.
(TARGET_SIMD, TARGET_FLOAT, TARGET_SVE): Don't depend on
!TARGET_GENERAL_REGS_ONLY.
* common/config/aarch64/aarch64-common.cc
(aarch64_set_asm_isa_flags): New function.
(aarch64_handle_option): Call it when updating -mgeneral-regs.
* config/aarch64/aarch64-protos.h (aarch64_simd_switcher): Replace
m_old_isa_flags with m_old_asm_isa_flags.
(aarch64_set_asm_isa_flags): Declare.
* config/aarch64/aarch64-builtins.cc
(aarch64_simd_switcher::aarch64_simd_switcher)
(aarch64_simd_switcher::~aarch64_simd_switcher): Save and restore
aarch64_asm_isa_flags instead of aarch64_isa_flags.
* config/aarch64/aarch64-sve-builtins.cc
(check_required_extensions): Use aarch64_asm_isa_flags instead
of aarch64_isa_flags.
* config/aarch64/aarch64.cc (aarch64_set_asm_isa_flags): New function.
(aarch64_override_options, aarch64_handle_attr_arch)
(aarch64_handle_attr_cpu, aarch64_handle_attr_isa_flags): Use
aarch64_set_asm_isa_flags to set the ISA flags.
(aarch64_option_print, aarch64_declare_function_name)
(aarch64_start_file): Use aarch64_asm_isa_flags instead
of aarch64_isa_flags.
(aarch64_can_inline_p): Check aarch64_asm_isa_flags as well as
aarch64_isa_flags.
Richard Sandiford [Thu, 29 Sep 2022 10:32:56 +0000 (11:32 +0100)]
aarch64: Tweak contents of flags_on/off fields
After previous changes, it's more convenient if the flags_on and
flags_off fields of all_extensions include the feature flag itself.
gcc/
* common/config/aarch64/aarch64-common.cc (all_extensions):
Include the feature flag in flags_on and flags_off.
(aarch64_parse_extension): Update accordingly.
(aarch64_get_extension_string_for_isa_flags): Likewise.
Richard Sandiford [Thu, 29 Sep 2022 10:32:56 +0000 (11:32 +0100)]
aarch64: Make more use of aarch64_feature_flags
A previous patch added a aarch64_feature_flags typedef, to abstract
the representation of the feature flags. This patch makes existing
code use the typedef too. Hope I've caught them all!
gcc/
* common/config/aarch64/aarch64-common.cc: Use aarch64_feature_flags
for feature flags throughout.
* config/aarch64/aarch64-protos.h: Likewise.
* config/aarch64/aarch64-sve-builtins.h: Likewise.
* config/aarch64/aarch64-sve-builtins.cc: Likewise.
* config/aarch64/aarch64.cc: Likewise.
* config/aarch64/aarch64.opt: Likewise.
* config/aarch64/driver-aarch64.cc: Likewise.
Richard Sandiford [Thu, 29 Sep 2022 10:32:55 +0000 (11:32 +0100)]
aarch64: Tweak constness of option-related data
Some of the option structures have all-const member variables.
That doesn't seem necessary: we can just use const on the objects
that are supposed to be read-only.
Also, with the new, more C++-heavy option handling, it seems
better to use constexpr for the static data, to make sure that
we're not adding unexpected overhead.
gcc/
* common/config/aarch64/aarch64-common.cc (aarch64_option_extension)
(processor_name_to_arch, arch_to_arch_name): Remove const from
member variables.
(all_extensions, all_cores, all_architectures): Make a constexpr.
* config/aarch64/aarch64.cc (processor): Remove const from
member variables.
(all_architectures): Make a constexpr.
* config/aarch64/driver-aarch64.cc (aarch64_core_data)
(aarch64_arch_driver_info): Remove const from member variables.
(aarch64_cpu_data, aarch64_arches): Make a constexpr.
(get_arch_from_id): Return a pointer to const.
(host_detect_local_cpu): Update accordingly.
Richard Sandiford [Thu, 29 Sep 2022 10:32:55 +0000 (11:32 +0100)]
aarch64: Avoid std::string in static data
Just a minor patch to avoid having to construct std::strings
in static data.
gcc/
* common/config/aarch64/aarch64-common.cc (processor_name_to_arch)
(arch_to_arch_name): Use const char * instead of std::string.
Richard Sandiford [Thu, 29 Sep 2022 10:32:55 +0000 (11:32 +0100)]
aarch64: Simplify generation of .arch strings
aarch64-common.cc has two arrays, one maintaining the original
definition order and one sorted by population count. Sorting
by population count was a way of ensuring topological ordering,
taking advantage of the fact that the entries are partially
ordered by the subset relation. However, the sorting is not
needed now that the .def file is forced to have topological
order from the outset.
Other changes are:
(1) The population count used:
uint64_t total_flags_a = opt_a->flag_canonical & opt_a->flags_on;
uint64_t total_flags_b = opt_b->flag_canonical & opt_b->flags_on;
int popcnt_a = popcount_hwi ((HOST_WIDE_INT)total_flags_a);
int popcnt_b = popcount_hwi ((HOST_WIDE_INT)total_flags_b);
where I think the & was supposed to be |. This meant that the
counts would always be 1 in practice, since flag_canonical is
a single bit. This led us to printing +nofp+nosimd even though
GCC "knows" (and GAS agrees) that +nofp disables simd.
(2) The .arch output code converts +aes+sha2 to +crypto. I think
the main reason for doing this is to support assemblers that
predate the individual per-feature crypto flags. It therefore
seems more natural to treat it as a special case, rather than
as an instance of a general pattern. Hopefully we won't do
something similar in future!
(There is already special handling of CRC, for different reasons.)
(3) Previously, if the /proc/cpuinfo code saw a feature like sve,
it would assume the presence of all the features that sve
depends on. It would be possible to keep that behaviour
if necessary, but it was simpler to assume the presence of
fp16 (say) only when fphp is present. There's an argument
that that's more conservatively correct too.
gcc/
* common/config/aarch64/aarch64-common.cc
(TARGET_OPTION_INIT_STRUCT): Delete.
(aarch64_option_extension): Remove is_synthetic_flag.
(all_extensions): Update accordingly.
(all_extensions_by_on, opt_ext, opt_ext_cmp): Delete.
(aarch64_option_init_struct, aarch64_contains_opt): Delete.
(aarch64_get_extension_string_for_isa_flags): Rewrite to use
all_extensions instead of all_extensions_on.
gcc/testsuite/
* gcc.target/aarch64/cpunative/info_8: Add all dependencies of sve.
* gcc.target/aarch64/cpunative/info_9: Likewise svesm4.
* gcc.target/aarch64/cpunative/info_15: Likewise.
* gcc.target/aarch64/cpunative/info_16: Likewise sve2.
* gcc.target/aarch64/cpunative/info_17: Likewise.
* gcc.target/aarch64/cpunative/native_cpu_2.c: Expect just +nofp
rather than +nofp+nosimd.
* gcc.target/aarch64/cpunative/native_cpu_10.c: Likewise.
* gcc.target/aarch64/target_attr_15.c: Likewise.
Richard Sandiford [Thu, 29 Sep 2022 10:32:54 +0000 (11:32 +0100)]
aarch64: Simplify feature definitions
Currently the aarch64-option-extensions.def entries, the
aarch64-cores.def entries, and the AARCH64_FL_FOR_* macros
have a transitive closure of dependencies that is maintained by hand.
This is a bit error-prone and is becoming less tenable as more features
are added. The main point of this patch is to maintain the closure
automatically instead.
For example, the +sve2-aes extension requires sve2 and aes.
This is now described using:
AARCH64_OPT_EXTENSION("sve2-aes", SVE2_AES, (SVE2, AES), ...)
If life was simple, we could just give the name of the feature
and the list of features that it requires/depends on. But sadly
things are more complicated. For example:
- the legacy +crypto option enables aes and sha2 only, but +nocrypto
disables all crypto-related extensions, including sm4.
- +fp16fml enables fp16, but armv8.4-a enables fp16fml without fp16.
fp16fml only has an effect when fp16 is also present; see the
comments for more details.
- +bf16 enables simd, but +bf16+nosimd is valid and enables just the
scalar bf16 instructions. rdma behaves similarly.
To handle cases like these, the option entries have extra fields to
specify what an explicit +foo enables and what an explicit +nofoo
disables, in addition to the absolute dependencies.
The other main changes are:
- AARCH64_FL_* are now defined automatically.
- the feature list for each architecture level moves from aarch64.h
to aarch64-arches.def.
As a consequence, we now have a (redundant) V8A feature flag.
While there, the patch uses a new typedef, aarch64_feature_flags,
for the set of feature flags. This should make it easier to switch
to a class if we run out of bits in the uint64_t.
For now the patch hardcodes the fact that crypto is the only
synthetic option. A later patch will remove this field.
To test for things that might not be covered by the testsuite,
I made the driver print out the all_extensions, all_cores and
all_archs arrays before and after the patch, with the following
tweaks:
- renumber the old AARCH64_FL_* bit assignments to match the .def order
- remove the new V8A flag when printing the new tables
- treat CRYPTO and CRYPTO | AES | SHA2 the same way when printing the
core tables
(On the last point: some cores enabled just CRYPTO while others enabled
CRYPTO, AES and SHA2. This doesn't cause a difference in behaviour
because of how the dependent macros are defined. With the new scheme,
all entries with CRYPTO automatically get AES and SHA2 too.)
The only difference is that +nofp now turns off dotprod. This was
another instance of an incomplete transitive closure, but unlike the
instances fixed in a previous patch, it had no observable effect.
gcc/
* config/aarch64/aarch64-option-extensions.def: Switch to a new format.
* config/aarch64/aarch64-cores.def: Use the same format to specify
lists of features.
* config/aarch64/aarch64-arches.def: Likewise, moving that information
from aarch64.h.
* config/aarch64/aarch64-opts.h (aarch64_feature_flags): New typedef.
* config/aarch64/aarch64.h (aarch64_feature): New class enum.
Turn AARCH64_FL_* macros into constexprs, getting the definitions
from aarch64-option-extensions.def. Remove AARCH64_FL_FOR_* macros.
* common/config/aarch64/aarch64-common.cc: Include
aarch64-feature-deps.h.
(all_extensions): Update for new .def format.
(all_extensions_by_on, all_cores, all_architectures): Likewise.
* config/aarch64/driver-aarch64.cc: Include aarch64-feature-deps.h.
(aarch64_extensions): Update for new .def format.
(aarch64_cpu_data, aarch64_arches): Likewise.
* config/aarch64/aarch64.cc: Include aarch64-feature-deps.h.
(all_architectures, all_cores): Update for new .def format.
* config/aarch64/aarch64-sve-builtins.cc
(check_required_extensions): Likewise.
Richard Sandiford [Thu, 29 Sep 2022 10:32:54 +0000 (11:32 +0100)]
aarch64: Reorder an entry in aarch64-option-extensions.def
aarch64-option-extensions.def was topologically sorted except
for one case: crypto came before its aes and sha2 dependencies.
This patch moves crypto after sha2 instead.
gcc/
* config/aarch64/aarch64-option-extensions.def: Move crypto
after sha2.
gcc/testsuite/
* gcc.target/aarch64/cpunative/native_cpu_0.c: Expect +crypto
to come after +crc.
* gcc.target/aarch64/cpunative/native_cpu_13.c: Likewise.
* gcc.target/aarch64/cpunative/native_cpu_16.c: Likewise.
* gcc.target/aarch64/cpunative/native_cpu_17.c: Likewise.
* gcc.target/aarch64/cpunative/native_cpu_6.c: Likewise.
* gcc.target/aarch64/cpunative/native_cpu_7.c: Likewise.
* gcc.target/aarch64/options_set_2.c: Likewise.
* gcc.target/aarch64/options_set_3.c: Likewise.
* gcc.target/aarch64/options_set_4.c: Likewise.
Richard Sandiford [Thu, 29 Sep 2022 10:32:53 +0000 (11:32 +0100)]
aarch64: Fix transitive closure of features
aarch64-option-extensions.def requires us to maintain the transitive
closure of options by hand. This patch fixes a few cases where a
flag was missed.
+noaes and +nosha2 now disable +crypto, which IMO makes more
sense and is consistent with the Clang behaviour.
gcc/
* config/aarch64/aarch64-option-extensions.def (dotprod): Depend
on fp as well as simd.
(sha3): Likewise.
(aes): Likewise. Make +noaes disable crypto.
(sha2): Likewise +nosha2. Also make +nosha2 disable sha3 and
sve2-sha3.
(sve2-sha3): Depend on sha2 as well as sha3.
gcc/testsuite/
* gcc.target/aarch64/options_set_6.c: Expect +crypto+nosha2 to
disable crypto but keep aes.
* gcc.target/aarch64/pragma_cpp_predefs_4.c: New test.
Richard Sandiford [Thu, 29 Sep 2022 10:32:53 +0000 (11:32 +0100)]
aarch64: Remove AARCH64_FL_RCPC8_4 [PR107025]
AARCH64_FL_RCPC8_4 is an odd-one-out in that it has no associated
entry in aarch64-option-extensions.def. This means that, although
it is internally separated from AARCH64_FL_V8_4A, there is no
mechanism for turning it on and off individually, independently
of armv8.4-a.
The only place that the flag was used independently was in the
entry for thunderx3t110, which enabled it alongside V8_3A.
As noted in PR107025, this means that any use of the extension
will fail to assemble.
In the PR trail, Andrew suggested removing the core entry.
That might be best long-term, but since the barrier for removing
command-line options without a deprecation period is very high,
this patch instead just drops the flag from the core entry.
We'll still produce correct code.
gcc/
PR target/107025
* config/aarch64/aarch64.h (oAARCH64_FL_RCPC8_4): Delete.
(AARCH64_FL_FOR_V8_4A): Update accordingly.
(AARCH64_ISA_RCPC8_4): Use AARCH64_FL_V8_4A directly.
* config/aarch64/aarch64-cores.def (thunderx3t110): Remove
AARCH64_FL_RCPC8_4.
Richard Sandiford [Thu, 29 Sep 2022 10:32:53 +0000 (11:32 +0100)]
aarch64: Avoid redundancy in aarch64-cores.def
The flags fields of the aarch64-cores.def always start with
AARCH64_FL_FOR_<ARCH>. After previous changes, <ARCH> is always
identical to the previous field, so we can drop the explicit
AARCH64_FL_FOR_<ARCH> and derive it programmatically.
This isn't a big saving in itself, but it helps with later patches.
gcc/
* config/aarch64/aarch64-cores.def: Remove AARCH64_FL_FOR_<ARCH>
from the flags field.
* common/config/aarch64/aarch64-common.cc (all_cores): Add it
here instead.
* config/aarch64/aarch64.cc (all_cores): Likewise.
* config/aarch64/driver-aarch64.cc (all_cores): Likewise.
Richard Sandiford [Thu, 29 Sep 2022 10:32:52 +0000 (11:32 +0100)]
aarch64: Small config.gcc cleanups
The aarch64-option-extensions.def parsing in config.gcc had
some code left over from when it tried to parse the whole
macro definition. Also, config.gcc now only looks at the
first fields of the aarch64-arches.def entries.
gcc/
* config.gcc: Remove dead aarch64-option-extensions.def code.
* config/aarch64/aarch64-arches.def: Update comment.
Richard Sandiford [Thu, 29 Sep 2022 10:32:52 +0000 (11:32 +0100)]
aarch64: Add "V" to aarch64-arches.def names
This patch completes the renaming of architecture-level related
things by adding "V" to the name of the architecture in
aarch64-arches.def. Since the "V" is predictable, we can easily
drop it when we don't need it (as when matching /proc/cpuinfo).
Having a valid C identifier is necessary for later patches.
gcc/
* config/aarch64/aarch64-arches.def: Add a leading "V" to the
ARCH_IDENT fields.
* config/aarch64/aarch64-cores.def: Update accordingly.
* common/config/aarch64/aarch64-common.cc (all_cores): Likewise.
* config/aarch64/aarch64.cc (all_cores): Likewise.
* config/aarch64/driver-aarch64.cc (aarch64_arches): Skip the
leading "V".
Richard Sandiford [Thu, 29 Sep 2022 10:32:51 +0000 (11:32 +0100)]
aarch64: Rename AARCH64_FL_FOR_ARCH macros
This patch renames AARCH64_FL_FOR_ARCH* macros to follow the
same V<number><profile> names that we (now) use elsewhere.
The names are only temporary -- a later patch will move the
information to the .def file instead. However, it helps with
the sequencing to do this first.
gcc/
* config/aarch64/aarch64.h (AARCH64_FL_FOR_ARCH8): Rename to...
(AARCH64_FL_FOR_V8A): ...this.
(AARCH64_FL_FOR_ARCH8_1): Rename to...
(AARCH64_FL_FOR_V8_1A): ...this.
(AARCH64_FL_FOR_ARCH8_2): Rename to...
(AARCH64_FL_FOR_V8_2A): ...this.
(AARCH64_FL_FOR_ARCH8_3): Rename to...
(AARCH64_FL_FOR_V8_3A): ...this.
(AARCH64_FL_FOR_ARCH8_4): Rename to...
(AARCH64_FL_FOR_V8_4A): ...this.
(AARCH64_FL_FOR_ARCH8_5): Rename to...
(AARCH64_FL_FOR_V8_5A): ...this.
(AARCH64_FL_FOR_ARCH8_6): Rename to...
(AARCH64_FL_FOR_V8_6A): ...this.
(AARCH64_FL_FOR_ARCH8_7): Rename to...
(AARCH64_FL_FOR_V8_7A): ...this.
(AARCH64_FL_FOR_ARCH8_8): Rename to...
(AARCH64_FL_FOR_V8_8A): ...this.
(AARCH64_FL_FOR_ARCH8_R): Rename to...
(AARCH64_FL_FOR_V8R): ...this.
(AARCH64_FL_FOR_ARCH9): Rename to...
(AARCH64_FL_FOR_V9A): ...this.
(AARCH64_FL_FOR_ARCH9_1): Rename to...
(AARCH64_FL_FOR_V9_1A): ...this.
(AARCH64_FL_FOR_ARCH9_2): Rename to...
(AARCH64_FL_FOR_V9_2A): ...this.
(AARCH64_FL_FOR_ARCH9_3): Rename to...
(AARCH64_FL_FOR_V9_3A): ...this.
* common/config/aarch64/aarch64-common.cc (all_cores): Update
accordingly.
* config/aarch64/aarch64-arches.def: Likewise.
* config/aarch64/aarch64-cores.def: Likewise.
* config/aarch64/aarch64.cc (all_cores): Likewise.
Richard Sandiford [Thu, 29 Sep 2022 10:32:51 +0000 (11:32 +0100)]
aarch64: Rename AARCH64_FL architecture-level macros
Following on from the previous AARCH64_ISA patch, this one adds the
profile name directly to the end of architecture-level AARCH64_FL_*
macros.
gcc/
* config/aarch64/aarch64.h (AARCH64_FL_V8_1, AARCH64_FL_V8_2)
(AARCH64_FL_V8_3, AARCH64_FL_V8_4, AARCH64_FL_V8_5, AARCH64_FL_V8_6)
(AARCH64_FL_V9, AARCH64_FL_V8_7, AARCH64_FL_V8_8, AARCH64_FL_V9_1)
(AARCH64_FL_V9_2, AARCH64_FL_V9_3): Add "A" to the end of the name.
(AARCH64_FL_V8_R): Rename to AARCH64_FL_V8R.
(AARCH64_FL_FOR_ARCH8_1, AARCH64_FL_FOR_ARCH8_2): Update accordingly.
(AARCH64_FL_FOR_ARCH8_3, AARCH64_FL_FOR_ARCH8_4): Likewise.
(AARCH64_FL_FOR_ARCH8_5, AARCH64_FL_FOR_ARCH8_6): Likewise.
(AARCH64_FL_FOR_ARCH8_7, AARCH64_FL_FOR_ARCH8_8): Likewise.
(AARCH64_FL_FOR_ARCH8_R, AARCH64_FL_FOR_ARCH9): Likewise.
(AARCH64_FL_FOR_ARCH9_1, AARCH64_FL_FOR_ARCH9_2): Likewise.
(AARCH64_FL_FOR_ARCH9_3, AARCH64_ISA_V8_2A, AARCH64_ISA_V8_3A)
(AARCH64_ISA_V8_4A, AARCH64_ISA_V8_5A, AARCH64_ISA_V8_6A): Likewise.
(AARCH64_ISA_V8R, AARCH64_ISA_V9A, AARCH64_ISA_V9_1A): Likewise.
(AARCH64_ISA_V9_2A, AARCH64_ISA_V9_3A): Likewise.
Richard Sandiford [Thu, 29 Sep 2022 10:32:50 +0000 (11:32 +0100)]
aarch64: Rename AARCH64_ISA architecture-level macros
All AARCH64_ISA_* architecture-level macros except AARCH64_ISA_V8_R
are for the A profile: they cause __ARM_ARCH_PROFILE to be set to
'A' and they are associated with architecture names like armv8.4-a.
It's convenient for later patches if we make this explicit
by adding an "A" to the name. Also, rather than add an underscore
(as for V8_R) it's more convenient to add the profile directly
to the number, like we already do in the ARCH_IDENT field of the
aarch64-arches.def entries.
gcc/
* config/aarch64/aarch64.h (AARCH64_ISA_V8_2, AARCH64_ISA_V8_3)
(AARCH64_ISA_V8_4, AARCH64_ISA_V8_5, AARCH64_ISA_V8_6)
(AARCH64_ISA_V9, AARCH64_ISA_V9_1, AARCH64_ISA_V9_2)
(AARCH64_ISA_V9_3): Add "A" to the end of the name.
(AARCH64_ISA_V8_R): Rename to AARCH64_ISA_V8R.
(TARGET_ARMV8_3, TARGET_JSCVT, TARGET_FRINT, TARGET_MEMTAG): Update
accordingly.
* common/config/aarch64/aarch64-common.cc
(aarch64_get_extension_string_for_isa_flags): Likewise.
* config/aarch64/aarch64-c.cc
(aarch64_define_unconditional_macros): Likewise.
Richard Sandiford [Thu, 29 Sep 2022 10:32:50 +0000 (11:32 +0100)]
Add OPTIONS_H_EXTRA to GTFILES
I have a patch that adds a typedef to aarch64's <cpu>-opts.h.
The typedef is used for a TargetVariable in the .opt file,
which means that it is covered by PCH and so needs to be
visible to gengtype.
<cpu>-opts.h is not included directly in tm.h, but indirectly
by target headers (in this case aarch64.h). There was therefore
nothing that caused it to be added to GTFILES.
gcc/
* Makefile.in (GTFILES): Add OPTIONS_H_EXTRA.
Jakub Jelinek [Thu, 29 Sep 2022 10:04:24 +0000 (12:04 +0200)]
driver, cppdefault: Unbreak bootstrap on Debian/Ubuntu [PR107059]
My recent change to enable _Float{16,32,64,128,32x,64x,128x} for C++
apparently broke bootstrap on some Debian/Ubuntu setups.
Those multiarch targets put some headers into
/usr/include/x86_64-linux-gnu/bits/ etc. subdirectory instead of
/usr/include/bits/.
This is handled by
/* /usr/include comes dead last. */
{ NATIVE_SYSTEM_HEADER_DIR, NATIVE_SYSTEM_HEADER_COMPONENT, 0, 0, 1, 2 },
{ NATIVE_SYSTEM_HEADER_DIR, NATIVE_SYSTEM_HEADER_COMPONENT, 0, 0, 1, 0 },
in cppdefault.cc, where the 2 in the last element of the first initializer
means the entry is ignored on non-multiarch and suffixed by the multiarch
dir otherwise, so installed gcc has search path like:
/home/jakub/gcc/obj01inst/usr/local/bin/../lib/gcc/x86_64-pc-linux-gnu/13.0.0/include
/home/jakub/gcc/obj01inst/usr/local/bin/../lib/gcc/x86_64-pc-linux-gnu/13.0.0/include-fixed
/usr/local/include
/usr/include/x86_64-linux-gnu
/usr/include
(when installed with DESTDIR=/home/jakub/gcc/obj01inst).
Now, when fixincludes is run, it is processing the whole /usr/include dir
and all its subdirectories, so floatn{,-common.h} actually go into
.../include-fixed/x86_64-linux-gnu/bits/floatn{,-common.h}
because that is where they appear in /usr/include too.
In some setups, /usr/include also contains /usr/include/bits -> x86_64-linux-gnu/bits
symlink and after the r13-2896 tweak it works.
In other setups there is no /usr/include/bits symlink and when one
#include <bits/floatn.h>
given the above search path, it doesn't find the fixincluded header,
as
/home/jakub/gcc/obj01inst/usr/local/bin/../lib/gcc/x86_64-pc-linux-gnu/13.0.0/include-fixed/bits/floatn.h
doesn't exist and
/home/jakub/gcc/obj01inst/usr/local/bin/../lib/gcc/x86_64-pc-linux-gnu/13.0.0/include-fixed/x86_64-linux-gnu/bits/floatn.h
isn't searched and so
/usr/include/x86_64-linux-gnu/bits/floatn.h
wins and we fail because of typedef whatever _Float128; and similar.
The following patch ought to fix this. The first hunk by arranging that
the installed search path actually looks like:
/home/jakub/gcc/obj01inst/usr/local/bin/../lib/gcc/x86_64-pc-linux-gnu/13.0.0/include
/home/jakub/gcc/obj01inst/usr/local/bin/../lib/gcc/x86_64-pc-linux-gnu/13.0.0/include-fixed/x86_64-linux-gnu
/home/jakub/gcc/obj01inst/usr/local/bin/../lib/gcc/x86_64-pc-linux-gnu/13.0.0/include-fixed
/usr/local/include
/usr/include/x86_64-linux-gnu
/usr/include
and thus for include-fixed it treats it the same as /usr/include.
The second FIXED_INCLUDE_DIR entry there is:
{ FIXED_INCLUDE_DIR, "GCC", 0, 0, 0,
/* A multilib suffix needs adding if different multilibs use
different headers. */
#ifdef SYSROOT_HEADERS_SUFFIX_SPEC
1
#else
0
#endif
},
where SYSROOT_HEADERS_SUFFIX_SPEC is defined only on vxworks or mips*-mti-linux
and arranges for multilib path to be appended there. Neither of those
systems is multiarch.
This isn't enough, because when using the -B option, the driver adds
-isystem .../include-fixed in another place, so the second hunk modifies
that spot the same.
/home/jakub/gcc/obj01/gcc/xgcc -B /home/jakub/gcc/obj01/gcc/
then has search path:
/home/jakub/gcc/obj01/gcc/include
/home/jakub/gcc/obj01/gcc/include-fixed/x86_64-linux-gnu
/home/jakub/gcc/obj01/gcc/include-fixed
/usr/local/include
/usr/include/x86_64-linux-gnu
/usr/include
which again is what I think we want to achieve.
2022-09-29 Jakub Jelinek <jakub@redhat.com>
PR bootstrap/107059
* cppdefault.cc (cpp_include_defaults): If SYSROOT_HEADERS_SUFFIX_SPEC
isn't defined, add FIXED_INCLUDE_DIR entry with multilib flag 2
before FIXED_INCLUDE_DIR entry with multilib flag 0.
* gcc.cc (do_spec_1): If multiarch_dir, add
include-fixed/multiarch_dir paths before include-fixed paths.
Martin Liska [Thu, 22 Sep 2022 12:30:44 +0000 (14:30 +0200)]
support -gz=zstd for both linker and assembler
PR driver/106897
gcc/ChangeLog:
* common.opt: Add -gz=zstd value.
* configure.ac: Detect --compress-debug-sections=zstd
for both linker and assembler.
* configure: Regenerate.
* gcc.cc (LINK_COMPRESS_DEBUG_SPEC): Handle -gz=zstd.
(ASM_COMPRESS_DEBUG_SPEC): Likewise.
Ronan Desplanques [Mon, 26 Sep 2022 14:55:28 +0000 (16:55 +0200)]
ada: Remove duplicated doc comment section
A documentation section was duplicated by mistake in r0-110752.
This commit removes the copy that was added by r0-110752, but
integrates the small editorial change that it brought to the
original.
gcc/ada/
* einfo.ads: remove documentation duplicate
Eric Botcazou [Mon, 26 Sep 2022 20:50:28 +0000 (22:50 +0200)]
ada: Further tweak new expansion of contracts
The original extended return statement is mandatory for functions whose
result type is limited in Ada 2005 and later.
gcc/ada/
* contracts.adb (Build_Subprogram_Contract_Wrapper): Put back the
extended return statement if the result type is built-in-place.
* sem_attr.adb (Analyze_Attribute_Old_Result): Also expect an
extended return statement.
Bob Duff [Tue, 20 Sep 2022 18:43:49 +0000 (14:43 -0400)]
ada: Improve efficiency of slice-of-component assignment
This patch improves the efficiency of slice assignment when the left- or
right-hand side is a slice of a component or a slice of a slice.
Previously, the optimization was disabled in these cases, just in
case there might be a volatile or independent component lurking.
Now we explicitly check all the relevant subcomponents of
the prefix.
The previous version said (in exp_ch5.adb):
-- ...We could
-- complicate this code by actually looking for such volatile and
-- independent components.
and that's exactly what we are doing here.
gcc/ada/
* exp_ch5.adb
(Expand_Assign_Array_Loop_Or_Bitfield): Make the checks for
volatile and independent objects more precise.
Piotr Trojanek [Fri, 23 Sep 2022 17:06:54 +0000 (19:06 +0200)]
ada: Fix checking of Refined_State with nested package renamings
When collecting package state declared in package body, we should only
recursively examine the visible part of nested packages while ignoring other
entities related to packages (e.g. package bodies or package renamings).
gcc/ada/
* sem_util.adb (Collect_Visible_States): Ignore package renamings.
Richard Biener [Fri, 19 Aug 2022 13:11:14 +0000 (15:11 +0200)]
tree-optimization/105646 - re-interpret always executed in uninit diag
The following fixes PR105646, not diagnosing
int f1();
int f3(){
auto const & a = f1();
bool v3{v3};
return a;
}
with optimization because the early uninit diagnostic pass only
diagnoses always executed cases. The patch does this by
re-interpreting what always executed means and choosing to
ignore exceptional and abnormal control flow for this. At the
same time it improves things as suggested in a comment - when
the value-numbering run done without optimizing figures there's
a fallthru path, consider blocks on it as always executed.
PR tree-optimization/105646
* tree-ssa-uninit.cc (warn_uninitialized_vars): Pre-compute
the set of fallthru reachable blocks from function entry
and use that to determine wlims.always_executed.
* g++.dg/uninit-pr105646.C: New testcase.
liuhongt [Wed, 28 Sep 2022 09:00:48 +0000 (17:00 +0800)]
Check nonlinear iv in vect_can_advance_ivs_p.
vectorizable_nonlinear_induction doesn't always guard
vect_peel_nonlinear_iv_init when it's called by
vect_update_ivs_after_vectorizer.
It's supposed to be guarded by vect_can_advance_ivs_p.
gcc/ChangeLog:
PR tree-optimization/107055
* tree-vect-loop-manip.cc (vect_can_advance_ivs_p): Check for
nonlinear induction variables.
* tree-vect-loop.cc (vect_can_peel_nonlinear_iv_p): New
functions.
(vectorizable_nonlinear_induction): Put part codes into
vect_can_peel_nonlinear_iv_p.
* tree-vectorizer.h (vect_can_peel_nonlinear_iv_p): Declare.
gcc/testsuite/ChangeLog:
* gcc.target/i386/pr107055.c: New test.
GCC Administrator [Thu, 29 Sep 2022 00:16:38 +0000 (00:16 +0000)]
Daily bump.
Jonathan Wakely [Wed, 28 Sep 2022 11:39:41 +0000 (12:39 +0100)]
libstdc++: Disable volatile-qualified std::bind for C++20
LWG 2487 added a precondition to std::bind for C++17, making
volatile-qualified uses undefined. We still support it, but with a
deprecated warning.
P1065R2 made it explicitly ill-formed for C++20, so we should no longer
accept it as deprecated. This implements that change.
libstdc++-v3/ChangeLog:
* doc/xml/manual/evolution.xml: Document std::bind API
changes.
* doc/xml/manual/intro.xml: Document LWG 2487 status.
* doc/xml/manual/using.xml: Clarify default value of
_GLIBCXX_USE_DEPRECATED.
* doc/html/*: Regenerate.
* include/std/functional (_Bind::operator()(Args&&...) volatile)
(_Bind::operator()(Args&&...) const volatile)
(_Bind_result::operator()(Args&&...) volatile)
(_Bind_result::operator()(Args&&...) const volatile): Replace
with deleted overload for C++20 and later.
* testsuite/20_util/bind/cv_quals.cc: Check for deprecated
warnings in C++17.
* testsuite/20_util/bind/cv_quals_2.cc: Likewise, and check for
ill-formed in C++20.
Jonathan Wakely [Tue, 27 Sep 2022 19:59:05 +0000 (20:59 +0100)]
libstdc++: Make INVOKE<R> refuse to create dangling references [PR70692]
This is the next part of the library changes from P2255R2. This makes
INVOKE<R> ill-formed if converting the INVOKE expression to R would bind
a reference to a temporary object.
The is_invocable_r trait is now false if the invocation would create a
dangling reference. This is done by adding the dangling check to the
__is_invocable_impl partial specialization used for INVOKE<R>
expressions. This change also slightly simplifies the nothrow checking
recently added to that partial specialization.
This change also removes the is_invocable_r checks from the pre-C++17
implementation of std::__invoke_r, because there is no need for it to be
SFINAE-friendly. None of our C++11 and C++14 uses of INVOKE<R> require
those constraints. The std::function constructor needs to check
is_invocable_r, but that's already done explicitly, so we don't need to
recheck when calling __is_invoke_r in std::function::operator(). The
other uses of std::__is_invoke_r do not need to be constrained and can
just be ill-formed if the INVOKE<R> expression is ill-formed.
libstdc++-v3/ChangeLog:
PR libstdc++/70692
* include/bits/invoke.h [__cplusplus < 201703] (__invoke_r):
Remove is_invocable and is_convertible constraints.
* include/std/type_traits (__is_invocable_impl::_S_conv): Use
non-deduced context for parameter.
(__is_invocable_impl::_S_test): Remove _Check_noex template
parameter and use deduced noexcept value in its place. Add bool
parameter to detect dangling references.
(__is_invocable_impl::type): Adjust call to _S_test to avoid
deducing unnecessary noexcept property..
(__is_invocable_impl::__nothrow_type): Rename to ...
(__is_invocable_impl::__nothrow_conv): ... this. Adjust call
to _S_test to deduce noexcept property.
* testsuite/20_util/bind/dangling_ref.cc: New test.
* testsuite/20_util/function/cons/70692.cc: New test.
* testsuite/20_util/function_objects/invoke/dangling_ref.cc:
New test.
* testsuite/20_util/is_invocable/dangling_ref.cc: New test.
* testsuite/30_threads/packaged_task/cons/dangling_ref.cc:
New test.
Eugene Rozenfeld [Thu, 21 Apr 2022 22:42:15 +0000 (15:42 -0700)]
Add instruction level discriminator support.
This is the first in a series of patches to enable discriminator support
in AutoFDO.
This patch switches to tracking discriminators per statement/instruction
instead of per basic block. Tracking per basic block was problematic since
not all statements in a basic block needed a discriminator and, also, later
optimizations could move statements between basic blocks making correlation
during AutoFDO compilation unreliable. Tracking per statement also allows
us to assign different discriminators to multiple function calls in the same
basic block. A subsequent patch will add that support.
The idea of this patch is based on commit
4c311d95cf6d9519c3c20f641cc77af7df491fdf
by Dehao Chen in vendors/google/heads/gcc-4_8 but uses a slightly different
approach. In Dehao's work special (normally unused) location ids and side tables
were used to keep track of locations with discriminators. Things have changed
since then and I don't think we have unused location ids anymore. Instead,
I made discriminators a part of ad-hoc locations.
The difference from Dehao's work also includes support for discriminator
reading/writing in lto streaming and in modules.
Tested on x86_64-pc-linux-gnu.
gcc/ChangeLog:
* basic-block.h: Remove discriminator from basic blocks.
* cfghooks.cc (split_block_1): Remove discriminator from basic blocks.
* final.cc (final_start_function_1): Switch from per-bb to per statement
discriminator.
(final_scan_insn_1): Don't keep track of basic block discriminators.
(compute_discriminator): Switch from basic block discriminators to
instruction discriminators.
(insn_discriminator): New function to return instruction discriminator.
(notice_source_line): Use insn_discriminator.
* gimple-pretty-print.cc (dump_gimple_bb_header): Remove dumping of
basic block discriminators.
* gimple-streamer-in.cc (input_bb): Remove reading of basic block
discriminators.
* gimple-streamer-out.cc (output_bb): Remove writing of basic block
discriminators.
* input.cc (make_location): Pass 0 discriminator to COMBINE_LOCATION_DATA.
(location_with_discriminator): New function to combine locus with
a discriminator.
(has_discriminator): New function to check if a location has a discriminator.
(get_discriminator_from_loc): New function to get the discriminator
from a location.
* input.h: Declarations of new functions.
* lto-streamer-in.cc (cmp_loc): Use discriminators in location comparison.
(apply_location_cache): Keep track of current discriminator.
(input_location_and_block): Read discriminator from stream.
* lto-streamer-out.cc (clear_line_info): Set current discriminator to
UINT_MAX.
(lto_output_location_1): Write discriminator to stream.
* lto-streamer.h: Add discriminator to cached_location.
Add current_discr to lto_location_cache.
Add current_discr to output_block.
* print-rtl.cc (print_rtx_operand_code_i): Print discriminator.
* rtl.h: Add extern declaration of insn_discriminator.
* tree-cfg.cc (assign_discriminator): New function to assign a unique
discriminator value to all statements in a basic block that have the given
line number.
(assign_discriminators): Assign discriminators to statement locations.
* tree-pretty-print.cc (dump_location): Dump discriminators.
* tree.cc (set_block): Preserve discriminator when setting block.
(set_source_range): Preserve discriminator when setting source range.
gcc/cp/ChangeLog:
* module.cc (write_location): Write discriminator.
(read_location): Read discriminator.
libcpp/ChangeLog:
* include/line-map.h: Add discriminator to location_adhoc_data.
(get_combined_adhoc_loc): Add discriminator parameter.
(get_discriminator_from_adhoc_loc): Add external declaration.
(get_discriminator_from_loc): Add external declaration.
(COMBINE_LOCATION_DATA): Add discriminator parameter.
* lex.cc (get_location_for_byte_range_in_cur_line) Pass 0 discriminator
in a call to COMBINE_LOCATION_DATA.
(warn_about_normalization): Pass 0 discriminator in a call to
COMBINE_LOCATION_DATA.
(_cpp_lex_direct): Pass 0 discriminator in a call to
COMBINE_LOCATION_DATA.
* line-map.cc (location_adhoc_data_hash): Use discriminator compute
location_adhoc_data hash.
(location_adhoc_data_eq): Use discriminator when comparing
location_adhoc_data.
(can_be_stored_compactly_p): Check discriminator to determine
compact storage.
(get_combined_adhoc_loc): Add discriminator parameter.
(get_discriminator_from_adhoc_loc): New function to get the discriminator
from an ad-hoc location.
(get_discriminator_from_loc): New function to get the discriminator
from a location.
gcc/testsuite/ChangeLog:
* c-c++-common/ubsan/pr85213.c: Pass -gno-statement-frontiers.
Nathan Sidwell [Wed, 28 Sep 2022 16:20:27 +0000 (09:20 -0700)]
c++: Add DECL_NTTP_OBJECT_P lang flag
VAR_DECLs for NTTPs need to be handled specially by module streaming,
in the same manner to type info decls. This reworks their handling to
allow that work to drop in. We use DECL_LANG_FLAG_5 to indicate such
decls (I didn't notice template_parm_object_p, which looks at the
mangled name -- anyway a bit flag on the node is better, IMHO). We
break apart the creation routine, so there's now an entry point the
module machinery can use directly.
gcc/cp/
* cp-tree.h (DECL_NTTP_OBJECT_P): New.
(template_parm_object_p): Delete.
(build_template_parm_object): Declare.
* cxx-pretty-print.cc (pp_cx_template_argument_list): Use DECL_NTTP_OBJECT_P.
* error.cc (dump_simple_decl): Likewise.
* mangle.cc (write_template_arg): Likewise.
* pt.cc (template_parm_object_p): Delete.
(create_template_parm_object): Separated out checking from ...
(get_template_parm_object): ... this, new external entry point.
H.J. Lu [Tue, 27 Sep 2022 23:19:11 +0000 (16:19 -0700)]
i386: Mark XMM4-XMM6 as clobbered by encodekey128/encodekey256
encodekey128 and encodekey256 operations clear XMM4-XMM6. But it is
documented that XMM4-XMM6 are reserved for future usages and software
should not rely upon them being zeroed. Change encodekey128 and
encodekey256 to clobber XMM4-XMM6.
gcc/
PR target/107061
* config/i386/predicates.md (encodekey128_operation): Check
XMM4-XMM6 as clobbered.
(encodekey256_operation): Likewise.
* config/i386/sse.md (encodekey128u32): Clobber XMM4-XMM6.
(encodekey256u32): Likewise.
gcc/testsuite/
PR target/107061
* gcc.target/i386/keylocker-encodekey128.c: Don't check
XMM4-XMM6.
* gcc.target/i386/keylocker-encodekey256.c: Likewise.
Ju-Zhe Zhong [Tue, 27 Sep 2022 09:26:08 +0000 (17:26 +0800)]
RISC-V: Add ABI-defined RVV types.
gcc/ChangeLog:
* config.gcc: Add riscv-vector-builtins.o.
* config/riscv/riscv-builtins.cc (riscv_init_builtins): Add RVV builtin function.
* config/riscv/riscv-protos.h (riscv_v_ext_enabled_vector_mode_p): New function.
* config/riscv/riscv.cc (ENTRY): New macro.
(riscv_v_ext_enabled_vector_mode_p): New function.
(riscv_mangle_type): Add RVV mangle.
(riscv_vector_mode_supported_p): Adjust RVV machine mode.
(riscv_verify_type_context): Add context check for RVV.
(riscv_vector_alignment): Add RVV alignment target hook support.
(TARGET_VECTOR_MODE_SUPPORTED_P): New target hook support.
(TARGET_VERIFY_TYPE_CONTEXT): Ditto.
(TARGET_VECTOR_ALIGNMENT): Ditto.
* config/riscv/t-riscv: Add riscv-vector-builtins.o
* config/riscv/riscv-vector-builtins.cc: New file.
* config/riscv/riscv-vector-builtins.def: New file.
* config/riscv/riscv-vector-builtins.h: New file.
* config/riscv/riscv-vector-switch.def: New file.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/abi-1.c: New test.
* gcc.target/riscv/rvv/base/abi-2.c: New test.
* gcc.target/riscv/rvv/base/abi-3.c: New test.
* gcc.target/riscv/rvv/base/abi-4.c: New test.
* gcc.target/riscv/rvv/base/abi-5.c: New test.
* gcc.target/riscv/rvv/base/abi-6.c: New test.
* gcc.target/riscv/rvv/base/abi-7.c: New test.
* gcc.target/riscv/rvv/rvv.exp: New test.
Stefan Schulze Frielinghaus [Wed, 28 Sep 2022 15:27:11 +0000 (17:27 +0200)]
var-tracking: Add entry values up to max register mode
For parameter of type integer which do not consume a whole register
(modulo sign/zero extension) this patch adds entry values up to maximal
register mode.
gcc/ChangeLog:
* var-tracking.cc (vt_add_function_parameter): Add entry values
up to maximal register mode.
Stefan Schulze Frielinghaus [Wed, 28 Sep 2022 15:27:11 +0000 (17:27 +0200)]
cselib: Keep track of further subvalue relations
Whenever a new cselib value is created check whether a smaller value
exists which is contained in the bigger one. If so add a subreg
relation to locs of the smaller one.
gcc/ChangeLog:
* cselib.cc (new_cselib_val): Keep track of further subvalue
relations.
Andrea Corallo [Wed, 14 Sep 2022 15:38:30 +0000 (17:38 +0200)]
arm: Define __ARM_FEATURE_AES and __ARM_FEATURE_SHA2 when march +crypto is selected
Hi all,
this patch fixes the missing definition of __ARM_FEATURE_AES and
__ARM_FEATURE_SHA2 when AES SHA1 & SHA2 crypto instructions are
available [1] (read when march +crypto is selected).
Okay for master?
Thanks
Andrea
[1] <https://raw.githubusercontent.com/ARM-software/acle/main/main/acle.md>
gcc/ChangeLog
2022-09-14 Andrea Corallo <andrea.corallo@arm.com>
* config/arm/arm-c.cc (arm_cpu_builtins): Define
__ARM_FEATURE_AES and __ARM_FEATURE_SHA2.
gcc/testsuite/ChangeLog
2022-09-14 Andrea Corallo <andrea.corallo@arm.com>
* gcc.target/arm/attr-crypto.c: Update test.
Xi Ruoyao [Sat, 24 Sep 2022 12:47:22 +0000 (20:47 +0800)]
LoongArch: Use UNSPEC for fmin/fmax RTL pattern [PR105414]
I made a mistake defining fmin/fmax RTL patterns in r13-2085: I used
smin and smax in the definition mistakenly. This causes the optimizer
to perform constant folding as if fmin/fmax was "really" smin/smax
operations even with -fsignaling-nans. Then pr105414.c fails.
We don't have fmin/fmax RTL codes for now (PR107013) so we can only use
an UNSPEC for fmin and fmax patterns.
gcc/ChangeLog:
PR tree-optimization/105414
* config/loongarch/loongarch.md (UNSPEC_FMAX): New unspec.
(UNSPEC_FMIN): Likewise.
(fmax<mode>3): Use UNSPEC_FMAX instead of smax.
(fmin<mode>3): Use UNSPEC_FMIN instead of smin.
Torbjörn SVENSSON [Tue, 27 Sep 2022 14:36:12 +0000 (16:36 +0200)]
testsuite: Skip intrinsics test if arm
In the test cases, it's clearly written that intrinsics are not
implemented on arm*. A simple xfail does not help since there are
link error and that would cause an UNRESOLVED testcase rather than
XFAIL.
By changing to dg-skip-if, the entire test case is omitted.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/advsimd-intrinsics/vld1x2.c: Rephrase
to unimplemented.
* gcc.target/aarch64/advsimd-intrinsics/vld1x3.c: Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vld1x4.c: Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vst1x2.c: Replace
dg-xfail-if with dg-skip-if.
* gcc.target/aarch64/advsimd-intrinsics/vst1x3.c: Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vst1x4.c: Likewise.
Co-Authored-By: Yvan ROUX <yvan.roux@foss.st.com>
Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>
Lulu Cheng [Wed, 28 Sep 2022 08:35:06 +0000 (16:35 +0800)]
LoongArch: Fixed a typo in the comment information of the function loongarch_asan_shadow_offset.
gcc/ChangeLog:
* config/loongarch/loongarch.cc (loongarch_asan_shadow_offset):
Fixed typo in "asan_mapping.h".
Tobias Burnus [Wed, 28 Sep 2022 08:24:58 +0000 (10:24 +0200)]
libgomp.texi: Status 'P' for 'assume', remove duplicated line
libgomp/
* libgomp.texi (OpenMP 5.1): Mark 'assume' as implemented
for C/C++. Remove duplicated 'begin declare target' entry.
Lulu Cheng [Mon, 26 Sep 2022 01:42:51 +0000 (09:42 +0800)]
LoongArch: Libitm add LoongArch support.
Co-Authored-By: Yang Yujie <yangyujie@loongson.cn>
libitm/ChangeLog:
* configure.tgt: Add loongarch support.
* config/loongarch/asm.h: New file.
* config/loongarch/sjlj.S: New file.
* config/loongarch/target.h: New file.
H.J. Lu [Thu, 14 Jul 2022 15:23:38 +0000 (08:23 -0700)]
stack-protector: Check stack canary before throwing exception
Check stack canary before throwing exception to avoid stack corruption.
gcc/
PR middle-end/58245
* calls.cc: Include "tree-eh.h".
(expand_call): Check stack canary before throwing exception.
gcc/testsuite/
PR middle-end/58245
* g++.dg/fstack-protector-strong.C: Adjusted.
* g++.dg/pr58245-1.C: New test.
Eugene Rozenfeld [Wed, 28 Sep 2022 00:28:20 +0000 (17:28 -0700)]
Fix AutoFDO tests to not look for hot/cold splitting.
AutoFDO counts are not reliable and we are currently not
performing hot/cold splitting based on them. This change adjusts
several tree-prof tests not to check for hot/cold splitting
when run with AutoFDO.
gcc/testsuite/ChangeLog:
* gcc.dg/tree-prof/cold_partition_label.c: Don't check for hot/cold splitting with AutoFDO.
* gcc.dg/tree-prof/section-attr-1.c: Don't check for hot/cold splitting with AutoFDO.
* gcc.dg/tree-prof/section-attr-2.c: Don't check for hot/cold splitting with AutoFDO.
* gcc.dg/tree-prof/section-attr-3.c: Don't check for hot/cold splitting with AutoFDO.
GCC Administrator [Wed, 28 Sep 2022 00:17:27 +0000 (00:17 +0000)]
Daily bump.
Eugene Rozenfeld [Fri, 23 Sep 2022 01:12:01 +0000 (18:12 -0700)]
Fix profile count comparison.
The comparison was incorrect when the counts weren't PRECISE.
For example, crossmodule-indir-call-topn-1.c was failing
with AutoFDO: when count_sum is 0 with quality AFDO,
count_sum > profile_count::zero() evaluates to true. Taking that
branch then leads to an assert in the call to to_sreal().
Tested on x86_64-pc-linux-gnu.
gcc/ChangeLog:
* ipa-cp.cc (good_cloning_opportunity_p): Fix profile count comparison.
Marek Polacek [Wed, 31 Aug 2022 21:37:59 +0000 (17:37 -0400)]
c++: Implement C++23 P2266R1, Simpler implicit move [PR101165]
This patch implements https://wg21.link/p2266, which, once again,
changes the implicit move rules. Here's a brief summary of various
changes in this area:
r125211: Introduced moving from certain lvalues when returning them
r171071: CWG 1148, enable move from value parameter on return
r212099: CWG 1579, it's OK to call a converting ctor taking an rvalue
r251035: CWG 1579, do maybe-rvalue overload resolution twice
r11-2411: Avoid calling const copy ctor on implicit move
r11-2412: C++20 implicit move changes, remove the fallback overload
resolution, allow move on throw of parameters and implicit
move of rvalue references
P2266 enables the implicit move even for functions that return references.
That is, we will now perform a move in
X&& foo (X&& x) {
return x;
}
P2266 also removes the fallback overload resolution, but this was
resolved by r11-2412: we only do convert_for_initialization with
LOOKUP_PREFER_RVALUE in C++17 and older.
P2266 also says that a returned move-eligible id-expression is always an
xvalue. This required some further short, but nontrivial changes,
especially when it comes to deduction, because we have to pay attention
to whether we have auto, auto&& (which is like T&&), or decltype(auto)
with (un)parenthesized argument. In C++23,
decltype(auto) f(int&& x) { return (x); }
auto&& f(int x) { return x; }
both should deduce to 'int&&' but
decltype(auto) f(int x) { return x; }
should deduce to 'int'. A cornucopia of tests attached. I've also
verified that we behave like clang++.
xvalue_p seemed to be broken: since the introduction of clk_implicit_rval,
it cannot use '==' when checking for clk_rvalueref.
Since this change breaks code, it's only enabled in C++23. In
particular, this code will not compile in C++23:
int& g(int&& x) { return x; }
because x is now treated as an rvalue, and you can't bind a non-const lvalue
reference to an rvalue.
This patch also fixes PR106882 (the check_return_expr changes).
PR c++/101165
PR c++/106882
gcc/c-family/ChangeLog:
* c-cppbuiltin.cc (c_cpp_builtins): Define __cpp_implicit_move.
gcc/cp/ChangeLog:
* call.cc (reference_binding): Check clk_implicit_rval in C++20 only.
* cp-tree.h (unparenthesized_id_or_class_member_access_p): Declare.
* pt.cc (unparenthesized_id_or_class_member_access_p): New function,
broken out of...
(do_auto_deduction): ...here. Use it. In C++23, maybe call
treat_lvalue_as_rvalue_p.
* tree.cc (xvalue_p): Check & clk_rvalueref, not == clk_rvalueref.
* typeck.cc (check_return_expr): Allow implicit move for functions
returning a reference as well, or when the return value type is not
a scalar type.
gcc/testsuite/ChangeLog:
* g++.dg/conversion/pr41426.C: Add dg-error for C++23.
* g++.dg/cpp0x/elision_weak.C: Likewise.
* g++.dg/cpp0x/move-return3.C: Only link in c++20_down.
* g++.dg/cpp1y/decltype-auto2.C: Add dg-error for C++23.
* g++.dg/cpp1y/lambda-generic-89419.C: Likewise.
* g++.dg/cpp23/feat-cxx2b.C: Test __cpp_implicit_move.
* g++.dg/gomp/pr56217.C: Only compile in c++20_down.
* g++.dg/warn/Wno-return-local-addr.C: Add dg-error for C++23.
* g++.dg/warn/Wreturn-local-addr.C: Adjust dg-error.
* g++.old-deja/g++.brendan/crash55.C: Add dg-error for C++23.
* g++.old-deja/g++.jason/temporary2.C: Likewise.
* g++.old-deja/g++.mike/p2846b.C: Adjust.
* g++.dg/cpp1y/decltype-auto6.C: New test.
* g++.dg/cpp23/decltype1.C: New test.
* g++.dg/cpp23/decltype2.C: New test.
* g++.dg/cpp23/elision1.C: New test.
* g++.dg/cpp23/elision2.C: New test.
* g++.dg/cpp23/elision3.C: New test.
* g++.dg/cpp23/elision4.C: New test.
* g++.dg/cpp23/elision5.C: New test.
* g++.dg/cpp23/elision6.C: New test.
* g++.dg/cpp23/elision7.C: New test.
Harald Anlauf [Tue, 27 Sep 2022 18:54:28 +0000 (20:54 +0200)]
Fortran: error recovery while simplifying intrinsic UNPACK [PR107054]
gcc/fortran/ChangeLog:
PR fortran/107054
* simplify.cc (gfc_simplify_unpack): Replace assert by condition
that terminates simplification when there are not enough elements
in the constructor of argument VECTOR.
gcc/testsuite/ChangeLog:
PR fortran/107054
* gfortran.dg/pr107054.f90: New test.
Ian Lance Taylor [Mon, 26 Sep 2022 19:03:53 +0000 (15:03 -0400)]
runtime: portable access to sigev_notify_thread_id
Previously, libgo relied on the _sigev_un implementation-specific
field in struct sigevent, which is only available on glibc.
This patch uses the sigev_notify_thread_id macro instead which is
mandated by timer_create(2). In theory, this should work with any libc
implementation for Linux. Unfortunately, there is an open glibc bug
as glibc does not define this macro. For this reason, a glibc-specific
workaround is required. Other libcs (such as musl) define the macro
and don't require the workaround.
See https://sourceware.org/bugzilla/show_bug.cgi?id=27417
This makes libgo compatible with musl libc.
Based on patch by Sören Tempel.
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/434755
melonedo [Mon, 19 Sep 2022 08:01:04 +0000 (16:01 +0800)]
runtime: synchronize empty struct field handling
In GCCGO and gollvm, the logic for allocating one byte for the last field is:
1. the last field has zero size
2. the struct itself does not have zero size
3. the last field is not blank
this commit adds the last two conditions to runtime.structToFFI.
For golang/go#55146
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/431735
Kim Kuparinen [Tue, 27 Sep 2022 15:04:25 +0000 (11:04 -0400)]
docs: update abi version info
gcc/
* doc/invoke.texi: Update ABI version info.
Aldy Hernandez [Tue, 27 Sep 2022 08:30:00 +0000 (10:30 +0200)]
range-ops: Calculate the popcount of a singleton.
The legacy popcount folding didn't actually fold singleton ranges.
I don't think anyone noticed because there are match.pd patterns that
fold __builtin_popcount using the global nonzero bits set by CCP.
It's good form to handle this, even without CCP's help.
Tested on x86-64 Linux.
gcc/ChangeLog:
* gimple-range-op.cc (cfn_popcount): Calculate the popcount of a
singleton.
gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/popcount6b.c: New test.
Marek Polacek [Fri, 23 Sep 2022 16:32:38 +0000 (12:32 -0400)]
c++: Don't quote nothrow in diagnostic
In <https://gcc.gnu.org/pipermail/gcc-patches/2022-September/602057.html>
Jason noticed that we quote "nothrow" in diagnostics even though it's
not a keyword in C++. This patch removes the quotes and also drops
"nothrow" from c_keywords.
gcc/c-family/ChangeLog:
* c-format.cc (c_keywords): Drop nothrow.
gcc/cp/ChangeLog:
* constraint.cc (diagnose_trait_expr): Say "nothrow" without quotes
rather than in quotes.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/concepts-traits3.C: Adjust expected diagnostics.
Jonathan Wakely [Tue, 27 Sep 2022 08:51:10 +0000 (09:51 +0100)]
c++: Make __is_{,nothrow_}convertible SFINAE on access [PR107049]
The is_convertible built-ins should return false if the conversion fails
an access check, not report an error.
PR c++/107049
gcc/cp/ChangeLog:
* method.cc (is_convertible_helper): Use access check sentinel.
gcc/testsuite/ChangeLog:
* g++.dg/ext/is_convertible4.C: New test.
* g++.dg/ext/is_nothrow_convertible4.C: New test.
libstdc++-v3/ChangeLog:
* testsuite/20_util/is_convertible/requirements/access.cc: New
test.
Jonathan Wakely [Tue, 27 Sep 2022 10:25:51 +0000 (11:25 +0100)]
libstdc++: Adjust deduction guides for static operator() [PR106651]
Adjust the deduction guides for std::function and std::packaged_task to
work with static call operators. This finishes the implementation of
P1169R4 for C++23.
libstdc++-v3/ChangeLog:
PR c++/106651
* include/bits/std_function.h (__function_guide_t): New alias
template.
[__cpp_static_call_operator] (__function_guide_static_helper):
New class template.
(function): Use __function_guide_t in deduction guide.
* include/std/future (packaged_task): Use __function_guide_t in
deduction guide.
* testsuite/20_util/function/cons/deduction_c++23.cc: New test.
* testsuite/30_threads/packaged_task/cons/deduction_c++23.cc:
New test.
Jakub Jelinek [Tue, 27 Sep 2022 10:29:46 +0000 (12:29 +0200)]
fixincludes: FIx up for Debian/Ubuntu includes
As reported by Tobias, my C++ _Float{16,32,64,128,32x,64x,128x} support
patch broke Debian/Ubuntu bootstraps. The problem is that there
glibc bits/floatn.h and bits/floatn-common.h isn't in /usr/include/
directly, but in a subdirectory like /usr/include/x86_64-linux-gnu/
Seems other fixinclude rules for bits/* headers use
files = bits/whatever.h, "*/bits/whatever.h";
so this patch just follows the suit.
2022-06-27 Jakub Jelinek <jakub@redhat.com>
* inclhack.def (glibc_cxx_floatn_1, glibc_cxx_floatn_2,
glibc_cxx_floatn_3): Add to files also "*/bits/floatn.h"
and "*/bits/floatn-common.h".
* fixincl.x: Regenerated.
Iain Buclaw [Tue, 27 Sep 2022 08:43:32 +0000 (10:43 +0200)]
d: Merge upstream dmd
d579c467c1, phobos
88aa69b14.
D front-end changes:
- Throwing from contracts of `nothrow' functions has been
deprecated, as this breaks the guarantees of `nothrow'.
- Added language support for initializing the interior pointer of
associative arrays using `new' keyword.
Phobos changes:
- The std.digest.digest module has been removed.
- The std.xml module has been removed.
gcc/d/ChangeLog:
* dmd/MERGE: Merge upstream dmd
d579c467c1.
* decl.cc (layout_struct_initializer): Update for new front-end
interface.
* expr.cc (ExprVisitor::visit (AssignExp *)): Remove lowering of array
assignments.
(ExprVisitor::visit (NewExp *)): Add new lowering of new'ing
associative arrays to an _aaNew() library call.
* runtime.def (ARRAYSETASSIGN): Remove.
(AANEW): Define.
libphobos/ChangeLog:
* libdruntime/MERGE: Merge upstream druntime
d579c467c1.
* libdruntime/Makefile.am (DRUNTIME_DSOURCES): Remove
rt/arrayassign.d.
* libdruntime/Makefile.in: Regenerate.
* src/MERGE: Merge upstream phobos
88aa69b14.
* src/Makefile.am (PHOBOS_DSOURCES): Remove std/digest/digest.d,
std/xml.d.
* src/Makefile.in: Regenerate.
Aldy Hernandez [Tue, 27 Sep 2022 06:05:30 +0000 (08:05 +0200)]
irange: keep better track of powers of 2.
When setting the nonzero bits to a mask containing only one bit, set
the range immediately, as it can be devined from the mask. This helps
us keep better track of powers of two.
For example, with this patch a nonzero mask of 0x8000 is set to a
range of [0,0][0x8000,0x8000] with a nonzero mask of 0x8000.
gcc/ChangeLog:
* value-range.cc (irange::set_nonzero_bits): Set range when known.
gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/popcount6.c: New test.
Aldy Hernandez [Tue, 27 Sep 2022 06:00:40 +0000 (08:00 +0200)]
Add an irange setter for wide_ints.
Just the same way as we have real_value setters for franges, we should
have a wide_int version for irange. This matches the irange
constructor for wide_ints, and paves the way for the eventual
conversion of irange to wide ints.
gcc/ChangeLog:
* value-range.h (irange::set): New version taking wide_int_ref.
Jakub Jelinek [Tue, 27 Sep 2022 06:36:28 +0000 (08:36 +0200)]
c++: Implement C++23 P1169R4 - static operator() [PR106651]
The following patch attempts to implement C++23 P1169R4 - static operator()
paper's compiler side (there is some small library side too not implemented
yet). This allows static members as user operator() declarations and
static specifier on lambdas without lambda capture.
The synthetized conversion operator changes for static lambdas as it can just
return the operator() static method address, doesn't need to create a thunk
for it.
The change in call.cc (joust) is to avoid ICEs because we assumed that len
could be different only if both candidates are direct calls but it can be
one direct and one indirect call, and to implement the
[over.match.best.general]/1 and [over.best.ics.general] changes from
the paper (implemented always as Jason is sure it doesn't make a difference
in C++20 and earlier unless static member function operator() or
static lambda which we accept with pedwarn in earlier standards too appears
and my testing confirmed that).
2022-09-27 Jakub Jelinek <jakub@redhat.com>
PR c++/106651
gcc/c-family/
* c-cppbuiltin.cc (c_cpp_builtins): Predefine
__cpp_static_call_operator=202207L for C++23.
gcc/cp/
* cp-tree.h (LAMBDA_EXPR_STATIC_P): Implement C++23
P1169R4 - static operator(). Define.
* parser.cc (CP_PARSER_FLAGS_ONLY_MUTABLE_OR_CONSTEXPR): Document
that it also allows static.
(cp_parser_lambda_declarator_opt): Handle static lambda specifier.
(cp_parser_decl_specifier_seq): Allow RID_STATIC for
CP_PARSER_FLAGS_ONLY_MUTABLE_OR_CONSTEXPR.
* decl.cc (grok_op_properties): If operator() isn't a method,
use a different error wording, if it is static member function,
allow it (for C++20 and older with a pedwarn unless it is
a lambda function or template instantiation).
* call.cc (joust): Don't ICE if one candidate is static member
function and the other is an indirect call. If the parameter
conversion on the other candidate is user defined conversion,
ellipsis or bad conversion, make static member function candidate
a winner for that parameter.
* lambda.cc (maybe_add_lambda_conv_op): Handle static lambdas.
* error.cc (dump_lambda_function): Print static for static lambdas.
gcc/testsuite/
* g++.dg/template/error30.C: Adjust expected diagnostics.
* g++.dg/cpp1z/constexpr-lambda13.C: Likewise.
* g++.dg/cpp23/feat-cxx2b.C: Test __cpp_static_call_operator.
* g++.dg/cpp23/static-operator-call1.C: New test.
* g++.dg/cpp23/static-operator-call2.C: New test.
* g++.old-deja/g++.jason/operator.C: Adjust expected diagnostics.
Jakub Jelinek [Tue, 27 Sep 2022 06:26:18 +0000 (08:26 +0200)]
reassoc: Handle OFFSET_TYPE like POINTER_TYPE in optimize_range_tests_cmp_bitwise [PR107029[
As the testcase shows, OFFSET_TYPE needs the same treatment as
POINTER_TYPE/REFERENCE_TYPE, otherwise we fail the same during the
newly added verification. OFFSET_TYPE is signed though, so unlike
POINTER_TYPE/REFERENCE_TYPE it can also trigger with the
x < 0 && y < 0 && z < 0 to (x | y | z) < 0
optimization.
2022-09-27 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/107029
* tree-ssa-reassoc.cc (optimize_range_tests_cmp_bitwise): Treat
OFFSET_TYPE like POINTER_TYPE, except that OFFSET_TYPE may be
signed and so can trigger even the (b % 4) == 3 case.
* g++.dg/torture/pr107029.C: New test.
Jakub Jelinek [Tue, 27 Sep 2022 06:23:08 +0000 (08:23 +0200)]
openmp: Add OpenMP assume, assumes and begin/end assumes support
The following patch implements OpenMP 5.1
#pragma omp assume
#pragma omp assumes
and
#pragma omp begin assumes
#pragma omp end assumes
directive support for C and C++. Currently it doesn't remember
anything from the assumption clauses for later, so is mainly
to support the directives and diagnose errors in their use.
If the recently posted C++23 [[assume (cond)]]; support makes it
in, the intent is that this can be easily adjusted at least for
the #pragma omp assume directive with holds clause(s) to use
the same infrastructure. Now, C++23 portable assumptions are slightly
different from OpenMP 5.1 assumptions' holds clause in that C++23
assumption holds just where it appears, while OpenMP 5.1 assumptions
hold everywhere in the scope of the directive. For assumes
directive which can appear at file or namespace scope it is the whole
TU and everything that functions from there call at runtime, for
begin assumes/end assumes pair all the functions in between those
directives and everything they call and for assume directive the
associated (currently structured) block. I have no idea how to
represents such holds to be usable for optimizers, except to
make
#pragma omp assume holds (cond)
block;
expand essentially to
[[assume (cond)]];
block;
or
[[assume (cond)]];
block;
[[assume (cond)]];
for now. Except for holds clause, the other assumptions are
OpenMP related, I'd say we should brainstorm where it would be
useful to optimize based on such information (I guess e.g. in target
regions it easily could) and only when we come up with something
like that think about how to propagate the assumptions to the optimizers.
2022-09-27 Jakub Jelinek <jakub@redhat.com>
gcc/c-family/
* c-pragma.h (enum pragma_kind): Add PRAGMA_OMP_ASSUME,
PRAGMA_OMP_ASSUMES and PRAGMA_OMP_BEGIN. Rename
PRAGMA_OMP_END_DECLARE_TARGET to PRAGMA_OMP_END.
* c-pragma.cc (omp_pragmas): Add assumes and begin.
For end rename PRAGMA_OMP_END_DECLARE_TARGET to PRAGMA_OMP_END.
(omp_pragmas_simd): Add assume.
* c-common.h (c_omp_directives): Declare.
* c-omp.cc (omp_directives): Rename to ...
(c_omp_directives): ... this. No longer static. Uncomment
assume, assumes, begin assumes and end assumes entries.
In end declare target entry rename PRAGMA_OMP_END_DECLARE_TARGET
to PRAGMA_OMP_END.
(c_omp_categorize_directive): Adjust for omp_directives to
c_omp_directives renaming.
gcc/c/
* c-lang.h (current_omp_begin_assumes): Declare.
* c-parser.cc: Include bitmap.h.
(c_parser_omp_end_declare_target): Rename to ...
(c_parser_omp_end): ... this. Handle also end assumes.
(c_parser_omp_begin, c_parser_omp_assumption_clauses,
c_parser_omp_assumes, c_parser_omp_assume): New functions.
(c_parser_translation_unit): Also diagnose #pragma omp begin assumes
without corresponding #pragma omp end assumes.
(c_parser_pragma): Use %s in may only be used at file scope
diagnostics to decrease number of translatable messages. Handle
PRAGMA_OMP_BEGIN and PRAGMA_OMP_ASSUMES. Handle PRAGMA_OMP_END
rather than PRAGMA_OMP_END_DECLARE_TARGET and call c_parser_omp_end
for it rather than c_parser_omp_end_declare_target.
(c_parser_omp_construct): Handle PRAGMA_OMP_ASSUME.
* c-decl.cc (current_omp_begin_assumes): Define.
gcc/cp/
* cp-tree.h (struct omp_begin_assumes_data): New type.
(struct saved_scope): Add omp_begin_assumes member.
* parser.cc: Include bitmap.h.
(cp_parser_omp_assumption_clauses, cp_parser_omp_assume,
cp_parser_omp_assumes, cp_parser_omp_begin): New functions.
(cp_parser_omp_end_declare_target): Rename to ...
(cp_parser_omp_end): ... this. Handle also end assumes.
(cp_parser_omp_construct): Handle PRAGMA_OMP_ASSUME.
(cp_parser_pragma): Handle PRAGMA_OMP_ASSUME, PRAGMA_OMP_ASSUMES
and PRAGMA_OMP_BEGIN. Handle PRAGMA_OMP_END rather than
PRAGMA_OMP_END_DECLARE_TARGET and call cp_parser_omp_end
for it rather than cp_parser_omp_end_declare_target.
* pt.cc (apply_late_template_attributes): Also temporarily clear
omp_begin_assumes.
* semantics.cc (finish_translation_unit): Also diagnose
#pragma omp begin assumes without corresponding
#pragma omp end assumes.
gcc/testsuite/
* c-c++-common/gomp/assume-1.c: New test.
* c-c++-common/gomp/assume-2.c: New test.
* c-c++-common/gomp/assume-3.c: New test.
* c-c++-common/gomp/assumes-1.c: New test.
* c-c++-common/gomp/assumes-2.c: New test.
* c-c++-common/gomp/assumes-3.c: New test.
* c-c++-common/gomp/assumes-4.c: New test.
* c-c++-common/gomp/begin-assumes-1.c: New test.
* c-c++-common/gomp/begin-assumes-2.c: New test.
* c-c++-common/gomp/begin-assumes-3.c: New test.
* c-c++-common/gomp/begin-assumes-4.c: New test.
* c-c++-common/gomp/declare-target-6.c: New test.
* g++.dg/gomp/attrs-1.C (bar): Add n1 and n2 arguments, add
tests for assume directive.
* g++.dg/gomp/attrs-2.C (bar): Likewise.
* g++.dg/gomp/attrs-9.C: Add n1 and n2 variables, add tests for
begin assumes directive.
* g++.dg/gomp/attrs-15.C: New test.
* g++.dg/gomp/attrs-16.C: New test.
* g++.dg/gomp/attrs-17.C: New test.
Jakub Jelinek [Tue, 27 Sep 2022 06:20:05 +0000 (08:20 +0200)]
c++: Improve diagnostics about conflicting specifiers
On Sat, Sep 17, 2022 at 01:23:59AM +0200, Jason Merrill wrote:
> I wonder why we don't give an error when setting the
> conflicting_specifiers_p flag in cp_parser_set_storage_class? We should be
> able to give a better diagnostic at that point.
I didn't have time to update the whole patch last night, but this part
seems to be independent and I've managed to test it.
The diagnostics then looks like:
a.C:1:9: error: ‘static’ specifier conflicts with ‘typedef’
1 | typedef static int a;
| ~~~~~~~ ^~~~~~
a.C:2:8: error: ‘typedef’ specifier conflicts with ‘static’
2 | static typedef int b;
| ~~~~~~ ^~~~~~~
a.C:3:8: error: duplicate ‘static’ specifier
3 | static static int c;
| ~~~~~~ ^~~~~~
a.C:4:8: error: ‘extern’ specifier conflicts with ‘static’
4 | static extern int d;
| ~~~~~~ ^~~~~~
2022-09-27 Jakub Jelinek <jakub@redhat.com>
gcc/cp/
* parser.cc (cp_parser_lambda_declarator_opt): Don't diagnose
conflicting specifiers here.
(cp_storage_class_name): New variable.
(cp_parser_decl_specifier_seq): When setting conflicting_specifiers_p
for the first time, diagnose which exact specifiers conflict.
(cp_parser_set_storage_class): Likewise. Move storage_class
computation earlier.
* decl.cc (grokdeclarator): Don't diagnose conflicting specifiers
here, just return error_mark_node.
gcc/testsuite/
* g++.dg/diagnostic/conflicting-specifiers-1.C: Adjust expected
diagnostics.
* g++.dg/parse/typedef8.C: Likewise.
* g++.dg/parse/crash39.C: Likewise.
* g++.dg/other/mult-stor1.C: Likewise.
* g++.dg/cpp2a/constinit3.C: Likewise.
Jeff Law [Tue, 27 Sep 2022 05:44:38 +0000 (01:44 -0400)]
Fix ICEs due to recent jump-to-return optimization
gcc/
* cfgrtl.cc (fixup_reorder_chain): Verify that simple_return
and return are available before trying to use them.
Jakub Jelinek [Tue, 27 Sep 2022 06:04:06 +0000 (08:04 +0200)]
c++: Implement P1467R9 - Extended floating-point types and standard names compiler part except for bfloat16 [PR106652]
The following patch implements the compiler part of C++23
P1467R9 - Extended floating-point types and standard names compiler part
by introducing _Float{16,32,64,128} as keywords and builtin types
like they are implemented for C already since GCC 7, with DF{16,32,64,128}_
mangling.
It also introduces _Float{32,64,128}x for C++ with the
https://github.com/itanium-cxx-abi/cxx-abi/pull/147
proposed mangling of DF{32,64,128}x.
The patch doesn't add anything for bfloat16_t support, as right now
__bf16 type refuses all conversions and arithmetic operations.
The patch wants to keep backwards compatibility with how __float128 has
been handled in C++ before, both for mangling and behavior in binary
operations, overload resolution etc. So, there are some backend changes
where for C __float128 and _Float128 are the same type (float128_type_node
and float128t_type_node are the same pointer), but for C++ they are distinct
types which mangle differently and _Float128 is treated as extended
floating-point type while __float128 is treated as non-standard floating
point type. The various C++23 changes about how floating-point types
are changed are actually implemented as written in the spec only if at least
one of the types involved is _Float{16,32,64,128,32x,64x,128x} (_FloatNx are
also treated as extended floating-point types) and kept previous behavior
otherwise. For float/double/long double the rules are actually written that
they behave the same as before.
There is some backwards incompatibility at least on x86 regarding _Float16,
because that type was already used by that name and with the DF16_ mangling
(but only since GCC 12 and I think it isn't that widely used in the wild
yet). E.g. config/i386/avx512fp16intrin.h shows the issues, where
in C or in GCC 12 in C++ one could pass 0.0f to a builtin taking _Float16
argument, but with the changes that is not possible anymore, one needs
to either use 0.0f16 or (_Float16) 0.0f.
We have also a problem with glibc headers, where since glibc 2.27
math.h and complex.h aren't compilable with these changes. One gets
errors like:
In file included from /usr/include/math.h:43,
from abc.c:1:
/usr/include/bits/floatn.h:86:9: error: multiple types in one declaration
86 | typedef __float128 _Float128;
| ^~~~~~~~~~
/usr/include/bits/floatn.h:86:20: error: declaration does not declare anything [-fpermissive]
86 | typedef __float128 _Float128;
| ^~~~~~~~~
In file included from /usr/include/bits/floatn.h:119:
/usr/include/bits/floatn-common.h:214:9: error: multiple types in one declaration
214 | typedef float _Float32;
| ^~~~~
/usr/include/bits/floatn-common.h:214:15: error: declaration does not declare anything [-fpermissive]
214 | typedef float _Float32;
| ^~~~~~~~
/usr/include/bits/floatn-common.h:251:9: error: multiple types in one declaration
251 | typedef double _Float64;
| ^~~~~~
/usr/include/bits/floatn-common.h:251:16: error: declaration does not declare anything [-fpermissive]
251 | typedef double _Float64;
| ^~~~~~~~
This is from snippets like:
/* The remaining of this file provides support for older compilers. */
# if __HAVE_FLOAT128
/* The type _Float128 exists only since GCC 7.0. */
# if !__GNUC_PREREQ (7, 0) || defined __cplusplus
typedef __float128 _Float128;
# endif
where it hardcodes that C++ doesn't have _Float{16,32,64,128,32x,64x,128x} support nor
{f,F}{16,32,64,128}{,x} literal suffixes nor _Complex _Float{16,32,64,128,32x,64x,128x}.
The patch fixincludes this for now and hopefully if this is committed, then
glibc can change those. The patch changes those
# if !__GNUC_PREREQ (7, 0) || defined __cplusplus
conditions to
# if !__GNUC_PREREQ (7, 0) || (defined __cplusplus && !__GNUC_PREREQ (13, 0))
Another thing is mangling, as said above, Itanium C++ ABI specifies
DF <number> _ as _Float{16,32,64,128} mangling, but GCC was implementing
a mangling incompatible with that starting with DF for fixed point types.
Fixed point was never supported in C++ though, I believe the reason why
the mangling has been added was that due to a bug it would leak into the
C++ FE through decltype (0.0r) etc. But that has been shortly after the
mangling was added fixed (I think in the same GCC release cycle), so we
now reject 0.0r etc. in C++. If we ever need the fixed point mangling,
I think it can be readded but better with a different prefix so that it
doesn't conflict with the published standard manglings. So, this patch
also kills the fixed point mangling and implements the DF <number> _
demangling.
The patch predefines __STDCPP_FLOAT{16,32,64,128}_T__ macros when
those types are available, but only for C++23, while the underlying types
are available in C++98 and later including the {f,F}{16,32,64,128} literal
suffixes (but those with a pedwarn for C++20 and earlier). My understanding
is that it needs to be predefined by the compiler, on the other side
predefining even for older modes when <stdfloat> is a new C++23 header
would be weird. One can find out if _Float{16,32,64,128,32x,64x,128x} is
supported in C++ by
__GNUC__ >= 13 && defined(__FLT{16,32,64,128,32X,64X,128X}_MANT_DIG__)
(but that doesn't work well with older G++ 13 snapshots).
As for std::bfloat16_t, three targets (aarch64, arm and x86) apparently
"support" __bf16 type which has the bfloat16 format, but isn't really
usable, e.g. {aarch64,arm,ix86}_invalid_conversion disallow any conversions
from or to type with BFmode, {aarch64,arm,ix86}_invalid_unary_op disallows
any unary operations on those except for ADDR_EXPR and
{aarch64,arm,ix86}_invalid_binary_op disallows any binary operation on
those. So, I think we satisfy:
"If the implementation supports an extended floating-point type with the
properties, as specified by ISO/IEC/IEEE 60559, of radix (b) of 2, storage
width in bits (k) of 16, precision in bits (p) of 8, maximum exponent (emax)
of 127, and exponent field width in bits (w) of 8, then the typedef-name
std::bfloat16_t is defined in the header <stdfloat> and names such a type,
the macro __STDCPP_BFLOAT16_T__ is defined, and the floating-point literal
suffixes bf16 and BF16 are supported."
because we don't really support those right now.
2022-09-27 Jakub Jelinek <jakub@redhat.com>
PR c++/106652
PR c++/85518
gcc/
* tree-core.h (enum tree_index): Add TI_FLOAT128T_TYPE
enumerator.
* tree.h (float128t_type_node): Define.
* tree.cc (build_common_tree_nodes): Initialize float128t_type_node.
* builtins.def (DEF_FLOATN_BUILTIN): Adjust comment now that
_Float<N> is supported in C++ too.
* config/i386/i386.cc (ix86_mangle_type): Only mangle as "g"
float128t_type_node.
* config/i386/i386-builtins.cc (ix86_init_builtin_types): Use
float128t_type_node for __float128 instead of float128_type_node
and create it if NULL.
* config/i386/avx512fp16intrin.h (_mm_setzero_ph, _mm256_setzero_ph,
_mm512_setzero_ph, _mm_set_sh, _mm_load_sh): Use 0.0f16 instead of
0.0f.
* config/ia64/ia64.cc (ia64_init_builtins): Use
float128t_type_node for __float128 instead of float128_type_node
and create it if NULL.
* config/rs6000/rs6000-c.cc (is_float128_p): Also return true
for float128t_type_node if non-NULL.
* config/rs6000/rs6000.cc (rs6000_mangle_type): Don't mangle
float128_type_node as "u9__ieee128".
* config/rs6000/rs6000-builtin.cc (rs6000_init_builtins): Use
float128t_type_node for __float128 instead of float128_type_node
and create it if NULL.
gcc/c-family/
* c-common.cc (c_common_reswords): Change _Float{16,32,64,128} and
_Float{32,64,128}x flags from D_CONLY to 0.
(shorten_binary_op): Punt if common_type returns error_mark_node.
(shorten_compare): Likewise.
(c_common_nodes_and_builtins): For C++ record _Float{16,32,64,128}
and _Float{32,64,128}x builtin types if available. For C++
clear float128t_type_node.
* c-cppbuiltin.cc (c_cpp_builtins): Predefine
__STDCPP_FLOAT{16,32,64,128}_T__ for C++23 if supported.
* c-lex.cc (interpret_float): For q/Q suffixes prefer
float128t_type_node over float128_type_node. Allow
{f,F}{16,32,64,128} suffixes for C++ if supported with pedwarn
for C++20 and older. Allow {f,F}{32,64,128}x suffixes for C++
with pedwarn. Don't call excess_precision_type for C++.
gcc/cp/
* cp-tree.h (cp_compare_floating_point_conversion_ranks): Implement
P1467R9 - Extended floating-point types and standard names except
for std::bfloat16_t for now. Declare.
(extended_float_type_p): New inline function.
* mangle.cc (write_builtin_type): Mangle float{16,32,64,128}_type_node
as DF{16,32,64,128}_. Mangle float{32,64,128}x_type_node as
DF{32,64,128}x. Remove FIXED_POINT_TYPE mangling that conflicts
with that.
* typeck2.cc (check_narrowing): If one of ftype or type is extended
floating-point type, compare floating-point conversion ranks.
* parser.cc (cp_keyword_starts_decl_specifier_p): Handle
CASE_RID_FLOATN_NX.
(cp_parser_simple_type_specifier): Likewise and diagnose missing
_Float<N> or _Float<N>x support if not supported by target.
* typeck.cc (cp_compare_floating_point_conversion_ranks): New function.
(cp_common_type): If both types are REAL_TYPE and one or both are
extended floating-point types, select common type based on comparison
of floating-point conversion ranks and subranks.
(cp_build_binary_op): Diagnose operation with floating point arguments
with unordered conversion ranks.
* call.cc (standard_conversion): For floating-point conversion, if
either from or to are extended floating-point types, set conv->bad_p
for implicit conversion from larger to smaller conversion rank or
with unordered conversion ranks.
(convert_like_internal): Emit a pedwarn on such conversions.
(build_conditional_expr): Diagnose operation with floating point
arguments with unordered conversion ranks.
(convert_arg_to_ellipsis): Don't promote extended floating-point types
narrower than double to double.
(compare_ics): Implement P1467R9 [over.ics.rank]/4 changes.
gcc/testsuite/
* g++.dg/cpp23/ext-floating1.C: New test.
* g++.dg/cpp23/ext-floating2.C: New test.
* g++.dg/cpp23/ext-floating3.C: New test.
* g++.dg/cpp23/ext-floating4.C: New test.
* g++.dg/cpp23/ext-floating5.C: New test.
* g++.dg/cpp23/ext-floating6.C: New test.
* g++.dg/cpp23/ext-floating7.C: New test.
* g++.dg/cpp23/ext-floating8.C: New test.
* g++.dg/cpp23/ext-floating9.C: New test.
* g++.dg/cpp23/ext-floating10.C: New test.
* g++.dg/cpp23/ext-floating.h: New file.
* g++.target/i386/float16-1.C: Adjust expected diagnostics.
libcpp/
* expr.cc (interpret_float_suffix): Allow {f,F}{16,32,64,128} and
{f,F}{32,64,128}x suffixes for C++.
include/
* demangle.h (enum demangle_component_type): Add
DEMANGLE_COMPONENT_EXTENDED_BUILTIN_TYPE.
(struct demangle_component): Add u.s_extended_builtin member.
libiberty/
* cp-demangle.c (d_dump): Handle
DEMANGLE_COMPONENT_EXTENDED_BUILTIN_TYPE. Don't handle
DEMANGLE_COMPONENT_FIXED_TYPE.
(d_make_extended_builtin_type): New function.
(cplus_demangle_builtin_types): Add _Float entry.
(cplus_demangle_type): For DF demangle it as _Float<N> or
_Float<N>x rather than fixed point which conflicts with it.
(d_count_templates_scopes): Handle
DEMANGLE_COMPONENT_EXTENDED_BUILTIN_TYPE. Just break; for
DEMANGLE_COMPONENT_FIXED_TYPE.
(d_find_pack): Handle DEMANGLE_COMPONENT_EXTENDED_BUILTIN_TYPE.
Don't handle DEMANGLE_COMPONENT_FIXED_TYPE.
(d_print_comp_inner): Likewise.
* cp-demangle.h (D_BUILTIN_TYPE_COUNT): Bump.
* testsuite/demangle-expected: Replace _Z3xxxDFyuVb test
with _Z3xxxDF16_DF32_DF64_DF128_CDF16_Vb. Add
_Z3xxxDF32xDF64xDF128xCDF32xVb test.
fixincludes/
* inclhack.def (glibc_cxx_floatn_1, glibc_cxx_floatn_2,
glibc_cxx_floatn_3): New fixes.
* tests/base/bits/floatn.h: New file.
* fixincl.x: Regenerated.
Meghan Denny [Tue, 27 Sep 2022 03:51:52 +0000 (23:51 -0400)]
Updated constants from <https://dwarfstd.org/Languages.php>
include
* dwarf2.h: Update with additional languages from dwarf
standard.
GCC Administrator [Tue, 27 Sep 2022 00:17:52 +0000 (00:17 +0000)]
Daily bump.
Jonathan Wakely [Mon, 26 Sep 2022 17:59:45 +0000 (18:59 +0100)]
libstdc++: Update std::pointer_traits to match new LWG 3545 wording
It was pointed out in recent LWG 3545 discussion that having a
constrained partial specialization of std::pointer_traits can cause
ambiguities with program-defined specializations. For example, the
addition to the testcase has:
template<typename P> requires std::derived_from<P, base_type
struct std::pointer_traits<P>;
This would be ambiguous with the library's own constrained partial
specialization:
template<typename Ptr> requires requires { typename Ptr::element_type; }
struct std::pointer_traits<Ptr>;
Neither specialization is more specialized than the other for a type
that is derived from base_type and also has an element_type member.
The solution is to remove the library's partial specialization, and do
the check for Ptr::element_type in the __ptr_traits_elem helper (which
is what we already do for !__cpp_concepts anyway).
libstdc++-v3/ChangeLog:
* include/bits/ptr_traits.h (__ptr_traits_elem) [__cpp_concepts]:
Also define the __ptr_traits_elem class template for the
concepts case.
(pointer_traits<Ptr>): Remove constrained partial
specialization.
* testsuite/20_util/pointer_traits/lwg3545.cc: Check for
ambiguitiy with program-defined partial specialization.
Jonathan Wakely [Fri, 23 Sep 2022 22:21:11 +0000 (23:21 +0100)]
libstdc++: Use new built-ins for std::is_convertible traits
libstdc++-v3/ChangeLog:
* include/std/type_traits (is_convertible, is_convertible_v):
Define using new built-in.
(is_nothrow_convertible is_nothrow_convertible_v): Likewise.
Martin Liska [Mon, 26 Sep 2022 19:04:11 +0000 (21:04 +0200)]
docs: add missing dash in option name
gcc/ChangeLog:
* doc/invoke.texi: Add missing dash for
Wanalyzer-exposure-through-uninit-copy.
Marek Polacek [Fri, 23 Sep 2022 22:06:34 +0000 (18:06 -0400)]
c++: P2513R4, char8_t Compatibility and Portability Fix [PR106656]
P0482R6, which added char8_t, didn't allow
const char arr[] = u8"howdy";
because it said "Declarations of arrays of char may currently be initialized
with UTF-8 string literals. Under this proposal, such initializations would
become ill-formed." This caused too many issues, so P2513R4 alleviates some
of those compatibility problems. In particular, "Arrays of char or unsigned
char may now be initialized with a UTF-8 string literal." This restriction
has been lifted for initialization only, not implicit conversions. Also,
my reading is that 'signed char' was excluded from the allowable conversions.
This is supposed to be treated as a DR in C++20.
PR c++/106656
gcc/c-family/ChangeLog:
* c-cppbuiltin.cc (c_cpp_builtins): Update value of __cpp_char8_t
for C++20.
gcc/cp/ChangeLog:
* typeck2.cc (array_string_literal_compatible_p): Allow
initializing arrays of char or unsigned char by a UTF-8 string literal.
gcc/testsuite/ChangeLog:
* g++.dg/cpp23/feat-cxx2b.C: Adjust.
* g++.dg/cpp2a/feat-cxx2a.C: Likewise.
* g++.dg/ext/char8_t-feature-test-macro-2.C: Likewise.
* g++.dg/ext/char8_t-init-2.C: Likewise.
* g++.dg/cpp2a/char8_t3.C: New test.
* g++.dg/cpp2a/char8_t4.C: New test.
Aldy Hernandez [Fri, 23 Sep 2022 17:47:19 +0000 (19:47 +0200)]
Optimize [0 = x & MASK] in range-ops.
For [0 = x & MASK], we can determine that x is ~MASK. This is
something we're picking up in DOM thanks to maybe_set_nonzero_bits,
but is something we should handle natively.
This is a good example of how much easier to maintain the range-ops
entries are versus the ad-hoc pattern matching stuff we had to do
before. For the curious, compare the changes to range-op here,
versus maybe_set_nonzero_bits.
I'm leaving the call to maybe_set_nonzero_bits until I can properly
audit it to make sure we're catching it all in range-ops. It won't
hurt, since both set_range_info() and set_nonzero_bits() are
intersect operations, so we'll never lose information if we do both.
PR tree-optimization/107009
gcc/ChangeLog:
* range-op.cc (operator_bitwise_and::op1_range): Optimize 0 = x & MASK.
(range_op_bitwise_and_tests): New test.
Marek Polacek [Mon, 26 Sep 2022 14:21:38 +0000 (10:21 -0400)]
c++: Instantiate less when evaluating __is_convertible
Jon reported that evaluating __is_convertible in a test led to
instantiating something ill-formed and so we failed to compile the test.
__is_convertible doesn't and shouldn't need to instantiate so much, so
let's limit it with a cp_unevaluated guard. Use a helper function to
implement both built-ins.
PR c++/106784
gcc/cp/ChangeLog:
* method.cc (is_convertible_helper): New.
(is_convertible): Use it.
(is_nothrow_convertible): Likewise.
gcc/testsuite/ChangeLog:
* g++.dg/ext/is_convertible3.C: New test.
* g++.dg/ext/is_nothrow_convertible3.C: New test.
Patrick Palka [Mon, 26 Sep 2022 15:30:17 +0000 (11:30 -0400)]
c++ modules: variable template partial spec fixes [PR107033]
In r13-2775-g32d8123cd6ce87 I missed that we need to adjust the call to
add_mergeable_specialization in the MK_partial case to correctly handle
variable template partial specializations (it currently assumes we're
always dealing with one for a class template). This fixes an ICE when
converting the testcase from that commit to use an importable header
instead of a named module.
PR c++/107033
gcc/cp/ChangeLog:
* module.cc (trees_in::decl_value): In the MK_partial case for
a variable template partial specialization, pass decl_p=true to
add_mergeable_specialization, and set spec to the VAR_DECL not
the TEMPLATE_DECL.
* pt.cc (add_mergeable_specialization): For a variable template
partial specialization, set the TREE_TYPE of the new
DECL_TEMPLATE_SPECIALIZATIONS node to the TREE_TYPE of the
VAR_DECL not the VAR_DECL itself.
gcc/testsuite/ChangeLog:
* g++.dg/modules/partial-2.cc, g++.dg/modules/partial-2.h: New
files, factored out from ...
* g++.dg/modules/partial-2_a.C, g++.dg/modules/partial-2_b.C: ...
these.
* g++.dg/modules/partial-2_c.H: New test.
* g++.dg/modules/partial-2_d.C: New test.
Jeff Law [Mon, 26 Sep 2022 15:14:55 +0000 (09:14 -0600)]
Update my address and DCO entry in MAINTAINERS file
/
* MAINTAINERS: Update my email address and DCO entry.
Aldy Hernandez [Fri, 23 Sep 2022 17:47:33 +0000 (19:47 +0200)]
Set ranges from unreachable edges for all known ranges.
In the conversion of DOM+evrp to DOM+ranger, we missed that evrp was
exporting ranges for unreachable edges for all SSA names for which we
have ranges for. Instead we have only been exporting ranges for the
SSA name in the final conditional to the BB involving the unreachable
edge.
This patch adjusts adjusts DOM to iterate over the exports, similarly
to what evrp was doing.
Note that I also noticed that we don't calculate the nonzero bit mask
for op1, when 0 = op1 & MASK. This isn't needed for this PR,
since maybe_set_nonzero_bits() is chasing the definition and
parsing the bitwise and on its own. However, I'll be adding the
functionality for completeness sake, plus we could probably drop the
maybe_set_nonzero_bits legacy call entirely.
PR tree-optimization/107009
gcc/ChangeLog:
* tree-ssa-dom.cc
(dom_opt_dom_walker::set_global_ranges_from_unreachable_edges):
Iterate over exports.
gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/pr107009.c: New test.