platform/upstream/gcc.git
2 years agoc: Diagnose "enum tag;" after definition [PR107164]
Joseph Myers [Tue, 18 Oct 2022 23:25:47 +0000 (23:25 +0000)]
c: Diagnose "enum tag;" after definition [PR107164]

As noted in bug 101764, a declaration "enum tag;" is invalid in
standard C after a definition, as well as when no definition is
visible; we had a pedwarn-if-pedantic for the forward declaration
case, but were missing one for the other case.  Add that missing
diagnostic (if pedantic only).

(These diagnostics will need to be appropriately conditioned when
support is added for C2x enums with fixed underlying type, since "enum
tag : type;" is OK both before and after a definition.)

Bootstrapped with no regressions for x86_64-pc-linux-gnu.

PR c/107164

gcc/c/
* c-decl.cc (shadow_tag_warned): If pedantic, diagnose "enum tag;"
with previous declaration visible.

gcc/testsuite/
* gcc.dg/c99-tag-4.c, gcc.dg/c99-tag-5.c, gcc.dg/c99-tag-6.c: New
tests.

2 years agotestsuite: Only run -fcf-protection test on i?86/x86_64 [PR107213]
Marek Polacek [Tue, 11 Oct 2022 16:51:40 +0000 (12:51 -0400)]
testsuite: Only run -fcf-protection test on i?86/x86_64 [PR107213]

This test fails on non-i?86/x86_64 targets because on those targets
we get

  error: '-fcf-protection=full' is not supported for this target

so this patch limits where the test is run.

PR testsuite/107213

gcc/testsuite/ChangeLog:

* c-c++-common/pointer-to-fn1.c: Only run on i?86/x86_64.

2 years agolibiberty: Fix C89-isms in configure tests
Florian Weimer [Tue, 18 Oct 2022 14:58:48 +0000 (16:58 +0200)]
libiberty: Fix C89-isms in configure tests

libiberty/

* acinclude.m4 (ac_cv_func_strncmp_works): Add missing
int return type and parameter list to the definition of main.
Include <stdlib.h> and <string.h> for prototypes.
(ac_cv_c_stack_direction): Add missing
int return type and parameter list to the definitions of
main, find_stack_direction.  Include <stdlib.h> for exit
prototype.
* configure: Regenerate.

2 years agolibsanitizer: Avoid implicit function declaration in configure test
Florian Weimer [Tue, 18 Oct 2022 14:58:48 +0000 (16:58 +0200)]
libsanitizer: Avoid implicit function declaration in configure test

libsanitizer/

* configure.ac (sanitizer_supported): Include <unistd.h> for
syscall prototype.
* configure: Regenerate.

2 years agoc++ modules: stream non-trailing default targs [PR105045]
Patrick Palka [Tue, 18 Oct 2022 14:57:30 +0000 (10:57 -0400)]
c++ modules: stream non-trailing default targs [PR105045]

This fixes the below testcase in which we neglect to stream the default
argument for T only because the subsequent parameter U doesn't also have
a default argument.

PR c++/105045

gcc/cp/ChangeLog:

* module.cc (trees_out::tpl_parms_fini): Don't assume default
template arguments must be trailing.
(trees_in::tpl_parms_fini): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/modules/pr105045_a.C: New test.
* g++.dg/modules/pr105045_b.C: New test.

2 years agolibstdc++: Implement ranges::stride_view from P1899R3
Patrick Palka [Tue, 18 Oct 2022 14:51:52 +0000 (10:51 -0400)]
libstdc++: Implement ranges::stride_view from P1899R3

libstdc++-v3/ChangeLog:

* include/std/ranges (stride_view): Define.
(stride_view::_Iterator): Define.
(views::__detail::__can_stride_view): Define.
(views::_Stride, views::stride): Define.
* testsuite/std/ranges/adaptors/stride/1.cc: New test.

2 years agoc: C2x enums wider than int [PR36113]
Joseph Myers [Tue, 18 Oct 2022 14:07:27 +0000 (14:07 +0000)]
c: C2x enums wider than int [PR36113]

C2x has two enhancements to enumerations: allowing enumerations whose
values do not all fit in int (similar to an existing extension), and
allowing an underlying type to be specified (as in C++).  This patch
implements the first of those enhancements.

Apart from adjusting diagnostics to reflect this being a standard
feature, there are some semantics differences from the previous
extension:

* The standard feature gives all the values of such an enum the
  enumerated type (while keeping type int if that can represent all
  values of the enumeration), where previously GCC only gave those
  values outside the range of int the enumerated type.  This change
  was previously requested in PR 36113; it seems unlikely that people
  are depending on the detail of the previous extension that some
  enumerators had different types to others depending on whether their
  values could be represented as int, and this patch makes the change
  to types of enumerators unconditionally (if that causes problems in
  practice we could always make it conditional on C2x mode instead).

* The types *while the enumeration is being defined*, for enumerators
  that can't be represented as int, are now based more directly on the
  types of the expressions used, rather than a possibly different type
  with the right precision constructed using c_common_type_for_size.
  Again, this change is made unconditionally.

* Where overflow (or wraparound to 0, in the case of an unsigned type)
  when 1 is implicitly added to determine the value of the next
  enumerator was previously an error, it now results in a wider type
  with the same signedness (as the while-being-defined type of the
  previous enumerator) being chosen, if available.

When a type is chosen in such an overflow case, or when a type is
chosen for the underlying integer type of the enumeration, it's
possible that (unsigned) __int128 is chosen.  Although C2x allows for
such types wider than intmax_t to be considered extended integer
types, we don't have various features required to do so (integer
constant suffixes; sufficient library support would also be needed to
define the associated macros for printf/scanf conversions, and
<stdint.h> typedef names would need to be defined).  Thus, there are
also pedwarns for exceeding the range of intmax_t / uintmax_t, as
while in principle exceeding that range is OK, it's only OK in a
context where the relevant types meet the requirements for extended
integer types, which does not currently apply here.

Bootstrapped with no regressions for x86_64-pc-linux-gnu.  Also
manually checked diagnostics for c2x-enum-3.c with -m32 to confirm the
diagnostics in that { target { ! int128 } } test are as expected.

PR c/36113

gcc/c-family/
* c-common.cc (c_common_type_for_size): Add fallback to
widest_unsigned_literal_type_node or
widest_integer_literal_type_node for precision that may not
exactly match the precision of those types.

gcc/c/
* c-decl.cc (finish_enum): If any enumerators do not fit in int,
convert all to the type of the enumeration.  pedwarn if no integer
type fits all enumerators and default to
widest_integer_literal_type_node in that case.  Otherwise pedwarn
for type wider than intmax_t.
(build_enumerator): pedwarn for enumerators outside the range of
uintmax_t or intmax_t, and otherwise use pedwarn_c11 for
enumerators outside the range of int.  On overflow, attempt to
find a wider type that can hold the value of the next enumerator.
Do not convert value to type determined with
c_common_type_for_size.

gcc/testsuite/
* gcc.dg/c11-enum-1.c, gcc.dg/c11-enum-2.c, gcc.dg/c11-enum-3.c,
gcc.dg/c2x-enum-1.c, gcc.dg/c2x-enum-2.c, gcc.dg/c2x-enum-3.c,
gcc.dg/c2x-enum-4.c, gcc.dg/c2x-enum-5.c: New tests.
* gcc.dg/pr30260.c: Explicitly use -std=gnu11.  Update expected
diagnostics.
* gcc.dg/torture/pr25183.c: Update expected diagnostics.

2 years agoipa-cp: Better representation of aggregate values in call contexts
Martin Jambor [Tue, 18 Oct 2022 12:14:26 +0000 (14:14 +0200)]
ipa-cp: Better representation of aggregate values in call contexts

This patch extends the previous one by using the same data structure
to represent aggregate values in classes ipa_auto_call_arg_values and
ipa_call_arg_values.

This usually simplifies handling and makes allocations of memory much
cheaper because only a single vectore is needed, as opposed to vectors
with each element pointing at other vecs.  The only functions which
unfortunately are a bit more complec are estimate_local_effects in
ipa-cp.cc and ipa_call_context::equal_to but I hope not too much - the
latter could probably be shorteneed at the expense of readability.

The patch removes types ipa_agg_value ipa_agg_value_set which is no
longer used with it.  This means that we could replace the "_argagg_"
part of the types introduced by the previous patches with more
reasonable "_agg_" - possibly as a follow-up patch.

gcc/ChangeLog:

2022-08-26  Martin Jambor  <mjambor@suse.cz>

* ipa-prop.h (ipa_agg_value): Remove type.
(ipa_agg_value_set): Likewise.
(ipa_copy_agg_values): Remove function.
(ipa_release_agg_values): Likewise.
(ipa_auto_call_arg_values) Add a forward declaration.
(ipa_call_arg_values): Likewise.
(class ipa_argagg_value_list): New constructors, added member function
value_for_index_p.
(class ipa_auto_call_arg_values): Removed the destructor and member
function safe_aggval_at.  Use ipa_argagg_values for m_known_aggs.
(class ipa_call_arg_values): Removed member function safe_aggval_at.
Use ipa_argagg_values for m_known_aggs.
(ipa_get_indirect_edge_target): Removed declaration.
(ipa_find_agg_cst_for_param): Likewise.
(ipa_find_agg_cst_from_init): New declaration.
(ipa_agg_value_from_jfunc): Likewise.
(ipa_agg_value_set_from_jfunc): Removed declaration.
(ipa_push_agg_values_from_jfunc): New declaration.
* ipa-cp.cc (ipa_agg_value_from_node): Renamed to
ipa_agg_value_from_jfunc, made public.
(ipa_agg_value_set_from_jfunc): Removed.
(ipa_push_agg_values_from_jfunc): New function.
(ipa_get_indirect_edge_target_1): Removed known_aggs parameter, use
avs for this purpose too.
(ipa_get_indirect_edge_target): Removed the overload working on
ipa_auto_call_arg_values, use ipa_argagg_value_list in the remaining
one.
(devirtualization_time_bonus): Use ipa_argagg_value_list and
ipa_get_indirect_edge_target_1 instead of
ipa_get_indirect_edge_target.
(context_independent_aggregate_values): Removed function.
(gather_context_independent_values): Work on ipa_argagg_value_list.
(estimate_local_effects): Likewise, define some iterator variables
only in the construct where necessary.
(ipcp_discover_new_direct_edges): Adjust the call to
ipa_get_indirect_edge_target_1.
(push_agg_values_for_index_from_edge): Adjust the call
ipa_agg_value_from_node which has been renamed to
ipa_agg_value_from_jfunc.
* ipa-fnsummary.cc (evaluate_conditions_for_known_args): Work on
ipa_argagg_value_list.
(evaluate_properties_for_edge): Replace manual filling in aggregate
values with call to ipa_push_agg_values_from_jfunc.
(estimate_calls_size_and_time): Work on ipa_argagg_value_list.
(ipa_cached_call_context::duplicate_from): Likewise.
(ipa_cached_call_context::release): Likewise.
(ipa_call_context::equal_to): Likewise.
* ipa-prop.cc (ipa_find_agg_cst_from_init): Make public.
(ipa_find_agg_cst_for_param): Removed function.
(ipa_find_agg_cst_from_jfunc_items): New function.
(try_make_edge_direct_simple_call): Replace calls to
ipa_agg_value_set_from_jfunc and ipa_find_agg_cst_for_param with
ipa_find_agg_cst_from_init and ipa_find_agg_cst_from_jfunc_items.
(try_make_edge_direct_virtual_call): Replace calls to
ipa_agg_value_set_from_jfunc and ipa_find_agg_cst_for_param with
simple query of constant jump function and a call to
ipa_find_agg_cst_from_jfunc_items.
(ipa_auto_call_arg_values::~ipa_auto_call_arg_values): Removed.

2 years agoipa-cp: Better representation of aggregate values we clone for
Martin Jambor [Tue, 18 Oct 2022 12:14:26 +0000 (14:14 +0200)]
ipa-cp: Better representation of aggregate values we clone for

This patch replaces linked lists of ipa_agg_replacement_value with
vectors of similar structures called ipa_argagg_value and simplifies
how we compute them in the first place.  Having a vector should also
result in less overhead when allocating and because we keep it sorted,
it leads to logarithmic searches.

The slightly obnoxious "argagg" bit in the name can be changed into
"agg" after the next patch removes our current ipa_agg_value type.

The patch also introduces type ipa_argagg_value_list which serves as a
common view into a vector of ipa_argagg_value structures regardless
whether they are stored in GC memory (required for IPA-CP
transformation summary because we store trees) or in an auto_vec which
is hopefully usually only allocated on stack.

The calculation of aggreagete costant values for a given subsert of
callers is then rewritten to compute known constants for each
edge (some pruning to skip obviously not needed is still employed and
should not be really worse than what I am replacing) and these vectors
are there intersected, which can be done linearly since they are
sorted.  The patch also removes a lot of heap allocations of small
lists of aggregate values and replaces them with stack based
auto_vecs.

As Richard Sandiford suggested, I use std::lower_bound from
<algorithm> rather than re-implementing bsearch for array_slice.  The
patch depends on the patch which adds the ability to construct
array_slices from gc-allocated vectors.

gcc/ChangeLog:

2022-10-17  Martin Jambor  <mjambor@suse.cz>

* ipa-prop.h (IPA_PROP_ARG_INDEX_LIMIT_BITS): New.
(ipcp_transformation): Added forward declaration.
(ipa_argagg_value): New type.
(ipa_argagg_value_list): New type.
(ipa_agg_replacement_value): Removed type.
(ipcp_transformation): Switch from using ipa_agg_replacement_value
to ipa_argagg_value_list.
(ipa_get_agg_replacements_for_node): Removed.
(ipa_dump_agg_replacement_values): Removed declaration.
* ipa-cp.cc: Define INCLUDE_ALGORITHM.
(values_equal_for_ipcp_p): Moved up in the file.
(ipa_argagg_value_list::dump): New function.
(ipa_argagg_value_list::debug): Likewise.
(ipa_argagg_value_list::get_elt): Likewise.
(ipa_argagg_value_list::get_elt_for_index): Likewise.
(ipa_argagg_value_list::get_value): New overloaded functions.
(ipa_argagg_value_list::superset_of_p): New function.
(new ipa_argagg_value_list::push_adjusted_values): Likewise.
(push_agg_values_from_plats): Likewise.
(intersect_argaggs_with): Likewise.
(get_clone_agg_value): Removed.
(ipa_agg_value_from_node): Make last parameter const, use
ipa_argagg_value_list to search values coming from clones.
(ipa_get_indirect_edge_target_1): Use ipa_argagg_value_list to search
values coming from clones.
(ipcp_discover_new_direct_edges): Pass around a vector of
ipa_argagg_values rather than a link list of replacement values.
(cgraph_edge_brings_value_p): Use ipa_argagg_value_list to search
values coming from clones.
(create_specialized_node): Work with a vector of ipa_argagg_values
rather than a link list of replacement values.
(self_recursive_agg_pass_through_p): Make the pointer parameters
const.
(copy_plats_to_inter): Removed.
(intersect_with_plats): Likewise.
(agg_replacements_to_vector): Likewise.
(intersect_with_agg_replacements): Likewise.
(intersect_aggregates_with_edge): Likewise.
(push_agg_values_for_index_from_edge): Likewise.
(push_agg_values_from_edge): Likewise.
(find_aggregate_values_for_callers_subset): Rewrite.
(cgraph_edge_brings_all_agg_vals_for_node): Likewise.
(ipcp_val_agg_replacement_ok_p): Use ipa_argagg_value_list to search
aggregate values.
(decide_about_value): Work with a vector of ipa_argagg_values rather
than a link list of replacement values.
(decide_whether_version_node): Likewise.
(ipa_analyze_node): Check number of parameters, assert that there
are no descriptors when bailing out.
* ipa-prop.cc (ipa_set_node_agg_value_chain): Switch to a vector of
ipa_argagg_value.
(ipa_node_params_t::duplicate): Removed superfluous handling of
ipa_agg_replacement_values.  Name of src parameter removed because
it is no longer used.
(ipcp_transformation_t::duplicate): Replaced duplication of
ipa_agg_replacement_values with copying vector m_agg_values.
(ipa_dump_agg_replacement_values): Removed.
(write_ipcp_transformation_info): Stream the new data-structure
instead of the old.
(read_ipcp_transformation_info): Likewise.
(adjust_agg_replacement_values): Work with ipa_argagg_values instead
of linked lists of ipa_agg_replacement_values, copy the items and
truncate the vector as necessary to keep it sorted instead of marking
items as invalid.  Return one bool if CFG should be updated.
(ipcp_modif_dom_walker): Store ipcp_transformation instead of
linked list of ipa_agg_replacement_values.
(ipcp_modif_dom_walker::before_dom_children): Use
ipa_argagg_value_list instead of walking a list of
ipa_agg_replacement_values.
(ipcp_transform_function): Switch to the new data structure, adjust
dumping.

gcc/testsuite/ChangeLog:

2022-08-15  Martin Jambor  <mjambor@suse.cz>

* gcc.dg/ipa/ipcp-agg-11.c: Adjust dumps.
* gcc.dg/ipa/ipcp-agg-8.c: Likewise.

2 years agolibgcc: Quote variable in Makefile.in
Jonathan Wakely [Wed, 12 Oct 2022 11:35:00 +0000 (12:35 +0100)]
libgcc: Quote variable in Makefile.in

If the xgcc executable has not been built (or has been removed by 'make
clean') then the command to print the multilib dir fails, and so the
MULTIOSDIR variable is empty. That then causes:
/bin/sh: line 0: test: !=: unary operator expected

We can avoid it by quoting the variable.

libgcc/ChangeLog:

* Makefile.in: Quote variable.

2 years agotree-optimization/107302 - fix vec_perm placement for recurrence vect
Richard Biener [Tue, 18 Oct 2022 08:01:45 +0000 (10:01 +0200)]
tree-optimization/107302 - fix vec_perm placement for recurrence vect

The following fixes the VEC_PERM_EXPR placement when the latch
definition is a PHI node.

PR tree-optimization/107302
* tree-vect-loop.cc (vectorizable_recurrence): Fix vec_perm
placement for a PHI latch def.

* gcc.dg/vect/pr107302.c: New testcase.

2 years agoifcvt: Do not lower bitfields if we can't analyze dr's [PR107275]
Andre Vieira [Tue, 18 Oct 2022 09:51:09 +0000 (10:51 +0100)]
ifcvt: Do not lower bitfields if we can't analyze dr's [PR107275]

The ifcvt dead code elimination code was not built to deal with inline
assembly, loops with such would never be if-converted in the past since we can't
do data-reference analysis on them and vectorization would eventually fail.  For
this reason we now also do not lower bitfields if the data-reference analysis
fails, as we would not end up vectorizing it.  As a consequence this also fixes
this PR as the dead code elimination will not run for such cases and wrongfully
eliminate inline assembly statements.

gcc/ChangeLog:

PR tree-optimization/107275
* tree-if-conv.cc (if_convertible_loop_p_1): Move
find_data_references_in_loop call from here...
(if_convertible_loop_p): And move data-reference vector initialization
from here...
(tree_if_conversion):... to here.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/pr107275.c: New test.

2 years agolibstdc++: Partial library support for std::float{16,32,64,128}_t and std::bfloat16_t
Jakub Jelinek [Tue, 18 Oct 2022 09:37:13 +0000 (11:37 +0200)]
libstdc++: Partial library support for std::float{16,32,64,128}_t and std::bfloat16_t

The following patch is partial support for std::float{16,32,64,128}_t
and std::bfloat16_t in libstdc++.
https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2022/p1467r9.html
says that <ostream>, <istream>, <charconv> and <complex>
need changes toom, but that isn't implemented so far.
In <cmath> the only thing missing I'm aware of is
std::nextafter std::float16_t and std::bfloat16_t overloads (I think
we probably need to implement that out of line somewhere, or inline? - might
need inline asm barriers) and std::nexttoward overloads (those are
intentional, you said there is a LWG issue about that).
Also, this patch has the glibc 2.26+ std::float128_t support for platforms
where long double isn't IEEE quad format temporarily disabled
because it depends on
https://gcc.gnu.org/pipermail/gcc-patches/2022-October/603665.html
changes which aren't in yet.

The patch also doesn't include any testcases to cover the <type_traits>
changes, it isn't clear to me where to put that.

2022-10-18  Jakub Jelinek  <jakub@redhat.com>

PR c++/106652
* include/std/stdfloat: New file.
* include/std/numbers (__glibcxx_numbers): Define and use it
for __float128 explicit instantiations as well as
_Float{16,32,64,128} and __gnu_cxx::__bfloat16_t.
* include/std/atomic (atomic<_Float16>, atomic<_Float32>,
atomic<_Float64>, atomic<_Float128>, atomic<__gnu_cxx::__bfloat16_t>):
New explicit instantiations.
* include/std/type_traits (__is_floating_point_helper<_Float16>,
__is_floating_point_helper<_Float32>,
__is_floating_point_helper<_Float64>,
__is_floating_point_helper<_Float128>,
__is_floating_point_helper<__gnu_cxx::__bfloat16_t>): Likewise.
* include/std/limits (__glibcxx_concat3_, __glibcxx_concat3,
__glibcxx_float_n): Define.
(numeric_limits<_Float16>, numeric_limits<_Float32>,
numeric_limits<_Float64>, numeric_limits<_Float128>,
numeric_limits<__gnu_cxx::__bfloat16_t>): New explicit instantiations.
* include/bits/std_abs.h (abs): New overloads for
_Float{16,32,64,128} and __gnu_cxx::__bfloat16_t.
* include/bits/c++config (_GLIBCXX_LDOUBLE_IS_IEEE_BINARY128): Define
if long double is IEEE quad.
(__gnu_cxx::__bfloat16_t): New using.
* include/c_global/cmath (acos, asin, atan, atan2, ceil, cos, cosh,
exp, fabs, floor, fmod, frexp, ldexp, log, log10, modf, pow, sin,
sinh, sqrt, tan, tanh, fpclassify, isfinite, isinf, isnan, isnormal,
signbit, isgreater, isgreaterequal, isless, islessequal,
islessgreater, isunordered, acosh, asinh, atanh, cbrt, copysign, erf,
erfc, exp2, expm1, fdim, fma, fmax, fmin, hypot, ilogb, lgamma,
llrint, llround, log1p, log2, logb, lrint, lround, nearbyint,
nextafter, remainder, rint, round, scalbln, scalbn, tgamma, trunc,
lerp): New overloads with _Float{16,32,64,128} or
__gnu_cxx::__bfloat16_t types.
* config/os/gnu-linux/os_defines.h (_GLIBCXX_HAVE_FLOAT128_MATH):
Prepare for definition if glibc 2.26 and later implements *f128 APIs
but comment out the actual definition for now.
* include/ext/type_traits.h (__promote<_Float16>, __promote<_Float32>,
__promote<_Float64>, __promote<_Float128>,
__promote<__gnu_cxx::__bfloat16_t>): New specializations.
* include/Makefile.am (std_headers): Add stdfloat.
* include/Makefile.in: Regenerated.
* include/precompiled/stdc++.h: Include stdfloat.
* testsuite/18_support/headers/stdfloat/types_std.cc: New test.
* testsuite/18_support/headers/limits/synopsis_cxx23.cc: New test.
* testsuite/26_numerics/headers/cmath/c99_classification_macros_c++23.cc:
New test.
* testsuite/26_numerics/headers/cmath/functions_std_c++23.cc: New test.
* testsuite/26_numerics/numbers/4.cc: New test.
* testsuite/29_atomics/atomic_float/requirements_cxx23.cc: New test.

2 years agomiddle-end IFN_ASSUME support [PR106654]
Jakub Jelinek [Tue, 18 Oct 2022 08:31:20 +0000 (10:31 +0200)]
middle-end IFN_ASSUME support [PR106654]

My earlier patches gimplify the simplest non-side-effects assumptions
into if (cond) ; else __builtin_unreachable (); and throw the rest
on the floor.
The following patch attempts to do something with the rest too.
For -O0, it throws the more complex assumptions on the floor,
we don't expect optimizations and the assumptions are there to allow
optimizations.  Otherwise arranges for the assumptions to be
visible in the IL as
  .ASSUME (_Z2f4i._assume.0, i_1(D));
call where there is an artificial function like:
bool _Z2f4i._assume.0 (int i)
{
  bool _2;

  <bb 2> [local count: 1073741824]:
  _2 = i_1(D) == 43;
  return _2;

}
with the semantics that there is UB unless the assumption function
would return true.

Aldy, could ranger handle this?  If it sees .ASSUME call,
walk the body of such function from the edge(s) to exit with the
assumption that the function returns true, so above set _2 [true, true]
and from there derive that i_1(D) [43, 43] and then map the argument
in the assumption function to argument passed to IFN_ASSUME (note,
args there are shifted by 1)?

During gimplification it actually gimplifies it into
  [[assume (D.2591)]]
    {
      {
        i = i + 1;
        D.2591 = i == 44;
      }
    }
which is a new GIMPLE_ASSUME statement wrapping a GIMPLE_BIND and
specifying a boolean_type_node variable which contains the result.
The GIMPLE_ASSUME then survives just a couple of passes and is lowered
during gimple lowering into an outlined separate function and
IFN_ASSUME call.  Variables declared inside of the
condition (both static and automatic) just change context, automatic
variables from the caller are turned into parameters (note, as the code
is never executed, I handle this way even non-POD types, we don't need to
bother pretending there would be user copy constructors etc. involved).

The assume_function artificial functions are then optimized until the
new assumptions pass which doesn't do much right now but I'd like to see
there the backwards ranger walk and filling up of SSA_NAME_RANGE_INFO
for the parameters.

There are a few further changes I'd like to do, like ignoring the
.ASSUME calls in inlining size estimations (but haven't figured out where
it is done), or for LTO arrange for the assume functions to be emitted
in all partitions that reference those (usually there will be just one,
unless code with the assumption got inlined, versioned etc.).

2022-10-18  Jakub Jelinek  <jakub@redhat.com>

PR c++/106654
gcc/
* gimple.def (GIMPLE_ASSUME): New statement kind.
* gimple.h (struct gimple_statement_assume): New type.
(is_a_helper <gimple_statement_assume *>::test,
is_a_helper <const gimple_statement_assume *>::test): New.
(gimple_build_assume): Declare.
(gimple_has_substatements): Return true for GIMPLE_ASSUME.
(gimple_assume_guard, gimple_assume_set_guard,
gimple_assume_guard_ptr, gimple_assume_body_ptr, gimple_assume_body):
New inline functions.
* gsstruct.def (GSS_ASSUME): New.
* gimple.cc (gimple_build_assume): New function.
(gimple_copy): Handle GIMPLE_ASSUME.
* gimple-pretty-print.cc (dump_gimple_assume): New function.
(pp_gimple_stmt_1): Handle GIMPLE_ASSUME.
* gimple-walk.cc (walk_gimple_op): Handle GIMPLE_ASSUME.
* omp-low.cc (WALK_SUBSTMTS): Likewise.
(lower_omp_1): Likewise.
* omp-oacc-kernels-decompose.cc (adjust_region_code_walk_stmt_fn):
Likewise.
* tree-cfg.cc (verify_gimple_stmt, verify_gimple_in_seq_2): Likewise.
* function.h (struct function): Add assume_function bitfield.
* gimplify.cc (gimplify_call_expr): If the assumption isn't
simple enough, expand it into GIMPLE_ASSUME wrapped block or
for -O0 drop it.
* gimple-low.cc: Include attribs.h.
(create_assumption_fn): New function.
(struct lower_assumption_data): New type.
(find_assumption_locals_r, assumption_copy_decl,
adjust_assumption_stmt_r, adjust_assumption_stmt_op,
lower_assumption): New functions.
(lower_stmt): Handle GIMPLE_ASSUME.
* tree-ssa-ccp.cc (pass_fold_builtins::execute): Remove
IFN_ASSUME calls.
* lto-streamer-out.cc (output_struct_function_base): Pack
assume_function bit.
* lto-streamer-in.cc (input_struct_function_base): And unpack it.
* cgraphunit.cc (cgraph_node::expand): Don't verify assume_function
has TREE_ASM_WRITTEN set and don't release its body.
(symbol_table::compile): Allow assume functions not to have released
body.
* internal-fn.cc (expand_ASSUME): Remove gcc_unreachable.
* passes.cc (execute_one_pass): For TODO_discard_function don't
release body of assume functions.
* cgraph.cc (cgraph_node::verify_node): Don't verify cgraph nodes
of PROP_assumptions_done functions.
* tree-pass.h (PROP_assumptions_done): Define.
(TODO_discard_function): Adjust comment.
(make_pass_assumptions): Declare.
* passes.def (pass_assumptions): Add.
* timevar.def (TV_TREE_ASSUMPTIONS): New.
* tree-inline.cc (remap_gimple_stmt): Handle GIMPLE_ASSUME.
* tree-vrp.cc (pass_data_assumptions): New variable.
(pass_assumptions): New class.
(make_pass_assumptions): New function.
gcc/cp/
* cp-tree.h (build_assume_call): Declare.
* parser.cc (cp_parser_omp_assumption_clauses): Use build_assume_call.
* cp-gimplify.cc (build_assume_call): New function.
(process_stmt_assume_attribute): Use build_assume_call.
* pt.cc (tsubst_copy_and_build): Likewise.
gcc/testsuite/
* g++.dg/cpp23/attr-assume5.C: New test.
* g++.dg/cpp23/attr-assume6.C: New test.
* g++.dg/cpp23/attr-assume7.C: New test.

2 years agotree-optimization/107301 - check if we can duplicate block before doing so
Richard Biener [Tue, 18 Oct 2022 07:38:03 +0000 (09:38 +0200)]
tree-optimization/107301 - check if we can duplicate block before doing so

Path isolation failed to do that.

PR tree-optimization/107301
* gimple-ssa-isolate-paths.cc (handle_return_addr_local_phi_arg):
Check whether we can duplicate the block.
(find_implicit_erroneous_behavior): Likewise.

* gcc.dg/torture/pr107301.c: New testcase.

2 years agoMove scanning pass of forwprop-19.c to dse1 for r13-3212-gb88adba751da63
Liwei Xu [Mon, 17 Oct 2022 03:08:55 +0000 (11:08 +0800)]
Move scanning pass of forwprop-19.c to dse1 for r13-3212-gb88adba751da63

gcc/testsuite/ChangeLog:

PR testsuite/107220
* gcc.dg/tree-ssa/forwprop-19.c: Move scanning pass from
forwprop1 to dse1, This fixs the test case fail.

2 years agoMerge partial relation precisions properly
Andrew MacLeod [Mon, 17 Oct 2022 23:00:49 +0000 (19:00 -0400)]
Merge partial relation precisions properly

When merging 2 groups of PE's, one group was simply being set to the
other instead of properly merging them.

PR tree-optimization/107273
gcc/
* value-relation.cc (equiv_oracle::add_partial_equiv): Merge
instead of copying precison of each member.

gcc/testsuite/
* gcc.dg/tree-ssa/pr107273-1.c: New.
* gcc.dg/tree-ssa/pr107273-2.c: New.

2 years agoDaily bump.
GCC Administrator [Tue, 18 Oct 2022 00:17:40 +0000 (00:17 +0000)]
Daily bump.

2 years agoFix bogus RTL on the H8.
Jeff Law [Mon, 17 Oct 2022 23:52:18 +0000 (19:52 -0400)]
Fix bogus RTL on the H8.

This patch actually fixes the bogus RTL seen in PR101697.

Basically we continue to use the insn condition to catch most of the problem
cases related to autoinc addressing modes.  This patch adds constraints which
can guide reload (and hopefully LRA) away from doing blind replacements during
register elimination that would ultimately result in bogus RTL.  The idea is
from Paul K. who has done something very similar on the pdp11.  I guess it
shouldn't be a big surprise that the H8 and pdp11 need the same kind of
handling given some of the similarities in their architectures.

gcc/
PR target/101697
* config/h8300/combiner.md: Replace '<' preincment constraint with
ZA/Z1..ZH/Z7 combinations.
* config/h8300/movepush.md: Similarly

2 years agoMore infrastructure to avoid bogus RTL on H8.
Jeff Law [Mon, 17 Oct 2022 23:42:27 +0000 (19:42 -0400)]
More infrastructure to avoid bogus RTL on H8.

Continuing the work to add constraints to avoid invalid RTL
with autoinc addressing modes.  Specifically this patch adds
the memory constraints similar to the pdp11.

gcc/

* config/h8300/constraints.md (Za..Zh): New constraints for
autoinc addresses using a specific register.
* config/h8300/h8300.cc (pre_incdec_with_reg): New function.
* config/h8300/h8300-protos.h (pre_incdec_with_reg): Add prototype.

2 years agoRemove accidential commits
Jeff Law [Mon, 17 Oct 2022 23:33:52 +0000 (17:33 -0600)]
Remove accidential commits

gcc/
* config/i386/cet.c: Remove accidental commit.
* config/i386/driver-mingw32.c: Likewise.
* config/i386/i386-builtins.c: Likewise.
* config/i386/i386-d.c:  Likewise.
* config/i386/i386-expand.c: Likewise.
* config/i386/i386-features.c: Likewise.
* config/i386/i386-options.c: Likewise.
* config/i386/t-cet: Likewise.
* config/i386/x86-tune-sched-atom.c: Likewise.
* config/i386/x86-tune-sched-bd.c: Likewise.
* config/i386/x86-tune-sched-core.c: Likewise.
* config/i386/x86-tune-sched.c: Likewise.

2 years agoEnable REE for H8
Jeff Law [Mon, 17 Oct 2022 23:28:00 +0000 (19:28 -0400)]
Enable REE for H8

I was looking at H8 assembly code recently and noticed we had unnecessary
extensions.  As it turns out we never enabled redundant extension elimination
on the H8.  This patch fixes that oversight (and was the trigger for the
failure fixed my the prior patch).

gcc/common

* common/config/h8300/h8300-common.cc (h8300_option_optimization_table):
Enable redundant extension elimination at -O2 and above.

2 years agoAdd missing splitter for H8
Jeff Law [Mon, 17 Oct 2022 23:19:25 +0000 (19:19 -0400)]
Add missing splitter for H8

While testing a minor optimization on the H8 my builds failed due to
failure to split a zero-extended memory load.    That particular pattern
is a bit special on the H8 in that it's split at assembly time primarily
to get the length computations correct.  Arguably that alternative should
go away completely, but I haven't really looked into that.

Anyway, with the final-asm split we obviously need to match a define_split
somewhere.  But none was ever written after adding CCZN optimizations.  So
if we had a zero extend of a memory operand and it was used to eliminate
a compare, then we'd abort at final asm time.

Regression tested (in conjunction with various other in-progress patches) on
H8 without regressions.

gcc/
* config/h8300/extensions.md (CCZN setting zero extended load): Add
missing splitter.

2 years agox86: Check corrupted return address when unwinding stack
H.J. Lu [Thu, 11 Aug 2022 23:21:23 +0000 (16:21 -0700)]
x86: Check corrupted return address when unwinding stack

If shadow stack is enabled, when unwinding stack, we count how many stack
frames we pop to reach the landing pad and adjust shadow stack by the same
amount.  When counting the stack frame, we compare the return address on
normal stack against the return address on shadow stack.  If they don't
match, return _URC_FATAL_PHASE2_ERROR for the corrupted return address on
normal stack.  Don't check the return address for

1. Non-catchable exception where exception_class == 0.  Process will be
terminated.
2. Zero return address which marks the outermost stack frame.
3. Signal stack frame since kernel puts a restore token on shadow stack.

* unwind-generic.h (_Unwind_Frames_Increment): Add the EXC
argument.
* unwind.inc (_Unwind_RaiseException_Phase2): Pass EXC to
_Unwind_Frames_Increment.
(_Unwind_ForcedUnwind_Phase2): Likewise.
* config/i386/shadow-stack-unwind.h (_Unwind_Frames_Increment):
Take the EXC argument.  Return _URC_FATAL_PHASE2_ERROR if the
return address on normal stack doesn't match the return address
on shadow stack.

2 years agoFortran: NULL pointer dereference in gfc_simplify_image_index [PR104330]
Steve Kargl [Mon, 17 Oct 2022 20:42:40 +0000 (22:42 +0200)]
Fortran: NULL pointer dereference in gfc_simplify_image_index [PR104330]

gcc/fortran/ChangeLog:

PR fortran/104330
* simplify.cc (gfc_simplify_image_index): Do not dereference NULL
pointer.

gcc/testsuite/ChangeLog:

PR fortran/104330
* gfortran.dg/pr104330.f90: New test.

2 years agoMake sure exported range for SSA post-dominates the DEF in set_global_ranges_from_unr...
Aldy Hernandez [Mon, 17 Oct 2022 16:56:24 +0000 (18:56 +0200)]
Make sure exported range for SSA post-dominates the DEF in set_global_ranges_from_unreachable_edges.

The problem here is that we're exporting a range for an SSA range that
happens on the other side of a __builtin_unreachable, but the SSA does
not post-dominate the definition point.  This is causing ivcanon to
unroll things incorrectly.

This was a snafu when converting the code from evrp.

PR tree-optimization/107293

gcc/ChangeLog:

* tree-ssa-dom.cc
(dom_opt_dom_walker::set_global_ranges_from_unreachable_edges):
Check that condition post-dominates the definition point.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/pr107293.c: New test.

2 years agoFortran: handle bad array ctors with typespec [PR93483, PR107216, PR107219]
Harald Anlauf [Sat, 15 Oct 2022 19:56:56 +0000 (21:56 +0200)]
Fortran: handle bad array ctors with typespec [PR93483, PR107216, PR107219]

gcc/fortran/ChangeLog:

PR fortran/93483
PR fortran/107216
PR fortran/107219
* arith.cc (reduce_unary): Handled expressions are EXP_CONSTANT and
EXPR_ARRAY.  Do not attempt to reduce otherwise.
(reduce_binary_ac): Likewise.
(reduce_binary_ca): Likewise.
(reduce_binary_aa): Moved check for EXP_CONSTANT and EXPR_ARRAY
from here ...
(reduce_binary): ... to here.
(eval_intrinsic): Catch failed reductions.
* gfortran.h (GFC_INTRINSIC_OPS): New enum ARITH_NOT_REDUCED to keep
track of expressions that were not reduced by the arithmetic evaluation
code.

gcc/testsuite/ChangeLog:

PR fortran/93483
PR fortran/107216
PR fortran/107219
* gfortran.dg/array_constructor_56.f90: New test.
* gfortran.dg/array_constructor_57.f90: New test.

Co-authored-by: Mikael Morin <mikael@gcc.gnu.org>
2 years agoFortran: check type of operands of logical operations, comparisons [PR107272]
Harald Anlauf [Sun, 16 Oct 2022 18:32:27 +0000 (20:32 +0200)]
Fortran: check type of operands of logical operations, comparisons [PR107272]

gcc/fortran/ChangeLog:

PR fortran/107272
* arith.cc (gfc_arith_not): Operand must be of type BT_LOGICAL.
(gfc_arith_and): Likewise.
(gfc_arith_or): Likewise.
(gfc_arith_eqv): Likewise.
(gfc_arith_neqv): Likewise.
(gfc_arith_eq): Compare consistency of types of operands.
(gfc_arith_ne): Likewise.
(gfc_arith_gt): Likewise.
(gfc_arith_ge): Likewise.
(gfc_arith_lt): Likewise.
(gfc_arith_le): Likewise.

gcc/testsuite/ChangeLog:

PR fortran/107272
* gfortran.dg/pr107272.f90: New test.

2 years agoFortran: Fixes for kind=4 characters strings [PR107266]
Tobias Burnus [Mon, 17 Oct 2022 16:15:16 +0000 (18:15 +0200)]
Fortran: Fixes for kind=4 characters strings [PR107266]

PR fortran/107266

gcc/fortran/
* trans-expr.cc (gfc_conv_string_parameter): Use passed
type to honor character kind.
* trans-types.cc (gfc_sym_type): Honor character kind.
* trans-decl.cc (gfc_conv_cfi_to_gfc): Fix handling kind=4
character strings.

gcc/testsuite/
* gfortran.dg/char4_decl.f90: New test.
* gfortran.dg/char4_decl-2.f90: New test.

2 years agoc++ modules: streaming constexpr_fundef [PR101449]
Patrick Palka [Mon, 17 Oct 2022 15:14:36 +0000 (11:14 -0400)]
c++ modules: streaming constexpr_fundef [PR101449]

It looks like we currently avoid streaming the RESULT_DECL and PARM_DECLs
of a constexpr_fundef entry under the assumption that they're just copies
of the DECL_RESULT and DECL_ARGUMENTS of the FUNCTION_DECL.  Thus we can
just make new copies of DECL_RESULT and DECL_ARGUMENTS on stream in rather
than separately streaming them.

But the FUNCTION_DECL's DECL_RESULT and DECL_ARGUMENTS eventually get
genericized, whereas the constexpr_fundef entry consists of a copy of the
FUNCTION_DECL's pre-GENERIC trees.  And notably during genericization we
lower invisref parms (which entails changing their TREE_TYPE and setting
DECL_BY_REFERENCE), the lowered form of which the constexpr evaluator
doesn't expect to see, and so this copying approach causes us to ICE for
the below testcase.

This patch fixes this by faithfully streaming the RESULT_DECL and
PARM_DECLs of a constexpr_fundef entry, which seems to just work.

Nathan says[1]: Hm, the reason for the complexity was that I wanted to
recreate the tree graph where the fndecl came from one TU and the defn
came from another one -- we need the definition to refer to argument
decls from the already-read decl.  However, it seems that for constexpr
fns here, that is not needed, resulting in a significant simplification.

[1]: https://gcc.gnu.org/pipermail/gcc-patches/2022-October/603662.html

PR c++/101449

gcc/cp/ChangeLog:

* module.cc (trees_out::write_function_def): Stream the
parms and result of the constexpr_fundef entry.
(trees_in::read_function_def): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/modules/cexpr-3_a.C: New test.
* g++.dg/modules/cexpr-3_b.C: New test.

2 years ago[PR tree-optimization/105820] Add test.
Aldy Hernandez [Mon, 17 Oct 2022 13:32:35 +0000 (15:32 +0200)]
[PR tree-optimization/105820] Add test.

PR tree-optimization/105820

gcc/testsuite/ChangeLog:

* g++.dg/tree-ssa/pr105820.c: New test.

2 years agoDo not test for -Inf when flag_finite_math_only.
Aldy Hernandez [Mon, 17 Oct 2022 13:26:05 +0000 (15:26 +0200)]
Do not test for -Inf when flag_finite_math_only.

PR tree-optimization/107286

gcc/ChangeLog:

* value-range.cc (range_tests_floats): Do not test for -Inf when
flag_finite_math_only.

2 years agoAdd 3 floating NAN tests.
Aldy Hernandez [Mon, 10 Oct 2022 09:01:48 +0000 (11:01 +0200)]
Add 3 floating NAN tests.

x UNORD x should set NAN on the TRUE side.
The false side of x == x should set NAN.
The true side of x != x should set NAN.

gcc/testsuite/
* gcc.dg/tree-ssa/vrp-float-3a.c: New.
* gcc.dg/tree-ssa/vrp-float-4a.c: New.
* gcc.dg/tree-ssa/vrp-float-5a.c: New.

2 years agoAdd relation_trio class for range-ops.
Andrew MacLeod [Thu, 13 Oct 2022 22:03:58 +0000 (18:03 -0400)]
Add relation_trio class for range-ops.

There are 3 possible relations range-ops might care about, but only the one
most likely to be needed is supplied.   This patch provides a new class
relation_trio which allows 3 relations to be passed in a single word.

fold_range (), op1_range (), and op2_range () are adjusted to take a
relation_trio class instead of a relation_kind, then the routine can
extract which relation it wants to work with.

* gimple-range-fold.cc (fold_using_range::range_of_range_op):
Provide relation_trio class.
* gimple-range-gori.cc (gori_compute::refine_using_relation):
Provide relation_trio class.
(gori_compute::refine_using_relation): Ditto.
(gori_compute::compute_operand1_range): Provide lhs_op2 and
op1_op2 relations via relation_trio class.
(gori_compute::compute_operand2_range): Ditto.
* gimple-range-op.cc (gimple_range_op_handler::calc_op1): Use
relation_trio instead of relation_kind.
(gimple_range_op_handler::calc_op2): Ditto.
(*::fold_range): Ditto.
* gimple-range-op.h (gimple_range_op::calc_op1): Adjust prototypes.
(gimple_range_op::calc_op2): Adjust prototypes.
* range-op-float.cc (*::fold_range): Use relation_trio instead of
relation_kind.
(*::op1_range): Ditto.
(*::op2_range): Ditto.
* range-op.cc (*::fold_range): Use relation_trio instead of
relation_kind.
(*::op1_range): Ditto.
(*::op2_range): Ditto.
* range-op.h (class range_operator): Adjust prototypes.
(class range_operator_float): Ditto.
(class range_op_handler): Adjust prototypes.
(relop_early_resolve): Pickup op1_op2 relation from relation_trio.
* value-relation.cc (VREL_LAST): Adjust use to be one past the end of
the enum.
(relation_oracle::validate_relation): Use relation_trio in call
to fold_range.
* value-relation.h (enum relation_kind_t): Add VREL_LAST as
final element.
(class relation_trio): New.
(TRIO_VARYING, TRIO_SHIFT, TRIO_MASK): New.

2 years agoFix nan updating in range-ops.
Andrew MacLeod [Fri, 14 Oct 2022 13:29:23 +0000 (09:29 -0400)]
Fix nan updating in range-ops.

Calling clean_nan on an undefined type traps, set_varying first. Other
tweaks for correctness.

* range-op-float.cc (foperator_not_equal::op1_range): Check for
VREL_EQ after singleton.
(foperator_unordered::op1_range): Set VARYING before calling
clear_nan().
(foperator_ordered::op1_range): Set rather than clear NAN if both
operands are the same.

2 years agoDon't set useless relations.
Andrew MacLeod [Fri, 14 Oct 2022 18:31:02 +0000 (14:31 -0400)]
Don't set useless relations.

The oracle will not register nonssense/useless relations, class
value_relation shouldn't either.

* value-relation.cc (value_relation::dump): Change message.
* value-relation.h (value_relation::set_relation): If op1 is the
same as op2 do not create a relation.

2 years agoGCN: Restore build with GCC 4.8
Thomas Schwinge [Fri, 14 Oct 2022 22:10:29 +0000 (00:10 +0200)]
GCN: Restore build with GCC 4.8

For example, for "g++-4.8 (Ubuntu 4.8.4-2ubuntu1~14.04.4) 4.8.4", the recent
commit r13-3220-g45381d6f9f4e7b5c7b062f5ad8cc9788091c2d07
"amdgcn: add multiple vector sizes" broke the build:

    In file included from [...]/source-gcc/gcc/coretypes.h:458:0,
                     from [...]/source-gcc/gcc/config/gcn/gcn.cc:24:
    [...]/source-gcc/gcc/config/gcn/gcn.cc: In function ‘machine_mode VnMODE(int, machine_mode)’:
    ./insn-modes.h:42:71: error: temporary of non-literal type ‘scalar_int_mode’ in a constant expression
     #define QImode (scalar_int_mode ((scalar_int_mode::from_int) E_QImode))
                                                                           ^
    [...]/source-gcc/gcc/config/gcn/gcn.cc:405:10: note: in expansion of macro ‘QImode’
         case QImode:
              ^
    In file included from [...]/source-gcc/gcc/coretypes.h:478:0,
                     from [...]/source-gcc/gcc/config/gcn/gcn.cc:24:
    [...]/source-gcc/gcc/machmode.h:410:7: note: ‘scalar_int_mode’ is not literal because:
     class scalar_int_mode
           ^
    [...]/source-gcc/gcc/machmode.h:410:7: note:   ‘scalar_int_mode’ is not an aggregate, does not have a trivial default constructor, and has no constexpr constructor that is not a copy or move constructor
    [...]

Addressing this like simiar issues have been addressed in the past.

gcc/
* config/gcn/gcn.cc (VnMODE): Use 'case E_QImode:' instead of
'case QImode:', etc.

2 years agoTag 'gcc/gimple-expr.cc:mark_addressable_2' as 'static'
Thomas Schwinge [Wed, 15 Dec 2021 21:00:53 +0000 (22:00 +0100)]
Tag 'gcc/gimple-expr.cc:mark_addressable_2' as 'static'

Added in 2015 r229696 (commit 1b223a9f3489296c625bdb7cc764196d04fd9231)
"defer mark_addressable calls during expand till the end of expand",
it has never been used 'extern'ally.

gcc/
* gimple-expr.cc (mark_addressable_2): Tag as 'static'.

2 years agoFix nvptx-specific '-foffload-options' syntax in 'libgomp.c/reverse-offload-sm30.c'
Thomas Schwinge [Fri, 23 Sep 2022 09:29:50 +0000 (11:29 +0200)]
Fix nvptx-specific '-foffload-options' syntax in 'libgomp.c/reverse-offload-sm30.c'

That is, '-mptx=_' is only valid in '-foffload-options=nvptx-none', too.

Fix test case added in recent
commit r13-2625-g6b43f556f392a7165582aca36a19fe7389d995b2 "nvptx/mkoffload.cc:
Warn instead of error when reverse offload is not possible".

libgomp/
* testsuite/libgomp.c/reverse-offload-sm30.c: Fix nvptx-specific
'-foffload-options' syntax.

2 years agoVectorization of first-order recurrences
Richard Biener [Thu, 6 Oct 2022 11:56:09 +0000 (13:56 +0200)]
Vectorization of first-order recurrences

The following picks up the prototype by Ju-Zhe Zhong for vectorizing
first order recurrences.  That solves two TSVC missed optimization PRs.

There's a new scalar cycle def kind, vect_first_order_recurrence
and it's handling of the backedge value vectorization is complicated
by the fact that the vectorized value isn't the PHI but instead
a (series of) permute(s) shifting in the recurring value from the
previous iteration.  I've implemented this by creating both the
single vectorized PHI and the series of permutes when vectorizing
the scalar PHI but leave the backedge values in both unassigned.
The backedge values are (for the testcases) computed by a load
which is also the place after which the permutes are inserted.
That placement also restricts the cases we can handle (without
resorting to code motion).

I added both costing and SLP handling though SLP handling is
restricted to the case where a single vectorized PHI is enough.

Missing is epilogue handling - while prologue peeling would
be handled transparently by adjusting iv_phi_p the epilogue
case doesn't work with just inserting a scalar LC PHI since
that a) keeps the scalar load live and b) that loads is the
wrong one, it has to be the last, much like when we'd vectorize
the LC PHI as live operation.  Unfortunately LIVE
compute/analysis happens too early before we decide on
peeling.  When using fully masked loop vectorization the
vect-recurr-6.c works as expected though.

I have tested this on x86_64 for now, but since epilogue
handling is missing there's probably no practical cases.
My prototype WHILE_ULT AVX512 patch can handle vect-recurr-6.c
just fine but I didn't feel like running SPEC within SDE nor
is the WHILE_ULT patch complete enough.

PR tree-optimization/99409
PR tree-optimization/99394
* tree-vectorizer.h (vect_def_type::vect_first_order_recurrence): Add.
(stmt_vec_info_type::recurr_info_type): Likewise.
(vectorizable_recurr): New function.
* tree-vect-loop.cc (vect_phi_first_order_recurrence_p): New
function.
(vect_analyze_scalar_cycles_1): Look for first order
recurrences.
(vect_analyze_loop_operations): Handle them.
(vect_transform_loop): Likewise.
(vectorizable_recurr): New function.
(maybe_set_vectorized_backedge_value): Handle the backedge value
setting in the first order recurrence PHI and the permutes.
* tree-vect-stmts.cc (vect_analyze_stmt): Handle first order
recurrences.
(vect_transform_stmt): Likewise.
(vect_is_simple_use): Likewise.
(vect_is_simple_use): Likewise.
* tree-vect-slp.cc (vect_get_and_check_slp_defs): Likewise.
(vect_build_slp_tree_2): Likewise.
(vect_schedule_scc): Handle the backedge value setting in the
first order recurrence PHI and the permutes.

* gcc.dg/vect/vect-recurr-1.c: New testcase.
* gcc.dg/vect/vect-recurr-2.c: Likewise.
* gcc.dg/vect/vect-recurr-3.c: Likewise.
* gcc.dg/vect/vect-recurr-4.c: Likewise.
* gcc.dg/vect/vect-recurr-5.c: Likewise.
* gcc.dg/vect/vect-recurr-6.c: Likewise.
* gcc.dg/vect/tsvc/vect-tsvc-s252.c: Un-XFAIL.
* gcc.dg/vect/tsvc/vect-tsvc-s254.c: Likewise.
* gcc.dg/vect/tsvc/vect-tsvc-s291.c: Likewise.

Co-authored-by: Ju-Zhe Zhong <juzhe.zhong@rivai.ai>
2 years agolibgcc: Move cfa_how into potential padding in struct frame_state_reg_info
Florian Weimer [Mon, 17 Oct 2022 09:09:17 +0000 (11:09 +0200)]
libgcc: Move cfa_how into potential padding in struct frame_state_reg_info

On many architectures, there is a padding gap after the how array
member, and cfa_how can be moved there.  This reduces the size of the
struct and the amount of memory that uw_frame_state_for has to clear.

There is no measurable performance benefit from this on x86-64 (even
though the memset goes from 120 to 112 bytes), but it seems to be a
good idea to do anyway.

libgcc/

* unwind-dw2.h (struct frame_state_reg_info): Move cfa_how member
and reduce its size.

2 years agolibstdc++: Fix value of __cpp_lib_constexpr_charconv
Jonathan Wakely [Mon, 17 Oct 2022 08:38:02 +0000 (09:38 +0100)]
libstdc++: Fix value of __cpp_lib_constexpr_charconv

libstdc++-v3/ChangeLog:

* include/std/charconv (__cpp_lib_constexpr_charconv): Define to
correct value.
* include/std/version (__cpp_lib_constexpr_charconv): Likewise.
* testsuite/20_util/to_chars/constexpr.cc: Check correct value.
* testsuite/20_util/to_chars/version.cc: Likewise.

2 years agoRISC-V: Fix format[NFC]
Ju-Zhe Zhong [Mon, 17 Oct 2022 07:30:47 +0000 (15:30 +0800)]
RISC-V: Fix format[NFC]

gcc/ChangeLog:

* config/riscv/t-riscv: Change Tab into 2 space.

2 years agoRISC-V: Reorganize mangle_builtin_type.[NFC]
Ju-Zhe Zhong [Fri, 14 Oct 2022 23:02:36 +0000 (07:02 +0800)]
RISC-V: Reorganize mangle_builtin_type.[NFC]

Hi, this patch fixed my mistake in the previous commit patch.
Since "mangle_builtin_type" is a global function will be called in riscv.cc.
It's reasonable move it down and put them together stay with other global functions.

gcc/ChangeLog:

* config/riscv/riscv-vector-builtins.cc (mangle_builtin_type): Move down the function.

2 years agoelf: ELF toolchain --without-{headers, newlib} should provide stdint.h
Arsen Arsenovic [Mon, 17 Oct 2022 06:58:07 +0000 (08:58 +0200)]
elf: ELF toolchain --without-{headers, newlib} should provide stdint.h

stdint.h is considered a freestanding headers by C, and a valid stdint.h
is required for certain parts of libstdc++' configuration, so we should
simply provide one when we have no other way (i.e. newlib or
user-specified sysroot) of getting one.

* config.gcc: --target=*-elf --without-{newlib,headers} should
provide stdint.h.

2 years agoInitial Meteorlake Support
Hu, Lin1 [Fri, 16 Sep 2022 03:25:13 +0000 (11:25 +0800)]
Initial Meteorlake Support

gcc/ChangeLog:

* common/config/i386/cpuinfo.h:
(get_intel_cpu): Handle Meteorlake.
* common/config/i386/i386-common.cc:
(processor_alias_table): Add Meteorlake.

2 years agoInitial Raptorlake Support
Haochen Jiang [Fri, 16 Sep 2022 05:59:01 +0000 (13:59 +0800)]
Initial Raptorlake Support

gcc/ChangeLog:

* common/config/i386/cpuinfo.h:
(get_intel_cpu): Handle Raptorlake.
* common/config/i386/i386-common.cc:
(processor_alias_table): Add Raptorlake.

2 years agoDaily bump.
GCC Administrator [Mon, 17 Oct 2022 00:16:37 +0000 (00:16 +0000)]
Daily bump.

2 years agoAdd new constraints for upcoming autoinc fixes
Jeff Law [Sun, 16 Oct 2022 16:43:25 +0000 (12:43 -0400)]
Add new constraints for upcoming autoinc fixes

GCC does not allow a the operand of an autoinc addressing mode to
overlap with another soure operand in the same insn.  This is primarly
enforced with insn conditions.  However, cases can slip through LRA
and reload.  To address those scenarios we'll take an idea from the
pdp11 port for describing the restriction in constraints as well.

To implement that we need register classes and constraints which are
"all general purpose hardware registers except r0".  And similarly for
r1..r7(sp).

This patch adds those register classes and constraints, but does not
yet use them.

gcc/
* config/h8300/constraints.md (Z0..Z7): New register
constraints.
* config/h8300/h8300.h (reg_class): Add new classes.
(REG_CLASS_NAMES): Similarly.
(REG_CLASS_CONTENTS): Similarly.

2 years agoRename "z" constraint to "Zz" on the H8/300
Jeff Law [Sun, 16 Oct 2022 14:58:52 +0000 (10:58 -0400)]
Rename "z" constraint to "Zz" on the H8/300

I want to use Z as a multi-letter constraint.  So first we have to
adjust the existing use of Z.  This does not affect code generation.

gcc/
* config/h8300/constraints.md (Zz constraint): Renamed
from "z".
* config/h8300/movepush.md (movqi_h8sx, movhi_h8sx): Adjust
constraint to use Zz instead of Z.

2 years agoFix bug in register move costing on H8/300
Jeff Law [Sun, 16 Oct 2022 03:38:20 +0000 (23:38 -0400)]
Fix bug in register move costing on H8/300

gcc/
* config/h8300/h8300.cc (h8300_register_move_cost): Fix typo.

2 years agoDaily bump.
GCC Administrator [Sun, 16 Oct 2022 00:16:25 +0000 (00:16 +0000)]
Daily bump.

2 years agolibstdc++: Fix -Wunused-function warning in src/c++11/debug.cc
Jonathan Wakely [Sat, 15 Oct 2022 20:15:51 +0000 (21:15 +0100)]
libstdc++: Fix -Wunused-function warning in src/c++11/debug.cc

The only remaining use of print_raw is conditionally compiled, so when
libstdc++ i built without debug backtrace support, there's an unused
warning function for it. Move it inside the conditional block.

libstdc++-v3/ChangeLog:

* src/c++11/debug.cc (print_raw): Move inside #if block.

2 years agolibstdc++: Implement constexpr std::to_chars for C++23 (P2291R3)
Jonathan Wakely [Sat, 15 Oct 2022 20:20:47 +0000 (21:20 +0100)]
libstdc++: Implement constexpr std::to_chars for C++23 (P2291R3)

Some of the helper functions use static constexpr local variables, which
is not permitted in a core constant expression. Removing the 'static'
seems to have negligible performance effect for __to_chars and
__to_chars_16. For __from_chars_alnum_to_val removing the 'static'
causes a significant performance impact for base 36 conversions. Use a
consteval lambda instead.

libstdc++-v3/ChangeLog:

* include/bits/charconv.h (__to_chars_10_impl): Add constexpr
for C++23. Remove 'static' from array.
* include/std/charconv (__cpp_lib_constexpr_charconv): Define.
(__to_chars, __to_chars_16): Remove 'static' from array, add
constexpr.
(__to_chars_10, __to_chars_8, __to_chars_2, __to_chars_i)
(to_chars, __raise_and_add, __from_chars_pow2_base)
(__from_chars_alnum, from_chars): Add constexpr.
(__from_chars_alnum_to_val): Avoid local static during constant
evaluation. Add constexpr.
* include/std/version (__cpp_lib_constexpr_charconv): Define.
* testsuite/20_util/from_chars/constexpr.cc: New test.
* testsuite/20_util/to_chars/constexpr.cc: New test.
* testsuite/20_util/to_chars/version.cc: New test.

2 years agolibstdc++: Fix uses_allocator_construction args for cv pair (LWG 3677)
Jonathan Wakely [Fri, 14 Oct 2022 13:25:48 +0000 (14:25 +0100)]
libstdc++: Fix uses_allocator_construction args for cv pair (LWG 3677)

The _Std_pair concept uses in <bits/uses_allocator_args.h> handles const
qualified pairs, but not volatile qualified. That's because it just uses
__is_pair which is specialized for const pairs.

This removes the partial specialization __is_pair<const pair<T,U>>, so
that __is_pair is now only true for cv-unqualified pairs. Then _Std_pair
needs to explicitly use remove_cv_t for the argument to __is_pair.

The other use of __is_pair is in map::insert(Pair&&) which doesn't want
to handle volatile so should just use remove_const_t.

libstdc++-v3/ChangeLog:

* include/bits/stl_map.h (map::insert(Pair&&)): Use
remove_const_t on argument to __is_pair.
* include/bits/stl_pair.h (__is_pair<const pair<T,U>>): Remove
partial specialization.
* include/bits/uses_allocator_args.h (_Std_pair): Use
remove_cv_t as per LWG 3677.
* testsuite/20_util/uses_allocator/lwg3677.cc: New test.

2 years agoDaily bump.
GCC Administrator [Sat, 15 Oct 2022 00:17:38 +0000 (00:17 +0000)]
Daily bump.

2 years agopreprocessor: C2x identifier rules
Joseph Myers [Fri, 14 Oct 2022 23:07:50 +0000 (23:07 +0000)]
preprocessor: C2x identifier rules

C2x has, like C++, adopted rules for identifiers based directly on an
unversioned normative reference to Unicode.  Make libcpp follow those
rules for c2x / gnu2x standards (this involves bringing back a flag
separate from the C++ one for whether to use these identifier rules,
but this time enabled for all C++ language versions since that was the
conclusion adopted for C++ identifier handling).

There is one change here that affects C++.  I believe the new
normative requirement for NFC only applies to identifiers, not to the
use of identifier-continue characters in pp-numbers, where there is no
such requirement and so the diagnostic ought to be a warning not a
pedwarn in pp-numbers, and that this is the case for both C and C++.

Bootstrapped with no regressions for x86_64-pc-linux-gnu.

libcpp/
* charset.cc (ucn_valid_in_identifier): Check xid_identifiers not
cplusplus to determine whether to use CXX23 and NXX23 flags.
* include/cpplib.h (struct cpp_options): Add xid_identifiers.
* init.cc (struct lang_flags, lang_defaults): Add xid_identifiers.
(cpp_set_lang): Set xid_identifiers.
* lex.cc (warn_about_normalization): Add parameter identifier.
Only pedwarn about non-NFC for identifiers, not pp-numbers.
(_cpp_lex_direct): Update calls to warn_about_normalization.

gcc/testsuite/
* gcc.dg/cpp/c2x-ucnid-1-utf8.c, gcc.dg/cpp/c2x-ucnid-1.c: New
tests.

2 years agoFortran: fix check of polymorphic elements in data transfers [PR100971]
Harald Anlauf [Sun, 9 Oct 2022 18:43:32 +0000 (20:43 +0200)]
Fortran: fix check of polymorphic elements in data transfers [PR100971]

gcc/fortran/ChangeLog:

PR fortran/100971
* resolve.cc (resolve_transfer): Extend check for permissibility
of polymorphic elements in a data transfer to arrays.

gcc/testsuite/ChangeLog:

PR fortran/100971
* gfortran.dg/der_io_5.f90: New test.

2 years agoImplement distinction between HONOR_SIGNED_ZEROS and MODE_HAS_SIGNED_ZEROS.
Aldy Hernandez [Fri, 14 Oct 2022 14:49:33 +0000 (16:49 +0200)]
Implement distinction between HONOR_SIGNED_ZEROS and MODE_HAS_SIGNED_ZEROS.

gcc/ChangeLog:

* value-range.cc (frange::set): Implement distinction between
HONOR_SIGNED_ZEROS and MODE_HAS_SIGNED_ZEROS.

2 years agoImplement range-op entry for __builtin_copysign.
Aldy Hernandez [Thu, 13 Oct 2022 15:51:29 +0000 (17:51 +0200)]
Implement range-op entry for __builtin_copysign.

copysign(MAGNITUDE, SIGN) is implemented as the absolute of MAGNITUDE,
with SIGN applied.  If the sign of "SIGN" cannot be determined, we
return a range of [-MAGNITUDE, +MAGNITUDE].

gcc/ChangeLog:

* gimple-range-op.cc (class cfn_copysign): New.
(gimple_range_op_handler::maybe_builtin_call): Add
CFN_BUILT_IN_COPYSIGN*.

2 years agogfortran.dg/c-interop/deferred-character-2.f90: Fix dg-do
Tobias Burnus [Fri, 14 Oct 2022 16:34:49 +0000 (18:34 +0200)]
gfortran.dg/c-interop/deferred-character-2.f90: Fix dg-do

gcc/testsuite/
* gfortran.dg/c-interop/deferred-character-2.f90: Use 'dg-do run'.

2 years agoCheck rvc_normal in real_isdenormal.
Aldy Hernandez [Fri, 14 Oct 2022 10:08:11 +0000 (12:08 +0200)]
Check rvc_normal in real_isdenormal.

[-Inf, -Inf] is being flushed to [-Inf, -0.0] because real_isdenormal
is being overly pessimistic.  It is missing a check for rvc_normal.
This doesn't cause problems in real.cc because all uses of
real_isdenormal are already on the rvc_normal path.  The uses in
value-range.cc however, are not.

This patch adds a check for rvc_normal.

gcc/ChangeLog:

* real.h (real_isdenormal): Check rvc_normal.
* value-range.cc (range_tests_floats): New test.

2 years agolibstdc++: Disable all emergency EH pool code if obj-count == 0
Jonathan Wakely [Wed, 12 Oct 2022 22:04:53 +0000 (23:04 +0100)]
libstdc++: Disable all emergency EH pool code if obj-count == 0

For a zero-sized static pool we can completely elide all code for the EH
pool.

We no longer need to adjust the static buffer size to ensure at least
one free_entry can be created in it, because we no longer use a static
buffer at all if obj_count == 0. If the buffer exists, obj_count >= 1
and the buffer will be much larger than sizeof(free_entry).

libstdc++-v3/ChangeLog:

* libsupc++/eh_alloc.cc [USE_POOL]: New macro.
[!USE_POOL] (__gnu_cxx::__freeres, pool): Do not define.
[_GLIBCXX_EH_POOL_STATIC] (pool::arena): Do not use std::max.
(__cxxabiv1::__cxa_allocate_exception) [!USE_POOL]: Do not use
pool.
(__cxxabiv1::__cxa_free_exception) [!USE_POOL]: Likewise.
(__cxxabiv1::__cxa_allocate_dependent_exception) [!USE_POOL]:
Likewise.
(__cxxabiv1::__cxa_free_dependent_exception) [!USE_POOL]:
Likewise.

2 years agolibstdc++: Simplify print_raw function for debug assertions
Jonathan Wakely [Wed, 12 Oct 2022 10:59:33 +0000 (11:59 +0100)]
libstdc++: Simplify print_raw function for debug assertions

Replace two uses of print_raw where it's clearer to just use fprintf
directly. Then the only remaining use of print_raw is as the print_func
argument of pretty_print. When called by pretty_print the count is
either a positive integer or -1, so we can simplify print_raw itself.

Remove the default argument, because it's never used. Remove the check
for nbc == 0, which never happens (but would be harmless if it did).
Replace the conditional expression with a single call to fprintf, using
INT_MAX as the maximum length.

libstdc++-v3/ChangeLog:

* src/c++11/debug.cc (print_raw): Simplify.
(print_word): Print indentation by calling fprintf directly.
(_Error_formatter::_M_error): Print unindented string by calling
fprintf directly.

2 years agoReplace CFN_BUILTIN_SIGNBIT* cases with CASE_FLT_FN.
Aldy Hernandez [Fri, 14 Oct 2022 13:02:06 +0000 (15:02 +0200)]
Replace CFN_BUILTIN_SIGNBIT* cases with CASE_FLT_FN.

gcc/ChangeLog:

* gimple-range-op.cc
(gimple_range_op_handler::maybe_builtin_call): Replace
CFN_BUILTIN_SIGNBIT* cases with CASE_FLT_FN.

2 years agoNormalize ranges over the range for both bounds when -ffinite-math-only.
Aldy Hernandez [Fri, 14 Oct 2022 10:07:40 +0000 (12:07 +0200)]
Normalize ranges over the range for both bounds when -ffinite-math-only.

[-Inf, +Inf] was being chopped correctly for -ffinite-math-only, but
[-Inf, -Inf] was not.  This was latent because a bug in
real_isdenormal is causing us to flush -Inf to zero.

gcc/ChangeLog:

* value-range.cc (frange::set): Normalize ranges for both bounds.

2 years agoDrop -0.0 in frange::set() for !HONOR_SIGNED_ZEROS.
Aldy Hernandez [Fri, 14 Oct 2022 10:06:56 +0000 (12:06 +0200)]
Drop -0.0 in frange::set() for !HONOR_SIGNED_ZEROS.

Similar to what we do for NANs when !HONOR_NANS and Inf when
flag_finite_math_only, we can remove -0.0 from the range at creation
time.

We were kinda sorta doing this because there is a bug in
real_isdenormal that is causing flush_denormals_to_zero to saturate
[x, -0.0] to [x, +0.0] when !HONOR_SIGNED_ZEROS.  Fixing this bug
(upcoming), causes us to leave -0.0 in places where we aren't
expecting it (the intersection code).

gcc/ChangeLog:

* value-range.cc (frange::set): Drop -0.0 for !HONOR_SIGNED_ZEROS.

2 years agoc++ modules: ICE with dynamic_cast [PR106304]
Patrick Palka [Fri, 14 Oct 2022 13:07:01 +0000 (09:07 -0400)]
c++ modules: ICE with dynamic_cast [PR106304]

The FUNCTION_DECL we build for __dynamic_cast has an empty DECL_CONTEXT
but trees_out::tree_node expects FUNCTION_DECLs to have non-empty
DECL_CONTEXT, thus we crash when streaming out the dynamic_cast in the
below testcase.

This patch naively fixes this by setting DECL_CONTEXT for __dynamic_cast
appropriately.  I suppose we should push it into the namespace too, like
we do for __cxa_atexit which is similarly lazily declared.

PR c++/106304

gcc/cp/ChangeLog:

* constexpr.cc (cxx_dynamic_cast_fn_p): Check for abi_node
instead of global_namespace.
* rtti.cc (build_dynamic_cast_1): Set DECL_CONTEXT and
DECL_SOURCE_LOCATION when building dynamic_cast_node.  Push
it into the namespace.

gcc/testsuite/ChangeLog:

* g++.dg/modules/pr106304_a.C: New test.
* g++.dg/modules/pr106304_b.C: New test.

2 years agoAdd cases for CFN_BUILT_IN_SIGNBIT[FL].
Aldy Hernandez [Thu, 13 Oct 2022 15:45:38 +0000 (17:45 +0200)]
Add cases for CFN_BUILT_IN_SIGNBIT[FL].

gcc/ChangeLog:

* gimple-range-op.cc
(gimple_range_op_handler::maybe_builtin_call): Add
CFN_BUILT_IN_SIGNBIT[FL]* entries.

2 years agotree-optimization/107254 - check and support live lanes from permutes
Richard Biener [Fri, 14 Oct 2022 09:14:59 +0000 (11:14 +0200)]
tree-optimization/107254 - check and support live lanes from permutes

The following fixes an omission from adding SLP permute nodes which
is live lanes originating from those.  We have to check that we
can extract the lane and have to actually code generate them.

PR tree-optimization/107254
* tree-vect-slp.cc (vect_slp_analyze_node_operations_1):
For permutes also analyze live lanes.
(vect_schedule_slp_node): For permutes also code generate
live lane extracts.

* gfortran.dg/vect/pr107254.f90: New testcase.

2 years agoFix PR target/107248
Eric Botcazou [Fri, 14 Oct 2022 09:52:04 +0000 (11:52 +0200)]
Fix PR target/107248

This is the infamous PR rtl-optimization/38644 rearing its ugly head for
leaf functions on SPARC more than a decade later...  Richard E.'s generic
solution has never been implemented so let's do as other RISC back-ends did.

gcc/
PR target/107248
* config/sparc/sparc.cc (sparc_expand_prologue): Emit a frame
blockage for leaf functions.
(sparc_flat_expand_prologue): Emit frame instead of full blockage.
(sparc_expand_epilogue): Emit a frame blockage for leaf functions.
(sparc_flat_expand_epilogue): Emit frame instead of full blockage.

2 years agolibstdc++: Use markdown in Doxygen comment
Jonathan Wakely [Wed, 12 Oct 2022 11:07:14 +0000 (12:07 +0100)]
libstdc++: Use markdown in Doxygen comment

This makes the comment easier to read in the source, without altering
the Doxygen output.

libstdc++-v3/ChangeLog:

* include/std/iostream: Use markdown in Doxygen comment.

2 years agogcov: test line count for label in then/else block
Jørgen Kvalsvik [Fri, 7 Oct 2022 11:29:20 +0000 (13:29 +0200)]
gcov: test line count for label in then/else block

Add a test to catch regression in line counts for labels on top of
then/else blocks. Only the 'goto <label>' should contribute to the line
counter for the label, not the if.

gcc/testsuite/ChangeLog:

* gcc.misc-tests/gcov-4.c: New testcase.

2 years agogcov: test switch/break line counts
Jørgen Kvalsvik [Tue, 4 Oct 2022 13:45:59 +0000 (15:45 +0200)]
gcov: test switch/break line counts

The coverage support will under some conditions decide to split edges to
accurately report coverage. By running the test suite with/without this
edge splitting a small diff shows up, addressed by this patch, which
should catch future regressions.

Removing the edge splitting:

$ diff --git a/gcc/profile.cc b/gcc/profile.cc
--- a/gcc/profile.cc
+++ b/gcc/profile.cc
@@ -1244,19 +1244,7 @@ branch_prob (bool thunk)
                Don't do that when the locuses match, so
                if (blah) goto something;
                is not computed twice.  */
-             if (last
-                 && gimple_has_location (last)
-                 && !RESERVED_LOCATION_P (e->goto_locus)
-                 && !single_succ_p (bb)
-                 && (LOCATION_FILE (e->goto_locus)
-                     != LOCATION_FILE (gimple_location (last))
-                     || (LOCATION_LINE (e->goto_locus)
-                         != LOCATION_LINE (gimple_location (last)))))
-               {
-                 basic_block new_bb = split_edge (e);
-                 edge ne = single_succ_edge (new_bb);
-                 ne->goto_locus = e->goto_locus;
-               }
+
        if ((e->flags & (EDGE_ABNORMAL | EDGE_ABNORMAL_CALL))
                && e->dest != EXIT_BLOCK_PTR_FOR_FN (cfun))
                need_exit_edge = 1;

Assuming the .gcov files from make chec-gcc RUNTESTFLAGS=gcov.exp are
kept:

$ diff -r no-split-edge with-split-edge | grep -C 2 -E "^[<>]\s\s"
diff -r sans-split-edge/gcc/gcov-4.c.gcov with-split-edge/gcc/gcov-4.c.gcov
   228c228
   <         -:  224:        break;
   ---
   >         1:  224:        break;
   231c231
   <         -:  227:        break;
   ---
   >     #####:  227:        break;
   237c237
   <         -:  233:        break;
   ---
   >         2:  233:        break;

gcc/testsuite/ChangeLog:

* g++.dg/gcov/gcov-1.C: Add line count check.
* gcc.misc-tests/gcov-4.c: Likewise.

2 years agomiddle-end, c++, i386, libgcc: std::bfloat16_t and __bf16 arithmetic support
Jakub Jelinek [Fri, 14 Oct 2022 07:37:01 +0000 (09:37 +0200)]
middle-end, c++, i386, libgcc: std::bfloat16_t and __bf16 arithmetic support

Here is a complete patch to add std::bfloat16_t support on
x86 (AArch64 and ARM left for later).  Almost no BFmode optabs
are added by the patch, so for binops/unops it extends to SFmode
first and then truncates back to BFmode.
For {HF,SF,DF,XF,TF}mode -> BFmode conversions libgcc has implementations
of all those conversions so that we avoid double rounding, for
BFmode -> {DF,XF,TF}mode conversions to avoid growing libgcc too much
it emits BFmode -> SFmode conversion first and then converts to the even
wider mode, neither step should be imprecise.
For BFmode -> HFmode, it first emits a precise BFmode -> SFmode conversion
and then SFmode -> HFmode, because neither format is subset or superset
of the other, while SFmode is superset of both.
expr.cc then contains a -ffast-math optimization of the BF -> SF and
SF -> BF conversions if we don't optimize for space (and for the latter
if -frounding-math isn't enabled either).
For x86, perhaps truncsfbf2 optab could be defined for TARGET_AVX512BF16
but IMNSHO should FAIL if !flag_finite_math || flag_rounding_math
|| !flag_unsafe_math_optimizations, because I think the insn doesn't
raise on sNaNs, hardcodes round to nearest and flushes denormals to zero.
By default (unless x86 -fexcess-precision=16) we use float excess
precision for BFmode, so truncate only on explicit casts and assignments.
The patch introduces a single __bf16 builtin - __builtin_nansf16b,
because (__bf16) __builtin_nansf ("") will drop the sNaN into qNaN,
and uses f16b suffix instead of bf16 because there would be ambiguity on
log vs. logb - __builtin_logbf16 could be either log with bf16 suffix
or logb with f16 suffix.  In other cases libstdc++ should mostly use
__builtin_*f for std::bfloat16_t overloads (we have a problem with
std::nextafter though but that one we have also for std::float16_t).

2022-10-14  Jakub Jelinek  <jakub@redhat.com>

gcc/
* tree-core.h (enum tree_index): Add TI_BFLOAT16_TYPE.
* tree.h (bfloat16_type_node): Define.
* tree.cc (excess_precision_type): Promote bfloat16_type_mode
like float16_type_mode.
(build_common_tree_nodes): Initialize bfloat16_type_node if
BFmode is supported.
* expmed.h (maybe_expand_shift): Declare.
* expmed.cc (maybe_expand_shift): No longer static.
* expr.cc (convert_mode_scalar): Don't ICE on BF -> HF or HF -> BF
conversions.  If there is no optab, handle BF -> {DF,XF,TF,HF}
conversions as separate BF -> SF -> {DF,XF,TF,HF} conversions, add
-ffast-math generic implementation for BF -> SF and SF -> BF
conversions.
* builtin-types.def (BT_BFLOAT16, BT_FN_BFLOAT16_CONST_STRING): New.
* builtins.def (BUILT_IN_NANSF16B): New builtin.
* fold-const-call.cc (fold_const_call): Handle CFN_BUILT_IN_NANSF16B.
* config/i386/i386.cc (classify_argument): Handle E_BCmode.
(ix86_libgcc_floating_mode_supported_p): Also return true for BFmode
for -msse2.
(ix86_mangle_type): Mangle BFmode as DF16b.
(ix86_invalid_conversion, ix86_invalid_unary_op,
ix86_invalid_binary_op): Remove.
(TARGET_INVALID_CONVERSION, TARGET_INVALID_UNARY_OP,
TARGET_INVALID_BINARY_OP): Don't redefine.
* config/i386/i386-builtins.cc (ix86_bf16_type_node): Remove.
(ix86_register_bf16_builtin_type): Use bfloat16_type_node rather than
ix86_bf16_type_node, only create it if still NULL.
* config/i386/i386-builtin-types.def (BFLOAT16): Likewise.
* config/i386/i386.md (cbranchbf4, cstorebf4): New expanders.
gcc/c-family/
* c-cppbuiltin.cc (c_cpp_builtins): If bfloat16_type_node,
predefine __BFLT16_*__ macros and for C++23 also
__STDCPP_BFLOAT16_T__.  Predefine bfloat16_type_node related
macros for -fbuilding-libgcc.
* c-lex.cc (interpret_float): Handle CPP_N_BFLOAT16.
gcc/c/
* c-typeck.cc (convert_arguments): Don't promote __bf16 to
double.
gcc/cp/
* cp-tree.h (extended_float_type_p): Return true for
bfloat16_type_node.
* typeck.cc (cp_compare_floating_point_conversion_ranks): Set
extended{1,2} if mv{1,2} is bfloat16_type_node.  Adjust comment.
gcc/testsuite/
* lib/target-supports.exp (check_effective_target_bfloat16,
check_effective_target_bfloat16_runtime, add_options_for_bfloat16):
New.
* gcc.dg/torture/bfloat16-basic.c: New test.
* gcc.dg/torture/bfloat16-builtin.c: New test.
* gcc.dg/torture/bfloat16-builtin-issignaling-1.c: New test.
* gcc.dg/torture/bfloat16-complex.c: New test.
* gcc.dg/torture/builtin-issignaling-1.c: Allow to be includable
from bfloat16-builtin-issignaling-1.c.
* gcc.dg/torture/floatn-basic.h: Allow to be includable from
bfloat16-basic.c.
* gcc.target/i386/vect-bfloat16-typecheck_2.c: Adjust expected
diagnostics.
* gcc.target/i386/sse2-bfloat16-scalar-typecheck.c: Likewise.
* gcc.target/i386/vect-bfloat16-typecheck_1.c: Likewise.
* g++.target/i386/bfloat_cpp_typecheck.C: Likewise.
libcpp/
* include/cpplib.h (CPP_N_BFLOAT16): Define.
* expr.cc (interpret_float_suffix): Handle bf16 and BF16 suffixes for
C++.
libgcc/
* config/i386/t-softfp (softfp_extensions): Add bfsf.
(softfp_truncations): Add tfbf xfbf dfbf sfbf hfbf.
(CFLAGS-extendbfsf2.c, CFLAGS-truncsfbf2.c, CFLAGS-truncdfbf2.c,
CFLAGS-truncxfbf2.c, CFLAGS-trunctfbf2.c, CFLAGS-trunchfbf2.c): Add
-msse2.
* config/i386/libgcc-glibc.ver (GCC_13.0.0): Export
__extendbfsf2 and __trunc{s,d,x,t,h}fbf2.
* config/i386/sfp-machine.h (_FP_NANSIGN_B): Define.
* config/i386/64/sfp-machine.h (_FP_NANFRAC_B): Define.
* config/i386/32/sfp-machine.h (_FP_NANFRAC_B): Define.
* soft-fp/brain.h: New file.
* soft-fp/truncsfbf2.c: New file.
* soft-fp/truncdfbf2.c: New file.
* soft-fp/truncxfbf2.c: New file.
* soft-fp/trunctfbf2.c: New file.
* soft-fp/trunchfbf2.c: New file.
* soft-fp/truncbfhf2.c: New file.
* soft-fp/extendbfsf2.c: New file.
libiberty/
* cp-demangle.h (D_BUILTIN_TYPE_COUNT): Increment.
* cp-demangle.c (cplus_demangle_builtin_types): Add std::bfloat16_t
entry.
(cplus_demangle_type): Demangle DF16b.
* testsuite/demangle-expected (_Z3xxxDF16b): New test.

2 years agoc++: Excess precision for ? int : float or int == float [PR107097, PR82071, PR87390]
Jakub Jelinek [Fri, 14 Oct 2022 07:33:23 +0000 (09:33 +0200)]
c++: Excess precision for ? int : float or int == float [PR107097, PR82071, PR87390]

The following incremental patch implements the C11 behavior (for all C++
versions) for
cond ? int : float
cond ? float : int
int cmp float
float cmp int
where int is any integral type, float any floating point type with
excess precision and cmp ==, !=, >, <, >=, <= and <=>.

2022-10-14  Jakub Jelinek  <jakub@redhat.com>

PR c/82071
PR c/87390
PR c++/107097
gcc/cp/
* cp-tree.h (cp_ep_convert_and_check): Remove.
* cvt.cc (cp_ep_convert_and_check): Remove.
* call.cc (build_conditional_expr): Use excess precision for ?: with
one arm floating and another integral.  Don't convert first to
semantic result type from integral types.
(convert_like_internal): Don't call cp_ep_convert_and_check, instead
just strip EXCESS_PRECISION_EXPR before calling cp_convert_and_check
or cp_convert.
* typeck.cc (cp_build_binary_op): Set may_need_excess_precision
for comparisons or SPACESHIP_EXPR with at least one operand integral.
Don't compute semantic_result_type if build_type is non-NULL.  Call
cp_convert_and_check instead of cp_ep_convert_and_check.
gcc/testsuite/
* gcc.target/i386/excess-precision-8.c: For C++ wrap abort and
exit declarations into extern "C" block.
* gcc.target/i386/excess-precision-10.c: Likewise.
* g++.target/i386/excess-precision-7.C: Remove.
* g++.target/i386/excess-precision-8.C: New test.
* g++.target/i386/excess-precision-9.C: Remove.
* g++.target/i386/excess-precision-10.C: New test.
* g++.target/i386/excess-precision-12.C: New test.

2 years agoc++: Implement excess precision support for C++ [PR107097, PR323]
Jakub Jelinek [Fri, 14 Oct 2022 07:28:57 +0000 (09:28 +0200)]
c++: Implement excess precision support for C++ [PR107097, PR323]

The following patch implements excess precision support for C++.
Like for C, it uses EXCESS_PRECISION_EXPR tree to say that its operand
is evaluated in excess precision and what the semantic type of the
expression is.
In most places I've followed what the C FE does in similar spots, so
e.g. for binary ops if one or both operands are already
EXCESS_PRECISION_EXPR, strip those away or for operations that might need
excess precision (+, -, *, /) check if the operands should use excess
precision and convert to that type and at the end wrap into
EXCESS_PRECISION_EXPR with the common semantic type.
This patch follows the C99 handling where it differs from C11 handling.

There are some cases which needed to be handled differently, the C FE can
just strip EXCESS_PRECISION_EXPR (replace it with its operand) when handling
explicit cast, but that IMHO isn't right for C++ - the discovery what exact
conversion should be used (e.g. if user conversion or standard or their
sequence) should be decided based on the semantic type (i.e. type of
EXCESS_PRECISION_EXPR), and that decision continues in convert_like* where
we pick the right user conversion, again, if say some class has ctor
from double and long double and we are on ia32 with standard excess
precision promoting float/double to long double, then we should pick the
ctor from double.  Or when some other class has ctor from just double,
and EXCESS_PRECISION_EXPR semantic type is float, we should choose the
user ctor from double, but actually just convert the long double excess
precision to double and not to float first.  We need to make sure
even identity conversion converts from excess precision to the semantic one
though, but if identity is chained with other conversions, we don't want
the identity next_conversion to drop to semantic precision only to widen
afterwards.

The existing testcases tweaks were for cases on i686-linux where excess
precision breaks those tests, e.g. if we have
  double d = 4.2;
  if (d == 4.2)
then it does the expected thing only with -fexcess-precision=fast,
because with -fexcess-precision=standard it is actually
  double d = 4.2;
  if ((long double) d == 4.2L)
where 4.2L is different from 4.2.  I've added -fexcess-precision=fast
to some tests and changed other tests to use constants that are exactly
representable and don't suffer from these excess precision issues.

There is one exception, pr68180.C looks like a bug in the patch which is
also present in the C FE (so I'd like to get it resolved incrementally
in both).  Reduced testcase:
typedef float __attribute__((vector_size (16))) float32x4_t;
float32x4_t foo(float32x4_t x, float y) { return x + y; }
with -m32 -std=c11 -Wno-psabi or -m32 -std=c++17 -Wno-psabi
it is rejected with:
pr68180.c:2:52: error: conversion of scalar ‘long double’ to vector ‘float32x4_t’ {aka ‘__vector(4) float’} involves truncation
but without excess precision (say just -std=c11 -Wno-psabi or -std=c++17 -Wno-psabi)
it is accepted.  Perhaps we should pass down the semantic type to
scalar_to_vector and use the semantic type rather than excess precision type
in the diagnostics.

2022-10-14  Jakub Jelinek  <jakub@redhat.com>

PR middle-end/323
PR c++/107097
gcc/
* doc/invoke.texi (-fexcess-precision=standard): Mention that the
option now also works in C++.
gcc/c-family/
* c-common.def (EXCESS_PRECISION_EXPR): Remove comment part about
the tree being specific to C/ObjC.
* c-opts.cc (c_common_post_options): Handle flag_excess_precision
in C++ the same as in C.
* c-lex.cc (interpret_float): Set const_type to excess_precision ()
even for C++.
gcc/cp/
* parser.cc (cp_parser_primary_expression): Handle
EXCESS_PRECISION_EXPR with REAL_CST operand the same as REAL_CST.
* cvt.cc (cp_ep_convert_and_check): New function.
* call.cc (build_conditional_expr): Add excess precision support.
When type_after_usual_arithmetic_conversions returns error_mark_node,
use gcc_checking_assert that it is because of uncomparable floating
point ranks instead of checking all those conditions and make it
work also with complex types.
(convert_like_internal): Likewise.  Add NESTED_P argument, pass true
to recursive calls to convert_like.
(convert_like): Add NESTED_P argument, pass it through to
convert_like_internal.  For other overload pass false to it.
(convert_like_with_context): Pass false to NESTED_P.
(convert_arg_to_ellipsis): Add excess precision support.
(magic_varargs_p): For __builtin_is{finite,inf,inf_sign,nan,normal}
and __builtin_fpclassify return 2 instead of 1, document what it
means.
(build_over_call): Don't handle former magic 2 which is no longer
used, instead for magic 1 remove EXCESS_PRECISION_EXPR.
(perform_direct_initialization_if_possible): Pass false to NESTED_P
convert_like argument.
* constexpr.cc (cxx_eval_constant_expression): Handle
EXCESS_PRECISION_EXPR.
(potential_constant_expression_1): Likewise.
* pt.cc (tsubst_copy, tsubst_copy_and_build): Likewise.
* cp-tree.h (cp_ep_convert_and_check): Declare.
* cp-gimplify.cc (cp_fold): Handle EXCESS_PRECISION_EXPR.
* typeck.cc (cp_common_type): For COMPLEX_TYPEs, return error_mark_node
if recursive call returned it.
(convert_arguments): For magic 1 remove EXCESS_PRECISION_EXPR.
(cp_build_binary_op): Add excess precision support.  When
cp_common_type returns error_mark_node, use gcc_checking_assert that
it is because of uncomparable floating point ranks instead of checking
all those conditions and make it work also with complex types.
(cp_build_unary_op): Likewise.
(cp_build_compound_expr): Likewise.
(build_static_cast_1): Remove EXCESS_PRECISION_EXPR.
gcc/testsuite/
* gcc.target/i386/excess-precision-1.c: For C++ wrap abort and
exit declarations into extern "C" block.
* gcc.target/i386/excess-precision-2.c: Likewise.
* gcc.target/i386/excess-precision-3.c: Likewise.  Remove
check_float_nonproto and check_double_nonproto tests for C++.
* gcc.target/i386/excess-precision-7.c: For C++ wrap abort and
exit declarations into extern "C" block.
* gcc.target/i386/excess-precision-9.c: Likewise.
* g++.target/i386/excess-precision-1.C: New test.
* g++.target/i386/excess-precision-2.C: New test.
* g++.target/i386/excess-precision-3.C: New test.
* g++.target/i386/excess-precision-4.C: New test.
* g++.target/i386/excess-precision-5.C: New test.
* g++.target/i386/excess-precision-6.C: New test.
* g++.target/i386/excess-precision-7.C: New test.
* g++.target/i386/excess-precision-9.C: New test.
* g++.target/i386/excess-precision-11.C: New test.
* c-c++-common/dfp/convert-bfp-10.c: Add -fexcess-precision=fast
as dg-additional-options.
* c-c++-common/dfp/compare-eq-const.c: Likewise.
* g++.dg/cpp1z/constexpr-96862.C: Likewise.
* g++.dg/cpp1z/decomp12.C (main): Use 2.25 instead of 2.3 to
avoid excess precision differences.
* g++.dg/other/thunk1.C: Add -fexcess-precision=fast
as dg-additional-options.
* g++.dg/vect/pr64410.cc: Likewise.
* g++.dg/cpp1y/pr68180.C: Likewise.
* g++.dg/vect/pr89653.cc: Likewise.
* g++.dg/cpp0x/variadic-tuple.C: Likewise.
* g++.dg/cpp0x/nsdmi-union1.C: Use 4.25 instead of 4.2 to
avoid excess precision differences.
* g++.old-deja/g++.brendan/copy9.C: Add -fexcess-precision=fast
as dg-additional-options.
* g++.old-deja/g++.brendan/overload7.C: Likewise.

2 years agoc: C2x storage class specifiers in compound literals
Joseph Myers [Fri, 14 Oct 2022 02:18:45 +0000 (02:18 +0000)]
c: C2x storage class specifiers in compound literals

Implement the C2x feature of storage class specifiers in compound
literals.  Such storage class specifiers (static, register or
thread_local; also constexpr, but we don't yet have C2x constexpr
support implemented) can be used before the type name (not mixed with
type specifiers, unlike in declarations) and have the same semantics
and constraints as for declarations of named objects.  Also allow GNU
__thread to be used, given that thread_local can be.

Bootstrapped with no regressions for x86_64-pc-linux-gnu.

gcc/c/
* c-decl.cc (build_compound_literal): Add parameter scspecs.
Handle storage class specifiers.
* c-parser.cc (c_token_starts_compound_literal)
(c_parser_compound_literal_scspecs): New.
(c_parser_postfix_expression_after_paren_type): Add parameter
scspecs.  Call pedwarn_c11 for use of storage class specifiers.
Update call to build_compound_literal.
(c_parser_cast_expression, c_parser_sizeof_expression)
(c_parser_alignof_expression): Handle storage class specifiers for
compound literals.  Update calls to
c_parser_postfix_expression_after_paren_type.
(c_parser_postfix_expression): Update syntax comment.
* c-tree.h (build_compound_literal): Update prototype.
* c-typeck.cc (c_mark_addressable): Diagnose taking address of
register compound literal.

gcc/testsuite/
* gcc.dg/c11-complit-1.c, gcc.dg/c11-complit-2.c,
gcc.dg/c11-complit-3.c, gcc.dg/c2x-complit-2.c,
gcc.dg/c2x-complit-3.c, gcc.dg/c2x-complit-4.c,
gcc.dg/c2x-complit-5.c, gcc.dg/c2x-complit-6.c,
gcc.dg/c2x-complit-7.c, gcc.dg/c90-complit-2.c,
gcc.dg/gnu2x-complit-1.c, gcc.dg/gnu2x-complit-2.c: New tests.

2 years agoDaily bump.
GCC Administrator [Fri, 14 Oct 2022 00:16:35 +0000 (00:16 +0000)]
Daily bump.

2 years agoFix bogus -Wstringop-overflow warning
Eric Botcazou [Thu, 13 Oct 2022 22:55:40 +0000 (00:55 +0200)]
Fix bogus -Wstringop-overflow warning

If you compile the testcase with -O2 -fno-inline -Wall, you get:

In function 'process_array3':
cc1: warning: 'process_array4' accessing 4 bytes in a region of size 3 [-
Wstringop-overflow=]
cc1: note: referencing argument 1 of type 'char[4]'
t.c:6:6: note: in a call to function 'process_array4'
    6 | void process_array4 (char a[4], int n)
      |      ^~~~~~~~~~~~~~
cc1: warning: 'process_array4' accessing 4 bytes in a region of size 3 [-
Wstringop-overflow=]
cc1: note: referencing argument 1 of type 'char[4]'
t.c:6:6: note: in a call to function 'process_array4'

That's because the ICF IPA pass has identified the two functions and turned
process_array3 into a wrapper of process_array4.

gcc/
* gimple-ssa-warn-access.cc (pass_waccess::check_call): Return
early for calls made from thunks.

gcc/testsuite/
* gcc.dg/Wstringop-overflow-89.c: New test.

2 years agoc++: trivial formatting cleanups
Jason Merrill [Tue, 29 Jun 2021 21:45:21 +0000 (17:45 -0400)]
c++: trivial formatting cleanups

Split out from the C++ contracts patch.

gcc/cp/ChangeLog:

* cp-tree.h: Fix whitespace.
* parser.h: Fix whitespace.
* decl.cc: Fix whitespace.
* parser.cc: Fix whitespace.
* pt.cc: Fix whitespace.

2 years agoanalyzer: fix ICE introduced in r13-3168 [PR107210]
David Malcolm [Thu, 13 Oct 2022 20:05:35 +0000 (16:05 -0400)]
analyzer: fix ICE introduced in r13-3168 [PR107210]

gcc/analyzer/ChangeLog:
PR analyzer/107210
* svalue.cc (constant_svalue::maybe_fold_bits_within): Only
attempt to extract individual bits when tree_fits_uhwi_p.

gcc/testsuite/ChangeLog:
PR analyzer/107210
* gfortran.dg/analyzer/pr107210.f90: New test.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2 years agolibgomp: Add Fortran testcases for omp_in_explicit_task
Tobias Burnus [Thu, 13 Oct 2022 18:38:27 +0000 (20:38 +0200)]
libgomp: Add Fortran testcases for omp_in_explicit_task

Fortranized testcases of commits r13-3257-ga58a965eb73
and r13-3258-g0ec4e93fb9f.

libgomp/ChangeLog:

* testsuite/libgomp.fortran/task-7.f90: New test.
* testsuite/libgomp.fortran/task-8.f90: New test.
* testsuite/libgomp.fortran/task-in-explicit-1.f90: New test.
* testsuite/libgomp.fortran/task-in-explicit-2.f90: New test.
* testsuite/libgomp.fortran/task-in-explicit-3.f90: New test.
* testsuite/libgomp.fortran/task-reduction-17.f90: New test.
* testsuite/libgomp.fortran/task-reduction-18.f90: New test.

2 years agoFix emit_group_store regression on big-endian
Eric Botcazou [Wed, 12 Oct 2022 07:27:19 +0000 (09:27 +0200)]
Fix emit_group_store regression on big-endian

The recent optimization implemented for complex modes contains an oversight
for big-endian platforms: it uses a lowpart SUBREG when the integer modes
have different sizes, but this does not match the semantics of the PARALLELs
which have a bundled byte offset; this offset is always zero in the code
path and the lowpart is not at offset zero on big-endian platforms.

gcc/
* expr.cc (emit_group_stote): Fix handling of modes of different
sizes for big-endian targets in latest change and add commentary.

2 years agouse proper DECL_INITIAL for VTV
Martin Liska [Thu, 13 Oct 2022 13:39:08 +0000 (15:39 +0200)]
use proper DECL_INITIAL for VTV

gcc/cp/ChangeLog:

* vtable-class-hierarchy.cc (vtv_generate_init_routine): Emit
an artificial variable that would be put into .preinit_array
section.

gcc/ChangeLog:

* output.h (assemble_vtv_preinit_initializer): Remove.
* varasm.cc (assemble_vtv_preinit_initializer): Remove.

2 years agopropagate partial equivs in the cache.
Andrew MacLeod [Wed, 5 Oct 2022 14:42:07 +0000 (10:42 -0400)]
propagate partial equivs in the cache.

Adjust on-entry cache propagation to look for and propagate both full
and partial equivalences.

gcc/
PR tree-optimization/102540
PR tree-optimization/102872
* gimple-range-cache.cc (ranger_cache::fill_block_cache):
Handle partial equivs.
(ranger_cache::range_from_dom): Cleanup dump output.

gcc/testsuite/
* gcc.dg/pr102540.c: New.
* gcc.dg/pr102872.c: New.

2 years agoAdd partial equivalence recognition to cast and bitwise and.
Andrew MacLeod [Thu, 6 Oct 2022 19:01:24 +0000 (15:01 -0400)]
Add partial equivalence recognition to cast and bitwise and.

This provides the hooks that will register partial equivalencies for
casts and bitwise AND operations with the appropriate bit pattern.

* range-op.cc (operator_cast::lhs_op1_relation): New.
(operator_bitwise_and::lhs_op1_relation): New.

2 years agoAdd equivalence iterator to relation oracle.
Andrew MacLeod [Fri, 7 Oct 2022 16:55:32 +0000 (12:55 -0400)]
Add equivalence iterator to relation oracle.

Instead of looping over an exposed equivalence bitmap, provide iterators
to loop over equivalences, partial equivalences, or both.

* gimple-range-cache.cc (ranger_cache::fill_block_cache): Use
iterator.
* value-relation.cc
  (equiv_relation_iterator::equiv_relation_iterator): New.
(equiv_relation_iterator::next): New.
(equiv_relation_iterator::get_name): New.
* value-relation.h (class relation_oracle): Privatize some methods.
(class equiv_relation_iterator): New.
(FOR_EACH_EQUIVALENCE): New.
(FOR_EACH_PARTIAL_EQUIV): New.
(FOR_EACH_PARTIAL_AND_FULL_EQUIV): New.

2 years agoAdd partial equivalence support to the relation oracle.
Andrew MacLeod [Thu, 6 Oct 2022 19:00:52 +0000 (15:00 -0400)]
Add partial equivalence support to the relation oracle.

This provides enhancements to the equivalence oracle to also track
partial equivalences.  They are tracked similar to equivalences, except
it tracks a 'slice' of another ssa name.   8, 16, 32 and 64 bit slices are
tracked.  This will allow casts and mask of the same value to compare
equal.

* value-relation.cc (equiv_chain::dump): Don't print empty
equivalences.
(equiv_oracle::equiv_oracle): Allocate a partial equiv table.
(equiv_oracle::~equiv_oracle): Release the partial equiv table.
(equiv_oracle::add_partial_equiv): New.
(equiv_oracle::partial_equiv_set): New.
(equiv_oracle::partial_equiv): New.
(equiv_oracle::query_relation): Check for partial equivs too.
(equiv_oracle::dump): Also dump partial equivs.
(dom_oracle::register_relation): Handle partial equivs.
(dom_oracle::query_relation): Check for partial equivs.
* value-relation.h (enum relation_kind_t): Add partial equivs.
(relation_partial_equiv_p): New.
(relation_equiv_p): New.
(class pe_slice): New.
(class equiv_oracle): Add prototypes.
(pe_to_bits): New.
(bits_to_pe): New.
(pe_min): New.

2 years agoc++: ICE with VEC_INIT_EXPR and defarg [PR106925]
Marek Polacek [Tue, 11 Oct 2022 18:16:54 +0000 (14:16 -0400)]
c++: ICE with VEC_INIT_EXPR and defarg [PR106925]

Since r12-8066, in cxx_eval_vec_init we perform expand_vec_init_expr
while processing the default argument in this test.  At this point
start_preparsed_function hasn't yet set current_function_decl.
expand_vec_init_expr then leads to maybe_splice_retval_cleanup which
checks DECL_CONSTRUCTOR_P (current_function_decl) without checking that
c_f_d is non-null first.  It seems correct that c_f_d is null here, so
it seems to me that maybe_splice_retval_cleanup should check c_f_d as
in the following patch.

PR c++/106925

gcc/cp/ChangeLog:

* except.cc (maybe_splice_retval_cleanup): Check current_function_decl.
Make the bool const.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/initlist-defarg3.C: New test.

2 years agotree-optimization/107247 - reduce SLP reduction accumulator
Richard Biener [Thu, 13 Oct 2022 12:56:01 +0000 (14:56 +0200)]
tree-optimization/107247 - reduce SLP reduction accumulator

The following makes sure to reduce a multi-vector SLP reduction
accumulator to a single vector using vector operations if
easily possible (if the number of lanes in the vector type is
a multiple of the number of scalar accumulators).

PR tree-optimization/107247
* tree-vect-loop.cc (vect_create_epilog_for_reduction):
Reduce multi vector SLP reduction accumulators.  Check
the adjusted number of accumulator vectors against
one for the re-use in the epilogue.

2 years agomachmode: Introduce GET_MODE_NEXT_MODE with previous GET_MODE_WIDER_MODE meaning...
Jakub Jelinek [Thu, 13 Oct 2022 14:22:21 +0000 (16:22 +0200)]
machmode: Introduce GET_MODE_NEXT_MODE with previous GET_MODE_WIDER_MODE meaning, add new GET_MODE_WIDER_MODE

On Wed, Oct 05, 2022 at 04:02:25PM -0400, Jason Merrill wrote:
> > > > @@ -5716,7 +5716,13 @@ emit_store_flag_1 (rtx target, enum rtx_
> > > >        {
> > > >         machine_mode optab_mode = mclass == MODE_CC ? CCmode : compare_mode;
> > > >         icode = optab_handler (cstore_optab, optab_mode);
> > > > -     if (icode != CODE_FOR_nothing)
> > > > +     if (icode != CODE_FOR_nothing
> > > > +        /* Don't consider [BH]Fmode as usable wider mode, as neither is
> > > > +           a subset or superset of the other.  */
> > > > +        && (compare_mode == mode
> > > > +            || !SCALAR_FLOAT_MODE_P (compare_mode)
> > > > +            || maybe_ne (GET_MODE_PRECISION (compare_mode),
> > > > +                         GET_MODE_PRECISION (mode))))
> > >
> > > Why do you need to do this here (and in prepare_cmp_insn, and similarly in
> > > can_compare_p)?  Shouldn't get_wider skip over modes that are not actually
> > > wider?
> >
> > I'm afraid too many places rely on all modes of a certain class to be
> > visible when walking from "narrowest" to "widest" mode, say
> > FOR_EACH_MODE_IN_CLASS/FOR_EACH_MODE/FOR_EACH_MODE_UNTIL/FOR_EACH_WIDER_MODE
> > etc. wouldn't work at all if GET_MODE_WIDER_MODE (BFmode) == SFmode
> > && GET_MODE_WIDER_MODE (HFmode) == SFmode.
>
> Yes, it seems they need to change now that their assumptions have been
> violated.  I suppose FOR_EACH_MODE_IN_CLASS would need to change to not use
> get_wider, and users of FOR_EACH_MODE/FOR_EACH_MODE_UNTIL need to decide
> whether they want an iteration that uses get_wider (likely with a new name)
> or not.

Here is a patch which does that.

Though I admit I didn't go carefully through all 24 GET_MODE_WIDER_MODE
uses, 54 FOR_EACH_MODE_IN_CLASS uses, 3 FOR_EACH_MODE uses, 24
FOR_EACH_MODE_FROM, 6 FOR_EACH_MODE_UNTIL and 15 FOR_EACH_WIDER_MODE uses.
It is more important to go through the GET_MODE_WIDER_MODE and
FOR_EACH_WIDER_MODE uses because the patch changes behavior for those,
the rest keep their previous meaning and so can be changed incrementally
if the other meaning is desirable to them (I've of course changed the 3
spots I had to change in the previous BFmode patch and whatever triggered
during the bootstraps).

2022-10-13  Jakub Jelinek  <jakub@redhat.com>

* genmodes.cc (emit_mode_wider): Emit previous content of
mode_wider array into mode_next array and for mode_wider
emit always VOIDmode for !CLASS_HAS_WIDER_MODES_P classes,
otherwise skip through modes with the same precision.
* machmode.h (mode_next): Declare.
(GET_MODE_NEXT_MODE): New inline function.
(mode_iterator::get_next, mode_iterator::get_known_next): New
function templates.
(FOR_EACH_MODE_IN_CLASS): Use get_next instead of get_wider.
(FOR_EACH_MODE): Use get_known_next instead of get_known_wider.
(FOR_EACH_MODE_FROM): Use get_next instead of get_wider.
(FOR_EACH_WIDER_MODE_FROM): Define.
(FOR_EACH_NEXT_MODE): Define.
* expmed.cc (emit_store_flag_1): Use FOR_EACH_WIDER_MODE_FROM
instead of FOR_EACH_MODE_FROM.
* optabs.cc (prepare_cmp_insn): Likewise.  Remove redundant
!CLASS_HAS_WIDER_MODES_P check.
(prepare_float_lib_cmp): Use FOR_EACH_WIDER_MODE_FROM instead of
FOR_EACH_MODE_FROM.
* config/i386/i386-expand.cc (get_mode_wider_vector): Use
GET_MODE_NEXT_MODE instead of GET_MODE_WIDER_MODE.

2 years ago[AArch64] Improve bit tests [PR105773]
Wilco Dijkstra [Thu, 13 Oct 2022 13:41:55 +0000 (14:41 +0100)]
[AArch64] Improve bit tests [PR105773]

Since AArch64 sets all flags on logical operations, comparisons with zero
can be combined into an AND even if the condition is LE or GT. Add a new
CC_NZV mode used by ANDS/BICS/TST instructions.

gcc/
PR target/105773
* config/aarch64/aarch64.cc (aarch64_select_cc_mode): Allow
GT/LE for merging compare with zero into AND.
(aarch64_get_condition_code_1): Add CC_NZVmode support.
* config/aarch64/aarch64-modes.def: Add CC_NZV.
* config/aarch64/aarch64.md: Use CC_NZV in cmp+and patterns.

gcc/testsuite/
PR target/105773
* gcc.target/aarch64/ands_2.c: Test for ANDS.
* gcc.target/aarch64/bics_2.c: Test for BICS.
* gcc.target/aarch64/tst_2.c: Test for TST.
* gcc.target/aarch64/tst_imm_split_1.c: Fix test.

2 years agotree-optimization/107160 - avoid reusing multiple accumulators
Richard Biener [Thu, 13 Oct 2022 12:24:05 +0000 (14:24 +0200)]
tree-optimization/107160 - avoid reusing multiple accumulators

Epilogue vectorization is not set up to re-use a vectorized
accumulator consisting of more than one vector.  For non-SLP
we always reduce to a single but for SLP that isn't happening.
In such case we currenlty miscompile the epilog so avoid this.

PR tree-optimization/107160
* tree-vect-loop.cc (vect_create_epilog_for_reduction):
Do not register accumulator if we failed to reduce it
to a single vector.

* gcc.dg/vect/pr107160.c: New testcase.

2 years agoAdd op1_op2_relation for float operands.
Aldy Hernandez [Thu, 13 Oct 2022 06:08:26 +0000 (08:08 +0200)]
Add op1_op2_relation for float operands.

op1_op2_relation can be called for relops (bool = a < b) as well as
regular binary operators (z = a + b).  This patch adds the overloaded
method for floating point results.

gcc/ChangeLog:

* range-op-float.cc (range_operator_float::op1_op2_relation): New.
(class foperator_equal): Add using.
(class foperator_not_equal): Same.
(class foperator_lt): Same.
(class foperator_le): Same.
(class foperator_gt): Same.
(class foperator_ge): Same.
* range-op.cc (range_op_handler::op1_op2_relation): New.
* range-op.h (range_operator_float::op1_op2_relation): New.

2 years agodiagnose return statement in match.pd (with { ... } expressions
Richard Biener [Thu, 13 Oct 2022 10:59:09 +0000 (12:59 +0200)]
diagnose return statement in match.pd (with { ... } expressions

The expression in (with { ... } is used like a statement expression
which means control flow that leaves it is not allowed.  The following
explicitely diagnoses 'return' and fixes up the few cases that crept
into match.pd (oops).  Any such return will prematurely end matching
the current expression.

* genmatch.cc (parser::parse_c_expr): Diagnose 'return'.
* match.pd: Replace 'return' statements in with expressions
with appropriate variants.

2 years agoifcvt: Fix bitpos calculation in bitfield lowering [PR107229]
Andre Vieira [Thu, 13 Oct 2022 11:09:38 +0000 (12:09 +0100)]
ifcvt: Fix bitpos calculation in bitfield lowering [PR107229]

The bitposition calculation for the bitfield lowering in loop if conversion was
not taking DECL_FIELD_OFFSET into account, which meant that it would result in
wrong bitpositions for bitfields that did not end up having representations
starting at the beginning of the struct.

gcc/ChangeLog:

PR tree-optimization/107229
* tree-if-conv.cc (get_bitfield_rep): Fix bitposition calculation.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/pr107229-1.c: New test.
* gcc.dg/vect/pr107229-2.c: New test.
* gcc.dg/vect/pr107229-3.c: New test.

2 years agoLoongArch: implement count_{leading,trailing}_zeros
Xi Ruoyao [Wed, 12 Oct 2022 14:06:07 +0000 (22:06 +0800)]
LoongArch: implement count_{leading,trailing}_zeros

LoongArch always support clz and ctz instructions, so we can always use
__builtin_{clz,ctz} for count_{leading,trailing}_zeros.  This improves
the code of libgcc, and also benefits Glibc once we merge longlong.h
there.

Bootstrapped and regtested on loongarch64-linux-gnu.

include/ChangeLog:

* longlong.h [__loongarch__] (count_leading_zeros): Define.
[__loongarch__] (count_trailing_zeros): Likewise.
[__loongarch__] (COUNT_LEADING_ZEROS_0): Likewise.

2 years agovect: Don't pattern match BITFIELD_REF's of non-integrals [PR107226]
Andre Vieira [Thu, 13 Oct 2022 09:34:27 +0000 (10:34 +0100)]
vect: Don't pattern match BITFIELD_REF's of non-integrals [PR107226]

The original patch supported matching the vect_recog_bitfield_ref_pattern for
BITFIELD_REF's where the first operand didn't have a INTEGRAL_TYPE_P type.
That means it would also match vectors, leading to regressions in targets that
supported vectorization of those.

gcc/ChangeLog:

PR tree-optimization/107226
* tree-vect-patterns.cc (vect_recog_bitfield_ref_pattern): Reject
BITFIELD_REF's with non integral typed first operands.

2 years agoLoongArch: Fixed a bug in the loongarch architecture of libitm package.
Lulu Cheng [Wed, 12 Oct 2022 03:02:11 +0000 (11:02 +0800)]
LoongArch: Fixed a bug in the loongarch architecture of libitm package.

Add a soft floating point condition to the register recovery part of the code.

libitm/ChangeLog:

* config/loongarch/sjlj.S: Add a soft floating point condition to the
register recovery part of the code.