review.tizen.org Git - platform/upstream/gcc.git/log

c++: constexpr folding in unevaluated context [PR105931]

Changing the type of N from int to unsigned in decltype82.C (from
r13-986-g0ecb6b906f215e) reveals another spot where we perform constexpr
evaluation in an unevaluated context for sake of warnings, this time
from the call to shorten_compare in cp_build_binary_op, which calls
fold_for_warn.

We could (and probably should) suppress the shorten_compare warnings
when in an unevaluated context, but there's probably other callers of
fold_for_warn that are similarly affected. So this patch takes the
approach of directly suppressing fold_for_warn when in an unevaluated
context.

PR c++/105931

gcc/cp/ChangeLog:

* expr.cc (fold_for_warn): Don't fold when in an unevaluated
context.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/decltype82a.C: New test.

c++: context completion in lookup_template_class [PR105982]

The below testcase demonstrates that completion of the substituted
context during lookup_template_class can end up registering the desired
specialization for us in more cases than r13-1045-gcb7fd1ea85feea
anticipated. In particular this can happen for a non-dependent
specialization of a nested class as well.

For this testcase, during overload resolution with A's guides, we
substitute the deduced argument T=int into the TYPENAME_TYPE B::C,
during which we call lookup_template_class for A<T>::B with T=int,
which completes A<int> for the first time, which recursively registers
the desired specialization of B already. The parent call to
lookup_template_class then tries to register the same specialization,
triggering an ICE.

This patch fixes this by making lookup_template_class determine more
directly whether we need to recheck the specializations table after
completion of the context -- when and only when the call to complete_type
had an effect.

PR c++/105982

gcc/cp/ChangeLog:

* pt.cc (lookup_template_class): After calling complete_type for
the substituted context, check the table again iff the type was
previously incomplete and complete_type made it complete.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1z/class-deduction111.C: New test.

compiler: in Sort_bindings return false if comparing value to itself

Some versions of std::sort may pass elements at the same iterator location.

Fixes golang/go#53483

Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/413434

compiler: unalias types for hash/equality functions

Test case is https://go.dev/cl/413694.

Fixes golang/go#52846

Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/413660

diagnostics: add ability to associate diagnostics with rules from coding standards

gcc/ChangeLog:
* common.opt (fdiagnostics-show-rules): New option.
* diagnostic-format-json.cc (diagnostic_output_format_init_json):
Fix up context->show_rules.
* diagnostic-format-sarif.cc
(diagnostic_output_format_init_sarif): Likewise.
* diagnostic-metadata.h (diagnostic_metadata::rule): New class.
(diagnostic_metadata::precanned_rule): New class.
(diagnostic_metadata::add_rule): New.
(diagnostic_metadata::get_num_rules): New.
(diagnostic_metadata::get_rule): New.
(diagnostic_metadata::m_rules): New field.
* diagnostic.cc (diagnostic_initialize): Initialize show_rules.
(print_any_rules): New.
(diagnostic_report_diagnostic): Call it.
* diagnostic.h (diagnostic_context::show_rules): New field.
* doc/invoke.texi (-fno-diagnostics-show-rules): New option.
* opts.cc (common_handle_option): Handle
OPT_fdiagnostics_show_rules.
* toplev.cc (general_init): Set up global_dc->show_rules.

gcc/testsuite/ChangeLog:
* gcc.dg/plugin/diagnostic-test-metadata.c: Expect " [STR34-C]" to
be emitted at the "gets" call.
* gcc.dg/plugin/diagnostic_plugin_test_metadata.c
(pass_test_metadata::execute): Associate the "gets" diagnostic
with a rule named "STR34-C".

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

libstdc++: Properly remove temporary directories in filesystem tests

Although these tests use filesystem::remove_all to clean up, that fails
because it uses recursive_directory_iterator which is intentionally
bodged by the custom readdir defined in the test.

Just use POSIX rmdir to clean up. We don't need to use _rmdir or _wrmdir
for Windows, because we'll never reach test02() on targets where the
custom readdir doesn't interpose the one from libc.

libstdc++-v3/ChangeLog:

* testsuite/27_io/filesystem/iterators/error_reporting.cc: Use
rmdir to remove directories.
* testsuite/experimental/filesystem/iterators/error_reporting.cc:
Likewise.

c++: -Waddress and if constexpr [PR94554]

Like we avoid various warnings for seemingly tautological expressions when
substituting a template, we should avoid warning for the implicit conversion
to bool in an if statement. I considered also doing this for the conditions
in loop expressions, but that seems unnecessary, as a loop condition is
unlikely to be a constant.

The change to finish_if_stmt_cond isn't necessary since dependent_operand_p
looks through IMPLICIT_CONV_EXPR, but makes it more constent with
e.g. build_x_binary_op that determines the type of an expression and then
builds it using the original operands.

PR c++/94554

gcc/cp/ChangeLog:

* pt.cc (dependent_operand_p): Split out from...
(tsubst_copy_and_build): ...here.
(tsubst_expr) [IF_STMT]: Use it.
* semantics.cc (finish_if_stmt_cond): Keep the pre-conversion
condition in the template tree.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1z/constexpr-if38.C: New test.

c++: -Waddress and value-dependent expr [PR105885]

We already suppress various warnings for code that would be tautological if
written directly, but not when it's the result of template substitution. It
seems we need to do this for -Waddress as well.

PR c++/105885

gcc/cp/ChangeLog:

* pt.cc (tsubst_copy_and_build): Also suppress -Waddress for
comparison of dependent operands.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1z/constexpr-if37.C: New test.

c++: properly initialize UBSAN built-ins

PR c++/106062

gcc/ChangeLog:

* ubsan.cc (sanitize_unreachable_fn): Change order of calls
in order to initialize UBSAN built-ins.

gcc/testsuite/ChangeLog:

* gfortran.dg/ubsan/pr106062.f90: New test.

c++: Prune unneeded macro locations

This implements garbage collection on locations within macro
expansions, when streaming out a CMI.  When doing the reachability
walks, we now note which macro locations we need and then only write
those locations.  The complication here is that every macro expansion
location has an independently calculated offset.  This complicates
writing, but reading remains the same -- the macro locations of a CMI
continue to form a contiguous block.

For std headers this reduced the number of macro maps by 40% and the
number of locations by 16%.  For a GMF including iostream, it reduced
it by 80% and 60% respectively.

Ordinary locations are still transformed en-mass.  They are somewhat
more complicated to apply a similar optimization to.

gcc/cp/
* module.cc (struct macro_info): New.
(struct macro_traits): New.
(macro_remap, macro_table): New globals.
(depset::hash::find_dependencies): Note namespace location.
(module_for_macro_loc): Adjust.
(module_state::note_location): New.
(module_state::Write_location): Note location when not
streaming. Adjust macro location streaming.
(module_state::read_location): Adjust macro location
streaming.
(module_state::write_init_maps): New.
(module_state::write_prepare_maps): Reimplement macro map
preparation.
(module_state::write_macro_maps): Reimplement.
(module_state::read_macro_maps): Likewise.
(module_state::write_begin): Adjust.
gcc/testsuite/
* g++.dg/modules/loc-prune-1.C: New.
* g++.dg/modules/loc-prune-2.C: New.
* g++.dg/modules/loc-prune-3.C: New.
* g++.dg/modules/pr98718_a.C: Adjust.
* g++.dg/modules/pr98718_b.C: Adjust.

testsuite: Compile slsr-39.c without vectorisation

The fix for PR106019 regressed slsr-39.c for -m32 -march=cascadelake
because we are now able to vectorise the code. (Whether the code model
should be allowing that is a different question -- the vectorised code
looked worse to me.)

The test runs at -O2 and predates vectorisation being enabled at -O2,
so this patch just adds -fno-tree-vectorize.

gcc/testsuite/
* gcc.dg/tree-ssa/slsr-39.c: Force vectorization off.

libstdc++: Simplify test by not using std::log2

This test uses std::log2 without including <cmath>, but it doesn't need
to use it at all. Just get the number of digits from numeric_limits
instead.

libstdc++-v3/ChangeLog:

* testsuite/26_numerics/random/random_device/entropy.cc: Use
numeric_limits<unsigned>::digits.

ipa-icf: skip variables with body_removed

Similarly to cgraph_nodes, it may happen that body_removed is set
during merging of symbols.

PR ipa/105600

gcc/ChangeLog:

* ipa-icf.cc (sem_item_optimizer::filter_removed_items):
Skip variables with body_removed.

Replace REGNO with reg_or_subregno in pre_reload splitter.

gcc/ChangeLog:

* config/i386/sse.md:(sse4_2_pcmpestr): Replace REGNO with
reg_or_subregno.
(sse4_2_pcmpistr): Ditto.

c++: tweak deduction with auto template parms

While looking at PR105964 I noticed that we were unnecessarily repeating
the deduction loop because of seeing a non-type parameter with type 'auto'.
It is indeed dependent, but not on any other deductions.

gcc/cp/ChangeLog:

* pt.cc (type_unification_real): An auto tparm can't
be affected by other deductions.

c++: dependence of baselink [PR105964]

helper<token>::c isn't dependent just because we haven't deduced its return
type yet. type_dependent_expression_p already knows how to deal with that
for bare FUNCTION_DECL, but needs to learn to look through a BASELINK.

PR c++/105964

gcc/cp/ChangeLog:

* pt.cc (type_dependent_expression_p): Look through BASELINK.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1z/nontype-auto21.C: New test.

Fix typo

Fix typo and commit as obvious.

Signed-off-by: Xionghu Luo <xionghuluo@tencent.com>
gcc/ChangeLog:

* cgraph.cc (cgraph_edge::redirect_call_stmt_to_callee): Fix
typo.
* tree-ssa-loop-ivopts.cc (struct iv_cand): Likewise.
* tree-switch-conversion.h: Likewise.

Daily bump.

c++: class scope function lookup [PR105908]

In r12-1273 for PR91706, I removed the code in get_class_binding that
stripped BASELINK. This testcase demonstrates that we still need to strip
it in outer_binding before putting the overload set in IDENTIFIER_BINDING,
for compatibility with bindings added directly for declarations.

PR c++/105908

gcc/cp/ChangeLog:

* name-lookup.cc (outer_binding): Strip BASELINK.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/trailing16.C: New test.

d:  Merge upstream dmd 6203135dc, druntime e150cca1, phobos a4a18d21c.

D front-end changes:

    - Input parameters can now be applied on extern(C++) functions to
      bind to `const &' when the `-fpreview=in' flag is in effect.

D runtime changes:

    - Run-time flag `--DRT-oncycle=deprecate' has been removed.

Phobos changes:

    - Removed std.experimental.logger's capability to set the minimal
      LogLevel at compile time.

gcc/d/ChangeLog:

* dmd/MERGE: Merge upstream dmd 6203135dc.
* typeinfo.cc (TypeInfoVisitor::visit (TypeInfoStructDeclaration *)):
Update for new front-end interface.
(SpeculativeTypeVisitor::visit (TypeStruct *)): Likewise.

libphobos/ChangeLog:

* libdruntime/MERGE: Merge upstream druntime e150cca1.
* src/MERGE: Merge upstream phobos a4a18d21c.
* testsuite/libphobos.cycles/cycles.exp (cycle_test_list): Update
expected result of deprecate test.

c++: Remove ifdefed code

The only reason I chose to use DECL_UID on this hash table was to make
it stable against ASLR and perturbations due to other allocations.
It's not required for correctness, as the comment mentions the
equality fn uses pointer identity.

gcc/cp/
* module.cc (struct duplicate_hash): Remove.
(duplicate_hash_map): Adjust.

ubsan: default to trap on unreachable at -O0 and -Og [PR104642]

When not optimizing, we can't do anything useful with unreachability in
terms of code performance, so we might as well improve debugging by turning
__builtin_unreachable into a trap.  I think it also makes sense to do this
when we're explicitly optimizing for the debugging experience.

In the PR richi suggested introducing an -funreachable-traps flag for this.
This functionality is already implemented as -fsanitize=unreachable
-fsanitize-trap=unreachable, and we want to share the implementation, but it
does seem useful to have a separate flag that isn't affected by the various
sanitization controls.  -fsanitize=unreachable takes priority over
-funreachable-traps if both are enabled.

Jakub observed that this would slow down -O0 by default from running the
sanopt pass, so this revision avoids the need for sanopt by rewriting calls
introduced by the compiler immediately, and calls written by the user at
fold time.  Many of the calls introduced by the compiler are also rewritten
immediately to ubsan calls when not trapping, which fixes ubsan-8b.C;
previously the call to f() was optimized away before sanopt.  But this early
rewriting isn't practical for uses of __builtin_unreachable in
devirtualization and such, so sanopt rewriting is still done for
non-trapping sanitize.

PR c++/104642

gcc/ChangeLog:

* common.opt: Add -funreachable-traps.
* doc/invoke.texi (-funreachable-traps): Document it.
* opts.cc (finish_options): Enable at -O0 or -Og.
* tree.cc (build_common_builtin_nodes): Add __builtin_trap.
(builtin_decl_unreachable, build_builtin_unreachable): New.
* tree.h: Declare them.
* ubsan.cc (sanitize_unreachable_fn): Factor out.
(ubsan_instrument_unreachable): Use
gimple_build_builtin_unreachable.
* ubsan.h (sanitize_unreachable_fn): Declare.
* gimple.cc (gimple_build_builtin_unreachable): New.
* gimple.h: Declare it.
* builtins.cc (expand_builtin_unreachable): Add assert.
(fold_builtin_0): Call build_builtin_unreachable.
* sanopt.cc: Don't run for just SANITIZE_RETURN
or SANITIZE_UNREACHABLE when trapping.
* cgraphunit.cc (walk_polymorphic_call_targets): Use new
unreachable functions.
* gimple-fold.cc (gimple_fold_call)
(gimple_get_virt_method_for_vtable)
* ipa-fnsummary.cc (redirect_to_unreachable)
* ipa-prop.cc (ipa_make_edge_direct_to_target)
(ipa_impossible_devirt_target)
* ipa.cc (walk_polymorphic_call_targets)
* tree-cfg.cc (pass_warn_function_return::execute)
(execute_fixup_cfg)
* tree-ssa-loop-ivcanon.cc (remove_exits_and_undefined_stmts)
(unloop_loops)
* tree-ssa-sccvn.cc (eliminate_dom_walker::eliminate_stmt):
Likewise.

gcc/cp/ChangeLog:

* constexpr.cc (cxx_eval_builtin_function_call): Handle
unreachable/trap earlier.
* cp-gimplify.cc (cp_maybe_instrument_return): Use
build_builtin_unreachable.

gcc/testsuite/ChangeLog:

* g++.dg/ubsan/return-8a.C: New test.
* g++.dg/ubsan/return-8b.C: New test.
* g++.dg/ubsan/return-8d.C: New test.
* g++.dg/ubsan/return-8e.C: New test.

data-ref: Improve non-loop disambiguation [PR106019]

When dr_may_alias_p is called without a loop context, it tries
to use the tree-affine interface to calculate the difference
between the two addresses and use that difference to check whether
the gap between the accesses is known at compile time. However, as the
example in the PR shows, this doesn't expand SSA_NAMEs and so can easily
be defeated by things like reassociation.

One fix would have been to use aff_combination_expand to expand the
SSA_NAMEs, but we'd then need some way of maintaining the associated
cache. This patch instead reuses the innermost_loop_behavior fields
(which exist even when no loop context is provided).

It might still be useful to do the aff_combination_expand thing too,
if an example turns out to need it.

gcc/
PR tree-optimization/106019
* tree-data-ref.cc (dr_may_alias_p): Try using the
innermost_loop_behavior to disambiguate non-loop queries.

gcc/testsuite/
PR tree-optimization/106019
* gcc.dg/vect/bb-slp-pr106019.c: New test.

RISC-V: Add -mtune=thead-c906 to the invoke docs

gcc/ChangeLog

* doc/invoke.texi (RISC-V): Document -mtune=thead-c906.

libstdc++: eh_globals: gthreads: reset _S_init before deleting key

Clear __eh_globals_init's _S_init in the dtor before deleting the
gthread key.

This ensures that, in case any code involved in deleting the key
interacts with eh_globals, the key that is being deleted won't be
used, and the non-thread-specific eh_globals fallback will.

for libstdc++-v3/ChangeLog

* libsupc++/eh_globals.cc [!_GLIBCXX_HAVE_TLS]
(__eh_globals_init::~__eh_globals_init): Clear _S_init first.

libstdc++: testsuite: call sched_yield for nonpreemptive targets

As in the gcc testsuite, systems without preemptive multi-threading
require sched_yield calls to be placed at points in which a context
switch might be needed to enable the test to complete.

for libstdc++-v3/ChangeLog

* testsuite/30_threads/this_thread/60421.cc (test02): Call
sched_yield.

libstdc++: testsuite: require cmath for nexttowardl

nexttowardl is only expected to be available with C99 math, but
20_util/to_chars/long_double.cc uses it unconditionally.

State the cmath requirement in the test.

for libstdc++-v3/ChangeLog

* testsuite/20_util/to_chars/long_double.cc: Require cmath.

libstdc++: testsuite: work around bitset namespace pollution

rtems6 declares a global struct bitset in a header file included
indirectly by sys/types.h, that ambiguates the unqualified references
to bitset after "using namespace std" in the testsuite.

Work around the namespace pollution with using declarations of
std::bitset.

for libstdc++-v3/ChangeLog

* testsuite/23_containers/bitset/cons/dr1325-2.cc: Work around
global struct bitset.
* testsuite/23_containers/bitset/ext/15361.cc: Likewise.
* testsuite/23_containers/bitset/input/1.cc: Likewise.
* testsuite/23_containers/bitset/to_string/1.cc: Likewise.
* testsuite/23_containers/bitset/to_string/dr396.cc: Likewise.

testsuite: outputs.exp: cleanup before running tests

Use the just-added dry-run infrastructure to clean up files that may
have been left over by interrupted runs of outputs.exp, which used to
lead to spurious non-repeatable (self-fixing) failures.

for gcc/testsuite/ChangeLog

* gcc.misc-tests/outputs.exp: Clean up left-overs first.

testsuite: outputs.exp: test for skip_atsave more thoroughly

The presence of -I or -L flags in link command lines changes the
driver's, and thus the linker's behavior, WRT naming files with
command-line options.  With such flags, the driver creates .args.0 and
.args.1 files, whereas without them it's the linker (collect2, really)
that creates .ld1_args.

I've hit some fails on a target system that doesn't have -I or -L
flags in the board config file, but it does add some of them
implicitly with configured-in driver self specs.  Alas, the test in
outputs.exp doesn't catch that, so we proceed to run rather than
skip_atsave tests.

I've reworked the outest procedure to allow dry runs and to return
would-have-been pass/fail results as lists, so we can now test whether
certain files are created and use that to configure the actual test
runs.

for  gcc/testsuite/ChangeLog

* gcc.misc-tests/outputs.exp (outest): Introduce quiet mode,
create and return lists of passes and fails.  Use it to catch
skip_atsave cases where -L flags are implicitly added by
driver self specs.

c++: testsuite: require lto_incremental in pr90990_0.C

Other LTO tests that use -r require the lto_incremental effective
target. I suppose pr90990_0.C is missing it due to an oversight.
This patch arranges for this test to also be skipped on
non-lto_incremental targets.

for gcc/testsuite/ChangeLog

* g++.dg/lto/pr90990_0.C: Require lto_incremental target.

i386: Add syscall to enable AMX for latest kernels

gcc/testsuite/ChangeLog:

* gcc.target/i386/amx-check.h (request_perm_xtile_data):
New function to check if AMX is usable and enable AMX.
(main): Run test if AMX is usable.

xtensa: Fix buffer overflow

Fortify buffer overflow message reported.
(see https://github.com/earlephilhower/esp-quick-toolchain/issues/36)

gcc/ChangeLog:

* config/xtensa/xtensa.md (bswapsi2_internal):
Enlarge the buffer that is obviously smaller than the template
string given to sprintf().

Daily bump.

PR target/105991: Recognize PLUS and XOR forms of rldimi in rs6000.md.

This patch addresses PR target/105991 where a change to prefer representing
shifts and adds at the tree-level as multiplications, causes problems for
the rldimi patterns in the powerpc backend.  The issue is that rs6000.md
models this pattern using IOR, and some variants that have the equivalent
PLUS or XOR in the RTL fail to match some *rotl<mode>4_insert patterns.
This is fixed in this patch by adding a define_insn_and_split to locally
canonicalize the PLUS and XOR forms to the backend's preferred IOR form.

An alternative fix might be for the RTL optimizers to define a canonical
form for these plus_xor_ior equivalent expressions, but the logical
choice might be plus (which may appear in an addressing mode), and such
a change may require a number of tweaks to update various backends
(i.e.  a more intrusive change than the one proposed here).

Many thanks for Marek Polacek for bootstrapping and regression testing
this change without problems.

2022-06-22  Roger Sayle  <roger@nextmovesoftware.com>
    Marek Polacek  <polacek@redhat.com>
    Segher Boessenkool  <segher@kernel.crashing.org>
    Kewen Lin  <linkw@linux.ibm.com>

gcc/ChangeLog
PR target/105991
* config/rs6000/rs6000.md (rotl<mode>3_insert_3): Check that
exact_log2 doesn't return -1 (or zero).
(plus_xor): New code iterator.
(*rotl<mode>3_insert_3_<code>): New define_insn_and_split.

gcc/testsuite/ChangeLog
PR target/105991
* gcc.target/powerpc/pr105991.c: New test case.

libgomp: Fix up target-31.c test [PR106045]

The i variable is used inside of the parallel in:
      #pragma omp simd safelen(32) private (v)
      for (i = 0; i < 64; i++)
        {
          v = 3 * i;
          ll[i] = u1 + v * u2[0] + u2[1] + x + y[0] + y[1] + v + h[0] + u3[i];
        }
where i is predetermined linear (so while inside of the body
it is safe, private per SIMD lane var) the final value is written to
the shared variable, and in:
      for (i = 0; i < 64; i++)
        if (ll[i] != u1 + 3 * i * u2[0] + u2[1] + x + y[0] + y[1] + 3 * i + 13 + 14 + i)
          #pragma omp atomic write
            err = 1;
which is a normal loop and so it isn't in any way privatized there.
So we have a data race, fixed by adding private (i) clause to the
parallel.

2022-06-21  Jakub Jelinek  <jakub@redhat.com>
    Paul Iannetta  <piannetta@kalrayinc.com>

PR libgomp/106045
* testsuite/libgomp.c/target-31.c: Add private (i) clause.

libgo: #include <sys/types.h> when checking for loff_t

PR go/106033

Fixes golang/go#53469

Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/413214

doc: Document module language-linkage supported

I missed we documented this as unimplemented, when I implemented it.

gcc/
* doc/invoke.texi (C++ Modules): Remove language-linkage
as missing feature.

match.pd: Remove "+ 0x80000000" in int comparisons [PR94899]

Expressions of the form "X + CST < Y + CST" where:

* CST is an unsigned integer constant with only the MSB set, and
* X and Y's types have integer conversion ranks <= CST's

can be simplified to "(signed) X < (signed) Y".

This is because, assuming a 32-bit signed numbers,
(unsigned) INT_MIN + 0x80000000 is 0, and
(unsigned) INT_MAX + 0x80000000 is UINT_MAX.

i.e. the result increases monotonically with signed input.

This means:
((signed) X < (signed) Y) iff (X + 0x80000000 < Y + 0x80000000)

gcc/
PR tree-optimization/94899
* match.pd (X + C < Y + C -> (signed) X < (signed) Y, if C is
0x80000000): New simplification.
gcc/testsuite/
* gcc.dg/pr94899.c: New test.

ifcvt: Don't introduce trapping or faulting reads in noce_try_sign_mask [PR106032]

noce_try_sign_mask as documented will optimize
  if (c < 0)
    x = t;
  else
    x = 0;
into x = (c >> bitsm1) & t;
The optimization is done if either t is unconditional
(e.g. for
  x = t;
  if (c >= 0)
    x = 0;
) or if it is cheap.  We already check that t doesn't have side-effects,
but if t is conditional, we need to punt also if it may trap or fault,
as we make it unconditional.

I've briefly skimmed other noce_try* optimizations and didn't find one that
would suffer from the same problem.

2022-06-21  Jakub Jelinek  <jakub@redhat.com>

PR rtl-optimization/106032
* ifcvt.cc (noce_try_sign_mask): Punt if !t_unconditional, and
t may_trap_or_fault_p, even if it is cheap.

* gcc.c-torture/execute/pr106032.c: New test.

expand: Fix up expand_cond_expr_using_cmove [PR106030]

If expand_cond_expr_using_cmove can't find a cmove optab for a particular
mode, it tries to promote the mode and perform the cmove in the promoted
mode.

The testcase in the patch ICEs on arm because in that case we pass temp which
has the promoted mode (SImode) as target to expand_operands where the
operands have the non-promoted mode (QImode).
Later on the function uses paradoxical subregs:
  if (GET_MODE (op1) != mode)
    op1 = gen_lowpart (mode, op1);

  if (GET_MODE (op2) != mode)
    op2 = gen_lowpart (mode, op2);
to change the operand modes.

The following patch fixes it by passing NULL_RTX as target if it has
promoted mode.

2022-06-21  Jakub Jelinek  <jakub@redhat.com>

PR middle-end/106030
* expr.cc (expand_cond_expr_using_cmove): Pass NULL_RTX instead of
temp to expand_operands if mode has been promoted.

* gcc.c-torture/compile/pr106030.c: New test.

if-to-switch: Don't skip the first condition bb when find_conditions in if-to-switch [PR105740]

The if condition is at last of first bb, so side effect statement in first BB
doesn't matter, then the first if condition could also be folded to switch
table.

gcc/ChangeLog:

PR target/105740
* gimple-if-to-switch.cc (find_conditions): Don't skip the first
condition bb.

gcc/testsuite/ChangeLog:

PR target/105740
* gcc.dg/tree-ssa/if-to-switch-11.c: New test.

Signed-off-by: Xionghu Luo <xionghuluo@tencent.com>

tree-object-size: Don't let error_mark_node escape for ADDR_EXPR [PR105736]

The addr_expr computation does not check for error_mark_node before
returning the size expression. This used to work in the constant case
because the conversion to uhwi would end up causing it to return
size_unknown, but that won't work for the dynamic case.

Modify the control flow to explicitly return size_unknown if the offset
computation returns an error_mark_node.

gcc/ChangeLog:

PR tree-optimization/105736
* tree-object-size.cc (addr_object_size): Return size_unknown
when object offset computation returns an error.

gcc/testsuite/ChangeLog:

PR tree-optimization/105736
* gcc.dg/builtin-dynamic-object-size-0.c (TV4): New struct.
(val3): New variable.
(test_pr105736): New test.
(main): Call it.

Signed-off-by: Siddhesh Poyarekar <siddhesh@gotplt.org>

Daily bump.

testsuite, asan: Avoid color in asan test output.

The presence of the color markers in the some of the asan tests
appears to confuse the dg-output matching (possibly a platform
TCL or termios bug) on some Darwin platforms.

Since the color is not being tested, switch it off (makes the log
files easier to read too). This fixes a large number of spurious
test fails on AVX512 Darwin19.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/testsuite/ChangeLog:

* lib/asan-dg.exp: Do not apply color to asan output when
under test.

i386: Disallow sibcall for calling ifunc functions with PIC register

Disallow siball when calling ifunc functions with PIC register so that
PIC register can be restored.

gcc/

PR target/105960
* config/i386/i386.cc (ix86_function_ok_for_sibcall): Return
false if PIC register is used when calling ifunc functions.

gcc/testsuite/

PR target/105960
* gcc.target/i386/pr105960.c: New test.

testsuite, Darwin: Skip an unsupported test.

Darwin does not support patchable function entries, skip the test
there.

gcc/testsuite/ChangeLog:

* g++.dg/modules/pr105169_a.C: Skip the test on Darwin.
* g++.dg/modules/pr105169_b.C: Likewise.

testsuite, Darwin: Allow for two CTOR bodies in array61 test.

For targets without alias support, we emit two essentially identical function
bodies into the gimple (complete and base CTORs). So this test needs to allow
for that when the target does not support aliases. The target support alias
test does not seem to be usable in the context of a single scan-tree-dump so
the fix here uses the target designation.

Note that the array has 10 elements, so that if the test were failing (because
we were emitting 10 inits instead of a loop) the count would be expected to
exceed 2, on Darwin and 1 where there's alias support.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/testsuite/ChangeLog:

* g++.dg/init/array61.C: Allow for two CTOR bodies on Darwin, where
aliases are not currently supported.

arm: more testsutie fallout for mve move-immediate changes

Unfortunately, there is more fall-out in the testsuite for my changes
to use MVE move-immediate operations instead of literal pool loads.
Fixed as follows:

gcc/testsuite/ChangeLog:
* gcc.target/arm/simd/mve-vcmp-f32-2.c: Adjust expected output.
* gcc.target/arm/simd/pr100757.c: Likewise.
* gcc.target/arm/simd/pr100757-2.c: Likewise.
* gcc.target/arm/simd/pr100757-3.c: Likewise.
* gcc.target/arm/simd/pr100757-4.c: Likewise.

testsuite: Add a missing USER_LABEL_PREFIX to a regex.

Fixes this test on Darwin.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/testsuite/ChangeLog:

* g++.dg/modules/init-2_b.C: Add a missing USER_LABEL_PREFIX
to a regex.

testsuite: Require init_priority target support in a test.

The attr-cdtor-1 test fails on targets without init priority since the
diagnostic emitted concerns the absence of support. Disable the test
on such targets.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/testsuite/ChangeLog:

* c-c++-common/attr-cdtor-1.c: Requite init_priority support.

middle-end/106027 - fix types in needle folding

The fold_to_nonsharp_ineq_using_bound folding ends up creating invalid
typed IL which confuses later foldings. The following fixes that.

2022-06-20 Richard Biener <rguenther@suse.de>

PR middle-end/106027
* fold-const.cc (fold_to_nonsharp_ineq_using_bound): Use the
type of the prevailing comparison for the new comparison type.
(fold_binary_loc): Use proper types for the A < X && A + 1 > Y
to A < X && A >= Y folding.

* gcc.dg/pr106027.c: New testcase.

vect: Respect slp decision when applying suggested uf [PR105940]

This follows Richi's suggestion in PR105940, it aims to avoid
inconsistent slp decision between when the suggested unroll
factor is worked out and when the suggested unroll factor is
applied.

If the previous slp decision is true when the suggested unroll
factor is worked out, when we are applying unroll factor we
don't need to start over with slp off if the analysis with slp
on fails.  On the other hand, if the previous slp decision is
false when the suggested unroll factor is worked out, when we
are applying unroll factor we can skip the slp handlings.

Function vect_is_simple_reduction saves reduction chains for
subsequent slp analyses, we have to disable this early otherwise
there is an ICE in vectorizable_reduction for below:

  if (REDUC_GROUP_FIRST_ELEMENT (stmt_info))
    gcc_assert (slp_node
&& REDUC_GROUP_FIRST_ELEMENT (stmt_info)
   == stmt_info);

PR tree-optimization/105940

gcc/ChangeLog:

* tree-vect-loop.cc (vect_analyze_loop_2): Add new parameter
slp_done_for_suggested_uf and adjust with it accordingly.
(vect_analyze_loop_1): Add new variable slp_done_for_suggested_uf,
pass it down to vect_analyze_loop_2 for the initial analysis and
applying suggested unroll factor.
(vect_is_simple_reduction): Add parameter slp and adjust with it.
(vect_analyze_scalar_cycles_1): Add parameter slp and pass down.
(vect_analyze_scalar_cycles): Likewise.

lto-plugin: support LDPT_GET_SYMBOLS_V3

That supports skipping of an object file (LDPS_NO_SYMS).

lto-plugin/ChangeLog:

* lto-plugin.c (struct plugin_file_info): Add skip_file flag.
(write_resolution): Write resolution only if get_symbols != LDPS_NO_SYMS.
(all_symbols_read_handler): Ignore file if skip_file is true.
(onload): Handle LDPT_GET_SYMBOLS_V3.

Add operators / and * for profile_{count,probability}.

gcc/ChangeLog:

* bb-reorder.cc (find_traces_1_round): Add operators / and * and
use them.
(better_edge_p): Likewise.
* cfgloop.cc (find_subloop_latch_edge_by_profile): Likewise.
* cfgloopmanip.cc (scale_loop_profile): Likewise.
* cfgrtl.cc (force_nonfallthru_and_redirect): Likewise.
* cgraph.cc (cgraph_edge::maybe_hot_p): Likewise.
* config/sh/sh.cc (expand_cbranchdi4): Likewise.
* dojump.cc (do_compare_rtx_and_jump): Likewise.
* final.cc (compute_alignments): Likewise.
* ipa-cp.cc (update_counts_for_self_gen_clones): Likewise.
(decide_about_value): Likewise.
* ipa-inline-analysis.cc (do_estimate_edge_time): Likewise.
* loop-unroll.cc (unroll_loop_runtime_iterations): Likewise.
* modulo-sched.cc (sms_schedule): Likewise.
* omp-expand.cc (extract_omp_for_update_vars): Likewise.
(expand_omp_ordered_sink): Likewise.
(expand_omp_for_ordered_loops): Likewise.
(expand_omp_for_static_nochunk): Likewise.
* predict.cc (maybe_hot_count_p): Likewise.
(probably_never_executed): Likewise.
(set_even_probabilities): Likewise.
(handle_missing_profiles): Likewise.
(expensive_function_p): Likewise.
* profile-count.h: Likewise.
* profile.cc (compute_branch_probabilities): Likewise.
* stmt.cc (emit_case_dispatch_table): Likewise.
* symtab-thunks.cc (expand_thunk): Likewise.
* tree-ssa-loop-manip.cc (tree_transform_and_unroll_loop): Likewise.
* tree-ssa-sink.cc (select_best_block): Likewise.
* tree-switch-conversion.cc (switch_decision_tree::analyze_switch_statement): Likewise.
(switch_decision_tree::balance_case_nodes): Likewise.
(switch_decision_tree::emit_case_nodes): Likewise.
* tree-vect-loop.cc (scale_profile_for_vect_loop): Likewise.

RISC-V: Fix a bug that is the CMO builtins are missing parameter

We changed builtins format about zicbom and zicboz subextensions and modified test cases.

diff with the previous version:
1.We modified the FUNCTION_TYPE from RISCV_VOID_FTYPE_SI/DI to RISCV_VOID_FTYPE_VOID_PTR.
2.We added a new RISCV_ATYPE_VOID_PTR in riscv-builtins.cc and a new DEF_RISCV_FTYPE (1, (VOID, VOID_PTR)) in riscv-ftypes.def.
3.We deleted DEF_RISCV_FTYPE (1, (VOID, SI/DI)).
4.We modified the input parameters of the test cases.

Thanks, Simon and Kito.

gcc/ChangeLog:

* config/riscv/riscv-builtins.cc (RISCV_ATYPE_VOID_PTR): New.
* config/riscv/riscv-cmo.def (RISCV_BUILTIN): Changed the FUNCTION_TYPE
of RISCV_BUILTIN.
* config/riscv/riscv-ftypes.def (0): Remove unused.
(1): New.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/cmo-zicbom-1.c: modified the input parameters.
* gcc.target/riscv/cmo-zicbom-2.c: modified the input parameters.
* gcc.target/riscv/cmo-zicboz-1.c: modified the input parameters.
* gcc.target/riscv/cmo-zicboz-2.c: modified the input parameters.

Daily bump.

xtensa: Fix RTL insn cost estimation about relaxed MOVI instructions

These instructions will all be converted to L32R ones with litpool entries
by the assembler.

gcc/ChangeLog:

* config/xtensa/xtensa.cc (xtensa_is_insn_L32R_p):
Consider relaxed MOVI instructions as L32R.

xtensa: Apply a few minor fixes

No functional changes.

gcc/ChangeLog:

* config/xtensa/xtensa.cc (xtensa_emit_move_sequence):
Use can_create_pseudo_p(), instead of using individual
reload_in_progress and reload_completed.
(xtensa_expand_block_set_small_loop): Use xtensa_simm8x256(),
the existing predicate function.
(xtensa_is_insn_L32R_p, gen_int_relational, xtensa_emit_sibcall):
Use the standard RTX code predicate macros such as MEM_P,
SYMBOL_REF_P and/or CONST_INT_P.
* config/xtensa/xtensa.md: Avoid using numeric literals to determine
if callee-saved register, at the split patterns for indirect sibcall
fixups.

Daily bump.

Fortran: check POS and LEN arguments simplifying bit intrinsics [PR105986]

gcc/fortran/ChangeLog:

PR fortran/105986
* simplify.cc (gfc_simplify_btest): Add check for POS argument.
(gfc_simplify_ibclr): Add check for POS argument.
(gfc_simplify_ibits): Add check for POS and LEN arguments.
(gfc_simplify_ibset): Add check for POS argument.

gcc/testsuite/ChangeLog:

PR fortran/105986
* gfortran.dg/check_bits_3.f90: New test.

ubsan: Add -fsanitize-trap= support

On Thu, Jun 16, 2022 at 09:32:02PM +0100, Jonathan Wakely wrote:
> It looks like clang has addressed this deficiency now:
>
> https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html#usage

Thanks, that is roughly what I'd implement anyway and apparently they have
it already since 2015, we've added the -fsanitize-undefined-trap-on-error
support back in 2014 and didn't change it since then.

As a small divergence from clang, I chose -fsanitize-undefined-trap-on-error
to be a (deprecated) alias for -fsanitize-trap aka -fsanitize-trap=all
rather thn -fsanitize-trap=undefined which seems to be what clang does,
because for a deprecated option it is IMHO more important backwards
compatibility with what gcc did over the past 8 years rather than clang
compatibility.
Some sanitizers (e.g. asan, lsan, tsan) don't support traps,
-fsanitize-trap=address etc. will be rejected (if enabled at the end of
command line), -fno-sanitize-trap= can be specified even for them.
This is similar behavior to -fsanitize-recover=.
One complication is vptr sanitization, which can't easily trap,
as the whole slow path of the checking is inside of libubsan.
Previously, -fsanitize=vptr -fsanitize-undefined-trap-on-error
silently ignored vptr sanitization.
This patch similarly to what clang does will accept
-fsanitize-trap=all or -fsanitize-trap=undefined which enable
the vptr bit as trapping and again that causes silent disabling
of vptr sanitization, while -fsanitize-trap=vptr is rejected
(already during option processing).

2022-06-18  Jakub Jelinek  <jakub@redhat.com>

gcc/
* common.opt (flag_sanitize_trap): New variable.
(fsanitize-trap=, fsanitize-trap): New options.
(fsanitize-undefined-trap-on-error): Change into deprecated alias
for -fsanitize-trap=all.
* opts.h (struct sanitizer_opts_s): Add can_trap member.
* opts.cc (finish_options): Complain about unsupported
-fsanitize-trap= options.
(sanitizer_opts): Add can_trap values to all entries.
(get_closest_sanitizer_option): Ignore -fsanitize-trap=
options which have can_trap false.
(parse_sanitizer_options): Add support for -fsanitize-trap=.
For -fsanitize-trap=all, enable
SANITIZE_UNDEFINED | SANITIZE_UNDEFINED_NONDEFAULT.  Disallow
-fsanitize-trap=vptr here.
(common_handle_option): Handle OPT_fsanitize_trap_ and
OPT_fsanitize_trap.
* sanopt.cc (maybe_optimize_ubsan_null_ifn): Check
flag_sanitize_trap & SANITIZE_{NULL,ALIGNMENT} instead of
flag_sanitize_undefined_trap_on_error.
* gcc.cc (sanitize_spec_function): Use
flag_sanitize & ~flag_sanitize_trap instead of flag_sanitize
and drop use of flag_sanitize_undefined_trap_on_error in
"undefined" handling.
* ubsan.cc (ubsan_instrument_unreachable): Use
flag_sanitize_trap & SANITIZE_??? instead of
flag_sanitize_undefined_trap_on_error.
(ubsan_expand_bounds_ifn, ubsan_expand_null_ifn,
ubsan_expand_objsize_ifn, ubsan_expand_ptr_ifn,
ubsan_build_overflow_builtin, instrument_bool_enum_load,
ubsan_instrument_float_cast, instrument_nonnull_arg,
instrument_nonnull_return, instrument_builtin): Likewise.
* doc/invoke.texi (-fsanitize-trap=, -fsanitize-trap): Document.
(-fsanitize-undefined-trap-on-error): Document as deprecated
alias of -fsanitize-trap.
gcc/c-family/
* c-ubsan.cc (ubsan_instrument_division, ubsan_instrument_shift):
Use flag_sanitize_trap & SANITIZE_??? instead of
flag_sanitize_undefined_trap_on_error.  If 2 sanitizers are involved
and flag_sanitize_trap differs for them, emit __builtin_trap only
for the comparison where trap is requested.
(ubsan_instrument_vla, ubsan_instrument_return): Use
lag_sanitize_trap & SANITIZE_??? instead of
flag_sanitize_undefined_trap_on_error.
gcc/cp/
* cp-ubsan.cc (cp_ubsan_instrument_vptr_p): Use
flag_sanitize_trap & SANITIZE_VPTR instead of
flag_sanitize_undefined_trap_on_error.
gcc/testsuite/
* c-c++-common/ubsan/nonnull-4.c: Use -fsanitize-trap=all
instead of -fsanitize-undefined-trap-on-error.
* c-c++-common/ubsan/div-by-zero-4.c: Use
-fsanitize-trap=signed-integer-overflow instead of
-fsanitize-undefined-trap-on-error.
* c-c++-common/ubsan/overflow-add-4.c: Use -fsanitize-trap=undefined
instead of -fsanitize-undefined-trap-on-error.
* c-c++-common/ubsan/pr56956.c: Likewise.
* c-c++-common/ubsan/pr68142.c: Likewise.
* c-c++-common/ubsan/pr80932.c: Use
-fno-sanitize-trap=all -fsanitize-trap=shift,undefined
instead of -fsanitize-undefined-trap-on-error.
* c-c++-common/ubsan/align-8.c: Use -fsanitize-trap=alignment
instead of -fsanitize-undefined-trap-on-error.

varasm: Fix up ICE in narrowing_initializer_constant_valid_p [PR105998]

The following testcase ICEs because there is NON_LVALUE_EXPR (location
wrapper) around a VAR_DECL and has TYPE_MODE V2SImode and
SCALAR_INT_TYPE_MODE on that ICEs. Or for -m32 -march=i386 TYPE_MODE
is DImode, but SCALAR_INT_TYPE_MODE still uses the raw V2SImode and ICEs
too.

2022-06-18 Jakub Jelinek <jakub@redhat.com>

PR middle-end/105998
* varasm.cc (narrowing_initializer_constant_valid_p): Check
SCALAR_INT_MODE_P instead of INTEGRAL_MODE_P, also break on
! INTEGRAL_TYPE_P and do the same check also on op{0,1}'s type.

* c-c++-common/pr105998.c: New test.

PR tree-optimization/105835: Two narrowing patterns for match.pd.

This patch resolves PR tree-optimization/105835, which is a code quality
(dead code elimination) regression at -O1 triggered/exposed by a recent
change to canonicalize X&-Y as X*Y.  The new (shorter) form exposes some
missed optimization opportunities that can be handled by adding some
extra simplifications to match.pd.

One transformation is to simplify "(short)(x ? 65535 : 0)" into the
equivalent "x ? -1 : 0", or more accurately x ? (short)-1 : (short)0",
as INTEGER_CSTs record their type, and integer conversions can be
pushed inside COND_EXPRs reducing the number of gimple statements.

The other transformation is that (short)(X * 65535), where X is [0,1],
into the equivalent (short)X * -1, (or again (short)-1 where tree's
INTEGER_CSTs encode their type).  This is valid because multiplications
where one operand is [0,1] are guaranteed not to overflow, and hence
integer conversions can also be pushed inside these multiplications.

These narrowing conversion optimizations can be identified by range
analyses, such as EVRP, but these are only performed at -O2 and above,
which is why this regression is only visible with -O1.

2022-06-18  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
PR tree-optimization/105835
* match.pd (convert (mult zero_one_valued_p@1 INTEGER_CST@2)):
Narrow integer multiplication by a zero_one_valued_p operand.
(convert (cond @1 INTEGER_CST@2 INTEGER_CST@3)): Push integer
conversions inside COND_EXPR where both data operands are
integer constants.

gcc/testsuite/ChangeLog
PR tree-optimization/105835
* gcc.dg/pr105835.c: New test case.

xtensa: Defer storing integer constants into litpool until reload

Storing integer constants into litpool in the early stage of compilation
hinders some integer optimizations.  In fact, such integer constants are
not subject to the constant folding process.

For example:

    extern unsigned short value;
    extern void foo(void);
    void test(void) {
      if (value == 30001)
        foo();
    }

.literal_position
.literal .LC0, value
.literal .LC1, 30001
    test:
l32r a3, .LC0
l32r a2, .LC1
l16ui a3, a3, 0
extui a2, a2, 0, 16  // runtime zero-extension despite constant
bne a3, a2, .L1
j.l foo, a9
    .L1:
ret.n

This patch defers the placement of integer constants into litpool until
the start of reload:

.literal_position
.literal .LC0, value
.literal .LC1, 30001
    test:
l32r a3, .LC0
l32r a2, .LC1
l16ui a3, a3, 0
bne a3, a2, .L1
j.l foo, a9
    .L1:
ret.n

gcc/ChangeLog:

* config/xtensa/constraints.md (Y):
Change to include integer constants until reload begins.
* config/xtensa/predicates.md (move_operand): Ditto.
* config/xtensa/xtensa.cc (xtensa_emit_move_sequence):
Change to allow storing integer constants into litpool only after
reload begins.

Daily bump.

libgo: permit loff_t and off_t to be macros

They are macros in musl libc, rather than typedefs, and -fgo-dump-spec
doesn't handle that case.

Based on patch by Sören Tempel.

Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/412075

c++: Use fold_non_dependent_expr rather than maybe_constant_value in __builtin_shufflevector handling [PR106001]

In this case the STATIC_CAST_EXPR expressions in the call aren't
type nor value dependent, but maybe_constant_value still ICEs on those
when processing_template_decl. Calling fold_non_dependent_expr on it
instead fixes the ICE and folds them to INTEGER_CSTs.

2022-06-17 Jakub Jelinek <jakub@redhat.com>

PR c++/106001
* typeck.cc (build_x_shufflevector): Use fold_non_dependent_expr
instead of maybe_constant_value.

* g++.dg/ext/builtin-shufflevector-4.C: New test.

alpha: Introduce target specific store_data_bypass_p function [PR105209]

This patch introduces alpha-specific version of store_data_bypass_p that
ignores TRAP_IF that would result in assertion failure (and internal
compiler error) in the generic store_data_bypass_p function.

While at it, also remove ev4_ist_c reservation, store_data_bypass_p
can handle the patterns with multiple sets since some time ago.

2022-06-17 Uroš Bizjak <ubizjak@gmail.com>

gcc/ChangeLog:

PR target/105209
* config/alpha/alpha-protos.h (alpha_store_data_bypass_p): New.
* config/alpha/alpha.cc (alpha_store_data_bypass_p): New function.
(alpha_store_data_bypass_p_1): Ditto.
* config/alpha/ev4.md: Use alpha_store_data_bypass_p instead
of generic store_data_bypass_p.
(ev4_ist_c): Remove insn reservation.

gcc/testsuite/ChangeLog:

PR target/105209
* gcc.target/alpha/pr105209.c: New test.

i386: Fix assert in ix86_function_arg [PR105970]

The mode of pointer argument should equal ptr_mode, not Pmode.

2022-06-17 Uroš Bizjak <ubizjak@gmail.com>

gcc/ChangeLog:

PR target/105970
* config/i386/i386.cc (ix86_function_arg): Assert that
the mode of pointer argumet is equal to ptr_mode, not Pmode.

gcc/testsuite/ChangeLog:

PR target/105970
* gcc.target/i386/pr105970.c: New test.

i386: Fix VPMOV splitter [PR105993]

REGNO should not be used with register_operand before reload because
subregs of registers or even subregs of memory match the predicate.
The build with RTL checking enabled does not tolerate REGNO with
non-reg operand.
The patch splits the splitter into two related splitters and uses
(match_dup ...) RTXes instead of REGNO comparisons.

2022-06-17 Uroš Bizjak <ubizjak@gmail.com>

gcc/ChangeLog:

PR target/105993
* config/i386/sse.md (vpmov splitter): Use (match_dup ...)
instead of REGNO comparisons in combine splitter.

gcc/testsuite/ChangeLog:

PR target/105993
* gcc.target/i386/pr105993.c: New test.

rs6000: Fix some error messages for invalid conversions

"* something" isn't a type. "something *" is.

2022-06-17 Segher Boessenkool <segher@kernel.crashing.org>

* config/rs6000/rs6000.cc (rs6000_invalid_conversion): Correct some
types.

RISC-V: Supress warning for comparison of integer expressions of different signedness

gcc/ChangeLog:

* config/riscv/bitmanip.md: Supress warning.

arm: fix checking ICE in arm_print_operand [PR106004]

Sigh, another instance where I incorrectly used XUINT instead of
UINTVAL.

I've also made the code here a little more robust (although I think
this case can't in fact be reached) if the 32-bit clear mask includes
bit 31. This case, if reached, would print out an out-of-range value
based on the size of the compiler's HOST_WIDE_INT type due to
sign-extension. We avoid this by masking the value after inversion.

gcc/ChangeLog:
PR target/106004
* config/arm/arm.cc (arm_print_operand, case 'V'): Use UINTVAL.
Clear bits in the mask above bit 31.

libstdc++: Add missing #include <string> to new test

Somehow I pushed a different version of this test to the one I actually
tested.

libstdc++-v3/ChangeLog:

* testsuite/21_strings/basic_string/cons/char/105995.cc: Add
missing #include.

docs: add missing table header

libgomp/ChangeLog:

* libgomp.texi: Add table header for new features of
OpenMP 5.2.

arm: mve: Don't force trivial vector literals to the pool

A bug in the ordering of the operands in the mve_mov<mode> pattern
meant that all literal values were being pushed to the literal pool.
This patch fixes that and simplifies some of the logic slightly so
that we can use as simple switch statement.

For example:
void f (uint32_t *a)
{
  int i;
  for (i = 0; i < 100; i++)
    a[i] += 1;
}

Now compiles to:
        push    {lr}
        mov     lr, #25
        vmov.i32        q2, #0x1  @ v4si
        ...

instead of

        push    {lr}
        mov     lr, #25
        vldr.64 d4, .L6
        vldr.64 d5, .L6+8
...
.L7:
        .align  3
.L6:
        .word   1
        .word   1
        .word   1
        .word   1

gcc/ChangeLog:
* config/arm/mve.md (*mve_mov<mode>): Re-order constraints
to avoid spilling trivial literals to the constant pool.

gcc/testsuite/ChangeLog:
* gcc.target/arm/acle/cde-mve-full-assembly.c: Adjust expected
output.

Daily bump.

gimple-ssa-warn-access.cc: add missing auto_diagnostic_group

Whilst working on SARIF output I noticed some places where followup notes
weren't being properly associated with their warnings in
gcc/gimple-ssa-warn-access.cc.

Fixed thusly.

gcc/ChangeLog:
* gimple-ssa-warn-access.cc (warn_string_no_nul): Add
auto_diagnostic_group to group any warning with its note.
(maybe_warn_for_bound): Likewise.
(check_access): Likewise.
(warn_dealloc_offset): Likewise.
(pass_waccess::maybe_warn_memmodel): Likewise.
(pass_waccess::maybe_check_dealloc_call): Likewise.
(pass_waccess::warn_invalid_pointer): Likewise.
(pass_waccess::check_dangling_stores): Likewise.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

c-decl: fix "inform" grouping and conditionalization

Whilst working on SARIF output I noticed some places where followup notes
weren't being properly associated with their errors/warnings in c-decl.cc.

Whilst fixing those I noticed some places where we "inform" after a
"warning" without checking that the warning was actually emitted.

Fixed the various issues seen in gcc/c/c-decl.cc thusly.

gcc/c/ChangeLog:
* c-decl.cc (implicitly_declare): Add auto_diagnostic_group to
group the warning with any note.
(warn_about_goto): Likewise to group error or warning with note.
Bail out if the warning wasn't emitted, to avoid emitting orphan
notes.
(lookup_label_for_goto): Add auto_diagnostic_group to
group the error with the note.
(check_earlier_gotos): Likewise.
(c_check_switch_jump_warnings): Likewise for any error/warning.
Conditionalize emission of the notes.
(diagnose_uninitialized_cst_member): Likewise for warning,
conditionalizing emission of the note.
(grokdeclarator): Add auto_diagnostic_group to group the "array
type has incomplete element type" error with any note.
(parser_xref_tag): Add auto_diagnostic_group to group warnings
with their notes. Conditionalize emission of notes.
(start_struct): Add auto_diagnostic_group to group the
"redefinition of" errors with any note.
(start_enum): Likewise for "redeclaration of %<enum %E%>" error.
(check_for_loop_decls): Likewise for pre-C99 error.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

analyzer: associate -Wanalyzer-va-arg-type-mismatch with CWE-686

gcc/analyzer/ChangeLog:
* varargs.cc (va_arg_type_mismatch::emit): Associate the warning
with CWE-686 ("Function Call With Incorrect Argument Type").

gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/stdarg-1.c
(__analyzer_called_by_test_type_mismatch_1): Verify that
-Wanalyzer-va-arg-type-mismatch is associated with CWE-686.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

analyzer: associate -Wanalyzer-va-list-exhausted with CWE-685

gcc/analyzer/ChangeLog:
* varargs.cc: Include "diagnostic-metadata.h".
(va_list_exhausted::emit): Associate the warning with
CWE-685 ("Function Call With Incorrect Number of Arguments").

gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/stdarg-1.c
(__analyzer_called_by_test_not_enough_args): Verify that
-Wanalyzer-va-list-exhausted is associated with CWE-685.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

analyzer: associate -Wanalyzer-double-fclose with CWE-1341

gcc/analyzer/ChangeLog:
* sm-file.cc (double_fclose::emit): Associate the warning with
CWE-1341 ("Multiple Releases of Same Resource or Handle").

gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/file-1.c (test_1): Verify that double-fclose is
associated with CWE-1341.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

opts: fix opts_set->x_flag_sanitize

While working on PR104642 I noticed this wasn't getting set.

gcc/ChangeLog:

* opts.cc (common_handle_option) [OPT_fsanitize_]: Set
opts_set->x_flag_sanitize.

flags: add comment

gcc/ChangeLog:

* flags.h (issue_strict_overflow_warning): Comment #endif.

compiler: don't generate stubs for ambiguous direct interface methods

Current implementation checks whether it has to generate a stub method for a
promoted method of an embedded struct field in Type::build_stub_methods(). If
the promoted method is ambiguous it's simply skipped. But struct types that
can fit in an interface value (e.g. structs that consist of a single pointer
field) get a second chance in Type::build_direct_iface_stub_methods().

This patch adds the same check used by Type::build_stub_methods() to
Type::build_direct_iface_stub_methods().

Fixes golang/go#52870

Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/405974

libstdc++: Support constexpr global std::string for size < 15 [PR105995]

I don't think this is required by the standard, but it's easy to
support.

libstdc++-v3/ChangeLog:

PR libstdc++/105995
* include/bits/basic_string.h (_M_use_local_data): Initialize
the entire SSO buffer.
* testsuite/21_strings/basic_string/cons/char/105995.cc: New test.

libstdc++: Apply r13-1096-g6abe341558abec change to vstring too [PR101482]

As recently done for std::basic_string, __gnu_cxx::__versa_string
equality comparisons can check lengths first for any character type and
traits type, not only for std::char_traits<char>.

libstdc++-v3/ChangeLog:

PR libstdc++/101482
* include/ext/vstring.h (operator==): Always check lengths
before comparing.

c++: Elide inactive initializer fns from init array

There's no point adding no-op initializer fns (that a module might
have) to the static initializer list. Also, we can add any objc
initializer call to a partial initializer function and simplify some
control flow.

gcc/cp/
* decl2.cc (finish_objects): Add startp parameter, adjust.
(generate_ctor_or_dtor_function): Detect empty fn, and don't
generate unnecessary code. Remove objc startup here ...
(c_parse_final_cleanyps): ... do it here.

gcc/testsuite/
* g++.dg/modules/init-2_b.C: Add init check.
* g++.dg/modules/init-2_c.C: Add init check.

Clear invariant bit for inferred ranges.

The range of an invariant SSA (no outgoing edge range anywhere) is not tracked.
If an inferred range is registered, remove the invariant flag.

* gimple-range-cache.cc (ranger_cache::apply_inferred_ranges): If name
was invaraint before, clear the invariant bit.
* gimple-range-gori.cc (gori_map::set_range_invariant): Add a flag.
* gimple-range-gori.h (gori_map::set_range_invariant): Adjust prototype.

Propagator should call value_of_stmt.

When evaluating the LHS of a stmt, its more efficent/better to call
value_of_stmt directly rather than value_of_expr.

* tree-ssa-propagate.cc (before_dom_children): Call value_of_stmt.

match.pd: Improve y == MIN || x < y optimization [PR105983]

On the following testcase, we only optimize bar where this optimization
is performed at GENERIC folding time, but on GIMPLE it doesn't trigger
anymore, as we actually don't see
  (bit_and (ne @1 min_value) (ge @0 @1))
but
  (bit_and (ne @1 min_value) (le @1 @0))
genmatch handles :c modifier not just on commutative operations, but
also comparisons and in that case it means it swaps the comparison.

2022-06-16  Jakub Jelinek  <jakub@redhat.com>

PR tree-optimization/105983
* match.pd (y == XXX_MIN || x < y -> x <= y - 1,
y != XXX_MIN && x >= y -> x > y - 1): Use :cs instead of :s
on non-equality comparisons.

* gcc.dg/tree-ssa/pr105983.c: New test.

match.pd: Fix up __builtin_mul_overflow_p signed type optimization [PR105984]

Earlier in the simplification pattern, we require that @0 has compatible
type to the type of IMAGPART_EXPR, but for @1 which is a non-zero constant
all we require is that it the constant fits into that type.
Later the code checks if the constant is negative, because when min / max
values are divided by negative divisor, lo will be higher than hi.
In the following testcase, @1 has unsigned char type, while @0 has
int type, so @1 which is 254 is wi::neg_p and we were swapping lo and hi,
even when @1 cast to int isn't negative.

We could use tree_int_cst_sgn (@1) < 0 as the check instead and it would
work both for narrower types of @1 and even same or wider ones, but
I've noticed we probably don't want to call fold_convert (TREE_TYPE (@0), @1)
twice and when we save that result in a temporary, we can just use wi::neg_p
on that temporary.

2022-06-16 Jakub Jelinek <jakub@redhat.com>

PR tree-optimization/105984
* match.pd (__builtin_mul_overflow_p (x, cst, (stype) 0) ->
x > stype_max / cst || x < stype_min / cst): fold_convert @1
to TREE_TYPE (@0) just once and test for negative divisor
also on that folded constant instead of on @1.

* gcc.c-torture/execute/pr105984.c: New test.

expand: Fix up IFN_ATOMIC_{BIT*,*CMP_0} expansion [PR105951]

Both IFN_ATOMIC_BIT_TEST_AND_* and IFN_ATOMIC_*_FETCH_CMP_0 ifns
are matched if their corresponding optab is implemented for the particular
mode.  The fact that those optabs are implemented doesn't guarantee
they will succeed though, they can just FAIL in their expansion.
The expansion in that case uses expand_atomic_fetch_op as fallback, but
as has been reported and and can be reproduced on the testcases,
even those can fail and we didn't have any fallback after that.
For IFN_ATOMIC_BIT_TEST_AND_* we actually have such calls.  One is
done whenever we lost lhs of the ifn at some point in between matching
it in tree-ssa-ccp.cc and expansion.  The following patch for that case
just falls through and expands as if there was a lhs, creates a temporary
for it.  For the other expand_atomic_fetch_op call in the same expander
and for the only expand_atomic_fetch_op call in the other, this falls
back the hard way, by constructing a CALL_EXPR to the call from which
the ifn has been matched and expanding that.  Either it is lucky and manages
to expand inline, or it emits a libatomic API call.
So that we don't have to rediscover which builtin function to call in the
fallback, we record at tree-ssa-ccp.cc time gimple_call_fn (call) in
an extra argument to the ifn.

2022-06-16  Jakub Jelinek  <jakub@redhat.com>

PR middle-end/105951
* tree-ssa-ccp.cc (optimize_atomic_bit_test_and,
optimize_atomic_op_fetch_cmp_0): Remember gimple_call_fn (call)
as last argument to the internal functions.
* builtins.cc (expand_ifn_atomic_bit_test_and): Adjust for the
extra call argument to ifns.  If expand_atomic_fetch_op fails for the
lhs == NULL_TREE case, fall through into the optab code with
gen_reg_rtx (mode) as target.  If second expand_atomic_fetch_op
fails, construct a CALL_EXPR and expand that.
(expand_ifn_atomic_op_fetch_cmp_0): Adjust for the extra call argument
to ifns.  If expand_atomic_fetch_op fails, construct a CALL_EXPR and
expand that.

* gcc.target/i386/pr105951-1.c: New test.
* gcc.target/i386/pr105951-2.c: New test.

rs6000: add V1TI into vector comparison expand [PR103316]

This patch adds V1TI mode into a new mode iterator used in vector comparison,shift and rotation expands.  It also merges some vector comparison, shift and rotation expands for V1T1 and other vector integer modes as they have the similar patterns.  The expands for V1TI only are removed.

gcc/
PR target/103316
* config/rs6000/rs6000-builtin.cc (rs6000_gimple_fold_builtin): Enable
gimple folding for RS6000_BIF_VCMPEQUT, RS6000_BIF_VCMPNET,
RS6000_BIF_CMPGE_1TI, RS6000_BIF_CMPGE_U1TI, RS6000_BIF_VCMPGTUT,
RS6000_BIF_VCMPGTST, RS6000_BIF_CMPLE_1TI, RS6000_BIF_CMPLE_U1TI.
* config/rs6000/vector.md (VEC_IC): New mode iterator.  Add support
for new Power10 V1TI instructions.
(vec_cmp<mode><mode>): Set mode iterator to VEC_IC.
(vec_cmpu<mode><mode>): Likewise.
(vector_nlt<mode>): Set mode iterator to VEC_IC.
(vector_nltv1ti): Remove.
(vector_gtu<mode>): Set mode iterator to VEC_IC.
(vector_gtuv1ti): Remove.
(vector_nltu<mode>): Set mode iterator to VEC_IC.
(vector_nltuv1ti): Remove.
(vector_geu<mode>): Set mode iterator to VEC_IC.
(vector_ngt<mode>): Likewise.
(vector_ngtv1ti): Remove.
(vector_ngtu<mode>): Set mode iterator to VEC_IC.
(vector_ngtuv1ti): Remove.
(vector_gtu_<mode>_p): Set mode iterator to VEC_IC.
(vector_gtu_v1ti_p): Remove.
(vrotl<mode>3): Set mode iterator to VEC_IC.  Emit insns for V1TI.
(vrotlv1ti3): Remove.
(vashr<mode>3): Set mode iterator to VEC_IC.  Emit insns for V1TI.
(vashrv1ti3): Remove.

gcc/testsuite/
PR target/103316
* gcc.target/powerpc/pr103316.c: New.
* gcc.target/powerpc/fold-vec-cmp-int128.c: New.

clang: fix -Wunused-parameter warning

Fixes:
gcc/cp/decl2.cc:158:54: warning: unused parameter 'entry' [-Wunused-parameter]

gcc/cp/ChangeLog:

* decl2.cc (struct priority_map_traits): Remove unused param.

gengtype: do not skip char after escape sequnce

Right now, when a \$x escape sequence occures, the
next character after $x is skipped, which is bogus.

The code has very low coverage right now.

gcc/ChangeLog:

* gengtype-state.cc (read_a_state_token): Do not skip extra
character after escaped sequence.

opts: improve option suggestion

In case where we have 2 equally good candidates like
-ftrivial-auto-var-init=
-Wtrivial-auto-var-init

for -ftrivial-auto-var-init, we should take the candidate that
has a difference in trailing sign symbol.

PR driver/105564

gcc/ChangeLog:

* spellcheck.cc (test_find_closest_string): Add new test.
* spellcheck.h (class best_match): Prefer a difference in
trailing sign symbol.

RISC-V/testsuite: Fix pr105666.c under rv32

In rv32 regression test, this cases will report an error:

"cc1: error: ABI requires '-march=rv32'"

Add '-mabi' option will fix this.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/pr105666.c: New options.

Simplify (B * v + C) * D -> BD* v + CD when B,C,D are all INTEGER_CST.

Similar for (v + B) * C + D -> C * v + BCD.
Don't simplify it when there's overflow and overflow is UB for type v.

gcc/ChangeLog:

PR tree-optimization/53533
* match.pd: Simplify (B * v + C) * D -> BD * v + CD and
(v + B) * C + D -> C * v + BCD when B,C,D are all INTEGER_CST,
and there's no overflow or !TYPE_OVERFLOW_UNDEFINED.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr53533-1.c: New test.
* gcc.target/i386/pr53533-2.c: New test.
* gcc.target/i386/pr53533-3.c: New test.
* gcc.target/i386/pr53533-4.c: New test.
* gcc.target/i386/pr53533-5.c: New test.
* gcc.dg/vect/slp-11a.c: Adjust testcase.