Takayuki 'January June' Suwa [Sun, 19 Jun 2022 19:12:48 +0000 (04:12 +0900)]
xtensa: Apply a few minor fixes
No functional changes.
gcc/ChangeLog:
* config/xtensa/xtensa.cc (xtensa_emit_move_sequence):
Use can_create_pseudo_p(), instead of using individual
reload_in_progress and reload_completed.
(xtensa_expand_block_set_small_loop): Use xtensa_simm8x256(),
the existing predicate function.
(xtensa_is_insn_L32R_p, gen_int_relational, xtensa_emit_sibcall):
Use the standard RTX code predicate macros such as MEM_P,
SYMBOL_REF_P and/or CONST_INT_P.
* config/xtensa/xtensa.md: Avoid using numeric literals to determine
if callee-saved register, at the split patterns for indirect sibcall
fixups.
GCC Administrator [Sun, 19 Jun 2022 00:16:23 +0000 (00:16 +0000)]
Daily bump.
Harald Anlauf [Wed, 15 Jun 2022 20:20:09 +0000 (22:20 +0200)]
Fortran: check POS and LEN arguments simplifying bit intrinsics [PR105986]
gcc/fortran/ChangeLog:
PR fortran/105986
* simplify.cc (gfc_simplify_btest): Add check for POS argument.
(gfc_simplify_ibclr): Add check for POS argument.
(gfc_simplify_ibits): Add check for POS and LEN arguments.
(gfc_simplify_ibset): Add check for POS argument.
gcc/testsuite/ChangeLog:
PR fortran/105986
* gfortran.dg/check_bits_3.f90: New test.
Jakub Jelinek [Sat, 18 Jun 2022 09:09:48 +0000 (11:09 +0200)]
ubsan: Add -fsanitize-trap= support
On Thu, Jun 16, 2022 at 09:32:02PM +0100, Jonathan Wakely wrote:
> It looks like clang has addressed this deficiency now:
>
> https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html#usage
Thanks, that is roughly what I'd implement anyway and apparently they have
it already since 2015, we've added the -fsanitize-undefined-trap-on-error
support back in 2014 and didn't change it since then.
As a small divergence from clang, I chose -fsanitize-undefined-trap-on-error
to be a (deprecated) alias for -fsanitize-trap aka -fsanitize-trap=all
rather thn -fsanitize-trap=undefined which seems to be what clang does,
because for a deprecated option it is IMHO more important backwards
compatibility with what gcc did over the past 8 years rather than clang
compatibility.
Some sanitizers (e.g. asan, lsan, tsan) don't support traps,
-fsanitize-trap=address etc. will be rejected (if enabled at the end of
command line), -fno-sanitize-trap= can be specified even for them.
This is similar behavior to -fsanitize-recover=.
One complication is vptr sanitization, which can't easily trap,
as the whole slow path of the checking is inside of libubsan.
Previously, -fsanitize=vptr -fsanitize-undefined-trap-on-error
silently ignored vptr sanitization.
This patch similarly to what clang does will accept
-fsanitize-trap=all or -fsanitize-trap=undefined which enable
the vptr bit as trapping and again that causes silent disabling
of vptr sanitization, while -fsanitize-trap=vptr is rejected
(already during option processing).
2022-06-18 Jakub Jelinek <jakub@redhat.com>
gcc/
* common.opt (flag_sanitize_trap): New variable.
(fsanitize-trap=, fsanitize-trap): New options.
(fsanitize-undefined-trap-on-error): Change into deprecated alias
for -fsanitize-trap=all.
* opts.h (struct sanitizer_opts_s): Add can_trap member.
* opts.cc (finish_options): Complain about unsupported
-fsanitize-trap= options.
(sanitizer_opts): Add can_trap values to all entries.
(get_closest_sanitizer_option): Ignore -fsanitize-trap=
options which have can_trap false.
(parse_sanitizer_options): Add support for -fsanitize-trap=.
For -fsanitize-trap=all, enable
SANITIZE_UNDEFINED | SANITIZE_UNDEFINED_NONDEFAULT. Disallow
-fsanitize-trap=vptr here.
(common_handle_option): Handle OPT_fsanitize_trap_ and
OPT_fsanitize_trap.
* sanopt.cc (maybe_optimize_ubsan_null_ifn): Check
flag_sanitize_trap & SANITIZE_{NULL,ALIGNMENT} instead of
flag_sanitize_undefined_trap_on_error.
* gcc.cc (sanitize_spec_function): Use
flag_sanitize & ~flag_sanitize_trap instead of flag_sanitize
and drop use of flag_sanitize_undefined_trap_on_error in
"undefined" handling.
* ubsan.cc (ubsan_instrument_unreachable): Use
flag_sanitize_trap & SANITIZE_??? instead of
flag_sanitize_undefined_trap_on_error.
(ubsan_expand_bounds_ifn, ubsan_expand_null_ifn,
ubsan_expand_objsize_ifn, ubsan_expand_ptr_ifn,
ubsan_build_overflow_builtin, instrument_bool_enum_load,
ubsan_instrument_float_cast, instrument_nonnull_arg,
instrument_nonnull_return, instrument_builtin): Likewise.
* doc/invoke.texi (-fsanitize-trap=, -fsanitize-trap): Document.
(-fsanitize-undefined-trap-on-error): Document as deprecated
alias of -fsanitize-trap.
gcc/c-family/
* c-ubsan.cc (ubsan_instrument_division, ubsan_instrument_shift):
Use flag_sanitize_trap & SANITIZE_??? instead of
flag_sanitize_undefined_trap_on_error. If 2 sanitizers are involved
and flag_sanitize_trap differs for them, emit __builtin_trap only
for the comparison where trap is requested.
(ubsan_instrument_vla, ubsan_instrument_return): Use
lag_sanitize_trap & SANITIZE_??? instead of
flag_sanitize_undefined_trap_on_error.
gcc/cp/
* cp-ubsan.cc (cp_ubsan_instrument_vptr_p): Use
flag_sanitize_trap & SANITIZE_VPTR instead of
flag_sanitize_undefined_trap_on_error.
gcc/testsuite/
* c-c++-common/ubsan/nonnull-4.c: Use -fsanitize-trap=all
instead of -fsanitize-undefined-trap-on-error.
* c-c++-common/ubsan/div-by-zero-4.c: Use
-fsanitize-trap=signed-integer-overflow instead of
-fsanitize-undefined-trap-on-error.
* c-c++-common/ubsan/overflow-add-4.c: Use -fsanitize-trap=undefined
instead of -fsanitize-undefined-trap-on-error.
* c-c++-common/ubsan/pr56956.c: Likewise.
* c-c++-common/ubsan/pr68142.c: Likewise.
* c-c++-common/ubsan/pr80932.c: Use
-fno-sanitize-trap=all -fsanitize-trap=shift,undefined
instead of -fsanitize-undefined-trap-on-error.
* c-c++-common/ubsan/align-8.c: Use -fsanitize-trap=alignment
instead of -fsanitize-undefined-trap-on-error.
Jakub Jelinek [Sat, 18 Jun 2022 09:07:13 +0000 (11:07 +0200)]
varasm: Fix up ICE in narrowing_initializer_constant_valid_p [PR105998]
The following testcase ICEs because there is NON_LVALUE_EXPR (location
wrapper) around a VAR_DECL and has TYPE_MODE V2SImode and
SCALAR_INT_TYPE_MODE on that ICEs. Or for -m32 -march=i386 TYPE_MODE
is DImode, but SCALAR_INT_TYPE_MODE still uses the raw V2SImode and ICEs
too.
2022-06-18 Jakub Jelinek <jakub@redhat.com>
PR middle-end/105998
* varasm.cc (narrowing_initializer_constant_valid_p): Check
SCALAR_INT_MODE_P instead of INTEGRAL_MODE_P, also break on
! INTEGRAL_TYPE_P and do the same check also on op{0,1}'s type.
* c-c++-common/pr105998.c: New test.
Roger Sayle [Sat, 18 Jun 2022 08:06:20 +0000 (09:06 +0100)]
PR tree-optimization/105835: Two narrowing patterns for match.pd.
This patch resolves PR tree-optimization/105835, which is a code quality
(dead code elimination) regression at -O1 triggered/exposed by a recent
change to canonicalize X&-Y as X*Y. The new (shorter) form exposes some
missed optimization opportunities that can be handled by adding some
extra simplifications to match.pd.
One transformation is to simplify "(short)(x ? 65535 : 0)" into the
equivalent "x ? -1 : 0", or more accurately x ? (short)-1 : (short)0",
as INTEGER_CSTs record their type, and integer conversions can be
pushed inside COND_EXPRs reducing the number of gimple statements.
The other transformation is that (short)(X * 65535), where X is [0,1],
into the equivalent (short)X * -1, (or again (short)-1 where tree's
INTEGER_CSTs encode their type). This is valid because multiplications
where one operand is [0,1] are guaranteed not to overflow, and hence
integer conversions can also be pushed inside these multiplications.
These narrowing conversion optimizations can be identified by range
analyses, such as EVRP, but these are only performed at -O2 and above,
which is why this regression is only visible with -O1.
2022-06-18 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
PR tree-optimization/105835
* match.pd (convert (mult zero_one_valued_p@1 INTEGER_CST@2)):
Narrow integer multiplication by a zero_one_valued_p operand.
(convert (cond @1 INTEGER_CST@2 INTEGER_CST@3)): Push integer
conversions inside COND_EXPR where both data operands are
integer constants.
gcc/testsuite/ChangeLog
PR tree-optimization/105835
* gcc.dg/pr105835.c: New test case.
Takayuki 'January June' Suwa [Fri, 17 Jun 2022 13:47:49 +0000 (22:47 +0900)]
xtensa: Defer storing integer constants into litpool until reload
Storing integer constants into litpool in the early stage of compilation
hinders some integer optimizations. In fact, such integer constants are
not subject to the constant folding process.
For example:
extern unsigned short value;
extern void foo(void);
void test(void) {
if (value == 30001)
foo();
}
.literal_position
.literal .LC0, value
.literal .LC1, 30001
test:
l32r a3, .LC0
l32r a2, .LC1
l16ui a3, a3, 0
extui a2, a2, 0, 16 // runtime zero-extension despite constant
bne a3, a2, .L1
j.l foo, a9
.L1:
ret.n
This patch defers the placement of integer constants into litpool until
the start of reload:
.literal_position
.literal .LC0, value
.literal .LC1, 30001
test:
l32r a3, .LC0
l32r a2, .LC1
l16ui a3, a3, 0
bne a3, a2, .L1
j.l foo, a9
.L1:
ret.n
gcc/ChangeLog:
* config/xtensa/constraints.md (Y):
Change to include integer constants until reload begins.
* config/xtensa/predicates.md (move_operand): Ditto.
* config/xtensa/xtensa.cc (xtensa_emit_move_sequence):
Change to allow storing integer constants into litpool only after
reload begins.
GCC Administrator [Sat, 18 Jun 2022 00:16:19 +0000 (00:16 +0000)]
Daily bump.
Ian Lance Taylor [Tue, 14 Jun 2022 18:33:42 +0000 (11:33 -0700)]
libgo: permit loff_t and off_t to be macros
They are macros in musl libc, rather than typedefs, and -fgo-dump-spec
doesn't handle that case.
Based on patch by Sören Tempel.
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/412075
Jakub Jelinek [Fri, 17 Jun 2022 15:40:49 +0000 (17:40 +0200)]
c++: Use fold_non_dependent_expr rather than maybe_constant_value in __builtin_shufflevector handling [PR106001]
In this case the STATIC_CAST_EXPR expressions in the call aren't
type nor value dependent, but maybe_constant_value still ICEs on those
when processing_template_decl. Calling fold_non_dependent_expr on it
instead fixes the ICE and folds them to INTEGER_CSTs.
2022-06-17 Jakub Jelinek <jakub@redhat.com>
PR c++/106001
* typeck.cc (build_x_shufflevector): Use fold_non_dependent_expr
instead of maybe_constant_value.
* g++.dg/ext/builtin-shufflevector-4.C: New test.
Uros Bizjak [Fri, 17 Jun 2022 15:19:44 +0000 (17:19 +0200)]
alpha: Introduce target specific store_data_bypass_p function [PR105209]
This patch introduces alpha-specific version of store_data_bypass_p that
ignores TRAP_IF that would result in assertion failure (and internal
compiler error) in the generic store_data_bypass_p function.
While at it, also remove ev4_ist_c reservation, store_data_bypass_p
can handle the patterns with multiple sets since some time ago.
2022-06-17 Uroš Bizjak <ubizjak@gmail.com>
gcc/ChangeLog:
PR target/105209
* config/alpha/alpha-protos.h (alpha_store_data_bypass_p): New.
* config/alpha/alpha.cc (alpha_store_data_bypass_p): New function.
(alpha_store_data_bypass_p_1): Ditto.
* config/alpha/ev4.md: Use alpha_store_data_bypass_p instead
of generic store_data_bypass_p.
(ev4_ist_c): Remove insn reservation.
gcc/testsuite/ChangeLog:
PR target/105209
* gcc.target/alpha/pr105209.c: New test.
Uros Bizjak [Fri, 17 Jun 2022 15:01:31 +0000 (17:01 +0200)]
i386: Fix assert in ix86_function_arg [PR105970]
The mode of pointer argument should equal ptr_mode, not Pmode.
2022-06-17 Uroš Bizjak <ubizjak@gmail.com>
gcc/ChangeLog:
PR target/105970
* config/i386/i386.cc (ix86_function_arg): Assert that
the mode of pointer argumet is equal to ptr_mode, not Pmode.
gcc/testsuite/ChangeLog:
PR target/105970
* gcc.target/i386/pr105970.c: New test.
Uros Bizjak [Fri, 17 Jun 2022 14:22:20 +0000 (16:22 +0200)]
i386: Fix VPMOV splitter [PR105993]
REGNO should not be used with register_operand before reload because
subregs of registers or even subregs of memory match the predicate.
The build with RTL checking enabled does not tolerate REGNO with
non-reg operand.
The patch splits the splitter into two related splitters and uses
(match_dup ...) RTXes instead of REGNO comparisons.
2022-06-17 Uroš Bizjak <ubizjak@gmail.com>
gcc/ChangeLog:
PR target/105993
* config/i386/sse.md (vpmov splitter): Use (match_dup ...)
instead of REGNO comparisons in combine splitter.
gcc/testsuite/ChangeLog:
PR target/105993
* gcc.target/i386/pr105993.c: New test.
Segher Boessenkool [Fri, 17 Jun 2022 14:07:37 +0000 (14:07 +0000)]
rs6000: Fix some error messages for invalid conversions
"* something" isn't a type. "something *" is.
2022-06-17 Segher Boessenkool <segher@kernel.crashing.org>
* config/rs6000/rs6000.cc (rs6000_invalid_conversion): Correct some
types.
Kito Cheng [Fri, 17 Jun 2022 13:57:35 +0000 (21:57 +0800)]
RISC-V: Supress warning for comparison of integer expressions of different signedness
gcc/ChangeLog:
* config/riscv/bitmanip.md: Supress warning.
Richard Earnshaw [Fri, 17 Jun 2022 13:25:51 +0000 (14:25 +0100)]
arm: fix checking ICE in arm_print_operand [PR106004]
Sigh, another instance where I incorrectly used XUINT instead of
UINTVAL.
I've also made the code here a little more robust (although I think
this case can't in fact be reached) if the 32-bit clear mask includes
bit 31. This case, if reached, would print out an out-of-range value
based on the size of the compiler's HOST_WIDE_INT type due to
sign-extension. We avoid this by masking the value after inversion.
gcc/ChangeLog:
PR target/106004
* config/arm/arm.cc (arm_print_operand, case 'V'): Use UINTVAL.
Clear bits in the mask above bit 31.
Jonathan Wakely [Fri, 17 Jun 2022 12:29:05 +0000 (13:29 +0100)]
libstdc++: Add missing #include <string> to new test
Somehow I pushed a different version of this test to the one I actually
tested.
libstdc++-v3/ChangeLog:
* testsuite/21_strings/basic_string/cons/char/105995.cc: Add
missing #include.
Martin Liska [Fri, 17 Jun 2022 11:33:35 +0000 (13:33 +0200)]
docs: add missing table header
libgomp/ChangeLog:
* libgomp.texi: Add table header for new features of
OpenMP 5.2.
Richard Earnshaw [Fri, 17 Jun 2022 09:30:57 +0000 (10:30 +0100)]
arm: mve: Don't force trivial vector literals to the pool
A bug in the ordering of the operands in the mve_mov<mode> pattern
meant that all literal values were being pushed to the literal pool.
This patch fixes that and simplifies some of the logic slightly so
that we can use as simple switch statement.
For example:
void f (uint32_t *a)
{
int i;
for (i = 0; i < 100; i++)
a[i] += 1;
}
Now compiles to:
push {lr}
mov lr, #25
vmov.i32 q2, #0x1 @ v4si
...
instead of
push {lr}
mov lr, #25
vldr.64 d4, .L6
vldr.64 d5, .L6+8
...
.L7:
.align 3
.L6:
.word 1
.word 1
.word 1
.word 1
gcc/ChangeLog:
* config/arm/mve.md (*mve_mov<mode>): Re-order constraints
to avoid spilling trivial literals to the constant pool.
gcc/testsuite/ChangeLog:
* gcc.target/arm/acle/cde-mve-full-assembly.c: Adjust expected
output.
GCC Administrator [Fri, 17 Jun 2022 00:16:23 +0000 (00:16 +0000)]
Daily bump.
David Malcolm [Thu, 16 Jun 2022 21:37:15 +0000 (17:37 -0400)]
gimple-ssa-warn-access.cc: add missing auto_diagnostic_group
Whilst working on SARIF output I noticed some places where followup notes
weren't being properly associated with their warnings in
gcc/gimple-ssa-warn-access.cc.
Fixed thusly.
gcc/ChangeLog:
* gimple-ssa-warn-access.cc (warn_string_no_nul): Add
auto_diagnostic_group to group any warning with its note.
(maybe_warn_for_bound): Likewise.
(check_access): Likewise.
(warn_dealloc_offset): Likewise.
(pass_waccess::maybe_warn_memmodel): Likewise.
(pass_waccess::maybe_check_dealloc_call): Likewise.
(pass_waccess::warn_invalid_pointer): Likewise.
(pass_waccess::check_dangling_stores): Likewise.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
David Malcolm [Thu, 16 Jun 2022 21:36:38 +0000 (17:36 -0400)]
c-decl: fix "inform" grouping and conditionalization
Whilst working on SARIF output I noticed some places where followup notes
weren't being properly associated with their errors/warnings in c-decl.cc.
Whilst fixing those I noticed some places where we "inform" after a
"warning" without checking that the warning was actually emitted.
Fixed the various issues seen in gcc/c/c-decl.cc thusly.
gcc/c/ChangeLog:
* c-decl.cc (implicitly_declare): Add auto_diagnostic_group to
group the warning with any note.
(warn_about_goto): Likewise to group error or warning with note.
Bail out if the warning wasn't emitted, to avoid emitting orphan
notes.
(lookup_label_for_goto): Add auto_diagnostic_group to
group the error with the note.
(check_earlier_gotos): Likewise.
(c_check_switch_jump_warnings): Likewise for any error/warning.
Conditionalize emission of the notes.
(diagnose_uninitialized_cst_member): Likewise for warning,
conditionalizing emission of the note.
(grokdeclarator): Add auto_diagnostic_group to group the "array
type has incomplete element type" error with any note.
(parser_xref_tag): Add auto_diagnostic_group to group warnings
with their notes. Conditionalize emission of notes.
(start_struct): Add auto_diagnostic_group to group the
"redefinition of" errors with any note.
(start_enum): Likewise for "redeclaration of %<enum %E%>" error.
(check_for_loop_decls): Likewise for pre-C99 error.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
David Malcolm [Thu, 16 Jun 2022 21:35:16 +0000 (17:35 -0400)]
analyzer: associate -Wanalyzer-va-arg-type-mismatch with CWE-686
gcc/analyzer/ChangeLog:
* varargs.cc (va_arg_type_mismatch::emit): Associate the warning
with CWE-686 ("Function Call With Incorrect Argument Type").
gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/stdarg-1.c
(__analyzer_called_by_test_type_mismatch_1): Verify that
-Wanalyzer-va-arg-type-mismatch is associated with CWE-686.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
David Malcolm [Thu, 16 Jun 2022 21:33:40 +0000 (17:33 -0400)]
analyzer: associate -Wanalyzer-va-list-exhausted with CWE-685
gcc/analyzer/ChangeLog:
* varargs.cc: Include "diagnostic-metadata.h".
(va_list_exhausted::emit): Associate the warning with
CWE-685 ("Function Call With Incorrect Number of Arguments").
gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/stdarg-1.c
(__analyzer_called_by_test_not_enough_args): Verify that
-Wanalyzer-va-list-exhausted is associated with CWE-685.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
David Malcolm [Thu, 16 Jun 2022 21:27:08 +0000 (17:27 -0400)]
analyzer: associate -Wanalyzer-double-fclose with CWE-1341
gcc/analyzer/ChangeLog:
* sm-file.cc (double_fclose::emit): Associate the warning with
CWE-1341 ("Multiple Releases of Same Resource or Handle").
gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/file-1.c (test_1): Verify that double-fclose is
associated with CWE-1341.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Jason Merrill [Tue, 14 Jun 2022 21:56:08 +0000 (17:56 -0400)]
opts: fix opts_set->x_flag_sanitize
While working on PR104642 I noticed this wasn't getting set.
gcc/ChangeLog:
* opts.cc (common_handle_option) [OPT_fsanitize_]: Set
opts_set->x_flag_sanitize.
Jason Merrill [Tue, 14 Jun 2022 21:56:08 +0000 (17:56 -0400)]
flags: add comment
gcc/ChangeLog:
* flags.h (issue_strict_overflow_warning): Comment #endif.
Mikhail Ablakatov [Thu, 12 May 2022 18:45:35 +0000 (21:45 +0300)]
compiler: don't generate stubs for ambiguous direct interface methods
Current implementation checks whether it has to generate a stub method for a
promoted method of an embedded struct field in Type::build_stub_methods(). If
the promoted method is ambiguous it's simply skipped. But struct types that
can fit in an interface value (e.g. structs that consist of a single pointer
field) get a second chance in Type::build_direct_iface_stub_methods().
This patch adds the same check used by Type::build_stub_methods() to
Type::build_direct_iface_stub_methods().
Fixes golang/go#52870
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/405974
Jonathan Wakely [Thu, 16 Jun 2022 13:57:32 +0000 (14:57 +0100)]
libstdc++: Support constexpr global std::string for size < 15 [PR105995]
I don't think this is required by the standard, but it's easy to
support.
libstdc++-v3/ChangeLog:
PR libstdc++/105995
* include/bits/basic_string.h (_M_use_local_data): Initialize
the entire SSO buffer.
* testsuite/21_strings/basic_string/cons/char/105995.cc: New test.
Jonathan Wakely [Thu, 16 Jun 2022 10:02:11 +0000 (11:02 +0100)]
libstdc++: Apply r13-1096-g6abe341558abec change to vstring too [PR101482]
As recently done for std::basic_string, __gnu_cxx::__versa_string
equality comparisons can check lengths first for any character type and
traits type, not only for std::char_traits<char>.
libstdc++-v3/ChangeLog:
PR libstdc++/101482
* include/ext/vstring.h (operator==): Always check lengths
before comparing.
Nathan Sidwell [Thu, 16 Jun 2022 17:14:56 +0000 (10:14 -0700)]
c++: Elide inactive initializer fns from init array
There's no point adding no-op initializer fns (that a module might
have) to the static initializer list. Also, we can add any objc
initializer call to a partial initializer function and simplify some
control flow.
gcc/cp/
* decl2.cc (finish_objects): Add startp parameter, adjust.
(generate_ctor_or_dtor_function): Detect empty fn, and don't
generate unnecessary code. Remove objc startup here ...
(c_parse_final_cleanyps): ... do it here.
gcc/testsuite/
* g++.dg/modules/init-2_b.C: Add init check.
* g++.dg/modules/init-2_c.C: Add init check.
Andrew MacLeod [Thu, 16 Jun 2022 16:44:33 +0000 (12:44 -0400)]
Clear invariant bit for inferred ranges.
The range of an invariant SSA (no outgoing edge range anywhere) is not tracked.
If an inferred range is registered, remove the invariant flag.
* gimple-range-cache.cc (ranger_cache::apply_inferred_ranges): If name
was invaraint before, clear the invariant bit.
* gimple-range-gori.cc (gori_map::set_range_invariant): Add a flag.
* gimple-range-gori.h (gori_map::set_range_invariant): Adjust prototype.
Andrew MacLeod [Thu, 31 Mar 2022 13:36:59 +0000 (09:36 -0400)]
Propagator should call value_of_stmt.
When evaluating the LHS of a stmt, its more efficent/better to call
value_of_stmt directly rather than value_of_expr.
* tree-ssa-propagate.cc (before_dom_children): Call value_of_stmt.
Jakub Jelinek [Thu, 16 Jun 2022 12:37:06 +0000 (14:37 +0200)]
match.pd: Improve y == MIN || x < y optimization [PR105983]
On the following testcase, we only optimize bar where this optimization
is performed at GENERIC folding time, but on GIMPLE it doesn't trigger
anymore, as we actually don't see
(bit_and (ne @1 min_value) (ge @0 @1))
but
(bit_and (ne @1 min_value) (le @1 @0))
genmatch handles :c modifier not just on commutative operations, but
also comparisons and in that case it means it swaps the comparison.
2022-06-16 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/105983
* match.pd (y == XXX_MIN || x < y -> x <= y - 1,
y != XXX_MIN && x >= y -> x > y - 1): Use :cs instead of :s
on non-equality comparisons.
* gcc.dg/tree-ssa/pr105983.c: New test.
Jakub Jelinek [Thu, 16 Jun 2022 12:36:04 +0000 (14:36 +0200)]
match.pd: Fix up __builtin_mul_overflow_p signed type optimization [PR105984]
Earlier in the simplification pattern, we require that @0 has compatible
type to the type of IMAGPART_EXPR, but for @1 which is a non-zero constant
all we require is that it the constant fits into that type.
Later the code checks if the constant is negative, because when min / max
values are divided by negative divisor, lo will be higher than hi.
In the following testcase, @1 has unsigned char type, while @0 has
int type, so @1 which is 254 is wi::neg_p and we were swapping lo and hi,
even when @1 cast to int isn't negative.
We could use tree_int_cst_sgn (@1) < 0 as the check instead and it would
work both for narrower types of @1 and even same or wider ones, but
I've noticed we probably don't want to call fold_convert (TREE_TYPE (@0), @1)
twice and when we save that result in a temporary, we can just use wi::neg_p
on that temporary.
2022-06-16 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/105984
* match.pd (__builtin_mul_overflow_p (x, cst, (stype) 0) ->
x > stype_max / cst || x < stype_min / cst): fold_convert @1
to TREE_TYPE (@0) just once and test for negative divisor
also on that folded constant instead of on @1.
* gcc.c-torture/execute/pr105984.c: New test.
Jakub Jelinek [Thu, 16 Jun 2022 08:58:58 +0000 (10:58 +0200)]
expand: Fix up IFN_ATOMIC_{BIT*,*CMP_0} expansion [PR105951]
Both IFN_ATOMIC_BIT_TEST_AND_* and IFN_ATOMIC_*_FETCH_CMP_0 ifns
are matched if their corresponding optab is implemented for the particular
mode. The fact that those optabs are implemented doesn't guarantee
they will succeed though, they can just FAIL in their expansion.
The expansion in that case uses expand_atomic_fetch_op as fallback, but
as has been reported and and can be reproduced on the testcases,
even those can fail and we didn't have any fallback after that.
For IFN_ATOMIC_BIT_TEST_AND_* we actually have such calls. One is
done whenever we lost lhs of the ifn at some point in between matching
it in tree-ssa-ccp.cc and expansion. The following patch for that case
just falls through and expands as if there was a lhs, creates a temporary
for it. For the other expand_atomic_fetch_op call in the same expander
and for the only expand_atomic_fetch_op call in the other, this falls
back the hard way, by constructing a CALL_EXPR to the call from which
the ifn has been matched and expanding that. Either it is lucky and manages
to expand inline, or it emits a libatomic API call.
So that we don't have to rediscover which builtin function to call in the
fallback, we record at tree-ssa-ccp.cc time gimple_call_fn (call) in
an extra argument to the ifn.
2022-06-16 Jakub Jelinek <jakub@redhat.com>
PR middle-end/105951
* tree-ssa-ccp.cc (optimize_atomic_bit_test_and,
optimize_atomic_op_fetch_cmp_0): Remember gimple_call_fn (call)
as last argument to the internal functions.
* builtins.cc (expand_ifn_atomic_bit_test_and): Adjust for the
extra call argument to ifns. If expand_atomic_fetch_op fails for the
lhs == NULL_TREE case, fall through into the optab code with
gen_reg_rtx (mode) as target. If second expand_atomic_fetch_op
fails, construct a CALL_EXPR and expand that.
(expand_ifn_atomic_op_fetch_cmp_0): Adjust for the extra call argument
to ifns. If expand_atomic_fetch_op fails, construct a CALL_EXPR and
expand that.
* gcc.target/i386/pr105951-1.c: New test.
* gcc.target/i386/pr105951-2.c: New test.
Haochen Gui [Mon, 30 May 2022 01:12:34 +0000 (09:12 +0800)]
rs6000: add V1TI into vector comparison expand [PR103316]
This patch adds V1TI mode into a new mode iterator used in vector comparison,shift and rotation expands. It also merges some vector comparison, shift and rotation expands for V1T1 and other vector integer modes as they have the similar patterns. The expands for V1TI only are removed.
gcc/
PR target/103316
* config/rs6000/rs6000-builtin.cc (rs6000_gimple_fold_builtin): Enable
gimple folding for RS6000_BIF_VCMPEQUT, RS6000_BIF_VCMPNET,
RS6000_BIF_CMPGE_1TI, RS6000_BIF_CMPGE_U1TI, RS6000_BIF_VCMPGTUT,
RS6000_BIF_VCMPGTST, RS6000_BIF_CMPLE_1TI, RS6000_BIF_CMPLE_U1TI.
* config/rs6000/vector.md (VEC_IC): New mode iterator. Add support
for new Power10 V1TI instructions.
(vec_cmp<mode><mode>): Set mode iterator to VEC_IC.
(vec_cmpu<mode><mode>): Likewise.
(vector_nlt<mode>): Set mode iterator to VEC_IC.
(vector_nltv1ti): Remove.
(vector_gtu<mode>): Set mode iterator to VEC_IC.
(vector_gtuv1ti): Remove.
(vector_nltu<mode>): Set mode iterator to VEC_IC.
(vector_nltuv1ti): Remove.
(vector_geu<mode>): Set mode iterator to VEC_IC.
(vector_ngt<mode>): Likewise.
(vector_ngtv1ti): Remove.
(vector_ngtu<mode>): Set mode iterator to VEC_IC.
(vector_ngtuv1ti): Remove.
(vector_gtu_<mode>_p): Set mode iterator to VEC_IC.
(vector_gtu_v1ti_p): Remove.
(vrotl<mode>3): Set mode iterator to VEC_IC. Emit insns for V1TI.
(vrotlv1ti3): Remove.
(vashr<mode>3): Set mode iterator to VEC_IC. Emit insns for V1TI.
(vashrv1ti3): Remove.
gcc/testsuite/
PR target/103316
* gcc.target/powerpc/pr103316.c: New.
* gcc.target/powerpc/fold-vec-cmp-int128.c: New.
Martin Liska [Thu, 16 Jun 2022 06:48:39 +0000 (08:48 +0200)]
clang: fix -Wunused-parameter warning
Fixes:
gcc/cp/decl2.cc:158:54: warning: unused parameter 'entry' [-Wunused-parameter]
gcc/cp/ChangeLog:
* decl2.cc (struct priority_map_traits): Remove unused param.
Martin Liska [Wed, 4 May 2022 14:21:45 +0000 (16:21 +0200)]
gengtype: do not skip char after escape sequnce
Right now, when a \$x escape sequence occures, the
next character after $x is skipped, which is bogus.
The code has very low coverage right now.
gcc/ChangeLog:
* gengtype-state.cc (read_a_state_token): Do not skip extra
character after escaped sequence.
Martin Liska [Wed, 11 May 2022 14:07:25 +0000 (16:07 +0200)]
opts: improve option suggestion
In case where we have 2 equally good candidates like
-ftrivial-auto-var-init=
-Wtrivial-auto-var-init
for -ftrivial-auto-var-init, we should take the candidate that
has a difference in trailing sign symbol.
PR driver/105564
gcc/ChangeLog:
* spellcheck.cc (test_find_closest_string): Add new test.
* spellcheck.h (class best_match): Prefer a difference in
trailing sign symbol.
Jia-wei Chen [Wed, 8 Jun 2022 09:35:21 +0000 (17:35 +0800)]
RISC-V/testsuite: Fix pr105666.c under rv32
In rv32 regression test, this cases will report an error:
"cc1: error: ABI requires '-march=rv32'"
Add '-mabi' option will fix this.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/pr105666.c: New options.
liuhongt [Tue, 31 May 2022 09:13:21 +0000 (17:13 +0800)]
Simplify (B * v + C) * D -> BD* v + CD when B,C,D are all INTEGER_CST.
Similar for (v + B) * C + D -> C * v + BCD.
Don't simplify it when there's overflow and overflow is UB for type v.
gcc/ChangeLog:
PR tree-optimization/53533
* match.pd: Simplify (B * v + C) * D -> BD * v + CD and
(v + B) * C + D -> C * v + BCD when B,C,D are all INTEGER_CST,
and there's no overflow or !TYPE_OVERFLOW_UNDEFINED.
gcc/testsuite/ChangeLog:
* gcc.target/i386/pr53533-1.c: New test.
* gcc.target/i386/pr53533-2.c: New test.
* gcc.target/i386/pr53533-3.c: New test.
* gcc.target/i386/pr53533-4.c: New test.
* gcc.target/i386/pr53533-5.c: New test.
* gcc.dg/vect/slp-11a.c: Adjust testcase.
GCC Administrator [Thu, 16 Jun 2022 00:16:44 +0000 (00:16 +0000)]
Daily bump.
Takayuki 'January June' Suwa [Tue, 14 Jun 2022 03:53:04 +0000 (12:53 +0900)]
xtensa: Eliminate [DS]Cmode hard register clobber that is immediately followed by whole overwrite the register
RTL expansion of substitution to [DS]Cmode hard register includes obstructive
register clobber.
A simplest example:
double _Complex test(double _Complex c) {
return c;
}
will be converted to:
(set (reg:DF 42 [ c ]) (reg:DF 2 a2))
(set (reg:DF 43 [ c+8 ]) (reg:DF 4 a4))
(clobber (reg:DC 2 a2))
(set (reg:DF 2 a2) (reg:DF 42 [ c ]))
(set (reg:DF 4 a4) (reg:DF 43 [ c+8 ]))
(use (reg:DC 2 a2))
(return)
and then finally:
test:
mov a8, a2
mov a9, a3
mov a6, a4
mov a7, a5
mov a2, a8
mov a3, a9
mov a4, a6
mov a5, a7
ret
As you see, it is so ridiculous.
This patch eliminates such clobber in order to prune away the wasted move
instructions by the optimizer:
test:
ret
gcc/ChangeLog:
* config/xtensa/xtensa.md (DSC): New split pattern and mode iterator.
Takayuki 'January June' Suwa [Tue, 14 Jun 2022 03:39:49 +0000 (12:39 +0900)]
xtensa: Eliminate unwanted reg-reg moves during DFmode input reloads
When spilled DFmode registers are reloaded in, once loaded into a pair of
SImode regs and then copied from that regs. Such unwanted reg-reg moves
seems not to be eliminated at the "cprop_hardreg" stage, despite no problem
in output reloads.
Luckily it is easy to resolve such inefficiencies, with the use of peephole2
pattern.
gcc/ChangeLog:
* config/xtensa/predicates.md (reload_operand):
New predicate.
* config/xtensa/xtensa.md: New peephole2 pattern.
Takayuki 'January June' Suwa [Tue, 14 Jun 2022 03:37:54 +0000 (12:37 +0900)]
xtensa: Add some dedicated patterns that correspond to GIMPLE canonicalizations
This patch offers better RTL representations against straightforward
derivations from some tree optimizers' canonicalized forms.
- rounding up to even, such as '(x + (x & 1))', is canonicalized to
'((x + 1) & -2)', but the former is one instruction less than the latter
in Xtensa ISA.
- signed greater or equal to zero as logical value '((signed)x >= 0)',
is canonicalized to '((unsigned)(x ^ -1) >> 31)', but the equivalent
'(((signed)x >> 31) + 1)' is one instruction less.
gcc/ChangeLog:
* config/xtensa/xtensa.md (*round_up_to_even):
New insn-and-split pattern.
(*signed_ge_zero): Ditto.
Takayuki 'January June' Suwa [Wed, 15 Jun 2022 12:21:21 +0000 (21:21 +0900)]
xtensa: Add support for sibling call optimization
This patch introduces support for sibling call optimization, when call0
ABI is in effect.
gcc/ChangeLog:
* config/xtensa/xtensa-protos.h (xtensa_prepare_expand_call,
xtensa_emit_sibcall): New prototypes.
(xtensa_expand_epilogue): Add new argument that specifies whether
or not sibling call.
* config/xtensa/xtensa.cc (TARGET_FUNCTION_OK_FOR_SIBCALL):
New macro definition.
(xtensa_prepare_expand_call): New function in order to share
the common code.
(xtensa_emit_sibcall, xtensa_function_ok_for_sibcall):
New functions.
(xtensa_expand_epilogue): Add new argument sibcall_p and use it
for sibling call handling.
* config/xtensa/xtensa.md (call, call_value):
Use xtensa_prepare_expand_call.
(call_internal, call_value_internal):
Add the condition in order to be disabled if sibling call.
(sibcall, sibcall_value, sibcall_epilogue): New expansions.
(sibcall_internal, sibcall_value_internal): New insn patterns,
and split ones in order to take care of the indirect sibcalls.
gcc/testsuite/ChangeLog:
* gcc.target/xtensa/sibcalls.c: New.
Takayuki 'January June' Suwa [Tue, 14 Jun 2022 03:34:48 +0000 (12:34 +0900)]
xtensa: Document new -mextra-l32r-costs= Xtensa-specific option
gcc/ChangeLog:
* doc/invoke.texi: Document -mextra-l32r-costs= option.
David Malcolm [Wed, 15 Jun 2022 21:44:14 +0000 (17:44 -0400)]
analyzer: fix up paths for inlining (PR analyzer/105962)
-fanalyzer runs late compared to other code analysis tools, in that in
runs on the partially-optimized gimple-ssa representation. I chose this
point to run in the hope of easy integration with LTO.
As PR analyzer/105962 notes, this means that function inlining can occur
before the -fanalyzer "sees" the user's code. For example given:
void foo (void *p)
{
__builtin_free (p);
}
void bar (void *q)
{
foo (q);
foo (q);
}
Below -O2, -fanalyzer shows the calls and returns:
inline-1.c: In function ‘foo’:
inline-1.c:3:3: warning: double-‘free’ of ‘p’ [CWE-415] [-Wanalyzer-double-free]
3 | __builtin_free (p);
| ^~~~~~~~~~~~~~~~~~
‘bar’: events 1-2
|
| 6 | void bar (void *q)
| | ^~~
| | |
| | (1) entry to ‘bar’
| 7 | {
| 8 | foo (q);
| | ~~~~~~~
| | |
| | (2) calling ‘foo’ from ‘bar’
|
+--> ‘foo’: events 3-4
|
| 1 | void foo (void *p)
| | ^~~
| | |
| | (3) entry to ‘foo’
| 2 | {
| 3 | __builtin_free (p);
| | ~~~~~~~~~~~~~~~~~~
| | |
| | (4) first ‘free’ here
|
<------+
|
‘bar’: events 5-6
|
| 8 | foo (q);
| | ^~~~~~~
| | |
| | (5) returning to ‘bar’ from ‘foo’
| 9 | foo (q);
| | ~~~~~~~
| | |
| | (6) passing freed pointer ‘q’ in call to ‘foo’ from ‘bar’
|
+--> ‘foo’: events 7-8
|
| 1 | void foo (void *p)
| | ^~~
| | |
| | (7) entry to ‘foo’
| 2 | {
| 3 | __builtin_free (p);
| | ~~~~~~~~~~~~~~~~~~
| | |
| | (8) second ‘free’ here; first ‘free’ was at (4)
|
but at -O2, -fanalyzer "sees" this gimple:
void bar (void * q)
{
<bb 2> [local count:
1073741824]:
__builtin_free (q_2(D));
__builtin_free (q_2(D));
return;
}
where "foo" has been inlined away, leading to this unhelpful output:
In function ‘foo’,
inlined from ‘bar’ at inline-1.c:9:3:
inline-1.c:3:3: warning: double-‘free’ of ‘q’ [CWE-415] [-Wanalyzer-double-free]
3 | __builtin_free (p);
| ^~~~~~~~~~~~~~~~~~
‘bar’: events 1-2
|
| 3 | __builtin_free (p);
| | ^~~~~~~~~~~~~~~~~~
| | |
| | (1) first ‘free’ here
| | (2) second ‘free’ here; first ‘free’ was at (1)
where the stack frame information in the execution path suggests that these
events are happening in "bar", in the top stack frame.
This is what the analyzer sees, but I find it hard to decipher such
output. Hence, as a workaround for the fact that -fanalyzer runs so
late, this patch attempts to reconstruct the "true" stack frame
information, and to inject events showing inline calls, based on the
inlining chain information recorded in the location_t values for the events.
Doing so leads to this output at -O2 on the above example (with
-fdiagnostics-show-path-depths):
In function ‘foo’,
inlined from ‘bar’ at inline-1.c:9:3:
inline-1.c:3:3: warning: double-‘free’ of ‘q’ [CWE-415] [-Wanalyzer-double-free]
3 | __builtin_free (p);
| ^~~~~~~~~~~~~~~~~~
‘bar’: events 1-2 (depth 1)
|
| 6 | void bar (void *q)
| | ^~~
| | |
| | (1) entry to ‘bar’
| 7 | {
| 8 | foo (q);
| | ~
| | |
| | (2) inlined call to ‘foo’ from ‘bar’
|
+--> ‘foo’: event 3 (depth 2)
|
| 3 | __builtin_free (p);
| | ^~~~~~~~~~~~~~~~~~
| | |
| | (3) first ‘free’ here
|
<------+
|
‘bar’: event 4 (depth 1)
|
| 9 | foo (q);
| | ^
| | |
| | (4) inlined call to ‘foo’ from ‘bar’
|
+--> ‘foo’: event 5 (depth 2)
|
| 3 | __builtin_free (p);
| | ^~~~~~~~~~~~~~~~~~
| | |
| | (5) second ‘free’ here; first ‘free’ was at (3)
|
reconstructing the calls and returns.
The patch also adds a new option, -fno-analyzer-undo-inlining, which can
be used to disable this reconstruction, restoring the output listed
above (this time with -fdiagnostics-show-path-depths):
In function ‘foo’,
inlined from ‘bar’ at inline-1.c:9:3:
inline-1.c:3:3: warning: double-‘free’ of ‘q’ [CWE-415] [-Wanalyzer-double-free]
3 | __builtin_free (p);
| ^~~~~~~~~~~~~~~~~~
‘bar’: events 1-2 (depth 1)
|
| 3 | __builtin_free (p);
| | ^~~~~~~~~~~~~~~~~~
| | |
| | (1) first ‘free’ here
| | (2) second ‘free’ here; first ‘free’ was at (1)
|
gcc/analyzer/ChangeLog:
PR analyzer/105962
* analyzer.opt (fanalyzer-undo-inlining): New option.
* checker-path.cc: Include "diagnostic-core.h" and
"inlining-iterator.h".
(event_kind_to_string): Handle EK_INLINED_CALL.
(class inlining_info): New class.
(checker_event::checker_event): Move here from checker-path.h.
Store original fndecl and depth, and calculate effective fndecl
and depth based on inlining information.
(checker_event::dump): Emit original depth as well as effective
depth when they differ; likewise for fndecl.
(region_creation_event::get_desc): Use m_effective_fndecl.
(inlined_call_event::get_desc): New.
(inlined_call_event::get_meaning): New.
(checker_path::inject_any_inlined_call_events): New.
* checker-path.h (enum event_kind): Add EK_INLINED_CALL.
(checker_event::checker_event): Make protected, and move
definition to checker-path.cc.
(checker_event::get_fndecl): Use effective fndecl.
(checker_event::get_stack_depth): Use effective stack depth.
(checker_event::get_logical_location): Use effective stack depth.
(checker_event::get_original_stack_depth): New.
(checker_event::m_fndecl): Rename to...
(checker_event::m_original_fndecl): ...this.
(checker_event::m_depth): Rename to...
(checker_event::m_original_depth): ...this.
(checker_event::m_effective_fndecl): New field.
(checker_event::m_effective_depth): New field.
(class inlined_call_event): New checker_event subclass.
(checker_path::inject_any_inlined_call_events): New decl.
* diagnostic-manager.cc: Include "inlining-iterator.h".
(diagnostic_manager::emit_saved_diagnostic): Call
checker_path::inject_any_inlined_call_events.
(diagnostic_manager::prune_for_sm_diagnostic): Handle
EK_INLINED_CALL.
* engine.cc (tainted_args_function_custom_event::get_desc): Use
effective fndecl.
* inlining-iterator.h: New file.
gcc/testsuite/ChangeLog:
PR analyzer/105962
* gcc.dg/analyzer/inlining-1-multiline.c: New test.
* gcc.dg/analyzer/inlining-1-no-undo.c: New test.
* gcc.dg/analyzer/inlining-1.c: New test.
* gcc.dg/analyzer/inlining-2-multiline.c: New test.
* gcc.dg/analyzer/inlining-2.c: New test.
* gcc.dg/analyzer/inlining-3-multiline.c: New test.
* gcc.dg/analyzer/inlining-3.c: New test.
* gcc.dg/analyzer/inlining-4-multiline.c: New test.
* gcc.dg/analyzer/inlining-4.c: New test.
* gcc.dg/analyzer/inlining-5-multiline.c: New test.
* gcc.dg/analyzer/inlining-5.c: New test.
* gcc.dg/analyzer/inlining-6-multiline.c: New test.
* gcc.dg/analyzer/inlining-6.c: New test.
* gcc.dg/analyzer/inlining-7-multiline.c: New test.
* gcc.dg/analyzer/inlining-7.c: New test.
gcc/ChangeLog:
PR analyzer/105962
* doc/invoke.texi: Add -fno-analyzer-undo-inlining.
* tree-diagnostic-path.cc (default_tree_diagnostic_path_printer):
Extend -fdiagnostics-path-format=separate-events so that with
-fdiagnostics-show-path-depths it prints fndecls as well as stack
depths.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
David Malcolm [Wed, 15 Jun 2022 21:42:17 +0000 (17:42 -0400)]
value-relation.h: add 'final' and 'override' to relation_oracle vfunc impls
gcc/ChangeLog:
* value-relation.h: Add "final" and "override" to relation_oracle
vfunc implementations as appropriate.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
David Malcolm [Wed, 15 Jun 2022 21:40:33 +0000 (17:40 -0400)]
analyzer: show saved diagnostics as nodes in .eg.dot dumps
I've been using this tweak to the output of
-fdump-analyzer-exploded-graph in my working copies for a while;
the extra red nodes make it *much* easier to find the places where
diagnostics are being emitted (or rejected by the diagnostic_manager).
gcc/analyzer/ChangeLog:
* diagnostic-manager.cc (saved_diagnostic::dump_dot_id): New.
(saved_diagnostic::dump_as_dot_node): New.
* diagnostic-manager.h (saved_diagnostic::dump_dot_id): New decl.
(saved_diagnostic::dump_as_dot_node): New decl.
* engine.cc (exploded_node::dump_dot): Add nodes for saved
diagnostics.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
David Malcolm [Wed, 15 Jun 2022 21:39:42 +0000 (17:39 -0400)]
analyzer: add more uninit test coverage
gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/uninit-1.c: Add test coverage of attempts
to jump through an uninitialized function pointer, and of attempts
to pass an uninitialized value to a function call.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Iain Buclaw [Wed, 15 Jun 2022 20:51:52 +0000 (22:51 +0200)]
d: Add `@no_sanitize' attribute to compiler and library.
The `@no_sanitize' attribute disables a particular sanitizer for this
function, analogous to `__attribute__((no_sanitize))'. The library also
defines `@noSanitize' to be compatible with the LLVM D compiler's
`ldc.attributes'.
gcc/d/ChangeLog:
* d-attribs.cc (d_langhook_attribute_table): Add no_sanitize.
(d_handle_no_sanitize_attribute): New function.
libphobos/ChangeLog:
* libdruntime/gcc/attributes.d (no_sanitize): Define.
(noSanitize): Define.
gcc/testsuite/ChangeLog:
* gdc.dg/asan/attr_no_sanitize1.d: New test.
* gdc.dg/ubsan/attr_no_sanitize2.d: New test.
François Dumont [Tue, 15 Feb 2022 08:47:52 +0000 (09:47 +0100)]
libstdc++: [_Hashtable] Insert range of types convertible to value_type PR 105717
Fix insertion of range of instances convertible to value_type.
libstdc++-v3/ChangeLog:
PR libstdc++/105717
* include/bits/hashtable_policy.h (_ConvertToValueType): New.
* include/bits/hashtable.h (_Hashtable<>::_M_insert_unique_aux): New.
(_Hashtable<>::_M_insert(_Arg&&, const _NodeGenerator&, true_type)): Use latters.
(_Hashtable<>::_M_insert(_Arg&&, const _NodeGenerator&, false_type)): Likewise.
(_Hashtable(_InputIterator, _InputIterator, size_type, const _Hash&, const _Equal&,
const allocator_type&, true_type)): Use this.insert range.
(_Hashtable(_InputIterator, _InputIterator, size_type, const _Hash&, const _Equal&,
const allocator_type&, false_type)): Use _M_insert.
* testsuite/23_containers/unordered_map/cons/56112.cc: Check how many times conversion
is done.
* testsuite/23_containers/unordered_map/insert/105717.cc: New test.
* testsuite/23_containers/unordered_set/insert/105717.cc: New test.
Iain Buclaw [Wed, 15 Jun 2022 17:44:36 +0000 (19:44 +0200)]
d: Add `@visibility' and `@hidden' attributes.
The `@visibility' attribute is functionality the same as
`__attribute__((visibility))', and `@hidden' is a convenience alias to
`@visibility("hidden")' defined in the `gcc.attributes' module.
As the visibility of a symbol is also indirectly controlled by the
`export' keyword, the handling of this in the code generation pass has
been improved so that conflicts will be appropriately diagnosed.
gcc/d/ChangeLog:
* d-attribs.cc (d_langhook_attribute_table): Add visibility.
(insert_type_attribute): Use decl_attributes instead of
merge_attributes.
(insert_decl_attribute): Likewise.
(apply_user_attributes): Do nothing when no UDAs applied.
(d_handle_visibility_attribute): New function.
* d-gimplify.cc (d_gimplify_binary_expr): Adjust.
* d-tree.h (set_visibility_for_decl): Declare.
* decl.cc (get_symbol_decl): Move setting of visibility flags to...
(set_visibility_for_decl): ... here. New function.
* types.cc (TypeVisitor::visit (TypeStruct *)): Call
set_visibility_for_decl().
(TypeVisitor::visit (TypeClass *)): Likewise.
gcc/testsuite/ChangeLog:
* gdc.dg/attr_visibility1.d: New test.
* gdc.dg/attr_visibility2.d: New test.
* gdc.dg/attr_visibility3.d: New test.
libphobos/ChangeLog:
* libdruntime/gcc/attributes.d (visibility): Define.
(hidden): Define.
David Edelsohn [Tue, 14 Jun 2022 17:07:24 +0000 (13:07 -0400)]
testsuite: AIX operator new
The testcase relies on C++ "operator new", which requires AIX
runtime linking to override the symbol at runtime.
* g++.dg/cpp1z/aligned-new9.C: Skip on AIX.
Richard Sandiford [Wed, 15 Jun 2022 16:40:09 +0000 (17:40 +0100)]
Revert recent internal-fn changes [PR105975]
The recent internal-fn “clean-ups” triggered problems on nvptx
because some of the omp_simt_* patterns had modeless operands.
I wondered about adapting expand_fn_using_insn to cope with that,
but then the problem becomes: what should the mode of operand 0
be when there is no lhs? The answer depends on the target insn.
For GOMP_SIMT_ENTER_ALLOC the answer was: use Pmode.
For GOMP_SIMT_ORDERED_PRED and others the answer was: elide the call.
(However, GOMP_SIMT_ORDERED_PRED doesn't seem to have ECF_* flags
that would normally allow it to be dropped at the gimple level.)
So these instructions seem to be special enough that they need
their own code after all. This patch reverts the second patch
and most of the first. The only part retained from the first
is splitting expand_fn_using_insn out of expand_direct_optab_fn,
since I think expand_fn_using_insn could still be useful in future.
gcc/
PR middle-end/105975
Revert everything apart from the expand_fn_using_insn and
expand_direct_optab_fn changes from:
* internal-fn.def (DEF_INTERNAL_INSN_FN): New macro.
(GOMP_SIMT_ENTER_ALLOC, GOMP_SIMT_EXIT, GOMP_SIMT_LANE)
(GOMP_SIMT_LAST_LANE, GOMP_SIMT_ORDERED_PRED, GOMP_SIMT_VOTE_ANY)
(GOMP_SIMT_XCHG_BFLY, GOMP_SIMT_XCHG_IDX): Use it.
* internal-fn.h (direct_internal_fn_info::directly_mapped): New
member variable.
(direct_internal_fn_info::vectorizable): Reduce to 1 bit.
(direct_internal_fn_p): Also return true for internal functions
that map directly to instructions defined target-insns.def.
(direct_internal_fn): Adjust comment accordingly.
* internal-fn.cc (direct_insn, optab1, optab2, vectorizable_optab1)
(vectorizable_optab2): New local macros.
(not_direct): Initialize directly_mapped.
(mask_load_direct, load_lanes_direct, mask_load_lanes_direct)
(gather_load_direct, len_load_direct, mask_store_direct)
(store_lanes_direct, mask_store_lanes_direct, vec_cond_mask_direct)
(vec_cond_direct, scatter_store_direct, len_store_direct)
(vec_set_direct, unary_direct, binary_direct, ternary_direct)
(cond_unary_direct, cond_binary_direct, cond_ternary_direct)
(while_direct, fold_extract_direct, fold_left_direct)
(mask_fold_left_direct, check_ptrs_direct): Use the macros above.
(expand_GOMP_SIMT_ENTER_ALLOC, expand_GOMP_SIMT_EXIT): Delete
(expand_GOMP_SIMT_LANE, expand_GOMP_SIMT_LAST_LANE): Likewise;
(expand_GOMP_SIMT_ORDERED_PRED, expand_GOMP_SIMT_VOTE_ANY): Likewise.
(expand_GOMP_SIMT_XCHG_BFLY, expand_GOMP_SIMT_XCHG_IDX): Likewise.
(direct_internal_fn_types): Handle functions that map to instructions
defined in target-insns.def.
(direct_internal_fn_types): Likewise.
(direct_internal_fn_supported_p): Likewise.
(internal_fn_expanders): Likewise.
(expand_fn_using_insn): New function,
split out and adapted from...
(expand_direct_optab_fn): ...here.
(expand_GOMP_SIMT_ENTER_ALLOC): Use it.
(expand_GOMP_SIMT_EXIT): Likewise.
(expand_GOMP_SIMT_LANE): Likewise.
(expand_GOMP_SIMT_LAST_LANE): Likewise.
(expand_GOMP_SIMT_ORDERED_PRED): Likewise.
(expand_GOMP_SIMT_VOTE_ANY): Likewise.
(expand_GOMP_SIMT_XCHG_BFLY): Likewise.
(expand_GOMP_SIMT_XCHG_IDX): Likewise.
Richard Earnshaw [Wed, 15 Jun 2022 15:07:20 +0000 (16:07 +0100)]
arm: big-endian issue in gen_cpymem_ldrd_strd [PR105981]
The code in gen_cpymem_ldrd_strd has been incorrect for big-endian
since r230663. The problem is that we use gen_lowpart, etc. to split
the 64-bit quantity, but fail to account for the fact that these
routines are really dealing with 64-bit /values/ and in big-endian the
ordering of the sub-registers changes.
To fix this, I've renamed the conceptually misnamed low_reg and hi_reg
as first_reg and second_reg, and then used different logic for
big-endian targets to initialize these values. This makes the logic
clearer than trying to think about high bits and low bits.
gcc/ChangeLog:
PR target/105981
* config/arm/arm.cc (gen_cpymem_ldrd_strd): Rename low_reg and hi_reg
to first_reg and second_reg respectively. Initialize them correctly
when generating big-endian code.
Nathan Sidwell [Fri, 10 Jun 2022 18:57:38 +0000 (11:57 -0700)]
c++: Use better module partition naming
It turns out that 'implementation partition' is not a term used in the
std, and is confusing to users. Let's use the better term 'internal
partition'. While there, adjust header unit naming.
gcc/cp/
* module.cc (module_state::write_readme): Use less confusing
importable unit names.
Richard Earnshaw [Wed, 15 Jun 2022 12:42:23 +0000 (13:42 +0100)]
arm: fix thinko in arm_bfi_1_p() [PR105974]
I clearly wasn't thinking straight when I wrote the arm_bfi_1_p
function and used XUINT rather than UINTVAL when extracting CONST_INT
values. It seemed to work in testing, but was incorrect and failed
RTL checking.
Fixed thusly:
gcc/ChangeLog:
PR target/105974
* config/arm/arm.cc (arm_bfi_1_p): Use UINTVAL instead of XUINT.
Iain Buclaw [Wed, 15 Jun 2022 11:20:15 +0000 (13:20 +0200)]
d: Set TYPE_ARTIFICIAL on internal TypeInfo types
Prevents them from triggering warnings when compiling with `-Wpadded'.
gcc/d/ChangeLog:
* typeinfo.cc (make_internal_typeinfo): Set TYPE_ARTIFICIAL.
gcc/testsuite/ChangeLog:
* gdc.dg/Wpadded.d: New test.
Richard Biener [Wed, 15 Jun 2022 09:27:31 +0000 (11:27 +0200)]
tree-optimization/105971 - less surprising refs_may_alias_p_2
When DSE asks whether __real a is using __imag a it gets a surprising
result when a is a FUNCTION_DECL. The following makes sure this case
is less surprising to callers but keeping the bail-out for the
non-decl case where it is true that PTA doesn't track aliases to code
correctly.
2022-06-15 Richard Biener <rguenther@suse.de>
PR tree-optimization/105971
* tree-ssa-alias.cc (refs_may_alias_p_2): Put bail-out for
FUNCTION_DECL and LABEL_DECL refs after decl-decl disambiguation
to leak less surprising alias results.
* gcc.dg/torture/pr106971.c: New testcase.
Richard Biener [Wed, 15 Jun 2022 08:54:48 +0000 (10:54 +0200)]
tree-optimization/105969 - FPE with array diagnostics
For a [0][0] array we have to be careful when dividing by the element
size which is zero for the outermost dimension. Luckily the division
is only for an overflow check which is pointless for array size zero.
2022-06-15 Richard Biener <rguenther@suse.de>
PR tree-optimization/105969
* gimple-ssa-sprintf.cc (get_origin_and_offset_r): Avoid division
by zero in overflow check.
* gcc.dg/pr105969.c: New testcase.
Iain Buclaw [Tue, 14 Jun 2022 13:56:59 +0000 (15:56 +0200)]
d: Delay completing aggregate and enum types until after attributes have been applied.
Because of forward/recursive references, the TYPE_SIZE, TYPE_ALIGN, and
TYPE_MODE of structs and enums were set before laying out its members.
This adds a new macro TYPE_FORWARD_REFERENCES for storing those forward
references against the incomplete type, laying them out after the type
has been completed. Construction of the TYPE_DECL has also been moved
on earlier in the type generation pass, which will allow the possibility
of adding gdc-specific type attributes to the D front-end in the future.
gcc/d/ChangeLog:
* d-attribs.cc (apply_user_attributes): Set ATTR_FLAG_TYPE_IN_PLACE
only on incomplete types.
* d-codegen.cc (copy_aggregate_type): Set TYPE_STUB_DECL after copy.
* d-compiler.cc (Compiler::onParseModule): Adjust.
* d-tree.h (AGGREGATE_OR_ENUM_TYPE_CHECK): Define.
(TYPE_FORWARD_REFERENCES): Define.
* decl.cc (gcc_attribute_p): Update documentation.
(DeclVisitor::visit (StructDeclaration *)): Exit before building type
node if gcc.attributes symbol.
(DeclVisitor::visit (ClassDeclaration *)): Build type node and add
TYPE_NAME to current binding level before emitting anything else.
(DeclVisitor::visit (InterfaceDeclaration *)): Likewise.
(DeclVisitor::visit (EnumDeclaration *)): Likewise.
(build_type_decl): Move rest_of_decl_compilation() call to
finish_aggregate_type().
* types.cc (insert_aggregate_field): Move layout_decl() call to
finish_aggregate_type().
(insert_aggregate_bitfield): Likewise.
(layout_aggregate_members): Adjust.
(finish_incomplete_fields): New function.
(finish_aggregate_type): Handle forward referenced field types. Call
rest_of_type_compilation() after completing the aggregate.
(TypeVisitor::visit (TypeEnum *)): Don't set size and alignment until
after apply_user_attributes(). Call rest_of_type_compilation() after
completing the enumeral.
(TypeVisitor::visit (TypeStruct *)): Call build_type_decl() before
apply_user_attributes(). Don't set size, alignment, and mode until
after apply_user_attributes().
(TypeVisitor::visit (TypeClass *)): Call build_type_decl() before
applly_user_attributes().
Richard Sandiford [Wed, 15 Jun 2022 10:12:51 +0000 (11:12 +0100)]
aarch64: Revert bogus fix for PR105254
In
f2ebf2d98efe0ac2314b58cf474f44cb8ebd5244 I'd forced the
chosen unroll factor to be a factor of the VF, in order to
work around an exact_div ICE in PR105254. This was completely
bogus -- clearly I didn't look in enough detail at why we ended
up with an unrolled VF that wasn't a multiple of the UF.
Kewen has since fixed the bug properly for PR105940, so this
patch reverts my earlier attempt. Sorry for the stupidity.
gcc/
PR tree-optimization/105254
PR tree-optimization/105940
Revert:
* config/aarch64/aarch64.cc
(aarch64_vector_costs::determine_suggested_unroll_factor): Take a
loop_vec_info as argument. Restrict the unroll factor to values
that divide the VF.
(aarch64_vector_costs::finish_cost): Update call accordingly.
gcc/testsuite/
* gcc.target/aarch64/sve/cost_model_14.c: New test.
Richard Sandiford [Wed, 15 Jun 2022 10:12:51 +0000 (11:12 +0100)]
gen: Allow unspec numbers in .md attributes
Tamar pointed out that:
(unspec:M ... <FOO>)
didn't work when a value of attribute FOO was defined by
define_constant, such as in:
(define_int_attribute FOO [(UNSPEC_A "UNSPEC_B") ...])
This is because symbolic constants are substituted during lexing
and only apply to bare symbol names, not strings.
One option would have been to extend this lexing substitution
to define_*_attribute values as well. However, that would replace
symbolic names with integer constants in the generated .cc code,
making it less readable.
This patch goes for the more localised approach of only
applying define_constants when we want their integer value.
I don't think any changes to the docs are needed. This isn't
adding a new feature, it's just making an existing one work in
the expected way.
gcc/
* read-rtl.cc (find_int): Substitute symbolic constants
before converting the string to an integer.
Jakub Jelinek [Wed, 15 Jun 2022 08:45:04 +0000 (10:45 +0200)]
openmp: Fix up get-mapped-ptr-1.{c,f90} tests
On Tue, Jun 14, 2022 at 06:41:37PM +0200, Thomas Schwinge wrote:
> In an offloading configuration, I'm seeing:
>
> PASS: libgomp.fortran/get-mapped-ptr-1.f90 -O (test for excess errors)
> [-PASS:-]{+FAIL:+} libgomp.fortran/get-mapped-ptr-1.f90 -O execution test
>
> Does that one need similar treatment?
I assume not just that but libgomp.c-c++-common/get-mapped-ptr-1.c too?
It both needs the same treatment, and in the get-mapped-ptr-1.c
case there is even UB, while the Fortran version was using c_loc (q)
as the host pointer, in C/C++ it was using q which was value of
uninitialized pointer.
2022-06-15 Jakub Jelinek <jakub@redhat.com>
* testsuite/libgomp.c-c++-common/get-mapped-ptr-1.c (main): Initialize
q to ddress of an automatic variable. Use -5 instead of -1 in
omp_get_mapped_ptr call. Add test with omp_initial_device.
* testsuite/libgomp.fortran/get-mapped-ptr-1.f90 (main): Use -5 instead
of -1 in omp_get_mapped_ptr call. Add test with omp_initial_device.
Renumber stop arguments afterwards.
Roger Sayle [Wed, 15 Jun 2022 07:31:13 +0000 (09:31 +0200)]
Fold truncations of left shifts in match.pd
Whilst investigating PR 55278, I noticed that the tree-ssa optimizers
aren't eliminating the promotions of shifts to "int" as inserted by the
c-family front-ends, instead leaving this simplification to be left to
the RTL optimizers. This patch allows match.pd to do this itself earlier,
narrowing (T)(X << C) to (T)X << C when the constant C is known to be
valid for the (narrower) type T.
Hence for this simple test case:
short foo(short x) { return x << 5; }
the .optimized dump currently looks like:
short int foo (short int x)
{
int _1;
int _2;
short int _4;
<bb 2> [local count:
1073741824]:
_1 = (int) x_3(D);
_2 = _1 << 5;
_4 = (short int) _2;
return _4;
}
but with this patch, now becomes:
short int foo (short int x)
{
short int _2;
<bb 2> [local count:
1073741824]:
_2 = x_1(D) << 5;
return _2;
}
This is always reasonable as RTL expansion knows how to use
widening optabs if it makes sense at the RTL level to perform
this shift in a wider mode.
Of course, there's often a catch. The above simplification not only
reduces the number of statements in gimple, but also allows further
optimizations, for example including the perception of rotate idioms
and bswap16. Alas, optimizing things earlier than anticipated
requires several testsuite changes [though all these tests have
been confirmed to generate identical assembly code on x86_64].
The only significant change is that the vectorization pass wouldn't
previously lower rotations of signed integer types. Hence this
patch includes a refinement to tree-vect-patterns to allow signed
types, by using the equivalent unsigned shifts.
2022-06-15 Roger Sayle <roger@nextmovesoftware.com>
Richard Biener <rguenther@suse.de>
gcc/ChangeLog
* match.pd (convert (lshift @1 INTEGER_CST@2)): Narrow integer
left shifts by a constant when the result is truncated, and the
shift constant is well-defined.
* tree-vect-patterns.cc (vect_recog_rotate_pattern): Add
support for rotations of signed integer types, by lowering
using unsigned vector shifts.
gcc/testsuite/ChangeLog
* gcc.dg/fold-convlshift-4.c: New test case.
* gcc.dg/optimize-bswaphi-1.c: Update found bswap count.
* gcc.dg/tree-ssa/pr61839_3.c: Shift is now optimized before VRP.
* gcc.dg/vect/vect-over-widen-1-big-array.c: Remove obsolete tests.
* gcc.dg/vect/vect-over-widen-1.c: Likewise.
* gcc.dg/vect/vect-over-widen-3-big-array.c: Likewise.
* gcc.dg/vect/vect-over-widen-3.c: Likewise.
* gcc.dg/vect/vect-over-widen-4-big-array.c: Likewise.
* gcc.dg/vect/vect-over-widen-4.c: Likewise.
liuhongt [Tue, 14 Jun 2022 08:27:04 +0000 (16:27 +0800)]
Fix ICE in extract_insn, at recog.cc:2791
(In reply to Uroš Bizjak from comment #1)
> Instruction does not accept memory operand for operand 3:
>
> (define_insn_and_split
> "*<sse4_1>_blendv<ssefltmodesuffix><avxsizesuffix>_ltint"
> [(set (match_operand:<ssebytemode> 0 "register_operand" "=Yr,*x,x")
> (unspec:<ssebytemode>
> [(match_operand:<ssebytemode> 1 "register_operand" "0,0,x")
> (match_operand:<ssebytemode> 2 "vector_operand" "YrBm,*xBm,xm")
> (subreg:<ssebytemode>
> (lt:VI48_AVX
> (match_operand:VI48_AVX 3 "register_operand" "Yz,Yz,x")
> (match_operand:VI48_AVX 4 "const0_operand")) 0)]
> UNSPEC_BLENDV))]
>
> The problematic insn is:
>
> (define_insn_and_split "*avx_cmp<mode>3_ltint_not"
> [(set (match_operand:VI48_AVX 0 "register_operand")
> (vec_merge:VI48_AVX
> (match_operand:VI48_AVX 1 "vector_operand")
> (match_operand:VI48_AVX 2 "vector_operand")
> (unspec:<avx512fmaskmode>
> [(subreg:VI48_AVX
> (not:<ssebytemode>
> (match_operand:<ssebytemode> 3 "vector_operand")) 0)
> (match_operand:VI48_AVX 4 "const0_operand")
> (match_operand:SI 5 "const_0_to_7_operand")]
> UNSPEC_PCMP)))]
>
> which gets split to the above pattern.
>
> In the preparation statements we have:
>
> if (!MEM_P (operands[3]))
> operands[3] = force_reg (<ssebytemode>mode, operands[3]);
> operands[3] = lowpart_subreg (<MODE>mode, operands[3], <ssebytemode>mode);
>
> Which won't fly when operand 3 is memory operand...
>
gcc/ChangeLog:
PR target/105953
* config/i386/sse.md (*avx_cmp<mode>3_ltint_not): Force_reg
operands[3].
gcc/testsuite/ChangeLog:
* g++.target/i386/pr105953.C: New test.
GCC Administrator [Wed, 15 Jun 2022 00:16:24 +0000 (00:16 +0000)]
Daily bump.
Ian Lance Taylor [Tue, 14 Jun 2022 04:32:25 +0000 (21:32 -0700)]
syscall: gofmt
Add blank lines after //sys comments where needed, and then run gofmt
on the syscall package with the new formatter.
This is the libgo version of CL 407136.
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/412074
Jonathan Wakely [Tue, 14 Jun 2022 15:19:32 +0000 (16:19 +0100)]
libstdc++: Check lengths first in operator== for basic_string [PR62187]
As confirmed by LWG 2852, the calls to traits_type::compare do not need
to be obsvervable, so we can make operator== compare string lengths
first and return immediately for non-equal lengths. This avoids doing a
slow string comparison for "abc...xyz" == "abc...xy". Previously we only
did this optimization for std::char_traits<char>, but we can enable it
unconditionally thanks to LWG 2852.
For comparisons with a const char* we can call traits_type::length right
away to do the same optimization. That strlen call can be folded away
for constant arguments, making it very efficient.
For the pre-C++20 operator== and operator!= overloads we can swap the
order of the arguments to take advantage of the operator== improvements.
libstdc++-v3/ChangeLog:
PR libstdc++/62187
* include/bits/basic_string.h (operator==): Always compare
lengths before checking string contents.
[!__cpp_lib_three_way_comparison] (operator==, operator!=):
Reorder arguments.
Jonathan Wakely [Tue, 14 Jun 2022 13:54:27 +0000 (14:54 +0100)]
libstdc++: Inline all basic_string::compare overloads [PR59048]
Defining the compare member functions inline allows calls to
traits_type::length and std::min to be inlined, taking advantage of
constant expression arguments. When not inline, the compiler prefers to
use the explicit instantiation definitions in libstdc++.so and can't
take advantage of constant arguments.
libstdc++-v3/ChangeLog:
PR libstdc++/59048
* include/bits/basic_string.h (compare): Define inline.
* include/bits/basic_string.tcc (compare): Remove out-of-line
definitions.
* include/bits/cow_string.h (compare): Define inline.
* testsuite/21_strings/basic_string/operations/compare/char/3.cc:
New test.
Jonathan Wakely [Tue, 14 Jun 2022 13:50:49 +0000 (14:50 +0100)]
libstdc++: Fix indentation in allocator base classes
libstdc++-v3/ChangeLog:
* include/bits/new_allocator.h: Fix indentation.
* include/ext/malloc_allocator.h: Likewise.
Jonathan Wakely [Tue, 14 Jun 2022 13:37:25 +0000 (14:37 +0100)]
libstdc++: Check for size overflow in constexpr allocation [PR105957]
libstdc++-v3/ChangeLog:
PR libstdc++/105957
* include/bits/allocator.h (allocator::allocate): Check for
overflow in constexpr allocation.
* testsuite/20_util/allocator/105975.cc: New test.
Surya Kumari Jangala [Fri, 10 Jun 2022 14:22:57 +0000 (19:52 +0530)]
regrename: Fix -fcompare-debug issue in check_new_reg_p [PR105041]
In check_new_reg_p, the nregs of a du chain is computed by obtaining the
MODE of the first element in the chain, and then calling
hard_regno_nregs() with the MODE. But the first element of the chain can
be a DEBUG_INSN whose mode need not be the same as the rest of the
elements in the du chain. This was resulting in fcompare-debug failure
as check_new_reg_p was returning a different result with -g for the same
candidate register. We can instead obtain nregs from the du chain
itself.
2022-06-10 Surya Kumari Jangala <jskumari@linux.ibm.com>
gcc/
PR rtl-optimization/105041
* regrename.cc (check_new_reg_p): Use nregs value from du chain.
gcc/testsuite/
PR rtl-optimization/105041
* gcc.target/powerpc/pr105041.c: New test.
Segher Boessenkool [Thu, 9 Jun 2022 19:45:03 +0000 (19:45 +0000)]
rs6000: Delete VS_scalar
It is just the same as VEC_base, which is a more generic name.
2022-06-14 Segher Boessenkool <segher@kernel.crashing.org>
* config/rs6000/vsx.md (VS_scalar): Delete.
(rest of file): Adjust.
Nathan Sidwell [Thu, 9 Jun 2022 18:18:19 +0000 (11:18 -0700)]
c++: Elide calls to NOP module initializers
gcc/cp
* cp-tree.h (fini_modules): Add has_inits parm.
* decl2.cc (c_parse_final_cleanups): Check for
inits, adjust fini_modules flags.
* module.cc (module_state): Rename call_init_p to
active_init_p.
(module_state::write_config): Write active_init.
(module_state::read_config): Read it.
(module_determine_import_inits): Clear active_init_p
of covered inits.
(late_finish_module): Add has_init parm. Record it.
(fini_modules): Adjust.
gcc/testsuite/
* g++.dg/modules/init-2_a.C: Adjust.
* g++.dg/modules/init-2_c.C: Adjust.
* g++.dg/modules/init-2_d.C: New.
Jan Hubicka [Tue, 14 Jun 2022 12:05:53 +0000 (14:05 +0200)]
Fix ipa-cp wrt volatile loads
Check for volatile flag to ipa_load_from_parm_agg.
gcc/ChangeLog:
2022-06-10 Jan Hubicka <hubicka@ucw.cz>
PR ipa/105739
* ipa-prop.cc (ipa_load_from_parm_agg): Punt on volatile loads.
gcc/testsuite/ChangeLog:
2022-06-10 Jan Hubicka <hubicka@ucw.cz>
* gcc.dg/ipa/pr105739.c: New test.
Philipp Tomsich [Wed, 11 May 2022 10:12:57 +0000 (12:12 +0200)]
RISC-V: Split slli+sh[123]add.uw opportunities to avoid zext.w
When encountering a prescaled (biased) value as a candidate for
sh[123]add.uw, the combine pass will present this as shifted by the
aggregate amount (prescale + shift-amount) with an appropriately
adjusted mask constant that has fewer than 32 bits set.
E.g., here's the failing expression seen in combine for a prescale of
1 and a shift of 2 (note how 0x3fffffff8 >> 3 is 0x7fffffff).
Trying 7, 8 -> 10:
7: r78:SI=r81:DI#0<<0x1
REG_DEAD r81:DI
8: r79:DI=zero_extend(r78:SI)
REG_DEAD r78:SI
10: r80:DI=r79:DI<<0x2+r82:DI
REG_DEAD r79:DI
REG_DEAD r82:DI
Failed to match this instruction:
(set (reg:DI 80 [ cD.1491 ])
(plus:DI (and:DI (ashift:DI (reg:DI 81)
(const_int 3 [0x3]))
(const_int
17179869176 [0x3fffffff8]))
(reg:DI 82)))
To address this, we introduce a splitter handling these cases.
Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
Co-developed-by: Manolis Tsamis <manolis.tsamis@vrull.eu>
gcc/ChangeLog:
* config/riscv/bitmanip.md: Add split to handle opportunities
for slli + sh[123]add.uw
gcc/testsuite/ChangeLog:
* gcc.target/riscv/zba-shadd.c: New test.
Philipp Tomsich [Tue, 24 May 2022 13:03:47 +0000 (15:03 +0200)]
RISC-V: add consecutive_bits_operand predicate
Provide an easy way to constrain for constants that are a a single,
consecutive run of ones.
gcc/ChangeLog:
* config/riscv/predicates.md (consecutive_bits_operand):
Implement new predicate.
Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
Richard Biener [Tue, 14 Jun 2022 09:10:13 +0000 (11:10 +0200)]
tree-optimization/105946 - avoid accessing excess args from uninit diag
uninit diagnostics uses passing via reference and access attributes
but that iterates over function type arguments which can in some
cases appearantly outrun the actual arguments leading to ICEs.
The following simply ignores not present arguments.
2022-06-14 Richard Biener <rguenther@suse.de>
PR tree-optimization/105946
* tree-ssa-uninit.cc (maybe_warn_pass_by_reference):
Do not look at arguments not specified in the function call.
Richard Biener [Tue, 14 Jun 2022 08:59:49 +0000 (10:59 +0200)]
middle-end/105965 - add missing v_c_e <{ el }> simplification
When we got the simplification of bit-field-ref to view-convert
we lost the ability to detect FMAs since we cannot look through
_1 = {_10};
_11 = VIEW_CONVERT_EXPR<float>(_1);
the following amends the (view_convert CONSTRUCTOR) pattern
to handle this case.
2022-06-14 Richard Biener <rguenther@suse.de>
PR middle-end/105965
* match.pd (view_convert CONSTRUCTOR): Handle single-element
CTOR case.
* gcc.target/i386/pr105965.c: New testcase.
Eric Botcazou [Tue, 14 Jun 2022 10:28:24 +0000 (12:28 +0200)]
Restore bootstrap on ARM
The -Wuse-after-free warning is explicitly disabled for destructors on ARM
because of the special ABI and the previous change to the warning machinery
uncovered another case where the warning data would be incorrectly erased.
gcc/
* warning-control.cc (copy_warning) [generic version]: Do not erase
the warning data of the destination location when the no-warning
bit is not set on the source.
(copy_warning) [tree version]: Return early if TO is equal to FROM.
(copy_warning) [gimple version]: Likewise.
gcc/testsuite/
* g++.dg/warn/Wuse-after-free5.C: New test.
Kewen Lin [Tue, 14 Jun 2022 05:57:01 +0000 (00:57 -0500)]
vect: Move suggested_unroll_factor applying [PR105940]
As PR105940 shown, when rs6000 port tries to assign
m_suggested_unroll_factor by 4 or so, there will be ICE on:
exact_div (LOOP_VINFO_VECT_FACTOR (loop_vinfo),
loop_vinfo->suggested_unroll_factor);
In function vect_analyze_loop_2, the current place of
suggested_unroll_factor applying can't guarantee it's
applied for all cases. As the case shows, vectorizer
could retry with SLP forced off, the vf is reset by
saved_vectorization_factor which isn't applied with
suggested_unroll_factor before. It means it can end
up with one vf which neglects suggested_unroll_factor.
I think it's off design, we should move the applying
of suggested_unroll_factor after start_over.
PR tree-optimization/105940
gcc/ChangeLog:
* tree-vect-loop.cc (vect_analyze_loop_2): Move the place of
applying suggested_unroll_factor after start_over.
Takayuki 'January June' Suwa [Mon, 13 Jun 2022 16:28:43 +0000 (01:28 +0900)]
xtensa: Optimize bitwise AND operation with some specific forms of constants
This patch offers several insn-and-split patterns for bitwise AND with
register and constant that can be represented as:
i. 1's least significant N bits and the others 0's (17 <= N <= 31)
ii. 1's most significant N bits and the others 0's (12 <= N <= 31)
iii. M 1's sequence of bits and trailing N 0's bits, that cannot fit into a
"MOVI Ax, simm12" instruction (1 <= M <= 16, 1 <= N <= 30)
And also offers shortcuts for conditional branch if each of the abovementioned
operations is (not) equal to zero.
gcc/ChangeLog:
* config/xtensa/predicates.md (shifted_mask_operand):
New predicate.
* config/xtensa/xtensa.md (*andsi3_const_pow2_minus_one):
New insn-and-split pattern.
(*andsi3_const_negative_pow2, *andsi3_const_shifted_mask,
*masktrue_const_pow2_minus_one, *masktrue_const_negative_pow2,
*masktrue_const_shifted_mask): Ditto.
Takayuki 'January June' Suwa [Thu, 27 May 2021 10:04:12 +0000 (19:04 +0900)]
xtensa: Make use of BALL/BNALL instructions
In Xtensa ISA, there is no single machine instruction that calculates unary
bitwise negation, but a few similar fused instructions are exist:
"BALL Ax, Ay, label" // if ((~Ax & Ay) == 0) goto label;
"BNALL Ax, Ay, label" // if ((~Ax & Ay) != 0) goto label;
These instructions have never been emitted before, but it seems no reason not
to make use of them.
gcc/ChangeLog:
* config/xtensa/xtensa.md (*masktrue_bitcmpl): New insn pattern.
gcc/testsuite/ChangeLog:
* gcc.target/xtensa/BALL-BNALL.c: New.
Takayuki 'January June' Suwa [Mon, 31 Jan 2022 00:56:21 +0000 (09:56 +0900)]
xtensa: Simplify conditional branch/move insn patterns
No need to describe the "false side" conditional insn patterns anymore.
gcc/ChangeLog:
* config/xtensa/xtensa-protos.h (xtensa_emit_branch):
Remove the first argument.
(xtensa_emit_bit_branch): Remove it because now called only from the
output statement of *bittrue insn pattern.
* config/xtensa/xtensa.cc (gen_int_relational): Remove the last
argument 'p_invert', and make so that the condition is reversed by
itself as needed.
(xtensa_expand_conditional_branch): Share the common path, and remove
condition inversion code.
(xtensa_emit_branch, xtensa_emit_movcc): Simplify by removing the
"false side" pattern.
(xtensa_emit_bit_branch): Remove it because of the abovementioned
reason, and move the function body to *bittrue insn pattern.
* config/xtensa/xtensa.md (*bittrue): Transplant the output
statement from removed xtensa_emit_bit_branch().
(*bfalse, *ubfalse, *bitfalse, *maskfalse): Remove the "false side"
insn patterns.
Takayuki 'January June' Suwa [Mon, 13 Jun 2022 16:38:31 +0000 (01:38 +0900)]
xtensa: Improve shift operations more
This patch introduces funnel shifter utilization, and rearranges existing
"per-byte shift" insn patterns.
gcc/ChangeLog:
* config/xtensa/predicates.md (logical_shift_operator,
xtensa_shift_per_byte_operator): New predicates.
* config/xtensa/xtensa-protos.h (xtensa_shlrd_which_direction):
New prototype.
* config/xtensa/xtensa.cc (xtensa_shlrd_which_direction):
New helper function for funnel shift patterns.
* config/xtensa/xtensa.md (ior_op): New code iterator.
(*ashlsi3_1): Replace with new split pattern.
(*shift_per_byte): Unify *ashlsi3_3x, *ashrsi3_3x and *lshrsi3_3x.
(*shift_per_byte_omit_AND_0, *shift_per_byte_omit_AND_1):
New insn-and-split patterns that redirect to *xtensa_shift_per_byte,
in order to omit unnecessary bitwise AND operation.
(*shlrd_reg_<code>, *shlrd_const_<code>, *shlrd_per_byte_<code>,
*shlrd_per_byte_<code>_omit_AND):
New insn patterns for funnel shifts.
gcc/testsuite/ChangeLog:
* gcc.target/xtensa/funnel_shifter.c: New.
GCC Administrator [Tue, 14 Jun 2022 00:16:39 +0000 (00:16 +0000)]
Daily bump.
Iain Buclaw [Mon, 13 Jun 2022 22:05:35 +0000 (00:05 +0200)]
libphobos: Check in missing core.sync package module
This was meant to be part of r13-1062 in the merge with upstream
druntime
454471d8.
Jason Merrill [Fri, 10 Jun 2022 19:26:36 +0000 (15:26 -0400)]
ubsan: -Wreturn-type and ubsan trap-on-error
I noticed that -fsanitize=undefined -fsanitize-undefined-trap-on-error was
omitting the usual -Wreturn-type warning for control flowing off the end of
a function. This was because the warning code was looking for calls either
to __builtin_unreachable or the UBSan function, but these flags produce a
call to __builtin_trap instead.
gcc/c-family/ChangeLog:
* c-ubsan.cc (ubsan_instrument_return): Use BUILTINS_LOCATION.
gcc/ChangeLog:
* tree-cfg.cc (pass_warn_function_return::execute): Also check
BUILT_IN_TRAP.
gcc/testsuite/ChangeLog:
* g++.dg/ubsan/return-8.C: New test.
Maciej W. Rozycki [Mon, 13 Jun 2022 21:29:45 +0000 (22:29 +0100)]
RISC-V: Reset the length to the default of 4 for FP comparisons
The default length for floating-point compare operations is overridden
to 8, however the FEQ.fmt, FLT.fmt, FLE.fmt machine instructions and
FGE.fmt, FGT.fmt assembly idioms the relevant RTL insns produce are all
4 bytes long each. And all the floating-point compare RTL insns that
produce multiple machine instructions explicitly set their lengths.
Remove the override then, letting the default of 4 apply for the single
instruction case.
gcc/
* config/riscv/riscv.md (length): Remove the explicit setting
for "fcmp".
H.J. Lu [Fri, 10 Jun 2022 18:22:00 +0000 (11:22 -0700)]
x86: Require AVX for F16C and VAES
Since F16C and VAES are only usable with AVX, require AVX for F16C and
VAES.
libgcc/105920
* common/config/i386/cpuinfo.h (get_available_features): Require
AVX for F16C and VAES.
Mark Mentovai [Mon, 13 Jun 2022 15:40:19 +0000 (16:40 +0100)]
libstdc++: Rename __null_terminated to avoid collision with Apple SDK
The macOS 13 SDK (and equivalent-version iOS and other Apple OS SDKs)
contain this definition in <sys/cdefs.h>:
863 #define __null_terminated
This collides with the use of __null_terminated in libstdc++'s
experimental fs_path.h.
As libstdc++'s use of this token is entirely internal to fs_path.h, the
simplest workaround, renaming it, is most appropriate. Here, it's
renamed to __nul_terminated, referencing the NUL ('\0') value that is
used to terminate the strings in the context in which this tag structure
is used.
libstdc++-v3/ChangeLog:
* include/experimental/bits/fs_path.h (__detail::__null_terminated):
Rename to __nul_terminated to avoid colliding with a macro in
Apple's SDK.
Signed-off-by: Mark Mentovai <mark@mentovai.com>
Jonathan Wakely [Mon, 13 Jun 2022 15:36:14 +0000 (16:36 +0100)]
libstdc++: Use type_identity_t for non-deducible std::atomic_xxx args
This is LWG 3220 which is about to become Tentatively Ready.
libstdc++-v3/ChangeLog:
* include/std/atomic (__atomic_val_t): Use __type_identity_t
instead of atomic<T>::value_type, as per LWG 3220.
* testsuite/29_atomics/atomic/lwg3220.cc: New test.
Uros Bizjak [Mon, 13 Jun 2022 15:08:18 +0000 (17:08 +0200)]
i386: Return true for (SUBREG (MEM....)) in register_no_elim_operand [PR105927]
Under certain conditions register_operand predicate also allows
subregs of memory operands. When RTL checking is enabled, these
will fail with REGNO (op).
Allow subregs of memory operands, these are guaranteed
to be reloaded to a register.
2022-06-13 Uroš Bizjak <ubizjak@gmail.com>
gcc/ChangeLog:
PR target/105927
* config/i386/predicates.md (register_no_elim_operand):
Return true for subreg of a memory operand.
gcc/testsuite/ChangeLog:
PR target/105927
* gcc.target/i386/pr105927.c: New test.
Iain Buclaw [Sat, 11 Jun 2022 10:40:00 +0000 (12:40 +0200)]
d: Match function declarations of gcc built-ins from any module.
Declarations of recognised gcc built-in functions are now matched from
any module. Previously, only the `core.stdc' package was scanned.
In addition to matching of the symbol, any user-applied `@attributes' or
`pragma(mangle)' name will be applied to the built-in decl as well.
Because there would now be no control over where built-in declarations
are coming from, the warning option `-Wbuiltin-declaration-mismatch' has
been implemented in the D front-end too.
gcc/d/ChangeLog:
* d-builtins.cc: Include builtins.h.
(gcc_builtins_libfuncs): Remove.
(strip_type_modifiers): New function.
(matches_builtin_type): New function.
(covariant_with_builtin_type_p): New function.
(maybe_set_builtin_1): Set front-end built-in if identifier matches
gcc built-in name. Apply user-specified attributes and assembler name
overrides to the built-in. Warn about built-in declaration mismatches.
(d_builtin_function): Set IDENTIFIER_DECL_TREE of built-in functions.
* d-compiler.cc (Compiler::onParseModule): Scan all modules for any
identifiers that match built-in function names.
* lang.opt (Wbuiltin-declaration-mismatch): New option.
gcc/testsuite/ChangeLog:
* gdc.dg/Wbuiltin_declaration_mismatch.d: New test.
* gdc.dg/builtins.d: New test.
Richard Sandiford [Mon, 13 Jun 2022 14:24:34 +0000 (15:24 +0100)]
Add a general mapping from internal fns to target insns
Several existing internal functions map directly to an instruction
defined in target-insns.def. This patch makes it easier to define
more such functions in future.
This should help to reduce cut-&-paste, but more importantly, it allows
the difference between optab functions and target-insns.def functions
to be abstracted away; both are now treated as “directly-mapped”.
gcc/
* internal-fn.def (DEF_INTERNAL_INSN_FN): New macro.
(GOMP_SIMT_ENTER_ALLOC, GOMP_SIMT_EXIT, GOMP_SIMT_LANE)
(GOMP_SIMT_LAST_LANE, GOMP_SIMT_ORDERED_PRED, GOMP_SIMT_VOTE_ANY)
(GOMP_SIMT_XCHG_BFLY, GOMP_SIMT_XCHG_IDX): Use it.
* internal-fn.h (direct_internal_fn_info::directly_mapped): New
member variable.
(direct_internal_fn_info::vectorizable): Reduce to 1 bit.
(direct_internal_fn_p): Also return true for internal functions
that map directly to instructions defined target-insns.def.
(direct_internal_fn): Adjust comment accordingly.
* internal-fn.cc (direct_insn, optab1, optab2, vectorizable_optab1)
(vectorizable_optab2): New local macros.
(not_direct): Initialize directly_mapped.
(mask_load_direct, load_lanes_direct, mask_load_lanes_direct)
(gather_load_direct, len_load_direct, mask_store_direct)
(store_lanes_direct, mask_store_lanes_direct, vec_cond_mask_direct)
(vec_cond_direct, scatter_store_direct, len_store_direct)
(vec_set_direct, unary_direct, binary_direct, ternary_direct)
(cond_unary_direct, cond_binary_direct, cond_ternary_direct)
(while_direct, fold_extract_direct, fold_left_direct)
(mask_fold_left_direct, check_ptrs_direct): Use the macros above.
(expand_GOMP_SIMT_ENTER_ALLOC, expand_GOMP_SIMT_EXIT): Delete
(expand_GOMP_SIMT_LANE, expand_GOMP_SIMT_LAST_LANE): Likewise;
(expand_GOMP_SIMT_ORDERED_PRED, expand_GOMP_SIMT_VOTE_ANY): Likewise.
(expand_GOMP_SIMT_XCHG_BFLY, expand_GOMP_SIMT_XCHG_IDX): Likewise.
(direct_internal_fn_types): Handle functions that map to instructions
defined in target-insns.def.
(direct_internal_fn_types): Likewise.
(direct_internal_fn_supported_p): Likewise.
(internal_fn_expanders): Likewise.
Richard Sandiford [Mon, 13 Jun 2022 14:24:34 +0000 (15:24 +0100)]
Factor out common internal-fn idiom
internal-fn.c has quite a few functions that simply map the result
of the call to an instruction's output operand (if any) and map
each argument to an instruction's input operand, in order.
This patch adds a single function for doing that. It's really
just a generalisation of expand_direct_optab_fn, but with the
output operand being optional.
Unfortunately, it isn't possible to do this for vcond_mask
because the internal function has a different argument order
from the optab.
gcc/
* internal-fn.cc (expand_fn_using_insn): New function,
split out and adapted from...
(expand_direct_optab_fn): ...here.
(expand_GOMP_SIMT_ENTER_ALLOC): Use it.
(expand_GOMP_SIMT_EXIT): Likewise.
(expand_GOMP_SIMT_LANE): Likewise.
(expand_GOMP_SIMT_LAST_LANE): Likewise.
(expand_GOMP_SIMT_ORDERED_PRED): Likewise.
(expand_GOMP_SIMT_VOTE_ANY): Likewise.
(expand_GOMP_SIMT_XCHG_BFLY): Likewise.
(expand_GOMP_SIMT_XCHG_IDX): Likewise.