Jakub Jelinek [Wed, 1 Dec 2021 09:21:20 +0000 (10:21 +0100)]
libcpp: Enable P1949R7 for C++98 too [PR100977]
On Mon, Nov 29, 2021 at 05:53:58PM -0500, Jason Merrill wrote:
> I'm inclined to go ahead and change C++98 as well; I doubt anyone is relying
> on the particular C++98 extended character set rules, and we already accept
> the union of the different sets when not pedantic.
Ok, here is an incremental patch to do that also for -std={c,gnu}++98.
2021-12-01 Jakub Jelinek <jakub@redhat.com>
PR c++/100977
* init.c (struct lang_flags): Remove cxx23_identifiers.
(lang_defaults): Remove cxx23_identifiers initializers.
(cpp_set_lang): Don't copy cxx23_identifiers.
* include/cpplib.h (struct cpp_options): Adjust comment about
c11_identifiers. Remove cxx23_identifiers field.
* lex.c (warn_about_normalization): Use cplusplus instead of
cxx23_identifiers.
* charset.c (ucn_valid_in_identifier): Likewise.
* g++.dg/cpp/ucnid-1.C: Adjust expected diagnostics.
* g++.dg/cpp/ucnid-1-utf8.C: Likewise.
Jakub Jelinek [Wed, 1 Dec 2021 09:16:57 +0000 (10:16 +0100)]
simplify-rtx: Punt on simplify_associative_operation with large operands [PR102356]
Seems simplify_associate_operation is quadratic, which isn't a big deal
for use during combine and other similar RTL passes, because those never
try to combine expressions from more than a few instructions and because
those instructions need to be recognized the machine description also bounds
how many expressions can appear in there.
var-tracking has depth limits only for some cases and unlimited depth
for the vt_expand_loc though:
/* This is the value used during expansion of locations. We want it
to be unbounded, so that variables expanded deep in a recursion
nest are fully evaluated, so that their values are cached
correctly. We avoid recursion cycles through other means, and we
don't unshare RTL, so excess complexity is not a problem. */
#define EXPR_DEPTH (INT_MAX)
/* We use this to keep too-complex expressions from being emitted as
location notes, and then to debug information. Users can trade
compile time for ridiculously complex expressions, although they're
seldom useful, and they may often have to be discarded as not
representable anyway. */
#define EXPR_USE_DEPTH (param_max_vartrack_expr_depth)
IMO for very large expressions it isn't worth trying to reassociate though,
in fact e.g. for the new testcase below keeping it as is has bigger chance
of generating smaller debug info which the dwarf2out.c part of the change
tries to achieve - if a binary operation has the same operands, we can
use DW_OP_dup and not bother computing the possibly large operand again.
The patch fixes it by adding a counter to simplify_context and counting
how many times simplify_associative_operation has been called during
a single outermost simplify_* call, and once it reaches some maximum
(currently 64), it stops reassociating.
Another possibility to deal with the power expressions in debug info
would be to introduce some new RTL operation for the pow{,i} (x, n)
case, allow that solely in debug insns and expand those into DWARF
using a loop. But that seems like quite a lot of work for something rarely
used (especially when powi for larger n is only useful for 0 and 1 inputs
because anything else overflows).
2021-12-01 Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/102356
* rtl.h (simplify_context): Add assoc_count member and
max_assoc_count static member.
* simplify-rtx.c (simplify_associative_operation): Don't reassociate
more than max_assoc_count times within one outermost simplify_* call.
* dwarf2out.c (mem_loc_descriptor): Optimize binary operation
with both operands the same using DW_OP_dup.
* gcc.dg/pr102356.c: New test.
Jakub Jelinek [Wed, 1 Dec 2021 09:07:59 +0000 (10:07 +0100)]
libcpp: Fix up #__VA_OPT__ handling [PR103415]
stringify_arg uses pfile->u_buff to create the string literal.
Unfortunately, paste_tokens -> _cpp_lex_direct -> lex_number -> _cpp_unaligned_alloc
can in some cases use pfile->u_buff too, which results in losing everything
prepared for the string literal until the token pasting.
The following patch fixes that by not calling paste_token during the
construction of the string literal, but doing that before. All the tokens
we are processing have been pushed into a token buffer using
tokens_buff_add_token so it is fine if we paste some of them in that buffer
(successful pasting creates a new token in that buffer), move following
tokens if any to make it contiguous, pop (throw away) the extra tokens at
the end and then do stringify_arg.
Also, paste_tokens now copies over PREV_WHITE and PREV_FALLTHROUGH flags
from the original lhs token to the replacement token. Copying that way
the PREV_WHITE flag is needed for the #__VA_OPT__ handling and copying
over PREV_FALLTHROUGH fixes the new Wimplicit-fallthrough-38.c test.
2021-12-01 Jakub Jelinek <jakub@redhat.com>
PR preprocessor/103415
libcpp/
* macro.c (stringify_arg): Remove va_opt argument and va_opt handling.
(paste_tokens): On successful paste or in PREV_WHITE and
PREV_FALLTHROUGH flags from the *plhs token to the new token.
(replace_args): Adjust stringify_arg callers. For #__VA_OPT__,
perform token pasting in a separate loop before stringify_arg call.
gcc/testsuite/
* c-c++-common/cpp/va-opt-8.c: New test.
* c-c++-common/Wimplicit-fallthrough-38.c: New test.
Tamar Christina [Wed, 1 Dec 2021 08:40:25 +0000 (08:40 +0000)]
middle-end: move bitmask match.pd pattern and update tests
Following the previous bugfix this addresses the cosmetic and test issues.
The vector tests are moved to vect and the scalar are left where they are.
gcc/ChangeLog:
* match.pd: Move below pattern that rewrites to EQ, NE.
* tree.c (bitmask_inv_cst_vector_p): Correct do .. while indentation.
gcc/testsuite/ChangeLog:
* gcc.dg/bic-bitmask-10.c: Moved to gcc.dg/vect/vect-bic-bitmask-10.c.
* gcc.dg/bic-bitmask-11.c: Moved to gcc.dg/vect/vect-bic-bitmask-11.c.
* gcc.dg/bic-bitmask-12.c: Moved to gcc.dg/vect/vect-bic-bitmask-12.c.
* gcc.dg/bic-bitmask-3.c: Moved to gcc.dg/vect/vect-bic-bitmask-3.c.
* gcc.dg/bic-bitmask-23.c: Moved to gcc.dg/vect/vect-bic-bitmask-23.c.
* gcc.dg/bic-bitmask-2.c: Moved to gcc.dg/vect/vect-bic-bitmask-2.c.
* gcc.dg/bic-bitmask-4.c: Moved to gcc.dg/vect/vect-bic-bitmask-4.c.
* gcc.dg/bic-bitmask-5.c: Moved to gcc.dg/vect/vect-bic-bitmask-5.c.
* gcc.dg/bic-bitmask-6.c: Moved to gcc.dg/vect/vect-bic-bitmask-6.c.
* gcc.dg/bic-bitmask-8.c: Moved to gcc.dg/vect/vect-bic-bitmask-8.c.
* gcc.dg/bic-bitmask-9.c: Moved to gcc.dg/vect/vect-bic-bitmask-9.c.
Siddhesh Poyarekar [Wed, 1 Dec 2021 07:28:12 +0000 (12:58 +0530)]
tree-optimization/103456 - Record only successes from object_sizes_set
Avoid overwriting osi->changed if object_sizes_set does not update the
size, so that a previous success in the same pass is not overwritten.
This fixes the bootstrap-ubsan build config, which was failing due to
incorrect object size.
gcc/ChangeLog:
PR tree-optimization/103456
* tree-object-size.c (merge_object_sizes): Update osi->changed
only if object_sizes_set succeeded.
gcc/testsuite/ChangeLog:
PR tree-optimization/103456
* gcc.dg/ubsan/pr103456.c: New test.
Co-authored-by: Martin Liška <mliska@suse.cz>
Signed-off-by: Siddhesh Poyarekar <siddhesh@gotplt.org>
GCC Administrator [Wed, 1 Dec 2021 00:17:04 +0000 (00:17 +0000)]
Daily bump.
liuhongt [Tue, 30 Nov 2021 08:24:39 +0000 (16:24 +0800)]
Fix ICE in ix86_attr_length_immediate_default.
ix86_attr_length_immediate_default assume TYPE ishift only have 1
constant operand,
but *x86_64_shld_1/*x86_shld_1/*x86_64_shrd_1/*x86_shrd_1 has 2, with
condition: INTVAL (operands[3]) == 32 - INTVAL (operands[2]) or
INTVAL (operands[3]) == 64 - INTVAL (operands[2]), and hit
gcc_assert.
Explicitly set_attr length_immediate for these patterns.
gcc/ChangeLog:
PR target/103463
PR target/103484
* config/i386/i386.md (*x86_64_shld_1): Set_attr
length_immediate to 1.
(*x86_shld_1): Ditto.
(*x86_64_shrd_1): Ditto.
(*x86_shrd_1): Ditto.
gcc/testsuite/ChangeLog:
* gcc.target/i386/pr103463.c: New test.
* gcc.target/i386/pr103463-2.c: New test.
Bill Schmidt [Tue, 23 Nov 2021 16:22:58 +0000 (10:22 -0600)]
rs6000: Clarify overloaded builtin diagnostic
When a built-in function required by an overloaded function name is not
currently enabled, the diagnostic message is not as clear as it should be.
Saying that one built-in "requires" another is somewhat misleading. It is
better to explicitly state that the overloaded builtin is implemented by the
missing builtin.
2021-11-23 Bill Schmidt <wschmidt@linux.ibm.com>
gcc/
* config/rs6000/rs6000-c.c (altivec_resolve_overloaded_builtin):
Clarify diagnostic.
(altivec_resolve_new_overloaded_builtin): Likewise.
Jonathan Wakely [Tue, 30 Nov 2021 16:07:21 +0000 (16:07 +0000)]
libstdc++: Fix tests that fail with fully-dynamic-string
Fix some tests that assume that a moved-from string is empty, or that
default constructing a string doesn't allocate.
libstdc++-v3/ChangeLog:
* testsuite/21_strings/basic_string/cons/char/moveable.cc: Allow
moved-from string to be non-empty.
* testsuite/21_strings/basic_string/cons/char/moveable2.cc:
Likewise.
* testsuite/21_strings/basic_string/cons/char/moveable2_c++17.cc:
Likewise.
* testsuite/21_strings/basic_string/cons/wchar_t/moveable.cc:
Likewise.
* testsuite/21_strings/basic_string/cons/wchar_t/moveable2.cc:
Likewise.
* testsuite/21_strings/basic_string/cons/wchar_t/moveable2_c++17.cc:
Likewise.
* testsuite/21_strings/basic_string/modifiers/assign/char/87749.cc:
Construct empty string before setting oom flag.
* testsuite/21_strings/basic_string/modifiers/assign/wchar_t/87749.cc:
Likewise.
Jonathan Wakely [Tue, 30 Nov 2021 12:54:10 +0000 (12:54 +0000)]
libstdc++: Fix fully-dynamic-string build
My last change to the fully-dynamic-string actually broke it. This fixes
the move constructor so it builds, and simplifies it slightly so that
more code is common between the fully-dynamic enabled/disabled cases.
libstdc++-v3/ChangeLog:
* include/bits/cow_string.h (basic_string(basic_string&&)): Fix
mem-initializer for _GLIBCXX_FULLY_DYNAMIC_STRING==0 case.
* testsuite/21_strings/basic_string/cons/char/noexcept_move_construct.cc:
Remove outdated comment.
* testsuite/21_strings/basic_string/cons/wchar_t/noexcept_move_construct.cc:
Likewise.
Jonathan Wakely [Tue, 30 Nov 2021 22:04:49 +0000 (22:04 +0000)]
libstdc++: Ensure C++20 std::stringstream definitions use correct ABI
The definitions of the new C++20 members of std::stringstream etc are
missing when --with-default-libstdcxx-abi=gcc4-compatible is used,
because all the explicit instantiations in src/c++20/sstream-inst.cc are
skipped.
This ensures the contents of that file are compiled with the new ABI, so
the same set of symbols are exported regardless of which ABI is active
by default.
libstdc++-v3/ChangeLog:
* src/c++20/sstream-inst.cc (_GLIBCXX_USE_CXX11_ABI): Define to
select new ABI.
David Malcolm [Tue, 30 Nov 2021 20:31:59 +0000 (15:31 -0500)]
analyzer: add regression test [PR94579]
gcc/testsuite/ChangeLog:
PR analyzer/94579
* gcc.dg/analyzer/pr94579.c: New test.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
David Malcolm [Tue, 30 Nov 2021 19:47:24 +0000 (14:47 -0500)]
analyzer: add regression test [PR99269]
gcc/testsuite/ChangeLog:
PR analyzer/99269
* gcc.dg/analyzer/pr99269.c: New test.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
David Malcolm [Tue, 30 Nov 2021 19:21:31 +0000 (14:21 -0500)]
analyzer: verify that -Wanalyzer-too-complex can be disabled via pragmas [PR100524]
gcc/testsuite/ChangeLog:
PR analyzer/100524
* gcc.dg/analyzer/pragma-2.c: New test.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Harald Anlauf [Sat, 27 Nov 2021 20:43:52 +0000 (21:43 +0100)]
Fortran: improve expansion of constant array expressions within constructors
gcc/fortran/ChangeLog:
PR fortran/102787
* array.c (expand_constructor): When encountering a constant array
expression or array section within a constructor, simplify it to
enable better expansion.
gcc/testsuite/ChangeLog:
* gfortran.dg/array_constructor_54.f90: New test.
Jason Merrill [Thu, 25 Nov 2021 15:50:59 +0000 (10:50 -0500)]
c++: don't fold away 'if' with constant condition
richi's recent unreachable code warning experiments had trouble with the C++
front end folding away an 'if' with a constant condition. Let's do less
folding at the statement level.
gcc/cp/ChangeLog:
* cp-gimplify.c (genericize_if_stmt): Always build a COND_EXPR.
Jonathan Wakely [Tue, 30 Nov 2021 13:41:32 +0000 (13:41 +0000)]
libstdc++: Skip tag dispatching for _S_relocate in C++17
In C++17 mode all callers of _S_relocate have already done:
if constexpr (_S_use_relocate())
so we don't need to repeat that check and use tag dispatching to avoid
ill-formed instantiations.
libstdc++-v3/ChangeLog:
* include/bits/stl_vector.h (vector::_S_do_relocate): Remove
C++20 constexpr specifier.
(vector::_S_relocate) [__cpp_if_constexpr]: Call __relocate_a
directly without tag dispatching.
Jonathan Wakely [Tue, 30 Nov 2021 13:14:38 +0000 (13:14 +0000)]
libstdc++: Make Asan detection work for Clang [PR103453]
Clang doesn't define __SANITIZE_ADDRESS__ so use its __has_feature check
to detect Asan instead.
libstdc++-v3/ChangeLog:
PR libstdc++/103453
* config/allocator/malloc_allocator_base.h
(_GLIBCXX_SANITIZE_STD_ALLOCATOR): Define for Clang.
* config/allocator/new_allocator_base.h
(_GLIBCXX_SANITIZE_STD_ALLOCATOR): Likewise.
Harald Anlauf [Mon, 29 Nov 2021 21:56:30 +0000 (22:56 +0100)]
Fortran: error recovery when simplifying MINLOC/MAXLOC
gcc/fortran/ChangeLog:
PR fortran/103473
* simplify.c (simplify_minmaxloc_nodim): Avoid NULL pointer
dereference when shape is not set.
gcc/testsuite/ChangeLog:
PR fortran/103473
* gfortran.dg/minmaxloc_15.f90: New test.
Harald Anlauf [Mon, 29 Nov 2021 21:23:02 +0000 (22:23 +0100)]
Fortran: check type of SUB argument to IMAGE_INDEX
gcc/fortran/ChangeLog:
PR fortran/101565
* check.c (gfc_check_image_index): Verify that SUB argument to
IMAGE_INDEX is of type integer.
gcc/testsuite/ChangeLog:
PR fortran/101565
* gfortran.dg/coarray_49.f90: New test.
Co-authored-by: Steven G. Kargl <kargl@gcc.gnu.org>
Martin Jambor [Tue, 30 Nov 2021 17:45:11 +0000 (18:45 +0100)]
ipa-sra: Check also ECF_LOOPING_CONST_OR_PURE when evaluating calls
in PR 103267 Honza found out that IPA-SRA does not look at
ECF_LOOPING_CONST_OR_PURE when evaluating if a call can have side
effects. Fixed with this patch. The testcase infinitely loops in a
const function, so it would not make a good addition to the testsuite.
gcc/ChangeLog:
2021-11-29 Martin Jambor <mjambor@suse.cz>
PR ipa/103267
* ipa-sra.c (scan_function): Also check ECF_LOOPING_CONST_OR_PURE flag.
Richard Sandiford [Tue, 30 Nov 2021 17:32:36 +0000 (17:32 +0000)]
vect: Fix ncopies calculation for emulated gather/scatter [PR103494]
I was too eager about removing ncopies calculations in g:
10833849b55.
When emulating gather/scatter, the offset ncopies can be different from
the data ncopies. This patch restores the original calculation.
gcc/
PR tree-optimization/103494
* tree-vect-stmts.c (vect_get_gather_scatter_ops): Remove ncopies
argument and calculate ncopies from gs_info->offset_vectype
where necessary.
(vectorizable_store, vectorizable_load): Update accordingly.
gcc/testsuite/
PR tree-optimization/103494
* gcc.dg/vect/pr103494.c: New test.
* g++.dg/vect/pr103494.cc: Likewise.
Iain Buclaw [Tue, 18 Jun 2019 18:42:10 +0000 (20:42 +0200)]
d: Import dmd
b8384668f, druntime
e6caaab9, phobos
5ab9ad256 (v2.098.0-beta.1)
The D front-end is now itself written in D, in order to build GDC, you
will need a working GDC compiler (GCC version 9.1 or later).
GCC changes:
- Add support for bootstrapping the D front-end.
These add the required components in order to have a D front-end written
in D itself. Because the compiler front-end only depends on the core
runtime modules, only libdruntime is built for the bootstrap stages.
D front-end changes:
- Import dmd v2.098.0-beta.1.
Druntime changes:
- Import druntime v2.098.0-beta.1.
Phobos changes:
- Import phobos v2.098.0-beta.1.
The jump from v2.076.1 to v2.098.0 covers nearly 4 years worth of
development on the D programming language and run-time libraries.
ChangeLog:
* Makefile.def: Add bootstrap to libbacktrace, libphobos, zlib, and
libatomic.
* Makefile.in: Regenerate.
* Makefile.tpl (POSTSTAGE1_HOST_EXPORTS): Fix command for GDC.
(STAGE1_CONFIGURE_FLAGS): Add --with-libphobos-druntime-only if
target-libphobos-bootstrap.
(STAGE2_CONFIGURE_FLAGS): Likewise.
* configure: Regenerate.
* configure.ac: Add support for bootstrapping D front-end.
config/ChangeLog:
* acx.m4 (ACX_PROG_GDC): New m4 function.
gcc/ChangeLog:
* Makefile.in (GDC): New variable.
(GDCFLAGS): New variable.
* configure: Regenerate.
* configure.ac: Add call to ACX_PROG_GDC. Substitute GDCFLAGS.
gcc/d/ChangeLog:
* dmd/MERGE: Merge upstream dmd
b8384668f.
* Make-lang.in (d-warn): Use strict warnings.
(DMD_WARN_CXXFLAGS): Remove.
(DMD_COMPILE): Remove.
(CHECKING_DFLAGS): Define.
(WARN_DFLAGS): Define.
(ALL_DFLAGS): Define.
(DCOMPILE.base): Define.
(DCOMPILE): Define.
(DPOSTCOMPILE): Define.
(DLINKER): Define.
(DLLINKER): Define.
(D_FRONTEND_OBJS): Add new dmd front-end objects.
(D_GENERATED_SRCS): Remove.
(D_GENERATED_OBJS): Remove.
(D_ALL_OBJS): Remove D_GENERATED_OBJS.
(d21$(exeext)): Build using DLLINKER and -static-libphobos.
(d.tags): Remove dmd/*.c and dmd/root/*.c.
(d.mostlyclean): Remove D_GENERATED_SRCS, d/idgen$(build_exeext),
d/impcnvgen$(build_exeext).
(D_INCLUDES): Include $(srcdir)/d/dmd/res.
(CFLAGS-d/id.o): Remove.
(CFLAGS-d/impcnvtab.o): Remove.
(d/%.o): Build using DCOMPILE and DPOSTCOMPILE. Update dependencies
from d/dmd/%.c to d/dmd/%.d.
(d/idgen$(build_exeext)): Remove.
(d/impcnvgen$(build_exeext)): Remove.
(d/id.c): Remove.
(d/id.h): Remove.
(d/impcnvtab.c): Remove.
(d/%.dmdgen.o): Remove.
(D_SYSTEM_H): Remove.
(d/idgen.dmdgen.o): Remove.
(d/impcnvgen.dmdgen.o): Remove.
* config-lang.in (boot_language): New variable.
* d-attribs.cc: Include dmd/expression.h.
* d-builtins.cc: Include d-frontend.h.
(build_frontend_type): Update for new front-end interface.
(d_eval_constant_expression): Likewise.
(d_build_builtins_module): Likewise.
(maybe_set_builtin_1): Likewise.
(d_build_d_type_nodes): Likewise.
* d-codegen.cc (d_decl_context): Likewise.
(declaration_reference_p): Likewise.
(declaration_type): Likewise.
(parameter_reference_p): Likewise.
(parameter_type): Likewise.
(get_array_length): Likewise.
(build_delegate_cst): Likewise.
(build_typeof_null_value): Likewise.
(identity_compare_p): Likewise.
(lower_struct_comparison): Likewise.
(build_filename_from_loc): Likewise.
(build_assert_call): Remove LIBCALL_SWITCH_ERROR.
(build_bounds_index_condition): Call LIBCALL_ARRAYBOUNDS_INDEXP on
bounds error.
(build_bounds_slice_condition): Call LIBCALL_ARRAYBOUNDS_SLICEP on
bounds error.
(array_bounds_check): Update for new front-end interface.
(checkaction_trap_p): Handle CHECKACTION_context.
(get_function_type): Update for new front-end interface.
(d_build_call): Likewise.
* d-compiler.cc: Remove include of dmd/scope.h.
(Compiler::genCmain): Remove.
(Compiler::paintAsType): Update for new front-end interface.
(Compiler::onParseModule): Likewise.
* d-convert.cc (convert_expr): Remove call to LIBCALL_ARRAYCAST.
(convert_for_rvalue): Update for new front-end interface.
(convert_for_assignment): Likewise.
(convert_for_condition): Likewise.
(d_array_convert): Likewise.
* d-diagnostic.cc (error): Remove.
(errorSupplemental): Remove.
(warning): Remove.
(warningSupplemental): Remove.
(deprecation): Remove.
(deprecationSupplemental): Remove.
(message): Remove.
(vtip): New.
* d-frontend.cc (global): Remove.
(Global::_init): Remove.
(Global::startGagging): Remove.
(Global::endGagging): Remove.
(Global::increaseErrorCount): Remove.
(Loc::Loc): Remove.
(Loc::toChars): Remove.
(Loc::equals): Remove.
(isBuiltin): Update for new front-end interface.
(eval_builtin): Likewise.
(getTypeInfoType): Likewise.
(inlineCopy): Remove.
* d-incpath.cc: Include d-frontend.h.
(add_globalpaths): Call d_gc_malloc to allocate Strings.
(add_filepaths): Likewise.
* d-lang.cc: Include dmd/id.h, dmd/root/file.h, d-frontend.h. Remove
include of dmd/mars.h, id.h.
(entrypoint_module): Remove.
(entrypoint_root_module): Remove.
(deps_write_string): Update for new front-end interface.
(deps_write): Likewise.
(d_init_options): Call rt_init. Remove setting global params that are
default initialized by the front-end.
(d_handle_option): Handle OPT_fcheckaction_, OPT_fdump_c___spec_,
OPT_fdump_c___spec_verbose, OPT_fextern_std_, OPT_fpreview,
OPT_revert, OPT_fsave_mixins_, and OPT_ftransition.
(d_post_options): Propagate dip1021 and dip1000 preview flags to
dip25, and flag_diagnostics_show_caret to printErrorContext.
(d_add_entrypoint_module): Remove.
(d_parse_file): Update for new front-end interface.
(d_type_promotes_to): Likewise.
(d_types_compatible_p): Likewise.
* d-longdouble.cc (CTFloat::zero): Remove.
(CTFloat::one): Remove.
(CTFloat::minusone): Remove.
(CTFloat::half): Remove.
* d-system.h (POSIX): Remove.
(realpath): Remove.
(isalpha): Remove.
(isalnum): Remove.
(isdigit): Remove.
(islower): Remove.
(isprint): Remove.
(isspace): Remove.
(isupper): Remove.
(isxdigit): Remove.
(tolower): Remove.
(_mkdir): Remove.
(INT32_MAX): Remove.
(INT32_MIN): Remove.
(INT64_MIN): Remove.
(UINT32_MAX): Remove.
(UINT64_MAX): Remove.
* d-target.cc: Include calls.h.
(target): Remove.
(define_float_constants): Remove initialization of snan.
(Target::_init): Update for new front-end interface.
(Target::isVectorTypeSupported): Likewise.
(Target::isVectorOpSupported): Remove cases for unordered operators.
(TargetCPP::typeMangle): Update for new front-end interface.
(TargetCPP::parameterType): Likewise.
(Target::systemLinkage): Likewise.
(Target::isReturnOnStack): Likewise.
(Target::isCalleeDestroyingArgs): Define.
(Target::preferPassByRef): Define.
* d-tree.h (d_add_entrypoint_module): Remove.
* decl.cc (gcc_attribute_p): Update for new front-end interface.
(apply_pragma_crt): Define.
(DeclVisitor::visit(PragmaDeclaration *)): Handle pragmas
crt_constructor and crt_destructor.
(DeclVisitor::visit(TemplateDeclaration *)): Update for new front-end
interface.
(DeclVisitor::visit): Likewise.
(DeclVisitor::finish_vtable): Likewise.
(get_symbol_decl): Error if template has more than one nesting
context. Update for new front-end interface.
(make_thunk): Update for new front-end interface.
(get_vtable_decl): Likewise.
* expr.cc (ExprVisitor::visit): Likewise.
(build_return_dtor): Likewise.
* imports.cc (ImportVisitor::visit): Likewise.
* intrinsics.cc: Include dmd/expression.h. Remove include of
dmd/mangle.h.
(maybe_set_intrinsic): Update for new front-end interface.
* intrinsics.def (INTRINSIC_ROL): Update intrinsic signature.
(INTRINSIC_ROR): Likewise.
(INTRINSIC_ROR_TIARG): Likewise.
(INTRINSIC_TOPREC): Likewise.
(INTRINSIC_TOPRECL): Likewise.
(INTRINSIC_TAN): Update intrinsic module and signature.
(INTRINSIC_ISNAN): Likewise.
(INTRINSIC_ISFINITE): Likewise.
(INTRINSIC_COPYSIGN): Define intrinsic.
(INTRINSIC_COPYSIGNI): Define intrinsic.
(INTRINSIC_EXP): Update intrinsic module.
(INTRINSIC_EXPM1): Likewise.
(INTRINSIC_EXP2): Likewise.
(INTRINSIC_LOG): Likewise.
(INTRINSIC_LOG2): Likewise.
(INTRINSIC_LOG10): Likewise.
(INTRINSIC_POW): Likewise.
(INTRINSIC_ROUND): Likewise.
(INTRINSIC_FLOORF): Likewise.
(INTRINSIC_FLOOR): Likewise.
(INTRINSIC_FLOORL): Likewise.
(INTRINSIC_CEILF): Likewise.
(INTRINSIC_CEIL): Likewise.
(INTRINSIC_CEILL): Likewise.
(INTRINSIC_TRUNC): Likewise.
(INTRINSIC_FMIN): Likewise.
(INTRINSIC_FMAX): Likewise.
(INTRINSIC_FMA): Likewise.
(INTRINSIC_VA_ARG): Update intrinsic signature.
(INTRINSIC_VASTART): Likewise.
* lang.opt (fcheck=): Add alternate aliases for contract switches.
(fcheckaction=): New option.
(check_action): New Enum and EnumValue entries.
(fdump-c++-spec-verbose): New option.
(fdump-c++-spec=): New option.
(fextern-std=): New option.
(extern_stdcpp): New Enum and EnumValue entries
(fpreview=): New options.
(frevert=): New options.
(fsave-mixins): New option.
(ftransition=): Update options.
* modules.cc (get_internal_fn): Replace Prot with Visibility.
(build_internal_fn): Likewise.
(build_dso_cdtor_fn): Likewise.
(build_module_tree): Remove check for __entrypoint module.
* runtime.def (P5): Define.
(ARRAYBOUNDS_SLICEP): Define.
(ARRAYBOUNDS_INDEXP): Define.
(NEWTHROW): Define.
(ADCMP2): Remove.
(ARRAYCAST): Remove.
(SWITCH_STRING): Remove.
(SWITCH_USTRING): Remove.
(SWITCH_DSTRING): Remove.
(SWITCH_ERROR): Remove.
* toir.cc (IRVisitor::visit): Update for new front-end interface.
(IRVisitor::check_previous_goto): Remove checks for case and default
statements.
(IRVisitor::visit(SwitchStatement *)): Remove handling of string
switch conditions.
* typeinfo.cc: Include d-frontend.h.
(get_typeinfo_kind): Update for new front-end interface.
(make_frontend_typeinfo): Likewise.
(TypeInfoVisitor::visit): Likewise.
(builtin_typeinfo_p): Likewise.
(get_typeinfo_decl): Likewise.
(build_typeinfo): Likewise.
* types.cc (valist_array_p): Likewise.
(make_array_type): Likewise.
(merge_aggregate_types): Likewise.
(TypeVisitor::visit(TypeBasic *)): Likewise.
(TypeVisitor::visit(TypeFunction *)): Likewise.
(TypeVisitor::visit(TypeStruct *)): Update comment.
* verstr.h: Removed.
* d-frontend.h: New file.
gcc/po/ChangeLog:
* EXCLUDES: Remove d/dmd sources from list.
gcc/testsuite/ChangeLog:
* gdc.dg/Wcastresult2.d: Update test.
* gdc.dg/asm1.d: Likewise.
* gdc.dg/asm2.d: Likewise.
* gdc.dg/asm3.d: Likewise.
* gdc.dg/gdc282.d: Likewise.
* gdc.dg/imports/gdc170.d: Likewise.
* gdc.dg/intrinsics.d: Likewise.
* gdc.dg/pr101672.d: Likewise.
* gdc.dg/pr90650a.d: Likewise.
* gdc.dg/pr90650b.d: Likewise.
* gdc.dg/pr94777a.d: Likewise.
* gdc.dg/pr95250.d: Likewise.
* gdc.dg/pr96869.d: Likewise.
* gdc.dg/pr98277.d: Likewise.
* gdc.dg/pr98457.d: Likewise.
* gdc.dg/simd1.d: Likewise.
* gdc.dg/simd2a.d: Likewise.
* gdc.dg/simd2b.d: Likewise.
* gdc.dg/simd2c.d: Likewise.
* gdc.dg/simd2d.d: Likewise.
* gdc.dg/simd2e.d: Likewise.
* gdc.dg/simd2f.d: Likewise.
* gdc.dg/simd2g.d: Likewise.
* gdc.dg/simd2h.d: Likewise.
* gdc.dg/simd2i.d: Likewise.
* gdc.dg/simd2j.d: Likewise.
* gdc.dg/simd7951.d: Likewise.
* gdc.dg/torture/gdc309.d: Likewise.
* gdc.dg/torture/pr94424.d: Likewise.
* gdc.dg/torture/pr94777b.d: Likewise.
* lib/gdc-utils.exp (gdc-convert-args): Handle new compiler options.
(gdc-convert-test): Handle CXXFLAGS, EXTRA_OBJC_SOURCES, and ARG_SETS
test directives.
(gdc-do-test): Only import modules in the test run directory.
* gdc.dg/pr94777c.d: New test.
* gdc.dg/pr96156b.d: New test.
* gdc.dg/pr96157c.d: New test.
* gdc.dg/simd_ctfe.d: New test.
* gdc.dg/torture/simd17344.d: New test.
* gdc.dg/torture/simd20052.d: New test.
* gdc.dg/torture/simd6.d: New test.
* gdc.dg/torture/simd7.d: New test.
libphobos/ChangeLog:
* libdruntime/MERGE: Merge upstream druntime
e6caaab9.
* libdruntime/Makefile.am (D_EXTRA_FLAGS): Build libdruntime with
-fpreview=dip1000, -fpreview=fieldwise, and -fpreview=dtorfields.
(ALL_DRUNTIME_SOURCES): Add DRUNTIME_DSOURCES_STDCXX.
(DRUNTIME_DSOURCES): Update list of C binding modules.
(DRUNTIME_DSOURCES_STDCXX): Likewise.
(DRUNTIME_DSOURCES_LINUX): Likewise.
(DRUNTIME_DSOURCES_OPENBSD): Likewise.
(DRUNTIME_DISOURCES): Remove __entrypoint.di.
* libdruntime/Makefile.in: Regenerated.
* libdruntime/__entrypoint.di: Removed.
* libdruntime/gcc/deh.d (_d_isbaseof): Update signature.
(_d_createTrace): Likewise.
(__gdc_begin_catch): Remove reference to the exception.
(_d_throw): Increment reference count of thrown object before unwind.
(__gdc_personality): Chain exceptions with Throwable.chainTogether.
* libdruntime/gcc/emutls.d: Update imports.
* libdruntime/gcc/sections/elf.d: Update imports.
(DSO.moduleGroup): Update signature.
* libdruntime/gcc/sections/macho.d: Update imports.
(DSO.moduleGroup): Update signature.
* libdruntime/gcc/sections/pecoff.d: Update imports.
(DSO.moduleGroup): Update signature.
* src/MERGE: Merge upstream phobos
5ab9ad256.
* src/Makefile.am (D_EXTRA_DFLAGS): Add -fpreview=dip1000 and
-fpreview=dtorfields flags.
(PHOBOS_DSOURCES): Update list of std modules.
* src/Makefile.in: Regenerate.
* testsuite/lib/libphobos.exp (libphobos-dg-test): Handle assembly
compile types.
(dg-test): Override.
(additional_prunes): Define.
(libphobos-dg-prune): Filter any additional_prunes set by tests.
* testsuite/libphobos.aa/test_aa.d: Update test.
* testsuite/libphobos.druntime/druntime.exp (version_flags): Add
-fversion=CoreUnittest.
* testsuite/libphobos.druntime_shared/druntime_shared.exp
(version_flags): Add -fversion=CoreUnittest -fversion=Shared.
* testsuite/libphobos.exceptions/unknown_gc.d: Update test.
* testsuite/libphobos.hash/test_hash.d: Update test.
* testsuite/libphobos.phobos/phobos.exp (version_flags): Add
-fversion=StdUnittest
* testsuite/libphobos.phobos_shared/phobos_shared.exp (version_flags):
Likewise.
* testsuite/libphobos.shared/host.c: Update test.
* testsuite/libphobos.shared/load.d: Update test.
* testsuite/libphobos.shared/load_13414.d: Update test.
* testsuite/libphobos.thread/fiber_guard_page.d: Update test.
* testsuite/libphobos.thread/tlsgc_sections.d: Update test.
* testsuite/testsuite_flags.in: Add -fpreview=dip1000 to --gdcflags.
* testsuite/libphobos.shared/link_mod_collision.d: Removed.
* testsuite/libphobos.shared/load_mod_collision.d: Removed.
* testsuite/libphobos.betterc/betterc.exp: New test.
* testsuite/libphobos.config/config.exp: New test.
* testsuite/libphobos.gc/gc.exp: New test.
* testsuite/libphobos.imports/imports.exp: New test.
* testsuite/libphobos.lifetime/lifetime.exp: New test.
* testsuite/libphobos.unittest/unittest.exp: New test.
Martin Jambor [Tue, 30 Nov 2021 14:35:18 +0000 (15:35 +0100)]
ipa-param-manip: Be careful about a reallocating hash_map
PR 103449 revealed that when I was storing result of one hash_map
lookup into another entry in the hash_map, I was still accessing the
entry in the table, which meanwhile could get reallocated, making the
accesses invalid-after-free.
Fixed with the following, which also simplifies the return statement
which must have been true even now.
gcc/ChangeLog:
2021-11-29 Martin Liska <mliska@suse.cz>
Martin Jambor <mjambor@suse.cz>
PR ipa/103449
* ipa-param-manipulation.c
(ipa_param_body_adjustments::prepare_debug_expressions): Be
careful about hash_map reallocating itself. Simpify a return
which always returns true.
Richard Biener [Tue, 30 Nov 2021 14:25:17 +0000 (15:25 +0100)]
Add comment to indicate tail recursion
My previous change removed an unreachable break; there (an
unreachable continue; would have been more to the point). The
following re-adds a comment explaining that WALK_SUBEXPR_TAIL
does not fall through but tail recurses.
2021-11-30 Richard Biener <rguenther@suse.de>
gcc/fortran/
* frontend-passes.c (gfc_expr_walker): Add comment to
indicate tail recursion.
Andrew MacLeod [Mon, 29 Nov 2021 14:19:34 +0000 (09:19 -0500)]
Always track arguments, even when ignoring equiv params.
To "ignore" ranges from equivalences, we should track the range separately,
but still do the other name processing which determiens if there is a single
name or not for equivalence. Otherwise we mistakently think we can introduce
an equivalences.
gcc/
PR tree-optimization/103440
* gimple-range-fold.cc (fold_using_range::range_of_phi): Continue
normal param processing for equiv params.
gcc/testsuite/
* gcc.dg/pr103440.c: New.
Richard Biener [Mon, 29 Nov 2021 12:19:57 +0000 (13:19 +0100)]
Remove more stray returns and gcc_unreachable ()s
This removes more cases that appear when bootstrap with
-Wunreachable-code-return progresses.
2021-11-29 Richard Biener <rguenther@suse.de>
* config/i386/i386.c (ix86_shift_rotate_cost): Remove
unreachable return.
* tree-chrec.c (evolution_function_is_invariant_rec_p):
Likewise.
* tree-if-conv.c (if_convertible_stmt_p): Likewise.
* tree-ssa-pre.c (fully_constant_expression): Likewise.
* tree-vrp.c (operand_less_p): Likewise.
* reload.c (reg_overlap_mentioned_for_reload_p): Remove
unreachable gcc_unreachable ().
* sel-sched-ir.h (bb_next_bb): Likewise.
* varasm.c (compare_constant): Likewise.
gcc/cp/
* logic.cc (cnf_size_r): Remove unreachable and inconsistently
placed gcc_unreachable ()s.
* pt.c (iterative_hash_template_arg): Remove unreachable
gcc_unreachable and return.
gcc/fortran/
* target-memory.c (gfc_element_size): Remove unreachable return.
gcc/objc/
* objc-act.c (objc_build_setter_call): Remove unreachable
return.
libcpp/
* charset.c (convert_escape): Remove unreachable break.
Richard Biener [Tue, 30 Nov 2021 13:08:19 +0000 (14:08 +0100)]
tree-optimization/103489 - fix ICE when bool pattern recog fails
bool pattern recog currently does not handle cycles correctly
and when it fails we can ICE later vectorizing PHIs with
mismatched bool and non-bool vector types. The following avoids
blindly trusting bool pattern recog here and verifies things
more thoroughly in vectorizable_phi. A bool pattern recog fix
is for GCC 13.
2021-11-30 Richard Biener <rguenther@suse.de>
PR tree-optimization/103489
* tree-vect-loop.c (vectorizable_phi): Verify argument
vector type compatibility to mitigate bool pattern recog
bug.
* gcc.dg/torture/pr103489.c: New testcase.
Martin Liska [Mon, 29 Nov 2021 15:24:12 +0000 (16:24 +0100)]
Change if-to-switch-conversion test.
PR tree-optimization/103278
gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/if-to-switch-5.c: Make the test acceptable by
targets with no jump-tables.
Jonathan Wakely [Tue, 30 Nov 2021 13:05:08 +0000 (13:05 +0000)]
libstdc++: Use gender-agnostic pronoun in docs
libstdc++-v3/ChangeLog:
* doc/xml/manual/debug_mode.xml: Replace "his or her" with "they".
* doc/html/manual/debug_mode_design.html: Regenerate.
Jakub Jelinek [Tue, 30 Nov 2021 12:30:27 +0000 (13:30 +0100)]
libstdc++: Add [[nodiscard]] to std::byteswap
This patch adds [[nodiscard]] to std::byteswap, because the function
template doesn't do anything useful if the result isn't used.
2021-11-30 Jakub Jelinek <jakub@redhat.com>
* include/std/bit (byteswap): Add [[nodiscard]].
Thomas Schwinge [Fri, 26 Nov 2021 12:11:16 +0000 (13:11 +0100)]
[OpenACC] Remove erroneous "Orphan reductions cannot have gang partitioning" handling
That is:
-/* Ensure that the middle end does not assign gang level parallelism
- to orphan loop containing reductions. */
+/* Verify that we diagnose "gang reduction on an orphan loop" for automatically
+ assigned gang level of parallelism. */
... to implement what the OpenACC specification actually says.
Fix-up for preceding commit
2b7dac2c0dcb087da9e4018943c023c0678234a3
"Make OpenACC orphan gang reductions errors".
gcc/
* omp-offload.c (oacc_loop_auto_partitions): Remove erroneous
"Orphan reductions cannot have gang partitioning" handling.
gcc/testsuite/
* c-c++-common/goacc/nested-reductions-1-routine.c: Adjust.
* c-c++-common/goacc/nested-reductions-2-routine.c: Adjust.
* c-c++-common/goacc/orphan-reductions-2.c: Adjust.
* gfortran.dg/goacc/nested-reductions-1-routine.f90: Adjust.
* gfortran.dg/goacc/nested-reductions-2-routine.f90: Adjust.
* gfortran.dg/goacc/orphan-reductions-1.f90: Adjust.
* gfortran.dg/goacc/orphan-reductions-2.f90: Adjust.
Thomas Schwinge [Fri, 26 Nov 2021 11:29:26 +0000 (12:29 +0100)]
Consolidate OpenACC "gang reduction on an orphan loop" checking
No need to implement separately in all front ends what we may implement in the
middle end, once for all.
Follow-up to preceding commit
2b7dac2c0dcb087da9e4018943c023c0678234a3
"Make OpenACC orphan gang reductions errors".
gcc/
* omp-offload.c (oacc_loop_process): Implement "gang reduction on
an orphan loop" checking.
gcc/c/
* c-typeck.c (c_finish_omp_clauses): Remove "gang reduction on an
orphan loop" checking.
gcc/cp/
* semantics.c (finish_omp_clauses): Remove "gang reduction on an
orphan loop" checking.
gcc/fortran/
* openmp.c (resolve_oacc_loop_blocks): Remove "gang reduction on
an orphan loop" checking.
(oacc_is_parallel, oacc_is_kernels, oacc_is_serial)
(oacc_is_compute_construct): Remove.
gcc/testsuite/
* gfortran.dg/goacc/orphan-reductions-1.f90: Adjust.
Frederik Harwath [Mon, 20 Jul 2020 09:24:21 +0000 (11:24 +0200)]
Re OpenACC "gang reduction on an orphan loop" error message
Follow-up to preceding commit
2b7dac2c0dcb087da9e4018943c023c0678234a3
"Make OpenACC orphan gang reductions errors".
gcc/fortran/
* openmp.c (oacc_is_parallel_or_serial): Evolve into...
(oacc_is_compute_construct): ... this function.
(resolve_oacc_loop_blocks): Use "oacc_is_compute_construct"
instead of "oacc_is_parallel_or_serial" for checking that a
loop is not orphaned.
gcc/testsuite/
* gfortran.dg/goacc/orphan-reductions-3.f90: New test
verifying that the "gang reduction on an orphan loop" error message
is not emitted for non-orphaned loops.
* c-c++-common/goacc/orphan-reductions-3.c: Likewise for C and C++.
Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>
Kwok Cheung Yeung [Fri, 13 Mar 2020 18:13:49 +0000 (11:13 -0700)]
[OpenACC] Allow gang reductions inside serial constructs
... fixing a regression introduced in the preceding
commit
2b7dac2c0dcb087da9e4018943c023c0678234a3
"Make OpenACC orphan gang reductions errors".
gcc/fortran/
* openmp.c (oacc_is_serial, oacc_is_parallel_or_serial): New.
(resolve_oacc_loop_blocks): Use oacc_is_parallel_or_serial instead of
oacc_is_parallel.
libgomp/
* testsuite/libgomp.oacc-fortran/parallel-dims.f90: Remove
temporary skip.
Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>
Cesar Philippidis [Tue, 2 May 2017 01:27:59 +0000 (18:27 -0700)]
Make OpenACC orphan gang reductions errors
This patch promotes all OpenACC gang reductions on orphan loops as
errors. Accord to the spec, orphan loops are those which are not
lexically nested inside an OpenACC parallel or kernels regions. I.e.,
acc loops inside acc routines.
At first I thought this could be a warning because the gang reduction
finalizer uses an atomic update. However, because there is no
synchronization between gangs, there is way to guarantee that reduction
will have completed once a single gang entity returns from the acc
routine call.
gcc/c/
* c-typeck.c (c_finish_omp_clauses): Emit an error on orphan
OpenACC gang reductions.
gcc/cp/
* semantics.c (finish_omp_clauses): Emit an error on orphan
OpenACC gang reductions.
gcc/fortran/
* openmp.c (oacc_is_parallel, oacc_is_kernels): New 'static'
functions.
(resolve_oacc_loop_blocks): Emit an error on orphan OpenACC gang
reductions.
gcc/
* omp-general.h (enum oacc_loop_flags): Add OLF_REDUCTION enum.
* omp-low.c (lower_oacc_head_mark): Use it to mark OpenACC
reductions.
* omp-offload.c (oacc_loop_auto_partitions): Don't assign gang
level parallelism to orphan reductions.
gcc/testsuite/
* c-c++-common/goacc/nested-reductions-1-routine.c: Adjust.
* c-c++-common/goacc/nested-reductions-2-routine.c: Likewise.
* gcc.dg/goacc/loop-processing-1.c: Likewise.
* gfortran.dg/goacc/nested-reductions-1-routine.f90: Likewise.
* gfortran.dg/goacc/nested-reductions-2-routine.f90: Likewise.
* c-c++-common/goacc/orphan-reductions-1.c: New test.
* c-c++-common/goacc/orphan-reductions-2.c: New test.
* gfortran.dg/goacc/orphan-reductions-1.f90: New test.
* gfortran.dg/goacc/orphan-reductions-2.f90: New test.
libgomp/
* testsuite/libgomp.oacc-fortran/parallel-dims.f90: Temporarily
skip.
Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>
Kwok Cheung Yeung [Tue, 28 Jul 2020 12:41:14 +0000 (05:41 -0700)]
Fix c-c++-common/goacc/routine-4.c and c-c++-common/goacc/routine-4-extern.c testcases
... in preparation for checks that we're introducing for OpenACC gang
reductions on orphan loops.
gcc/testsuite/
* c-c++-common/goacc/routine-4.c (seq, vector, worker, gang):
Remove loop reductions.
* c-c++-common/goacc/routine-4-extern.c (seq, vector, worker, gang):
Likewise.
Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>
Roger Sayle [Tue, 30 Nov 2021 10:25:35 +0000 (10:25 +0000)]
[Committed] PR testsuite/103477: Fix big-endian mistake in new test case.
I missed a spot when adding the "#if __BYTE_ORDER__ == ..." guards to
the new test case for PR tree-optimization/103345. Committed as obvious.
2021-11-30 Roger Sayle <roger@nextmovesoftware.com>
gcc/testsuite/ChangeLog
PR testsuite/103477
* gcc.dg/tree-ssa/pr103345.c: Correct xor test for big-endian.
Aldy Hernandez [Mon, 29 Nov 2021 11:52:45 +0000 (12:52 +0100)]
Remove can_throw_non_call_exceptions special case from operator_div::wi_fold.
PR tree-optimization/103451
gcc/ChangeLog:
* range-op.cc (operator_div::wi_fold): Remove
can_throw_non_call_exceptions special case.
* tree-ssa-sink.c (sink_code_in_bb): Same.
gcc/testsuite/ChangeLog:
* gcc.dg/pr103451.c: New test.
Richard Sandiford [Tue, 30 Nov 2021 09:52:30 +0000 (09:52 +0000)]
vect: Support masked gather loads with SLP
This patch extends the previous SLP gather load support so
that it can handle masked loads too.
gcc/
* tree-vect-slp.c (arg1_arg4_map): New variable.
(vect_get_operand_map): Handle IFN_MASK_GATHER_LOAD.
(vect_build_slp_tree_1): Likewise.
(vect_build_slp_tree_2): Likewise.
* tree-vect-stmts.c (vectorizable_load): Expect the mask to be
the last SLP child node rather than the first.
gcc/testsuite/
* gcc.dg/vect/vect-gather-3.c: New test.
* gcc.dg/vect/vect-gather-4.c: Likewise.
* gcc.target/aarch64/sve/mask_gather_load_8.c: Likewise.
Richard Sandiford [Tue, 30 Nov 2021 09:52:29 +0000 (09:52 +0000)]
if-conv: Apply VN to hoisted conversions
This patch is a prerequisite for a later one. At the moment,
if-conversion converts predicated POINTER_PLUS_EXPRs into
non-wrapping forms, which for:
… = base + offset
becomes:
tmp = (unsigned long) base
… = tmp + offset
It then hoists these conversions out of the loop where possible.
However, because “base” is a valid gimple operand, there can be
multiple POINTER_PLUS_EXPRs with the same base, which can in turn
lead to multiple instances of the same conversion. The later VN pass
is (and I think needs to be) restricted to the new if-converted code,
whereas here we're deliberately inserting the conversions before the
.LOOP_VECTORIZED condition:
/* If we versioned loop then make sure to insert invariant
stmts before the .LOOP_VECTORIZED check since the vectorizer
will re-use that for things like runtime alias versioning
whose condition can end up using those invariants. */
We can therefore enter the vectoriser with redundant conversions.
The easiest fix seemed to be to defer the hoisting until after VN.
This catches other hoisting opportunities too.
Hoisting the code from the (artificial) loop in pr99102.c means
that it's no longer worth vectorising. The patch forces vectorisation
instead of relying on the cost model.
The patch also reverts pr87007-4.c and pr87007-5.c back to their
original forms, undoing changes in
783dc66f9ccb0019c3dad.
The code at the time the tests were added was:
testl %edi, %edi
je .L10
vxorps %xmm1, %xmm1, %xmm1
vsqrtsd d3(%rip), %xmm1, %xmm0
vsqrtsd d2(%rip), %xmm1, %xmm1
...
.L10:
ret
with the operations being hoisted, and the vxorps was specifically
wanted (compared to the previous code). This patch restores the code
to that form, with the hoisted operations and the vxorps.
gcc/
* tree-if-conv.c: Include tree-eh.h.
(predicate_statements): Remove pe argument. Don't hoist
statements here.
(combine_blocks): Remove pe argument.
(ifcvt_available_on_edge_p, ifcvt_can_hoist): New functions.
(ifcvt_hoist_invariants): Likewise.
(tree_if_conversion): Update call to combine_blocks. Call
ifcvt_hoist_invariants after VN.
gcc/testsuite/
* gcc.dg/vect/pr99102.c: Add -fno-vect-cost-model.
Revert:
2020-09-09 Richard Biener [rguenther@suse.de]
* gcc.target/i386/pr87007-4.c: Adjust.
* gcc.target/i386/pr87007-5.c: Likewise.
Richard Sandiford [Tue, 30 Nov 2021 09:52:29 +0000 (09:52 +0000)]
vect: Support gather loads with SLP
This patch adds SLP support for IFN_GATHER_LOAD. Like the SLP
support for IFN_MASK_LOAD, it works by treating only some of the
arguments as child nodes. Unlike IFN_MASK_LOAD, it requires the
other arguments (base, scale, and extension type) to be the same
for all calls in the group. It does not require/expect the loads
to be in a group (which probably wouldn't make sense for gathers).
I was worried about the possible alias effect of moving gathers
around to be part of the same SLP group. The patch therefore
makes vect_analyze_data_ref_dependence treat gathers and scatters
as a top-level concern, punting if the accesses aren't completely
independent and if the user hasn't told us that a particular
VF is safe. I think in practice we already punted in the same
circumstances; the idea is just to make it more explicit.
gcc/
PR tree-optimization/102467
* doc/sourcebuild.texi (vect_gather_load_ifn): Document.
* tree-vect-data-refs.c (vect_analyze_data_ref_dependence):
Commonize safelen handling. Punt for anything involving
gathers and scatters unless safelen says otherwise.
* tree-vect-slp.c (arg1_map): New variable.
(vect_get_operand_map): Handle IFN_GATHER_LOAD.
(vect_build_slp_tree_1): Likewise.
(vect_build_slp_tree_2): Likewise.
(compatible_calls_p): If vect_get_operand_map returns nonnull,
check that any skipped arguments are equal.
(vect_slp_analyze_node_operations_1): Tighten reduction check.
* tree-vect-stmts.c (check_load_store_for_partial_vectors): Take
an ncopies argument.
(vect_get_gather_scatter_ops): Take slp_node and ncopies arguments.
Handle SLP nodes.
(vectorizable_store, vectorizable_load): Adjust accordingly.
gcc/testsuite/
* lib/target-supports.exp
(check_effective_target_vect_gather_load_ifn): New target test.
* gcc.dg/vect/vect-gather-1.c: New test.
* gcc.dg/vect/vect-gather-2.c: Likewise.
* gcc.target/aarch64/sve/gather_load_11.c: Likewise.
Richard Sandiford [Tue, 30 Nov 2021 09:52:28 +0000 (09:52 +0000)]
vect: Use generalised accessors to build SLP nodes
This patch adds:
- gimple_num_args
- gimple_arg
- gimple_arg_ptr
for accessing rhs operands of an assignment, call or PHI. This is
similar to the existing gimple_get_lhs.
I guess there's a danger that these routines could be overused,
such as in cases where gimple_assign_rhs1 etc. would be more
appropriate. I think the routines are still worth having though.
These days, most new operations are added as internal functions rather
than tree codes, so it's useful to be able to handle assignments and
calls in a consistent way.
The patch also generalises the way that SLP child nodes map
to gimple stmt operands. This is useful for later patches.
gcc/
* gimple.h (gimple_num_args, gimple_arg, gimple_arg_ptr): New
functions.
* tree-vect-slp.c (cond_expr_maps, arg2_map): New variables.
(vect_get_operand_map): New function.
(vect_get_and_check_slp_defs): Fix outdated comment.
Use vect_get_operand_map and new gimple argument accessors.
(vect_build_slp_tree_2): Likewise.
Richard Sandiford [Tue, 30 Nov 2021 09:52:28 +0000 (09:52 +0000)]
vect: Use code_helper when building SLP nodes
This patch uses code_helper to represent the common (and
alternative) operations when building an SLP node. It's not
much of a saving on its own, but it helps with later patches.
gcc/
* tree-vect-slp.c (vect_build_slp_tree_1): Use code_helper
to record the operations performed by statements, only using
CALL_EXPR for things that don't map to built-in or internal
functions. For shifts, require all shift amounts to be equal
if optab_vector is not supported but optab_scalar is.
Richard Sandiford [Tue, 30 Nov 2021 09:52:28 +0000 (09:52 +0000)]
vect: Fix SVE mask_gather_load/store_store tests
If-conversion now applies rewrite_to_defined_overflow to the
address calculation in an IFN_MASK_LOAD. This means that we
end up with:
cast_base = (uintptr_t) base;
uncast_sum = cast_base + offset;
sum = (orig_type *) uncast_sum;
If the target supports IFN_MASK_GATHER_LOAD with pointer-sized
offsets for the given vectype, we wouldn't look through the sum
cast and so would needlessly vectorise the uncast_sum addition.
This showed up as several failures in gcc.target/aarch64/sve.
gcc/
* tree-vect-data-refs.c (vect_check_gather_scatter): Continue
processing conversions if the current offset is a pointer.
Richard Sandiford [Tue, 30 Nov 2021 09:52:27 +0000 (09:52 +0000)]
vect: Fix vect_is_reduction
The current definition of vect_is_reduction (provided for target
costing) misses some pattern statements.
gcc/
* tree-vectorizer.h (vect_is_reduction): Use STMT_VINFO_REDUC_IDX.
gcc/testsuite/
* gcc.target/aarch64/sve/cost_model_13.c: New test.
Richard Sandiford [Tue, 30 Nov 2021 09:52:27 +0000 (09:52 +0000)]
vect: Pass mode to gather/scatter tests
vect_check_gather_scatter had a binary “does this target support
internal gather/scatter functions” test. This dates from the time when
we only handled gathers and scatters via direct target support, with
x86_64 using built-in functions and aarch64 using IFNs. But now that we
can emulate gathers, we need to check whether the gather for a particular
mode is going to be emulated or not.
Without this, enabling SVE regresses emulated Advanced SIMD gather
sequences in cases where SVE isn't used.
Livermore kernel 15 can now be vectorised with Advanced SIMD when
SVE is enabled.
gcc/
* genopinit.c (main): Turn supports_vec_gather_load and
supports_vec_scatter_store into signed char arrays and remove
supports_vec_gather_load_cached and supports_vec_scatter_store_cached.
* optabs-query.c (supports_vec_convert_optab_p): Add a mode parameter.
If the mode is not VOIDmode, test only for that mode.
(supports_vec_gather_load_p): Likewise.
(supports_vec_scatter_store_p): Likewise.
* optabs-query.h (supports_vec_gather_load_p): Likewise.
(supports_vec_scatter_store_p): Likewise.
* tree-vect-data-refs.c (vect_check_gather_scatter): Pass the
vector mode to supports_vec_gather_load_p and
supports_vec_scatter_store_p.
gcc/testsuite/
* gfortran.dg/vect/vect-8.f90: Bump number of vectorized loops
to 25 for SVE.
* gcc.target/aarch64/sve/gather_load_10.c: New test.
Richard Sandiford [Tue, 30 Nov 2021 09:52:27 +0000 (09:52 +0000)]
Mark IFN_ADD/MUL_OVERFLOW as commutative
gcc/
* internal-fn.c (commutative_binary_fn_p): Handle IFN_ADD_OVERFLOW
and IFN_MUL_OVERFLOW.
gcc/testsuite/
* gcc.dg/add-mul-overflow-1.c: New test.
Richard Sandiford [Tue, 30 Nov 2021 09:52:26 +0000 (09:52 +0000)]
Mark IFN_UBSAN_CHECK_ADD/MUL as commutative
gcc/
* internal-fn.c (commutative_binary_fn_p): Handle IFN_UBSAN_CHECK_ADD
and IFN_UBSAN_CHECK_MUL.
gcc/testsuite/
* gcc.dg/ubsan/commutative-1.c: New test.
Richard Sandiford [Tue, 30 Nov 2021 09:52:26 +0000 (09:52 +0000)]
Mark IFN_COMPLEX_MUL as commutative
gcc/
* internal-fn.c (commutative_binary_fn_p): Handle IFN_COMPLEX_MUL.
gcc/testsuite/
* gcc.target/aarch64/sve/complex_mul_1.c: New test.
Richard Sandiford [Tue, 30 Nov 2021 09:52:25 +0000 (09:52 +0000)]
Canonicalize argument order for commutative functions
This patch uses information about internal functions to canonicalize
the argument order of calls.
gcc/
* gimple-fold.c: Include internal-fn.h.
(fold_stmt_1): If a function maps to an internal one, use
first_commutative_argument to canonicalize the order of
commutative arguments.
* gimple-match-head.c (gimple_resimplify2, gimple_resimplify3)
(gimple_resimplify4, gimple_resimplify5): Extend commutativity
checks to functions.
gcc/testsuite/
* gcc.dg/fmax-fmin-1.c: New test.
Richard Sandiford [Tue, 30 Nov 2021 09:52:25 +0000 (09:52 +0000)]
vect: Add support for fmax and fmin reductions
This patch adds support for reductions involving calls to fmax*()
and fmin*(), without the -ffast-math flags that allow them to be
converted to MAX_EXPR and MIN_EXPR.
gcc/
* doc/md.texi (reduc_fmin_scal_@var{m}): Document.
(reduc_fmax_scal_@var{m}): Likewise.
* optabs.def (reduc_fmax_scal_optab): New optab.
(reduc_fmin_scal_optab): Likewise
* internal-fn.def (REDUC_FMAX, REDUC_FMIN): New functions.
* tree-vect-loop.c (reduction_fn_for_scalar_code): Handle
CASE_CFN_FMAX and CASE_CFN_FMIN.
(neutral_op_for_reduction): Likewise.
(needs_fold_left_reduction_p): Likewise.
* config/aarch64/iterators.md (FMAXMINV): New iterator.
(fmaxmin): Handle UNSPEC_FMAXNMV and UNSPEC_FMINNMV.
* config/aarch64/aarch64-simd.md (reduc_<optab>_scal_<mode>): Fix
unspec mode.
(reduc_<fmaxmin>_scal_<mode>): New pattern.
* config/aarch64/aarch64-sve.md (reduc_<fmaxmin>_scal_<mode>):
Likewise.
gcc/testsuite/
* gcc.dg/vect/vect-fmax-1.c: New test.
* gcc.dg/vect/vect-fmax-2.c: Likewise.
* gcc.dg/vect/vect-fmax-3.c: Likewise.
* gcc.dg/vect/vect-fmin-1.c: New test.
* gcc.dg/vect/vect-fmin-2.c: Likewise.
* gcc.dg/vect/vect-fmin-3.c: Likewise.
* gcc.target/aarch64/fmaxnm_1.c: Likewise.
* gcc.target/aarch64/fmaxnm_2.c: Likewise.
* gcc.target/aarch64/fminnm_1.c: Likewise.
* gcc.target/aarch64/fminnm_2.c: Likewise.
* gcc.target/aarch64/sve/fmaxnm_2.c: Likewise.
* gcc.target/aarch64/sve/fmaxnm_3.c: Likewise.
* gcc.target/aarch64/sve/fminnm_2.c: Likewise.
* gcc.target/aarch64/sve/fminnm_3.c: Likewise.
Richard Sandiford [Tue, 30 Nov 2021 09:52:24 +0000 (09:52 +0000)]
vect: Make reduction code handle calls
This patch extends the reduction code to handle calls. So far
it's a structural change only; a later patch adds support for
specific function reductions.
Most of the patch consists of using code_helper and gimple_match_op
to describe the reduction operations. The other main change is that
vectorizable_call now needs to handle fully-predicated reductions.
There are some new functions that are provided for ABI completeness
and aren't currently used:
first_commutative_argument
commutative_ternary_op_p
1- and 3-argument forms of gimple_build
gcc/
* builtins.h (associated_internal_fn): Declare overload that
takes a (combined_cfn, return type) pair.
* builtins.c (associated_internal_fn): Split new overload out
of original fndecl version. Also provide an overload that takes
a (combined_cfn, return type) pair.
* internal-fn.h (commutative_binary_fn_p): Declare.
(commutative_ternary_fn_p): Likewise.
(associative_binary_fn_p): Likewise.
* internal-fn.c (commutative_binary_fn_p, commutative_ternary_fn_p):
New functions, split out from...
(first_commutative_argument): ...here.
(associative_binary_fn_p): New function.
* gimple-match.h (code_helper): Add a constructor that takes
internal functions.
(commutative_binary_op_p): Declare.
(commutative_ternary_op_p): Likewise.
(first_commutative_argument): Likewise.
(associative_binary_op_p): Likewise.
(canonicalize_code): Likewise.
(directly_supported_p): Likewise.
(get_conditional_internal_fn): Likewise.
(gimple_build): New overloads that takes a code_helper.
* gimple-fold.c (gimple_build): Likewise.
* gimple-match-head.c (commutative_binary_op_p): New function.
(commutative_ternary_op_p): Likewise.
(first_commutative_argument): Likewise.
(associative_binary_op_p): Likewise.
(canonicalize_code): Likewise.
(directly_supported_p): Likewise.
(get_conditional_internal_fn): Likewise.
* tree-vectorizer.h: Include gimple-match.h.
(neutral_op_for_reduction): Take a code_helper instead of a tree_code.
(needs_fold_left_reduction_p): Likewise.
(reduction_fn_for_scalar_code): Likewise.
(vect_can_vectorize_without_simd_p): Declare a nNew overload that takes
a code_helper.
* tree-vect-loop.c: Include case-cfn-macros.h.
(fold_left_reduction_fn): Take a code_helper instead of a tree_code.
(reduction_fn_for_scalar_code): Likewise.
(neutral_op_for_reduction): Likewise.
(needs_fold_left_reduction_p): Likewise.
(use_mask_by_cond_expr_p): Likewise.
(build_vect_cond_expr): Likewise.
(vect_create_partial_epilog): Likewise. Use gimple_build rather
than gimple_build_assign.
(check_reduction_path): Handle calls and operate on code_helpers
rather than tree_codes.
(vect_is_simple_reduction): Likewise.
(vect_model_reduction_cost): Likewise.
(vect_find_reusable_accumulator): Likewise.
(vect_create_epilog_for_reduction): Likewise.
(vect_transform_cycle_phi): Likewise.
(vectorizable_reduction): Likewise. Make more use of
lane_reduc_code_p.
(vect_transform_reduction): Use gimple_extract_op but expect
a tree_code for now.
(vect_can_vectorize_without_simd_p): New overload that takes
a code_helper.
* tree-vect-stmts.c (vectorizable_call): Handle reductions in
fully-masked loops.
* tree-vect-patterns.c (vect_mark_pattern_stmts): Use
gimple_extract_op when updating STMT_VINFO_REDUC_IDX.
Richard Sandiford [Tue, 30 Nov 2021 09:52:24 +0000 (09:52 +0000)]
gimple-match: Make code_helper conversions explicit
code_helper provides conversions to tree_code and combined_fn.
Now that the codebase is C++11, we can mark these conversions as
explicit. This avoids accidentally using code_helpers with
functions that take tree_codes, which would previously entail
a hidden unchecked conversion.
gcc/
* gimple-match.h (code_helper): Provide == and != overloads.
(code_helper::operator tree_code): Make explicit.
(code_helper::operator combined_fn): Likewise.
* gimple-match-head.c (convert_conditional_op): Use explicit
conversions where necessary.
(gimple_resimplify1, gimple_resimplify2, gimple_resimplify3): Likewise.
(maybe_push_res_to_seq, gimple_simplify): Likewise.
* gimple-fold.c (replace_stmt_with_simplification): Likewise.
Richard Sandiford [Tue, 30 Nov 2021 09:52:24 +0000 (09:52 +0000)]
gimple-match: Add a gimple_extract_op function
code_helper and gimple_match_op seem like generally useful ways
of summing up a gimple_assign or gimple_call (or gimple_cond).
This patch adds a gimple_extract_op function that can be used
for that.
gcc/
* gimple-match.h (code_helper): Add functions for querying whether
the code represents an internal_fn or a built_in_function.
Provide explicit conversion operators for both cases.
(gimple_extract_op): Declare.
* gimple-match-head.c (gimple_extract): New function, extracted from...
(gimple_simplify): ...here.
(gimple_extract_op): New function.
Eric Botcazou [Tue, 30 Nov 2021 09:17:09 +0000 (10:17 +0100)]
Fix -freorder-blocks-and-partition glitch with Windows SEH (continued)
This fixes a thinko in the fix for the -freorder-blocks-and-partition
glitch with SEH on 64-bit Windows:
https://gcc.gnu.org/pipermail/gcc-patches/2021-February/565208.html
Even if no exceptions are active, e.g. in C, we need to consider calls.
gcc/
PR target/103274
* config/i386/i386.c (ix86_output_call_insn): Beef up comment about
nops emitted with SEH.
* config/i386/winnt.c (i386_pe_seh_unwind_emit): When switching to
the cold section, emit a nop before the directive if the previous
active instruction is a call.
Jakub Jelinek [Tue, 30 Nov 2021 08:50:52 +0000 (09:50 +0100)]
libcpp: Enable P1949R7 for C++11 and up as it was a DR [PR100977]
Jonathan mentioned on IRC that:
"Accept P1949R7 (C++ Identifier Syntax using Unicode Standard Annex 31) as
a Defect Report and apply the changes therein to the C++ working paper."
while I've actually implemented it only for -std={gnu,c}++{23,2b}.
As the C++98 rules were significantly different, I'm not trying to change
anything for C++98.
2021-11-30 Jakub Jelinek <jakub@redhat.com>
PR c++/100977
* init.c (lang_defaults): Enable cxx23_identifiers for
-std={gnu,c}++{11,14,17,20} too.
* c-c++-common/cpp/ucnid-2011-1-utf8.c: Expect errors in C++.
* c-c++-common/cpp/ucnid-2011-1.c: Likewise.
* g++.dg/cpp/ucnid-4-utf8.C: Add missing space to dg-options.
* g++.dg/cpp23/normalize3.C: Enable for c++11 rather than just c++23.
* g++.dg/cpp23/normalize4.C: Likewise.
* g++.dg/cpp23/normalize5.C: Likewise.
* g++.dg/cpp23/normalize7.C: Expect errors rather than just warnings
for c++11 and up rather than just c++23.
* g++.dg/cpp23/ucnid-2-utf8.C: Expect errors even for c++11 .. c++20.
Jakub Jelinek [Tue, 30 Nov 2021 08:48:59 +0000 (09:48 +0100)]
c++: Small incremental tweak to source_location::current() folding
I've already committed the patch, but perhaps we shouldn't do it in cp_fold
where it will be folded even for warnings etc. and the locations might not
be the final yet. This patch moves it to cp_fold_r so that it is done just
once for each function and just once for each static initializer.
2021-11-30 Jakub Jelinek <jakub@redhat.com>
* cp-gimplify.c (cp_fold_r): Perform folding of
std::source_location::current() calls here...
(cp_fold): ... rather than here.
Roger Sayle [Tue, 30 Nov 2021 08:35:39 +0000 (08:35 +0000)]
x86_64: PR target/100711: Splitters for pandn
This patch addresses PR target/100711 by introducing define_split
patterns so that not/broadcast/pand may be simplified (by combine)
to broadcast/pandn. This introduces two splitters one for optimizing
pandn on TARGET_SSE for V4SI and V2DI, and another for vpandn on
TARGET_AVX2 for V16QI, V8HI, V32QI, V16HI and V8SI. Each splitter
has its own new testcase.
I've also confirmed that not/broadcast/pandn is already getting
simplified to broadcast/pand by the middle-end optimizers.
2021-11-30 Roger Sayle <roger@nextmovesoftware.com>
Uroš Bizjak <ubizjak@gmail.com>
gcc/ChangeLog
PR target/100711
* config/i386/sse.md (define_split): New splitters to simplify
not;vec_duplicate;and as vec_duplicate;andn.
gcc/testsuite/ChangeLog
PR target/100711
* gcc.target/i386/pr100711-1.c: New test case.
* gcc.target/i386/pr100711-2.c: New test case.
Richard Biener [Mon, 29 Nov 2021 11:26:39 +0000 (12:26 +0100)]
Only return after resetting type_param_spec_list
This fixes an appearant mistake in gfc_insert_parameter_exprs.
2021-11-29 Richard Biener <rguenther@suse.de>
gcc/fortran/
* decl.c (gfc_insert_parameter_exprs): Only return after
resetting type_param_spec_list.
Richard Biener [Tue, 30 Nov 2021 07:19:24 +0000 (08:19 +0100)]
middle-end/103485 - fix conversion kind for vectors
This makes sure to use a VIEW_CONVERT_EXPR for converting
vector signedness in the -((int)x >> (prec - 1)) to (unsigned)x >> (prec - 1)
simplification.
2021-11-30 Richard Biener <rguenther@suse.de>
PR middle-end/103485
* match.pd (-((int)x >> (prec - 1)) to (unsigned)x >> (prec - 1)):
Use VIEW_CONVERT_EXPR for vectors.
* gcc.dg/pr103485.c: New testcase.
Rasmus Villemoes [Fri, 29 Oct 2021 08:10:12 +0000 (10:10 +0200)]
libgcc: vxcrtstuff.c: add a few undefs
When vxcrtstuff.c was created, the set of #includes was copied from
crtstuff.c. But crtstuff.c also has a bunch of #undefs after the first
#include, because, as the comment says, including auto-host.h when
building objects that are meant for target is technically not
correct.
This manifests when I try do do a canadian cross, with build=linux,
host=windows and target=vxworks, in that we pick up a
#define caddr_t char *
from auto-host.h, which then of course creates a problem when we later
include a target header that has
typedef char * caddr_t;
I assume that the #undefs in crtstuff.c have been added for similar
reasons.
These potentially problematic #defines all seem to be guarded by
#ifndef USED_FOR_TARGET, which tconfig.h defines before including
auto-host.h. So at first, it seems that one could avoid the problem
by simply removing the initial include of auto-host.h. Unfortunately,
we do need some of the things defined in auto-host.h within such an
ifndef USED_FOR_TARGET, namely the define of
HAVE_INITFINI_ARRAY_SUPPORT, which is what later causes
initfini-array.h to define USE_INITFINI_ARRAY. So as the next best
fix, just copy the #undefs from crtstuff.c.
libgcc/
* config/vxcrtstuff.c: Undefine caddr_t, pid_t, rlim_t,
ssize_t and vfork after including auto-host.h.
Richard Biener [Mon, 29 Nov 2021 14:20:38 +0000 (15:20 +0100)]
Avoid some -Wunreachable-code-ctrl
This cleans up unreachable code diagnosed by -Wunreachable-code-ctrl.
It largely follows the previous series but discovers a few extra
cases, namely dead code after break or continue or loops without
exits.
2021-11-29 Richard Biener <rguenther@suse.de>
gcc/c/
* gimple-parser.c (c_parser_gimple_postfix_expression):
avoid unreachable code after break.
gcc/
* cfgrtl.c (skip_insns_after_block): Refactor code to
be more easily readable.
* expr.c (op_by_pieces_d::run): Remove unreachable
assert.
* sched-deps.c (sched_analyze): Remove unreachable
gcc_unreachable.
* sel-sched-ir.c (in_same_ebb_p): Likewise.
* tree-ssa-alias.c (nonoverlapping_refs_since_match_p):
Remove unreachable code.
* tree-vect-slp.c (vectorize_slp_instance_root_stmt):
Refactor to avoid unreachable loop iteration.
* tree.c (walk_tree_1): Remove unreachable break.
* vec-perm-indices.c (vec_perm_indices::series_p): Remove
unreachable return.
gcc/cp/
* parser.c (cp_parser_postfix_expression): Remove
unreachable code.
* pt.c (tsubst_expr): Remove unreachable breaks.
gcc/fortran/
* frontend-passes.c (gfc_expr_walker): Remove unreachable
break.
* scanner.c (skip_fixed_comments): Remove unreachable
gcc_unreachable.
* trans-expr.c (gfc_expr_is_variable): Refactor to make
control flow more obvious.
Kewen Lin [Tue, 30 Nov 2021 03:22:32 +0000 (21:22 -0600)]
rs6000: Remove builtin mask check from builtin_decl [PR102347]
As the discussion in PR102347, currently builtin_decl is invoked so
early, it's when making up the function_decl for builtin functions,
at that time the rs6000_builtin_mask could be wrong for those
builtins sitting in #pragma/attribute target functions, though it
will be updated properly later when LTO processes all nodes.
This patch is to align with the practice i386 port adopts, also
align with r10-7462 by relaxing builtin mask checking in some places.
gcc/ChangeLog:
PR target/102347
* config/rs6000/rs6000-call.c (rs6000_builtin_decl): Remove builtin mask
check.
gcc/testsuite/ChangeLog:
PR target/102347
* gcc.target/powerpc/pr102347.c: New test.
Kewen Lin [Tue, 30 Nov 2021 03:22:27 +0000 (21:22 -0600)]
rs6000: Modify the way for extra penalized cost
This patch follows the discussions here[1][2], where Segher
pointed out the existing way to guard the extra penalized
cost for strided/elementwise loads with a magic bound does
not scale.
The way with nunits * stmt_cost can get one much
exaggerated penalized cost, such as: for V16QI on P8, it's
16 * 20 = 320, that's why we need one bound. To make it
better and more readable, the penalized cost is simplified
as:
unsigned adjusted_cost = (nunits == 2) ? 2 : 1;
unsigned extra_cost = nunits * adjusted_cost;
For V2DI/V2DF, it uses 2 penalized cost for each scalar load
while for the other modes, it uses 1. It's mainly concluded
from the performance evaluations. One thing might be
related is that: More units vector gets constructed, more
instructions are used. It has more chances to schedule them
better (even run in parallelly when enough available units
at that time), so it seems reasonable not to penalize more
for them.
The SPEC2017 evaluations on Power8/Power9/Power10 at option
sets O2-vect and Ofast-unroll show this change is neutral.
[1] https://gcc.gnu.org/pipermail/gcc-patches/2021-September/579121.html
[2] https://gcc.gnu.org/pipermail/gcc-patches/2021-September/580099.html
gcc/ChangeLog:
* config/rs6000/rs6000.c
(rs6000_cost_data::update_target_cost_per_stmt): Adjust the way to
compute extra penalized cost. Remove useless parameter.
(rs6000_cost_data::rs6000_add_stmt_cost): Adjust the call to function
update_target_cost_per_stmt.
Kewen Lin [Tue, 30 Nov 2021 01:26:20 +0000 (19:26 -0600)]
visium: Revert commit r12-5332
This reverts commit
b8ce19bb1a0592051e8f9a4c3252d12ae605b256
(r12-5332) "visium: Fix non-robust split condition in
define_insn_and_split".
Jeff found newlib failed to build for visium port since
r12-5332, as Eric confirmed, those split conditions in the
related define_insn_and_splits are intentional not to join
with insn condition (&&), since insn condition won't hold
after reload and the proposed concatenation will make the
splitting never happen wrongly.
Andrew MacLeod [Tue, 30 Nov 2021 00:53:50 +0000 (19:53 -0500)]
Don't reuse reference after potential resize.
When a new def chain is requested, any existing reference may no longer
be valid, so just use the object directly.
PR tree-optimization/103467
* gimple-range-gori.cc (range_def_chain::register_dependency): Don't
use an object reference after a potential resize.
GCC Administrator [Tue, 30 Nov 2021 00:16:44 +0000 (00:16 +0000)]
Daily bump.
David Malcolm [Mon, 29 Nov 2021 16:47:47 +0000 (11:47 -0500)]
analyzer: further false leak fixes due to overzealous state merging [PR103217]
Commit r12-5424-gf573d35147ca8433c102e1721d8c99fc432cb44b fixed a false
positive from -Wanalyzer-malloc-leak due to overzealous state merging,
erroneously merging two different svalues bound to a particular part
of the store when one has sm-state.
A further case was discovered by the reporter of PR analyzer/103217,
which this patch fixes. In this variant, different states have set
different fields of a struct, and on attempting to merge them, the
states have a different set of binding keys, leading to one state
having an svalue with sm-state, and its peer state having a NULL value
for that binding key. The state merger code was erroneously treating
them as mergeable to "UNKNOWN". This followup patch fixes things by
rejecting such mergers if the non-NULL svalue is not mergeable with
"UNKNOWN".
gcc/analyzer/ChangeLog:
PR analyzer/103217
* store.cc (binding_cluster::can_merge_p): For the "key is bound"
vs "key is not bound" merger case, check that the bound svalue
is mergeable before merging it to "unknown", rejecting the merger
otherwise.
gcc/testsuite/ChangeLog:
PR analyzer/103217
* gcc.dg/analyzer/pr103217-2.c: New test.
* gcc.dg/analyzer/pr103217-3.c: New test.
* gcc.dg/analyzer/pr103217-4.c: New test.
* gcc.dg/analyzer/pr103217-5.c: New test.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Uros Bizjak [Mon, 29 Nov 2021 21:16:12 +0000 (22:16 +0100)]
i386: Fix and improve movhi_internal and movhf_internal some more.
An (*v,C) alternative can be added to movhi_internal to directly load
HImode constant 0 to xmm register. Also, V4SFmode moves can be used
for xmm->xmm moves instead of TImode moves when optimizing for size.
Fix invalid %vpinsrw insn template, which needs to duplicate %xmm
register for AVX targets.
Optimize GPR moves in movhf_internal in the same way as in movhi_internal.
Fix pinsrw and pextrw templates for AVX targets. Use sselog1
instead of sselog type. Also, handle TARGET_SSE_PARTIAL_REG_DEPENDENCY
and TARGET_SSE_SPLIT_REGS targets.
2021-11-29 Uroš Bizjak <ubizjak@gmail.com>
gcc/ChangeLog:
PR target/102811
* config/i386/i386.md (*movhi_internal): Introduce (*v,C) alternative.
Do not allocate non-GPR registers. Optimize xmm->xmm moves when
optimizing for size. Fix vpinsrw insn template.
(*movhf_internal): Fix pinsrw and pextrw insn templates for
AVX targets. Use sselog1 type instead of sselog. Optimize GPR moves.
Optimize xmm->xmm moves for TARGET_SSE_PARTIAL_REG_DEPENDENCY
and TARGET_SSE_SPLIT_REGS targets.
Martin Sebor [Mon, 29 Nov 2021 20:13:30 +0000 (13:13 -0700)]
Prune out valid -Winfinite-recursion [PR103469].
gcc/testsuite/ChangeLog:
PR testsuite/103469
* c-c++-common/attr-retain-5.c: Prune out valid warning.
* c-c++-common/attr-retain-6.c: Same.
* c-c++-common/attr-retain-9.c: Same.
Eric Gallager [Mon, 29 Nov 2021 19:50:02 +0000 (14:50 -0500)]
Fix autoconf regeneration slip-up.
A stray _AC_FINALIZE somehow snuck into g:909b30a; this should fix it.
gcc/ChangeLog:
* configure: Re-regenerate.
Eric Gallager [Mon, 29 Nov 2021 18:24:12 +0000 (13:24 -0500)]
Make etags path used by build system configurable
This commit allows users to specify a path to their "etags"
executable for use when doing "make tags".
I based this patch off of this one from upstream automake:
https://git.savannah.gnu.org/cgit/automake.git/commit/m4?id=
d2ccbd7eb38d6a4277d6f42b994eb5a29b1edf29
This means that I just supplied variables that the user can override
for the tags programs, rather than having the configure scripts
actually check for them. I handle etags and ctags separately because
the intl subdirectory has separate targets for them. This commit
only affects the subdirectories that use handwritten Makefiles; the
ones that use automake will have to wait until we update the version
of automake used to be 1.16.4 or newer before they'll be fixed.
Addresses #103021
gcc/ChangeLog:
PR other/103021
* Makefile.in: Substitute CTAGS, ETAGS, and CSCOPE
variables. Use ETAGS variable in TAGS target.
* configure: Regenerate.
* configure.ac: Allow CTAGS, ETAGS, and CSCOPE
variables to be overridden.
gcc/ada/ChangeLog:
PR other/103021
* gcc-interface/Make-lang.in: Use ETAGS variable in
TAGS target.
gcc/c/ChangeLog:
PR other/103021
* Make-lang.in: Use ETAGS variable in TAGS target.
gcc/cp/ChangeLog:
PR other/103021
* Make-lang.in: Use ETAGS variable in TAGS target.
gcc/d/ChangeLog:
PR other/103021
* Make-lang.in: Use ETAGS variable in TAGS target.
gcc/fortran/ChangeLog:
PR other/103021
* Make-lang.in: Use ETAGS variable in TAGS target.
gcc/go/ChangeLog:
PR other/103021
* Make-lang.in: Use ETAGS variable in TAGS target.
gcc/objc/ChangeLog:
PR other/103021
* Make-lang.in: Use ETAGS variable in TAGS target.
gcc/objcp/ChangeLog:
PR other/103021
* Make-lang.in: Use ETAGS variable in TAGS target.
intl/ChangeLog:
PR other/103021
* Makefile.in: Use ETAGS variable in TAGS target,
CTAGS variable in CTAGS target, and MKID variable
in ID target.
* configure: Regenerate.
* configure.ac: Allow CTAGS, ETAGS, and MKID
variables to be overridden.
libcpp/ChangeLog:
PR other/103021
* Makefile.in: Use ETAGS variable in TAGS target.
* configure: Regenerate.
* configure.ac: Allow ETAGS variable to be overridden.
libiberty/ChangeLog:
PR other/103021
* Makefile.in: Use ETAGS variable in TAGS target.
* configure: Regenerate.
* configure.ac: Allow ETAGS variable to be overridden.
Paul A. Clarke [Thu, 21 Oct 2021 16:21:01 +0000 (11:21 -0500)]
rs6000: Add Power10 optimization for most _mm_movemask*
Power10 ISA added `vextract*` instructions which are realized in the
`vec_extractm` instrinsic.
Use `vec_extractm` for `_mm_movemask_ps`, `_mm_movemask_pd`, and
`_mm_movemask_epi8` compatibility intrinsics, when `_ARCH_PWR10`.
2021-11-29 Paul A. Clarke <pc@us.ibm.com>
gcc
* config/rs6000/xmmintrin.h (_mm_movemask_ps): Use vec_extractm
when _ARCH_PWR10.
* config/rs6000/emmintrin.h (_mm_movemask_pd): Likewise.
(_mm_movemask_epi8): Likewise.
Richard Biener [Mon, 29 Nov 2021 11:24:30 +0000 (12:24 +0100)]
Fix RTL FE issue with premature return
This fixes an issue discovered by -Wunreachable-code-return
2021-11-29 Richard Biener <rguenther@suse.de>
* read-rtl-function.c (function_reader::read_rtx_operand):
Return only after resetting m_in_call_function_usage.
Patrick Palka [Mon, 29 Nov 2021 12:52:47 +0000 (07:52 -0500)]
c++: redundant explicit 'this' capture before C++20 [PR100493]
As described in detail in the PR, in C++20 implicitly capturing 'this'
via a '=' capture default is deprecated, and in C++17 adding an explicit
'this' capture alongside a '=' capture default is diagnosed as redundant
(and is strictly speaking ill-formed). This means it's impossible to
write, in a forward-compatible way, a C++17 lambda that has a '=' capture
default and that also captures 'this' (implicitly or explicitly):
[=] { this; } // #1 deprecated in C++20, OK in C++17
// GCC issues a -Wdeprecated warning in C++20 mode
[=, this] { } // #2 ill-formed in C++17, OK in C++20
// GCC issues an unconditional warning in C++17 mode
This patch resolves this dilemma by downgrading the warning for #2 into
a -pedantic one. In passing, move it into the -Wc++20-extensions class
of warnings and adjust its wording accordingly.
PR c++/100493
gcc/cp/ChangeLog:
* parser.c (cp_parser_lambda_introducer): In C++17, don't
diagnose a redundant 'this' capture alongside a by-copy
capture default unless -pedantic. Move the diagnostic into
-Wc++20-extensions and adjust wording accordingly.
gcc/testsuite/ChangeLog:
* g++.dg/cpp1z/lambda-this1.C: Adjust expected diagnostics.
* g++.dg/cpp1z/lambda-this8.C: New test.
* g++.dg/cpp2a/lambda-this3.C: Compile with -pedantic in C++17
to continue to diagnose redundant 'this' captures.
Roger Sayle [Mon, 29 Nov 2021 10:45:11 +0000 (10:45 +0000)]
x86_64: Improved V1TImode rotations by non-constant amounts.
This patch builds on the recent improvements to TImode rotations (and
Jakub's fixes to shldq/shrdq patterns). Now that expanding a TImode
rotation can never fail, it is safe to allow general_operand constraints
on the QImode shift amounts in rotlv1ti3 and rotrv1ti3 patterns.
I've also made an additional tweak to ix86_expand_v1ti_to_ti to use
vec_extract via V2DImode, which avoid using memory and takes advantage
vpextrq on recent hardware.
For the following test case:
typedef unsigned __int128 uv1ti __attribute__ ((__vector_size__ (16)));
uv1ti rotr(uv1ti x, unsigned int i) { return (x >> i) | (x << (128-i)); }
GCC with -O2 -mavx2 would previously generate:
rotr: vmovdqa %xmm0, -24(%rsp)
movq -16(%rsp), %rdx
movl %edi, %ecx
xorl %esi, %esi
movq -24(%rsp), %rax
shrdq %rdx, %rax
shrq %cl, %rdx
testb $64, %dil
cmovne %rdx, %rax
cmovne %rsi, %rdx
negl %ecx
xorl %edi, %edi
andl $127, %ecx
vmovq %rax, %xmm2
movq -24(%rsp), %rax
vpinsrq $1, %rdx, %xmm2, %xmm1
movq -16(%rsp), %rdx
shldq %rax, %rdx
salq %cl, %rax
testb $64, %cl
cmovne %rax, %rdx
cmovne %rdi, %rax
vmovq %rax, %xmm3
vpinsrq $1, %rdx, %xmm3, %xmm0
vpor %xmm1, %xmm0, %xmm0
ret
with this patch, we now generate:
rotr: movl %edi, %ecx
vpextrq $1, %xmm0, %rax
vmovq %xmm0, %rdx
shrdq %rax, %rdx
vmovq %xmm0, %rsi
shrdq %rsi, %rax
andl $64, %ecx
movq %rdx, %rsi
cmovne %rax, %rsi
cmove %rax, %rdx
vmovq %rsi, %xmm0
vpinsrq $1, %rdx, %xmm0, %xmm0
ret
2021-11-29 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
* config/i386/i386-expand.c (ix86_expand_v1ti_to_ti): Perform the
conversion via V2DImode using vec_extractv2didi on TARGET_SSE2.
* config/i386/sse.md (rotlv1ti3, rotrv1ti3): Change constraint
on QImode shift amounts from const_int_operand to general_operand.
gcc/testsuite/ChangeLog
* gcc.target/i386/sse2-v1ti-rotate.c: New test case.
Richard Biener [Wed, 24 Nov 2021 14:57:03 +0000 (15:57 +0100)]
Remove unreachable gcc_unreachable () at the end of functions
It seems to be a style to place gcc_unreachable () after a
switch that handles all cases with every case returning.
Those are unreachable (well, yes!), so they will be elided
at CFG construction time and the middle-end will place
another __builtin_unreachable "after" them to note the
path doesn't lead to a return when the function is not declared
void.
So IMHO those explicit gcc_unreachable () serve no purpose,
if they could be replaced by a comment. But since all cases
cover switches not handling a case or not returning will
likely cause some diagnostic to be emitted which is better
than running into an ICE only at runtime.
2021-11-24 Richard Biener <rguenther@suse.de>
* tree.h (reverse_storage_order_for_component_p): Remove
spurious gcc_unreachable.
* cfganal.c (dfs_find_deadend): Likewise.
* fold-const-call.c (fold_const_logb): Likewise.
(fold_const_significand): Likewise.
* gimple-ssa-store-merging.c (lhs_valid_for_store_merging_p):
Likewise.
gcc/c-family/
* c-format.c (check_format_string): Remove spurious
gcc_unreachable.
Richard Biener [Wed, 24 Nov 2021 14:57:03 +0000 (15:57 +0100)]
Remove unreachable returns
This removes unreachable return statements as diagnosed by
the -Wunreachable-code patch. Some cases are more obviously
an improvement than others - in fact some may get you the idea
to replace them with gcc_unreachable () instead, leading to
cases of the 'Remove unreachable gcc_unreachable () at the end
of functions' patch.
2021-11-25 Richard Biener <rguenther@suse.de>
* vec.c (qsort_chk): Do not return the void return value
from the noreturn qsort_chk_error.
* ccmp.c (expand_ccmp_expr_1): Remove unreachable return.
* df-scan.c (df_ref_equal_p): Likewise.
* dwarf2out.c (is_base_type): Likewise.
(add_const_value_attribute): Likewise.
* fixed-value.c (fixed_arithmetic): Likewise.
* gimple-fold.c (gimple_fold_builtin_fputs): Likewise.
* gimple-ssa-strength-reduction.c (stmt_cost): Likewise.
* graphite-isl-ast-to-gimple.c
(gcc_expression_from_isl_expr_op): Likewise.
(gcc_expression_from_isl_expression): Likewise.
* ipa-fnsummary.c (will_be_nonconstant_expr_predicate):
Likewise.
* lto-streamer-in.c (lto_input_mode_table): Likewise.
gcc/c-family/
* c-opts.c (c_common_post_options): Remove unreachable return.
* c-pragma.c (handle_pragma_target): Likewise.
(handle_pragma_optimize): Likewise.
gcc/c/
* c-typeck.c (c_tree_equal): Remove unreachable return.
* c-parser.c (get_matching_symbol): Likewise.
libgomp/
* oacc-plugin.c (GOMP_PLUGIN_acc_default_dim): Remove unreachable
return.
liuhongt [Mon, 29 Nov 2021 02:01:42 +0000 (10:01 +0800)]
Optimize _Float16 usage for non AVX512FP16.
1. No memory is needed to move HI/HFmode between GPR and SSE registers
under TARGET_SSE2 and above, pinsrw/pextrw are used for them w/o
AVX512FP16.
2. Use gen_sse2_pinsrph/gen_vec_setv4sf_0 to replace
ix86_expand_vector_set in extendhfsf2/truncsfhf2 so that redundant
initialization cound be eliminated.
gcc/ChangeLog:
PR target/102811
* config/i386/i386.c (inline_secondary_memory_needed): HImode
move between GPR and SSE registers is supported under
TARGET_SSE2 and above.
* config/i386/i386.md (extendhfsf2): Optimize expander.
(truncsfhf2): Ditto.
* config/i386/sse.md (sse2p4_1): Adjust attr for V8HFmode to
align with V8HImode.
gcc/testsuite/ChangeLog:
* gcc.target/i386/pr102811-2.c: New test.
* gcc.target/i386/avx512vl-vcvtps2ph-pr102811.c: Add new
scan-assembler-times.
liuhongt [Fri, 26 Nov 2021 15:24:20 +0000 (23:24 +0800)]
Fix regression introduced by r12-5536.
There're several failures:
1. unsupported instruction `pextrw` for "pextrw $0, %xmm31, 16(%rax)"
%vpextrw should be used in output templates.
2. ICE in get_attr_memory for movhi_internal since some alternatives
are marked as TYPE_SSELOG.
use TYPE_SSELOG1 instead.
Also this patch fixs a typo and some latent bugs which are related to
moving HImode from/to sse register w/o TARGET_AVX512FP16.
gcc/ChangeLog:
PR target/102811
PR target/103463
* config/i386/i386.c (ix86_secondary_reload): Without
TARGET_SSE4_1, General register is needed to move HImode from
sse register to memory.
* config/i386/sse.md (*vec_extrachf): Use %vpextrw instead of
pextrw in output templates.
* config/i386/i386.md (movhi_internal): Ditto, also fix typo of
MEM_P (operands[1]) and adjust mode/prefix/type attribute for
alternatives related to sse register.
Richard Biener [Mon, 29 Nov 2021 08:15:47 +0000 (09:15 +0100)]
tree-optimization/103458 - avoid creating new loops in CD-DCE
When creating forwarders in CD-DCE we have to avoid creating loops
where we formerly did not consider those because of abnormal
predecessors. At this point simply excuse us when there are any
abnormal predecessors.
2021-11-29 Richard Biener <rguenther@suse.de>
PR tree-optimization/103458
* tree-ssa-dce.c (make_forwarders_with_degenerate_phis): Do not
create forwarders for blocks with abnormal predecessors.
* gcc.dg/torture/pr103458.c: New testcase.
Richard Biener [Fri, 26 Nov 2021 07:50:24 +0000 (08:50 +0100)]
Restore can_be_invalidated_p semantics to before refactoring
This restores the semantics of can_be_invalidated_p to the original
semantics of the function this was split out from tree-ssa-uninit.c.
The current semantics only ever look at the first predicate which
cannot be correct.
2021-11-26 Richard Biener <rguenther@suse.de>
* gimple-predicate-analysis.cc (can_be_invalidated_p):
Restore semantics to the one before the split from
tree-ssa-uninit.c.
Rasmus Villemoes [Mon, 11 Oct 2021 12:37:05 +0000 (14:37 +0200)]
libgcc: remove crt{begin,end}.o from powerpc-wrs-vxworks target
Since commit
78e49fb1bc (Introduce vxworks specific crtstuff support),
the generic crtbegin.o/crtend.o have been unnecessary to build. So
remove them from extra_parts.
This is effectively a revert of commit
9a5b8df70 (libgcc: add
crt{begin,end} for powerpc-wrs-vxworks target).
libgcc/
* config.host (powerpc-wrs-vxworks): Do not add crtbegin.o and
crtend.o to extra_parts.
Kewen Lin [Mon, 29 Nov 2021 01:59:59 +0000 (19:59 -0600)]
rs6000/test: Add emulated gather test case
As verified, the emulated gather capability of vectorizer
(r12-2733) can help to speed up SPEC2017 510.parest_r on
Power8/9/10 by 5% ~ 9% with option sets Ofast unroll and
Ofast lto.
This patch is to add a test case similar to the one in i386
to add testing coverage for 510.parest_r hotspots.
btw, different from the one in i386, this uses unsigned int
as INDEXTYPE since the unpack support for unsigned int
(r12-3134) also matters for the hotspots vectorization.
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/vect-gather-1.c: New test.
Andrew Pinski [Sun, 28 Nov 2021 02:16:50 +0000 (18:16 -0800)]
Fix PR 19089: Environment variable TMP may yield gcc: abort
Even though I cannot reproduce the ICE any more, this is still
a bug. We check already to see if we can access the directory
but never check to see if the path is actually a directory.
This adds the check and now we reject the file as not usable
as a tmp directory.
OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.
libiberty/ChangeLog:
* make-temp-file.c (try_dir): Check to see if the dir
is actually a directory.
GCC Administrator [Mon, 29 Nov 2021 00:16:16 +0000 (00:16 +0000)]
Daily bump.
Andrew Pinski [Sun, 28 Nov 2021 01:14:59 +0000 (01:14 +0000)]
Fix PR 62157: disclean in libsanitizer not working
So what is happening is DIST_SUBDIRS contains the conditional
directories which is wrong, so we need to force DIST_SUBDIRS
to be the same as SUBDIRS as recommened by the automake manual.
OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.
Also now make distclean works inside libsanitizer directory.
libsanitizer/ChangeLog:
PR sanitizer/62157
* Makefile.am: Force DIST_SUBDIRS to be SUBDIRS.
* Makefile.in: Regenerate.
* asan/Makefile.in: Likewise.
* hwasan/Makefile.in: Likewise.
* interception/Makefile.in: Likewise.
* libbacktrace/Makefile.in: Likewise.
* lsan/Makefile.in: Likewise.
* sanitizer_common/Makefile.in: Likewise.
* tsan/Makefile.in: Likewise.
* ubsan/Makefile.in: Likewise.
Jan Hubicka [Sun, 28 Nov 2021 18:42:45 +0000 (19:42 +0100)]
Compare guessed and feedback frequencies during profile feedback stream-in
This patch adds simple code to dump and compare frequencies of basic blocks
read from the profile feedback and frequencies guessed statically.
It dumps basic blocks in the order of decreasing frequencies from feedback
along with guessed frequencies and histograms.
It makes it to possible spot basic blocks in hot regions that are considered
cold by guessed profile or vice versa.
I am trying to figure out how realistic our profile estimate is compared to
read one on exchange2 (looking again into PR98782. There IRA now places spills
into hot regions of code while with older (and worse) profile it did not.
Catch is that the function is very large and has 9 nested loops, so it is hard
to figure out how to improve the profile estimate and/or IRA.
gcc/ChangeLog:
2021-11-28 Jan Hubicka <hubicka@ucw.cz>
* profile.c: Include sreal.h
(struct bb_stats): New.
(cmp_stats): New function.
(compute_branch_probabilities): Output bb stats.
Jan Hubicka [Sun, 28 Nov 2021 18:25:33 +0000 (19:25 +0100)]
Improve -fprofile-report
Profile-report was never properly updated after switch to new profile
representation. This patch fixes the way profile mismatches are calculated:
we used to collect separately count and freq mismatches, while now we have
only counts & probabilities. So we verify
- in count: that total count of incomming edges is close to acutal count of
the BB
- out prob: that total sum of outgoing edge edge probabilities is close
to 1 (except for BB containing noreturn calls or EH).
Moreover I added dumping of absolute data which is useful to plot them: with
Martin Liska we plan to setup regular testing so we keep optimizers profie
updates bit under control.
Finally I added both static and dynamic stats about mismatches - static one is
simply number of inconsistencies in the cfg while dynamic is scaled by the
profile - I think in order to keep eye on optimizers the first number is quite
relevant. WHile when tracking why code quality regressed the second number
matters more.
2021-11-28 Jan Hubicka <hubicka@ucw.cz>
* cfghooks.c: Include sreal.h, profile.h.
(profile_record_check_consistency): Fix checking of count counsistency;
record also dynamic mismatches.
* cfgrtl.c (rtl_account_profile_record): Similarly.
* tree-cfg.c (gimple_account_profile_record): Likewise.
* cfghooks.h (struct profile_record): Remove num_mismatched_freq_in,
num_mismatched_freq_out, turn time to double, add
dyn_mismatched_prob_out, dyn_mismatched_count_in,
num_mismatched_prob_out; remove num_mismatched_count_out.
* passes.c (account_profile_1): New function.
(account_profile_in_list): New function.
(pass_manager::dump_profile_report): Rewrite.
(execute_one_ipa_transform_pass): Check profile consistency after
running all passes.
(execute_all_ipa_transforms): Remove cfun test; record all transform
methods.
(execute_one_pass): Fix collecting of profile stats.
Jakub Jelinek [Sun, 28 Nov 2021 15:32:24 +0000 (16:32 +0100)]
libstdc++: Implement std::byteswap for C++23
This patch attempts to implement P1272R4 (except for the std::bit_cast
changes in there which seem quite unrelated to this and will need to be
fixed on the compiler side).
While at least for GCC __builtin_bswap{16,32,64,128} should work fine
in constant expressions, I wonder about other compilers, so I'm using
a fallback implementation for constexpr evaluation always.
If you think that is unnecessary, I can drop the
__cpp_if_consteval >= 202106L &&
if !consteval
{
and
}
and reformat.
The fallback implementation is an attempt to make it work even for integral
types that don't have number of bytes divisible by 2 or when __CHAR_BIT__
is e.g. 16.
2021-11-28 Jakub Jelinek <jakub@redhat.com>
* include/std/bit (__cpp_lib_byteswap, byteswap): Define.
* include/std/version (__cpp_lib_byteswap): Define.
* testsuite/26_numerics/bit/bit.byteswap/byteswap.cc: New test.
* testsuite/26_numerics/bit/bit.byteswap/version.cc: New test.
Martin Liska [Sun, 28 Nov 2021 08:39:40 +0000 (09:39 +0100)]
d: fix thinko in optimize attr parsing
gcc/d/ChangeLog:
* d-attribs.cc (parse_optimize_options): Fix thinko.
GCC Administrator [Sun, 28 Nov 2021 00:16:20 +0000 (00:16 +0000)]
Daily bump.
John David Anglin [Sat, 27 Nov 2021 21:47:47 +0000 (21:47 +0000)]
Fix typo in t-dimode
2021-11-27 John David Anglin <danglin@gcc.gnu.org>
libgcc/ChangeLog:
* config/pa/t-dimode (lib2difuncs): Fix typo.
Petter Tomner [Sat, 27 Nov 2021 14:52:15 +0000 (15:52 +0100)]
jit: Change printf specifiers for size_t to %zu
Change four occurances of %ld specifier for size_t to %zu for clean 32bit builds.
Signed-off-by
2021-11-27 Petter Tomner <tomner@kth.se>
gcc/jit/
* libgccjit.c: %ld -> %zu
Jakub Jelinek [Sat, 27 Nov 2021 12:02:06 +0000 (13:02 +0100)]
x86: Fix up x86_{,64_}sh{l,r}d patterns [PR103431]
The following testcase is miscompiled because the x86_{,64_}sh{l,r}d
patterns don't properly describe what the instructions do. One thing
is left out, in particular that there is initial count &= 63 for
sh{l,r}dq and initial count &= 31 for sh{l,r}d{l,w}. And another thing
not described properly, in particular the behavior when count (after the
masking) is 0. The pattern says it is e.g.
res = (op0 << op2) | (op1 >> (64 - op2))
but that triggers UB on op1 >> 64. For op2 0 we actually want
res = (op0 << op2) | 0
When constants are propagated to these patterns during RTL optimizations,
both such problems trigger wrong-code issues.
This patch represents the patterns as e.g.
res = (op0 << (op2 & 63)) | (unsigned long long) ((uint128_t) op1 >> (64 - (op2 & 63)))
so there is both the initial masking and op2 == 0 behavior results in
zero being ored.
The patch introduces alternate patterns for constant op2 where
simplify-rtx.c will fold those expressions into simple numbers,
and define_insn_and_split pre-reload splitter for how the patterns
looked before into the new form, so that it can pattern match during
combine even computations that assumed the shift amount will be in
the range of 1 .. bitsize-1.
2021-11-27 Jakub Jelinek <jakub@redhat.com>
PR middle-end/103431
* config/i386/i386.md (x86_64_shld, x86_shld, x86_64_shrd, x86_shrd):
Change insn pattern to accurately describe the instructions.
(*x86_64_shld_1, *x86_shld_1, *x86_64_shrd_1, *x86_shrd_1): New
define_insn patterns.
(*x86_64_shld_2, *x86_shld_2, *x86_64_shrd_2, *x86_shrd_2): New
define_insn_and_split patterns.
(*ashl<dwi>3_doubleword_mask, *ashl<dwi>3_doubleword_mask_1,
*<insn><dwi>3_doubleword_mask, *<insn><dwi>3_doubleword_mask_1,
ix86_rotl<dwi>3_doubleword, ix86_rotr<dwi>3_doubleword): Adjust
splitters for x86_{,64_}sh{l,r}d pattern changes.
* gcc.dg/pr103431.c: New test.
Jakub Jelinek [Sat, 27 Nov 2021 12:00:55 +0000 (13:00 +0100)]
bswap: Fix UB in find_bswap_or_nop_finalize [PR103435]
On gcc.c-torture/execute/pr103376.c in the following code we trigger UB
in the compiler. n->range is 8 because it is 64-bit load and rsize is 0
because it is a bswap sequence with load and known to be 0:
/* Find real size of result (highest non-zero byte). */
if (n->base_addr)
for (tmpn = n->n, rsize = 0; tmpn; tmpn >>= BITS_PER_MARKER, rsize++);
else
rsize = n->range;
The shifts then shift uint64_t by 64 bits. For this case mask is 0
and we want both *cmpxchg and *cmpnop as 0, the operation can be done as
both nop and bswap and callers will prefer nop.
2021-11-27 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/103435
* gimple-ssa-store-merging.c (find_bswap_or_nop_finalize): Avoid UB if
n->range - rsize == 8, just clear both *cmpnop and *cmpxchg in that
case.
Roger Sayle [Sat, 27 Nov 2021 10:13:31 +0000 (10:13 +0000)]
[Committed] Fix new ivopts-[89].c test cases for -m32.
2021-11-27 Roger Sayle <roger@nextmovesoftware.com>
gcc/testsuite/ChangeLog
* gcc.dg/tree-ssa/ivopts-8.c: Fix new test case for -m32.
* gcc.dg/tree-ssa/ivopts-9.c: Likewise.
GCC Administrator [Sat, 27 Nov 2021 00:16:19 +0000 (00:16 +0000)]
Daily bump.
Martin Jambor [Sat, 27 Nov 2021 00:00:56 +0000 (01:00 +0100)]
ipa: Fix CFG fix-up in IPA-CP transform phase (PR 103441)
I forgot that IPA passes before ipa-inline must not return
TODO_cleanup_cfg from their transformation function because ordinary
CFG cleanup does not remove call graph edges associated with removed
call statements but must use
delete_unreachable_blocks_update_callgraph instead. This patch fixes
that error.
gcc/ChangeLog:
2021-11-26 Martin Jambor <mjambor@suse.cz>
PR ipa/103441
* ipa-prop.c (ipcp_transform_function): Call
delete_unreachable_blocks_update_callgraph instead of returning
TODO_cleanup_cfg.