Andrew Pinski [Sat, 19 Jun 2021 17:08:21 +0000 (10:08 -0700)]
Duplicate the range information of the phi onto the new ssa_name
Since match_simplify_replacement uses gimple_simplify, there is a new
ssa name created sometimes and then we go and replace the phi edge with
this new ssa name, the range information on the phi is lost.
Placing this in replace_phi_edge_with_variable is the best option instead
of doing it in each time replace_phi_edge_with_variable is called which is
what is done today.
OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.
gcc/ChangeLog:
* tree-ssa-phiopt.c (replace_phi_edge_with_variable): Duplicate range
info if we're the only things setting the target PHI.
(value_replacement): Don't duplicate range here.
(minmax_replacement): Likewise.
Richard Biener [Mon, 28 Jun 2021 09:05:46 +0000 (11:05 +0200)]
tree-optimization/101229 - fix vectorizer SLP hybrid detection with PHIs
This fixes the missing handling of PHIs in gimple_walk_op which causes
the new vectorizer SLP hybrid detection scheme to fail.
2021-06-28 Richard Biener <rguenther@suse.de>
PR tree-optimization/101229
* gimple-walk.c (gimple_walk_op): Handle PHIs.
* gcc.dg/torture/pr101229.c: New testcase.
Martin Liska [Wed, 23 Jun 2021 13:48:28 +0000 (15:48 +0200)]
v850: silent 2 warnings
Silents:
/home/marxin/Programming/gcc/gcc/config/v850/v850.c: In function ‘char* construct_dispose_instruction(rtx)’:
/home/marxin/Programming/gcc/gcc/config/v850/v850.c:2690:22: warning: ‘%s’ directive writing up to 99 bytes into a region of size between 79 and 89 [-Wformat-overflow=]
2690 | sprintf (buff, "dispose %d {%s}, r31", stack_bytes / 4, regs);
| ^~~~~~~~~~~~~~~~~~~~~~ ~~~~
/home/marxin/Programming/gcc/gcc/config/v850/v850.c:2690:15: note: ‘sprintf’ output between 18 and 127 bytes into a destination of size 100
2690 | sprintf (buff, "dispose %d {%s}, r31", stack_bytes / 4, regs);
| ~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/marxin/Programming/gcc/gcc/config/v850/v850.c: In function ‘char* construct_prepare_instruction(rtx)’:
/home/marxin/Programming/gcc/gcc/config/v850/v850.c:2814:22: warning: ‘%s’ directive writing up to 99 bytes into a region of size 91 [-Wformat-overflow=]
2814 | sprintf (buff, "prepare {%s}, %d", regs, (- stack_bytes) / 4);
| ^~~~~~~~~~~~~~~~~~ ~~~~
/home/marxin/Programming/gcc/gcc/config/v850/v850.c:2814:15: note: ‘sprintf’ output between 14 and 123 bytes into a destination of size 100
2814 | sprintf (buff, "prepare {%s}, %d", regs, (- stack_bytes) / 4);
| ~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
gcc/ChangeLog:
* config/v850/v850.c (construct_dispose_instruction): Allocate
a bigger buffer.
(construct_prepare_instruction): Likewise.
Martin Liska [Wed, 23 Jun 2021 13:46:22 +0000 (15:46 +0200)]
v850: add v850_can_inline_p target hook
gcc/ChangeLog:
* config/v850/v850.c (v850_option_override): Build default
target node.
(v850_can_inline_p): New. Allow MASK_PROLOG_FUNCTION to be
ignored for inlining.
(TARGET_CAN_INLINE_P): New.
Richard Biener [Mon, 28 Jun 2021 07:42:58 +0000 (09:42 +0200)]
tree-optimization/101207 - fix BB reduc permute elide with life stmts
This fixes breakage of live lane extracts from permuted loads we elide
from BB reduction vectorization by handling the un-permuting the same
as in the regular eliding code - apply the reverse permute to
both the scalar stmts and the load permutation.
2021-06-28 Richard Biener <rguenther@suse.de>
PR tree-optimization/101207
* tree-vect-slp.c (vect_optimize_slp): Do BB reduction
permute eliding for load permutations properly.
* gcc.dg/vect/bb-slp-pr101207.c: New testcase.
Richard Biener [Wed, 23 Jun 2021 07:59:28 +0000 (09:59 +0200)]
tree-optimization/101173 - fix interchange dependence checking
This adjusts the loop interchange dependence checking to disallow
an outer loop dependence distance of zero.
2021-06-23 Richard Biener <rguenther@suse.de>
PR tree-optimization/101173
* gimple-loop-interchange.cc
(tree_loop_interchange::valid_data_dependences): Disallow outer
loop dependence distance of zero.
* gcc.dg/torture/pr101173.c: New testcase.
liuhongt [Mon, 24 May 2021 02:57:52 +0000 (10:57 +0800)]
For 128/256-bit vec_cond_expr, When mask operands is lt reg const0_rtx, blendv can be used instead of avx512 mask.
gcc/ChangeLog:
PR target/100648
* config/i386/sse.md (*avx_cmp<mode>3_lt): New
define_insn_and_split.
(*avx_cmp<mode>3_ltint): Ditto.
(*avx2_pcmp<mode>3_3): Ditto.
(*avx2_pcmp<mode>3_4): Ditto.
(*avx2_pcmp<mode>3_5): Ditto.
gcc/testsuite/ChangeLog:
PR target/100648
* g++.target/i386/avx2-pr54700-2.C: Adjust testcase.
* g++.target/i386/avx512vl-pr54700-1a.C: New test.
* g++.target/i386/avx512vl-pr54700-1b.C: New test.
* g++.target/i386/avx512vl-pr54700-2a.C: New test.
* g++.target/i386/avx512vl-pr54700-2b.C: New test.
* gcc.target/i386/avx512vl-pr100648.c: New test.
* gcc.target/i386/avx512vl-blendv-1.c: New test.
* gcc.target/i386/avx512vl-blendv-2.c: New test.
liuhongt [Fri, 21 May 2021 01:48:18 +0000 (09:48 +0800)]
Fold blendv builtins into gimple.
Fold __builtin_ia32_pblendvb128 (a, b, c) as VEC_COND_EXPR (c < 0, b,
a), similar for float version but with mask operand VIEW_CONVERT_EXPR
to same sized integer vectype.
gcc/ChangeLog:
* config/i386/i386-builtin.def (IX86_BUILTIN_BLENDVPD256,
IX86_BUILTIN_BLENDVPS256, IX86_BUILTIN_PBLENDVB256,
IX86_BUILTIN_BLENDVPD, IX86_BUILTIN_BLENDVPS,
IX86_BUILTIN_PBLENDVB128): Replace icode with
CODE_FOR_nothing.
* config/i386/i386.c (ix86_gimple_fold_builtin): Fold blendv
builtins.
* config/i386/sse.md (*<sse4_1_avx2>_pblendvb_lt_subreg_not):
New pre_reload splitter.
gcc/testsuite/ChangeLog:
* gcc.target/i386/funcspec-8.c: Replace
__builtin_ia32_blendvpd with __builtin_ia32_roundps_az.
* gcc.target/i386/blendv-1.c: New test.
* gcc.target/i386/blendv-2.c: New test.
GCC Administrator [Mon, 28 Jun 2021 00:16:30 +0000 (00:16 +0000)]
Daily bump.
Andrew Pinski [Sun, 27 Jun 2021 20:14:48 +0000 (13:14 -0700)]
Fix PR 101230: ICE in fold_cond_expr_with_comparison
This fixes PR 101230 where I had messed up and forgot that
invert_tree_comparison can return ERROR_MARK if the comparsion
is not invertable (floating point types).
Committed as obvious after a bootstrap/test on x86_64-linux-gnu-gnu
gcc/ChangeLog:
PR middle-end/101230
* fold-const.c (fold_ternary_loc): Check
the return value of invert_tree_comparison.
gcc/testsuite/ChangeLog:
* gcc.dg/torture/pr101230-1.c: New test.
David Edelsohn [Thu, 24 Jun 2021 19:40:25 +0000 (15:40 -0400)]
aix: Add AIX 7.3 configuration and SPDX License Identifiers.
The anticipated release of AIX 7.3 has been announced. This
patch adds the configuration bits based on AIX 7.2 configuration.
gcc/ChangeLog:
* config.gcc: Add SPDX License Identifier.
(powerpc-ibm-aix789): Default to aix73.h.
(powerpc-ibm-aix7.2.*.*): New stanza.
* config/rs6000/aix72.h: Add SPDX License Identifier.
* config/rs6000/aix73.h: New file.
GCC Administrator [Sun, 27 Jun 2021 00:16:24 +0000 (00:16 +0000)]
Daily bump.
Patrick Palka [Sat, 26 Jun 2021 15:05:54 +0000 (11:05 -0400)]
c++: access scope during partial spec matching [PR96204]
Here, when determining whether the partial specialization matches
has_type_member<Child>, we do so from the scope of where the template-id
appears rather than from the scope of the specialization, and this
causes us to select the partial specialization (since Child::type is
accessible from Parent). When we later instantiate this partial
specialization, we've entered the scope of the specialization and so
substitution into e.g. the DECL_CONTEXT of has_type_member::value fails
with access errors since the friend declaration that we relied on to
choose the partial specialization no longer applies.
It seems the appropriate access scope from which to perform partial
specialization matching is the specialization itself (similar to how
we check access of base-clauses), which is what this patch implements.
PR c++/96204
gcc/cp/ChangeLog:
* pt.c (instantiate_class_template_1): Enter the scope of the
type when calling most_specialized_partial_spec.
gcc/testsuite/ChangeLog:
* g++.dg/template/access40.C: New test.
* g++.dg/template/access40a.C: New test.
Jason Merrill [Thu, 24 Jun 2021 14:37:42 +0000 (10:37 -0400)]
except: remove dwarf2out.h dependency
When thinking about the CTF debug patchset dwarf2out.h split, I noticed that
except.c only needs macros from dwarf2.h, nothing from dwarf2out.h.
gcc/ChangeLog:
* except.c: #include "dwarf2.h" instead of "dwarf2out.h".
Jason Merrill [Thu, 24 Jun 2021 21:32:02 +0000 (17:32 -0400)]
c++: constexpr aggr init of empty class [PR101040]
This is basically the aggregate initializer version of PR97566; as in that
bug, we are trying to initialize empty field 'obj' in 'single' when there's
no CONSTRUCTOR entry for the 'single' base class subobject of 'derived'. As
with that bug, the fix is to stop trying to add entries for empty fields,
this time in cxx_eval_bare_aggregate.
The change to the other function isn't necessary for this version of
the patch, but seems worthwhile for robustness anyway.
PR c++/101040
PR c++/97566
gcc/cp/ChangeLog:
* class.c (is_empty_field): Handle null argument.
* constexpr.c (cxx_eval_bare_aggregate): Discard initializer
for empty field.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/no_unique_address13.C: New test.
Andrew Pinski [Sun, 13 Jun 2021 01:58:03 +0000 (18:58 -0700)]
Lower for loops before lowering cond in genmatch
While converting some fold_cond_expr_with_comparison
to match, I found that I wanted to use "for cnd (cond vec_cond)"
but that was not causing the lowering of cond to happen.
What was happening was the lowering of the for loop
was happening after the lowering of the cond. So
swapping was the correct thing to do but it also
means we need to copy for_subst_vec in lower_cond.
OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.
gcc/ChangeLog:
* genmatch.c (lower_cond): Copy for_subst_vec
for the simplify also.
(lower): Swap the order for lower_for and lower_cond.
Andrew Pinski [Sat, 19 Jun 2021 00:55:51 +0000 (17:55 -0700)]
Reset the range info on the moved instruction in PHIOPT
I had missed this when wrote the patch which allowed the
gimple to be moved from inside the conditional as it. It
was also missed in the review. Anyways the range information
needs to be reset for the moved gimple as it was under a
conditional and the flow has changed to be unconditional.
I have not seen any testcase in the wild that produces wrong code
yet which is why there is no testcase but this is similar to what
the other code in phiopt does so after moving those to match, there
might be some.
OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.
gcc/ChangeLog:
* tree-ssa-phiopt.c (match_simplify_replacement): Reset
flow senatitive info on the moved ssa set.
Andrew Pinski [Sat, 12 Jun 2021 02:52:30 +0000 (19:52 -0700)]
Expand the comparison argument of fold_cond_expr_with_comparison
To make things slightly easiler to convert fold_cond_expr_with_comparison
over to match.pd, expanding the arg0 argument into 3 different arguments
is done. Also this was simple because we don't use arg0 after grabbing
the code and the two operands.
Also since we do this, we don't need to fold the comparison to
get the inverse but just use invert_tree_comparison directly.
OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.
gcc/ChangeLog:
* fold-const.c (fold_cond_expr_with_comparison):
Exand arg0 into comp_code, arg00, and arg01.
(fold_ternary_loc): Use invert_tree_comparison
instead of fold_invert_truthvalue for the case
where we have A CMP B ? C : A.
GCC Administrator [Sat, 26 Jun 2021 00:16:39 +0000 (00:16 +0000)]
Daily bump.
Marek Polacek [Tue, 8 Jun 2021 21:44:13 +0000 (17:44 -0400)]
c++: Failure to delay noexcept parsing with ptr-operator [PR100752]
We weren't passing 'flags' to the recursive call to cp_parser_declarator
in the ptr-operator case and as an effect, delayed parsing of noexcept
didn't work as advertised. The following change passes more than just
CP_PARSER_FLAGS_DELAY_NOEXCEPT but that doesn't seem to break anything.
I'm now also passing member_p and static_p, as a consequence, two tests
needed small tweaks.
PR c++/100752
gcc/cp/ChangeLog:
* parser.c (cp_parser_declarator): Pass flags down to
cp_parser_declarator. Also pass static_p/member_p.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/noexcept69.C: New test.
* g++.dg/parse/saved1.C: Adjust dg-error.
* g++.dg/template/crash50.C: Likewise.
David Malcolm [Fri, 25 Jun 2021 23:07:30 +0000 (19:07 -0400)]
jit: fix test-vector-* failures
Fix failures seen on i686 due to relying on exact floating-point
equality when testing results of vector division.
gcc/testsuite/ChangeLog:
* jit.dg/test-vector-rvalues.cc (check_div): Add specialization
for v4f, to avoid relying on exact floating-point equality.
* jit.dg/test-vector-types.cc (check_div): Likewise.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
David Malcolm [Fri, 25 Jun 2021 23:07:25 +0000 (19:07 -0400)]
jit: fix test-asm failures on i?86
On i686, test_i386_basic_asm_4 has:
error: inconsistent operand constraints in an 'asm'
and test_i386_basic_asm_5 has:
/tmp/libgccjit-9FsLie/fake.s:9: Error: bad register name `%rdi'
/tmp/libgccjit-9FsLie/fake.s:10: Error: bad register name `%rsi'
This is only intended as a smoketest of asm support, so only run
it on x86_64.
gcc/testsuite/ChangeLog:
* jit.dg/test-asm.c: Remove i?86-*-* from target specifier.
* jit.dg/test-asm.cc: Likewise.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Martin Sebor [Fri, 25 Jun 2021 23:01:01 +0000 (17:01 -0600)]
PR middle-end/101216 - spurious notes for function calls
PR middle-end/101216
gcc/ChangeLog:
* calls.c (maybe_warn_rdwr_sizes): Use the no_warning constant.
gcc/testsuite/ChangeLog:
* gcc.dg/Wnonnull-7.c: New test.
Jonathan Wakely [Fri, 25 Jun 2021 17:31:23 +0000 (18:31 +0100)]
libstdc++: Avoid intercepting exception in ostream::write
Currently if ostream::write fails and sets badbit and that causes an
exception, we will catch the exception, set badbit again, and rethrow
the exception.
This change delays setting badbit until after the try-catch block, so
that if it causes an exception we don't need to catch and rethrow it.
This removes the last remaining use of _M_write, so it can be made
private (or removed entirely for versioned namespace builds, where ABI
compatibility is not required). All other uses of _M_write were replaced
by calls to __ostream_insert, so make _M_write use that too.
libstdc++-v3/ChangeLog:
* include/bits/ostream.tcc (basic_ostream::write): Call sputn
directly instead of using _M_write. Do setstate(__err) all
outside the try-catch block.
* include/std/ostream (basic_ostream::_M_write): Declare
private. Use __ostream_insert. Do not define for the versioned
namespace.
Jonathan Wakely [Fri, 25 Jun 2021 17:31:23 +0000 (18:31 +0100)]
libstdc++: Implement LWG 581 for std:ostream::flush()
LWG 581 changed ostream::flush() to an unformatted output function for
C++11, but it was never implemented in libstdc++.
libstdc++-v3/ChangeLog:
* doc/xml/manual/intro.xml: Document LWG 581 change.
* doc/html/manual/bugs.html: Regenerate.
* include/bits/basic_ios.tcc: Whitespace.
* include/bits/ostream.tcc (basic_ostream::flush()): Construct
sentry.
* testsuite/27_io/basic_ostream/flush/char/2.cc: Check
additional cases.
* testsuite/27_io/basic_ostream/flush/char/exceptions_badbit_throw.cc:
Likewise.
* testsuite/27_io/basic_ostream/flush/wchar_t/2.cc: Likewise.
* testsuite/27_io/basic_ostream/flush/wchar_t/exceptions_badbit_throw.cc:
Likewise.
Jonathan Wakely [Fri, 25 Jun 2021 17:31:23 +0000 (18:31 +0100)]
libstdc++: Fix exception handling in std::ostream seek functions
N3168 added the requirement that the [ostream.seeks] functions create a
sentry object. Nothing in the requirements of those functions says
anything about catching exceptions and setting badbit.
As well as not catching exceptions, this change results in another
observable behaviour change. Previously seeking on a stream with eofbit
set would work (as long as badbit and failbit weren't set). The
construction of a sentry causes failbit to be set when eofbit is set,
which causes the seek to fail. It is necessary to clear the eofbit
before seeking now.
libstdc++-v3/ChangeLog:
* include/bits/ostream.tcc (sentry): Only set failbit if badbit
is set, not if eofbit is set.
(tellp, seekp, seekp): Create sentry object. Do not set badbit
on exceptions.
* testsuite/27_io/basic_ostream/seekp/char/exceptions_badbit_throw.cc:
Adjust expected behaviour.
* testsuite/27_io/basic_ostream/seekp/wchar_t/exceptions_badbit_throw.cc:
Likewise.
* testsuite/27_io/basic_ostream/tellp/char/exceptions_badbit_throw.cc:
Likewise.
* testsuite/27_io/basic_ostream/tellp/wchar_t/exceptions_badbit_throw.cc:
Likewise.
* testsuite/27_io/basic_ostream/seekp/char/n3168.cc: New test.
* testsuite/27_io/basic_ostream/seekp/wchar_t/n3168.cc: New test.
* testsuite/27_io/basic_ostream/tellp/char/n3168.cc: New test.
* testsuite/27_io/basic_ostream/tellp/wchar_t/n3168.cc: New test.
Jonathan Wakely [Fri, 25 Jun 2021 17:31:22 +0000 (18:31 +0100)]
libstdc++: Remove noexcept from syncbuf::swap (LWG 3498)
The proposed resolution for the inconsistent noexcept-specifiers in the
spec is to remove it from bto hthe assignment operator and swap.
libstdc++-v3/ChangeLog:
* include/std/syncstream (basic_syncbuf::swap()): Remove
noexcept, as per LWG 3498.
Jonathan Wakely [Fri, 25 Jun 2021 17:31:22 +0000 (18:31 +0100)]
libstdc++: More workarounds in 17_intro/names.cc test [PR 97088]
Conditionally #undef some more names that are used in system headers.
libstdc++-v3/ChangeLog:
PR libstdc++/97088
* testsuite/17_intro/names.cc: Undef more names for newlib and
also for arm-none-linux-gnueabi.
* testsuite/experimental/names.cc: Disable PCH.
Chung-Lin Tang [Fri, 25 Jun 2021 16:42:58 +0000 (00:42 +0800)]
testsuite/101114: Adjust libgomp.c-c++-common/struct-elem-5.c testcase
The dg-shouldfail testcase libgomp.c-c++-common/struct-elem-5.c does not
properly fail for non-shared address space offloading. Adjust testcase
to limit testing only for "target offload_device_nonshared_as".
libgomp/ChangeLog:
PR testsuite/101114
* testsuite/libgomp.c-c++-common/struct-elem-5.c:
Add "target offload_device_nonshared_as" condition for enabling test.
Matthias Kretz [Thu, 28 Jan 2021 20:04:03 +0000 (21:04 +0100)]
libstdc++: Make use of __builtin_bit_cast for simd
The __bit_cast function was a hack to achieve what __builtin_bit_cast
can do, therefore use __builtin_bit_cast if possible. However,
__builtin_bit_cast cannot be used to cast from/to fixed_size_simd, since
it isn't trivially copyable (in the language sense — in principle it
is). Therefore add __proposed::simd_bit_cast to enable the use case
required in the test framework.
Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:
* include/experimental/bits/simd.h (__bit_cast): Implement via
__builtin_bit_cast #if available.
(__proposed::simd_bit_cast): Add overloads for simd and
simd_mask, which use __builtin_bit_cast (or __bit_cast #if not
available), which return an object of the requested type with
the same bits as the argument.
* include/experimental/bits/simd_math.h: Use simd_bit_cast
instead of __bit_cast to allow casts to fixed_size_simd.
(copysign): Remove branch that was only required if __bit_cast
cannot be constexpr.
* testsuite/experimental/simd/tests/bits/test_values.h: Switch
from __bit_cast to __proposed::simd_bit_cast since the former
will not cast fixed_size objects anymore.
Matthias Kretz [Fri, 25 Jun 2021 12:40:26 +0000 (14:40 +0200)]
MAINTAINERS: Add myself for write after approval and DCO
Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
ChangeLog:
* MAINTAINERS: Add myself for write after approval and DCO
Jeff Law [Fri, 25 Jun 2021 13:22:28 +0000 (09:22 -0400)]
Use right shifts to eliminate redundant test/compare insns on the H8
gcc/
* config/h8300/h8300.c (select_cc_mode): Handle ASHIFTRT and LSHIFTRT.
Richard Biener [Fri, 25 Jun 2021 07:20:56 +0000 (09:20 +0200)]
tree-optimization/101202 - fix ICE with failed backedge SLP nodes
This fixes an ICE with failed backedge SLP nodes still in the graph
while doing permute optimization by explicitely handling those.
2021-06-25 Richard Biener <rguenther@suse.de>
PR tree-optimization/101202
* tree-vect-slp.c (vect_optimize_slp): Explicitely handle
failed nodes.
* gcc.dg/torture/pr101202.c: New testcase.
Richard Biener [Fri, 25 Jun 2021 06:54:14 +0000 (08:54 +0200)]
Fixup reduction info on addsub SLP pattern
gcc.dg/vect/pr96854.c shows we need to copy over reduction info
to the SLP pattern as already done for the complex patterns.
2021-06-25 Richard Biener <rguenther@suse.de>
* tree-vect-slp-patterns.c (addsub_pattern::build): Copy
STMT_VINFO_REDUC_DEF from the original representative.
Richard Biener [Tue, 22 Jun 2021 06:43:15 +0000 (08:43 +0200)]
add -ltrans-objects lto-plugin debug option
This adds a -ltrans-objects option to lto-plugin that by-passes
lto-wrapper invocation and instead feeds LD the final LTRANS objects
directly from the response file given as argument to the option.
This allows LD issues involving the linker-plugin path to be
debugged in an easier way with just the IR objects (their symtab)
and the LTRANS objects as testcase.
I've tested the path re-building stage2 build/genmatch from an
LTO bootstrap and got a bit-identical executable by adding
-plugin-opt=-ltrans-objects=y to the original collect2 invocation,
seeding y with the final objects as printed by building genmatch
with -save-temps -v.
2021-06-22 Richard Biener <rguenther@suse.de>
lto-plugin/
* lto-plugin.c (ltrans_objects): New global.
(all_symbols_read_handler): If -ltrans-objects was specified,
add the output files from the specified file directly.
(process_option): Handle -ltrans-objects.
Xi Ruoyao [Tue, 22 Jun 2021 06:57:47 +0000 (14:57 +0800)]
testsuite: avoid no-stack-protector-attr-3 fail on mips*-*-*
On MIPS a call to __stack_chk_fail needs an additional .reloc pseudo-op,
so "stack_chk_fail" will appear two times.
gcc/testsuite/
* g++.dg/no-stack-protector-attr-3.C (dg-final): Adjust for MIPS.
Martin Sebor [Fri, 25 Jun 2021 01:22:06 +0000 (19:22 -0600)]
middle-end: add support for per-location warning groups.
gcc/ChangeLog:
* builtins.c (warn_string_no_nul): Replace uses of TREE_NO_WARNING,
gimple_no_warning_p and gimple_set_no_warning with
warning_suppressed_p, and suppress_warning.
(c_strlen): Same.
(maybe_warn_for_bound): Same.
(warn_for_access): Same.
(check_access): Same.
(expand_builtin_strncmp): Same.
(fold_builtin_varargs): Same.
* calls.c (maybe_warn_nonstring_arg): Same.
(maybe_warn_rdwr_sizes): Same.
* cfgexpand.c (expand_call_stmt): Same.
* cgraphunit.c (check_global_declaration): Same.
* fold-const.c (fold_undefer_overflow_warnings): Same.
(fold_truth_not_expr): Same.
(fold_unary_loc): Same.
(fold_checksum_tree): Same.
* gimple-array-bounds.cc (array_bounds_checker::check_array_ref): Same.
(array_bounds_checker::check_mem_ref): Same.
(array_bounds_checker::check_addr_expr): Same.
(array_bounds_checker::check_array_bounds): Same.
* gimple-expr.c (copy_var_decl): Same.
* gimple-fold.c (gimple_fold_builtin_strcpy): Same.
(gimple_fold_builtin_strncat): Same.
(gimple_fold_builtin_stxcpy_chk): Same.
(gimple_fold_builtin_stpcpy): Same.
(gimple_fold_builtin_sprintf): Same.
(fold_stmt_1): Same.
* gimple-ssa-isolate-paths.c (diag_returned_locals): Same.
* gimple-ssa-nonnull-compare.c (do_warn_nonnull_compare): Same.
* gimple-ssa-sprintf.c (handle_printf_call): Same.
* gimple-ssa-store-merging.c (imm_store_chain_info::output_merged_store): Same.
* gimple-ssa-warn-restrict.c (maybe_diag_overlap): Same.
* gimple-ssa-warn-restrict.h: Adjust declarations.
(maybe_diag_access_bounds): Replace uses of TREE_NO_WARNING,
gimple_no_warning_p and gimple_set_no_warning with
warning_suppressed_p, and suppress_warning.
(check_call): Same.
(check_bounds_or_overlap): Same.
* gimple.c (gimple_build_call_from_tree): Same.
* gimplify.c (gimplify_return_expr): Same.
(gimplify_cond_expr): Same.
(gimplify_modify_expr_complex_part): Same.
(gimplify_modify_expr): Same.
(gimple_push_cleanup): Same.
(gimplify_expr): Same.
* omp-expand.c (expand_omp_for_generic): Same.
(expand_omp_taskloop_for_outer): Same.
* omp-low.c (lower_rec_input_clauses): Same.
(lower_lastprivate_clauses): Same.
(lower_send_clauses): Same.
(lower_omp_target): Same.
* tree-cfg.c (pass_warn_function_return::execute): Same.
* tree-complex.c (create_one_component_var): Same.
* tree-inline.c (remap_gimple_op_r): Same.
(copy_tree_body_r): Same.
(declare_return_variable): Same.
(expand_call_inline): Same.
* tree-nested.c (lookup_field_for_decl): Same.
* tree-sra.c (create_access_replacement): Same.
(generate_subtree_copies): Same.
* tree-ssa-ccp.c (pass_post_ipa_warn::execute): Same.
* tree-ssa-forwprop.c (combine_cond_expr_cond): Same.
* tree-ssa-loop-ch.c (ch_base::copy_headers): Same.
* tree-ssa-loop-im.c (execute_sm): Same.
* tree-ssa-phiopt.c (cond_store_replacement): Same.
* tree-ssa-strlen.c (maybe_warn_overflow): Same.
(handle_builtin_strcpy): Same.
(maybe_diag_stxncpy_trunc): Same.
(handle_builtin_stxncpy_strncat): Same.
(handle_builtin_strcat): Same.
* tree-ssa-uninit.c (get_no_uninit_warning): Same.
(set_no_uninit_warning): Same.
(uninit_undefined_value_p): Same.
(warn_uninit): Same.
(maybe_warn_operand): Same.
* tree-vrp.c (compare_values_warnv): Same.
* vr-values.c (vr_values::extract_range_for_var_from_comparison_expr): Same.
(test_for_singularity): Same.
* gimple.h (warning_suppressed_p): New function.
(suppress_warning): Same.
(copy_no_warning): Same.
(gimple_set_block): Call gimple_set_location.
(gimple_set_location): Call copy_warning.
Martin Sebor [Thu, 24 Jun 2021 23:29:34 +0000 (17:29 -0600)]
cp: add support for per-location warning groups.
gcc/cp/ChangeLog:
* call.c (build_over_call): Replace direct uses of TREE_NO_WARNING
with warning_suppressed_p, suppress_warning, and copy_no_warning, or
nothing if not necessary.
(set_up_extended_ref_temp): Same.
* class.c (layout_class_type): Same.
* constraint.cc (constraint_satisfaction_value): Same.
* coroutines.cc (finish_co_await_expr): Same.
(finish_co_yield_expr): Same.
(finish_co_return_stmt): Same.
(build_actor_fn): Same.
(coro_rewrite_function_body): Same.
(morph_fn_to_coro): Same.
* cp-gimplify.c (genericize_eh_spec_block): Same.
(gimplify_expr_stmt): Same.
(cp_genericize_r): Same.
(cp_fold): Same.
* cp-ubsan.c (cp_ubsan_instrument_vptr): Same.
* cvt.c (cp_fold_convert): Same.
(convert_to_void): Same.
* decl.c (wrapup_namespace_globals): Same.
(grokdeclarator): Same.
(finish_function): Same.
(require_deduced_type): Same.
* decl2.c (no_linkage_error): Same.
(c_parse_final_cleanups): Same.
* except.c (expand_end_catch_block): Same.
* init.c (build_new_1): Same.
(build_new): Same.
(build_vec_delete_1): Same.
(build_vec_init): Same.
(build_delete): Same.
* method.c (defaultable_fn_check): Same.
* parser.c (cp_parser_fold_expression): Same.
(cp_parser_primary_expression): Same.
* pt.c (push_tinst_level_loc): Same.
(tsubst_copy): Same.
(tsubst_omp_udr): Same.
(tsubst_copy_and_build): Same.
* rtti.c (build_if_nonnull): Same.
* semantics.c (maybe_convert_cond): Same.
(finish_return_stmt): Same.
(finish_parenthesized_expr): Same.
(cp_check_omp_declare_reduction): Same.
* tree.c (build_cplus_array_type): Same.
* typeck.c (build_ptrmemfunc_access_expr): Same.
(cp_build_indirect_ref_1): Same.
(cp_build_function_call_vec): Same.
(warn_for_null_address): Same.
(cp_build_binary_op): Same.
(unary_complex_lvalue): Same.
(cp_build_modify_expr): Same.
(build_x_modify_expr): Same.
(convert_for_assignment): Same.
Martin Sebor [Thu, 24 Jun 2021 22:09:20 +0000 (16:09 -0600)]
c-family: add support for per-location warning groups.
gcc/c-family/ChangeLog:
* c-common.c (c_wrap_maybe_const): Remove TREE_NO_WARNING.
(c_common_truthvalue_conversion): Replace direct uses of
TREE_NO_WARNING with warning_suppressed_p, suppress_warning, and
copy_no_warning.
(check_function_arguments_recurse): Same.
* c-gimplify.c (c_gimplify_expr): Same.
* c-warn.c (overflow_warning): Same.
(warn_logical_operator): Same.
(warn_if_unused_value): Same.
(do_warn_unused_parameter): Same.
Martin Sebor [Thu, 24 Jun 2021 21:35:20 +0000 (15:35 -0600)]
c: add support for per-location warning groups.
gcc/ChangeLog:
* tree.h (warning_suppressed_at, copy_warning,
warning_suppressed_p, suppress_warning): New functions.
gcc/c/ChangeLog:
* c-decl.c (pop_scope): Replace direct uses of TREE_NO_WARNING with
warning_suppressed_p, suppress_warning, and copy_no_warning.
(diagnose_mismatched_decls): Same.
(duplicate_decls): Same.
(grokdeclarator): Same.
(finish_function): Same.
(c_write_global_declarations_1): Same.
* c-fold.c (c_fully_fold_internal): Same.
* c-parser.c (c_parser_expr_no_commas): Same.
(c_parser_postfix_expression): Same.
* c-typeck.c (array_to_pointer_conversion): Same.
(function_to_pointer_conversion): Same.
(default_function_array_conversion): Same.
(convert_lvalue_to_rvalue): Same.
(default_conversion): Same.
(build_indirect_ref): Same.
(build_function_call_vec): Same.
(build_atomic_assign): Same.
(build_unary_op): Same.
(c_finish_return): Same.
(emit_side_effect_warnings): Same.
(c_finish_stmt_expr): Same.
(c_omp_clause_copy_ctor): Same.
Martin Sebor [Thu, 24 Jun 2021 17:11:00 +0000 (11:11 -0600)]
Add support for per-location warning groups.
gcc/ChangeLog:
* Makefile.in (OBJS-libcommon): Add diagnostic-spec.o.
* gengtype.c (open_base_files): Add diagnostic-spec.h.
* diagnostic-spec.c: New file.
* diagnostic-spec.h: New file.
* tree.h (no_warning, all_warnings, suppress_warning_at): New
declarations.
* warning-control.cc: New file.
liuhongt [Thu, 24 Jun 2021 08:14:13 +0000 (16:14 +0800)]
Revert x86_order_regs_for_local_alloc changes in r12-1669.
Still put general regs as first alloca order.
gcc/ChangeLog:
PR target/101185
* config/i386/i386.c (x86_order_regs_for_local_alloc):
Revert r12-1669.
gcc/testsuite/ChangeLog
PR target/101185
* gcc.target/i386/bitwise_mask_op-3.c: Add xfail to
temporarily avoid regression, eventually xfail should be
removed.
GCC Administrator [Fri, 25 Jun 2021 00:16:53 +0000 (00:16 +0000)]
Daily bump.
Andrew MacLeod [Thu, 24 Jun 2021 17:49:51 +0000 (13:49 -0400)]
Add a testcase to confirm the equivalence's are being checked by EVRP.
* gcc.dg/tree-ssa/evrp30.c: New.
Andrew MacLeod [Thu, 24 Jun 2021 17:35:21 +0000 (13:35 -0400)]
Only register relations on live edges
Register a relation on a conditional edge only if the LHS supports
this edge being taken.
gcc/
PR tree-optimization/101189
* gimple-range-fold.cc (fold_using_range::range_of_range_op): Pass
LHS range of condition to postfold routine.
(fold_using_range::postfold_gcond_edges): Only process the TRUE or
FALSE edge if the LHS range supports it being taken.
* gimple-range-fold.h (postfold_gcond_edges): Add range parameter.
gcc/testsuite/
* gcc.dg/tree-ssa/pr101189.c: New.
Andrew MacLeod [Thu, 24 Jun 2021 15:13:47 +0000 (11:13 -0400)]
Fix relation query of equivalences.
When looking for relations between equivalencies, a typo was causing
the wrong bitmap to be checked. Effect was is missed them.
Plus don't dump blocks which don't exist.
* value-relation.cc (equiv_oracle::dump): Do not dump NULL blocks.
(relation_oracle::find_relation_block): Check correct bitmap.
(relation_oracle::dump): Do not dump NULL blocks.
Andrew MacLeod [Wed, 23 Jun 2021 19:25:45 +0000 (15:25 -0400)]
Correctly unify recomputation with existing range.
When propagating the on-entry cache, new block ranges are calculated
by combining all the incoming edges and comparing to the old value.
When a recomputation was performed on an edge, it didn't take into account
that the value in the block may already be better than a potential
recompuation... Thus a worse values was sometimes propagated.
Fixed by simply calling the now correct range_on_edge the cache provides.
* gimple-range-cache.cc (ranger_cache::propagate_cache): Call
range_on_edge instead of manually calculating.
Andrew MacLeod [Wed, 23 Jun 2021 16:14:37 +0000 (12:14 -0400)]
Fix comment typo.
* range-op.cc: Fix comment.
Patrick Palka [Thu, 24 Jun 2021 17:11:44 +0000 (13:11 -0400)]
c++: alias CTAD and aggregate deduction cand [PR98832]
During alias CTAD, we're accidentally ignoring the aggregate deduction
candidate for the underlying template because this guide is added
separately via maybe_aggr_guide (which doesn't yet handle alias
templates) instead of via deduction_guides_for (which does). This patch
makes maybe_aggr_guide handle alias templates in a manner similar to
deduction_guides_for.
PR c++/98832
gcc/cp/ChangeLog:
* pt.c (maybe_aggr_guide): Handle alias templates appropriately.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/class-deduction-alias9.C: New test.
Patrick Palka [Thu, 24 Jun 2021 15:29:02 +0000 (11:29 -0400)]
c++: requires-expression folding [PR101182]
Here we're crashing because cp_fold_function walks into the (templated)
requirements of a requires-expression outside a template, but the
folding routines aren't prepared to handle templated trees. This patch
fixes this by making cp_fold use evaluate_requires_expr to fold a
requires-expression as a whole, which also means we no longer need to
explicitly do so during gimplification. (Note that we delay folding
of such requires-expressions for sake of better diagnostics when one is
used as the condition of a failed static_assert.)
PR c++/101182
gcc/cp/ChangeLog:
* constraint.cc (evaluate_requires_expr): Adjust function comment.
* cp-gimplify.c (cp_genericize_r) <case REQUIRES_EXPR>: Move to ...
(cp_fold) <case REQUIRES_EXPR>: ... here.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/concepts-requires25.C: New test.
Jakub Jelinek [Thu, 24 Jun 2021 13:58:02 +0000 (15:58 +0200)]
c: Fix up c_parser_has_attribute_expression [PR101176]
This function keeps src_range member of the result uninitialized, which at
least under valgrind can show up later when those uninitialized location_t's
can make it into the IL or location_t hash tables.
2021-06-24 Jakub Jelinek <jakub@redhat.com>
PR c/101176
* c-parser.c (c_parser_has_attribute_expression): Set source range for
the result.
Jakub Jelinek [Thu, 24 Jun 2021 13:55:28 +0000 (15:55 +0200)]
c: Fix C cast error-recovery [PR101171]
The following testcase ICEs during error-recovery, as build_c_cast calls
note_integer_operands on error_mark_node and that wraps it into
C_MAYBE_CONST_EXPR which is unexpected and causes ICE later on.
Seems most other callers of note_integer_operands check early if something
is error_mark_node and return before calling note_integer_operands on it.
The following patch fixes it by not calling on error_mark_node, another
possibility would be to handle error_mark_node in note_integer_operands and
just return it.
2021-06-24 Jakub Jelinek <jakub@redhat.com>
PR c/101171
* c-typeck.c (build_c_cast): Don't call note_integer_operands on
error_mark_node.
* gcc.dg/pr101171.c: New test.
Uros Bizjak [Thu, 24 Jun 2021 13:39:26 +0000 (15:39 +0200)]
i386: Add pack/unpack patterns for 64bit vectors [PR89021]
2021-06-24 Uroš Bizjak <ubizjak@gmail.com>
gcc/
PR target/89021
* config/i386/i386-expand.c (ix86_expand_sse_unpack):
Handle V8QI and V4HI modes.
* config/i386/mmx.md (sse4_1_<any_extend:code>v4qiv4hi2):
New insn pattern.
(sse4_1_<any_extend:code>v4qiv4hi2): Ditto.
(mmxpackmode): New mode attribute.
(vec_pack_trunc_<mmxpackmode:mode>): New expander.
(mmxunpackmode): New mode attribute.
(vec_unpacks_lo_<mmxunpackmode:mode>): New expander.
(vec_unpacks_hi_<mmxunpackmode:mode>): Ditto.
(vec_unpacku_lo_<mmxunpackmode:mode>): Ditto.
(vec_unpacku_hi_<mmxunpackmode:mode>): Ditto.
* config/i386/i386.md (extsuffix): Move from ...
* config/i386/sse.md: ... here.
gcc/testsuite/
PR target/89021
* gcc.dg/vect/vect-nb-iter-ub-3.c (dg-additional-options):
Add --param vect-epilogues-nomask=0.
* gcc.target/i386/pr97249-1.c (foo): Add #pragma GCC unroll
to avoid loop vectorization.
(foo1): Ditto.
(foo2): Ditto.
Matthias Kretz [Thu, 24 Jun 2021 13:20:15 +0000 (14:20 +0100)]
libstdc++: Fix internal names: add missing underscores
Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:
* include/experimental/bits/simd_math.h
(_GLIBCXX_SIMD_MATH_CALL2_): Rename arg2_ to __arg2.
(_GLIBCXX_SIMD_MATH_CALL3_): Rename arg2_ to __arg2 and arg3_ to
__arg3.
Matthias Kretz [Thu, 24 Jun 2021 13:20:15 +0000 (14:20 +0100)]
libstdc++: Ensure unrolled loops inline the lambda
Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:
* include/experimental/bits/simd.h (__execute_on_index_sequence)
(__execute_on_index_sequence_with_return)
(__call_with_n_evaluations, __call_with_subscripts): Add flatten
attribute.
Matthias Kretz [Thu, 24 Jun 2021 13:20:15 +0000 (14:20 +0100)]
libstdc++: Avoid raising fp exceptions in trunc, floor, and ceil
Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:
* include/experimental/bits/simd_x86.h (_S_trunc, _S_floor)
(_S_ceil): Set bit 8 (_MM_FROUND_NO_EXC) on AVX and SSE4.1
roundp[sd] calls.
Matthias Kretz [Thu, 24 Jun 2021 13:20:15 +0000 (14:20 +0100)]
libstdc++: Fix condition when AVX512F ldexp implementation is used
This improves codegen of ldexp if AVX512VL is available.
Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:
* include/experimental/bits/simd_x86.h (_S_ldexp): The AVX512F
implementation doesn't require a _VecBltnBtmsk ABI tag, it
requires either a 64-Byte input (in which case AVX512F must be
available) or AVX512VL.
Matthias Kretz [Thu, 24 Jun 2021 13:20:15 +0000 (14:20 +0100)]
libstdc++: Minor simd_math cleanups
Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:
* include/experimental/bits/simd_math.h: Undefine internal
macros after use.
(frexp): Move #if to a more sensible position and reformat
preceding code.
(logb): Call _SimdImpl::_S_logb for fixed_size instead of
duplicating the code here.
(modf): Simplify condition.
Matthias Kretz [Thu, 24 Jun 2021 13:20:14 +0000 (14:20 +0100)]
libstdc++: Remove incorrect fabs(simd) overload
fabs(int) returns double, this one didn't. This overload is not
specified in the Parallelism TS 2. Also remove the comment about labs
and llabs: it doesn't belong here.
Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:
* include/experimental/bits/simd_math.h (fabs): Remove
fabs(simd<integral>) overload.
Matthias Kretz [Thu, 24 Jun 2021 13:20:14 +0000 (14:20 +0100)]
libstdc++: Improve simd fixed_size codegen
Sometimes fixed_size objects will get unnecessarily copied on the stack.
The simd implementation should never pass _SimdTuple by value to avoid
requiring the optimizer to see through these copies.
Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:
* include/experimental/bits/simd_converter.h
(_SimdConverter::operator()): Pass _SimdTuple by const-ref.
* include/experimental/bits/simd_fixed_size.h
(_GLIBCXX_SIMD_FIXED_OP): Pass binary operator _SimdTuple
arguments by const-ref.
(_S_masked_unary): Pass _SimdTuple by const-ref.
Matthias Kretz [Thu, 24 Jun 2021 13:20:14 +0000 (14:20 +0100)]
libstdc++: Remove dead code in simd
This helper type became unused at some point.
Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:
* include/experimental/bits/simd_fixed_size.h
(_AbisInSimdTuple): Removed.
Matthias Kretz [Thu, 24 Jun 2021 13:20:13 +0000 (14:20 +0100)]
libstdc++: Improve copysign(simd) codegen
This also resolves a test failure on aarch64 with -ffast-math and
fixed_size<N> with large N.
Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:
* include/experimental/bits/simd.h: Add missing operator~
overload for simd<floating-point> to __float_bitwise_operators.
* include/experimental/bits/simd_builtin.h
(_SimdImplBuiltin::_S_complement): Bitcast to int (and back) to
implement complement for floating-point vectors.
* include/experimental/bits/simd_fixed_size.h
(_SimdImplFixedSize::_S_copysign): New function, forwarding to
copysign implementation of _SimdTuple members.
* include/experimental/bits/simd_math.h (copysign): Call
_SimdImpl::_S_copysign for fixed_size arguments. Simplify
generic copysign implementation using the new ~ operator.
Jonathan Wakely [Thu, 24 Jun 2021 12:49:19 +0000 (13:49 +0100)]
libstdc++: Fix typos and markdown errors in new simd/README.md
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:
* testsuite/experimental/simd/README.md: Fix typos.
Jonathan Wakely [Thu, 24 Jun 2021 11:56:20 +0000 (12:56 +0100)]
libstdc++: Implement LWG 2762 for std::unique_ptr::operator*
The LWG issue proposes to add a conditional noexcept-specifier to
std::unique_ptr's dereference operator. The issue is currently in
Tentatively Ready status, but even if it isn't voted into the draft, we
can do it as a conforming extensions. This commit also adds a similar
noexcept-specifier to operator[] for the unique_ptr<T[], D> partial
specialization.
Also ensure that all dereference operators for shared_ptr are noexcept,
and adds tests for the std::optional accessors modified by the issue,
which were already noexcept in our implementation.
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:
* include/bits/shared_ptr_base.h (__shared_ptr_access::operator[]):
Add noexcept.
* include/bits/unique_ptr.h (unique_ptr::operator*): Add
conditional noexcept as per LWG 2762.
* testsuite/20_util/shared_ptr/observers/array.cc: Check that
dereferencing cannot throw.
* testsuite/20_util/shared_ptr/observers/get.cc: Likewise.
* testsuite/20_util/optional/observers/lwg2762.cc: New test.
* testsuite/20_util/unique_ptr/lwg2762.cc: New test.
Eric Botcazou [Thu, 24 Jun 2021 10:55:27 +0000 (12:55 +0200)]
Emit .file 0 directive earlier in DWARF 5
When the assembler supports it, the compiler automatically passes --gdwarf-5
to it, which has an interesting side effect: any assembly instruction prior
to the first .file directive defines a new line associated with .file 0 in
the .debug_line section and of course the numbering of these implicit lines
has nothing to do with that of the source code. This can be problematic in
Ada when we do not generate .file/.loc directives for compiled-generated
functions to avoid too jumpy a debugging experience.
gcc/
* dwarf2out.c (dwarf2out_assembly_start): Emit .file 0 marker here..
(dwarf2out_finish): ...instead of here.
Eric Botcazou [Thu, 24 Jun 2021 10:53:24 +0000 (12:53 +0200)]
Fix --gdwarf-5 configure tests for Windows
The issues are that 1) they use readelf instead of objdump and 2) they use
ELF syntax in the assembly code.
gcc/
* configure.ac (--gdwarf-5 option): Use objdump instead of readelf.
(working --gdwarf-4/--gdwarf-5 for all sources): Likewise.
(--gdwarf-4 not refusing generated .debug_line): Adjust for Windows.
* configure: Regenerate.
prathamesh.kulkarni [Thu, 24 Jun 2021 11:20:19 +0000 (16:50 +0530)]
Add cscope.out to git ignore.
ChangeLog:
* .gitignore: Add entry for cscope.out.
Richard Biener [Fri, 18 Jun 2021 07:29:10 +0000 (09:29 +0200)]
Merge vec_addsub patterns
This merges the vec_addsub<mode>3 patterns using a mode attribute
for the vec_merge merge operand.
2021-06-18 Richard Biener <rguenther@suse.de>
* config/i386/sse.md (vec_addsubv4df3, vec_addsubv2df3,
vec_addsubv8sf3, vec_addsubv4sf3): Merge into ...
(vec_addsub<mode>3): ... using a new addsub_cst mode attribute.
Richard Biener [Mon, 31 May 2021 11:19:01 +0000 (13:19 +0200)]
Add x86 addsub SLP pattern
This addds SLP pattern recognition for the SSE3/AVX [v]addsubp{ds} v0, v1
instructions which compute { v0[0] - v1[0], v0[1], + v1[1], ... }
thus subtract, add alternating on lanes, starting with subtract.
It adds a corresponding optab and direct internal function,
vec_addsub$a3 and renames the existing i386 backend patterns to
the new canonical name.
The SLP pattern matches the exact alternating lane sequence rather
than trying to be clever and anticipating incoming permutes - we
could permute the two input vectors to the needed lane alternation,
do the addsub and then permute the result vector back but that's
only profitable in case the two input or the output permute will
vanish - something Tamars refactoring of SLP pattern recog should
make possible.
2021-06-17 Richard Biener <rguenther@suse.de>
* config/i386/sse.md (avx_addsubv4df3): Rename to
vec_addsubv4df3.
(avx_addsubv8sf3): Rename to vec_addsubv8sf3.
(sse3_addsubv2df3): Rename to vec_addsubv2df3.
(sse3_addsubv4sf3): Rename to vec_addsubv4sf3.
* config/i386/i386-builtin.def: Adjust.
* internal-fn.def (VEC_ADDSUB): New internal optab fn.
* optabs.def (vec_addsub_optab): New optab.
* tree-vect-slp-patterns.c (class addsub_pattern): New.
(slp_patterns): Add addsub_pattern.
* tree-vect-slp.c (vect_optimize_slp): Disable propagation
across CFN_VEC_ADDSUB.
* tree-vectorizer.h (vect_pattern::vect_pattern): Make
m_ops optional.
* doc/md.texi (vec_addsub<mode>3): Document.
* gcc.target/i386/vect-addsubv2df.c: New testcase.
* gcc.target/i386/vect-addsubv4sf.c: Likewise.
* gcc.target/i386/vect-addsubv4df.c: Likewise.
* gcc.target/i386/vect-addsubv8sf.c: Likewise.
* gcc.target/i386/vect-addsub-2.c: Likewise.
* gcc.target/i386/vect-addsub-3.c: Likewise.
Jakub Jelinek [Thu, 24 Jun 2021 10:24:48 +0000 (12:24 +0200)]
df: Fix up handling of paradoxical subregs in debug insns [PR101170]
The recent addition of gcc_assert (regno < endregno); triggers during
glibc build on m68k.
The problem is that RA decisions shouldn't depend on expressions in
DEBUG_INSNs and those expressions can contain paradoxical subregs of certain
pseudos. If RA then decides to allocate the pseudo to a register
with very small hard register REGNO, we can trigger the new assert,
as (int) subreg_regno_offset may be negative on big endian and the small
REGNO + the negative offset can wrap around.
The following patch in that case records the range from the REGNO 0 to
endregno, before the addition of the assert as both regno and endregno are
unsigned it wouldn't record anything at all silently.
2021-06-24 Jakub Jelinek <jakub@redhat.com>
PR middle-end/101170
* df-scan.c (df_ref_record): For paradoxical big-endian SUBREGs
where regno + subreg_regno_offset wraps around use 0 as starting
regno.
* gcc.dg/pr101170.c: New test.
Jakub Jelinek [Thu, 24 Jun 2021 10:22:14 +0000 (12:22 +0200)]
stor-layout: Avoid DECL_BIT_FIELD_REPRESENTATIVE with NULL TREE_TYPE [PR101172]
finish_bitfield_representative has an early out if the field after a
bitfield has error_mark_node type, but that early out leads to TREE_TYPE
of the DECL_BIT_FIELD_REPRESENTATIVE being NULL, which breaks assumptions
on code that uses the DECL_BIT_FIELD_REPRESENTATIVE during error-recovery.
The following patch instead sets TREE_TYPE of the representative to
error_mark_node, something the users can deal with better. At this point
the representative can be set as DECL_BIT_FIELD_REPRESENTATIVE for multiple
bitfields, so making sure that we clear the DECL_BIT_FIELD_REPRESENTATIVE
instead would be harder (but doable, e.g. with the error_mark_node TREE_TYPE
set by this patch set some flag in the caller and if the flag is there, walk
all the fields once again and clear all DECL_BIT_FIELD_REPRESENTATIVE that
have error_mark_node TREE_TYPE).
2021-06-24 Jakub Jelinek <jakub@redhat.com>
PR middle-end/101172
* stor-layout.c (finish_bitfield_representative): If nextf has
error_mark_node type, set repr type to error_mark_node too.
* gcc.dg/pr101172.c: New test.
Ilya Leoshkevich [Thu, 17 Jun 2021 12:18:17 +0000 (14:18 +0200)]
IBM Z: Define NO_PROFILE_COUNTERS
s390 glibc does not need counters in the .data section, since it stores
edge hits in its own data structure. Therefore counters only waste
space and confuse diffing tools (e.g. kpatch), so don't generate them.
gcc/ChangeLog:
* config/s390/s390.c (s390_function_profiler): Ignore labelno
parameter.
* config/s390/s390.h (NO_PROFILE_COUNTERS): Define.
gcc/testsuite/ChangeLog:
* gcc.target/s390/mnop-mcount-m31-mzarch.c: Adapt to the new
prologue size.
* gcc.target/s390/mnop-mcount-m64.c: Likewise.
Richard Biener [Thu, 24 Jun 2021 08:47:18 +0000 (10:47 +0200)]
Fix SLP permute propagation error
This fixes SLP permute propagation to not propagate across operations
that have different semantics on different lanes like for example
the recently added COMPLEX_ADD_ROT90.
2021-06-24 Richard Biener <rguenther@suse.de>
* tree-vect-slp.c (vect_optimize_slp): Do not propagate
across operations that have different semantics on different
lanes.
Jakub Jelinek [Thu, 24 Jun 2021 09:25:34 +0000 (11:25 +0200)]
openmp: in_reduction clause support on target construct
This patch adds support for in_reduction clause on target construct, though
for now only for synchronous targets (without nowait clause).
The encountering thread in that case runs the target task and blocks until
the target region ends, so it is implemented by remapping it before entering
the target, initializing the private copy if not yet initialized for the
current thread and then using the remapped addresses for the mapping
addresses.
For nowait combined with in_reduction the patch contains a hack where the
nowait clause is ignored. To implement it correctly, I think we would need
to create a new private variable for the in_reduction and initialize it before
doing the async target and adjust the map addresses to that private variable
and then pass a function pointer to the library routine with code where the callback
would remap the address to the current threads private variable and use in_reduction
combiner to combine the private variable we've created into the thread's copy.
The library would then need to make sure that the routine is called in some thread
participating in the parallel (and not in an unshackeled thread).
2021-06-24 Jakub Jelinek <jakub@redhat.com>
gcc/
* tree.h (OMP_CLAUSE_MAP_IN_REDUCTION): Document meaning for OpenMP.
* gimplify.c (gimplify_scan_omp_clauses): For OpenMP map clauses
with OMP_CLAUSE_MAP_IN_REDUCTION flag partially defer gimplification
of non-decl OMP_CLAUSE_DECL. For OMP_CLAUSE_IN_REDUCTION on
OMP_TARGET user outer_ctx instead of ctx for placeholders and
initializer/combiner gimplification.
* omp-low.c (scan_sharing_clauses): Handle OMP_CLAUSE_MAP_IN_REDUCTION
on target constructs.
(lower_rec_input_clauses): Likewise.
(lower_omp_target): Likewise.
* omp-expand.c (expand_omp_target): Temporarily ignore nowait clause
on target if in_reduction is present.
gcc/c-family/
* c-common.h (enum c_omp_region_type): Add C_ORT_TARGET and
C_ORT_OMP_TARGET.
* c-omp.c (c_omp_split_clauses): For OMP_CLAUSE_IN_REDUCTION on
combined target constructs also add map (always, tofrom:) clause.
gcc/c/
* c-parser.c (omp_split_clauses): Pass C_ORT_OMP_TARGET instead of
C_ORT_OMP for clauses on target construct.
(OMP_TARGET_CLAUSE_MASK): Add in_reduction clause.
(c_parser_omp_target): For non-combined target add
map (always, tofrom:) clauses for OMP_CLAUSE_IN_REDUCTION. Pass
C_ORT_OMP_TARGET to c_finish_omp_clauses.
* c-typeck.c (handle_omp_array_sections): Adjust ort handling
for addition of C_ORT_OMP_TARGET and simplify, mapping clauses are
never present on C_ORT_*DECLARE_SIMD.
(c_finish_omp_clauses): Likewise. Handle OMP_CLAUSE_IN_REDUCTION
on C_ORT_OMP_TARGET, set OMP_CLAUSE_MAP_IN_REDUCTION on
corresponding map clauses.
gcc/cp/
* parser.c (cp_omp_split_clauses): Pass C_ORT_OMP_TARGET instead of
C_ORT_OMP for clauses on target construct.
(OMP_TARGET_CLAUSE_MASK): Add in_reduction clause.
(cp_parser_omp_target): For non-combined target add
map (always, tofrom:) clauses for OMP_CLAUSE_IN_REDUCTION. Pass
C_ORT_OMP_TARGET to finish_omp_clauses.
* semantics.c (handle_omp_array_sections_1): Adjust ort handling
for addition of C_ORT_OMP_TARGET and simplify, mapping clauses are
never present on C_ORT_*DECLARE_SIMD.
(handle_omp_array_sections): Likewise.
(finish_omp_clauses): Likewise. Handle OMP_CLAUSE_IN_REDUCTION
on C_ORT_OMP_TARGET, set OMP_CLAUSE_MAP_IN_REDUCTION on
corresponding map clauses.
* pt.c (tsubst_expr): Pass C_ORT_OMP_TARGET instead of C_ORT_OMP for
clauses on target construct.
gcc/testsuite/
* c-c++-common/gomp/target-in-reduction-1.c: New test.
* c-c++-common/gomp/clauses-1.c: Add in_reduction clauses on
target or combined target constructs.
libgomp/
* testsuite/libgomp.c-c++-common/target-in-reduction-1.c: New test.
* testsuite/libgomp.c-c++-common/target-in-reduction-2.c: New test.
* testsuite/libgomp.c++/target-in-reduction-1.C: New test.
* testsuite/libgomp.c++/target-in-reduction-2.C: New test.
Kewen Lin [Thu, 24 Jun 2021 08:45:29 +0000 (03:45 -0500)]
predcom: Refactor more by encapsulating global states
This patch is to encapsulate global states into a class and
making their accessors as member functions, remove some
consequent useless clean up code, and do some clean up with
RAII.
Bootstrapped/regtested on powerpc64le-linux-gnu P9,
x86_64-redhat-linux and aarch64-linux-gnu, also
bootstrapped on ppc64le P9 with bootstrap-O3 config.
gcc/ChangeLog:
* tree-predcom.c (class pcom_worker): New class.
(release_chain): Renamed to...
(pcom_worker::release_chain): ...this.
(release_chains): Renamed to...
(pcom_worker::release_chains): ...this.
(aff_combination_dr_offset): Renamed to...
(pcom_worker::aff_combination_dr_offset): ...this.
(determine_offset): Renamed to...
(pcom_worker::determine_offset): ...this.
(class comp_ptrs): New class.
(split_data_refs_to_components): Renamed to...
(pcom_worker::split_data_refs_to_components): ...this,
and update with class comp_ptrs.
(suitable_component_p): Renamed to...
(pcom_worker::suitable_component_p): ...this.
(filter_suitable_components): Renamed to...
(pcom_worker::filter_suitable_components): ...this.
(valid_initializer_p): Renamed to...
(pcom_worker::valid_initializer_p): ...this.
(find_looparound_phi): Renamed to...
(pcom_worker::find_looparound_phi): ...this.
(add_looparound_copies): Renamed to...
(pcom_worker::add_looparound_copies): ...this.
(determine_roots_comp): Renamed to...
(pcom_worker::determine_roots_comp): ...this.
(determine_roots): Renamed to...
(pcom_worker::determine_roots): ...this.
(single_nonlooparound_use): Renamed to...
(pcom_worker::single_nonlooparound_use): ...this.
(remove_stmt): Renamed to...
(pcom_worker::remove_stmt): ...this.
(execute_pred_commoning_chain): Renamed to...
(pcom_worker::execute_pred_commoning_chain): ...this.
(execute_pred_commoning): Renamed to...
(pcom_worker::execute_pred_commoning): ...this.
(struct epcc_data): New member worker.
(execute_pred_commoning_cbck): Call execute_pred_commoning
with pcom_worker pointer.
(find_use_stmt): Renamed to...
(pcom_worker::find_use_stmt): ...this.
(find_associative_operation_root): Renamed to...
(pcom_worker::find_associative_operation_root): ...this.
(find_common_use_stmt): Renamed to...
(pcom_worker::find_common_use_stmt): ...this.
(combinable_refs_p): Renamed to...
(pcom_worker::combinable_refs_p): ...this.
(reassociate_to_the_same_stmt): Renamed to...
(pcom_worker::reassociate_to_the_same_stmt): ...this.
(stmt_combining_refs): Renamed to...
(pcom_worker::stmt_combining_refs): ...this.
(combine_chains): Renamed to...
(pcom_worker::combine_chains): ...this.
(try_combine_chains): Renamed to...
(pcom_worker::try_combine_chains): ...this.
(prepare_initializers_chain): Renamed to...
(pcom_worker::prepare_initializers_chain): ...this.
(prepare_initializers): Renamed to...
(pcom_worker::prepare_initializers): ...this.
(prepare_finalizers_chain): Renamed to...
(pcom_worker::prepare_finalizers_chain): ...this.
(prepare_finalizers): Renamed to...
(pcom_worker::prepare_finalizers): ...this.
(tree_predictive_commoning_loop): Renamed to...
(pcom_worker::tree_predictive_commoning_loop): ...this, adjust
some calls and remove some cleanup code.
(tree_predictive_commoning): Adjusted to use pcom_worker instance.
(static variable looparound_phis): Remove.
(static variable name_expansions): Remove.
Richard Biener [Wed, 23 Jun 2021 13:17:07 +0000 (15:17 +0200)]
refactor SLP permute propagation
This refactors SLP permute propagation to record the outgoing permute
separately from the incoming/materialized one. Instead of separate
arrays/bitmaps I've now created a struct to represent the state.
2021-06-23 Richard Biener <rguenther@suse.de>
* tree-vect-slp.c (slpg_vertex): New struct.
(vect_slp_build_vertices): Adjust.
(vect_optimize_slp): Likewise. Maintain an outgoing permute
and a materialized one.
Richard Biener [Wed, 23 Jun 2021 10:43:03 +0000 (12:43 +0200)]
tree-optimization/101105 - fix runtime alias test optimization
We were ignoring DR_STEP for VF == 1 which is OK only in case
the scalar order is preserved or both DR steps are the same.
2021-06-23 Richard Biener <rguenther@suse.de>
PR tree-optimization/101105
* tree-vect-data-refs.c (vect_prune_runtime_alias_test_list):
Only ignore steps when they are equal or scalar order is preserved.
* gcc.dg/torture/pr101105.c: New testcase.
liuhongt [Tue, 26 Jan 2021 08:29:32 +0000 (16:29 +0800)]
i386: Add vashlm3/vashrm3/vlshrm3 to enable vectorization of vector shift vector. [PR98434]
Add expanders for vashl<VI12_AVX512BW>, vlshr<VI12_AVX512BW>,
vashr<VI1_AVX512BW> and vashr<v32hi,v16hi,v4di,v8di>.
Besides there's some assumption in expand_mult_const that mul and
add must be available at the same time, but for i386, addv8qi is
restricted under TARGET_64BIT, but mulv8qi not, that could cause ICE.
So restrict mulv8qi and shiftv8qi under TARGET_64BIT.
gcc/ChangeLog:
PR target/98434
* config/i386/i386-expand.c (ix86_expand_vec_interleave):
Adjust comments for ix86_expand_vecop_qihi2.
(ix86_expand_vecmul_qihi): Renamed to ..
(ix86_expand_vecop_qihi2): Adjust function prototype to
support shift operation, add static to definition.
(ix86_expand_vec_shift_qihi_constant): Add static to definition.
(ix86_expand_vecop_qihi): Call ix86_expand_vecop_qihi2 and
ix86_expand_vec_shift_qihi_constant.
* config/i386/i386-protos.h (ix86_expand_vecmul_qihi): Deleted.
(ix86_expand_vec_shift_qihi_constant): Deleted.
* config/i386/sse.md (VI12_256_512_AVX512VL): New mode
iterator.
(mulv8qi3): Call ix86_expand_vecop_qihi directly, add
condition TARGET_64BIT.
(mul<mode>3): Ditto.
(<insn><mode>3): Ditto.
(vlshr<mode>3): Extend to support avx512 vlshr.
(v<insn><mode>3): New expander for
vashr/vlshr/vashl.
(v<insn>v8qi3): Ditto.
(vashrv8hi3<mask_name>): Renamed to ..
(vashr<mode>3): And extend to support V16QImode for avx512.
(vashrv16qi3): Deleted.
(vashrv2di3<mask_name>): Extend expander to support avx512
instruction.
gcc/testsuite/ChangeLog:
PR target/98434
* gcc.target/i386/pr98434-1.c: New test.
* gcc.target/i386/pr98434-2.c: New test.
* gcc.target/i386/avx512vl-pr95488-1.c: Adjust testcase.
GCC Administrator [Thu, 24 Jun 2021 00:16:30 +0000 (00:16 +0000)]
Daily bump.
Patrick Palka [Wed, 23 Jun 2021 21:23:39 +0000 (17:23 -0400)]
c++: excessive instantiation during CTAD [PR101174]
We set DECL_CONTEXT on implicitly generated deduction guides so that
their access is consistent with that of the constructor. But this
apparently leads to excessive instantiation in some cases, ultimately
because instantiation of a deduction guide should be independent of
instantiation of the resulting class specialization, but setting the
DECL_CONTEXT of the former to the latter breaks this independence.
To fix this, this patch makes push_access_scope handle artificial
deduction guides specifically rather than setting their DECL_CONTEXT
in build_deduction_guide. We could alternatively make the class
befriend the guide via DECL_BEFRIENDING_CLASSES, but that wouldn't
be a complete fix and would break class-deduction-access3.C below
since friendship isn't transitive.
PR c++/101174
gcc/cp/ChangeLog:
* pt.c (push_access_scope): For artificial deduction guides,
set the access scope to that of the constructor.
(pop_access_scope): Likewise.
(build_deduction_guide): Don't set DECL_CONTEXT on the guide.
libstdc++-v3/ChangeLog:
* testsuite/23_containers/multiset/cons/deduction.cc:
Uncomment CTAD example that was rejected by this bug.
* testsuite/23_containers/set/cons/deduction.cc: Likewise.
gcc/testsuite/ChangeLog:
* g++.dg/cpp1z/class-deduction-access3.C: New test.
* g++.dg/cpp1z/class-deduction91.C: New test.
Dimitar Dimitrov [Sun, 20 Jun 2021 15:39:41 +0000 (18:39 +0300)]
doc/lto.texi: List slim object format as the default
Slim LTO object files have been the default for quite a while, since:
commit
e9f67e625c2a4225a7169d7220dcb85b6fdd7ca9
Author: Jan Hubicka <hubicka@gcc.gnu.org>
common.opt (ffat-lto-objects): Disable by default.
That commit did not update lto.texi, so do it now.
gcc/ChangeLog:
* doc/lto.texi (Design Overview): Update that slim objects are
the default.
Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
Tobias Burnus [Wed, 23 Jun 2021 20:10:43 +0000 (22:10 +0200)]
fortran/dump-parse-tree.c: Use proper enum type
gcc/fortran/ChangeLog:
* dump-parse-tree.c (show_omp_clauses): Fix enum type used
for dumping gfc_omp_defaultmap_category.
Aaron Sawdey [Tue, 22 Jun 2021 21:02:15 +0000 (16:02 -0500)]
Do not enable pcrel-opt by default
SPEC2017 testing on p10 shows that this optimization does not have a
positive impact on performance. So we are no longer going to enable it
by default. The test cases for it needed to be updated so they always
enable it to test it.
gcc/
* config/rs6000/rs6000-cpus.def: Take OPTION_MASK_PCREL_OPT out
of OTHER_POWER10_MASKS so it will not be enabled by default.
gcc/testsuite/
* gcc.target/powerpc/pcrel-opt-inc-di.c: Enable -mpcrel-opt to test it.
* gcc.target/powerpc/pcrel-opt-ld-df.c: Enable -mpcrel-opt to test it.
* gcc.target/powerpc/pcrel-opt-ld-di.c: Enable -mpcrel-opt to test it.
* gcc.target/powerpc/pcrel-opt-ld-hi.c: Enable -mpcrel-opt to test it.
* gcc.target/powerpc/pcrel-opt-ld-qi.c: Enable -mpcrel-opt to test it.
* gcc.target/powerpc/pcrel-opt-ld-sf.c: Enable -mpcrel-opt to test it.
* gcc.target/powerpc/pcrel-opt-ld-si.c: Enable -mpcrel-opt to test it.
* gcc.target/powerpc/pcrel-opt-ld-vector.c: Enable -mpcrel-opt to
test it.
* gcc.target/powerpc/pcrel-opt-st-df.c: Enable -mpcrel-opt to test it.
* gcc.target/powerpc/pcrel-opt-st-di.c: Enable -mpcrel-opt to test it.
* gcc.target/powerpc/pcrel-opt-st-hi.c: Enable -mpcrel-opt to test it.
* gcc.target/powerpc/pcrel-opt-st-qi.c: Enable -mpcrel-opt to test it.
* gcc.target/powerpc/pcrel-opt-st-sf.c: Enable -mpcrel-opt to test it.
* gcc.target/powerpc/pcrel-opt-st-si.c: Enable -mpcrel-opt to test it.
* gcc.target/powerpc/pcrel-opt-st-vector.c: Enable -mpcrel-opt to
test it.
Xi Ruoyao [Wed, 23 Jun 2021 18:45:06 +0000 (14:45 -0400)]
testsuite: add -fwrapv for 950704-1.c
gcc/testsuite
* gcc.c-torture/execute/950704-1.c: Add -fwrapv to avoid
undefined behavior.
Jonathan Wakely [Wed, 23 Jun 2021 17:50:03 +0000 (18:50 +0100)]
libstdc++: Fix comment in chrono::year::is_leap()
libstdc++-v3/ChangeLog:
* include/std/chrono (chrono::year::is_leap()): Fix incorrect
logic in comment.
Matthias Kretz [Wed, 23 Jun 2021 15:36:08 +0000 (16:36 +0100)]
libstdc++: Document simd testsuite
Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:
* testsuite/experimental/simd/README.md: New file.
Matthias Kretz [Wed, 23 Jun 2021 15:34:30 +0000 (16:34 +0100)]
libstdc++: Improve output verbosity options and default
For most uses --quiet was too quiet while the default was too noisy. Now
the default output, if stdout is a tty, shows the last successful test
on the same line. With --percentage it adds a percentage at the start of
the line. --percentage is not default because it requires more resources
and might not be 100% compatible to all environments.
If stdout is not a tty the default is quiet output like for dejagnu.
Additionally, argument parsing now recognizes contracted short options
which is easier to use with e.g. DRIVEROPTS=-pxk.
Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:
* testsuite/experimental/simd/driver.sh: Rewrite output
verbosity logic. Add -p/--percentage option. Allow -v/--verbose
to be used twice. Add -x and -o short options. Parse long
options with = instead of separating space generically. Parce
contracted short options. Make unrecognized options an error.
If same-line output is active, trap on EXIT to increment the
progress (only with --percentage), erase the line and print the
current status.
* testsuite/experimental/simd/generate_makefile.sh: Initialize
helper files for progress account keeping. Update help target
for changes to DRIVEROPTS.
Matthias Kretz [Wed, 23 Jun 2021 15:29:30 +0000 (16:29 +0100)]
libstdc++: Remove -fno-tree-vrp after PR98834 was resolved
Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:
* testsuite/Makefile.am (check-simd): Remove -fno-tree-vrp flag
and associated warning.
* testsuite/Makefile.in: Regenerate.
Cassio Neri [Wed, 23 Jun 2021 14:32:16 +0000 (15:32 +0100)]
libstdc++: More efficient std::chrono::year::leap
Simple change to std::chrono::year::is_leap. If a year is multiple of 100,
then it's divisible by 400 if and only if it's divisible by 16. The latter
allows for better code generation.
The expression is then either y%16 or y%4 which are both powers of two
and so it can be rearranged to use simple bitmask operations.
Co-authored-by: Jonathan Wakely <jwakely@redhat.com>
Co-authored-by: Ulrich Drepper <drepper@redhat.com>
libstdc++-v3/ChangeLog:
* include/std/chrono (chrono::year::is_leap()): Optimize.
Martin Jambor [Wed, 23 Jun 2021 16:46:04 +0000 (18:46 +0200)]
tree-inline: Fix TREE_READONLY of parameter replacements
tree-inline leaves behind VAR_DECLs which are TREE_READONLY (because
they are copies of const parameters) but are written to because they
need to be initialized. This patch resets the flag unconditionally so
that this does not happen.
There are other sources of variables which are incorrectly marked as
TREE_READOLY, but with this patch and a verifier catching them I can
at least compile the Ada run-time library.
gcc/ChangeLog:
2021-06-22 Richard Biener <rguenther@suse.de>
Martin Jambor <mjambor@suse.cz>
* tree-inline.c (setup_one_parameter): Set TREE_READONLY of the
param replacement unconditionally. Adjust comment.
Andrew MacLeod [Tue, 22 Jun 2021 15:41:30 +0000 (11:41 -0400)]
Split gimple-range into gimple-range-fold and gimple-range.
Split the fold_using_range functions from gimple-range into gimple-range-fold.
Also move the gimple_range_calc* routines into gimple-range-gori.
* Makefile.in (OBJS): Add gimple-range-fold.o
* gimple-range-fold.cc: New.
* gimple-range-fold.h: New.
* gimple-range-gori.cc (gimple_range_calc_op1): Move to here.
(gimple_range_calc_op2): Ditto.
* gimple-range-gori.h: Move prototypes to here.
* gimple-range.cc: Adjust include files.
(fur_source:fur_source): Relocate to gimple-range-fold.cc.
(fur_source::get_operand): Ditto.
(fur_source::get_phi_operand): Ditto.
(fur_source::query_relation): Ditto.
(fur_source::register_relation): Ditto.
(class fur_edge): Ditto.
(fur_edge::fur_edge): Ditto.
(fur_edge::get_operand): Ditto.
(fur_edge::get_phi_operand): Ditto.
(fur_stmt::fur_stmt): Ditto.
(fur_stmt::get_operand): Ditto.
(fur_stmt::get_phi_operand): Ditto.
(fur_stmt::query_relation): Ditto.
(class fur_depend): Relocate to gimple-range-fold.h.
(fur_depend::fur_depend): Relocate to gimple-range-fold.cc.
(fur_depend::register_relation): Ditto.
(fur_depend::register_relation): Ditto.
(class fur_list): Ditto.
(fur_list::fur_list): Ditto.
(fur_list::get_operand): Ditto.
(fur_list::get_phi_operand): Ditto.
(fold_range): Ditto.
(adjust_pointer_diff_expr): Ditto.
(gimple_range_adjustment): Ditto.
(gimple_range_base_of_assignment): Ditto.
(gimple_range_operand1): Ditto.
(gimple_range_operand2): Ditto.
(gimple_range_calc_op1): Relocate to gimple-range-gori.cc.
(gimple_range_calc_op2): Ditto.
(fold_using_range::fold_stmt): Relocate to gimple-range-fold.cc.
(fold_using_range::range_of_range_op): Ditto.
(fold_using_range::range_of_address): Ditto.
(fold_using_range::range_of_phi): Ditto.
(fold_using_range::range_of_call): Ditto.
(fold_using_range::range_of_builtin_ubsan_call): Ditto.
(fold_using_range::range_of_builtin_call): Ditto.
(fold_using_range::range_of_cond_expr): Ditto.
(fold_using_range::range_of_ssa_name_with_loop_info): Ditto.
(fold_using_range::relation_fold_and_or): Ditto.
(fold_using_range::postfold_gcond_edges): Ditto.
* gimple-range.h: Add gimple-range-fold.h to include files. Change
GIMPLE_RANGE_STMT_H to GIMPLE_RANGE_H.
(gimple_range_handler): Relocate to gimple-range-fold.h.
(gimple_range_ssa_p): Ditto.
(range_compatible_p): Ditto.
(class fur_source): Ditto.
(class fur_stmt): Ditto.
(class fold_using_range): Ditto.
(gimple_range_calc_op1): Relocate to gimple-range-gori.h
(gimple_range_calc_op2): Ditto.
Andrew MacLeod [Tue, 22 Jun 2021 21:46:05 +0000 (17:46 -0400)]
Do not continue propagating values which cannot be set properly.
If the on-entry cache cannot properly represent a range, do not continue
trying to propagate it.
PR tree-optimization/101148
PR tree-optimization/101014
* gimple-range-cache.cc (ranger_cache::ranger_cache): Adjust.
(ranger_cache::~ranger_cache): Adjust.
(ranger_cache::block_range): Check if propagation disallowed.
(ranger_cache::propagate_cache): Disallow propagation if new value
can't be stored properly.
* gimple-range-cache.h (ranger_cache::m_propfail): New member.
Andrew MacLeod [Tue, 22 Jun 2021 21:21:32 +0000 (17:21 -0400)]
Adjust on_entry cache to indicate if the value was set properly.
* gimple-range-cache.cc (class ssa_block_ranges): Adjust prototype.
(sbr_vector::set_bb_range): Return true.
(class sbr_sparse_bitmap): Adjust.
(sbr_sparse_bitmap::set_bb_range): Return value.
(block_range_cache::set_bb_range): Return value.
(ranger_cache::propagate_cache): Use return value to print msg.
* gimple-range-cache.h (class block_range_cache): Adjust.
Andrew MacLeod [Tue, 22 Jun 2021 13:20:47 +0000 (09:20 -0400)]
Dump should be read only. Do not trigger new lookups.
* gimple-range.cc (dump_bb): Use range_on_edge from the cache.
Jeff Law [Tue, 22 Jun 2021 19:25:11 +0000 (15:25 -0400)]
Use more logicals to eliminate useless test/compare instructions
gcc/
* config/h8300/logical.md (<code><mode>3<ccnz>): Use <cczn>
so this pattern can be used for test/compare removal. Pass
current insn to compute_logical_op_length and output_logical_op.
* config/h8300/h8300.c (compute_logical_op_cc): Remove.
(h8300_and_costs): Add argument to compute_logical_op_length.
(output_logical_op): Add new argument. Use it to determine if the
condition codes are used and adjust the output accordingly.
(compute_logical_op_length): Add new argument and update length
computations when condition codes are used.
* config/h8300/h8300-protos.h (compute_logical_op_length): Update
prototype.
(output_logical_op): Likewise.
Uros Bizjak [Wed, 23 Jun 2021 14:14:31 +0000 (16:14 +0200)]
i386: Add PPERM two-operand 64bit vector permutation [PR89021]
Add emulation of V8QI PPERM permutations for TARGET_XOP target. Similar
to PSHUFB, the permutation is performed with V16QI PPERM instruction,
where selector is defined in V16QI mode with inactive elements set to 0x80.
Specific to two operand permutations is the remapping of elements from
the second operand (e.g. e[8] -> e[16]), as we have to account for the
inactive elements from the first operand.
2021-06-23 Uroš Bizjak <ubizjak@gmail.com>
gcc/
PR target/89021
* config/i386/i386-expand.c (expand_vec_perm_pshufb):
Handle 64bit modes for TARGET_XOP. Use indirect gen_* functions.
* config/i386/mmx.md (mmx_ppermv64): New insn pattern.
* config/i386/i386.md (unspec): Move UNSPEC_XOP_PERMUTE from ...
* config/i386/sse.md (unspec): ... here.
Martin Liska [Wed, 23 Jun 2021 13:30:17 +0000 (15:30 +0200)]
arm: Revert partially
ebd5e86c0f41dc1d692f9b2b68a510b1f6835a3e
PR target/98636
gcc/ChangeLog:
* optc-save-gen.awk: Put back arm_fp16_format to
checked_options.
Patrick Palka [Wed, 23 Jun 2021 12:24:34 +0000 (08:24 -0400)]
c++: CTAD and deduction guide selection [PR86439]
During CTAD, we select the best viable deduction guide using
build_new_function_call, which performs overload resolution on the set
of candidate guides and then forms a call to the guide. As the PR
points out, this latter step is unnecessary and occasionally incorrect
since a call to the selected guide may be ill-formed, or forming the
call may have side effects such as prematurely deducing the type of a {}.
So this patch introduces a specialized subroutine based on
build_new_function_call that stops short of building a call to the
selected function, and makes do_class_deduction use this subroutine
instead. And since a call is no longer built, do_class_deduction
doesn't need to set tf_decltype or cp_unevaluated_operand anymore.
This change causes us to reject some container CTAD examples in the
libstdc++ testsuite due to deduction failure for {}, which AFAICT is the
correct behavior. Previously in e.g. the first removed example
std::map{{std::pair{1, 2.0}, {2, 3.0}, {3, 4.0}}, {}},
the type of the {} would get deduced to less<int> as a side effect of
forming a call to the chosen guide
template<typename _Key, typename _Tp, typename _Compare = less<_Key>,
typename _Allocator = allocator<pair<const _Key, _Tp>>>
map(initializer_list<pair<_Key, _Tp>>,
_Compare = _Compare(), _Allocator = _Allocator())
-> map<_Key, _Tp, _Compare, _Allocator>;
which made later overload resolution for the constructor call
unambiguous. Now, the type of the {} remains undeduced until
constructor overload resolution, and we complain about ambiguity
for the two equally good constructor candidates
map(initializer_list<value_type>,
const _Compare& = _Compare(),
const allocator_type& = allocator_type())
map(initializer_list<value_type>, const allocator_type&).
This patch fixes these problematic container CTAD examples by giving
the {} an appropriate concrete type. Two of these adjusted CTAD
examples (one for std::set and one for std::multiset) end up triggering
an unrelated CTAD bug on trunk, PR101174, so these two adjusted examples
are commented out for now.
PR c++/86439
gcc/cp/ChangeLog:
* call.c (print_error_for_call_failure): Constify 'args' parameter.
(perform_dguide_overload_resolution): Define.
* cp-tree.h: (perform_dguide_overload_resolution): Declare.
* pt.c (do_class_deduction): Use perform_dguide_overload_resolution
instead of build_new_function_call. Don't use tf_decltype or
set cp_unevaluated_operand. Remove unnecessary NULL_TREE tests.
libstdc++-v3/ChangeLog:
* testsuite/23_containers/map/cons/deduction.cc: Replace ambiguous
CTAD examples.
* testsuite/23_containers/multimap/cons/deduction.cc: Likewise.
* testsuite/23_containers/multiset/cons/deduction.cc: Likewise.
Mention one of the replaced examples is broken due to PR101174.
* testsuite/23_containers/set/cons/deduction.cc: Likewise.
* testsuite/23_containers/unordered_map/cons/deduction.cc: Replace
ambiguous CTAD examples.
* testsuite/23_containers/unordered_multimap/cons/deduction.cc:
Likewise.
* testsuite/23_containers/unordered_multiset/cons/deduction.cc:
Likewise.
* testsuite/23_containers/unordered_set/cons/deduction.cc: Likewise.
gcc/testsuite/ChangeLog:
* g++.dg/cpp1z/class-deduction88.C: New test.
* g++.dg/cpp1z/class-deduction89.C: New test.
* g++.dg/cpp1z/class-deduction90.C: New test.
Uros Bizjak [Wed, 23 Jun 2021 10:50:53 +0000 (12:50 +0200)]
i386: Prevent unwanted combine from LZCNT to BSR [PR101175]
The current RTX pattern for BSR allows combine pass to convert LZCNT insn
to BSR. Note that the LZCNT has a defined behavior to return the operand
size when operand is zero, where BSR has not.
Add a BSR specific setting of zero-flag to RTX pattern of BSR insn
in order to avoid matching unwanted combinations.
2021-06-23 Uroš Bizjak <ubizjak@gmail.com>
gcc/
PR target/101175
* config/i386/i386.md (bsr_rex64): Add zero-flag setting RTX.
(bsr): Ditto.
(*bsrhi): Remove.
(clz<mode>2): Update RTX pattern for additions.
gcc/testsuite/
PR target/101175
* gcc.target/i386/pr101175.c: New test.
Jonathan Wakely [Wed, 23 Jun 2021 10:05:51 +0000 (11:05 +0100)]
libstdc++: Avoid "__lockable" name defined as macro by newlib
libstdc++-v3/ChangeLog:
* include/std/mutex (__detail::__try_lock_impl): Rename
parameter to avoid clashing with newlib's __lockable macro.
(try_lock): Add 'inline' specifier.
* testsuite/17_intro/names.cc: Add check for __lockable.
* testsuite/30_threads/try_lock/5.cc: Add options for pthreads.