Peter Bergner [Sat, 8 Aug 2020 16:54:48 +0000 (11:54 -0500)]
rs6000: MMA built-ins reject typedefs of MMA types
We do not allow conversions between the MMA types and other types.
However, we are being too strict in not matching MMA types with
typdefs of those types. Use TYPE_CANONICAL to see through the
types to their canonical types before comparing them.
2020-08-08 Peter Bergner <bergner@linux.ibm.com>
gcc/
PR target/96530
* config/rs6000/rs6000.c (rs6000_invalid_conversion): Use canonical
types for type comparisons. Refactor code to simplify it.
gcc/testsuite/
PR target/96530
* gcc.target/powerpc/pr96530.c: New test.
Jakub Jelinek [Sat, 8 Aug 2020 09:10:30 +0000 (11:10 +0200)]
openmp: Handle clauses with gimple sequences in convert_nonlocal_omp_clauses properly
If the walk_body on the various sequences of reduction, lastprivate and/or linear
clauses needs to create a temporary variable, we should declare that variable
in that sequence rather than outside, where it would need to be privatized inside of
the construct.
2020-08-08 Jakub Jelinek <jakub@redhat.com>
PR fortran/93553
* tree-nested.c (convert_nonlocal_omp_clauses): For
OMP_CLAUSE_REDUCTION, OMP_CLAUSE_LASTPRIVATE and OMP_CLAUSE_LINEAR
save info->new_local_var_chain around walks of the clause gimple
sequences and declare_vars if needed into the sequence.
2020-08-08 Tobias Burnus <tobias@codesourcery.com>
PR fortran/93553
* testsuite/libgomp.fortran/pr93553.f90: New test.
Jakub Jelinek [Sat, 8 Aug 2020 09:07:09 +0000 (11:07 +0200)]
openmp: Avoid floating point comparison at the end of bb with -fnon-call-exceptions
The following testcase ICEs with -fexceptions -fnon-call-exceptions because
in that mode floating point comparisons should not be done at the end of bb
in GIMPLE_COND. Fixed by forcing it into a bool SSA_NAME and comparing that against
false.
2020-08-08 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/96424
* omp-expand.c: Include tree-eh.h.
(expand_omp_for_init_vars): Handle -fexceptions -fnon-call-exceptions
by forcing floating point comparison into a bool temporary.
* c-c++-common/gomp/pr96424.c: New test.
Ian Lance Taylor [Fri, 7 Aug 2020 22:17:35 +0000 (15:17 -0700)]
libgo: update to Go1.15rc2 release
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/247517
GCC Administrator [Sat, 8 Aug 2020 00:16:34 +0000 (00:16 +0000)]
Daily bump.
Jonathan Wakely [Fri, 7 Aug 2020 19:29:11 +0000 (20:29 +0100)]
libstdc++: Fix ambiguous comparisons in __gnu_debug::bitset [PR 96303]
With -pedantic the debug mode bitset has an ambiguous equality
comparison operator, because it tries to compare the non-debug base to
the debug object. The base object can be converted to another debug
bitset, making the same operator== a candidate again.
The fix is to do the comparison on both base objects, so the operator
for the derived type isn't a candidate.
For the inequality operator the same change should be done, but that
operator can be removed entirely for C++20 because it can be synthesized
by the compiler.
I don't think either equality or inequality operators are really needed,
because the public _GLIBCXX_STD_C::bitset base class cam always be
compared using its own comparison operators. I'm not changing that here
though.
libstdc++-v3/ChangeLog:
PR libstdc++/96303
* include/debug/bitset (bitset::operator==): Call _M_base() on
right operand.
(bitset::operator!=): Likewise, but don't define it at all when
default comparisons are supported by the compiler.
* testsuite/23_containers/bitset/operations/96303.cc: New test.
Marc Glisse [Fri, 7 Aug 2020 16:49:04 +0000 (18:49 +0200)]
Disable some VEC_COND_EXPR transformations after vector lowering
ARM understands VEC_COND_EXPR<v == w, -1, 0> but not a plain v == w which is
fed to something other than VEC_COND_EXPR (say BIT_IOR_EXPR). This patch avoids
introducing the second kind of statement after the vector lowering pass, which
is the last chance to turn v == w back into something the target handles.
This is just a workaround to avoid ICEs, a v == w produced before vector
lowering will yield pretty bad code. Either the arm target needs to learn to
handle vector comparisons (aarch64 already does), or the middle-end needs to
fall back to vcond when plain comparisons are not supported (or ...).
2020-08-07 Marc Glisse <marc.glisse@inria.fr>
* generic-match-head.c (optimize_vectors_before_lowering_p): New
function.
* gimple-match-head.c (optimize_vectors_before_lowering_p):
Likewise.
* match.pd ((v ? w : 0) ? a : b, c1 ? c2 ? a : b : b): Use it.
Jonathan Wakely [Fri, 7 Aug 2020 16:45:42 +0000 (17:45 +0100)]
libstdc++: Replace some VERIFY tests with static_assert
libstdc++-v3/ChangeLog:
* testsuite/18_support/comparisons/algorithms/partial_order.cc:
Replace VERIFY with static_assert where the compiler now
allows it.
* testsuite/18_support/comparisons/algorithms/weak_order.cc:
Likewise.
Jonathan Wakely [Fri, 7 Aug 2020 15:38:51 +0000 (16:38 +0100)]
libstdc++: Fix linker script patterns for 32-bit targets
When making the patterns less greedy I forgot to use [jmy] for unsigned
integer parameters.
libstdc++-v3/ChangeLog:
* config/abi/pre/gnu.ver: Fix wildcards for wstring symbols.
Richard Biener [Fri, 7 Aug 2020 08:16:05 +0000 (10:16 +0200)]
tree-optimization/96514 - avoid if-converting control-altering calls
This avoids if-converting when encountering control-altering calls.
2020-08-07 Richard Biener <rguenther@suse.de>
PR tree-optimization/96514
* tree-if-conv.c (if_convertible_bb_p): If the last stmt
is a call that is control-altering, fail.
* gcc.dg/pr96514.c: New testcase.
Jose E. Marchesi [Fri, 7 Aug 2020 09:27:55 +0000 (11:27 +0200)]
bpf: remove trailing whitespaces from source files
This patch is a little cleanup that removes trailing whitespaces from
the bpf backend source files.
2020-08-07 Jose E. Marchesi <jose.marchesi@oracle.com>
gcc/
* config/bpf/bpf.md: Remove trailing whitespaces.
* config/bpf/constraints.md: Likewise.
* config/bpf/predicates.md: Likewise.
gcc/testsuite/
* gcc.target/bpf/diag-funargs-2.c: Remove trailing whitespaces.
* gcc.target/bpf/skb-ancestor-cgroup-id.c: Likewise.
* gcc.target/bpf/helper-xdp-adjust-meta.c: Likewise.
* gcc.target/bpf/helper-xdp-adjust-head.c: Likewise.
* gcc.target/bpf/helper-tcp-check-syncookie.c: Likewise.
* gcc.target/bpf/helper-sock-ops-cb-flags-set.c
* gcc.target/bpf/helper-sysctl-set-new-value.c: Likewise.
* gcc.target/bpf/helper-sysctl-get-new-value.c: Likewise.
* gcc.target/bpf/helper-sysctl-get-name.c: Likewise.
* gcc.target/bpf/helper-sysctl-get-current-value.c: Likewise.
* gcc.target/bpf/helper-strtoul.c: Likewise.
* gcc.target/bpf/helper-strtol.c: Likewise.
* gcc.target/bpf/helper-sock-map-update.c: Likewise.
* gcc.target/bpf/helper-sk-storage-get.c: Likewise.
* gcc.target/bpf/helper-sk-storage-delete.c: Likewise.
* gcc.target/bpf/helper-sk-select-reuseport.c: Likewise.
* gcc.target/bpf/helper-sk-release.c: Likewise.
* gcc.target/bpf/helper-sk-redirect-map.c: Likewise.
* gcc.target/bpf/helper-sk-lookup-upd.c: Likewise.
* gcc.target/bpf/helper-sk-lookup-tcp.c: Likewise.
* gcc.target/bpf/helper-skb-change-head.c: Likewise.
* gcc.target/bpf/helper-skb-cgroup-id.c: Likewise.
* gcc.target/bpf/helper-skb-adjust-room.c: Likewise.
* gcc.target/bpf/helper-set-hash.c: Likewise.
* gcc.target/bpf/helper-setsockopt.c: Likewise.
* gcc.target/bpf/helper-redirect-map.c: Likewise.
* gcc.target/bpf/helper-rc-repeat.c: Likewise.
* gcc.target/bpf/helper-rc-keydown.c: Likewise.
* gcc.target/bpf/helper-probe-read-str.c: Likewise.
* gcc.target/bpf/helper-perf-prog-read-value.c: Likewise.
* gcc.target/bpf/helper-perf-event-read-value.c: Likewise.
* gcc.target/bpf/helper-override-return.c: Likewise.
* gcc.target/bpf/helper-msg-redirect-map.c: Likewise.
* gcc.target/bpf/helper-msg-pull-data.c: Likewise.
* gcc.target/bpf/helper-msg-cork-bytes.c: Likewise.
* gcc.target/bpf/helper-msg-apply-bytes.c: Likewise.
* gcc.target/bpf/helper-lwt-seg6-store-bytes.c: Likewise.
* gcc.target/bpf/helper-lwt-seg6-adjust-srh.c: Likewise.
* gcc.target/bpf/helper-lwt-seg6-action.c: Likewise.
* gcc.target/bpf/helper-lwt-push-encap.c: Likewise.
* gcc.target/bpf/helper-get-socket-uid.c: Likewise.
* gcc.target/bpf/helper-get-socket-cookie.c: Likewise.
* gcc.target/bpf/helper-get-local-storage.c: Likewise.
* gcc.target/bpf/helper-get-current-cgroup-id.c: Likewise.
* gcc.target/bpf/helper-getsockopt.c: Likewise.
* gcc.target/bpf/diag-funargs-3.c: Likewise.
Kwok Cheung Yeung [Thu, 6 Aug 2020 11:52:52 +0000 (13:52 +0200)]
[testsuite] Add gcc.dg/ia64-sync-5.c
There currently is no sync_char_short-enabled run test that tests
__sync_val_compare_and_swap.
Fix this by copying ia64-sync-3.c and modifying it for char/short.
Tested on x86_64.
2020-08-06 Kwok Cheung Yeung <kcy@codesourcery.com>
Tom de Vries <tdevries@suse.de>
gcc/testsuite/ChangeLog:
* gcc.dg/ia64-sync-5.c: New test.
Michael Meissner [Fri, 7 Aug 2020 05:03:22 +0000 (01:03 -0400)]
Power10: Add BRD, BRW, and BRH support.
This patch adds support for the ISA 3.1 (power10) instructions that does a byte
swap of values in GPR registers.
gcc/
2020-08-07 Michael Meissner <meissner@linux.ibm.com>
* config/rs6000/rs6000.md (bswaphi2_reg): Add ISA 3.1 support.
(bswapsi2_reg): Add ISA 3.1 support.
(bswapdi2): Rename bswapdi2_xxbrd to bswapdi2_brd.
(bswapdi2_brd,bswapdi2_xxbrd): Rename. Add ISA 3.1 support.
gcc/testsuite/
2020-08-07 Michael Meissner <meissner@linux.ibm.com>
* gcc.target/powerpc/bswap-brd.c: New test.
* gcc.target/powerpc/bswap-brw.c: New test.
* gcc.target/powerpc/bswap-brh.c: New test.
Alan Modra [Thu, 6 Aug 2020 04:42:21 +0000 (14:12 +0930)]
PR96493, powerpc local call linkage failure
This corrects current_file_function_operand, an operand predicate used
to determine whether a symbol_ref is safe to use with the local_call
patterns. Calls between pcrel and non-pcrel code need to go via
linker stubs. In the case of non-pcrel code to pcrel the stub saves
r2 but there needs to be a nop after the branch for the r2 restore.
So the local_call patterns can't be used there. For pcrel code to
non-pcrel the local_call patterns could still be used, but I thought
it better to not use them since the call isn't direct. Code generated
by the corresponding call_nonlocal_aix for pcrel is identical anyway.
Incidentally, without the TREE_CODE () == FUNCTION_DECL test,
gcc.c-torture/compile/pr37433.c and pr37433-1.c ICE. Also, if you
make the test more strict by disallowing an op without a
SYMBOL_REF_DECL then a bunch of go and split-stack tests fail. That's
because a prologue call to __morestack can't have a following nop.
(__morestack calls its caller at a fixed offset from the __morestack
call!)
gcc/
PR target/96493
* config/rs6000/predicates.md (current_file_function_operand): Don't
accept functions that differ in r2 usage.
gcc/testsuite/
* gcc.target/powerpc/pr96493.c: New file.
GCC Administrator [Fri, 7 Aug 2020 00:16:33 +0000 (00:16 +0000)]
Daily bump.
Hans-Peter Nilsson [Thu, 6 Aug 2020 23:57:15 +0000 (01:57 +0200)]
mmix: fix gcc.dg/loop-9.c by more accurate move insns
It looks like gcc.dg/loop-9.c kind-of works as sentinel for sane
move-instruction generation for a port.
Looking at the
FAIL: gcc.dg/loop-9.c scan-rtl-dump loop2_invariant "Decided"
FAIL: gcc.dg/loop-9.c scan-rtl-dump loop2_invariant "without introducing a new temporary register"
it seems the problem is that in the loop:
for (i = 0; i < 100; i++)
a[i] = 18.4242;
the move insn corresponding to "a[i] = 18.4242" happens to be
generated as a move of a constant to a memory address, using no
registers except for the address (edited):
(insn 9 8 10 3 (set (mem:DF (reg:DI 269 [ ivtmp::9 ]))
(const_double:DF 1.
84241999e+1)) "x/loop-9.c":9:10 6 {movdf})
To wit, at the loop2 pass there's no register-initialization to move
out of the loop! The insn above isn't accurate and has to be fixed up
at register allocation time to make constraints match. While there
are insns to set memory to constant in MMIX, that's limited to 64-bit
moves corresponding to the integer bit-patterns for 0..255, and
18.4242 isn't one of them. (Only 0.0 matches; the bit-patterns for
0..255 would IIUC be interpreted as denormal floating-point numbers
a.k.a. subnormal numbers and don't seem worthwhile to handle.)
The fault is with the port, for not requiring a register for an
operand that actually requires an intermediate register, in order to
enable pre-register-allocation passes to do their job. The movdf
pattern (actually, all MMIX movM), only required the destination to be
a non-immediate operand and the source to be a general_operand,
i.e. anything-to-anything.
Better force the source to be a register, when asked to generate such
a move insn. Also, make operands stay sane by using the matching insn
condition to require one of the operands to be a register
pre-register-allocation (for sake of combine-like passes that cook up
"simplified" insns, possibly losing the use of a register). Looking
no deeper than at the results of test-runs with different variants, I
see that the latter "safety latch" has no effect on the test-results
(at
919c9d4bd3db7da0), but it just feels like the right thing to do.
Similarly, there's no effect on test-suite results, to do the same not
just for movdf but for all moves.
gcc:
* config/mmix/mmix.md (MM): New mode_iterator.
("mov<mode>"): New expander to expand for all MM-modes.
("*movqi_expanded", "*movhi_expanded", "*movsi_expanded")
("*movsf_expanded", "*movdf_expanded"): Rename from the
corresponding mov<M> named pattern. Add to the condition that
either operand must be a register_operand.
("*movdi_expanded"): Similar, but also allow STCO in the condition.
Andrew Luo [Thu, 6 Aug 2020 18:35:43 +0000 (19:35 +0100)]
libstdc++: Implement P0966 std::string::reserve should not shrink
Remove ability for reserve(n) to reduce a string's capacity. Add a new
reserve() overload that makes a shrink-to-fit request, and make
shrink_to_fit() use that.
libstdc++-v3/ChangeLog:
2020-07-30 Andrew Luo <andrewluotechnologies@outlook.com>
Jonathan Wakely <jwakely@redhat.com>
* config/abi/pre/gnu.ver (GLIBCXX_3.4): Use less greedy
patterns for basic_string members.
(GLIBCXX_3.4.29): Export new basic_string::reserve symbols.
* doc/xml/manual/status_cxx2020.xml: Update P0966 status.
* include/bits/basic_string.h (shrink_to_fit()): Call reserve().
(reserve(size_type)): Remove default argument.
(reserve()): Declare new overload.
[!_GLIBCXX_USE_CXX11_ABI] (shrink_to_fit, reserve): Likewise.
* include/bits/basic_string.tcc (reserve(size_type)): Remove
support for shrinking capacity.
(reserve()): Perform shrink-to-fit operation.
[!_GLIBCXX_USE_CXX11_ABI] (reserve): Likewise.
* testsuite/21_strings/basic_string/capacity/1.cc: Adjust to
reflect new behavior.
* testsuite/21_strings/basic_string/capacity/char/1.cc:
Likewise.
* testsuite/21_strings/basic_string/capacity/char/18654.cc:
Likewise.
* testsuite/21_strings/basic_string/capacity/char/2.cc:
Likewise.
* testsuite/21_strings/basic_string/capacity/wchar_t/1.cc:
Likewise.
* testsuite/21_strings/basic_string/capacity/wchar_t/18654.cc:
Likewise.
* testsuite/21_strings/basic_string/capacity/wchar_t/2.cc:
Likewise.
Jonathan Wakely [Thu, 6 Aug 2020 18:23:14 +0000 (19:23 +0100)]
libstdc++: Do not set eofbit eagerly in operator>>(istream&, char(&)[N])
Similar to the bugs I fixed recently in istream::ignore, we incorrectly
set eofbit too often in operator>>(istream&, string&) and
operator>>(istream&. char(&)[N]).
We should only set eofbit if we reach EOF but would have kept going
otherwise. If we've already extracted the maximum number of characters
(whether that's because of the buffer size or the istream's width())
then we should not set eofbit.
libstdc++-v3/ChangeLog:
* include/bits/basic_string.tcc
(operator>>(basic_istream&, basic_string&)): Do not set eofbit
if extraction stopped after in.width() characters.
* src/c++98/istream-string.cc (operator>>(istream&, string&)):
Likewise.
* include/bits/istream.tcc (__istream_extract): Do not set
eofbit if extraction stopped after n-1 characters.
* src/c++98/istream.cc (__istream_extract): Likewise.
* testsuite/21_strings/basic_string/inserters_extractors/char/13.cc: New test.
* testsuite/21_strings/basic_string/inserters_extractors/wchar_t/13.cc: New test.
* testsuite/27_io/basic_istream/extractors_character/char/5.cc: New test.
* testsuite/27_io/basic_istream/extractors_character/wchar_t/5.cc: New test.
Richard Sandiford [Thu, 6 Aug 2020 18:19:41 +0000 (19:19 +0100)]
arm: Clear canary value after stack_protect_test [PR96191]
The stack_protect_test patterns were leaving the canary value in the
temporary register, meaning that it was often still in registers on
return from the function. An attacker might therefore have been
able to use it to defeat stack-smash protection for a later function.
gcc/
PR target/96191
* config/arm/arm.md (arm_stack_protect_test_insn): Zero out
operand 2 after use.
* config/arm/thumb1.md (thumb1_stack_protect_test_insn): Likewise.
gcc/testsuite/
* gcc.target/arm/stack-protector-1.c: New test.
* gcc.target/arm/stack-protector-2.c: Likewise.
Jonathan Wakely [Thu, 6 Aug 2020 17:44:50 +0000 (18:44 +0100)]
libstdc++: Fix unnecessary allocations in read_symlink [PR 96484]
libstdc++-v3/ChangeLog:
PR libstdc++/96484
* src/c++17/fs_ops.cc (fs::read_symlink): Return an error
immediately for non-symlinks.
* src/filesystem/ops.cc (fs::read_symlink): Likewise.
Jonathan Wakely [Thu, 6 Aug 2020 15:16:33 +0000 (16:16 +0100)]
libstdc++: Adjust overflow prevention to operator>>
This adjusts the overflow prevention added to operator>> so that we can
distinguish "unknown size" from "zero size", and avoid writing anything
at all in to zero sized buffers.
This also removes the incorrect comment saying extraction stops at a
null byte.
libstdc++-v3/ChangeLog:
* include/std/istream (operator>>(istream&, char*)): Add
attributes to get warnings for pointers that are null or known
to point to the end of a buffer. Request upper bound from
__builtin_object_size check and handle zero-sized buffer case.
(operator>>(istream&, signed char))
(operator>>(istream&, unsigned char*)): Add attributes.
* testsuite/27_io/basic_istream/extractors_character/char/overflow.cc:
Check extracting into the middle of a buffer.
* testsuite/27_io/basic_istream/extractors_character/wchar_t/overflow.cc: New test.
Peter Bergner [Thu, 6 Aug 2020 15:03:03 +0000 (10:03 -0500)]
rs6000: Don't ICE when spilling an MMA accumulator
When we spill an accumulator that has a known zero value, LRA will emit
a new (set (reg:PXI ...) 0) insn, but it does not use the mma_xxsetaccz
pattern to do it, leading to an unrecognized insn ICE. The solution here
is to move the xxsetaccz instruction into the movpxi pattern and have the
xxsetaccz pattern call the move pattern.
2020-08-06 Peter Bergner <bergner@linux.ibm.com>
gcc/
PR target/96446
* config/rs6000/mma.md (*movpxi): Add xxsetaccz generation.
Disable split for zero constant source operand.
(mma_xxsetaccz): Change to define_expand. Call gen_movpxi.
gcc/testsuite/
PR target/96446
* gcc.target/powerpc/pr96446.c: New test.
Roger Sayle [Thu, 6 Aug 2020 14:28:14 +0000 (15:28 +0100)]
x86: Restrict new gcc.target/i386/minmax-9.c test to !ia32.
As reported by Jakub Jelinek, this test fails with -m32.
Sorry for any inconvenience.
2020-08-06 Roger Sayle <roger@nextmovesoftware.com>
gcc/testsuite/ChangeLog
* gcc.target/i386/minmax-9.c: Restrict test to !ia32.
Jakub Jelinek [Thu, 6 Aug 2020 13:47:25 +0000 (15:47 +0200)]
reassoc: Improve maybe_optimize_range_tests [PR96480]
On the following testcase, if the IL before reassoc would be:
...
<bb 4> [local count:
354334800]:
if (x_3(D) == 2)
goto <bb 7>; [34.00%]
else
goto <bb 5>; [66.00%]
<bb 5> [local count:
233860967]:
if (x_3(D) == 3)
goto <bb 7>; [34.00%]
else
goto <bb 6>; [66.00%]
<bb 6> [local count:
79512730]:
<bb 7> [local count:
1073741824]:
# prephitmp_7 = PHI <1(3), 1(4), 1(5), 1(2), 0(6)>
then we'd optimize it properly, but as bb 5-7 is instead:
<bb 5> [local count:
233860967]:
if (x_3(D) == 3)
goto <bb 6>; [34.00%]
else
goto <bb 7>; [66.00%]
<bb 6> [local count:
79512730]:
<bb 7> [local count:
1073741824]:
# prephitmp_7 = PHI <1(3), 1(4), 0(5), 1(2), 1(6)>
(i.e. the true/false edges on the last bb with condition swapped
and ditto for the phi args), we don't recognize it. If bb 6
is empty, there should be no functional difference between the two IL
representations.
This patch handles those special cases.
2020-08-06 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/96480
* tree-ssa-reassoc.c (suitable_cond_bb): Add TEST_SWAPPED_P argument.
If TEST_BB ends in cond and has one edge to *OTHER_BB and another
through an empty bb to that block too, if PHI args don't match, retry
them through the other path from TEST_BB.
(maybe_optimize_range_tests): Adjust callers. Handle such LAST_BB
through inversion of the condition.
* gcc.dg/tree-ssa/pr96480.c: New test.
Jose E. Marchesi [Thu, 6 Aug 2020 12:13:59 +0000 (14:13 +0200)]
bpf: more flexible support for kernel helpers
This patch changes the existing support for BPF kernel helpers to be
more flexible, in two main ways.
First, there is no longer a hardcoded list of kernel helpers defined
in the compiler internals. This is replaced by a new target-specific
attribute `kernel_helper' that the user can use to define her own
helpers, annotating function prototypes.
Second, following feedback from the kernel hackers, the pre-defined
helpers in the distributed bpf-helpers.h are no longer available
conditionally depending on the kernel version used in -mkernel. The
command-line option stays for now, as it may be useful for other
things.
Target tests and documentation updated.
2020-08-06 Jose E. Marchesi <jose.marchesi@oracle.com>
gcc/
* config/bpf/bpf-helpers.h (KERNEL_HELPER): Define.
(KERNEL_VERSION): Remove.
* config/bpf/bpf-helpers.def: Delete.
* config/bpf/bpf.c (bpf_handle_fndecl_attribute): New function.
(bpf_attribute_table): Define.
(bpf_helper_names): Delete.
(bpf_helper_code): Likewise.
(enum bpf_builtins): Adjust to new helpers mechanism.
(bpf_output_call): Likewise.
(bpf_init_builtins): Likewise.
(bpf_init_builtins): Likewise.
* doc/extend.texi (BPF Function Attributes): New section.
(BPF Kernel Helpers): Delete section.
gcc/testsuite/
* gcc.target/bpf/helper-bind.c: Adjust to new kernel helpers
mechanism.
* gcc.target/bpf/helper-bpf-redirect.c: Likewise.
* gcc.target/bpf/helper-clone-redirect.c: Likewise.
* gcc.target/bpf/helper-csum-diff.c: Likewise.
* gcc.target/bpf/helper-csum-update.c: Likewise.
* gcc.target/bpf/helper-current-task-under-cgroup.c: Likewise.
* gcc.target/bpf/helper-fib-lookup.c: Likewise.
* gcc.target/bpf/helper-get-cgroup-classid.c: Likewise.
* gcc.target/bpf/helper-get-current-cgroup-id.c: Likewise.
* gcc.target/bpf/helper-get-current-comm.c: Likewise.
* gcc.target/bpf/helper-get-current-pid-tgid.c: Likewise.
* gcc.target/bpf/helper-get-current-task.c: Likewise.
* gcc.target/bpf/helper-get-current-uid-gid.c: Likewise.
* gcc.target/bpf/helper-get-hash-recalc.c: Likewise.
* gcc.target/bpf/helper-get-listener-sock.c: Likewise.
* gcc.target/bpf/helper-get-local-storage.c: Likewise.
* gcc.target/bpf/helper-get-numa-node-id.c: Likewise.
* gcc.target/bpf/helper-get-prandom-u32.c: Likewise.
* gcc.target/bpf/helper-get-route-realm.c: Likewise.
* gcc.target/bpf/helper-get-smp-processor-id.c: Likewise.
* gcc.target/bpf/helper-get-socket-cookie.c: Likewise.
* gcc.target/bpf/helper-get-socket-uid.c: Likewise.
* gcc.target/bpf/helper-get-stack.c: Likewise.
* gcc.target/bpf/helper-get-stackid.c: Likewise.
* gcc.target/bpf/helper-getsockopt.c: Likewise.
* gcc.target/bpf/helper-ktime-get-ns.c: Likewise.
* gcc.target/bpf/helper-l3-csum-replace.c: Likewise.
* gcc.target/bpf/helper-l4-csum-replace.c: Likewise.
* gcc.target/bpf/helper-lwt-push-encap.c: Likewise.
* gcc.target/bpf/helper-lwt-seg6-action.c: Likewise.
* gcc.target/bpf/helper-lwt-seg6-adjust-srh.c: Likewise.
* gcc.target/bpf/helper-lwt-seg6-store-bytes.c: Likewise.
* gcc.target/bpf/helper-map-delete-elem.c: Likewise.
* gcc.target/bpf/helper-map-lookup-elem.c: Likewise.
* gcc.target/bpf/helper-map-peek-elem.c: Likewise.
* gcc.target/bpf/helper-map-pop-elem.c: Likewise.
* gcc.target/bpf/helper-map-push-elem.c: Likewise.
* gcc.target/bpf/helper-map-update-elem.c: Likewise.
* gcc.target/bpf/helper-msg-apply-bytes.c: Likewise.
* gcc.target/bpf/helper-msg-cork-bytes.c: Likewise.
* gcc.target/bpf/helper-msg-pop-data.c: Likewise.
* gcc.target/bpf/helper-msg-pull-data.c: Likewise.
* gcc.target/bpf/helper-msg-push-data.c: Likewise.
* gcc.target/bpf/helper-msg-redirect-hash.c: Likewise.
* gcc.target/bpf/helper-msg-redirect-map.c: Likewise.
* gcc.target/bpf/helper-override-return.c: Likewise.
* gcc.target/bpf/helper-perf-event-output.c: Likewise.
* gcc.target/bpf/helper-perf-event-read-value.c: Likewise.
* gcc.target/bpf/helper-perf-event-read.c: Likewise.
* gcc.target/bpf/helper-perf-prog-read-value.c: Likewise.
* gcc.target/bpf/helper-probe-read-str.c: Likewise.
* gcc.target/bpf/helper-probe-read.c: Likewise.
* gcc.target/bpf/helper-probe-write-user.c: Likewise.
* gcc.target/bpf/helper-rc-keydown.c: Likewise.
* gcc.target/bpf/helper-rc-pointer-rel.c: Likewise.
* gcc.target/bpf/helper-rc-repeat.c: Likewise.
* gcc.target/bpf/helper-redirect-map.c: Likewise.
* gcc.target/bpf/helper-set-hash-invalid.c: Likewise.
* gcc.target/bpf/helper-set-hash.c: Likewise.
* gcc.target/bpf/helper-setsockopt.c: Likewise.
* gcc.target/bpf/helper-sk-fullsock.c: Likewise.
* gcc.target/bpf/helper-sk-lookup-tcp.c: Likewise.
* gcc.target/bpf/helper-sk-lookup-upd.c: Likewise.
* gcc.target/bpf/helper-sk-redirect-hash.c: Likewise.
* gcc.target/bpf/helper-sk-redirect-map.c: Likewise.
* gcc.target/bpf/helper-sk-release.c: Likewise.
* gcc.target/bpf/helper-sk-select-reuseport.c: Likewise.
* gcc.target/bpf/helper-sk-storage-delete.c: Likewise.
* gcc.target/bpf/helper-sk-storage-get.c: Likewise.
* gcc.target/bpf/helper-skb-adjust-room.c: Likewise.
* gcc.target/bpf/helper-skb-cgroup-id.c: Likewise.
* gcc.target/bpf/helper-skb-change-head.c: Likewise.
* gcc.target/bpf/helper-skb-change-proto.c: Likewise.
* gcc.target/bpf/helper-skb-change-tail.c: Likewise.
* gcc.target/bpf/helper-skb-change-type.c: Likewise.
* gcc.target/bpf/helper-skb-ecn-set-ce.c: Likewise.
* gcc.target/bpf/helper-skb-get-tunnel-key.c: Likewise.
* gcc.target/bpf/helper-skb-get-tunnel-opt.c: Likewise.
* gcc.target/bpf/helper-skb-get-xfrm-state.c: Likewise.
* gcc.target/bpf/helper-skb-load-bytes-relative.c: Likewise.
* gcc.target/bpf/helper-skb-load-bytes.c: Likewise.
* gcc.target/bpf/helper-skb-pull-data.c: Likewise.
* gcc.target/bpf/helper-skb-set-tunnel-key.c: Likewise.
* gcc.target/bpf/helper-skb-set-tunnel-opt.c: Likewise.
* gcc.target/bpf/helper-skb-store-bytes.c: Likewise.
* gcc.target/bpf/helper-skb-under-cgroup.c: Likewise.
* gcc.target/bpf/helper-skb-vlan-pop.c: Likewise.
* gcc.target/bpf/helper-skb-vlan-push.c: Likewise.
* gcc.target/bpf/helper-skc-lookup-tcp.c: Likewise.
* gcc.target/bpf/helper-sock-hash-update.c: Likewise.
* gcc.target/bpf/helper-sock-map-update.c: Likewise.
* gcc.target/bpf/helper-sock-ops-cb-flags-set.c: Likewise.
* gcc.target/bpf/helper-spin-lock.c: Likewise.
* gcc.target/bpf/helper-spin-unlock.c: Likewise.
* gcc.target/bpf/helper-strtol.c: Likewise.
* gcc.target/bpf/helper-strtoul.c: Likewise.
* gcc.target/bpf/helper-sysctl-get-current-value.c: Likewise.
* gcc.target/bpf/helper-sysctl-get-name.c: Likewise.
* gcc.target/bpf/helper-sysctl-get-new-value.c: Likewise.
* gcc.target/bpf/helper-sysctl-set-new-value.c: Likewise.
* gcc.target/bpf/helper-tail-call.c: Likewise.
* gcc.target/bpf/helper-tcp-check-syncookie.c: Likewise.
* gcc.target/bpf/helper-tcp-sock.c: Likewise.
* gcc.target/bpf/helper-trace-printk.c: Likewise.
* gcc.target/bpf/helper-xdp-adjust-head.c: Likewise.
* gcc.target/bpf/helper-xdp-adjust-meta.c: Likewise.
* gcc.target/bpf/helper-xdp-adjust-tail.c: Likewise.
* gcc.target/bpf/skb-ancestor-cgroup-id.c: Likewise.
Richard Biener [Thu, 6 Aug 2020 10:18:24 +0000 (12:18 +0200)]
tree-optimization/96491 - avoid store commoning across abnormal edges
This avoids store commoning across abnormal edges since that easily
can disrupt abnormal coalescing because it might create overlapping
lifetime of variables.
2020-08-06 Richard Biener <rguenther@suse.de>
PR tree-optimization/96491
* tree-ssa-sink.c (sink_common_stores_to_bb): Avoid
sinking across abnormal edges.
* gcc.dg/torture/pr96491.c: New testcase.
Richard Biener [Thu, 6 Aug 2020 10:16:05 +0000 (12:16 +0200)]
tree-optimization/96483 - fix ICE in PRE with POLY_INT_CST
This adds a missing case for PRE expression re-materialization.
2020-08-06 Richard Biener <rguenther@suse.de>
PR tree-optimization/96483
* tree-ssa-pre.c (create_component_ref_by_pieces_1): Handle
POLY_INT_CST.
Richard Biener [Thu, 6 Aug 2020 08:18:10 +0000 (10:18 +0200)]
Remove std::map use from graphite
This replaces the use of std::map with hash_map for mapping
ISL ids to SSA names.
2020-08-06 Richard Biener <rguenther@suse.de>
* graphite-isl-ast-to-gimple.c (ivs_params): Use hash_map instead
of std::map.
(ivs_params_clear): Adjust.
(gcc_expression_from_isl_ast_expr_id): Likewise.
(graphite_create_new_loop): Likewise.
(add_parameters_to_ivs_params): Likewise.
Roger Sayle [Thu, 6 Aug 2020 08:15:25 +0000 (09:15 +0100)]
x86_64: Integer min/max improvements.
This patch tweaks the way that min and max are expanded, so that the
semantics of these operations are visible to the early RTL optimization
passes, until split into explicit comparison and conditional move
instructions. The good news is that i386.md already contains all of
the required logic (many thanks to Richard Biener and Uros Bizjak),
but this is currently only enabled to scalar-to-vector (STV) synthesis
of min/max instructions. This change enables this functionality for
all TARGET_CMOVE architectures for SImode, HImode and DImode.
2020-08-06 Roger Sayle <roger@nextmovesoftware.com>
Uroš Bizjak <ubizjak@gmail.com>
gcc/ChangeLog
* config/i386/i386.md (MAXMIN_IMODE): No longer needed.
(<maxmin><mode>3): Support SWI248 and general_operand for
second operand, when TARGET_CMOVE.
(<maxmin><mode>3_1 splitter): Optimize comparisons against
0, 1 and -1 to use "test" instead of "cmp".
(*<maxmin>di3_doubleword): Likewise, allow general_operand
and enable on TARGET_CMOVE.
(peephole2): Convert clearing a register after a flag setting
instruction into an xor followed by the original flag setter.
gcc/testsuite/ChangeLog
* gcc.target/i386/minmax-8.c: New test.
* gcc.target/i386/minmax-9.c: New test.
* gcc.target/i386/minmax-10.c: New test.
* gcc.target/i386/minmax-11.c: New test.
Gerald Pfeifer [Thu, 6 Aug 2020 07:02:15 +0000 (09:02 +0200)]
ipa-fnsummary: Include <vector> the proper way
This fixes a bootstrap error with clang 10 that would complain
/usr/include/c++/v1/typeinfo:346:5: error: no member named
'fancy_abort' in namespace 'std::__1'; did you mean simply
'fancy_abort'?
It mirrors how this is handled in gcov.c and indirectly includes
<vector> via system.h.
gcc/ChangeLog:
* ipa-fnsummary.c (INCLUDE_VECTOR): Define.
Remove direct inclusion of <vector>.
Kewen Lin [Wed, 29 Jul 2020 06:38:39 +0000 (14:38 +0800)]
vect/rs6000: Support vector with length cost modeling
This patch is to add the cost modeling for vector with length,
it mainly follows what we generate for vector with length in
functions vect_set_loop_controls_directly and vect_gen_len
at the worst case.
For Power, the length is expected to be in bits 0-7 (high bits),
we have to model the cost of shifting bits, which is implemented
in adjust_vect_cost_per_loop.
Bootstrapped/regtested on powerpc64le-linux-gnu (P9) with explicit
param vect-partial-vector-usage=1.
gcc/ChangeLog:
* config/rs6000/rs6000.c (rs6000_adjust_vect_cost_per_loop): New
function.
(rs6000_finish_cost): Call rs6000_adjust_vect_cost_per_loop.
* tree-vect-loop.c (vect_estimate_min_profitable_iters): Add cost
modeling for vector with length.
(vect_rgroup_iv_might_wrap_p): New function, factored out from...
* tree-vect-loop-manip.c (vect_set_loop_controls_directly): ...this.
Update function comment.
* tree-vect-stmts.c (vect_gen_len): Update function comment.
* tree-vectorizer.h (vect_rgroup_iv_might_wrap_p): New declare.
Kewen Lin [Wed, 5 Aug 2020 07:50:49 +0000 (02:50 -0500)]
vect: Skip epilogue loops for dbgcnt check [PR96451]
As the PR shows, commit r11-2453 exposed one issue that vectorizer
wants to vectorize the epilogue loop and leaves the if-cvt body there,
but later dbgcnt check disables it, the left scalar mask_store
statement causes ICE.
As Richard pointed out in that PR, the dbgcnt is to count original
scalar loops, so this fix is to make it skip the epilogue loops.
gcc/ChangeLog:
* tree-vectorizer.c (try_vectorize_loop_1): Skip the epilogue loops
for dbgcnt check.
GCC Administrator [Thu, 6 Aug 2020 00:16:26 +0000 (00:16 +0000)]
Daily bump.
Jonathan Wakely [Wed, 5 Aug 2020 21:48:17 +0000 (22:48 +0100)]
libstdc++: Break long lines to fit in 80 columns
libstdc++-v3/ChangeLog:
* include/std/atomic (atomic<T>::store): Reformat.
Jonathan Wakely [Wed, 5 Aug 2020 21:45:04 +0000 (22:45 +0100)]
libstdc++: Change URL for PSTL again
libstdc++-v3/ChangeLog:
* doc/xml/manual/status_cxx2017.xml: Replace oneAPI DPC++ link
with LLVM repo for PSTL.
* doc/html/manual/status.html: Regenerate.
Jonathan Wakely [Wed, 5 Aug 2020 21:17:18 +0000 (22:17 +0100)]
libstdc++: Replace operator>>(istream&, char*) [LWG 2499]
P0487R1 resolved LWG 2499 for C++20 by removing the operator>> overloads
that have high risk of buffer overflows. They were replaced by
equivalents that only accept a reference to an array, and so can
guarantee not to write past the end of the array.
In order to support both the old and new functionality, this patch
introduces a new overloaded __istream_extract function which takes a
maximum length. The new operator>> overloads use the array size as the
maximum length. The old overloads now use __builtin_object_size to
determine the available buffer size if available (which requires -O2) or
use numeric_limits<streamsize>::max()/sizeof(char_type) otherwise. This
is a change in behaviour, as the old overloads previously always used
numeric_limits<streamsize>::max(), without considering sizeof(char_type)
and without attempting to prevent overflows.
Because they now do little more than call __istream_extract, the old
operator>> overloads are very small inline functions. This means there
is no advantage to explicitly instantiating them in the library (in fact
that would prevent the __builtin_object_size checks from ever working).
As a result, the explicit instantiation declarations can be removed from
the header. The explicit instantiation definitions are still needed, for
backwards compatibility with existing code that expects to link to the
definitions in the library.
While working on this change I noticed that src/c++11/istream-inst.cc
has the following explicit instantiation definition:
template istream& operator>>(istream&, char*);
This had no effect (and so should not have been present in that file),
because there was an explicit specialization declared in <istream> and
defined in src/++98/istream.cc. However, this change removes the
explicit specialization, and now the explicit instantiation definition
is necessary to ensure the symbol gets defined in the library.
libstdc++-v3/ChangeLog:
* config/abi/pre/gnu.ver (GLIBCXX_3.4.29): Export new symbols.
* include/bits/istream.tcc (__istream_extract): New function
template implementing both of operator>>(istream&, char*) and
operator>>(istream&, char(&)[N]). Add explicit instantiation
declaration for it. Remove explicit instantiation declarations
for old function templates.
* include/std/istream (__istream_extract): Declare.
(operator>>(basic_istream<C,T>&, C*)): Define inline and simply
call __istream_extract.
(operator>>(basic_istream<char,T>&, signed char*)): Likewise.
(operator>>(basic_istream<char,T>&, unsigned char*)): Likewise.
(operator>>(basic_istream<C,T>&, C(7)[N])): Define for LWG 2499.
(operator>>(basic_istream<char,T>&, signed char(&)[N])):
Likewise.
(operator>>(basic_istream<char,T>&, unsigned char(&)[N])):
Likewise.
* include/std/streambuf (basic_streambuf): Declare char overload
of __istream_extract as a friend.
* src/c++11/istream-inst.cc: Add explicit instantiation
definition for wchar_t overload of __istream_extract. Remove
explicit instantiation definitions of old operator>> overloads
for versioned-namespace build.
* src/c++98/istream.cc (operator>>(istream&, char*)): Replace
with __istream_extract(istream&, char*, streamsize).
* testsuite/27_io/basic_istream/extractors_character/char/3.cc:
Do not use variable-length array.
* testsuite/27_io/basic_istream/extractors_character/char/4.cc:
Do not run test for C++20.
* testsuite/27_io/basic_istream/extractors_character/char/9555-ic.cc:
Do not test writing to pointers for C++20.
* testsuite/27_io/basic_istream/extractors_character/char/9826.cc:
Use array instead of pointer.
* testsuite/27_io/basic_istream/extractors_character/wchar_t/3.cc:
Do not use variable-length array.
* testsuite/27_io/basic_istream/extractors_character/wchar_t/4.cc:
Do not run test for C++20.
* testsuite/27_io/basic_istream/extractors_character/wchar_t/9555-ic.cc:
Do not test writing to pointers for C++20.
* testsuite/27_io/basic_istream/extractors_character/char/lwg2499.cc:
New test.
* testsuite/27_io/basic_istream/extractors_character/char/lwg2499_neg.cc:
New test.
* testsuite/27_io/basic_istream/extractors_character/char/overflow.cc:
New test.
* testsuite/27_io/basic_istream/extractors_character/wchar_t/lwg2499.cc:
New test.
* testsuite/27_io/basic_istream/extractors_character/wchar_t/lwg2499_neg.cc:
New test.
Patrick Palka [Wed, 5 Aug 2020 19:05:30 +0000 (15:05 -0400)]
c++: cxx_eval_vec_init after zero-initialization [PR96282]
In the first testcase below, expand_aggr_init_1 sets up t's default
constructor such that the ctor first zero-initializes the entire base b,
followed by calling b's default constructor, the latter of which just
default-initializes the array member b::m via a VEC_INIT_EXPR.
So upon constexpr evaluation of this latter VEC_INIT_EXPR, ctx->ctor is
nonempty due to the prior zero-initialization, and we proceed in
cxx_eval_vec_init to append new constructor_elts to the end of ctx->ctor
without first checking if a matching constructor_elt already exists.
This leads to ctx->ctor having two matching constructor_elts for each
index.
This patch fixes this issue by truncating a zero-initialized array
CONSTRUCTOR in cxx_eval_vec_init_1 before we begin appending array
elements to it. We propagate its zeroed out state during evaluation by
clearing CONSTRUCTOR_NO_CLEARING on each new appended aggregate element.
gcc/cp/ChangeLog:
PR c++/96282
* constexpr.c (cxx_eval_vec_init_1): Truncate ctx->ctor and
then clear CONSTRUCTOR_NO_CLEARING on each appended element
initializer if we're initializing a previously zero-initialized
array object.
gcc/testsuite/ChangeLog:
PR c++/96282
* g++.dg/cpp0x/constexpr-array26.C: New test.
* g++.dg/cpp0x/constexpr-array27.C: New test.
* g++.dg/cpp2a/constexpr-init18.C: New test.
Co-authored-by: Jason Merrill <jason@redhat.com>
Thomas Koenig [Wed, 5 Aug 2020 18:53:44 +0000 (20:53 +0200)]
Added test case to make sure that legal cases still pass.
gcc/testsuite/ChangeLog:
PR fortran/96469
* gfortran.dg/do_check_14.f90: New test.
Thomas Koenig [Wed, 5 Aug 2020 16:37:32 +0000 (18:37 +0200)]
Static analysis for definition of DO index variables in contained procedures.
When encountering a procedure call in a DO loop, this patch checks if
the call is to a contained procedure, and if it is, check for
changes in the index variable.
gcc/fortran/ChangeLog:
PR fortran/96469
* frontend-passes.c (doloop_contained_function_call): New
function.
(doloop_contained_procedure_code): New function.
(CHECK_INQ): Macro for inquire checks.
(doloop_code): Invoke doloop_contained_procedure_code and
doloop_contained_function_call if appropriate.
(do_intent): Likewise.
gcc/testsuite/ChangeLog:
PR fortran/96469
* gfortran.dg/do_check_4.f90: Hide change in index variable
from compile-time analysis.
* gfortran.dg/do_check_13.f90: New test.
Marc Glisse [Wed, 5 Aug 2020 14:45:33 +0000 (16:45 +0200)]
VEC_COND_EXPR optimizations
When vector comparisons were forced to use vec_cond_expr, we lost a number of optimizations (my fault for not adding enough testcases to
prevent that). This patch tries to unwrap vec_cond_expr a bit so some optimizations can still happen.
I wasn't planning to add all those transformations together, but adding one caused a regression, whose fix introduced a second regression,
etc.
Restricting to constant folding would not be sufficient, we also need at least things like X|0 or X&X. The transformations are quite
conservative with :s and folding only if everything simplifies, we may want to relax this later. And of course we are going to miss things
like a?b:c + a?c:b -> b+c.
In terms of number of operations, some transformations turning 2 VEC_COND_EXPR into VEC_COND_EXPR + BIT_IOR_EXPR + BIT_NOT_EXPR might not look
like a gain... I expect the bit_not disappears in most cases, and VEC_COND_EXPR looks more costly than a simpler BIT_IOR_EXPR.
2020-08-05 Marc Glisse <marc.glisse@inria.fr>
PR tree-optimization/95906
PR target/70314
* match.pd ((c ? a : b) op d, (c ? a : b) op (c ? d : e),
(v ? w : 0) ? a : b, c1 ? c2 ? a : b : b): New transformations.
(op (c ? a : b)): Update to match the new transformations.
* gcc.dg/tree-ssa/andnot-2.c: New file.
* gcc.dg/tree-ssa/pr95906.c: Likewise.
* gcc.target/i386/pr70314.c: Likewise.
Richard Sandiford [Wed, 5 Aug 2020 14:18:36 +0000 (15:18 +0100)]
aarch64: Clear canary value after stack_protect_test [PR96191]
The stack_protect_test patterns were leaving the canary value in the
temporary register, meaning that it was often still in registers on
return from the function. An attacker might therefore have been
able to use it to defeat stack-smash protection for a later function.
gcc/
PR target/96191
* config/aarch64/aarch64.md (stack_protect_test_<mode>): Set the
CC register directly, instead of a GPR. Replace the original GPR
destination with an extra scratch register. Zero out operand 3
after use.
(stack_protect_test): Update accordingly.
gcc/testsuite/
PR target/96191
* gcc.target/aarch64/stack-protector-1.c: New test.
* gcc.target/aarch64/stack-protector-2.c: Likewise.
Richard Sandiford [Wed, 5 Aug 2020 13:49:32 +0000 (14:49 +0100)]
aarch64: Add missing %z prefixes to LDP/STP patterns
For LDP/STP Q, the memory operand might not be valid for "m",
so we need to use %z<N> instead of %<N> in the asm template.
This patch does that for all Ump LDP/STP patterns, regardless
of whether it's strictly needed.
This is needed to unbreak bootstrap.
2020-08-05 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* config/aarch64/aarch64.md (load_pair_sw_<SX:mode><SX2:mode>)
(load_pair_dw_<DX:mode><DX2:mode>, load_pair_dw_tftf)
(store_pair_sw_<SX:mode><SX2:mode>)
(store_pair_dw_<DX:mode><DX2:mode>, store_pair_dw_tftf)
(*load_pair_extendsidi2_aarch64)
(*load_pair_zero_extendsidi2_aarch64): Use %z for the memory operand.
* config/aarch64/aarch64-simd.md (load_pair<DREG:mode><DREG2:mode>)
(vec_store_pair<DREG:mode><DREG2:mode>, load_pair<VQ:mode><VQ2:mode>)
(vec_store_pair<VQ:mode><VQ2:mode>): Likewise.
Richard Biener [Wed, 5 Aug 2020 10:04:59 +0000 (12:04 +0200)]
refactor LIM a bit
This refactors LIM to eschew alloc_aux_for_edges and re-uses the RPO
order of the move_computations walk for invariantness computation as well.
It also removes one unnecessary sorting (but retaining it as checking
code because we bsearch the vector) and moves edge insert commit code
to the place where it doesn't have to scan all the functions edges.
This was all done when investigating whether LIM can be refactored
to work on a specific loop for on-demand processing (but we're not
there yet).
2020-08-05 Richard Biener <rguenther@suse.de>
* tree-ssa-loop-im.c (invariantness_dom_walker): Remove.
(invariantness_dom_walker::before_dom_children): Move to ...
(compute_invariantness): ... this function.
(move_computations): Inline ...
(tree_ssa_lim): ... here, share RPO order and avoid some
cfun references.
(analyze_memory_references): Remove sorting of location
lists, instead assert they are sorted already when checking.
(prev_flag_edges): Remove.
(execute_sm_if_changed): Pass down and adjust prev edge state.
(execute_sm_exit): Likewise.
(hoist_memory_references): Likewise. Commit edge insertions
of each processed exit.
(store_motion_loop): Do not commit edge insertions on all
edges in the function.
(tree_ssa_lim_initialize): Do not call alloc_aux_for_edges.
(tree_ssa_lim_finalize): Do not call free_aux_for_edges.
Richard Biener [Wed, 5 Aug 2020 10:00:07 +0000 (12:00 +0200)]
Make genmatch transform failure handling more consistent
Currently whether a fail during the transform stage is fatal or
whether following patterns are still considers is a bit random
depending on whether the pattern is wrapped in a for for example.
The follwing makes it consistent by replacing early returns with
gotos to the end of the pattern processing.
2020-08-05 Richard Biener <rguenther@suse.de>
* genmatch.c (fail_label): New global.
(expr::gen_transform): Branch to fail_label instead of
returning. Fix indent of call argument checking.
(dt_simplify::gen_1): Compute and emit fail_label, branch
to it instead of returning early.
Jakub Jelinek [Wed, 5 Aug 2020 08:45:16 +0000 (10:45 +0200)]
openmp: Handle even some combined non-rectangular loops
The number of loops computation and logical iteration -> actual iterator values
computations can now be done separately even on composite constructs (though
for triangular loops it would still be more efficient to propagate a few values
through, will handle that incrementally).
simd and taskloop are still unhandled.
2020-08-05 Jakub Jelinek <jakub@redhat.com>
* omp-expand.c (expand_omp_for): Don't disallow combined non-rectangular
loops.
* testsuite/libgomp.c/loop-22.c: New test.
* testsuite/libgomp.c/loop-23.c: New test.
Jakub Jelinek [Wed, 5 Aug 2020 08:40:10 +0000 (10:40 +0200)]
openmp: Handle reduction clauses on host teams construct [PR96459]
As the new testcase shows, we weren't actually performing reductions on
host teams construct. And fixing that revealed a flaw in the for-14.c testcase.
The problem is that the tests perform also initialization and checking around the
calls to the functions with the OpenMP constructs. In that testcase, all the
tests have been spawned from a teams construct but only the tested loops were
distribute, which means the initialization and checking has been performed
redundantly and racily in each team. Fixed by performing the initialization
and checking outside of host teams and only do the calls to functions with
the tested constructs inside of host teams.
2020-08-05 Jakub Jelinek <jakub@redhat.com>
PR middle-end/96459
* omp-low.c (lower_omp_taskreg): Call lower_reduction_clauses even in
for host teams.
* testsuite/libgomp.c/teams-3.c: New test.
* testsuite/libgomp.c-c++-common/for-2.h (OMPTEAMS): Define to nothing
if not defined yet.
(N(test)): Use it before all N(f*) calls.
* testsuite/libgomp.c-c++-common/for-14.c (DO_PRAGMA, OMPTEAMS): Define.
(main): Don't call all test_* functions from within
#pragma omp teams reduction(|:err), call them directly.
Jakub Jelinek [Wed, 5 Aug 2020 08:37:25 +0000 (10:37 +0200)]
openmp: Use more efficient logical -> actual computation even if # iterations is computed at runtime
For triangular loops use more efficient logical iteration number
to actual iterator values computation even for non-rectangular loops
where number of loop iterations could not be computed at compile time.
2020-08-05 Jakub Jelinek <jakub@redhat.com>
* omp-expand.c (expand_omp_for_init_counts): Remember
first_inner_iterations, factor and n1o from the number of iterations
computation in *fd.
(expand_omp_for_init_vars): Use more efficient logical iteration number
to actual iterator values computation even for non-rectangular loops
where number of loop iterations could not be computed at compile time.
Carl Love [Fri, 12 Jun 2020 15:35:31 +0000 (10:35 -0500)]
rs6000 Add vector blend, permute builtin support
GCC maintainers:
The following patch adds support for the vec_blendv and vec_permx
builtins.
The patch has been compiled and tested on
powerpc64le-unknown-linux-gnu (Power 8 LE)
powerpc64le-unknown-linux-gnu (Power 9 LE)
with no regression errors.
The test cases were compiled on a Power 9 system and then tested on
Mambo.
Carl Love
rs6000 RFC2609 vector blend, permute instructions
gcc/ChangeLog
2020-08-04 Carl Love <cel@us.ibm.com>
* config/rs6000/altivec.h (vec_blendv, vec_permx): Add define.
* config/rs6000/altivec.md (UNSPEC_XXBLEND, UNSPEC_XXPERMX.): New
unspecs.
(VM3): New define_mode.
(VM3_char): New define_attr.
(xxblend_<mode> mode VM3): New define_insn.
(xxpermx): New define_expand.
(xxpermx_inst): New define_insn.
* config/rs6000/rs6000-builtin.def (VXXBLEND_V16QI, VXXBLEND_V8HI,
VXXBLEND_V4SI, VXXBLEND_V2DI, VXXBLEND_V4SF, VXXBLEND_V2DF): New
BU_P10V_3 definitions.
(XXBLEND): New BU_P10_OVERLOAD_3 definition.
(XXPERMX): New BU_P10_OVERLOAD_4 definition.
* config/rs6000/rs6000-c.c (altivec_resolve_overloaded_builtin):
(P10_BUILTIN_VXXPERMX): Add if statement.
* config/rs6000/rs6000-call.c (P10_BUILTIN_VXXBLEND_V16QI,
P10_BUILTIN_VXXBLEND_V8HI, P10_BUILTIN_VXXBLEND_V4SI,
P10_BUILTIN_VXXBLEND_V2DI, P10_BUILTIN_VXXBLEND_V4SF,
P10_BUILTIN_VXXBLEND_V2DF, P10_BUILTIN_VXXPERMX): Define
overloaded arguments.
(rs6000_expand_quaternop_builtin): Add if case for CODE_FOR_xxpermx.
(builtin_quaternary_function_type): Add v16uqi_type and xxpermx_type
variables, add case statement for P10_BUILTIN_VXXPERMX.
(builtin_function_type): Add case statements for
P10_BUILTIN_VXXBLEND_V16QI, P10_BUILTIN_VXXBLEND_V8HI,
P10_BUILTIN_VXXBLEND_V4SI, P10_BUILTIN_VXXBLEND_V2DI.
* doc/extend.texi: Add documentation for vec_blendv and vec_permx.
gcc/testsuite/ChangeLog
2020-08-04 Carl Love <cel@us.ibm.com>
* gcc.target/powerpc/vec-blend-runnable.c: New test.
* gcc.target/powerpc/vec-permute-ext-runnable.c: New test.
Carl Love [Wed, 27 May 2020 15:07:44 +0000 (10:07 -0500)]
rs6000, Add vector splat builtin support
GCC maintainers:
The following patch adds support for the vec_splati, vec_splatid and
vec_splati_ins builtins.
This patch adds support for instructions that take a 32-bit immediate
value that represents a floating point value. This support adds new
predicates and a support function to properly handle the immediate value.
The patch has been compiled and tested on
powerpc64le-unknown-linux-gnu (Power 8 LE)
powerpc64le-unknown-linux-gnu (Power 9 LE)
with no regression errors.
The test case was compiled on a Power 9 system and then tested on
Mambo.
Please let me know if this patch is acceptable for the mainline
branch. Thanks.
Carl Love
--------------------------------------------------------
gcc/ChangeLog
2020-08-04 Carl Love <cel@us.ibm.com>
* config/rs6000/altivec.h (vec_splati, vec_splatid, vec_splati_ins):
Add defines.
* config/rs6000/altivec.md (UNSPEC_XXSPLTIW, UNSPEC_XXSPLTID,
UNSPEC_XXSPLTI32DX): New.
(vxxspltiw_v4si, vxxspltiw_v4sf_inst, vxxspltidp_v2df_inst,
vxxsplti32dx_v4si_inst, vxxsplti32dx_v4sf_inst): New define_insn.
(vxxspltiw_v4sf, vxxspltidp_v2df, vxxsplti32dx_v4si,
vxxsplti32dx_v4sf.): New define_expands.
* config/rs6000/predicates.md (u1bit_cint_operand,
s32bit_cint_operand, c32bit_cint_operand): New predicates.
* config/rs6000/rs6000-builtin.def (VXXSPLTIW_V4SI, VXXSPLTIW_V4SF,
VXXSPLTID): New definitions.
(VXXSPLTI32DX_V4SI, VXXSPLTI32DX_V4SF): New BU_P10V_3
definitions.
(XXSPLTIW, XXSPLTID): New definitions.
(XXSPLTI32DX): Add definitions.
* config/rs6000/rs6000-call.c (P10_BUILTIN_VEC_XXSPLTIW,
P10_BUILTIN_VEC_XXSPLTID, P10_BUILTIN_VEC_XXSPLTI32DX):
New definitions.
* config/rs6000/rs6000-protos.h (rs6000_constF32toI32): New extern
declaration.
* config/rs6000/rs6000.c (rs6000_constF32toI32): New function.
* doc/extend.texi: Add documentation for vec_splati,
vec_splatid, and vec_splati_ins.
gcc/testsuite/ChangeLog
2020-08-04 Carl Love <cel@us.ibm.com>
* gcc.target/powerpc/vec-splati-runnable.c: New test.
Carl Love [Wed, 27 May 2020 03:44:50 +0000 (22:44 -0500)]
rs6000, Add vector shift double builtin support
GCC maintainers:
The following patch adds support for the vector shift double builtins.
The patch has been compiled and tested on
powerpc64le-unknown-linux-gnu (Power 8 LE)
powerpc64le-unknown-linux-gnu (Power 9 LE)
and Mambo with no regression errors.
Please let me know if this patch is acceptable for the mainline branch.
Thanks.
Carl Love
-------------------------------------------------------
gcc/ChangeLog
2020-08-04 Carl Love <cel@us.ibm.com>
* config/rs6000/altivec.h (vec_sldb, vec_srdb): New defines.
* config/rs6000/altivec.md (UNSPEC_SLDB, UNSPEC_SRDB): New.
(SLDB_lr): New attribute.
(VSHIFT_DBL_LR): New iterator.
(vs<SLDB_lr>db_<mode>): New define_insn.
* config/rs6000/rs6000-builtin.def (VSLDB_V16QI, VSLDB_V8HI,
VSLDB_V4SI, VSLDB_V2DI, VSRDB_V16QI, VSRDB_V8HI, VSRDB_V4SI,
VSRDB_V2DI): New BU_P10V_3 definitions.
(SLDB, SRDB): New BU_P10_OVERLOAD_3 definitions.
* config/rs6000/rs6000-call.c (P10_BUILTIN_VEC_SLDB,
P10_BUILTIN_VEC_SRDB): New definitions.
(rs6000_expand_ternop_builtin) [CODE_FOR_vsldb_v16qi,
CODE_FOR_vsldb_v8hi, CODE_FOR_vsldb_v4si, CODE_FOR_vsldb_v2di,
CODE_FOR_vsrdb_v16qi, CODE_FOR_vsrdb_v8hi, CODE_FOR_vsrdb_v4si,
CODE_FOR_vsrdb_v2di]: Add clauses.
* doc/extend.texi: Add description for vec_sldb and vec_srdb.
gcc/testsuite/ChangeLog
2020-08-04 Carl Love <cel@us.ibm.com>
* gcc.target/powerpc/vec-shift-double-runnable.c: New test file.
Carl Love [Mon, 15 Jun 2020 22:44:19 +0000 (17:44 -0500)]
rs6000, Add vector replace builtin support GCC maintainers:
The following patch adds support for builtins vec_replace_elt and
vec_replace_unaligned.
The patch has been compiled and tested on
powerpc64le-unknown-linux-gnu (Power 8 LE)
powerpc64le-unknown-linux-gnu (Power 9 LE)
and mambo with no regression errors.
Please let me know if this patch is acceptable for the mainline
branch. Thanks.
Carl Love
-------------------------------------------------------
gcc/ChangeLog
2020-08-04 Carl Love <cel@us.ibm.com>
* config/rs6000/altivec.h: Add define for vec_replace_elt and
vec_replace_unaligned.
* config/rs6000/vsx.md (UNSPEC_REPLACE_ELT, UNSPEC_REPLACE_UN): New
unspecs.
(REPLACE_ELT): New mode iterator.
(REPLACE_ELT_char, REPLACE_ELT_sh, REPLACE_ELT_max): New mode attributes.
(vreplace_un_<mode>, vreplace_elt_<mode>_inst): New.
* config/rs6000/rs6000-builtin.def (VREPLACE_ELT_V4SI,
VREPLACE_ELT_UV4SI, VREPLACE_ELT_V4SF, VREPLACE_ELT_UV2DI,
VREPLACE_ELT_V2DF, VREPLACE_UN_V4SI, VREPLACE_UN_UV4SI,
VREPLACE_UN_V4SF, VREPLACE_UN_V2DI, VREPLACE_UN_UV2DI,
VREPLACE_UN_V2DF, (REPLACE_ELT, REPLACE_UN, VREPLACE_ELT_V2DI): New builtin
entries.
* config/rs6000/rs6000-call.c (P10_BUILTIN_VEC_REPLACE_ELT,
P10_BUILTIN_VEC_REPLACE_UN): New builtin argument definitions.
(rs6000_expand_quaternop_builtin): Add 3rd argument checks for
CODE_FOR_vreplace_elt_v4si, CODE_FOR_vreplace_elt_v4sf,
CODE_FOR_vreplace_un_v4si, CODE_FOR_vreplace_un_v4sf.
(builtin_function_type) [P10_BUILTIN_VREPLACE_ELT_UV4SI,
P10_BUILTIN_VREPLACE_ELT_UV2DI, P10_BUILTIN_VREPLACE_UN_UV4SI,
P10_BUILTIN_VREPLACE_UN_UV2DI]: New cases.
* doc/extend.texi: Add description for vec_replace_elt and
vec_replace_unaligned builtins.
gcc/testsuite/ChangeLog
2020-08-04 Carl Love <cel@us.ibm.com>
* gcc.target/powerpc/vec-replace-word-runnable.c: New test.
Carl Love [Tue, 26 May 2020 19:13:29 +0000 (14:13 -0500)]
rs6000 Add vector insert builtin support
GCC maintainers:
This patch adds support for vec_insertl and vec_inserth builtins.
The patch has been compiled and tested on
powerpc64le-unknown-linux-gnu (Power 8 LE)
powerpc64le-unknown-linux-gnu (Power 9 LE)
and mambo with no regression errors.
Please let me know if this patch is acceptable for the mainline branch.
Thanks.
Carl Love
--------------------------------------------------------------
gcc/ChangeLog
2020-08-04 Carl Love <cel@us.ibm.com>
* config/rs6000/altivec.h (vec_insertl, vec_inserth): New defines.
* config/rs6000/rs6000-builtin.def (VINSERTGPRBL, VINSERTGPRHL,
VINSERTGPRWL, VINSERTGPRDL, VINSERTVPRBL, VINSERTVPRHL, VINSERTVPRWL,
VINSERTGPRBR, VINSERTGPRHR, VINSERTGPRWR, VINSERTGPRDR, VINSERTVPRBR,
VINSERTVPRHR, VINSERTVPRWR): New builtins.
(INSERTL, INSERTH): New builtins.
* config/rs6000/rs6000-call.c (P10_BUILTIN_VEC_INSERTL,
P10_BUILTIN_VEC_INSERTH): New overloaded definitions.
(P10_BUILTIN_VINSERTGPRBL, P10_BUILTIN_VINSERTGPRHL,
P10_BUILTIN_VINSERTGPRWL, P10_BUILTIN_VINSERTGPRDL,
P10_BUILTIN_VINSERTVPRBL, P10_BUILTIN_VINSERTVPRHL,
P10_BUILTIN_VINSERTVPRWL): Add case entries.
* config/rs6000/vsx.md (define_c_enum): Add UNSPEC_INSERTL,
UNSPEC_INSERTR.
(define_expand): Add vinsertvl_<mode>, vinsertvr_<mode>,
vinsertgl_<mode>, vinsertgr_<mode>, mode is VI2.
(define_ins): vinsertvl_internal_<mode>, vinsertvr_internal_<mode>,
vinsertgl_internal_<mode>, vinsertgr_internal_<mode>, mode VEC_I.
* doc/extend.texi: Add documentation for vec_insertl, vec_inserth.
gcc/testsuite/ChangeLog
2020-08-04 Carl Love <cel@us.ibm.com>
* gcc.target/powerpc/vec-insert-word-runnable.c: New test case.
Carl Love [Thu, 2 Jul 2020 20:30:30 +0000 (15:30 -0500)]
rs6000, Update support for vec_extract
GCC maintainers:
Move the existing vector extract support in altivec.md to vsx.md
so all of the vector insert and extract support is in the same file.
The patch also updates the name of the builtins and descriptions for the
builtins in the documentation file so they match the approved builtin
names and descriptions.
The patch does not make any functional changes.
Please let me know if the changes are acceptable for mainline. Thanks.
Carl Love
------------------------------------------------------
gcc/ChangeLog
2020-08-04 Carl Love <cel@us.ibm.com>
* config/rs6000/altivec.md: (UNSPEC_EXTRACTL, UNSPEC_EXTRACTR)
(vextractl<mode>, vextractr<mode>)
(vextractl<mode>_internal, vextractr<mode>_internal for mode VI2)
(VI2): Move to ...
* config/rs6000/vsx.md: (UNSPEC_EXTRACTL, UNSPEC_EXTRACTR)
(vextractl<mode>, vextractr<mode>)
(vextractl<mode>_internal, vextractr<mode>_internal for mode VI2)
(VI2): ..here.
* doc/extend.texi: Update documentation for vec_extractl.
Replace builtin name vec_extractr with vec_extracth. Update
description of vec_extracth.
GCC Administrator [Wed, 5 Aug 2020 00:16:39 +0000 (00:16 +0000)]
Daily bump.
Jim Wilson [Sun, 12 Jul 2020 23:48:24 +0000 (16:48 -0700)]
aarch64: Delete duplicated option docs.
Noticed while reviewing the RISC-V -mstack-protector-guard docs. The
AArch64 section has two identical copies of the docs for this option.
gcc/
* doc/invoke.texi (AArch64 Options): Delete duplicate
-mstack-protector-guard docs.
Roger Sayle [Tue, 4 Aug 2020 13:31:42 +0000 (15:31 +0200)]
[PATCH] nvptx: Add support for PTX highpart multiplications (HI/SI)
This patch adds support for signed and unsigned, HImode and SImode highpart
multiplications to the nvptx backend.
This patch has been tested on nvptx-none hosted on x86_64-pc-linux-gnu
with a "make" and "make -k check" with no new failures with the
above patch.
2020-08-04 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog:
* config/nvptx/nvptx.md (smulhi3_highpart, smulsi3_highpart)
(umulhi3_highpart, umulsi3_highpart): New instructions.
gcc/testsuite/ChangeLog:
* gcc.target/nvptx/mul-hi.c: New test.
* gcc.target/nvptx/umul-hi.c: New test.
Marek Polacek [Tue, 4 Aug 2020 13:35:25 +0000 (09:35 -0400)]
c++: Template keyword following :: [PR96082]
In r9-4235 I tried to make sure that the template keyword follows
a nested-name-specifier. :: is a valid nested-name-specifier, so
I also have to check 'globalscope' before giving the error.
gcc/cp/ChangeLog:
PR c++/96082
* parser.c (cp_parser_elaborated_type_specifier): Allow
'template' following ::.
gcc/testsuite/ChangeLog:
PR c++/96082
* g++.dg/template/template-keyword3.C: New test.
Ian Lance Taylor [Tue, 4 Aug 2020 01:23:39 +0000 (18:23 -0700)]
compiler: delete lowered constant strings
If we lower a constant string operation in a Binary_expression,
delete the strings. This is safe because constant strings are always
newly allocated.
This is a hack to use much less memory when compiling the new
time/tzdata package, which has a file that contains the sum of over
13,000 constant strings. We don't do this for numeric expressions
because that could cause us to delete an Iota_expression.
We should have a cleaner approach to memory usage some day.
Fixes PR go/96450
Andrew Stubbs [Tue, 4 Aug 2020 16:50:38 +0000 (17:50 +0100)]
amdgcn: Remove dead defines from gcn-run
Nothing uses these since the switch to HSACOv3.
gcc/ChangeLog:
* config/gcn/gcn-run.c (R_AMDGPU_NONE): Delete.
(R_AMDGPU_ABS32_LO): Delete.
(R_AMDGPU_ABS32_HI): Delete.
(R_AMDGPU_ABS64): Delete.
(R_AMDGPU_REL32): Delete.
(R_AMDGPU_REL64): Delete.
(R_AMDGPU_ABS32): Delete.
(R_AMDGPU_GOTPCREL): Delete.
(R_AMDGPU_GOTPCREL32_LO): Delete.
(R_AMDGPU_GOTPCREL32_HI): Delete.
(R_AMDGPU_REL32_LO): Delete.
(R_AMDGPU_REL32_HI): Delete.
(reserved): Delete.
(R_AMDGPU_RELATIVE64): Delete.
Omar Tahir [Tue, 4 Aug 2020 16:35:18 +0000 (17:35 +0100)]
[Arm] Modify default tuning of armv8.1-m.main to use Cortex-M55
Previously, compiling with -march=armv8.1-m.main would tune for
Cortex-M7.
However, the Cortex-M7 only supports up to Armv7e-M. The Cortex-M55 is
the earliest CPU that supports Armv8.1-M Mainline so is more appropriate.
This also has the effect of changing the branch cost function used, which
will be necessary to correctly prioritise conditional instructions over branches
in the rest of this patch series.
Regression tested on arm-none-eabi.
gcc/ChangeLog
2020-08-04 Omar Tahir <omar.tahir@arm.com>
* config/arm/arm-cpus.in (armv8.1-m.main): Tune for Cortex-M55.
Hu Jiangping [Tue, 4 Aug 2020 16:35:12 +0000 (17:35 +0100)]
aarch64: Delete unnecessary code
gcc/
* config/aarch64/aarch64.c (aarch64_if_then_else_costs): Delete
redundant extra_cost variable.
Nathan Sidwell [Tue, 4 Aug 2020 16:24:02 +0000 (09:24 -0700)]
c++: fix template parm count leak
I noticed that we could leak parser->num_template_parameter_lists with
erroneous specializations. We'd increment, notice a problem and then
bail out. This refactors cp_parser_explicit_specialization to avoid
that code path. A couple of tests get different diagnostics because
of the fix. pr39425 then goes to unbounded template instantiation
and exceeds the implementation limit.
gcc/cp/
* parser.c (cp_parser_explicit_specialization): Refactor
to avoid leak of num_template_parameter_lists value.
gcc/testsuite/
* g++.dg/template/pr39425.C: Adjust errors, (unbounded
template recursion).
* g++.old-deja/g++.pt/spec20.C: Remove fallout diagnostics.
xiezhiheng [Tue, 4 Aug 2020 16:25:29 +0000 (17:25 +0100)]
AArch64: Use FLOAT_MODE_P macro and add FLAG_AUTO_FP [PR94442]
Since all FP intrinsics are set by FLAG_FP by default, but not all FP intrinsics
raise FP exceptions or read FPCR register. So we add a global flag FLAG_AUTO_FP
to suppress the flag FLAG_FP.
2020-08-04 Zhiheng Xie <xiezhiheng@huawei.com>
gcc/ChangeLog:
* config/aarch64/aarch64-builtins.c (aarch64_call_properties):
Use FLOAT_MODE_P macro instead of enumerating all floating-point
modes and add global flag FLAG_AUTO_FP.
Tobias Burnus [Tue, 4 Aug 2020 16:17:04 +0000 (18:17 +0200)]
Fortran/OpenMP: Fix detecting not perfectly nested loops
gcc/fortran/ChangeLog:
* openmp.c (resolve_omp_do): Detect not perfectly
nested loop with innermost collapse.
gcc/testsuite/ChangeLog:
* gfortran.dg/gomp/collapse1.f90: Add dg-error.
* gfortran.dg/gomp/collapse2.f90: New test.
Jakub Jelinek [Tue, 4 Aug 2020 16:16:23 +0000 (18:16 +0200)]
doc: Add @cindex to symver attribute
When looking at the symver attr documentation in html, I found there is no
name to refer to for it.
2020-08-04 Jakub Jelinek <jakub@redhat.com>
* doc/extend.texi (symver): Add @cindex for symver function attribute.
Roger Sayle [Tue, 4 Aug 2020 15:56:06 +0000 (16:56 +0100)]
Test case for PR rtl-optimization/60473
PR rtl-optimization/60473 is code quality regression that has
been cured by improvements to register allocation. For the function
in the test case, GCC 4.4, 4.5 and 4.6 generated very poor code
requiring two mov instructions, and GCC 4.7 and 4.8 (when the PR was
filed) produced better but still poor code with one mov instruction.
Since GCC 4.9 (including current mainline), it generates optimal
code with no mov instructions, matching what used to be generated
in GCC 4.1.
2020-08-04 Roger Sayle <roger@nextmovesoftware.com>
gcc/testsuite/ChangeLog
PR rtl-optimization/60473
* gcc.target/i386/pr60473.c: New test.
Marc Glisse [Tue, 4 Aug 2020 15:30:16 +0000 (17:30 +0200)]
Simplify X * C1 == C2 with undefined overflow
this transformation is quite straightforward, without overflow, 3*X==15 is
the same as X==5 and 3*X==5 cannot happen. Adding a single_use restriction
for the first case didn't seem necessary, although of course it can
slightly increase register pressure in some cases.
2020-08-04 Marc Glisse <marc.glisse@inria.fr>
PR tree-optimization/95433
* match.pd (X * C1 == C2): New transformation.
* gcc.c-torture/execute/pr23135.c: Add -fwrapv to avoid
undefined behavior.
* gcc.dg/tree-ssa/pr95433.c: New file.
Aldy Hernandez [Tue, 4 Aug 2020 04:48:38 +0000 (06:48 +0200)]
Adjust gimple-ssa-sprintf.c for irange API.
gcc/ChangeLog:
* gimple-ssa-sprintf.c (get_int_range): Adjust for irange API.
(format_integer): Same.
(handle_printf_call): Same.
Iain Buclaw [Wed, 22 Jul 2020 07:50:38 +0000 (09:50 +0200)]
d: Fix struct literals that have non-deterministic hash values (PR96153)
Adds code generation for generating a temporary for, and pre-filling
struct and array literals with zeroes before assigning, so that
alignment holes don't cause objects to produce a non-deterministic hash
value. A new field has been added to the expression visitor to track
whether the result is being generated for another literal, so that
memset() is only called once on the top-level literal expression, and
not for nesting struct or arrays.
gcc/d/ChangeLog:
PR d/96153
* d-tree.h (build_expr): Add literalp argument.
* expr.cc (ExprVisitor): Add literalp_ field.
(ExprVisitor::ExprVisitor): Initialize literalp_.
(ExprVisitor::visit (AssignExp *)): Call memset() on blits where RHS
is a struct literal. Elide assignment if initializer is all zeroes.
(ExprVisitor::visit (CastExp *)): Forward literalp_ to generation of
subexpression.
(ExprVisitor::visit (AddrExp *)): Likewise.
(ExprVisitor::visit (ArrayLiteralExp *)): Use memset() to pre-fill
object with zeroes. Set literalp in subexpressions.
(ExprVisitor::visit (StructLiteralExp *)): Likewise.
(ExprVisitor::visit (TupleExp *)): Set literalp in subexpressions.
(ExprVisitor::visit (VectorExp *)): Likewise.
(ExprVisitor::visit (VectorArrayExp *)): Likewise.
(build_expr): Forward literal_p to ExprVisitor.
gcc/testsuite/ChangeLog:
PR d/96153
* gdc.dg/pr96153.d: New test.
Andrew Stubbs [Fri, 31 Jul 2020 10:27:24 +0000 (11:27 +0100)]
amdgcn: TImode shifts
Implement TImode shifts in the backend.
The middle-end support that does it for other architectures doesn't work for
GCN because BITS_PER_WORD==32, meaning that TImode is quad-word, not
double-word.
gcc/ChangeLog:
* config/gcn/gcn.md ("<expander>ti3"): New.
Patrick Palka [Tue, 4 Aug 2020 14:11:35 +0000 (10:11 -0400)]
c++: Member initializer list diagnostic locations [PR94024]
This patch preserves the source locations of each node in a member
initializer list so that during processing of the list we can set
input_location appropriately for generally more accurate diagnostic
locations. Since TREE_LIST nodes are tcc_exceptional, they can't have
source locations, so we instead store the location in a dummy
tcc_expression node within the TREE_TYPE of the list node.
gcc/cp/ChangeLog:
PR c++/94024
* init.c (sort_mem_initializers): Preserve TREE_TYPE of the
member initializer list node.
(emit_mem_initializers): Set input_location when performing each
member initialization.
* parser.c (cp_parser_mem_initializer): Attach the source
location of this initializer to a dummy EMPTY_CLASS_EXPR
within the TREE_TYPE of the list node.
* pt.c (tsubst_initializer_list): Preserve TREE_TYPE of the
member initializer list node.
gcc/testsuite/ChangeLog:
PR c++/94024
* g++.dg/diagnostic/mem-init1.C: New test.
Richard Biener [Tue, 4 Aug 2020 12:10:45 +0000 (14:10 +0200)]
tree-optimization/88240 - stopgap for floating point code-hoisting issues
This adds a stopgap measure to avoid performing code-hoisting
on mixed type loads when the load we'd insert in the hoisting
position would be a floating point one. This is because certain
targets (hello x87) cannot perform floating point loads without
possibly altering the bit representation and thus cannot be used
in place of integral loads.
2020-08-04 Richard Biener <rguenther@suse.de>
PR tree-optimization/88240
* tree-ssa-sccvn.h (vn_reference_s::punned): New flag.
* tree-ssa-sccvn.c (vn_reference_insert): Initialize punned.
(vn_reference_insert_pieces): Likewise.
(visit_reference_op_call): Likewise.
(visit_reference_op_load): Track whether a ref was punned.
* tree-ssa-pre.c (do_hoist_insertion): Refuse to perform hoist
insertion on punned floating point loads.
* gcc.target/i386/pr88240.c: New testcase.
Tobias Burnus [Tue, 4 Aug 2020 12:42:26 +0000 (14:42 +0200)]
Fortran: Fix for OpenMP's 'lastprivate(conditional:'
gcc/fortran/ChangeLog:
* trans-openmp.c (gfc_trans_omp_do): Fix 'lastprivate(conditional:'.
gcc/testsuite/ChangeLog:
* gfortran.dg/gomp/lastprivate-conditional-3.f90: Enable some
previously disabled 'lastprivate(conditional:' dg-warnings.
Sudakshina Das [Tue, 4 Aug 2020 11:01:21 +0000 (12:01 +0100)]
aarch64: Use Q-reg loads/stores in movmem expansion
This is my attempt at reviving the old patch
https://gcc.gnu.org/pipermail/gcc-patches/2019-January/514632.html
I have followed on Kyrill's comment upstream on the link above and I
am using the recommended option iii that he mentioned.
"1) Adjust the copy_limit to 256 bits after checking
AARCH64_EXTRA_TUNE_NO_LDP_STP_QREGS in the tuning.
2) Adjust aarch64_copy_one_block_and_progress_pointers to handle
256-bit moves. by iii:
iii) Emit explicit V4SI (or any other 128-bit vector mode) pairs
ldp/stps. This wouldn't need any adjustments to MD patterns,
but would make aarch64_copy_one_block_and_progress_pointers
more complex as it would now have two paths, where one
handles two adjacent memory addresses in one calls."
gcc/ChangeLog:
* config/aarch64/aarch64.c (aarch64_gen_store_pair): Add case
for E_V4SImode.
(aarch64_gen_load_pair): Likewise.
(aarch64_copy_one_block_and_progress_pointers): Handle 256 bit copy.
(aarch64_expand_cpymem): Expand copy_limit to 256bits where
appropriate.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/cpymem-q-reg_1.c: New test.
* gcc.target/aarch64/large_struct_copy_2.c: Update for ldp q regs.
Andrea Corallo [Wed, 29 Jul 2020 17:04:40 +0000 (19:04 +0200)]
aarch64: Add missing clobber for fjcvtzs
gcc/ChangeLog
2020-07-30 Andrea Corallo <andrea.corallo@arm.com>
* config/aarch64/aarch64.md (aarch64_fjcvtzs): Add missing
clobber.
* doc/sourcebuild.texi (aarch64_fjcvtzs_hw) Document new
target supports option.
gcc/testsuite/ChangeLog
2020-07-30 Andrea Corallo <andrea.corallo@arm.com>
* gcc.target/aarch64/acle/jcvt_2.c: New testcase.
* lib/target-supports.exp
(check_effective_target_aarch64_fjcvtzs_hw): Add new check for
FJCVTZS hw.
Tom de Vries [Tue, 4 Aug 2020 07:53:08 +0000 (09:53 +0200)]
[nvptx] Handle V2DI/V2SI mode in nvptx_gen_shuffle
With the pr96628-part1.f90 source and -ftree-slp-vectorize, we run into an
ICE due to the fact that V2DI mode is not handled in nvptx_gen_shuffle.
Fix this by adding handling of V2DI as well as V2SI mode in
nvptx_gen_shuffle.
Build and reg-tested on x86_64 with nvptx accelerator.
gcc/ChangeLog:
PR target/96428
* config/nvptx/nvptx.c (nvptx_gen_shuffle): Handle V2SI/V2DI.
libgomp/ChangeLog:
PR target/96428
* testsuite/libgomp.oacc-fortran/pr96628-part1.f90: New test.
* testsuite/libgomp.oacc-fortran/pr96628-part2.f90: New test.
Jakub Jelinek [Tue, 4 Aug 2020 09:33:18 +0000 (11:33 +0200)]
veclower: Don't ICE on .VEC_CONVERT calls with no lhs [PR96426]
.VEC_CONVERT is a const internal call, so normally if the lhs is not used,
we'd DCE it far before getting to veclower, but with -O0 (or perhaps
-fno-tree-dce and some other -fno-* options) it can happen.
But as the internal fn needs the lhs to know the type to which the
conversion is done (and I think that is a reasonable representation, having
some magic another argument and having to create constants with that type
looks overkill to me), we just should DCE those calls ourselves.
During veclower, we can't really remove insns, as the callers would be
upset, so this just replaces it with a GIMPLE_NOP.
2020-08-04 Jakub Jelinek <jakub@redhat.com>
PR middle-end/96426
* tree-vect-generic.c (expand_vector_conversion): Replace .VEC_CONVERT
call with GIMPLE_NOP if there is no lhs.
* gcc.c-torture/compile/pr96426.c: New test.
Jakub Jelinek [Tue, 4 Aug 2020 09:31:44 +0000 (11:31 +0200)]
gimple-fold: Fix ICE in maybe_canonicalize_mem_ref_addr on debug stmt [PR96354]
In debug stmts, we are less strict about what is and what is not accepted
there, so this patch just punts on optimization of a debug stmt rather than
ICEing.
2020-08-04 Jakub Jelinek <jakub@redhat.com>
PR debug/96354
* gimple-fold.c (maybe_canonicalize_mem_ref_addr): Add IS_DEBUG
argument. Return false instead of gcc_unreachable if it is true and
get_addr_base_and_unit_offset returns NULL.
(fold_stmt_1) <case GIMPLE_DEBUG>: Adjust caller.
* g++.dg/opt/pr96354.C: New test.
Aldy Hernandez [Tue, 4 Aug 2020 09:19:39 +0000 (11:19 +0200)]
Add is_gimple_min_invariant dropped from previous patch.
gcc/ChangeLog:
* vr-values.c (simplify_using_ranges::vrp_evaluate_conditional):
Call is_gimple_min_invariant dropped from previous patch.
Jakub Jelinek [Tue, 4 Aug 2020 08:53:07 +0000 (10:53 +0200)]
openmp: Compute number of collapsed loop iterations more efficiently for some non-rectangular loops
2020-08-04 Jakub Jelinek <jakub@redhat.com>
* omp-expand.c (expand_omp_for_init_counts): For triangular loops
compute number of iterations at runtime more efficiently.
(expand_omp_for_init_vars): Adjust immediate dominators.
(extract_omp_for_update_vars): Likewise.
Iain Buclaw [Mon, 3 Aug 2020 20:35:38 +0000 (22:35 +0200)]
d: Fix PR96429: Pointer subtraction uses TRUNC_DIV_EXPR
gcc/d/ChangeLog:
PR d/96429
* expr.cc (ExprVisitor::visit (BinExp*)): Use EXACT_DIV_EXPR for
pointer diff expressions.
gcc/testsuite/ChangeLog:
PR d/96429
* gdc.dg/pr96429.d: New test.
Paul Thomas [Tue, 4 Aug 2020 06:53:50 +0000 (07:53 +0100)]
Change testcase for pr96325 from run to compile.
2020-08-04 Paul Thomas <pault@gcc.gnu.org>
gcc/testsuite/
PR fortran/96325
* gfortran.dg/pr96325.f90: Change from run to compile.
Aldy Hernandez [Tue, 4 Aug 2020 05:16:05 +0000 (07:16 +0200)]
Adjust two_valued_val_range_p for irange API.
gcc/ChangeLog:
* vr-values.c (simplify_using_ranges::two_valued_val_range_p):
Use irange API.
Aldy Hernandez [Tue, 4 Aug 2020 05:13:44 +0000 (07:13 +0200)]
Adjust simplify_conversion_using_ranges for irange API.
gcc/ChangeLog:
* vr-values.c (simplify_conversion_using_ranges): Convert to irange API.
Aldy Hernandez [Tue, 4 Aug 2020 05:09:59 +0000 (07:09 +0200)]
Use irange API in test_for_singularity.
gcc/ChangeLog:
* vr-values.c (test_for_singularity): Use irange API.
(simplify_using_ranges::simplify_cond_using_ranges_1): Do not
special case VR_RANGE.
Aldy Hernandez [Tue, 4 Aug 2020 05:01:22 +0000 (07:01 +0200)]
Adjust vrp_evaluate_conditional for irange API.
gcc/ChangeLog:
* vr-values.c (simplify_using_ranges::vrp_evaluate_conditional): Adjust
for irange API.
Aldy Hernandez [Tue, 4 Aug 2020 04:58:26 +0000 (06:58 +0200)]
Adjust op_with_boolean_value_range_p for irange API.
gcc/ChangeLog:
* vr-values.c (simplify_using_ranges::op_with_boolean_value_range_p): Adjust
for irange API.
Aldy Hernandez [Tue, 4 Aug 2020 04:55:55 +0000 (06:55 +0200)]
Adjust get_range_info to use the base irange class.
gcc/ChangeLog:
* tree-ssanames.c (get_range_info): Use irange instead of value_range.
* tree-ssanames.h (get_range_info): Same.
Aldy Hernandez [Tue, 4 Aug 2020 04:46:09 +0000 (06:46 +0200)]
Adjust expr_not_equal_to to use irange API.
gcc/ChangeLog:
* fold-const.c (expr_not_equal_to): Adjust for irange API.
Aldy Hernandez [Tue, 4 Aug 2020 04:41:03 +0000 (06:41 +0200)]
Remove ad-hoc range canonicalization from determine_block_size.
Anti ranges of ~[MIN,X] are automatically canonicalized to [X+1,MAX],
at creation time. There is no need to handle them specially.
Tested by adding a gcc_unreachable and bootstrapping/testing.
gcc/ChangeLog:
* builtins.c (determine_block_size): Remove ad-hoc range canonicalization.
Xionghu Luo [Tue, 4 Aug 2020 03:09:15 +0000 (22:09 -0500)]
dse: Remove partial load after full store for high part access[PR71309]
v5 update as comments:
1. Move const_rhs out of loop;
2. Iterate from int size for read_mode.
This patch could optimize(works for char/short/int/void*):
6: r119:TI=[r118:DI+0x10]
7: [r118:DI]=r119:TI
8: r121:DI=[r118:DI+0x8]
=>
6: r119:TI=[r118:DI+0x10]
16: r122:DI=r119:TI#8
Final ASM will be as below without partial load after full store(stxv+ld):
ld 10,16(3)
mr 9,3
ld 3,24(3)
std 10,0(9)
std 3,8(9)
blr
It could achieve ~25% performance improvement for typical cases on
Power9. Bootstrap and regression tested on Power9-LE.
For AArch64, one ldr is replaced by mov with this patch:
ldp x2, x3, [x0, 16]
stp x2, x3, [x0]
ldr x0, [x0, 8]
=>
mov x1, x0
ldp x2, x0, [x0, 16]
stp x2, x0, [x1]
gcc/ChangeLog:
2020-08-04 Xionghu Luo <luoxhu@linux.ibm.com>
PR rtl-optimization/71309
* dse.c (find_shift_sequence): Use subreg of shifted from high part
register to avoid loading from address.
gcc/testsuite/ChangeLog:
2020-08-04 Xionghu Luo <luoxhu@linux.ibm.com>
PR rtl-optimization/71309
* gcc.target/powerpc/pr71309.c: New test.
GCC Administrator [Tue, 4 Aug 2020 00:16:24 +0000 (00:16 +0000)]
Daily bump.
Marek Polacek [Mon, 3 Aug 2020 23:17:20 +0000 (19:17 -0400)]
c++: Remove unused declaration.
gcc/cp/ChangeLog:
* cp-tree.h (after_nsdmi_defaulted_late_checks): Remove.
Ian Lance Taylor [Mon, 3 Aug 2020 22:59:45 +0000 (15:59 -0700)]
libgcc: increase required stack space for x86_64 -fsplit-stack
This accomodates increased space required by use of the xsavec
instruction in the dynamic linker trampoline.
libgcc/ChangeLog:
* config/i386/morestack.S (BACKOFF) [x86_64]: Add 2048 bytes.
Segher Boessenkool [Mon, 3 Aug 2020 22:15:01 +0000 (22:15 +0000)]
rs6000: Fix vector_float.c testcase for -m32
It should be skipped then.
2020-08-03 Segher Boessenkool <segher@kernel.crashing.org>
gcc/testsuite/
* gcc.target/powerpc/vector_float.c: Skip if not lp64.
Marek Polacek [Thu, 16 Jul 2020 13:15:37 +0000 (09:15 -0400)]
c++: Variable template and template parameter pack [PR96218]
This is DR 2032 which says that the restrictions regarding template
parameter packs and default arguments apply to variable templates as
well, but we weren't detecting that.
gcc/cp/ChangeLog:
DR 2032
PR c++/96218
* pt.c (check_default_tmpl_args): Also consider variable
templates.
gcc/testsuite/ChangeLog:
DR 2032
PR c++/96218
* g++.dg/cpp1y/var-templ67.C: New test.
Jakub Jelinek [Mon, 3 Aug 2020 20:55:28 +0000 (22:55 +0200)]
aarch64: Fix up __aarch64_cas16_acq_rel fallback
As mentioned in the PR, the fallback path when LSE is unavailable writes
incorrect registers to the memory if the previous content compares equal
to x0, x1 - it writes copy of x0, x1 from the start of function, but it
should write x2, x3.
2020-08-03 Jakub Jelinek <jakub@redhat.com>
PR target/96402
* config/aarch64/lse.S (__aarch64_cas16_acq_rel): Use x2, x3 instead
of x(tmp0), x(tmp1) in STXP arguments.
* gcc.target/aarch64/pr96402.c: New test.
Jonathan Wakely [Mon, 3 Aug 2020 20:16:50 +0000 (21:16 +0100)]
cpp: Do not use @dots for ... tokens in code examples
This prevents a ... token in code examples from being turned into a
single HORIZONTAL ELLIPSIS glyph (e.g. via the HTML … entity).
gcc/ChangeLog:
* doc/cpp.texi (Variadic Macros): Use the exact ... token in
code examples.
Nathan Sidwell [Mon, 3 Aug 2020 20:06:06 +0000 (13:06 -0700)]
Refer to C++20
I noticed a bunch of references to c++2a.
gcc/
* doc/invoke.texi: Refer to c++20
Julian Brown [Mon, 27 Jul 2020 13:29:02 +0000 (06:29 -0700)]
openacc: No attach/detach present/release mappings for array descriptors
Standalone attach and detach clauses should not create present/release
mappings for Fortran array descriptors (e.g. used when we have a pointer
to an array), both because it is unnecessary and because those mappings
will be incorrectly subject to reference counting. Simply omitting the
mappings means we just use GOMP_MAP_TO_PSET and GOMP_MAP_{ATTACH,DETACH}
mappings for array descriptors.
That requires a tweak in gimplify.c, since we may now see GOMP_MAP_TO_PSET
without a preceding data-movement mapping.
2020-08-03 Julian Brown <julian@codesourcery.com>
Thomas Schwinge <thomas@codesourcery.com>
gcc/fortran/
* trans-openmp.c (gfc_trans_omp_clauses): Don't create present/release
mappings for array descriptors.
gcc/
* gimplify.c (gimplify_omp_target_update): Allow GOMP_MAP_TO_PSET
without a preceding data-movement mapping.
gcc/testsuite/
* gfortran.dg/goacc/attach-descriptor.f90: Update pattern output. Add
scanning of gimplify dump.
libgomp/
* testsuite/libgomp.oacc-fortran/attach-descriptor-1.f90: Don't run for
shared-memory devices. Extend with further checking.
Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>