David Malcolm [Mon, 1 Aug 2022 23:30:15 +0000 (19:30 -0400)]
docs: fix copy&paste error in -Wanalyzer-putenv-of-auto-var
gcc/ChangeLog:
* doc/invoke.texi (-Wanalyzer-putenv-of-auto-var): Fix copy&paste
error.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Roger Sayle [Mon, 1 Aug 2022 22:08:23 +0000 (23:08 +0100)]
PR target/106481: Handle CONST_WIDE_INT in REG_EQUAL during STV on x86_64.
This patch resolves PR target/106481, and is an oversight in my recent
battles with REG_EQUAL notes during TImode STV (see PR target/106278
https://gcc.gnu.org/pipermail/gcc-patches/2022-July/598416.html).
The patch above's/current behaviour is that we check that the mode of
the REG_EQUAL note is TImode before using PUT_MODE to set it to V1TImode.
However, the new test case reveals that this doesn't consider REG_EQUAL
notes that are CONST_INT or CONST_WIDE_INT, i.e. that are VOIDmode,
and so STV produces:
(insn 85 84 86 2 (set (reg:V1TI 113)
(reg:V1TI 84)) "pr106481.c":13:3 1766 {movv1ti_internal}
(expr_list:REG_EQUAL (const_wide_int 0x0ffffffff00000004)
(nil)))
which causes problems as the const_wide_int isn't a valid immediate
constant for V1TImode. With this patch, we now generate the correct:
(insn 85 84 86 2 (set (reg:V1TI 113)
(reg:V1TI 84)) "pr106481.c":13:3 1766 {movv1ti_internal}
(expr_list:REG_EQUAL (const_vector:V1TI [
(const_wide_int 0x0ffffffff00000004)
])
(nil)))
2022-08-01 Roger Sayle <roger@nextmovesoftware.com>
Uroš Bizjak <ubizjak@gmail.com>
gcc/ChangeLog
PR target/106481
* config/i386/i386-features.cc (timode_scalar_chain::convert_insn):
Convert a CONST_SCALAR_INT_P in a REG_EQUAL note into a V1TImode
CONST_VECTOR.
gcc/testsuite/ChangeLog
PR target/106481
* gcc.target/i386/pr106481.c: New test case.
H.J. Lu [Wed, 20 Jul 2022 23:57:32 +0000 (16:57 -0700)]
x86: Add ix86_ifunc_ref_local_ok
We can't always use the PLT entry as the function address for local IFUNC
functions. When the PIC register is needed for PLT call, indirect call
via the PLT entry will fail since the PIC register may not be set up
properly for indirect call. Add ix86_ifunc_ref_local_ok to return false
when the PLT entry can't be used as local IFUNC function pointers.
gcc/
PR target/83782
* config/i386/i386.cc (ix86_ifunc_ref_local_ok): New.
(TARGET_IFUNC_REF_LOCAL_OK): Use it.
gcc/testsuite/
PR target/83782
* gcc.target/i386/pr83782-1.c: Require non-ia32.
* gcc.target/i386/pr83782-2.c: Likewise.
* gcc.target/i386/pr83782-3.c: New test.
Jose E. Marchesi [Fri, 8 Jul 2022 16:32:02 +0000 (18:32 +0200)]
btf: emit linkage information in BTF_KIND_FUNC entries
The kernel bpftool expects BTF_KIND_FUNC entries in BTF to include an
annotation reflecting the linkage of functions (static, global). For
whatever reason they abuse the `vlen' field of the BTF_KIND_FUNC entry
instead of adding a variable-part to the record like it is done with
other entry kinds.
This patch makes GCC to include this linkage info in BTF_KIND_FUNC
entries.
Tested in bpf-unknown-none target.
gcc/ChangeLog:
PR debug/106263
* ctfc.h (struct ctf_dtdef): Add field linkage.
* ctfc.cc (ctf_add_function): Set ctti_linkage.
* dwarf2ctf.cc (gen_ctf_function_type): Pass a linkage for
function types and subprograms.
* btfout.cc (btf_asm_func_type): Emit linkage information for the
function.
(btf_dtd_emit_preprocess_cb): Propagate the linkage information
for functions.
gcc/testsuite/ChangeLog:
PR debug/106263
* gcc.dg/debug/btf/btf-function-4.c: New test.
* gcc.dg/debug/btf/btf-function-5.c: Likewise.
Andrew Stubbs [Tue, 19 Jul 2022 10:16:09 +0000 (11:16 +0100)]
openmp-simd-clone: Match shift types
Ensure that both parameters to vector shifts use the same mode. This is most
important for amdgcn where the masks are DImode.
gcc/ChangeLog:
* omp-simd-clone.cc (simd_clone_adjust): Convert shift_cnt to match
the mask type.
Co-authored-by: Jakub Jelinek <jakub@redhat.com>
Sam Feifer [Fri, 29 Jul 2022 13:44:48 +0000 (09:44 -0400)]
match.pd: Add new division pattern [PR104992]
This patch fixes a missed optimization in match.pd. It takes the pattern,
x / y * y == x, and optimizes it to x % y == 0. This produces fewer
instructions. This simplification does not happen for complex types.
This patch also adds tests for the optimization rule.
Bootstrapped/regtested on x86_64-pc-linux-gnu.
PR tree-optimization/104992
gcc/ChangeLog:
* match.pd (x / y * y == x): New simplification.
gcc/testsuite/ChangeLog:
* g++.dg/pr104992-1.C: New test.
* gcc.dg/pr104992.c: New test.
Roger Sayle [Mon, 1 Aug 2022 10:36:23 +0000 (11:36 +0100)]
Update configure to check for a recent gnat Ada compiler.
GCC fails to bootstrap when configured with --enable-languages=all on
machines that have older versions of GNAT installed as the system Ada
compiler. In configure, it's not sufficient to check whether gnat is
available, but whether a sufficiently recent version of GNAT is
installed. This patch tweaks config/acx.m4 so that conftest.adb also
contains a reference to System.CRTL.int64 as required by the current
version of gcc/ada/osint.adb. This fixes the build when the system
Ada is GNAT v4.8.5 (on Redhat 7) by disabling ada, but continues to
work fine when the system Ada is GNAT v11.3.1.
2022-08-01 Roger Sayle <roger@nextmovesoftware.com>
Arnaud Charlet <charlet@adacore.com>
config/ChangeLog
* acx.m4 (AC_PROG_GNAT): Update conftest.adb to include
features required of the host gnat compiler.
ChangeLog
* configure: Regenerate.
Martin Liska [Mon, 1 Aug 2022 08:32:00 +0000 (10:32 +0200)]
lto: replace $target with $host in configure.ac [PR106170]
PR lto/106170
lto-plugin/ChangeLog:
* configure.ac: Replace $target with $host.
* configure: Regenerate.
Jakub Jelinek [Mon, 1 Aug 2022 06:26:03 +0000 (08:26 +0200)]
libfortran: Fix up boz_15.f90 on powerpc64le with -mabi=ieeelongdouble [PR106079]
The boz_15.f90 test FAILs on powerpc64le-linux when -mabi=ieeelongdouble
is used (either default through --with-long-double-format=ieee or
when used explicitly).
The problem is that the read/write transfer routines are called with
BT_REAL (or BT_COMPLEX) type and kind 17 which is magic we use to say
it is the IEEE quad real(kind=16) rather than the IBM double double
real(kind=16). For the floating point input/output we then handle kind
17 specially, but for B/O/Z we just treat the bytes of the floating point
value as binary blob and using 17 in that case results in unexpected
behavior, for write it means we don't estimate right how many chars we'll
need and print ******************** etc. rather than what we should, and
even with explicit size we'd print one further byte than intended.
For read it would even mean overwriting some unrelated byte after the
floating point object.
Fixed by using 16 instead of 17 in the read_radix and write_{b,o,z} calls.
2022-08-01 Jakub Jelinek <jakub@redhat.com>
PR libfortran/106079
* io/transfer.c (formatted_transfer_scalar_read,
formatted_transfer_scalar_write): For type BT_REAL with kind 17
change kind to 16 before calling read_radix or write_{b,o,z}.
Aldy Hernandez [Sun, 31 Jul 2022 11:43:36 +0000 (13:43 +0200)]
Cleanups to frange.
These are some assorted cleanups to the frange class to make it easier
to drop in an implementation with FP endpoints:
* frange::set() had some asserts limiting the type of arguments
passed. There's no reason why we can't handle all the variants.
Worse comes to worse, we can always return a VARYING which is
conservative and correct.
* frange::normalize_kind() now returns a boolean that can be used in
union and intersection to indicate that the range changed.
* Implement vrp_val_max and vrp_val_min for floats. Also, move them
earlier in the header file so frange can use them.
Tested on x86-64 Linux.
gcc/ChangeLog:
* value-range.cc (tree_compare): New.
(frange::set): Make more general.
(frange::normalize_kind): Cleanup and return bool.
(frange::union_): Use normalize_kind return value.
(frange::intersect): Same.
(frange::verify_range): Remove unnecessary else.
* value-range.h (vrp_val_max): Move before frange class.
(vrp_val_min): Same.
(frange::frange): Remove set to m_type.
Aldy Hernandez [Sun, 31 Jul 2022 11:36:59 +0000 (13:36 +0200)]
const_tree conversion of vrange::supports_*
Make all vrange::supports_*_p methods const_tree as they can end up
being called from functions that are const_tree.
Tested on x86-64 Linux.
gcc/ChangeLog:
* value-range.cc (vrange::supports_type_p): Use const_tree.
(irange::supports_type_p): Same.
(frange::supports_type_p): Same.
* value-range.h (Value_Range::supports_type_p): Same.
(irange::supports_p): Same.
Aldy Hernandez [Sun, 31 Jul 2022 21:02:14 +0000 (23:02 +0200)]
Make irange dependency explicit for range_of_ssa_name_with_loop_info.
Even though ranger is type agnostic, SCEV seems to only work with
integers. This patch removes some FIXME notes making it explicit that
bounds_of_var_in_loop only works with iranges.
Tested on x86-64 Linux.
gcc/ChangeLog:
* gimple-range-fold.cc (fold_using_range::range_of_phi): Only
query SCEV for integers.
(fold_using_range::range_of_ssa_name_with_loop_info): Remove
irange check.
Dimitrije Milošević [Fri, 29 Jul 2022 06:36:06 +0000 (08:36 +0200)]
libsanitizer: Cherry-pick
2bfb0fcb51510f22723c8cdfefe from upstream
2bfb0fcb51510f22723c8cdfefe [Sanitizer][MIPS] Fix stat struct size for the O32 ABI.
Signed-off-by: Dimitrije Milosevic <dimitrije.milosevic@syrmia.com>.
GCC Administrator [Mon, 1 Aug 2022 00:16:31 +0000 (00:16 +0000)]
Daily bump.
Roger Sayle [Sun, 31 Jul 2022 20:51:44 +0000 (21:51 +0100)]
Add rotl64ti2_doubleword pattern to i386.md
This patch adds rot[lr]64ti2_doubleword patterns to the x86_64 backend,
to move splitting of 128-bit TImode rotates by 64 bits after reload,
matching what we now do for 64-bit DImode rotations by 32 bits with -m32.
In theory moving when this rotation is split should have little
influence on code generation, but in practice "reload" sometimes
decides to make use of the increased flexibility to reduce the number
of registers used, and the code size, by using xchg.
For example:
__int128 x;
__int128 y;
__int128 a;
__int128 b;
void foo()
{
unsigned __int128 t = x;
t ^= a;
t = (t<<64) | (t>>64);
t ^= b;
y = t;
}
Before:
movq x(%rip), %rsi
movq x+8(%rip), %rdi
xorq a(%rip), %rsi
xorq a+8(%rip), %rdi
movq %rdi, %rax
movq %rsi, %rdx
xorq b(%rip), %rax
xorq b+8(%rip), %rdx
movq %rax, y(%rip)
movq %rdx, y+8(%rip)
ret
After:
movq x(%rip), %rax
movq x+8(%rip), %rdx
xorq a(%rip), %rax
xorq a+8(%rip), %rdx
xchgq %rdx, %rax
xorq b(%rip), %rax
xorq b+8(%rip), %rdx
movq %rax, y(%rip)
movq %rdx, y+8(%rip)
ret
One some modern architectures this is a small win, on some older
architectures this is a small loss. The decision which code to
generate is made in "reload", and could probably be tweaked by
register preferencing. The much bigger win is that (eventually) all
TImode mode shifts and rotates by constants will become potential
candidates for TImode STV.
2022-07-31 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
* config/i386/i386.md (define_expand <any_rotate>ti3): For
rotations by 64 bits use new rot[lr]64ti2_doubleword pattern.
(rot[lr]64ti2_doubleword): New post-reload splitter.
Roger Sayle [Sun, 31 Jul 2022 20:44:51 +0000 (21:44 +0100)]
PR target/106450: Tweak timode_remove_non_convertible_regs on x86_64.
This patch resolves PR target/106450, some more fall-out from more
aggressive TImode scalar-to-vector (STV) optimizations. I continue
to be caught out by how far TImode STV has diverged from DImode/SImode
STV, and therefore requires additional (unexpected) tweaking. Many
thanks to H.J. Lu for pointing out timode_remove_non_convertible_regs
needs to be extended to handle XOR (and other new operations).
Unhelpfully the comment above this function states that it's the TImode
version of "remove_non_convertible_regs", which doesn't exist anymore,
so I've resurrected an explanatory comment from the git history.
By refactoring the checks for hard regs and already "marked" regs
into timode_check_non_convertible_regs itself, all of its callers are
simplified. This patch then FOR_EACH_INSN_USE and FOR_EACH_INSN_DEF
to generically handle arbitrary (non-move) instructions (including
unary and binary operations), calling timode_check_non_convertible_regs
on each TImode register USE and DEF.
2022-07-31 Roger Sayle <roger@nextmovesoftware.com>
H.J. Lu <hjl.tools@gmail.com>
gcc/ChangeLog
PR target/106450
* config/i386/i386-features.cc (timode_check_non_convertible_regs):
Do nothing if REGNO is set in the REGS bitmap, or is a hard reg.
(timode_remove_non_convertible_regs): Update comment.
Call timode_check_non_convertible_reg on all TImode register
DEFs and USEs in each instruction.
gcc/testsuite/ChangeLog
PR target/106450
* gcc.target/i386/pr106450.c: New test case.
Harald Anlauf [Thu, 28 Jul 2022 20:07:02 +0000 (22:07 +0200)]
Fortran: detect blanks within literal constants in free-form mode [PR92805]
gcc/fortran/ChangeLog:
PR fortran/92805
* match.cc (gfc_match_small_literal_int): Make gobbling of leading
whitespace optional.
(gfc_match_name): Likewise.
(gfc_match_char): Likewise.
* match.h (gfc_match_small_literal_int): Adjust prototype.
(gfc_match_name): Likewise.
(gfc_match_char): Likewise.
* primary.cc (match_kind_param): Match small literal int or name
without gobbling whitespace.
(get_kind): Do not skip over blanks.
(match_string_constant): Likewise.
gcc/testsuite/ChangeLog:
PR fortran/92805
* gfortran.dg/literal_constants.f: New test.
* gfortran.dg/literal_constants.f90: New test.
Co-authored-by: Steven G. Kargl <kargl@gcc.gnu.org>
Harald Anlauf [Wed, 27 Jul 2022 19:34:22 +0000 (21:34 +0200)]
Fortran: fix invalid rank error in ASSOCIATED when rank is remapped [PR77652]
gcc/fortran/ChangeLog:
PR fortran/77652
* check.cc (gfc_check_associated): Make the rank check of POINTER
vs. TARGET match the allowed forms of pointer assignment for the
selected Fortran standard.
gcc/testsuite/ChangeLog:
PR fortran/77652
* gfortran.dg/associated_target_9a.f90: New test.
* gfortran.dg/associated_target_9b.f90: New test.
Lewis Hyatt [Tue, 12 Jul 2022 13:47:47 +0000 (09:47 -0400)]
c++: Fix location for -Wunused-macros [PR66290]
In C++, since all tokens are lexed from libcpp up front, diagnostics generated
by libcpp after lexing has completed do not get a valid location from libcpp
(rather, libcpp thinks they all pertain to the end of the file.) This has long
been addressed using the global variable "done_lexing", which the C++ frontend
sets at the appropriate time; when done_lexing is true, then c_cpp_diagnostic(),
which outputs libcpp's diagnostics, uses input_location instead of the wrong
libcpp location. The C++ frontend arranges that input_location will point to the
token it is currently processing, so this generally works fine. However, there
is one exception currently, which is -Wunused-macros. This gets generated at the
end of processing in cpp_finish (), since we need to wait until then to
determine whether a macro was eventually used or not. But the locations it
passes to c_cpp_diagnostic () were remembered from the original lexing and hence
they should not be overridden with input_location, which is now the one
incorrectly pointing to the end of the file.
Fixed by setting done_lexing=false again just prior to calling cpp_finish (). I
also renamed the variable from done_lexing to "override_libcpp_locations", since
it's now not strictly about lexing anymore.
There is no new testcase with this patch, since we already had an xfailed
testcase which is now fixed.
gcc/c-family/ChangeLog:
PR c++/66290
* c-common.h: Rename global done_lexing to
override_libcpp_locations.
* c-common.cc (c_cpp_diagnostic): Likewise.
* c-opts.cc (c_common_finish): Set override_libcpp_locations
(formerly done_lexing) immediately prior to calling cpp_finish ().
gcc/cp/ChangeLog:
PR c++/66290
* parser.cc (cp_lexer_new_main): Rename global done_lexing to
override_libcpp_locations.
gcc/testsuite/ChangeLog:
PR c++/66290
* c-c++-common/pragma-diag-15.c: Remove xfail for C++.
Roger Sayle [Sun, 31 Jul 2022 07:13:30 +0000 (08:13 +0100)]
PR bootstrap/106472: Add libgo depends on libbacktrace to Makefile.def
This patch fixes PR bootstrap/106472 by adding a missing dependency
to Makefile.def to allow make bootstrap when configured using
"--enable-languages=go" (and not using make with multiple threads).
2022-07-31 Roger Sayle <roger@nextmovesoftware.com>
ChangeLog
PR bootstrap/106472
* Makefile.def (dependencies): Make configure-target-libgo depend
upon all-target-libbacktrace.
Jason Merrill [Tue, 26 Jul 2022 15:02:21 +0000 (11:02 -0400)]
c++: constexpr, empty base after non-empty [PR106369]
Here the CONSTRUCTOR we were providing for D{} had an entry for the B base
subobject at offset 0 following the entry for the C base, causing
output_constructor_regular_field to ICE due to going backwards. It might be
nice for that function to be more tolerant of empty fields, but it also
seems reasonable for the front end to prune the useless entry.
PR c++/106369
gcc/cp/ChangeLog:
* constexpr.cc (reduced_constant_expression_p): Return false
if a CONSTRUCTOR initializes an empty field.
gcc/testsuite/ChangeLog:
* g++.dg/cpp1z/constexpr-lambda27.C: New test.
GCC Administrator [Sun, 31 Jul 2022 00:16:37 +0000 (00:16 +0000)]
Daily bump.
Ian Lance Taylor [Sat, 30 Jul 2022 14:29:28 +0000 (07:29 -0700)]
libgo: use SYS_timer_settime32
Musl defines SYS_timer_settime32, not SYS_timer_settime, on 32-bit systems.
Based on patch by Sören Tempel.
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/420222
Takayuki 'January June' Suwa [Fri, 29 Jul 2022 19:32:46 +0000 (04:32 +0900)]
xtensa: Fix conflicting hard regno between indirect sibcall fixups and EH_RETURN_STACKADJ_RTX
The hard register A10 was already allocated for EH_RETURN_STACKADJ_RTX.
(although exception handling and sibling call may not apply at the same time,
but for safety)
gcc/ChangeLog:
* config/xtensa/xtensa.md: Change hard register number used in
the split patterns for indirect sibling call fixups from 10 to 11,
the last free one for the CALL0 ABI.
Takayuki 'January June' Suwa [Fri, 29 Jul 2022 19:31:44 +0000 (04:31 +0900)]
xtensa: Add RTX costs for if_then_else
It takes one machine instruction for both condtional branch and move.
gcc/ChangeLog:
* config/xtensa/xtensa.cc (xtensa_rtx_costs):
Add new case for IF_THEN_ELSE.
GCC Administrator [Sat, 30 Jul 2022 00:16:30 +0000 (00:16 +0000)]
Daily bump.
Andrew Stubbs [Tue, 19 Jul 2022 10:14:28 +0000 (11:14 +0100)]
amdgcn: 64-bit vector shifts
Enable 64-bit vector-vector and vector-scalar shifts.
gcc/ChangeLog:
* config/gcn/gcn-valu.md (V_INT_noHI): New iterator.
(<expander><mode>3<exec>): Use V_INT_noHI.
(v<expander><mode>3<exec>): Likewise.
Andrew Stubbs [Fri, 15 Jul 2022 14:28:44 +0000 (15:28 +0100)]
amdgcn: 64-bit not
This makes the auto-vectorizer happier when handling masks.
gcc/ChangeLog:
* config/gcn/gcn.md (one_cmpldi2): New.
Tobias Burnus [Fri, 29 Jul 2022 10:41:08 +0000 (12:41 +0200)]
Add libgomp.c-c++-common/pr106449-2.c
This run-time test test pointer-based iteration with collapse,
similar to the '(parallel) simd' test for PR106449 but for 'for'.
libgomp/ChangeLog:
* testsuite/libgomp.c-c++-common/pr106449-2.c: New test.
Tobias Burnus [Fri, 29 Jul 2022 10:36:07 +0000 (12:36 +0200)]
OpenMP/Fortran: Permit assumed-size arrays in uniform clause
gcc/fortran/ChangeLog:
* openmp.cc (resolve_omp_clauses): Permit assumed-size arrays
in uniform clause.
gcc/testsuite/ChangeLog:
* gfortran.dg/gomp/declare-simd-3.f90: New test.
Richard Biener [Fri, 29 Jul 2022 08:40:34 +0000 (10:40 +0200)]
tree-optimization/105679 - disable backward threading of unlikely entry
The following makes the backward threader reject threads whose entry
edge is probably never executed according to the profile. That in
particular, for the testcase, avoids threading the irq == 1 check
on the path where irq > 31, thereby avoiding spurious -Warray-bounds
diagnostics
if (irq_1(D) > 31)
goto <bb 3>; [0.00%]
else
goto <bb 4>; [100.00%]
;; basic block 3, loop depth 0, count 0 (precise), probably never executed
_2 = (unsigned long) irq_1(D);
__builtin___ubsan_handle_shift_out_of_bounds (&*.Lubsan_data0, 1, _2);
_3 = 1 << irq_1(D);
mask_4 = (u32) _3;
entry = instance_5(D)->array[irq_1(D)];
capture (mask_4);
if (level_6(D) != 0)
goto <bb 7>; [34.00%]
else
goto <bb 5>; [66.00%]
;; basic block 5, loop depth 0, count
708669600 (estimated locally), maybe hot if (irq_1(D) == 1)
goto <bb 7>; [20.97%]
else
goto <bb 6>; [79.03%]
PR tree-optimization/105679
* tree-ssa-threadbackward.cc
(back_threader_profitability::profitable_path_p): Avoid threading
when the entry edge is probably never executed.
Jonathan Wakely [Thu, 28 Jul 2022 19:55:51 +0000 (20:55 +0100)]
libstdc++: Tweak common_iterator::operator-> return type [PR104443]
This adjusts the return type to match the resolution of LWG 3672. There
is no functional difference, because decltype(auto) always deduced a
value anyway, but this makes it simpler and consistent with the working
draft.
libstdc++-v3/ChangeLog:
PR libstdc++/104443
* include/bits/stl_iterator.h (common_iterator::operator->):
Change return type to just auto.
Richard Biener [Fri, 29 Jul 2022 06:24:52 +0000 (08:24 +0200)]
tree-optimization/106422 - verify block copying in forward threading
The forward threader failed to check whether it can actually duplicate
blocks. The following adds this in a similar place the backwards threader
performs this check.
PR tree-optimization/106422
* tree-ssa-threadupdate.cc (fwd_jt_path_registry::update_cfg):
Check whether we can copy thread blocks and cancel the thread if not.
* gcc.dg/torture/pr106422.c: New testcase.
Jakub Jelinek [Fri, 29 Jul 2022 07:59:19 +0000 (09:59 +0200)]
openmp: Reject invalid forms of C++ #pragma omp atomic compare [PR106448]
The allowed syntaxes of atomic compare don't allow ()s around the condition
of ?:, but we were accepting it in one case for C++.
Fixed thusly.
2022-07-29 Jakub Jelinek <jakub@redhat.com>
PR c++/106448
* parser.cc (cp_parser_omp_atomic): For simple cast followed by
CPP_QUERY token, don't try cp_parser_binary_operation if compare
is true.
* c-c++-common/gomp/atomic-32.c: New test.
Jakub Jelinek [Fri, 29 Jul 2022 07:49:11 +0000 (09:49 +0200)]
openmp: Fix up handling of non-rectangular simd loops with pointer type iterators [PR106449]
There were 2 issues visible on this new testcase, one that we didn't have
special POINTER_TYPE_P handling in a few spots of expand_omp_simd - for
pointers we need to use POINTER_PLUS_EXPR and need to have the non-pointer
part in sizetype, for non-rectangular loop on the other side we can rely on
multiplication factor 1, pointers can't be multiplied, without those changes
we'd ICE. The other issue was that we put n2 expression directly into a
comparison in a condition and regimplified that, for the &a[512] case that
and with gimplification being destructed that unfortunately meant modification
of original fd->loops[?].n2. Fixed by unsharing the expression. This was
causing a runtime failure on the testcase.
2022-07-29 Jakub Jelinek <jakub@redhat.com>
PR middle-end/106449
* omp-expand.cc (expand_omp_simd): Fix up handling of pointer
iterators in non-rectangular simd loops. Unshare fd->loops[i].n2
or n2 before regimplifying it inside of a condition.
* testsuite/libgomp.c-c++-common/pr106449.c: New test.
Jakub Jelinek [Fri, 29 Jul 2022 07:43:34 +0000 (09:43 +0200)]
openmp: Simplify fold_build_pointer_plus callers in omp-expand
Tobias mentioned in PR106449 that fold_build_pointer_plus already
fold_converts the second argument to sizetype if it doesn't already
have an integral type gimple compatible with sizetype.
So, this patch simplifies the callers of fold_build_pointer_plus in
omp-expand so that they don't do those conversions manually.
2022-07-29 Jakub Jelinek <jakub@redhat.com>
* omp-expand.cc (expand_omp_for_init_counts, expand_omp_for_init_vars,
extract_omp_for_update_vars, expand_omp_for_ordered_loops,
expand_omp_simd): Don't fold_convert second argument to
fold_build_pointer_plus to sizetype.
Lulu Cheng [Fri, 29 Jul 2022 01:44:52 +0000 (09:44 +0800)]
LoongArch: Define the macro ASM_PREFERRED_EH_DATA_FORMAT by checking the assembler's support for eh_frame encoding.
.eh_frame DW_EH_PE_pcrel encoding format is not supported by gas <= 2.39.
Check if the assembler support DW_EH_PE_PCREL encoding and define .eh_frame
encoding type.
gcc/ChangeLog:
* config.in: Regenerate.
* config/loongarch/loongarch.h (ASM_PREFERRED_EH_DATA_FORMAT):
Select the value of the macro definition according to whether
HAVE_AS_EH_FRAME_PCREL_ENCODING_SUPPORT is defined.
* configure: Regenerate.
* configure.ac: Reinstate HAVE_AS_EH_FRAME_PCREL_ENCODING_SUPPORT test.
Richard Biener [Thu, 28 Jul 2022 13:07:28 +0000 (15:07 +0200)]
Use CONVERT_EXPR_CODE_P
* gimple-ssa-warn-restrict.cc (builtin_memref::set_base_and_offset):
Use CONVERT_EXPR_CODE_P.
Richard Biener [Thu, 28 Jul 2022 13:08:23 +0000 (15:08 +0200)]
Avoid vect_get_vector_types_for_stmt
This replaces vect_get_vector_types_for_stmt with get_vectype_for_scalar_type
in vect_recog_bool_pattern.
* tree-vect-patterns.cc (vect_recog_bool_pattern): Use
get_vectype_for_scalar_type instead of
vect_get_vector_types_for_stmt.
GCC Administrator [Fri, 29 Jul 2022 00:16:21 +0000 (00:16 +0000)]
Daily bump.
David Malcolm [Thu, 28 Jul 2022 21:21:29 +0000 (17:21 -0400)]
analyzer: new warning: -Wanalyzer-putenv-of-auto-var [PR105893]
This patch implements a new -fanalyzer warning:
-Wanalyzer-putenv-of-auto-var
which complains about stack pointers passed to putenv(3) calls, as
per SEI CERT C Coding Standard rule POS34-C ("Do not call putenv() with
a pointer to an automatic variable as the argument").
For example, given:
#include <stdio.h>
#include <stdlib.h>
void test_arr (void)
{
char arr[] = "NAME=VALUE";
putenv (arr);
}
it emits:
demo.c: In function ‘test_arr’:
demo.c:7:3: warning: ‘putenv’ on a pointer to automatic variable ‘arr’ [POS34-C] [-Wanalyzer-putenv-of-auto-var]
7 | putenv (arr);
| ^~~~~~~~~~~~
‘test_arr’: event 1
|
| 7 | putenv (arr);
| | ^~~~~~~~~~~~
| | |
| | (1) ‘putenv’ on a pointer to automatic variable ‘arr’
|
demo.c:6:8: note: ‘arr’ declared on stack here
6 | char arr[] = "NAME=VALUE";
| ^~~
demo.c:7:3: note: perhaps use ‘setenv’ rather than ‘putenv’
7 | putenv (arr);
| ^~~~~~~~~~~~
gcc/analyzer/ChangeLog:
PR analyzer/105893
* analyzer.opt (Wanalyzer-putenv-of-auto-var): New.
* region-model-impl-calls.cc (class putenv_of_auto_var): New.
(region_model::impl_call_putenv): New.
* region-model.cc (region_model::on_call_pre): Handle putenv.
* region-model.h (region_model::impl_call_putenv): New decl.
gcc/ChangeLog:
PR analyzer/105893
* doc/invoke.texi: Add -Wanalyzer-putenv-of-auto-var.
gcc/testsuite/ChangeLog:
PR analyzer/105893
* gcc.dg/analyzer/putenv-1.c: New test.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
David Malcolm [Thu, 28 Jul 2022 21:21:29 +0000 (17:21 -0400)]
analyzer: add CWE identifier URLs to docs
gcc/analyzer/ChangeLog:
* sm-malloc.cc (free_of_non_heap::emit): Add comment about CWE.
* sm-taint.cc (tainted_size::emit): Likewise.
gcc/ChangeLog:
* doc/invoke.texi (-fdiagnostics-show-cwe): Use uref rather than
url.
(Static Analyzer Options): Likewise. Add urefs for all of the
warnings that have associated CWE identifiers.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
David Malcolm [Thu, 28 Jul 2022 21:21:28 +0000 (17:21 -0400)]
analyzer: expand the comment in region.h
gcc/analyzer/ChangeLog:
* region.h: Add notes to the comment describing the region
class hierarchy.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
David Malcolm [Thu, 28 Jul 2022 21:21:28 +0000 (17:21 -0400)]
jit: update docs to reflect .c to .cc renaming
gcc/jit/ChangeLog:
* docs/internals/index.rst: Remove reference to ".c" extensions
of source files.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Maciej W. Rozycki [Thu, 28 Jul 2022 13:04:33 +0000 (14:04 +0100)]
doc: Clarify FENV_ACCESS pragma semantics WRT `-ftrapping-math'
Our documentation indicates that it is the `-frounding-math' invocation
option that controls whether we respect what the FENV_ACCESS pragma
would imply, should we implement it, regarding the floating point
environment. It is only a part of the picture however, because the
`-ftrapping-math' invocation option also affects how we handle said
environment. Clarify that in the description of both options then, as
well as the FENV_ACCESS pragma itself.
gcc/
* doc/implement-c.texi (Floating point implementation): Mention
`-fno-trapping-math' in the context of FENV_ACCESS pragma.
* doc/invoke.texi (Optimize Options): Clarify FENV_ACCESS pragma
implication in the descriptions of `-fno-trapping-math' and
`-frounding-math'.
Maciej W. Rozycki [Thu, 28 Jul 2022 13:04:33 +0000 (14:04 +0100)]
RISC-V: Split unordered FP comparisons into individual RTL insns
We have unordered FP comparisons implemented as RTL insns that produce
multiple machine instructions. Such RTL insns are hard to match with a
processor pipeline description and additionally there is a redundant
SNEZ instruction produced on the result of these comparisons even though
the FLT.fmt and FLE.fmt machine instructions already produce either 0 or
1, e.g.:
long
flt (double x, double y)
{
return __builtin_isless (x, y);
}
with `-O2 -fno-finite-math-only -ftrapping-math -fno-signaling-nans'
gets compiled to:
.globl flt
.type flt, @function
flt:
frflags a5
flt.d a0,fa0,fa1
fsflags a5
snez a0,a0
ret
.size flt, .-flt
because the middle end can't see through the UNSPEC operation unordered
FP comparisons have been defined in terms of.
These instructions are only produced via an expander already, so change
the expander to emit individual RTL insns for each machine instruction
in the ultimate ultimate sequence produced rather than deferring to a
single RTL insn producing the whole sequence at once.
gcc/
* config/riscv/riscv.md (UNSPECV_FSNVSNAN): New constant.
(QUIET_PATTERN): New int attribute.
(f<quiet_pattern>_quiet<ANYF:mode><X:mode>4): Emit the intended
RTL insns entirely within the preparation statements.
(*f<quiet_pattern>_quiet<ANYF:mode><X:mode>4_default)
(*f<quiet_pattern>_quiet<ANYF:mode><X:mode>4_snan): Remove
insns.
(*riscv_fsnvsnan<mode>2): New insn.
gcc/testsuite/
* gcc.target/riscv/fle-ieee.c: New test.
* gcc.target/riscv/fle-snan.c: New test.
* gcc.target/riscv/fle.c: New test.
* gcc.target/riscv/flef-ieee.c: New test.
* gcc.target/riscv/flef-snan.c: New test.
* gcc.target/riscv/flef.c: New test.
* gcc.target/riscv/flt-ieee.c: New test.
* gcc.target/riscv/flt-snan.c: New test.
* gcc.target/riscv/flt.c: New test.
* gcc.target/riscv/fltf-ieee.c: New test.
* gcc.target/riscv/fltf-snan.c: New test.
* gcc.target/riscv/fltf.c: New test.
Richard Biener [Thu, 28 Jul 2022 08:07:32 +0000 (10:07 +0200)]
middle-end/106457 - improve array_at_struct_end_p for array objects
Array references to array objects are never at struct end.
PR middle-end/106457
* tree.cc (array_at_struct_end_p): Handle array objects
specially.
Jakub Jelinek [Thu, 28 Jul 2022 10:42:14 +0000 (12:42 +0200)]
gimple, internal-fn: Add IFN_TRAP and use it for __builtin_unreachable [PR106099]
__builtin_unreachable and __ubsan_handle_builtin_unreachable don't
use vops, they are marked const/leaf/noreturn/nothrow/cold.
But __builtin_trap uses vops, isn't const, just leaf/noreturn/nothrow/cold.
This is I believe so that when users explicitly use __builtin_trap in their
sources they get stores visible at the trap side.
-fsanitize=unreachable -fsanitize-undefined-trap-on-error used to transform
__builtin_unreachable to __builtin_trap even in the past, but the sanopt pass
has TODO_update_ssa, so it worked fine.
Now that gimple_build_builtin_unreachable can build a __builtin_trap call
right away, we can run into problems that whenever we need it we would need
to either manually or through TODO_update* ensure the vops being updated.
Though, as it is originally __builtin_unreachable which is just implemented
as trap, I think for this case it is fine to avoid vops. For this the
patch introduces IFN_TRAP, which has ECF_* flags like __builtin_unreachable
and is expanded as __builtin_trap.
2022-07-28 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/106099
* internal-fn.def (TRAP): New internal fn.
* internal-fn.h (expand_TRAP): Declare.
* internal-fn.cc (expand_TRAP): Define.
* gimple.cc (gimple_build_builtin_unreachable): For BUILT_IN_TRAP,
use internal fn rather than builtin.
* gcc.dg/ubsan/pr106099.c: New test.
Martin Liška [Tue, 26 Jul 2022 06:42:29 +0000 (08:42 +0200)]
jit,docs: shorten assembly output
Shorten the assembly example so that there is not slider.
Ready for master?
Thanks,
Martin
gcc/jit/ChangeLog:
* docs/cp/intro/tutorial02.rst:
Shorten the assembly example so that there is not slider.
* docs/cp/intro/tutorial04.rst: Likewise.
* docs/intro/tutorial02.rst: Likewise.
* docs/intro/tutorial04.rst: Likewise.
* docs/topics/contexts.rst: Likewise.
Martin Liska [Mon, 25 Jul 2022 13:57:32 +0000 (15:57 +0200)]
contrib: use sphinx-build from a venv
maintainer-scripts/ChangeLog:
* update_web_docs_git: Use sphinx-build from a venv so that
we can use a recent version.
marxin [Mon, 25 Jul 2022 12:45:01 +0000 (14:45 +0200)]
jit,docs: remove :ref:`modindex`
gcc/jit/ChangeLog:
* docs/index.rst: Remove reference to module index
as we don't emit any.
marxin [Mon, 25 Jul 2022 12:39:46 +0000 (14:39 +0200)]
jit,docs: use :expr:`type *` for pointers to a type
gcc/jit/ChangeLog:
* docs/cp/intro/tutorial02.rst: Use :expr:`type *` for pointers to a type
* docs/cp/topics/asm.rst: Likewise.
* docs/cp/topics/contexts.rst: Likewise.
* docs/cp/topics/expressions.rst: Likewise.
* docs/cp/topics/functions.rst: Likewise.
* docs/cp/topics/objects.rst: Likewise.
* docs/intro/tutorial02.rst: Likewise.
* docs/intro/tutorial03.rst: Likewise.
* docs/intro/tutorial04.rst: Likewise.
* docs/intro/tutorial05.rst: Likewise.
* docs/topics/compilation.rst: Likewise.
* docs/topics/contexts.rst: Likewise.
* docs/topics/objects.rst: Likewise.
marxin [Mon, 25 Jul 2022 10:35:26 +0000 (12:35 +0200)]
jit,docs: use list-table instead of fixed table
Use rather list-table that is easible to maintainer and one
does not have to wrap lines. Moreover, it provides great
attribute :widths: that correctly works (tested for HTML and PDF).
gcc/jit/ChangeLog:
* docs/cp/intro/tutorial04.rst: Use list-table.
* docs/intro/tutorial04.rst: Likewise.
* docs/intro/tutorial05.rst: Likewise.
* docs/topics/compilation.rst: Likewise.
* docs/topics/expressions.rst: Likewise.
* docs/topics/types.rst: Likewise.
marxin [Mon, 25 Jul 2022 09:51:51 +0000 (11:51 +0200)]
jit,docs: compact function declarations
gcc/jit/ChangeLog:
* docs/cp/topics/expressions.rst: Compact so that the generated
output is also more compact.
marxin [Mon, 25 Jul 2022 09:15:25 +0000 (11:15 +0200)]
jit,docs: various fixes
gcc/jit/ChangeLog:
* docs/cp/intro/tutorial02.rst: Use proper reference.
* docs/cp/topics/contexts.rst: Likewise.
* docs/cp/topics/functions.rst: Put `class` directive before a
function as it is not allowed declaring a class in a fn.
* docs/cp/topics/types.rst: Add template keyword.
* docs/examples/tut04-toyvm/toyvm.c (toyvm_function_compile):
Add removed comment used for code snippet ending detection.
* docs/intro/tutorial04.rst: Fix to match the real comment.
marxin [Mon, 25 Jul 2022 09:03:23 +0000 (11:03 +0200)]
jit,docs: replace c:type:`int_type` with :expr:`int_type`
Use expression that work fine for basic type.
gcc/jit/ChangeLog:
* docs/cp/topics/expressions.rst: Use :expr: for basic types.
* docs/topics/compilation.rst: Likewise.
* docs/topics/expressions.rst: Likewise.
* docs/topics/function-pointers.rst: Likewise.
marxin [Mon, 25 Jul 2022 08:52:56 +0000 (10:52 +0200)]
jit,docs: use enum directive for enumeral types
gcc/jit/ChangeLog:
* docs/conf.py: Add needs_sphinx = '3.0' where c:type was added.
* docs/index.rst: Remove note about it.
* docs/topics/compilation.rst: Use enum directive and reference.
* docs/topics/contexts.rst: Likewise.
* docs/topics/expressions.rst: Likewise.
* docs/topics/functions.rst: Likewise.
GCC Administrator [Thu, 28 Jul 2022 00:16:35 +0000 (00:16 +0000)]
Daily bump.
Lewis Hyatt [Sun, 10 Jul 2022 13:30:29 +0000 (09:30 -0400)]
preprocessor: Set input_location to the most recently seen token
When preprocessing with -E and -save-temps, input_location points always to the
first character of the current file. This was previously irrelevant because
nothing was called during the token streaming process that would inspect
input_location. But since r13-1544, "#pragma GCC diagnostic" is supported in
preprocess-only mode, and that pragma relies on input_location to decide if a
given source code location is subject to a diagnostic or not. Most diagnostics
work fine anyway, because they are handled as soon as they are seen and so
everything is still seen in the expected order even though all the diagnostic
pragmas are treated as if they applied at the start of the file. One example
that doesn't work correctly is the new testcase, since here the warning is not
triggered until the end of the file and so it is necessary to track the location
properly.
Fixed by setting input_location to point to each token as it is being
streamed, similar to how C++ mode sets it.
gcc/c-family/ChangeLog:
* c-ppoutput.cc (token_streamer::stream): Update input_location
prior to streaming each token.
gcc/testsuite/ChangeLog:
* c-c++-common/pragma-diag-14.c: New test.
* c-c++-common/pragma-diag-15.c: New test.
David Faust [Wed, 27 Jul 2022 18:11:26 +0000 (11:11 -0700)]
MAINTAINERS: Add myself as CTF and BTF reviewer
ChangeLog:
* MAINTAINERS: Add myself as reviewer for CTF and BTF.
Andrew Carlotti [Wed, 27 Jul 2022 14:11:51 +0000 (15:11 +0100)]
docs: Fix outdated reference to LOOPS_HAVE_MARKED_SINGLE_EXITS
gcc/ChangeLog:
* doc/loop.texi: Refer to LOOPS_HAVE_RECORDED_EXITS instead.
Immad Mir [Wed, 27 Jul 2022 13:46:36 +0000 (19:16 +0530)]
analyzer: add get_meaning_for_state_change vfunc to fd_diagnostic in sm-fd.cc [PR106286]
This patch adds get_meaning_for_state_change vfunc to
fd_diagnostic in sm-fd.cc which could be used by SARIF output.
Lightly tested on x86_64 Linux.
gcc/analyzer/ChangeLog:
PR analyzer/106286
* sm-fd.cc:
(fd_diagnostic::get_meaning_for_state_change): New.
gcc/testsuite/ChangeLog:
PR analyzer/106286
* gcc.dg/analyzer/fd-meaning.c: New test.
Signed-off-by: Immad Mir <mirimmad@outlook.com>
WANG Xuerui [Wed, 27 Jul 2022 07:01:17 +0000 (15:01 +0800)]
LoongArch: document -m[no-]explicit-relocs
gcc/ChangeLog:
* doc/invoke.texi: Document -m[no-]explicit-relocs for
LoongArch.
Maciej W. Rozycki [Wed, 27 Jul 2022 10:09:43 +0000 (11:09 +0100)]
RISC-V: Remove duplicate backslashes from `stack_protect_set_<mode>'
Remove redundant duplicate backslash characters from \t sequences in the
output pattern of the `stack_protect_set_<mode>' RTL insn.
gcc/
* config/riscv/riscv.md (stack_protect_set_<mode>): Remove
duplicate backslashes.
Maciej W. Rozycki [Wed, 27 Jul 2022 10:09:42 +0000 (11:09 +0100)]
RISC-V: Add RTX costs for `if_then_else' expressions
Fix a performance regression from commit
391500af1932 ("Do not ignore
costs of jump insns in combine."), a part of the m68k series for MODE_CC
conversion (<https://gcc.gnu.org/ml/gcc-patches/2019-11/msg01028.html>),
observed in soft-fp code in libgcc used by some of the embench-iot
benchmarks.
The immediate origin of the regression is the middle end, which in the
absence of cost information from the backend estimates the cost of an
RTL expression by assuming a single machine instruction for each of the
expression's subexpression.
So for `if_then_else', which takes 3 operands, the estimated cost is 3
instructions (i.e. 12 units) even though a branch instruction evaluates
it in a single machine cycle (ignoring the cost of actually taking the
branch of course, which is handled elsewhere). Consequently an insn
sequence like:
(insn 595 594 596 43 (set (reg:DI 305)
(lshiftrt:DI (reg/v:DI 160 [ R_f ])
(const_int 55 [0x37]))) ".../libgcc/soft-fp/adddf3.c":46:3 216 {lshrdi3}
(nil))
(insn 596 595 597 43 (set (reg:DI 304)
(and:DI (reg:DI 305)
(const_int 1 [0x1]))) ".../libgcc/soft-fp/adddf3.c":46:3 109 {anddi3}
(expr_list:REG_DEAD (reg:DI 305)
(nil)))
(jump_insn 597 596 598 43 (set (pc)
(if_then_else (eq (reg:DI 304)
(const_int 0 [0]))
(label_ref:DI 1644)
(pc))) ".../libgcc/soft-fp/adddf3.c":46:3 237 {*branchdi}
(expr_list:REG_DEAD (reg:DI 304)
(int_list:REG_BR_PROB
536870916 (nil)))
-> 1644)
does not (anymore, as from the commit referred) get combined into:
(note 595 594 596 43 NOTE_INSN_DELETED)
(note 596 595 597 43 NOTE_INSN_DELETED)
(jump_insn 597 596 598 43 (parallel [
(set (pc)
(if_then_else (eq (zero_extract:DI (reg/v:DI 160 [ R_f ])
(const_int 1 [0x1])
(const_int 55 [0x37]))
(const_int 0 [0]))
(label_ref:DI 1644)
(pc)))
(clobber (scratch:DI))
]) ".../libgcc/soft-fp/adddf3.c":46:3 243 {*branch_on_bitdi}
(int_list:REG_BR_PROB
536870916 (nil))
-> 1644)
This is because the new cost is incorrectly calculated as 28 units while
the cost of the original 3 instructions was 24:
rejecting combination of insns 595, 596 and 597
original costs 4 + 4 + 16 = 24
replacement cost 28
Before the commit referred the cost of jump instruction was ignored and
considered 0 (i.e. unknown) and a sequence of instructions of a known
cost used to win:
allowing combination of insns 595, 596 and 597
original costs 4 + 4 + 0 = 0
replacement cost 28
Add the missing costs for the 3 variants of `if_then_else' expressions
we currently define in the backend.
With the fix in place the cost of this particular `if_then_else' pattern
is 2 instructions or 8 units (because of the shift operation) and
therefore the ultimate cost of the original 3 RTL insns will work out at
16 units (4 + 4 + 8), however the replacement single RTL insn will cost
8 units only.
gcc/
* config/riscv/riscv.cc (riscv_rtx_costs) <IF_THEN_ELSE>: New
case.
Jakub Jelinek [Wed, 27 Jul 2022 10:06:22 +0000 (12:06 +0200)]
cgraphunit: Don't emit asm thunks for -dx [PR106261]
When -dx option is used (didn't know we have it and no idea what is it
useful for), we just expand functions to RTL and then omit all further
RTL passes, so the normal functions aren't actually emitted into assembly,
just variables.
The following testcase ICEs, because we don't emit the methods, but do
emit thunks pointing to that and those thunks have unwind info and rely on
at least some real functions to be emitted (which is normally the case,
thunks are only emitted for locally defined functions) because otherwise
there are no CIEs, only FDEs and dwarf2out is upset about it.
The following patch fixes that by not emitting assembly thunks for -dx
either.
2022-07-27 Jakub Jelinek <jakub@redhat.com>
PR debug/106261
* cgraphunit.cc (cgraph_node::assemble_thunks_and_aliases): Don't
output asm thunks for -dx.
* g++.dg/debug/pr106261.C: New test.
Jakub Jelinek [Wed, 27 Jul 2022 10:04:50 +0000 (12:04 +0200)]
opts: Add an assertion to help static analyzers [PR106332]
This function would have UB if called with empty candidates vector
(accessing p[-1] where p is malloc (0) result).
As analyzed in the PR, we never call it with empty vector, so this just
adds an assertion to make it clear.
2022-07-27 Jakub Jelinek <jakub@redhat.com>
PR middle-end/106332
* opts-common.cc (candidates_list_and_hint): Add gcc_assert
that candidates is not an empty vector.
Jakub Jelinek [Wed, 27 Jul 2022 10:02:12 +0000 (12:02 +0200)]
testsuite: Add -Wno-psabi to pr94920 tests [PR94920]
These tests fail on ia32, because we get -Wpsabi warnings.
Fixed by adding -Wno-psabi. The pr94920.C test still fails the
ABS_EXPR scan-tree-dump though, I think we'll need to add vect
options and use vect_int effective target or something similar.
2022-07-27 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/94920
* g++.dg/pr94920.C: Add -Wno-psabi to dg-options.
* g++.dg/pr94920-1.C: Add dg-additional-options -Wno-psabi.
Jakub Jelinek [Wed, 27 Jul 2022 10:00:36 +0000 (12:00 +0200)]
testsuite: Add extra ia32 options so that -fprefetch-loop-arrays works [PR106397]
-fprefetch-loop-arrays isn't supported on ia32 with just -march=i386 and
similar, the following patch adds extra options similar testcases use.
2022-07-27 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/106397
* gcc.dg/pr106397.c: For ia32, add dg-additional-options
-march=i686 -msse.
Richard Biener [Wed, 27 Jul 2022 09:37:17 +0000 (11:37 +0200)]
Fix Rogers e-mail in MAINTAINERS
I've made the mistake of cut&pasting the bouncing address at least
twice.
* MAINTAINERS (Roger Sayle): Update e-mail address.
Xi Ruoyao [Tue, 26 Jul 2022 13:46:20 +0000 (21:46 +0800)]
LoongArch: adjust the default of -mexplicit-relocs by checking gas feature
The assembly produced with -mexplicit-relocs is not supported by gas <=
2.39. Check if the assembler supports explicit relocations and set the
default accordingly.
gcc/ChangeLog:
* configure.ac (HAVE_AS_EXPLICIT_RELOCS): Define to 1 if the
assembler supports explicit relocation for LoongArch.
* configure: Regenerate.
* config/loongarch/loongarch-opts.h (HAVE_AS_EXPLICIT_RELOCS):
Define to 0 if not defined.
* config/loongarch/genopts/loongarch.opt.in
(TARGET_EXPLICIT_RELOCS): Default to HAVE_AS_EXPLICIT_RELOCS.
* config/loongarch/loongarch.opt: Regenerate.
GCC Administrator [Wed, 27 Jul 2022 00:16:58 +0000 (00:16 +0000)]
Daily bump.
Thomas Rodgers [Wed, 6 Jul 2022 00:42:42 +0000 (17:42 -0700)]
libstdc++: Minor codegen improvement for atomic wait spinloop
This patch merges the spin loops in the atomic wait implementation which is a
minor codegen improvement.
libstdc++-v3/ChangeLog:
* include/bits/atomic_wait.h (__atomic_spin): Merge spin loops.
David Malcolm [Tue, 26 Jul 2022 21:17:18 +0000 (17:17 -0400)]
analyzer: fix false +ves from -Wanalyzer-va-arg-type-mismatch on int promotion [PR106319]
gcc/analyzer/ChangeLog:
PR analyzer/106319
* store.cc (store::set_value): Don't strip away casts if the
region has NULL type.
gcc/testsuite/ChangeLog:
PR analyzer/106319
* gcc.dg/analyzer/stdarg-types-3.c: New test.
* gcc.dg/analyzer/stdarg-types-4.c: New test.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
David Malcolm [Tue, 26 Jul 2022 18:43:59 +0000 (14:43 -0400)]
analyzer: fix stray get_element decls
These were copy&paste errors.
gcc/analyzer/ChangeLog:
* region.h (code_region::get_element): Remove stray decl.
(function_region::get_element): Likewise.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Harald Anlauf [Mon, 25 Jul 2022 20:29:50 +0000 (22:29 +0200)]
Fortran: error recovery from calculation of storage size of a symbol [PR103504]
gcc/fortran/ChangeLog:
PR fortran/103504
* interface.cc (get_sym_storage_size): Array bounds and character
length can only be of integer type.
gcc/testsuite/ChangeLog:
PR fortran/103504
* gfortran.dg/pr103504.f90: New test.
Peter Bergner [Sat, 18 Jun 2022 04:43:23 +0000 (23:43 -0500)]
c: Handle initializations of opaque types [PR106016]
The initial commit that added opaque types thought that there couldn't
be any valid initializations for variables of these types, but the test
case in the bug report shows that isn't true. The solution is to handle
OPAQUE_TYPE initializations like the other scalar types.
2022-06-17 Peter Bergner <bergner@linux.ibm.com>
gcc/
PR c/106016
* expr.cc (count_type_elements): Handle OPAQUE_TYPE.
gcc/testsuite/
PR c/106016
* gcc.target/powerpc/pr106016.c: New test.
Lulu Cheng [Tue, 26 Jul 2022 13:03:52 +0000 (21:03 +0800)]
LoongArch: Modify the output message string of the warning.
Fix bug for "error: spurious trailing punctuation sequence '.' in format [-Werror=format-diag]".
gcc/ChangeLog:
* config/loongarch/loongarch-opts.cc: Modify the output message string
of the warning.
Martin Liska [Tue, 26 Jul 2022 12:26:58 +0000 (14:26 +0200)]
docs: fix previsou commit
gcc/ChangeLog:
* doc/tm.texi.in: Fix placement of defmac.
* doc/tm.texi: Copy.
Martin Liska [Tue, 26 Jul 2022 12:06:11 +0000 (14:06 +0200)]
docs: fix crossing declaration of @defmac and @hook.
gcc/ChangeLog:
* doc/tm.texi.in: Fix cross @defmac and @hook.
* doc/tm.texi: Copy.
Marek Polacek [Fri, 15 Jul 2022 13:51:50 +0000 (09:51 -0400)]
c++: ICE with erroneous template redeclaration [PR106311]
Here we ICE trying to get DECL_SOURCE_LOCATION of the parm that happens
to be error_mark_node in this ill-formed test. I kept running into this
while reducing code, so it'd be good to have it fixed.
PR c++/106311
gcc/cp/ChangeLog:
* pt.cc (redeclare_class_template): Check DECL_P before accessing
DECL_SOURCE_LOCATION.
gcc/testsuite/ChangeLog:
* g++.dg/template/redecl5.C: New test.
Aldy Hernandez [Tue, 26 Jul 2022 09:03:17 +0000 (11:03 +0200)]
Handle non constant ranges in irange pretty printer.
Technically iranges only exist in constant form, but we allow symbolic
ones before arriving in the ranger, so legacy VRP can work. This fixes the
ICE when attempting to print symbolic iranges in the pretty printer.
For consistency's sake, I have made sure irange::get_nonzero_bits does
not similarly ICE on a symbolic range, even though no one should be
querying nonzero bits on such a range. This should all melt away
when legacy disappears, because all these methods are slated for
removal (min, max, kind, symbolic_p, constant_p, etc).
Finally, Richi suggested using pp_wide_int in the pretty printer
instead of going through trees. I've adapted a test, since
dump_generic_node seems to work slightly different.
PR tree-optimization/106444
gcc/ChangeLog:
* value-range-pretty-print.cc (vrange_printer::visit): Handle
legacy ranges.
(vrange_printer::print_irange_bound): Work on wide_int's.
* value-range-pretty-print.h (print_irange_bound): Same.
* value-range.cc (irange::get_nonzero_bits): Handle legacy ranges.
gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/evrp4.c: Adjust.
Richard Biener [Tue, 26 Jul 2022 09:02:13 +0000 (11:02 +0200)]
Improve ptr_derefs_may_alias_p for the case of &STRING_CST
When the first pointer happens to be a pointer to a STRING_CST we
give up too early since the 2nd pointer handling could still end
up with a DECL for example which can disambiguate against a STRING_CST
just fine.
* tree-ssa-alias.cc (ptr_derefs_may_alias_p): If ptr1
points to a constant continue checking ptr2.
Andrew Carlotti [Thu, 21 Jul 2022 16:22:14 +0000 (17:22 +0100)]
aarch64: Move vreinterpret definitions into the compiler
This removes a significant number of intrinsic definitions from the arm_neon.h
header file, and reduces the amount of code duplication. The new macros and
data structures are intended to also facilitate moving other intrinsic
definitions out of the header file in future.
There is a a slight change in the behaviour of the bf16 vreinterpret intrinsics
when compiling without bf16 support. Expressions like:
b = vreinterpretq_s32_bf16(vreinterpretq_bf16_s64(a))
are now compiled successfully, instead of causing a 'target specific option
mismatch' during inlining.
gcc/ChangeLog:
* config/aarch64/aarch64-builtins.cc
(MODE_d_bf16, MODE_d_f16, MODE_d_f32, MODE_d_f64, MODE_d_s8)
(MODE_d_s16, MODE_d_s32, MODE_d_s64, MODE_d_u8, MODE_d_u16)
(MODE_d_u32, MODE_d_u64, MODE_d_p8, MODE_d_p16, MODE_d_p64)
(MODE_q_bf16, MODE_q_f16, MODE_q_f32, MODE_q_f64, MODE_q_s8)
(MODE_q_s16, MODE_q_s32, MODE_q_s64, MODE_q_u8, MODE_q_u16)
(MODE_q_u32, MODE_q_u64, MODE_q_p8, MODE_q_p16, MODE_q_p64)
(MODE_q_p128): Define macro to map to corresponding mode name.
(QUAL_bf16, QUAL_f16, QUAL_f32, QUAL_f64, QUAL_s8, QUAL_s16)
(QUAL_s32, QUAL_s64, QUAL_u8, QUAL_u16, QUAL_u32, QUAL_u64)
(QUAL_p8, QUAL_p16, QUAL_p64, QUAL_p128): Define macro to map to
corresponding qualifier name.
(LENGTH_d, LENGTH_q): Define macro to map to "" or "q" suffix.
(SIMD_INTR_MODE, SIMD_INTR_QUAL, SIMD_INTR_LENGTH_CHAR): Macro
functions for the above mappings
(VREINTERPRET_BUILTIN2, VREINTERPRET_BUILTINS1, VREINTERPRET_BUILTINS)
(VREINTERPRETQ_BUILTIN2, VREINTERPRETQ_BUILTINS1)
(VREINTERPRETQ_BUILTINS, VREINTERPRET_BUILTIN)
(AARCH64_SIMD_VREINTERPRET_BUILTINS): New macros to create definitions
for all vreinterpret intrinsics
(enum aarch64_builtins): Add vreinterpret function codes
(aarch64_init_simd_intrinsics): New
(handle_arm_neon_h): Improved comment.
(aarch64_general_fold_builtin): Fold vreinterpret calls
* config/aarch64/arm_neon.h
(vreinterpret_p8_f16, vreinterpret_p8_f64, vreinterpret_p8_s8)
(vreinterpret_p8_s16, vreinterpret_p8_s32, vreinterpret_p8_s64)
(vreinterpret_p8_f32, vreinterpret_p8_u8, vreinterpret_p8_u16)
(vreinterpret_p8_u32, vreinterpret_p8_u64, vreinterpret_p8_p16)
(vreinterpret_p8_p64, vreinterpretq_p8_f64, vreinterpretq_p8_s8)
(vreinterpretq_p8_s16, vreinterpretq_p8_s32, vreinterpretq_p8_s64)
(vreinterpretq_p8_f16, vreinterpretq_p8_f32, vreinterpretq_p8_u8)
(vreinterpretq_p8_u16, vreinterpretq_p8_u32, vreinterpretq_p8_u64)
(vreinterpretq_p8_p16, vreinterpretq_p8_p64, vreinterpretq_p8_p128)
(vreinterpret_p16_f16, vreinterpret_p16_f64, vreinterpret_p16_s8)
(vreinterpret_p16_s16, vreinterpret_p16_s32, vreinterpret_p16_s64)
(vreinterpret_p16_f32, vreinterpret_p16_u8, vreinterpret_p16_u16)
(vreinterpret_p16_u32, vreinterpret_p16_u64, vreinterpret_p16_p8)
(vreinterpret_p16_p64, vreinterpretq_p16_f64, vreinterpretq_p16_s8)
(vreinterpretq_p16_s16, vreinterpretq_p16_s32, vreinterpretq_p16_s64)
(vreinterpretq_p16_f16, vreinterpretq_p16_f32, vreinterpretq_p16_u8)
(vreinterpretq_p16_u16, vreinterpretq_p16_u32, vreinterpretq_p16_u64)
(vreinterpretq_p16_p8, vreinterpretq_p16_p64, vreinterpretq_p16_p128)
(vreinterpret_p64_f16, vreinterpret_p64_f64, vreinterpret_p64_s8)
(vreinterpret_p64_s16, vreinterpret_p64_s32, vreinterpret_p64_s64)
(vreinterpret_p64_f32, vreinterpret_p64_u8, vreinterpret_p64_u16)
(vreinterpret_p64_u32, vreinterpret_p64_u64, vreinterpret_p64_p8)
(vreinterpret_p64_p16, vreinterpretq_p64_f64, vreinterpretq_p64_s8)
(vreinterpretq_p64_s16, vreinterpretq_p64_s32, vreinterpretq_p64_s64)
(vreinterpretq_p64_f16, vreinterpretq_p64_f32, vreinterpretq_p64_p128)
(vreinterpretq_p64_u8, vreinterpretq_p64_u16, vreinterpretq_p64_p16)
(vreinterpretq_p64_u32, vreinterpretq_p64_u64, vreinterpretq_p64_p8)
(vreinterpretq_p128_p8, vreinterpretq_p128_p16, vreinterpretq_p128_f16)
(vreinterpretq_p128_f32, vreinterpretq_p128_p64, vreinterpretq_p128_s64)
(vreinterpretq_p128_u64, vreinterpretq_p128_s8, vreinterpretq_p128_s16)
(vreinterpretq_p128_s32, vreinterpretq_p128_u8, vreinterpretq_p128_u16)
(vreinterpretq_p128_u32, vreinterpret_f16_f64, vreinterpret_f16_s8)
(vreinterpret_f16_s16, vreinterpret_f16_s32, vreinterpret_f16_s64)
(vreinterpret_f16_f32, vreinterpret_f16_u8, vreinterpret_f16_u16)
(vreinterpret_f16_u32, vreinterpret_f16_u64, vreinterpret_f16_p8)
(vreinterpret_f16_p16, vreinterpret_f16_p64, vreinterpretq_f16_f64)
(vreinterpretq_f16_s8, vreinterpretq_f16_s16, vreinterpretq_f16_s32)
(vreinterpretq_f16_s64, vreinterpretq_f16_f32, vreinterpretq_f16_u8)
(vreinterpretq_f16_u16, vreinterpretq_f16_u32, vreinterpretq_f16_u64)
(vreinterpretq_f16_p8, vreinterpretq_f16_p128, vreinterpretq_f16_p16)
(vreinterpretq_f16_p64, vreinterpret_f32_f16, vreinterpret_f32_f64)
(vreinterpret_f32_s8, vreinterpret_f32_s16, vreinterpret_f32_s32)
(vreinterpret_f32_s64, vreinterpret_f32_u8, vreinterpret_f32_u16)
(vreinterpret_f32_u32, vreinterpret_f32_u64, vreinterpret_f32_p8)
(vreinterpret_f32_p16, vreinterpret_f32_p64, vreinterpretq_f32_f16)
(vreinterpretq_f32_f64, vreinterpretq_f32_s8, vreinterpretq_f32_s16)
(vreinterpretq_f32_s32, vreinterpretq_f32_s64, vreinterpretq_f32_u8)
(vreinterpretq_f32_u16, vreinterpretq_f32_u32, vreinterpretq_f32_u64)
(vreinterpretq_f32_p8, vreinterpretq_f32_p16, vreinterpretq_f32_p64)
(vreinterpretq_f32_p128, vreinterpret_f64_f16, vreinterpret_f64_f32)
(vreinterpret_f64_p8, vreinterpret_f64_p16, vreinterpret_f64_p64)
(vreinterpret_f64_s8, vreinterpret_f64_s16, vreinterpret_f64_s32)
(vreinterpret_f64_s64, vreinterpret_f64_u8, vreinterpret_f64_u16)
(vreinterpret_f64_u32, vreinterpret_f64_u64, vreinterpretq_f64_f16)
(vreinterpretq_f64_f32, vreinterpretq_f64_p8, vreinterpretq_f64_p16)
(vreinterpretq_f64_p64, vreinterpretq_f64_s8, vreinterpretq_f64_s16)
(vreinterpretq_f64_s32, vreinterpretq_f64_s64, vreinterpretq_f64_u8)
(vreinterpretq_f64_u16, vreinterpretq_f64_u32, vreinterpretq_f64_u64)
(vreinterpret_s64_f16, vreinterpret_s64_f64, vreinterpret_s64_s8)
(vreinterpret_s64_s16, vreinterpret_s64_s32, vreinterpret_s64_f32)
(vreinterpret_s64_u8, vreinterpret_s64_u16, vreinterpret_s64_u32)
(vreinterpret_s64_u64, vreinterpret_s64_p8, vreinterpret_s64_p16)
(vreinterpret_s64_p64, vreinterpretq_s64_f64, vreinterpretq_s64_s8)
(vreinterpretq_s64_s16, vreinterpretq_s64_s32, vreinterpretq_s64_f16)
(vreinterpretq_s64_f32, vreinterpretq_s64_u8, vreinterpretq_s64_u16)
(vreinterpretq_s64_u32, vreinterpretq_s64_u64, vreinterpretq_s64_p8)
(vreinterpretq_s64_p16, vreinterpretq_s64_p64, vreinterpretq_s64_p128)
(vreinterpret_u64_f16, vreinterpret_u64_f64, vreinterpret_u64_s8)
(vreinterpret_u64_s16, vreinterpret_u64_s32, vreinterpret_u64_s64)
(vreinterpret_u64_f32, vreinterpret_u64_u8, vreinterpret_u64_u16)
(vreinterpret_u64_u32, vreinterpret_u64_p8, vreinterpret_u64_p16)
(vreinterpret_u64_p64, vreinterpretq_u64_f64, vreinterpretq_u64_s8)
(vreinterpretq_u64_s16, vreinterpretq_u64_s32, vreinterpretq_u64_s64)
(vreinterpretq_u64_f16, vreinterpretq_u64_f32, vreinterpretq_u64_u8)
(vreinterpretq_u64_u16, vreinterpretq_u64_u32, vreinterpretq_u64_p8)
(vreinterpretq_u64_p16, vreinterpretq_u64_p64, vreinterpretq_u64_p128)
(vreinterpret_s8_f16, vreinterpret_s8_f64, vreinterpret_s8_s16)
(vreinterpret_s8_s32, vreinterpret_s8_s64, vreinterpret_s8_f32)
(vreinterpret_s8_u8, vreinterpret_s8_u16, vreinterpret_s8_u32)
(vreinterpret_s8_u64, vreinterpret_s8_p8, vreinterpret_s8_p16)
(vreinterpret_s8_p64, vreinterpretq_s8_f64, vreinterpretq_s8_s16)
(vreinterpretq_s8_s32, vreinterpretq_s8_s64, vreinterpretq_s8_f16)
(vreinterpretq_s8_f32, vreinterpretq_s8_u8, vreinterpretq_s8_u16)
(vreinterpretq_s8_u32, vreinterpretq_s8_u64, vreinterpretq_s8_p8)
(vreinterpretq_s8_p16, vreinterpretq_s8_p64, vreinterpretq_s8_p128)
(vreinterpret_s16_f16, vreinterpret_s16_f64, vreinterpret_s16_s8)
(vreinterpret_s16_s32, vreinterpret_s16_s64, vreinterpret_s16_f32)
(vreinterpret_s16_u8, vreinterpret_s16_u16, vreinterpret_s16_u32)
(vreinterpret_s16_u64, vreinterpret_s16_p8, vreinterpret_s16_p16)
(vreinterpret_s16_p64, vreinterpretq_s16_f64, vreinterpretq_s16_s8)
(vreinterpretq_s16_s32, vreinterpretq_s16_s64, vreinterpretq_s16_f16)
(vreinterpretq_s16_f32, vreinterpretq_s16_u8, vreinterpretq_s16_u16)
(vreinterpretq_s16_u32, vreinterpretq_s16_u64, vreinterpretq_s16_p8)
(vreinterpretq_s16_p16, vreinterpretq_s16_p64, vreinterpretq_s16_p128)
(vreinterpret_s32_f16, vreinterpret_s32_f64, vreinterpret_s32_s8)
(vreinterpret_s32_s16, vreinterpret_s32_s64, vreinterpret_s32_f32)
(vreinterpret_s32_u8, vreinterpret_s32_u16, vreinterpret_s32_u32)
(vreinterpret_s32_u64, vreinterpret_s32_p8, vreinterpret_s32_p16)
(vreinterpret_s32_p64, vreinterpretq_s32_f64, vreinterpretq_s32_s8)
(vreinterpretq_s32_s16, vreinterpretq_s32_s64, vreinterpretq_s32_f16)
(vreinterpretq_s32_f32, vreinterpretq_s32_u8, vreinterpretq_s32_u16)
(vreinterpretq_s32_u32, vreinterpretq_s32_u64, vreinterpretq_s32_p8)
(vreinterpretq_s32_p16, vreinterpretq_s32_p64, vreinterpretq_s32_p128)
(vreinterpret_u8_f16, vreinterpret_u8_f64, vreinterpret_u8_s8)
(vreinterpret_u8_s16, vreinterpret_u8_s32, vreinterpret_u8_s64)
(vreinterpret_u8_f32, vreinterpret_u8_u16, vreinterpret_u8_u32)
(vreinterpret_u8_u64, vreinterpret_u8_p8, vreinterpret_u8_p16)
(vreinterpret_u8_p64, vreinterpretq_u8_f64, vreinterpretq_u8_s8)
(vreinterpretq_u8_s16, vreinterpretq_u8_s32, vreinterpretq_u8_s64)
(vreinterpretq_u8_f16, vreinterpretq_u8_f32, vreinterpretq_u8_u16)
(vreinterpretq_u8_u32, vreinterpretq_u8_u64, vreinterpretq_u8_p8)
(vreinterpretq_u8_p16, vreinterpretq_u8_p64, vreinterpretq_u8_p128)
(vreinterpret_u16_f16, vreinterpret_u16_f64, vreinterpret_u16_s8)
(vreinterpret_u16_s16, vreinterpret_u16_s32, vreinterpret_u16_s64)
(vreinterpret_u16_f32, vreinterpret_u16_u8, vreinterpret_u16_u32)
(vreinterpret_u16_u64, vreinterpret_u16_p8, vreinterpret_u16_p16)
(vreinterpret_u16_p64, vreinterpretq_u16_f64, vreinterpretq_u16_s8)
(vreinterpretq_u16_s16, vreinterpretq_u16_s32, vreinterpretq_u16_s64)
(vreinterpretq_u16_f16, vreinterpretq_u16_f32, vreinterpretq_u16_u8)
(vreinterpretq_u16_u32, vreinterpretq_u16_u64, vreinterpretq_u16_p8)
(vreinterpretq_u16_p16, vreinterpretq_u16_p64, vreinterpretq_u16_p128)
(vreinterpret_u32_f16, vreinterpret_u32_f64, vreinterpret_u32_s8)
(vreinterpret_u32_s16, vreinterpret_u32_s32, vreinterpret_u32_s64)
(vreinterpret_u32_f32, vreinterpret_u32_u8, vreinterpret_u32_u16)
(vreinterpret_u32_u64, vreinterpret_u32_p8, vreinterpret_u32_p16)
(vreinterpret_u32_p64, vreinterpretq_u32_f64, vreinterpretq_u32_s8)
(vreinterpretq_u32_s16, vreinterpretq_u32_s32, vreinterpretq_u32_s64)
(vreinterpretq_u32_f16, vreinterpretq_u32_f32, vreinterpretq_u32_u8)
(vreinterpretq_u32_u16, vreinterpretq_u32_u64, vreinterpretq_u32_p8)
(vreinterpretq_u32_p16, vreinterpretq_u32_p64, vreinterpretq_u32_p128)
(vreinterpretq_f64_p128, vreinterpretq_p128_f64, vreinterpret_bf16_u8)
(vreinterpret_bf16_u16, vreinterpret_bf16_u32, vreinterpret_bf16_u64)
(vreinterpret_bf16_s8, vreinterpret_bf16_s16, vreinterpret_bf16_s32)
(vreinterpret_bf16_s64, vreinterpret_bf16_p8, vreinterpret_bf16_p16)
(vreinterpret_bf16_p64, vreinterpret_bf16_f16, vreinterpret_bf16_f32)
(vreinterpret_bf16_f64, vreinterpretq_bf16_u8, vreinterpretq_bf16_u16)
(vreinterpretq_bf16_u32, vreinterpretq_bf16_u64, vreinterpretq_bf16_s8)
(vreinterpretq_bf16_s16, vreinterpretq_bf16_s32, vreinterpretq_bf16_s64)
(vreinterpretq_bf16_p8, vreinterpretq_bf16_p16, vreinterpretq_bf16_p64)
(vreinterpretq_bf16_p128, vreinterpretq_bf16_f16)
(vreinterpretq_bf16_f32, vreinterpretq_bf16_f64, vreinterpret_s8_bf16)
(vreinterpret_s16_bf16, vreinterpret_s32_bf16, vreinterpret_s64_bf16)
(vreinterpret_u8_bf16, vreinterpret_u16_bf16, vreinterpret_u32_bf16)
(vreinterpret_u64_bf16, vreinterpret_f16_bf16, vreinterpret_f32_bf16)
(vreinterpret_f64_bf16, vreinterpret_p8_bf16, vreinterpret_p16_bf16)
(vreinterpret_p64_bf16, vreinterpretq_s8_bf16, vreinterpretq_s16_bf16)
(vreinterpretq_s32_bf16, vreinterpretq_s64_bf16, vreinterpretq_u8_bf16)
(vreinterpretq_u16_bf16, vreinterpretq_u32_bf16, vreinterpretq_u64_bf16)
(vreinterpretq_f16_bf16, vreinterpretq_f32_bf16, vreinterpretq_f64_bf16)
(vreinterpretq_p8_bf16, vreinterpretq_p16_bf16, vreinterpretq_p64_bf16)
(vreinterpretq_p128_bf16): Delete
Andrew Carlotti [Thu, 21 Jul 2022 16:18:43 +0000 (17:18 +0100)]
aarch64: Consolidate simd type lookup functions
There were several similarly-named functions, which each built or looked up an
operand type using a different subset of valid modes or qualifiers.
This change provides a single function to return operand types, which can
additionally handle const and pointer qualifiers. For clarity, the existing
functionality is kept in separate helper functions.
gcc/ChangeLog:
* config/aarch64/aarch64-builtins.cc
(aarch64_simd_builtin_std_type): Rename to...
(aarch64_int_or_fp_type): ...this, and allow irrelevant qualifiers.
(aarch64_lookup_simd_builtin_type): Rename to...
(aarch64_simd_builtin_type): ...this. Add const/pointer
support, and extract table lookup to...
(aarch64_lookup_simd_type_in_table): ...this function.
(aarch64_init_crc32_builtins): Update to use aarch64_simd_builtin_type.
(aarch64_init_fcmla_laneq_builtins): Ditto.
(aarch64_init_simd_builtin_functions): Ditto.
Andrew Carlotti [Thu, 21 Jul 2022 16:07:23 +0000 (17:07 +0100)]
aarch64: Lower vcombine to GIMPLE
This lowers vcombine intrinsics to a GIMPLE vector constructor, which enables
better optimisation during GIMPLE passes.
gcc/
* config/aarch64/aarch64-builtins.cc
(aarch64_general_gimple_fold_builtin): Add combine.
gcc/testsuite/
* gcc.target/aarch64/advsimd-intrinsics/combine.c:
New test.
Richard Biener [Mon, 25 Jul 2022 15:24:57 +0000 (17:24 +0200)]
tree-optimization/106189 - avoid division by zero exception
The diagnostic code can end up with zero sized array elements
with T[][0] and the wide-int code nicely avoids exceptions when
dividing by zero in one codepath but not in another. The following
fixes the exception by using wide-int in both paths.
PR tree-optimization/106189
* gimple-array-bounds.cc (array_bounds_checker::check_mem_ref):
Divide using offset_ints.
* gcc.dg/pr106189.c: New testcase.
Lulu Cheng [Thu, 21 Jul 2022 03:04:08 +0000 (11:04 +0800)]
LoongArch: Support split symbol.
Add compilation option '-mexplicit-relocs', and if enable '-mexplicit-relocs'
the symbolic address load instruction 'la.*' will be split into two instructions.
This compilation option enabled by default.
gcc/ChangeLog:
* common/config/loongarch/loongarch-common.cc:
Enable '-fsection-anchors' when O1 and more advanced optimization.
* config/loongarch/genopts/loongarch.opt.in: Add new option
'-mexplicit-relocs', and enable by default.
* config/loongarch/loongarch-protos.h (loongarch_split_move_insn_p):
Delete function declaration.
(loongarch_split_move_insn): Delete function declaration.
(loongarch_split_symbol_type): Add function declaration.
* config/loongarch/loongarch.cc (enum loongarch_address_type):
Add new address type 'ADDRESS_LO_SUM'.
(loongarch_classify_symbolic_expression): New function definitions.
Classify the base of symbolic expression X, given that X appears in
context CONTEXT.
(loongarch_symbol_insns): Add a judgment condition TARGET_EXPLICIT_RELOCS.
(loongarch_split_symbol_type): New function definitions.
Determines whether the symbol load should be split into two instructions.
(loongarch_valid_lo_sum_p): New function definitions.
Return true if a LO_SUM can address a value of mode MODE when the LO_SUM
symbol has type SYMBOL_TYPE.
(loongarch_classify_address): Add handling of 'LO_SUM'.
(loongarch_address_insns): Add handling of 'ADDRESS_LO_SUM'.
(loongarch_signed_immediate_p): Sort code.
(loongarch_12bit_offset_address_p): Return true if address type is ADDRESS_LO_SUM.
(loongarch_const_insns): Add handling of 'HIGH'.
(loongarch_split_move_insn_p): Add the static attribute to the function.
(loongarch_emit_set): New function definitions.
(loongarch_call_tls_get_addr): Add symbol handling when defining TARGET_EXPLICIT_RELOCS.
(loongarch_legitimize_tls_address): Add symbol handling when defining the
TARGET_EXPLICIT_RELOCS macro.
(loongarch_split_symbol): New function definitions. Split symbol.
(loongarch_legitimize_address): Add codes see if the address can split into a high part
and a LO_SUM.
(loongarch_legitimize_const_move): Add codes split moves of symbolic constants into
high and low.
(loongarch_split_move_insn): Delete function definitions.
(loongarch_output_move): Add support for HIGH and LO_SUM.
(loongarch_print_operand_reloc): New function definitions.
Print symbolic operand OP, which is part of a HIGH or LO_SUM in context CONTEXT.
(loongarch_memmodel_needs_release_fence): Sort code.
(loongarch_print_operand): Rearrange alphabetical order and add H and L to support HIGH
and LOW output.
(loongarch_print_operand_address): Add handling of 'ADDRESS_LO_SUM'.
(TARGET_MIN_ANCHOR_OFFSET): Define macro to -IMM_REACH/2.
(TARGET_MAX_ANCHOR_OFFSET): Define macro to IMM_REACH/2-1.
* config/loongarch/loongarch.md (movti): Delete the template.
(*movti): Delete the template.
(movtf): Delete the template.
(*movtf): Delete the template.
(*low<mode>): New template of normal symbol low address.
(@tls_low<mode>): New template of tls symbol low address.
(@ld_from_got<mode>): New template load address from got table.
(@ori_l_lo12<mode>): New template.
* config/loongarch/loongarch.opt: Update from loongarch.opt.in.
* config/loongarch/predicates.md: Add support for symbol_type HIGH.
gcc/testsuite/ChangeLog:
* gcc.target/loongarch/func-call-1.c: Add build option '-mno-explicit-relocs'.
* gcc.target/loongarch/func-call-2.c: Add build option '-mno-explicit-relocs'.
* gcc.target/loongarch/func-call-3.c: Add build option '-mno-explicit-relocs'.
* gcc.target/loongarch/func-call-4.c: Add build option '-mno-explicit-relocs'.
* gcc.target/loongarch/func-call-5.c: New test.
* gcc.target/loongarch/func-call-6.c: New test.
* gcc.target/loongarch/func-call-7.c: New test.
* gcc.target/loongarch/func-call-8.c: New test.
* gcc.target/loongarch/relocs-symbol-noaddend.c: New test.
Lulu Cheng [Thu, 21 Jul 2022 02:32:51 +0000 (10:32 +0800)]
LoongArch: Subdivision symbol type, add SYMBOL_PCREL support.
1. Remove cModel type support other than normal.
2. The method for calling global functions changed from 'la.global + jirl' to 'bl'
when complied add '-fplt'.
gcc/ChangeLog:
* config/loongarch/constraints.md (a): Delete the constraint.
(b): A constant call not local address.
(h): Delete the constraint.
(t): Delete the constraint.
* config/loongarch/loongarch-opts.cc (loongarch_config_target):
Remove cModel type support other than normal.
* config/loongarch/loongarch-protos.h (enum loongarch_symbol_type):
Add new symbol type 'SYMBOL_PCREL', 'SYMBOL_TLS_IE' and 'SYMBOL_TLS_LE'.
(loongarch_split_symbol): Delete useless function declarations.
(loongarch_split_symbol_type): Delete useless function declarations.
* config/loongarch/loongarch.cc (enum loongarch_address_type):
Delete unnecessary comment information.
(loongarch_symbol_binds_local_p): Modified the judgment order of label
and symbol.
(loongarch_classify_symbol): Return symbol type. If symbol is a label,
or symbol is a local symbol return SYMBOL_PCREL. If is a tls symbol,
return SYMBOL_TLS. If is a not local symbol return SYMBOL_GOT_DISP.
(loongarch_symbolic_constant_p): Add handling of 'SYMBOL_TLS_IE'
'SYMBOL_TLS_LE' and 'SYMBOL_PCREL'.
(loongarch_symbol_insns): Add handling of 'SYMBOL_TLS_IE' 'SYMBOL_TLS_LE'
and 'SYMBOL_PCREL'.
(loongarch_address_insns): Sort code.
(loongarch_12bit_offset_address_p): Sort code.
(loongarch_14bit_shifted_offset_address_p): Sort code.
(loongarch_call_tls_get_addr): Sort code.
(loongarch_legitimize_tls_address): Sort code.
(loongarch_output_move): Remove schema support for cmodel other than normal.
(loongarch_memmodel_needs_release_fence): Sort code.
(loongarch_print_operand): Sort code.
* config/loongarch/loongarch.h (LARCH_U12BIT_OFFSET_P):
Rename to LARCH_12BIT_OFFSET_P.
(LARCH_12BIT_OFFSET_P): New macro.
* config/loongarch/loongarch.md: Reimplement the function call. Remove schema
support for cmodel other than normal.
* config/loongarch/predicates.md (is_const_call_weak_symbol): Delete this predicate.
(is_const_call_plt_symbol): Delete this predicate.
(is_const_call_global_noplt_symbol): Delete this predicate.
(is_const_call_no_local_symbol): New predicate, determines whether it is a local
symbol or label.
gcc/testsuite/ChangeLog:
* gcc.target/loongarch/func-call-1.c: New test.
* gcc.target/loongarch/func-call-2.c: New test.
* gcc.target/loongarch/func-call-3.c: New test.
* gcc.target/loongarch/func-call-4.c: New test.
Kewen Lin [Tue, 26 Jul 2022 02:29:14 +0000 (21:29 -0500)]
rs6000: Preserve REG_EH_REGION when replacing load/store [PR106091]
As test case in PR106091 shows, rs6000 specific pass swaps
doesn't preserve the reg_note REG_EH_REGION when replacing
some load insn at the end of basic block, it causes the
flow info verification to fail unexpectedly. Since memory
reference rtx may trap, this patch is to ensure we copy
REG_EH_REGION reg_note while replacing swapped aligned load
or store.
PR target/106091
gcc/ChangeLog:
* config/rs6000/rs6000-p8swap.cc (replace_swapped_aligned_store): Copy
REG_EH_REGION when replacing one store insn having it.
(replace_swapped_aligned_load): Likewise.
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/pr106091.c: New test.
GCC Administrator [Tue, 26 Jul 2022 00:16:29 +0000 (00:16 +0000)]
Daily bump.
Jason Merrill [Mon, 25 Jul 2022 15:13:31 +0000 (11:13 -0400)]
c++: aggregate prvalue as for range [PR106230]
Since my PR94041 work on temporary lifetime in aggregate initialization, we
end up calling build_vec_init to initialize the reference-extended temporary
for the artificial __for_range variable. And build_vec_init uses
finish_for_stmt to implement its loop. That function assumes that if
__for_range is in current_binding_level, we're finishing a range-for, and we
should fix up the variable as it goes out of scope. But when called from
build_vec_init we aren't finishing a range-for, and do_poplevel doesn't
remove the variable from scope because stmts_are_full_exprs_p is false. So
let's check that here as well, and leave the DECL_NAME alone.
PR c++/106230
gcc/cp/ChangeLog:
* semantics.cc (finish_for_stmt): Check stmts_are_full_exprs_p.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/range-for38.C: New test.
Aldy Hernandez [Mon, 25 Jul 2022 14:42:00 +0000 (16:42 +0200)]
Dispatch code for floating point range ops.
This modifies the range-op dispatch code to handle floats. Also
provided are the stub routines for the floating point range-ops, as we
need something to dispatch to ;-).
I am not ecstatic about the dispatch code, but there's no getting
around having to switch on the tree code and type in some manner. All
the other alternatives I played with ended up being slower, or harder
to maintain. At least, this one is self-contained in the
range_op_handler API, and less than 0.16% slower for VRP in our
benchmarks.
Tested on x86-64 Linux.
gcc/ChangeLog:
* Makefile.in (OBJS): Add range-op-float.o.
* range-op.cc (get_float_handler): New.
(range_op_handler::range_op_handler): Save code and type for
delayed querying.
(range_op_handler::oeprator bool): Move from header file, and
add support for floats.
(range_op_handler::fold_range): Add support for floats.
(range_op_handler::op1_range): Same.
(range_op_handler::op2_range): Same.
(range_op_handler::lhs_op1_relation): Same.
(range_op_handler::lhs_op2_relation): Same.
(range_op_handler::op1_op2_relation): Same.
* range-op.h (class range_operator_float): New.
(class floating_op_table): New.
* value-query.cc (range_query::get_tree_range): Add case for
REAL_CST.
* range-op-float.cc: New file.
Martin Liska [Mon, 25 Jul 2022 06:38:37 +0000 (08:38 +0200)]
analyzer: convert tests with dos2unix
gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/fd-2.c: Convert Windows endlines to Unix
style.
* gcc.dg/analyzer/fd-3.c: Likewise.
* gcc.dg/analyzer/fd-4.c: Likewise.
* gcc.dg/analyzer/fd-5.c: Likewise.
* c-c++-common/attr-fd.c: Likewise.
Martin Liska [Mon, 25 Jul 2022 06:10:01 +0000 (08:10 +0200)]
analyzer: fix coding style in sm-fd.cc
gcc/analyzer/ChangeLog:
* sm-fd.cc: Run dos2unix and fix coding style issues.
Roger Sayle [Mon, 25 Jul 2022 16:33:48 +0000 (17:33 +0100)]
PR target/91681: zero_extendditi2 pattern for more optimizations on x86.
Technically, PR target/91681 has already been resolved; we now recognize the
highpart multiplication at the tree-level, we no longer use the stack, and
we currently generate the same number of instructions as LLVM. However, it
is still possible to do better, the current x86_64 code to generate a double
word addition of a zero extended operand, looks like:
xorl %r11d, %r11d
addq %r10, %rax
adcq %r11, %rdx
when it's possible (as LLVM does) to use an immediate constant:
addq %r10, %rax
adcq $0, %rdx
This is implemented by introducing a zero_extendditi2 pattern,
for zero extension from DImode to TImode on TARGET_64BIT that is
split after reload. With zero extension now visible to combine,
we add two new define_insn_and_split that add/subtract a zero
extended operand in double word mode. These apply to both 32-bit
and 64-bit code generation, to produce adc $0 and sbb $0.
One consequence of this is that these new patterns interfere with
the optimization that recognizes DW:DI = (HI:SI<<32)+LO:SI as a pair
of register moves, or more accurately the combine splitter no longer
triggers as we're now converting two instructions into two instructions
(not three instructions into two instructions). This is easily
repaired (and extended to handle TImode) by changing from a pair
of define_split (that handle operand commutativity) to a set of
four define_insn_and_split (again to handle operand commutativity).
2022-07-25 Roger Sayle <roger@nextmovesoftware.com>
Uroš Bizjak <ubizjak@gmail.com>
gcc/ChangeLog
PR target/91681
* config/i386/i386-expand.cc (split_double_concat): A new helper
function for setting a double word value from two word values.
* config/i386/i386-protos.h (split_double_concat): Prototype here.
* config/i386/i386.md (zero_extendditi2): New define_insn_and_split.
(*add<dwi>3_doubleword_zext): New define_insn_and_split.
(*sub<dwi>3_doubleword_zext): New define_insn_and_split.
(*concat<mode><dwi>3_1): New define_insn_and_split replacing
previous define_split for implementing DST = (HI<<32)|LO as
pair of move instructions, setting lopart and hipart.
(*concat<mode><dwi>3_2): Likewise.
(*concat<mode><dwi>3_3): Likewise, where HI is zero_extended.
(*concat<mode><dwi>3_4): Likewise, where HI is zero_extended.
gcc/testsuite/ChangeLog
PR target/91681
* g++.target/i386/pr91681.C: New test case (from the PR).
* gcc.target/i386/pr91681-1.c: New int128 test case.
* gcc.target/i386/pr91681-2.c: Likewise.
* gcc.target/i386/pr91681-3.c: Likewise, but for ia32.
Aldy Hernandez [Mon, 25 Jul 2022 13:58:04 +0000 (15:58 +0200)]
[PR middle-end/106432] Gracefully handle unsupported type in range_on_edge
A cleaner approach to fix this PR has been suggested by Andrew, which
is to just return false on range_on_edge for unsupported range types.
Tested on x86-64 Linux.
PR middle-end/106432
gcc/ChangeLog:
* gimple-range.cc (gimple_ranger::range_on_edge): Return false
when the result range type is unsupported.
Jason Merrill [Mon, 25 Jul 2022 03:26:59 +0000 (23:26 -0400)]
c++: -Woverloaded-virtual false positive [PR87729]
My attempt to shortcut unnecessary checking after finding a match was
also wrong for multiple inheritance, so let's give up on it.
PR c++/87729
gcc/cp/ChangeLog:
* class.cc (warn_hidden): Remove shortcut.
gcc/testsuite/ChangeLog:
* g++.dg/warn/Woverloaded-virt4.C: New test.
Sebastian Huber [Fri, 22 Jul 2022 12:09:20 +0000 (14:09 +0200)]
RTEMS: Do not define _GNU_SOURCE by default
gcc/ChangeLog:
* config/rs6000/rtems.h (CPLUSPLUS_CPP_SPEC): Undef.
Richard Biener [Mon, 25 Jul 2022 10:10:48 +0000 (12:10 +0200)]
middle-end/106414 - fix mistake in ~(x ^ y) -> x == y pattern
When compares are integer typed the inversion with ~ isn't properly
preserved by the equality comparison even when converting the
result properly. The following fixes this by restricting the
input precisions accordingly.
PR middle-end/106414
* match.pd (~(x ^ y) -> x == y): Restrict to single bit
precision types.
* gcc.dg/torture/pr106414-1.c: New testcase.
* gcc.dg/torture/pr106414-2.c: Likewise.