qingzhe huang [Fri, 1 Oct 2021 14:46:35 +0000 (10:46 -0400)]
c++: cv-qualified ref introduced by typedef [PR101783]
The root cause of this bug is that it considers reference with
cv-qualifiers as an error by generating value for variable "bad_quals".
However, this is not correct for case of typedef. Here I quote spec
[dcl.ref]/1 :
"Cv-qualified references are ill-formed except when the cv-qualifiers
are introduced through the use of a typedef-name ([dcl.typedef],
[temp.param]) or decltype-specifier ([dcl.type.decltype]),
in which case the cv-qualifiers are ignored."
2021-09-30 qingzhe huang <nickhuang99@hotmail.com>
gcc/cp/ChangeLog:
PR c++/101783
* tree.c (cp_build_qualified_type_real): Exclude typedef from
error.
gcc/testsuite/ChangeLog:
PR c++/101783
* g++.dg/parse/pr101783.C: New test.
Jonathan Wakely [Fri, 1 Oct 2021 13:06:42 +0000 (14:06 +0100)]
libstdc++: Define basic_regex::multiline for non-strict modes
The regex_constants::multiline constant is defined for non-strict C++11
and C++14 modes, on the basis that the feature is a DR (even though it
was really a new feature addition to C++17 and probably shouldn't have
gone through the issues list).
This makes the basic_regex::multiline constant defined consistently with
the regex_constants::multiline one.
For strict C++11 and C++14 mode we don't define them, because multiline
is not a reserved name in those standards.
libstdc++-v3/ChangeLog:
* include/bits/regex.h (basic_regex::multiline): Define for
non-strict C++11 and C++14 modes.
* include/bits/regex_constants.h (regex_constants::multiline):
Add _GLIBCXX_RESOLVE_LIB_DEFECTS comment.
Jonathan Wakely [Fri, 1 Oct 2021 11:55:53 +0000 (12:55 +0100)]
libstdc++: Add missing header to test
We need to include <iterator> (or one of the containers) to get a
definition for std::begin.
libstdc++-v3/ChangeLog:
* testsuite/25_algorithms/is_permutation/2.cc: Include <iterator>.
Jonathan Wakely [Thu, 30 Sep 2021 13:39:36 +0000 (14:39 +0100)]
libstdc++: Add noexcept to istream_iterator and ostream_iterator
libstdc++-v3/ChangeLog:
* include/bits/stream_iterator.h (istream_iterator): Add
noexcept to constructors and non-throwing member functions and
friend functions.
(ostream_iterator): Likewise.
Jonathan Wakely [Thu, 30 Sep 2021 10:25:15 +0000 (11:25 +0100)]
libstdc++: Fix _ForwardIteratorConcept for __gnu_debug::vector<bool>
The recent changes to the _GLIBCXX_CONCEPT_CHECKS checks for forward
iterators don't work for vector<bool> iterators in debug mode, because
the _Safe_iterator specializations don't match the special cases I added
for _Bit_iterator and _Bit_const_iterator.
This refactors the _ForwardIteratorReferenceConcept class template to
identify vector<bool> iterators using a new trait, which also works for
debug iterators.
libstdc++-v3/ChangeLog:
* include/bits/boost_concept_check.h (_Is_vector_bool_iterator):
New trait to identify vector<bool> iterators, including debug
ones.
(_ForwardIteratorReferenceConcept): Add default template
argument using _Is_vector_bool_iterator and use it in partial
specialization for the vector<bool> cases.
(_Mutable_ForwardIteratorReferenceConcept): Likewise.
* testsuite/24_iterators/operations/prev_neg.cc: Adjust dg-error
line number.
Jonathan Wakely [Wed, 29 Sep 2021 19:46:55 +0000 (20:46 +0100)]
libstdc++: Replace try-catch in std::list::merge to avoid O(N) size
The current std::list::merge code calls size() before starting to merge
any elements, so that the _M_size members can be updated after the merge
finishes. The work is done in a try-block so that the sizes can still be
updated in an exception handler if any element comparison throws.
The _M_size members only exist for the cxx11 ABI, so the initial call to
size() and the try-catch are only needed for that ABI. For the old ABI
the size() call performs an O(N) list traversal to get a value that
isn't even used, and catching exceptions just to rethrow them isn't
needed either.
This refactors the merge functions to remove the try-catch block and use
an RAII type instead. For the cxx11 ABI that type's destructor updates
the list sizes, and for the old ABI it's a no-op.
libstdc++-v3/ChangeLog:
* include/bits/list.tcc (list::merge): Remove call to size() and
try-catch block. Use _Finalize_merge instead.
* include/bits/stl_list.h (list::_Finalize_merge): New
scope guard type to update _M_size members after a merge.
Martin Liska [Fri, 1 Oct 2021 13:37:59 +0000 (15:37 +0200)]
options: fix concat of options.
PR target/102552
gcc/c-family/ChangeLog:
* c-common.c (parse_optimize_options): decoded_options[0] is
used for program name, so merged_decoded_options should also
respect that.
Przemyslaw Wirkus [Fri, 1 Oct 2021 12:49:51 +0000 (13:49 +0100)]
aarch64: fix AARCH64_FL_V9 flag value
Patch is fixing AARCH64_FL_V9 flag value which is now wrongly set due to
merge error.
gcc/ChangeLog:
* config/aarch64/aarch64.h (AARCH64_FL_V9): Update value.
Aldy Hernandez [Fri, 1 Oct 2021 10:27:55 +0000 (12:27 +0200)]
Remove shadowed oracle field.
The m_oracle field in the path solver was shadowing the base class.
This was causing subtle problems while calculating outgoing edges
between blocks, because the query object being passed did not have an
oracle set.
This should further improve our solving ability.
Tested on x86-64 Linux.
gcc/ChangeLog:
* gimple-range-path.cc (path_range_query::compute_ranges): Use
get_path_oracle.
* gimple-range-path.h (class path_range_query): Remove shadowed
m_oracle field.
(path_range_query::get_path_oracle): New.
Jakub Jelinek [Fri, 1 Oct 2021 12:27:32 +0000 (14:27 +0200)]
ubsan: Move INT_MIN / -1 instrumentation from -fsanitize=integer-divide-by-zero to -fsanitize=signed-integer-overflow [PR102515]
As noted by Richi, in clang INT_MIN / -1 is instrumented under
-fsanitize=signed-integer-overflow rather than
-fsanitize=integer-divide-by-zero as we did and doing it in the former
makes more sense, as it is overflow during division rather than division
by zero.
I've verified on godbolt that clang behaved that way since 3.2-ish times or
so when sanitizers were added.
Furthermore, we've been using
-f{,no-}sanitize-recover=integer-divide-by-zero to decide on the float
-fsanitize=float-divide-by-zero instrumentation _abort suffix.
The case where INT_MIN / -1 is instrumented by one sanitizer and
x / 0 by another one when both are enabled is slightly harder if
the -f{,no-}sanitize-recover={integer-divide-by-zero,signed-integer-overflow}
flags differ, then we need to emit both __ubsan_handle_divrem_overflow
and __ubsan_handle_divrem_overflow_abort calls guarded by their respective
checks rather than one guarded by check1 || check2.
2021-10-01 Jakub Jelinek <jakub@redhat.com>
Richard Biener <rguenther@suse.de>
PR sanitizer/102515
gcc/
* doc/invoke.texi (-fsanitize=integer-divide-by-zero): Remove
INT_MIN / -1 division detection from here ...
(-fsanitize=signed-integer-overflow): ... and add it here.
gcc/c-family/
* c-ubsan.c (ubsan_instrument_division): Check the right
flag_sanitize_recover bit, depending on which sanitization
is done. Sanitize INT_MIN / -1 under SANITIZE_SI_OVERFLOW
rather than SANITIZE_DIVIDE. If both SANITIZE_SI_OVERFLOW
and SANITIZE_DIVIDE is enabled, neither check is known
to be false and flag_sanitize_recover bits for those two
aren't the same, emit both __ubsan_handle_divrem_overflow
and __ubsan_handle_divrem_overflow_abort calls.
gcc/c/
* c-typeck.c (build_binary_op): Call ubsan_instrument_division
for division even for SANITIZE_SI_OVERFLOW.
gcc/cp/
* typeck.c (cp_build_binary_op): Call ubsan_instrument_division
for division even for SANITIZE_SI_OVERFLOW.
gcc/testsuite/
* c-c++-common/ubsan/div-by-zero-3.c: Use
-fsanitize=signed-integer-overflow instead of
-fsanitize=integer-divide-by-zero.
* c-c++-common/ubsan/div-by-zero-5.c: Likewise.
* c-c++-common/ubsan/div-by-zero-4.c: Likewise. Add
-fsanitize-undefined-trap-on-error.
* c-c++-common/ubsan/float-div-by-zero-2.c: New test.
* c-c++-common/ubsan/overflow-div-1.c: New test.
* c-c++-common/ubsan/overflow-div-2.c: New test.
* c-c++-common/ubsan/overflow-div-3.c: New test.
Kyrylo Tkachov [Fri, 1 Oct 2021 11:19:42 +0000 (12:19 +0100)]
aarch64: Fix cpymem-size.c test for ILP32
gcc/testsuite/
* gcc.target/aarch64/cpymem-size.c: Adjust scan for ilp32.
Przemyslaw Wirkus [Fri, 1 Oct 2021 09:06:45 +0000 (10:06 +0100)]
aarch64: add armv9-a to -march
gcc/ChangeLog:
* config/aarch64/aarch64-arches.def (AARCH64_ARCH): Added
armv9-a.
* config/aarch64/aarch64.h (AARCH64_FL_V9): New.
(AARCH64_FL_FOR_ARCH9): New flags for Armv9-A.
(AARCH64_ISA_V9): New ISA flag.
* doc/invoke.texi: Update docs.
Andrew Pinski [Fri, 1 Oct 2021 09:23:47 +0000 (09:23 +0000)]
Fix bb-slp-pr97709.c after computed goto change
Looks like I tested the change for bb-slp-pr97709.c on an
older tree which did not have the error message so I had
missed one more place where the change was needed.
Anyways committed after testing to make sure the testcase passes
now.
gcc/testsuite/ChangeLog:
* gcc.dg/vect/bb-slp-pr97709.c: Fix for computed goto
pointers.
Martin Liska [Wed, 2 Jun 2021 06:44:37 +0000 (08:44 +0200)]
Append target/optimize attr to the current cmdline.
gcc/c-family/ChangeLog:
* c-common.c (parse_optimize_options): Combine optimize
options with what was provided on the command line.
gcc/ChangeLog:
* toplev.c (toplev::main): Save decoded optimization options.
* toplev.h (save_opt_decoded_options): New.
* doc/extend.texi: Be more clear about optimize and target
attributes.
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx512er-vrsqrt28ps-3.c: Disable fast math.
* gcc.target/i386/avx512er-vrsqrt28ps-5.c: Likewise.
* gcc.target/i386/attr-optimize.c: New test.
Eric Botcazou [Fri, 1 Oct 2021 08:56:45 +0000 (10:56 +0200)]
Fix ICE with stack checking emulation at -O2
On bare-metal platforms, the Ada compiler emulates stack checking (it is
required by the language and tested by ACATS) in the runtime via the
stack_check_libfunc hook of the RTL middle-end. Calls to the function
are generated as libcalls but they now require a proper function type
at -O2 or above.
gcc/
* explow.c: Include langhooks.h.
(set_stack_check_libfunc): Build a proper function type.
Eric Botcazou [Fri, 1 Oct 2021 08:49:34 +0000 (10:49 +0200)]
Fix PR c++/64697 at -O1 or above
The BFD fix eliminates the link failure and working code is generated at
-O0, but _not_ when optimization is enabled because the optimizer changes:
movq .refptr._ZTH1s(%rip), %rax
testq %rax, %rax
je .L2
call _ZTH1s
into:
leaq _ZTH1s(%rip), %rax
testq %rax, %rax
je .L2
call _ZTH1s
and the leaq now also gets the relocation overflow. So the fix is to
teach legitimate_pic_address_disp_p to reject the transformation when
the symbol is an external weak function, which yields:
cmpq $0, .refptr._ZTH1s(%rip)
je .L2
call _ZTH1s
and the cmpq keeps a relocation that does not overflow.
gcc/
PR c++/64697
* config/i386/i386.c (legitimate_pic_address_disp_p): For PE-COFF do
not return true for external weak function symbols in medium model.
Jakub Jelinek [Fri, 1 Oct 2021 08:45:48 +0000 (10:45 +0200)]
openmp: Differentiate between order(concurrent) and order(reproducible:concurrent)
While OpenMP 5.1 implies order(concurrent) is the same thing as
order(reproducible:concurrent), this is going to change in OpenMP 5.2, where
essentially order(concurrent) means nothing is stated on whether it is
reproducible or unconstrained (and is determined by other means, e.g. for/do
with schedule static or runtime with static being selected is implicitly
reproducible, distribute with dist_schedule static is implicitly reproducible,
loop is implicitly reproducible) and when the modifier is specified explicitly,
it overrides the implicit behavior either way.
And, when order(reproducible:concurrent) is used with e.g. schedule(dynamic)
or some other schedule that is by definition not reproducible, it is
implementation's duty to ensure it is reproducible, either by remembering how
it scheduled some loop and then replaying the same schedule when seeing loops
with the same directive/schedule/number of iterations, or by overriding the
schedule to some reproducible one.
This patch doesn't implement the 5.2 wording just yet, but in the FEs
differentiates between the 3 states - no explicit modifier, explicit reproducible
or explicit unconstrainted, so that the middle-end can easily switch any time.
Instead it follows the 5.1 wording where both order(concurrent) (implicit or
explicit) or order(reproducible:concurrent) imply reproducibility.
And, it implements the easier method, when for/do should be reproducible, it
just chooses static schedule. order(concurrent) implies no OpenMP APIs in the
loop body nor threadprivate vars, so the exact scheduling isn't (easily at least)
observable.
2021-10-01 Jakub Jelinek <jakub@redhat.com>
gcc/
* tree.h (OMP_CLAUSE_ORDER_REPRODUCIBLE): Define.
* tree-pretty-print.c (dump_omp_clause) <case OMP_CLAUSE_ORDER>: Print
reproducible: for OMP_CLAUSE_ORDER_REPRODUCIBLE.
* omp-general.c (omp_extract_for_data): If OMP_CLAUSE_ORDER is seen
without OMP_CLAUSE_ORDER_UNCONSTRAINED, overwrite sched_kind to
OMP_CLAUSE_SCHEDULE_STATIC.
gcc/c-family/
* c-omp.c (c_omp_split_clauses): Also copy
OMP_CLAUSE_ORDER_REPRODUCIBLE.
gcc/c/
* c-parser.c (c_parser_omp_clause_order): Set
OMP_CLAUSE_ORDER_REPRODUCIBLE for explicit reproducible: modifier.
gcc/cp/
* parser.c (cp_parser_omp_clause_order): Set
OMP_CLAUSE_ORDER_REPRODUCIBLE for explicit reproducible: modifier.
gcc/fortran/
* gfortran.h (gfc_omp_clauses): Add order_reproducible bitfield.
* dump-parse-tree.c (show_omp_clauses): Print REPRODUCIBLE: for it.
* openmp.c (gfc_match_omp_clauses): Set order_reproducible for
explicit reproducible: modifier.
* trans-openmp.c (gfc_trans_omp_clauses): Set
OMP_CLAUSE_ORDER_REPRODUCIBLE for order_reproducible.
(gfc_split_omp_clauses): Also copy order_reproducible.
gcc/testsuite/
* gfortran.dg/gomp/order-5.f90: Adjust scan-tree-dump-times regexps.
libgomp/
* testsuite/libgomp.c-c++-common/order-reproducible-1.c: New test.
* testsuite/libgomp.c-c++-common/order-reproducible-2.c: New test.
Jakub Jelinek [Fri, 1 Oct 2021 08:42:07 +0000 (10:42 +0200)]
openmp: Avoid PLT relocations for omp_* symbols in libgomp
This patch avoids the following relocations:
readelf -Wr libgomp.so.1.0.0 | grep omp_
00000000000470e0 0000020700000007 R_X86_64_JUMP_SLOT
000000000001d9d0 omp_fulfill_event@@OMP_5.0.1 + 0
0000000000047170 000000b800000007 R_X86_64_JUMP_SLOT
000000000000e760 omp_display_env@@OMP_5.1 + 0
00000000000471e0 000000e800000007 R_X86_64_JUMP_SLOT
000000000000f910 omp_get_initial_device@@OMP_4.5 + 0
0000000000047280 0000019500000007 R_X86_64_JUMP_SLOT
0000000000015940 omp_get_active_level@@OMP_3.0 + 0
00000000000472c8 0000020d00000007 R_X86_64_JUMP_SLOT
0000000000035210 omp_get_team_num@@OMP_4.0 + 0
00000000000472f0 0000014700000007 R_X86_64_JUMP_SLOT
0000000000035200 omp_get_num_teams@@OMP_4.0 + 0
by using ialias{,_call,_redirect} macros as needed.
We still have many acc_* PLT relocations, could somebody please fix those?
readelf -Wr libgomp.so.1.0.0 | grep acc_
0000000000046fb8 000001ed00000006 R_X86_64_GLOB_DAT
0000000000036350 acc_prof_unregister@@OACC_2.5.1 + 0
0000000000046fd8 000000a400000006 R_X86_64_GLOB_DAT
0000000000035f30 acc_prof_register@@OACC_2.5.1 + 0
0000000000046fe0 000001d100000006 R_X86_64_GLOB_DAT
0000000000035ee0 acc_prof_lookup@@OACC_2.5.1 + 0
0000000000047058 000001dd00000007 R_X86_64_JUMP_SLOT
0000000000031f40 acc_create_async@@OACC_2.5 + 0
0000000000047068 0000011500000007 R_X86_64_JUMP_SLOT
000000000002fc60 acc_get_property@@OACC_2.6 + 0
0000000000047070 000001fb00000007 R_X86_64_JUMP_SLOT
0000000000032ce0 acc_wait_all@@OACC_2.0 + 0
0000000000047080 0000006500000007 R_X86_64_JUMP_SLOT
000000000002f990 acc_on_device@@OACC_2.0 + 0
0000000000047088 000000ae00000007 R_X86_64_JUMP_SLOT
0000000000032140 acc_attach_async@@OACC_2.6 + 0
0000000000047090 0000021900000007 R_X86_64_JUMP_SLOT
000000000002f550 acc_get_device_type@@OACC_2.0 + 0
0000000000047098 000001cb00000007 R_X86_64_JUMP_SLOT
0000000000032090 acc_copyout_finalize@@OACC_2.5 + 0
00000000000470a8 0000005200000007 R_X86_64_JUMP_SLOT
0000000000031f80 acc_copyin@@OACC_2.0 + 0
00000000000470b8 000001ad00000007 R_X86_64_JUMP_SLOT
0000000000032030 acc_delete_finalize@@OACC_2.5 + 0
00000000000470e8 0000010900000007 R_X86_64_JUMP_SLOT
0000000000031f00 acc_create@@OACC_2.0 + 0
00000000000470f8 0000005900000007 R_X86_64_JUMP_SLOT
0000000000032b70 acc_wait_async@@OACC_2.0 + 0
0000000000047110 0000013100000007 R_X86_64_JUMP_SLOT
0000000000032860 acc_async_test@@OACC_2.0 + 0
0000000000047118 000001ff00000007 R_X86_64_JUMP_SLOT
000000000002f720 acc_get_device_num@@OACC_2.0 + 0
0000000000047128 0000019100000007 R_X86_64_JUMP_SLOT
0000000000032020 acc_delete_async@@OACC_2.5 + 0
0000000000047130 000001d200000007 R_X86_64_JUMP_SLOT
000000000002efa0 acc_shutdown@@OACC_2.0 + 0
0000000000047150 000000d000000007 R_X86_64_JUMP_SLOT
0000000000031f00 acc_present_or_create@@OACC_2.0 + 0
0000000000047188 0000019200000007 R_X86_64_JUMP_SLOT
0000000000031910 acc_is_present@@OACC_2.0 + 0
0000000000047190 000001aa00000007 R_X86_64_JUMP_SLOT
000000000002fca0 acc_get_property_string@@OACC_2.6 + 0
00000000000471d0 000001bf00000007 R_X86_64_JUMP_SLOT
0000000000032120 acc_update_self_async@@OACC_2.5 + 0
0000000000047200 0000020500000007 R_X86_64_JUMP_SLOT
0000000000032e00 acc_wait_all_async@@OACC_2.0 + 0
0000000000047208 000000a600000007 R_X86_64_JUMP_SLOT
0000000000031790 acc_deviceptr@@OACC_2.0 + 0
0000000000047218 0000007500000007 R_X86_64_JUMP_SLOT
0000000000032000 acc_delete@@OACC_2.0 + 0
0000000000047238 000001e900000007 R_X86_64_JUMP_SLOT
000000000002f3a0 acc_set_device_type@@OACC_2.0 + 0
0000000000047240 000001f600000007 R_X86_64_JUMP_SLOT
000000000002ef20 acc_init@@OACC_2.0 + 0
0000000000047248 0000018800000007 R_X86_64_JUMP_SLOT
0000000000032060 acc_copyout@@OACC_2.0 + 0
0000000000047258 0000021f00000007 R_X86_64_JUMP_SLOT
0000000000032a80 acc_wait@@OACC_2.0 + 0
0000000000047270 000001bc00000007 R_X86_64_JUMP_SLOT
0000000000032100 acc_update_self@@OACC_2.0 + 0
0000000000047288 0000011400000007 R_X86_64_JUMP_SLOT
0000000000032080 acc_copyout_async@@OACC_2.5 + 0
0000000000047290 0000013d00000007 R_X86_64_JUMP_SLOT
000000000002f850 acc_set_device_num@@OACC_2.0 + 0
00000000000472a8 000000c500000007 R_X86_64_JUMP_SLOT
00000000000320e0 acc_update_device_async@@OACC_2.5 + 0
00000000000472c0 0000014600000007 R_X86_64_JUMP_SLOT
0000000000031fc0 acc_copyin_async@@OACC_2.5 + 0
00000000000472f8 0000006a00000007 R_X86_64_JUMP_SLOT
000000000002f310 acc_get_num_devices@@OACC_2.0 + 0
0000000000047350 0000021700000007 R_X86_64_JUMP_SLOT
0000000000031f80 acc_present_or_copyin@@OACC_2.0 + 0
0000000000047360 0000020900000007 R_X86_64_JUMP_SLOT
00000000000320c0 acc_update_device@@OACC_2.0 + 0
0000000000047380 0000008400000007 R_X86_64_JUMP_SLOT
0000000000032950 acc_async_test_all@@OACC_2.0 + 0
2021-10-01 Jakub Jelinek <jakub@redhat.com>
* affinity-fmt.c (omp_get_team_num, omp_get_num_teams): Add
ialias_redirect.
* env.c (handle_omp_display_env): Use ialias_call.
* icv-device.c: Move ialias right below each function.
(omp_get_device_num): Use ialias_call.
* fortran.c (omp_fulfill_event): Add ialias_redirect.
* icv.c (omp_get_active_level): Add ialias_redirect.
Jakub Jelinek [Fri, 1 Oct 2021 08:32:10 +0000 (10:32 +0200)]
openmp: Add alloc_align attribute to omp_aligned_*alloc and testcase for omp_realloc
This patch adds alloc_align attribute to omp_aligned_{,c}alloc so that if
the first argument is constant, GCC can assume requested alignment.
Additionally, it adds testsuite coverage for omp_realloc which I haven't
managed to write in the patch from yesterday.
2021-10-01 Jakub Jelinek <jakub@redhat.com>
* omp.h.in (omp_aligned_alloc, omp_aligned_calloc): Add
__alloc_align__ (1) attribute.
* testsuite/libgomp.c-c++-common/alloc-9.c: New test.
Jakub Jelinek [Fri, 1 Oct 2021 08:30:16 +0000 (10:30 +0200)]
c++: Fix handling of __thread/thread_local extern vars declared at function scope [PR102496]
The introduction of push_local_extern_decl_alias in
r11-3699-g4e62aca0e0520e4ed2532f2d8153581190621c1a
broke tls vars, while the decl they are created for has the tls model
set properly, nothing sets it for the alias that is actually used,
so accesses to it are done as if they were normal variables.
This is then diagnosed at link time if the definition of the extern
vars is __thread/thread_local.
2021-10-01 Jakub Jelinek <jakub@redhat.com>
PR c++/102496
* name-lookup.c (push_local_extern_decl_alias): Return early even for
tls vars with non-dependent type when processing_template_decl. For
CP_DECL_THREAD_LOCAL_P vars call set_decl_tls_model on alias.
* g++.dg/tls/pr102496-1.C: New test.
* g++.dg/tls/pr102496-2.C: New test.
Richard Biener [Thu, 30 Sep 2021 13:05:53 +0000 (15:05 +0200)]
middle-end/102518 - avoid invalid GIMPLE during inlining
When inlining we have to avoid mapping a non-lvalue parameter
value into a context that prevents the parameter to be a register.
Formerly the register were TREE_ADDRESSABLE but now it can be
just DECL_NOT_GIMPLE_REG_P.
2021-09-30 Richard Biener <rguenther@suse.de>
PR middle-end/102518
* tree-inline.c (setup_one_parameter): Avoid substituting
an invariant into contexts where a GIMPLE register is not valid.
* gcc.dg/torture/pr102518.c: New testcase.
Bob Duff [Tue, 17 Aug 2021 15:02:51 +0000 (11:02 -0400)]
[Ada] Subprogram_Variant in ignored ghost code
gcc/ada/
* exp_ch6.adb (Expand_Call_Helper): Do not call
Check_Subprogram_Variant if the subprogram is an ignored ghost
entity. Otherwise the compiler crashes (in debug builds) or
gives strange error messages (in production builds).
Ghjuvan Lacambre [Mon, 16 Aug 2021 13:28:09 +0000 (15:28 +0200)]
[Ada] Empty CUDA_Global procedures when compiling for host
gcc/ada/
* gnat_cuda.adb (Empty_CUDA_Global_Subprograms): New procedure.
(Expand_CUDA_Package): Call Empty_CUDA_Global_Subprograms.
Steve Baird [Thu, 12 Aug 2021 23:55:36 +0000 (16:55 -0700)]
[Ada] Improved checking for invalid index values when accessing array elements
gcc/ada/
* checks.ads: Define a type Dimension_Set. Add an out-mode
parameter of this new type to Generate_Index_Checks so that
callers can know for which dimensions a check was generated. Add
an in-mode parameter of this new type to
Apply_Subscript_Validity_Checks so that callers can indicate
that no check is needed for certain dimensions.
* checks.adb (Generate_Index_Checks): Implement new
Checks_Generated parameter.
(Apply_Subscript_Validity_Checks): Implement new No_Check_Needed
parameter.
* exp_ch4.adb (Expand_N_Indexed_Component): Call
Apply_Subscript_Validity_Checks in more cases than before. This
includes declaring two new local functions,
(Is_Renamed_Variable_Name,
Type_Requires_Subscript_Validity_Checks_For_Reads): To help in
deciding whether to call Apply_Subscript_Validity_Checks.
Adjust to parameter profile changes in Generate_Index_Checks and
Apply_Subscript_Validity_Checks.
Eric Botcazou [Fri, 13 Aug 2021 16:32:53 +0000 (18:32 +0200)]
[Ada] Document rounding mode assumed for dynamic floating-point computations
gcc/ada/
* doc/gnat_rm/implementation_defined_characteristics.rst: Document
the rounding mode assumed for dynamic computations as per 3.5.7(16).
* gnat_rm.texi: Regenerate.
Bob Duff [Thu, 12 Aug 2021 20:49:16 +0000 (16:49 -0400)]
[Ada] More work on efficiency improvements
gcc/ada/
* table.ads (Table_Type): Remove "aliased"; no longer needed by
Atree. Besides it contradicted the comment a few lines above,
"-- Note: We do not make the table components aliased...".
* types.ads: Move type Slot to Atree.
* atree.ads: Move type Slot fromt Types to here. Move type
Node_Header from Seinfo to here.
* atree.adb: Avoid the need for aliased components of the Slots
table. Instead of 'Access, use a getter and setter. Misc
cleanups.
(Print_Statistics): Print statistics about node and entity kind
frequencies. Give 3 digit fractions instead of percentages.
* (Get_Original_Node_Count, Set_Original_Node_Count): Statistics
for calls to Original_Node and Set_Original_Node.
(Original_Node, Set_Original_Node): Gather statistics by calling
the above.
(Print_Field_Statistics): Print Original_Node statistics.
(Update_Kind_Statistics): Remove, and put all statistics
gathering under "if Atree_Statistics_Enabled", which is a flag
generated in Seinfo by Gen_IL.
* gen_il-gen.adb (Compute_Field_Offsets): Choose offsets of
Nkind, Ekind, and Homonym first. This causes a slight efficiency
improvement. Misc cleanups. Do not generate Node_Header; it is
now hand-written in Atree. When choosing the order in which to
assign offsets, weight by the frequency of the node type, so the
more common nodes get their field offsets assigned earlier. Add
more special cases.
(Compute_Type_Sizes): Remove this and related things.
There was a comment: "At some point we can instrument Atree to
print out accurate size statistics, and remove this code." We
have Atree statistics, so we now remove this code.
(Put_Seinfo): Generate Atree_Statistics_Enabled, which is equal
to Statistics_Enabled. This allows Atree to say "if
Atree_Statistics_Enabled then <gather statistics>" for
efficiency. When Atree_Statistics_Enabled is False, the "if ..."
will be optimized away.
* gen_il-internals.ads (Type_Frequency): New table of kind
frequencies.
* gen_il-internals.adb: Minor comment improvement.
* gen_il-fields.ads: Remove unused subtypes. Suppress style
checks in the Type_Frequency table. If we regenerate this
table (see -gnatd.A) we don't want to have to fiddle with
casing.
* impunit.adb: Minor.
* sinfo-utils.adb: Minor.
* debug.adb: Minor comment improvement.
Eric Botcazou [Thu, 12 Aug 2021 16:12:40 +0000 (18:12 +0200)]
[Ada] Add missing guard before call to Interface_Present_In_Ancestor
gcc/ada/
* sem_type.adb (Specific_Type): Check that the type is tagged
before calling Interface_Present_In_Ancestor on it.
Eric Botcazou [Thu, 12 Aug 2021 19:45:33 +0000 (21:45 +0200)]
[Ada] Add new debug switch -gnatd.8
gcc/ada/
* debug.adb (d.8): Document usage.
* fe.h (Debug_Flag_Dot_8): Declare.
Gary Dismukes [Wed, 11 Aug 2021 22:41:28 +0000 (18:41 -0400)]
[Ada] Spurious warning about hiding in generic instantiation
gcc/ada/
* sem_util.adb (Enter_Name): Suppress hiding warning when in an
instance.
Ed Schonberg [Thu, 12 Aug 2021 14:39:21 +0000 (10:39 -0400)]
[Ada] Crash on improper use of GNAT attribute Type_Key
gcc/ada/
* sem_attr.adb (Analyze_Attribute, case Type_Key): Attribute can
be applied to a formal type.
* sem_ch5.adb (Analyze_Case_Statement): If Extensions_Allowed is
not enabled, verify that the type of the expression is discrete.
Justin Squirek [Thu, 12 Aug 2021 12:54:15 +0000 (08:54 -0400)]
[Ada] Crash on renaming within declare expression
gcc/ada/
* exp_dbug.adb (Debug_Renaming_Declaration): Add check for
Entity present for Ren to prevent looking at unanalyzed nodes
Ghjuvan Lacambre [Thu, 12 Aug 2021 13:05:23 +0000 (15:05 +0200)]
[Ada] Fix CodePeer warnings
gcc/ada/
* atree.adb (Print_Statistics): Help CodePeer see Total as
greater than zero.
* gen_il-gen.adb (One_Comp): Annotate Field_Table as Modified.
Richard Kenner [Thu, 12 Aug 2021 01:28:35 +0000 (21:28 -0400)]
[Ada] Add Evaluable_Kind and Global_Name_Kind
gcc/ada/
* gen_il-gen-gen_entities.adb (Evaluable_Kind,
Global_Name_Kind): Add.
* gen_il-types.ads (Evaluable_Kind, Global_Name_Kind): Likewise.
Ghjuvan Lacambre [Tue, 9 Feb 2021 08:31:45 +0000 (09:31 +0100)]
[Ada] Stub CUDA_Device aspect
gcc/ada/
* aspects.ads: Add CUDA_Device aspect.
* gnat_cuda.ads (Add_CUDA_Device_Entity): New subprogram.
* gnat_cuda.adb:
(Add_CUDA_Device_Entity): New subprogram.
(CUDA_Device_Entities_Table): New hashmap for CUDA_Device
entities.
(Get_CUDA_Device_Entities): New internal subprogram.
(Set_CUDA_Device_Entities): New internal subprogram.
* par-prag.adb (Prag): Handle pragma id Pragma_CUDA_Device.
* sem_prag.ads (Aspect_Specifying_Pragma): Mark CUDA_Device as
being both aspect and pragma.
* sem_prag.adb (Analyze_Pragma): Add CUDA_Device entities to
list of CUDA_Entities belonging to package N.
(Sig_Flags): Signal CUDA_Device entities as referenced.
* snames.ads-tmpl: Create CUDA_Device names and pragmas.
Gary Dismukes [Wed, 11 Aug 2021 20:49:40 +0000 (16:49 -0400)]
[Ada] Assert_Failure on derived type with inherited Default_Initial_Condition
gcc/ada/
* exp_util.adb (Build_DIC_Procedure_Body): Remove inappropriate
Assert pragma. Remove unneeded and dead code related to derived
private types.
Richard Kenner [Wed, 11 Aug 2021 17:12:55 +0000 (13:12 -0400)]
[Ada] Add more node unions
gcc/ada/
* gen_il-gen-gen_nodes.adb (N_Alternative, N_Is_Case_Choice):
Add.
(N_Is_Exception_Choice, N_Is_Range): Likewise.
* gen_il-types.ads: Add above names.
* gen_il-gen.adb (Put_Union_Membership): Write both declarations
and definitions of union functions.
Ed Schonberg [Wed, 11 Aug 2021 16:52:29 +0000 (12:52 -0400)]
[Ada] Implementation of AI12-0212: iterator specs in array aggregates (II)
gcc/ada/
* exp_aggr.adb (Expand_Array_Aggregate,
Two_Pass_Aggregate_Expansion): Increment index for element
insertion within the loop, only if upper bound has not been
reached.
Javier Miranda [Mon, 2 Aug 2021 13:16:47 +0000 (09:16 -0400)]
[Ada] Ada2022: AI12-0195 overriding class-wide pre/postconditions
gcc/ada/
* contracts.ads (Make_Class_Precondition_Subps): New subprogram.
(Merge_Class_Conditions): New subprogram.
(Process_Class_Conditions_At_Freeze_Point): New subprogram.
* contracts.adb (Check_Class_Condition): New subprogram.
(Set_Class_Condition): New subprogram.
(Analyze_Contracts): Remove code analyzing class-wide-clone
subprogram since it is no longer built.
(Process_Spec_Postconditions): Avoid processing twice seen
subprograms.
(Process_Preconditions): Simplify its functionality to
non-class-wide preconditions.
(Process_Preconditions_For): No action needed for wrappers and
helpers.
(Make_Class_Precondition_Subps): New subprogram.
(Process_Class_Conditions_At_Freeze_Point): New subprogram.
(Merge_Class_Conditions): New subprogram.
* exp_ch6.ads (Install_Class_Preconditions_Check): New
subprogram.
* exp_ch6.adb (Expand_Call_Helper): Install class-wide
preconditions check on dispatching primitives that have or
inherit class-wide preconditions.
(Freeze_Subprogram): Remove code for null procedures with
preconditions.
(Install_Class_Preconditions_Check): New subprogram.
* exp_util.ads (Build_Class_Wide_Expression): Lower the
complexity of this subprogram; out-mode formal Needs_Wrapper
since this functionality is now provided by a new subprogram.
(Get_Mapped_Entity): New subprogram.
(Map_Formals): New subprogram.
* exp_util.adb (Build_Class_Wide_Expression): Lower the
complexity of this subprogram. Its previous functionality is now
provided by subprograms Needs_Wrapper and Check_Class_Condition.
(Add_Parent_DICs): Map the overridden primitive to the
overriding one.
(Get_Mapped_Entity): New subprogram.
(Map_Formals): New subprogram.
(Update_Primitives_Mapping): Adding assertion.
* freeze.ads (Check_Inherited_Conditions): Subprogram made
public with added formal to support late overriding.
* freeze.adb (Check_Inherited_Conditions): New implementation;
builds the dispatch table wrapper required for class-wide
pre/postconditions; added support for late overriding.
(Needs_Wrapper): New subprogram.
* sem.ads (Inside_Class_Condition_Preanalysis): New global
variable.
* sem_disp.ads (Covered_Interface_Primitives): New subprogram.
* sem_disp.adb (Covered_Interface_Primitives): New subprogram.
(Check_Dispatching_Context): Skip checking context of
dispatching calls during preanalysis of class-wide conditions
since at that stage the expression is not installed yet on its
definite context.
(Check_Dispatching_Call): Skip checking 6.1.1(18.2/5) by
AI12-0412 on helpers and wrappers internally built for
supporting class-wide conditions; for late-overriding
subprograms call Check_Inherited_Conditions to build the
dispatch-table wrapper (if required).
(Propagate_Tag): Adding call to
Install_Class_Preconditions_Check.
* sem_util.ads (Build_Class_Wide_Clone_Body): Removed.
(Build_Class_Wide_Clone_Call): Removed.
(Build_Class_Wide_Clone_Decl): Removed.
(Class_Condition): New subprogram.
(Nearest_Class_Condition_Subprogram): New subprogram.
* sem_util.adb (Build_Class_Wide_Clone_Body): Removed.
(Build_Class_Wide_Clone_Call): Removed.
(Build_Class_Wide_Clone_Decl): Removed.
(Class_Condition): New subprogram.
(Nearest_Class_Condition_Subprogram): New subprogram.
(Eligible_For_Conditional_Evaluation): No need to evaluate
class-wide conditions during preanalysis since the expression is
not installed on its definite context.
* einfo.ads (Class_Wide_Clone): Removed.
(Class_Postconditions): New attribute.
(Class_Preconditions): New attribute.
(Class_Preconditions_Subprogram): New attribute.
(Dynamic_Call_Helper): New attribute.
(Ignored_Class_Postconditions): New attribute.
(Ignored_Class_Preconditions): New attribute.
(Indirect_Call_Wrapper): New attribute.
(Is_Dispatch_Table_Wrapper): New attribute.
(Static_Call_Helper): New attribute.
* exp_attr.adb (Expand_N_Attribute_Reference): When the prefix
is of an access-to-subprogram type that has class-wide
preconditions and an indirect-call wrapper of such subprogram is
available, replace the prefix by the wrapper.
* exp_ch3.adb (Build_Class_Condition_Subprograms): New
subprogram.
(Register_Dispatch_Table_Wrappers): New subprogram.
* exp_disp.adb (Build_Class_Wide_Check): Removed; class-wide
precondition checks now rely on internally built helpers.
* sem_ch13.adb (Analyze_Aspect_Specifications): Set initial
value of attributes Class_Preconditions, Class_Postconditions,
Ignored_Class_Preconditions and Ignored_Class_Postconditions.
These values are later updated with the full pre/postcondition
by Merge_Class_Conditions.
(Freeze_Entity_Checks): Call
Process_Class_Conditions_At_Freeze_Point.
* sem_ch6.adb (Analyze_Subprogram_Body_Helper): Remove code
building the body of the class-wide clone subprogram since it is
no longer required.
(Install_Entity): Adding assertion.
* sem_prag.adb (Analyze_Pre_Post_Condition_In_Decl_Part): Remove
code building and analyzing the class-wide clone subprogram; no
longer required.
(Build_Pragma_Check_Equivalent): Adjust call to
Build_Class_Wide_Expression since the formal named Needs_Wrapper
has been removed.
* sem_attr.adb (Analyze_Attribute_Old_Result): Skip processing
these attributes during preanalysis of class-wide conditions
since at that stage the expression is not installed yet on its
definite context.
* sem_res.adb (Resolve_Actuals): Skip applying RM 3.9.2(9/1) and
SPARK RM 6.1.7(3) on actuals of internal helpers and wrappers
built to support class-wide preconditions.
* sem_ch5.adb (Process_Bounds): Do not generate a constant
declaration for the bounds when we are preanalyzing a class-wide
condition.
(Analyze_Loop_Parameter_Specification): Handle preanalysis of
quantified expression placed in the outermost expression of a
class-wide condition.
* ghost.adb (Check_Ghost_Context): No check required during
preanalysis of class-wide conditions.
* gen_il-fields.ads (Opt_Field_Enum): Adding
Class_Postconditions, Class_Preconditions,
Class_Preconditions_Subprogram, Dynamic_Call_Helper,
Ignored_Class_Postconditions, Ignored_Class_Preconditions,
Indirect_Call_Wrapper, Is_Dispatch_Table_Wrapper,
Static_Call_Helper.
* gen_il-gen-gen_entities.adb (Is_Dispatch_Table_Wrapper):
Adding semantic flag Is_Dispatch_Table_Wrapper; removing
semantic field Class_Wide_Clone; adding semantic fields for
Class_Postconditions, Class_Preconditions,
Class_Preconditions_Subprogram, Dynamic_Call_Helper,
Ignored_Class_Postconditions, Indirect_Call_Wrapper,
Ignored_Class_Preconditions, and Static_Call_Helper.
Piotr Trojanek [Wed, 11 Aug 2021 15:57:55 +0000 (17:57 +0200)]
[Ada] Fix deleting CodePeer files for non-ordinary units
gcc/ada/
* comperr.adb (Delete_SCIL_Files): Handle generic subprogram
declarations and renaming just like generic package declarations
and renamings, respectively; handle
N_Subprogram_Renaming_Declaration.
Steve Baird [Tue, 10 Aug 2021 17:33:42 +0000 (10:33 -0700)]
[Ada] Improve error message for .ali file version mismatch
gcc/ada/
* bcheck.adb (Check_Versions): Add support for the case where
the .ali file contains both a primary and a secondary version
number, as in "GNAT Lib v22.
20210809".
Steve Baird [Mon, 2 Aug 2021 23:18:08 +0000 (16:18 -0700)]
[Ada] Fix bug in inherited user-defined-literal aspects for tagged types
gcc/ada/
* sem_res.adb (Resolve): Two separate fixes. In the case where
Find_Aspect for a literal aspect returns the aspect for a
different (ancestor) type, call Corresponding_Primitive_Op to
get the right callee. In the case where a downward tagged type
conversion appears to be needed, generate a null extension
aggregate instead, as per Ada RM 3.4(27).
* sem_util.ads, sem_util.adb: Add new Corresponding_Primitive_Op
function. It maps a primitive op of a tagged type and a
descendant type of that tagged type to the corresponding
primitive op of the descendant type. The body of this function
was written by Javier Miranda.
Bob Duff [Mon, 9 Aug 2021 23:06:18 +0000 (19:06 -0400)]
[Ada] Info. gathering in preparation for more efficiency improvements
gcc/ada/
* atree.adb: Gather and print statistics about frequency of
getter and setter calls.
* atree.ads (Print_Statistics): New procedure for printing
statistics.
* debug.adb: Document -gnatd.A switch.
* gen_il-gen.adb: Generate code for statistics gathering.
Choose the offset of Homonym early. Misc cleanup. Put more
comments in the generated code.
* gen_il-internals.ads (Unknown_Offset): New value to indicate
that the offset has not yet been chosen.
* gnat1drv.adb: Call Print_Statistics.
* libgnat/s-imglli.ads: Minor comment fix.
* output.ads (Write_Int_64): New procedure to write a 64-bit
value. Needed for new statistics, and could come in handy
elsewhere.
* output.adb (Write_Int_64): Likewise.
* sinfo.ads: Remove obsolete comment. The xtreeprs program no
longer exists.
* types.ads: New 64-bit types needed for new statistics.
Dmitriy Anisimkov [Fri, 6 Aug 2021 11:54:28 +0000 (17:54 +0600)]
[Ada] Support gmem.out longer than 2G on 32 bit platforms
gcc/ada/
* libgnat/memtrack.adb (Putc): New routine wrapped around fputc
with error check.
(Write): New routine wrapped around fwrite with error check.
Remove bound functions fopen, fwrite, fputs, fclose, OS_Exit.
Use the similar routines from System.CRTL and System.OS_Lib.
Ed Schonberg [Sun, 8 Aug 2021 14:34:38 +0000 (10:34 -0400)]
[Ada] Spurious range checks on aggregate with non-static bounds
gcc/ada/
* exp_aggr.adb (Must_Slide): If the aggregate only contains an
others_clause no sliding id involved. Otherwise sliding is
required if any bound of the aggregate or the context subtype is
non-static.
Richard Kenner [Sat, 7 Aug 2021 13:21:32 +0000 (09:21 -0400)]
[Ada] Add N_Is_Decl
gcc/ada/
* gen_il-gen-gen_nodes.adb (N_Is_Decl): Add.
* gen_il-types.ads (N_Is_Decl): Likewise.
Richard Kenner [Thu, 5 Aug 2021 21:05:40 +0000 (17:05 -0400)]
[Ada] Add N_Entity_Name
gcc/ada/
* gen_il-gen-gen_nodes.adb (N_Entity_Name): Add.
* gen_il-types.ads (N_Entity_Name): Likewise.
Steve Baird [Thu, 5 Aug 2021 18:18:19 +0000 (11:18 -0700)]
[Ada] Improve error message for .ali file version mismatch
gcc/ada/
* bcheck.adb (Check_Versions): In the case of an ali file
version mismatch, if distinct integer values can be extracted
from the two version strings then include those values in the
generated error message.
Steve Baird [Thu, 5 Aug 2021 03:23:31 +0000 (20:23 -0700)]
[Ada] No ABE check needed for an expression function call.
gcc/ada/
* sem_elab.adb (Is_Safe_Call): Return True in the case of a
(possibly rewritten) call to an expression function.
Ghjuvan Lacambre [Wed, 4 Aug 2021 15:46:04 +0000 (17:46 +0200)]
[Ada] Fix CodePeer warnings
gcc/ada/
* sem_aggr.adb (Resolve_Iterated_Component_Association):
Initialize Id_Typ to Any_Type by default.
Eric Botcazou [Wed, 4 Aug 2021 13:07:17 +0000 (15:07 +0200)]
[Ada] Document that gnatmem requires fixed-position executables
gcc/ada/
* doc/gnat_ugn/gnat_and_program_execution.rst (gnatmem): Document
that it works only with fixed-position executables.
Doug Rupp [Mon, 26 Jul 2021 20:07:30 +0000 (13:07 -0700)]
[Ada] Switch to SR0660
gcc/ada/
* libgnat/s-parame__vxworks.ads (time_t_bits): Change to
Long_Long_Integer'Size.
GCC Administrator [Fri, 1 Oct 2021 00:16:27 +0000 (00:16 +0000)]
Daily bump.
David Edelsohn [Thu, 30 Sep 2021 20:43:58 +0000 (16:43 -0400)]
testsuite: Fix cf-descriptor-5.f90
gcc/testsuite/ChangeLog
* gfortran.dg/c-interop/cf-descriptor-5-c.c: Include alloca.h.
Przemyslaw Wirkus [Thu, 30 Sep 2021 20:32:48 +0000 (21:32 +0100)]
arm: Enable Cortex-R52+ CPU
Patch is adding Cortex-R52+ as 'cortex-r52plus' command line
flag for -mcpu option.
gcc/ChangeLog:
* config/arm/arm-cpus.in: Add Cortex-R52+ CPU.
* config/arm/arm-tables.opt: Regenerate.
* config/arm/arm-tune.md: Regenerate.
* doc/invoke.texi: Update docs.
Patrick Palka [Thu, 30 Sep 2021 21:34:23 +0000 (17:34 -0400)]
c++: __is_trivially_xible and multi-arg aggr paren init [PR102535]
is_xible_helper assumes only 0- and 1-argument ctors can be trivial, but
C++20 aggregate paren init means multi-arg ctors can now be trivial too.
This patch relaxes the relevant early exit check accordingly.
PR c++/102535
gcc/cp/ChangeLog:
* method.c (is_xible_helper): Don't exit early for multi-arg
ctors in C++20.
gcc/testsuite/ChangeLog:
* g++.dg/ext/is_trivially_constructible7.C: New test.
Patrick Palka [Thu, 30 Sep 2021 21:29:18 +0000 (17:29 -0400)]
c++: argument order in a variadic type trait intrinsic
When parsing a variadic type trait intrinsic, we build up the list of
trailing arguments in reverse, but we neglect to reverse the list to
the true order afterwards. This causes us to confuse the meaning of
e.g. __is_xible(x, y, z) vs __is_xible(x, z, y).
Note that this bug doesn't affect the library traits because they pass a
pack expansion as the single trailing argument to __is_xible, which gets
expanded in the correct order by tsubst_tree_list.
gcc/cp/ChangeLog:
* parser.c (cp_parser_trait_expr): Call nreverse on the reversed
list of trailing arguments.
gcc/testsuite/ChangeLog:
* g++.dg/ext/is_constructible6.C: New test.
Patrick Palka [Thu, 30 Sep 2021 21:29:05 +0000 (17:29 -0400)]
c++: defaulted comparisons and vptr fields [PR95567]
We need to explicitly skip over vptr fields when synthesizing a
defaulted comparison operator, because next_initializable_field
doesn't do so for us.
PR c++/95567
gcc/cp/ChangeLog:
* method.c (build_comparison_op): Skip DECL_VIRTUAL_P fields.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/spaceship-virtual1.C: New test.
Ian Lance Taylor [Thu, 30 Sep 2021 04:48:48 +0000 (21:48 -0700)]
compiler: avoid calling Expression::type before lowering
This is a minor cleanup to ensure that the various Expression::do_type
methods don't have to worry about the possibility that the Expression
has not been lowered.
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/353140
Harald Anlauf [Thu, 30 Sep 2021 18:29:31 +0000 (20:29 +0200)]
Fortran: resolve expressions during SIZE simplification
gcc/fortran/ChangeLog:
PR fortran/102458
* simplify.c (simplify_size): Resolve expressions used in array
specifications so that SIZE can be simplified.
gcc/testsuite/ChangeLog:
PR fortran/102458
* gfortran.dg/pr102458b.f90: New test.
Harald Anlauf [Thu, 30 Sep 2021 18:28:39 +0000 (20:28 +0200)]
Fortran: fix reference to Fortran standard in comment
gcc/fortran/
* expr.c: The correct reference to Fortran standard is: F2018:10.1.12.
Uros Bizjak [Thu, 30 Sep 2021 17:33:49 +0000 (19:33 +0200)]
i386: Eliminate sign extension after logic operation [PR89954]
Convert (sign_extend:WIDE (any_logic:NARROW (memory, immediate)))
to (any_logic:WIDE (sign_extend (memory)), (sign_extend (immediate))).
This eliminates sign extension after logic operation.
2021-09-30 Uroš Bizjak <ubizjak@gmail.com>
gcc/
PR target/89954
* config/i386/i386.md
(sign_extend:WIDE (any_logic:NARROW (memory, immediate)) splitters):
New splitters.
gcc/testsuite/
PR target/89954
* gcc.target/i386/pr89954.c: New test.
Tobias Burnus [Thu, 30 Sep 2021 17:08:25 +0000 (19:08 +0200)]
Fortran: Fix same_type_as
A test for CLASS(*) + assumed rank was missing; adding a test to
unlimited_polymorphic_1.f03 showed an ICE as backend_decl wasn't
set. While gfc_get_symbol_decl would fix it, the code also assumed
that the class(*) was a variable and could not be a subobject of
a derived type.
PR fortran/71703
PR fortran/84007
gcc/fortran/ChangeLog:
* trans-intrinsic.c (gfc_conv_same_type_as): Fix handling
of UNLIMITED_POLY.
* trans.h (gfc_vtpr_hash_get): Renamed prototype to ...
(gfc_vptr_hash_get): ... this to match function name.
gcc/testsuite/ChangeLog:
* gfortran.dg/c-interop/c535b-1.f90: Remove wrong comment.
* gfortran.dg/unlimited_polymorphic_1.f03: Extend.
* gfortran.dg/unlimited_polymorphic_32.f90: New test.
Iain Buclaw [Sat, 25 Sep 2021 21:18:53 +0000 (23:18 +0200)]
libphobos: Select the appropriate exception handler in getClassInfo
This is analogous to __gdc_personality, which ignores in-flight
exceptions that we haven't collided with yet.
libphobos/ChangeLog:
* libdruntime/gcc/deh.d (ExceptionHeader.getClassInfo): Move to...
(getClassInfo): ...here as free function. Add lsda parameter.
(scanLSDA): Pass lsda to actionTableLookup.
(actionTableLookup): Add lsda parameter, pass to getClassInfo.
(__gdc_personality): Remove currentCfa variable.
Iain Buclaw [Sat, 25 Sep 2021 21:03:41 +0000 (23:03 +0200)]
libphobos: Print stacktrace before terminating program due to uncaught exception.
By default, D run-time has a top level exception handler to catch
anything that was uncaught by user code. However when the
`rt_trapExceptions' flag is cleared, this handler would not be enabled,
and this termination would occur, aborting the program, but without any
information about the exception.
libphobos/ChangeLog:
* libdruntime/gcc/deh.d (_d_print_throwable): Declare.
(_d_throw): Print stacktrace before terminating program due to
uncaught exception.
Iain Buclaw [Fri, 24 Sep 2021 08:49:13 +0000 (10:49 +0200)]
libphobos: Remove unused variables in gcc.backtrace.
The core.runtime module always overrides the default parameter value for
constructor calls. MaxAlignment is not required because a class can be
created on the stack with the `scope' keyword.
libphobos/ChangeLog:
* libdruntime/core/runtime.d (runModuleUnitTests): Use scope to new
LibBacktrace on the stack.
* libdruntime/gcc/backtrace.d (FIRSTFRAME): Remove.
(LibBacktrace.MaxAlignment): Remove.
(LibBacktrace.this): Remove default initialization of firstFrame.
(UnwindBacktrace.this): Likewise.
Iain Buclaw [Sat, 25 Sep 2021 17:50:52 +0000 (19:50 +0200)]
libphobos: Give _Unwind_Exception an alignment that best resembles __attribute__((aligned))
For interoperability with C++ EH, the alignment should match, otherwise
D may not be able to intercept exceptions thrown from C++.
libphobos/ChangeLog:
* libdruntime/gcc/unwind/generic.d (__aligned__): Define.
(_Unwind_Exception): Align struct to __aligned__.
Iain Buclaw [Fri, 24 Sep 2021 08:59:47 +0000 (10:59 +0200)]
libphobos: Define main function as extern(C) when compiling without D runtime (PR102476)
The default supplied main function as read when compiling with `-fmain'
has extern(D) linkage. However this does not work when mixing this
option together with `-fno-druntime'.
PR d/102476
gcc/testsuite/ChangeLog:
* gdc.dg/pr102476.d: New test.
libphobos/ChangeLog:
* libdruntime/__main.di: Define main function as extern(C) when
compiling without D runtime.
Tobias Burnus [Thu, 30 Sep 2021 12:44:06 +0000 (14:44 +0200)]
libgomp.fortran/alloc-*.f90: Add missing dg-prune-output
libgomp/
* testsuite/libgomp.fortran/alloc-7.f90: Add dg-prune-output
for -fintrinsic-modules-path= warning of the C compiler.
* testsuite/libgomp.fortran/alloc-9.f90: Likewise.
* testsuite/libgomp.fortran/alloc-10.f90: Likewise.
Tobias Burnus [Thu, 30 Sep 2021 12:26:46 +0000 (14:26 +0200)]
openmp: Add omp_aligned_{,c}alloc and omp_{c,re}alloc for Fortran
gcc/ChangeLog:
* omp-low.c (omp_runtime_api_call): Add omp_aligned_{,c}alloc and
omp_{c,re}alloc, fix omp_alloc/omp_free.
libgomp/ChangeLog:
* libgomp.texi (OpenMP 5.1): Set implementation status to Y for
omp_aligned_{,c}alloc and omp_{c,re}alloc routines.
* omp_lib.f90.in (omp_aligned_alloc, omp_aligned_calloc, omp_calloc,
omp_realloc): Add.
* omp_lib.h.in (omp_aligned_alloc, omp_aligned_calloc, omp_calloc,
omp_realloc): Add.
* testsuite/libgomp.fortran/alloc-10.f90: New test.
* testsuite/libgomp.fortran/alloc-6.f90: New test.
* testsuite/libgomp.fortran/alloc-7.c: New test.
* testsuite/libgomp.fortran/alloc-7.f90: New test.
* testsuite/libgomp.fortran/alloc-8.f90: New test.
* testsuite/libgomp.fortran/alloc-9.f90: New test.
Martin Liska [Thu, 30 Sep 2021 12:12:35 +0000 (14:12 +0200)]
testsuite: Skip a test-case when LTO is used [PR102509]
PR testsuite/102509
gcc/testsuite/ChangeLog:
* gcc.c-torture/compile/attr-complex-method.c: Skip if LTO is
used.
* gcc.c-torture/compile/attr-complex-method-2.c: Likewise.
Martin Liska [Wed, 15 Sep 2021 11:52:35 +0000 (13:52 +0200)]
Do not hide asm_out_file in ASM_OUTPUT_ASCII.
gcc/ChangeLog:
* defaults.h (ASM_OUTPUT_ASCII): Do not hide global variable
asm_out_file and stream directly to MYFILE.
Richard Biener [Thu, 30 Sep 2021 11:05:45 +0000 (13:05 +0200)]
Refine alingment peeling fix
This refines the previous fix further by reverting to the original
code since the API is a bit of a mess. It also fixes the vector type
used to query the misalignment - that was what triggered the original
bogus change.
2021-09-30 Richard Biener <rguenther@suse.de>
* tree-vect-data-refs.c (vect_update_misalignment_for_peel):
Restore and fix condition under which we apply npeel to
the DRs misalignment value.
Richard Biener [Thu, 30 Sep 2021 08:21:36 +0000 (10:21 +0200)]
Fix thinko in previous alignment peeling change
I was mistaken in that npeel is -1 for variable peeling - it is 0.
2021-09-30 Richard Biener <rguenther@suse.de>
* tree-vect-data-refs.c (vect_update_misalignment_for_peel):
Fix npeel check for variable amount of peeling.
Jonathan Wakely [Thu, 30 Sep 2021 07:59:21 +0000 (08:59 +0100)]
libstdc++: Fix preprocessor check for C++17
libstdc++-v3/ChangeLog:
* include/bits/regex.h (basic_regex::multiline): Fix #if
condition.
Aldy Hernandez [Tue, 28 Sep 2021 13:54:20 +0000 (15:54 +0200)]
Plug possible snprintf overflow in lto-wrapper.
My upcoming improvements to the DOM threader triggered a warning in
this code. It looks like the format string is ".ltrans%u.ltrans", but
we're only writing a max of ".ltrans" + whatever the MAX_INT is here.
Tested on x86-64 Linux.
gcc/ChangeLog:
* lto-wrapper.c (run_gcc): Plug snprintf overflow.
Jakub Jelinek [Thu, 30 Sep 2021 07:30:18 +0000 (09:30 +0200)]
openmp: Add omp_aligned_{,c}alloc and omp_{c,re}alloc
This patch adds new OpenMP 5.1 allocator entrypoints and in addition to that
fixes an omp_alloc bug which is hard to test for - if the first allocator
fails but has a larger alignment trait and has a fallback allocator, either
the default behavior or a user fallback, then the extra alignment will be used
even in the fallback allocation, rather than just starting with whatever
alignment has been requested (in GOMP_alloc or the minimum one in omp_alloc).
Jonathan's comment on IRC this morning made me realize that I should add
alloc_align attributes to 2 of the prototypes and I still need to add testsuite
coverage for omp_realloc, will do that in a follow-up.
2021-09-30 Jakub Jelinek <jakub@redhat.com>
* omp.h.in (omp_aligned_alloc, omp_calloc, omp_aligned_calloc,
omp_realloc): New prototypes.
(omp_alloc): Move after omp_free prototype, add __malloc__ (omp_free)
attribute.
* allocator.c: Include string.h.
(omp_aligned_alloc): No longer static, add ialias. Add new_alignment
variable and use it instead of alignment so that when retrying the old
alignment is used again. Don't retry if new alignment is the same
as old alignment, unless allocator had pool size.
(omp_alloc, GOMP_alloc, GOMP_free): Use ialias_call.
(omp_aligned_calloc, omp_calloc, omp_realloc): New functions.
* libgomp.map (OMP_5.0.2): Export omp_aligned_alloc, omp_calloc,
omp_aligned_calloc and omp_realloc.
* testsuite/libgomp.c-c++-common/alloc-4.c (main): Add
omp_aligned_alloc, omp_calloc and omp_aligned_calloc tests.
* testsuite/libgomp.c-c++-common/alloc-5.c: New test.
* testsuite/libgomp.c-c++-common/alloc-6.c: New test.
* testsuite/libgomp.c-c++-common/alloc-7.c: New test.
* testsuite/libgomp.c-c++-common/alloc-8.c: New test.
Aldy Hernandez [Wed, 29 Sep 2021 18:50:20 +0000 (20:50 +0200)]
Add gimple_ranger::debug.
I'm trying to add one debug() for each dump() to the dumping aids.
Tested on x86-64 Linux.
gcc/ChangeLog:
* gimple-range.cc (gimple_ranger::debug): New.
* gimple-range.h (class gimple_ranger): Add debug.
Aldy Hernandez [Thu, 30 Sep 2021 00:19:36 +0000 (02:19 +0200)]
Plug memory leak in hybrid_threader.
Tested on x86-64 Linux.
gcc/ChangeLog:
PR middle-end/102519
* tree-vrp.c (hybrid_threader::~hybrid_threader): Free m_query.
GCC Administrator [Thu, 30 Sep 2021 00:16:20 +0000 (00:16 +0000)]
Daily bump.
Indu Bhagat [Wed, 29 Sep 2021 20:25:39 +0000 (13:25 -0700)]
debug/102507: ICE in btf_finalize when compiling with -gbtf
Fix the free up of btf_var_ids hash_map in btf_finalize ().
gcc/ChangeLog:
PR debug/102507
* btfout.c (GTY): Add GTY (()) albeit for cosmetic only purpose.
(btf_finalize): Empty the hash_map btf_var_ids.
Jonathan Wakely [Wed, 29 Sep 2021 20:00:30 +0000 (21:00 +0100)]
MAINTAINERS: Add myself to DCO section
ChangeLog:
* MAINTAINERS: Add myself to DCO section.
Aldy Hernandez [Tue, 28 Sep 2021 15:53:57 +0000 (17:53 +0200)]
[PR102501] Adjust jump threading testcases for ppc64* and others.
I really don't know what to do here. This is a bit of whack-o-mole.
The IL is sufficiently different for various architectures that any
tweak can cause the number of jump threads to vary.
For the pr7745-2.c testcase, we have less threading candidates because 2
of them now cross loop boundaries. Interestingly, this test matches
"Jumps threaded", not threads registered, so the block copier can
drop threads at copying time adding further confusion.
For example, we can register N threads, but the old copier can cancel
N-M threads while updating the CFG for a variety of different reasons
(removed edges, threading through loop exits, etc). This makes the
"Registering jump threads" not to match the total number of threads this
test checks for with "Jumps threaded".
The pr66752-3.c test OTOH, is just a matter of thread4 eliminating the
"if". I had erroneously thought it would always be eliminated by
thread3, but we really don't care where it gets cleaned up. All we know
is that DCE can't depend on the early threaders doing this work, because
it may cross loop boundaries. I've chosen thread4 arbitrarily, but we
could just as easily pick the ".optimized" dump.
Sorry, I'm really at my wits end here. I don't see any clean path
forward, except rewrite these tests as gimple IL. They're close to useless
as they sit.
gcc/testsuite/ChangeLog:
PR testsuite/102501
* gcc.dg/tree-ssa/pr66752-3.c: Adjust.
* gcc.dg/tree-ssa/pr77445-2.c: Adjust.
Aldy Hernandez [Wed, 29 Sep 2021 08:02:12 +0000 (10:02 +0200)]
Avoid CFG updates in VRP threader if nothing changed.
There is no need to update the CFG or SSAs if nothing has changed in VRP
threading.
gcc/ChangeLog:
* tree-vrp.c (thread_through_all_blocks): Return bool.
(execute_vrp_threader): Return TODO_* flags.
(pass_data_vrp_threader): Set todo_flags_finish to 0.
Aldy Hernandez [Wed, 29 Sep 2021 15:16:49 +0000 (17:16 +0200)]
Use a separate TV_* timer for the VRP threader.
There seems to be a memory consumption issue on 32 bit hosts after the
hybrid threader patchset. I'm having a hard time reproducing, and in
the process I've noticed that the threader is using the TV_TREE_VRP
timer. Having a distinct one could help diagnose this and other
issues going forward.
gcc/ChangeLog:
* timevar.def (TV_TREE_VRP_THREADER): New.
* tree-vrp.c: Use TV_TREE_VRP_THREADER for VRP threader pass.
Harald Anlauf [Wed, 29 Sep 2021 18:11:53 +0000 (20:11 +0200)]
Fortran: fix error recovery for invalid constructor
gcc/fortran/ChangeLog:
PR fortran/102520
* array.c (expand_constructor): Do not dereference NULL pointer.
gcc/testsuite/ChangeLog:
PR fortran/102520
* gfortran.dg/pr102520.f90: New test.
David Faust [Tue, 28 Sep 2021 17:29:50 +0000 (10:29 -0700)]
bpf: correct extra_headers
The BPF CO-RE support (commit
8bdabb37549f12ce727800a1c8aa182c0b1dd42a)
mistakenly overwrote bpf-*-* extra_headers in config.gcc, causing
bpf-helpers.h to not be installed. The redefinition with coreout.h is
unneeded, so delete it.
gcc/ChangeLog:
* config.gcc (bpf-*-*): Do not overwrite extra_headers.
Jeff Law [Wed, 29 Sep 2021 15:21:42 +0000 (11:21 -0400)]
Fix more testsuite fallout from computed goto changes
gcc/testsuite
* gcc.c-torture/compile/920831-1.c: Fix computed goto types.
* gcc.c-torture/compile/pr27863.c: Likewise.
Jonathan Wright [Thu, 23 Sep 2021 13:27:22 +0000 (14:27 +0100)]
aarch64: Fix type qualifiers for qtbl1 and qtbx1 Neon builtins
Fix type qualifiers for qtbl1 and qtbx1 Neon builtins and remove
casts from the Neon intrinsic function bodies that use these
builtins.
gcc/ChangeLog:
2021-09-23 Jonathan Wright <jonathan.wright@arm.com>
* config/aarch64/aarch64-builtins.c (TYPES_BINOP_PPU): Define
new type qualifier enum.
(TYPES_TERNOP_SSSU): Likewise.
(TYPES_TERNOP_PPPU): Likewise.
* config/aarch64/aarch64-simd-builtins.def: Define PPU, SSU,
PPPU and SSSU builtin generator macros for qtbl1 and qtbx1
Neon builtins.
* config/aarch64/arm_neon.h (vqtbl1_p8): Use type-qualified
builtin and remove casts.
(vqtbl1_s8): Likewise.
(vqtbl1q_p8): Likewise.
(vqtbl1q_s8): Likewise.
(vqtbx1_s8): Likewise.
(vqtbx1_p8): Likewise.
(vqtbx1q_s8): Likewise.
(vqtbx1q_p8): Likewise.
(vtbl1_p8): Likewise.
(vtbl2_p8): Likewise.
(vtbx2_p8): Likewise.
Jonathan Wakely [Wed, 29 Sep 2021 12:48:19 +0000 (13:48 +0100)]
libstdc++: Implement std::regex_constants::multiline (LWG 2503)
This implements LWG 2503, which allows ^ and $ to match line terminator
characters, rather than only matching the beginning and end of the
entire input. The multiline option is only valid for ECMAScript, but
for other grammars we ignore it rather than throwing an exception.
This is related to PR libstdc++/102480, which incorrectly said that
ECMAscript should match the beginning of a line when match_prev_avail
is used. I think that's only supposed to happen when multiline is used.
The new regex_constants::multiline and basic_regex::multiline constants
are not defined for strict -std=c++11 and -std=c++14 modes, but
regex_constants::__multiline is always defined, so that the
implementation can use it internally.
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:
* include/bits/regex.h (basic_regex::multiline): Define constant
for C++17.
* include/bits/regex_constants.h (regex_constants::multiline):
Define constant for C++17.
(regex_constants::__multiline): Define duplicate constant for
internal use in C++11 and C++14.
* include/bits/regex_executor.h (_Executor::_M_match_multiline()):
New member function.
(_Executor::_M_is_line_terminator(_CharT)): New member function.
(_Executor::_M_at_begin(), _Executor::_M_at_end()): Use new
member functions to support multiline matches.
* testsuite/28_regex/algorithms/regex_match/multiline.cc: New test.
Jonathan Wakely [Wed, 29 Sep 2021 12:48:15 +0000 (13:48 +0100)]
libstdc++: Check for invalid syntax_option_type values in <regex>
The standard says that it is invalid for more than one grammar element
to be set in a value of type regex_constants::syntax_option_type. This
adds a check in the regex compiler andthrows an exception if an invalid
value is used.
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:
* include/bits/regex_compiler.h (_Compiler::_S_validate): New
function.
* include/bits/regex_compiler.tcc (_Compiler::_Compiler): Use
_S_validate to check flags.
* include/bits/regex_error.h (_S_grammar): New error code for
internal use.
* testsuite/28_regex/basic_regex/ctors/grammar.cc: New test.
Jonathan Wakely [Wed, 29 Sep 2021 12:48:11 +0000 (13:48 +0100)]
libstdc++: std::basic_regex should treat '\0' as an ordinary char [PR84110]
When the input sequence contains a _CharT(0) character, the strchr call
in _Scanner<_CharT>::_M_scan_normal() will search for '\0' and so return
a pointer to the terminating null at the end of the string. This makes
the scanner think it's found a special character. Because it doesn't
match any of the actual special characters, we fall off the end of the
function (or assert in debug mode).
We should check for a null character explicitly and either treat it as
an ordinary character (for the ECMAScript grammar) or an error (for all
others). I'm not 100% sure that's right, but it seems consistent with
the POSIX RE rules where a '\0' means the end of the regex pattern or
the end of the sequence being matched.
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:
PR libstdc++/84110
* include/bits/regex_error.h (regex_constants::_S_null): New
error code for internal use.
* include/bits/regex_scanner.tcc (_Scanner::_M_scan_normal()):
Check for null character.
* testsuite/28_regex/basic_regex/84110.cc: New test.
Jonathan Wakely [Wed, 29 Sep 2021 12:48:02 +0000 (13:48 +0100)]
libstdc++: Simplify std::basic_regex construction and assignment
Introduce a new _M_compile function which does the common work needed by
all constructors and assignment. Call that directly to avoid multiple
levels of constructor delegation or calls to basic_regex::assign
overloads.
For assignment, there is no need to construct a std::basic_string if we
already have a contiguous sequence of the correct character type, and no
need to construct a temporary basic_regex when assigning from an
existing basic_regex.
Also define the copy and move assignment operators as defaulted, which
does the right thing without constructing a temporary and swapping it.
Copying or moving the shared_ptr member cannot fail, so they can be
noexcept. The assign(const basic_regex&) and assign(basic_regex&&)
member can then be defined in terms of copy or move assignment.
The new _M_compile function takes pointer arguments, so the caller has
to convert arbitrary iterator ranges into a contiguous sequence of
characters. With that simplification, the __compile_nfa helpers are not
needed and can be removed.
This also fixes a bug where construction from a contiguous sequence with
the wrong character type would fail to compile, rather than converting
the elements to the regex character type.
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:
* include/bits/regex.h (__detail::__is_contiguous_iter): Move
here from <bits/regex_compiler.h>.
(basic_regex::_M_compile): New function to compile an NFA from
a regular expression string.
(basic_regex::basic_regex): Use _M_compile instead of delegating
to other constructors.
(basic_regex::operator=(const basic_regex&)): Define as
defaulted.
(basic_regex::operator=(initializer_list<C>)): Use _M_compile.
(basic_regex::assign(const basic_regex&)): Use copy assignment.
(basic_regex::assign(basic_regex&&)): Use move assignment.
(basic_regex::assign(const C*, flag_type)): Use _M_compile
instead of constructing a temporary string.
(basic_regex::assign(const C*, size_t, flag_type)): Likewise.
(basic_regex::assign(const basic_string<C,T,A>&, flag_type)):
Use _M_compile instead of constructing a temporary basic_regex.
(basic_regex::assign(InputIter, InputIter, flag_type)): Avoid
constructing a temporary string for contiguous iterators of the
right value type.
* include/bits/regex_compiler.h (__is_contiguous_iter): Move to
<bits/regex.h>.
(__enable_if_contiguous_iter, __disable_if_contiguous_iter)
(__compile_nfa): Remove.
* testsuite/28_regex/basic_regex/assign/exception_safety.cc: New
test.
* testsuite/28_regex/basic_regex/ctors/char/other.cc: New test.
Richard Biener [Wed, 29 Sep 2021 12:32:32 +0000 (14:32 +0200)]
testsuite/102517 - fix FAIL of gcc.dg/pr78408-1.c with OImode availability
This fixes the testcase which looks for variants of memcpy after
memset folding which is disturbed when we expand the memcpy inline
earlier which in fact performs the desired optimization but makes
the dump file not match. For the ease of testing the following
adjusts the smaller structure size to be no longer power-of-two
which avoids the inline expansion.
2021-09-29 Richard Biener <rguenther@suse.de>
PR testsuite/102517
* gcc.dg/pr78408-1.c: Make S not power-of-two size.
Richard Biener [Wed, 29 Sep 2021 09:18:23 +0000 (11:18 +0200)]
Fix peeling for alignment with negative step
The following fixes a regression causing us to no longer peel
negative step loops for alignment. With dr_misalignment now
applying the bias for negative step we have to do the reverse
when adjusting the misalignment for peeled DRs.
2021-09-29 Richard Biener <rguenther@suse.de>
* tree-vect-data-refs.c (vect_dr_misalign_for_aligned_access):
New helper.
(vect_update_misalignment_for_peel): Use it to update
misaligned to the value necessary for an aligned access.
(vect_get_peeling_costs_all_drs): Likewise.
(vect_enhance_data_refs_alignment): Likewise.
* gcc.target/i386/vect-alignment-peeling-1.c: New testcase.
* gcc.target/i386/vect-alignment-peeling-2.c: Likewise.
Kyrylo Tkachov [Wed, 29 Sep 2021 10:21:45 +0000 (11:21 +0100)]
aarch64: Improve size heuristic for cpymem expansion
Similar to my previous patch for setmem this one does the same for the cpymem expansion.
We count the number of ops emitted and compare it against the alternative of just calling
the library function when optimising for size.
For the code:
void
cpy_127 (char *out, char *in)
{
__builtin_memcpy (out, in, 127);
}
void
cpy_128 (char *out, char *in)
{
__builtin_memcpy (out, in, 128);
}
we now emit a call to memcpy (with an extra MOV-immediate instruction for the size) instead of:
cpy_127(char*, char*):
ldp q0, q1, [x1]
stp q0, q1, [x0]
ldp q0, q1, [x1, 32]
stp q0, q1, [x0, 32]
ldp q0, q1, [x1, 64]
stp q0, q1, [x0, 64]
ldr q0, [x1, 96]
str q0, [x0, 96]
ldr q0, [x1, 111]
str q0, [x0, 111]
ret
cpy_128(char*, char*):
ldp q0, q1, [x1]
stp q0, q1, [x0]
ldp q0, q1, [x1, 32]
stp q0, q1, [x0, 32]
ldp q0, q1, [x1, 64]
stp q0, q1, [x0, 64]
ldp q0, q1, [x1, 96]
stp q0, q1, [x0, 96]
ret
which is a clear code size win. Speed optimisation heuristics remain unchanged.
2021-09-29 Kyrylo Tkachov <kyrylo.tkachov@arm.com>
* config/aarch64/aarch64.c (aarch64_expand_cpymem): Count number of
emitted operations and adjust heuristic for code size.
2021-09-29 Kyrylo Tkachov <kyrylo.tkachov@arm.com>
* gcc.target/aarch64/cpymem-size.c: New test.
Kyrylo Tkachov [Wed, 29 Sep 2021 10:00:14 +0000 (11:00 +0100)]
aarch64: Improve size optimisation heuristic for setmem expansion
This patch adjusts the setmem expansion in the backend to track the number of ops it generates
for the DUP + STR/STP inline sequences. This way we can compare the size/complexity of the sequence
against alternatives, notably just returning "false" and thus just emitting a call to memset.
The simple heuristic change here is that if we were going to emit more than 4 operations then
we shouldn't bother and just call memset. The number 4 is chosen because in the worst case for memset
we need to emit 4 instructions: 3 to move the arguments into the right registers and 1 for the call.
The speed optimisation decisions are not affected, though I do want to extend these expansions in a later
patch and I'd like to reuse this ops counting logic there. In any case this patch should make sense on its own.
For the code:
void __attribute__((__noinline__))
set127byte (int64_t *src, int c)
{
__builtin_memset (src, c, 127);
}
void __attribute__((__noinline__))
set128byte (int64_t *src, int c)
{
__builtin_memset (src, c, 128);
}
when optimising for size we now get just an immediate move + a call to memset (2 instructions) where before we'd have generated:
set127byte(long*, int):
dup v0.16b, w1
str q0, [x0, 96]
stp q0, q0, [x0]
stp q0, q0, [x0, 32]
stp q0, q0, [x0, 64]
str q0, [x0, 111]
ret
set128byte(long*, int):
dup v0.16b, w1
stp q0, q0, [x0]
stp q0, q0, [x0, 32]
stp q0, q0, [x0, 64]
stp q0, q0, [x0, 96]
ret
which is clearly undesirable for -Os.
I've adjusted the recently-added gcc.target/aarch64/memset-strict-align-1.c testcase to use a bigger struct
and switch to speed optimisation as with this patch we'll just call memset rather than expanding inline.
That is the right decision for size optimisation (the resulting code is indeed shorter).
With -O2 and the new struct size we still try the SIMD expansion and still trigger the path that the testcase is supposed to exercise.
2021-09-27 Kyrylo Tkachov <kyrylo.tkachov@arm.com>
* config/aarch64/aarch64.c (aarch64_expand_setmem): Count number of
emitted operations and adjust heuristic for code size.
2021-09-27 Kyrylo Tkachov <kyrylo.tkachov@arm.com>
* gcc.target/aarch64/memset-corner-cases-2.c: New test.
* gcc.target/aarch64/memset-strict-align-1.c: Adjust.
Jakub Jelinek [Wed, 29 Sep 2021 08:17:52 +0000 (10:17 +0200)]
openmp: Disallow reduction with var private in containing parallel even on scope [PR102504]
The standard has a restriction:
"A list item that appears in a reduction clause of a scope construct must be
shared in the parallel region to which a corresponding scope region binds."
similar to the restriction for worksharing constructs, but we were checking
it only on worksharing constructs and not for scope and ICEd later on during
omp expansion.
2021-09-29 Jakub Jelinek <jakub@redhat.com>
PR middle-end/102504
* gimplify.c (gimplify_scan_omp_clauses): Use omp_check_private even
in OMP_SCOPE clauses, not just on worksharing construct clauses.
* c-c++-common/gomp/scope-4.c: New test.
Andrew Pinski [Wed, 29 Sep 2021 02:01:52 +0000 (02:01 +0000)]
Fix some testcases after my computed goto patch
For some reason I did not see these failures in my testing.
Sorry about that. Anyways this fixes the testcases by
adding a cast to __INTPTR_TYPE__ and then a cast to void*.
Committed after testing them on x86_64-linux-gnu.
gcc/testsuite/ChangeLog:
* gcc.c-torture/compile/920826-1.c: Fix computed goto.
* gcc.c-torture/compile/pr27863.c: Likewise.
* gcc.c-torture/compile/pr70190.c: Likewise.
* gcc.dg/torture/pr89135.c: Likewise.
* gcc.dg/torture/pr90071.c: Likewise.
* gcc.dg/vect/bb-slp-pr97709.c: Likewise.
Richard Biener [Wed, 29 Sep 2021 06:06:09 +0000 (08:06 +0200)]
Avoid memcpy inline expansion in gcc.dg/out-of-bounds-1.c
This avoids inline expansion to preserve the warning by making
the memcpy size a non-power-of-two as suggested by Martin Sebor.
2021-09-29 Richard Biener <rguenther@suse.de>
* gcc.dg/out-of-bounds-1.c: Make memcpied size not power-of-two.
GCC Administrator [Wed, 29 Sep 2021 00:16:26 +0000 (00:16 +0000)]
Daily bump.