Jonathan Wakely [Wed, 1 Dec 2021 17:56:23 +0000 (17:56 +0000)]
libstdc++: Fix non-reserved name in std::allocator base class [PR64135]
The possible base classes of std::allocator are new_allocator and
malloc_allocator, which both cause a non-reserved name to be declared in
every program that includes the definition of std::allocator. This is
non-conforming.
This change replaces __gnu_cxx::new_allocator with std::__new_allocator
which is identical except for using a reserved name. The non-standard
extension __gnu_cxx::new_allocator is preserved as a thin wrapper over
std::__new_allocator. There is no problem with the extension using a
non-reserved name now that it's not included by default in other
headers.
The same change could be done to __gnu_cxx::malloc_allocator but as it's
not the default configuration it can wait.
libstdc++-v3/ChangeLog:
PR libstdc++/64135
* config/allocator/new_allocator_base.h: Include
<bits/new_allocator.h> instead of <ext/new_allocator.h>.
(__allocator_base): Use std::__new_allocator instead of
__gnu_cxx::new_allocator.
* doc/xml/manual/allocator.xml: Document new default base class
for std::allocator.
* doc/xml/manual/evolution.xml: Likewise.
* doc/html/*: Regenerate.
* include/Makefile.am: Add bits/new_allocator.h.
* include/Makefile.in: Regenerate.
* include/experimental/memory_resource (new_delete_resource):
Use std::__new_allocator instead of __gnu_cxx::new_allocator.
* include/ext/new_allocator.h (new_allocator): Derive from
std::__new_allocator. Move implementation to ...
* include/bits/new_allocator.h: New file.
* testsuite/20_util/allocator/64135.cc: New test.
Jan Hubicka [Thu, 9 Dec 2021 20:02:17 +0000 (21:02 +0100)]
Limit inlining functions called once
as dicussed in PR ipa/103454 there are several benchmarks that regresses
for -finline-functions-called once. Runtmes:
- tramp3d with -Ofast. 31%
- exchange2 with -Ofast 11-21%
- roms O2 9%-10%
- tonto 2.5-3.5% with LTO
Build times:
- specfp2006 41% (mostly wrf that builds 71% faster)
- specint2006 1.5-3%
- specfp2017 64% (again mostly wrf)
- specint2017 2.5-3.5%
This patch adds two params to tweak the behaviour:
1) max-inline-functions-called-once-loop-depth limiting the loop depth
(this is useful primarily for exchange where the inlined function is in
loop depth 9)
2) max-inline-functions-called-once-insns
We already have large-function-insns/growth parameters, but these are
limiting also inlining small functions, so reducing them will regress
very large functions that are hot.
Because inlining functions called once is meant just as a cleanup pass
I think it makes sense to have separate limit for it.
gcc/ChangeLog:
2021-12-09 Jan Hubicka <hubicka@ucw.cz>
* doc/invoke.texi (max-inline-functions-called-once-loop-depth,
max-inline-functions-called-once-insns): New parameters.
* ipa-inline.c (check_callers): Handle
param_inline_functions_called_once_loop_depth and
param_inline_functions_called_once_insns.
(edge_badness): Fix linebreaks.
* params.opt (param=max-inline-functions-called-once-loop-depth,
param=max-inline-functions-called-once-insn): New params.
Martin Sebor [Thu, 9 Dec 2021 19:49:28 +0000 (12:49 -0700)]
Extend the offset and size of merged object references [PR103215].
Resolves:
PR tree-optimization/103215 - bogus -Warray-bounds with two pointers with different offsets each
gcc/ChangeLog:
PR tree-optimization/103215
* pointer-query.cc (access_ref::merge_ref): Extend the offset and
size of the merged object instead of using the larger.
gcc/testsuite/ChangeLog:
PR tree-optimization/103215
* gcc.dg/Wstringop-overflow-58.c: Adjust and xfail expected warnings.
* gcc.dg/Wstringop-overflow-59.c: Same.
* gcc.dg/warn-strnlen-no-nul.c: Same.
* gcc.dg/Warray-bounds-91.c: New test.
* gcc.dg/Warray-bounds-92.c: New test.
* gcc.dg/Wstringop-overflow-85.c: New test.
* gcc.dg/Wstringop-overflow-87.c: New test.
Martin Sebor [Thu, 9 Dec 2021 18:24:14 +0000 (11:24 -0700)]
Avoid expecting nonzero size for access none void* arguments [PR101751].
Resolves:
PR middle-end/101751 - attribute access none with void pointer expects nonzero size
gcc/ChangeLog:
PR middle-end/101751
* doc/extend.texi (attribute access): Adjust.
* gimple-ssa-warn-access.cc (pass_waccess::maybe_check_access_sizes):
Treat access mode none on a void* argument as expecting as few as
zero bytes.
gcc/testsuite/ChangeLog:
PR middle-end/101751
* gcc.dg/Wstringop-overflow-86.c: New test.
Frederic Konrad [Thu, 14 Jan 2021 08:08:40 +0000 (09:08 +0100)]
Fix path to t-ppc64-fp for ppc*-vxworks7* libgcc tmake_file
This fixes a basic mistake in the relative path used to reference
a rs6000 specific Makefile fragment in the libgcc configuration bits
for powerpc*-vxworks7.
2021-01-14 Fred Konrad <konrad@adacore.com>
libgcc/
* config.host (powerpc*-wrs-vxworks7*): Fix path to
rs6000/t-ppc64-fp, relative to config/ not libgcc/.
Jakub Jelinek [Thu, 9 Dec 2021 16:55:28 +0000 (17:55 +0100)]
pch: Fix aarch64 build [PR71934]
On Thu, Dec 09, 2021 at 05:42:10PM +0100, Christophe Lyon wrote:
> This also broke aarch64 I think:
> In file included from
> /tmp/6140018_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/config/aarch64/aarch64-sve-builtins.cc:3920:0:
> ./gt-aarch64-sve-builtins.h: In function 'void
> gt_pch_p_19registered_function(void*, void*, gt_pointer_operator, void*)':
> ./gt-aarch64-sve-builtins.h:86:44: error: no matching function for call to
> 'gt_pch_nx(aarch64_sve::function_instance*, void (*&)(void*, void*, void*),
> void*&)'
> gt_pch_nx (&((*x).instance), op, cookie);
Fixed thusly.
2021-12-09 Jakub Jelinek <jakub@redhat.com>
PR pch/71934
* config/aarch64/aarch64-sve-builtins.cc (gt_pch_nx): Change type of
second argument from function with 2 pointer arguments to function
with 3 pointer arguments.
Olivier Hainque [Fri, 9 Apr 2021 15:46:42 +0000 (15:46 +0000)]
Leverage VX_CPU_PREFIX in aarch64-vxworks.h
This change tightens the CPU macro definitions issued
for VxWorks system headers on aarch64 to incorporate
the common VX_CPU_PREFIX facility, as the powerpc port
does.
The net effect for current configurations is the addition
of an actual "_VX_" prefix to the references to architecture
representative values. The absence of this prefix is most
often compensated for in system headers, but not always (when
going through particular #include paths), and this caused
a couple of spurious test failures.
2021-12-09 Olivier Hainque <hainque@adacore.com>
gcc/
* config/aarch64/aarch64-vxworks.h (TARGET_OS_CPP_BUILTINS):
Use VX_CPU_PREFIX in CPU definitions.
Martin Sebor [Mon, 6 Dec 2021 16:52:32 +0000 (09:52 -0700)]
Add a new dump function.
gcc/ChangeLog:
* pointer-query.cc (access_ref::dump): Define new function
(pointer_query::dump): Call it.
* pointer-query.h (access_ref::dump): Declare new function.
Martin Sebor [Mon, 6 Dec 2021 16:33:32 +0000 (09:33 -0700)]
Refactor compute_objsize_r into helpers.
gcc/ChangeLog:
* pointer-query.cc (compute_objsize_r): Add an argument.
(gimple_call_return_array): Pass a new argument to compute_objsize_r.
(access_ref::merge_ref): Same.
(access_ref::inform_access): Add an argument and use it.
(access_data::access_data): Initialize new member.
(handle_min_max_size): Pass a new argument to compute_objsize_r.
(handle_decl): New function.
(handle_array_ref): Pass a new argument to compute_objsize_r.
Avoid incrementing deref.
(set_component_ref_size): New function.
(handle_component_ref): New function.
(handle_mem_ref): Pass a new argument to compute_objsize_r.
Only increment deref after successfully computing object size.
(handle_ssa_name): New function.
(compute_objsize_r): Move code into helpers and call them.
(compute_objsize): Pass a new argument to compute_objsize_r.
* pointer-query.h (access_ref::inform_access): Add an argument.
(access_data::ostype): New member.
Martin Sebor [Mon, 6 Dec 2021 16:23:22 +0000 (09:23 -0700)]
Introduce access_ref::merge_ref.
gcc/ChangeLog:
* pointer-query.cc (access_ref::merge_ref): Define new function.
(access_ref::get_ref): Move code into merge_ref and call it.
* pointer-query.h (access_ref::merge_ref): Declare new function.
Martin Sebor [Sat, 4 Dec 2021 23:57:48 +0000 (16:57 -0700)]
Pass GIMPLE statement to compute_objsize.
gcc/ChangeLog:
* gimple-ssa-warn-restrict.c (builtin_access::builtin_access): Pass
GIMPLE statement to compute_objsize.
* pointer-query.cc (compute_objsize): Add a statement argument.
* pointer-query.h (compute_objsize): Define a new overload.
Martin Sebor [Sat, 4 Dec 2021 23:46:17 +0000 (16:46 -0700)]
Move bndrng from access_ref to access_data.
gcc/ChangeLog:
* gimple-ssa-warn-access.cc (check_access): Adjust to member name
change.
(pass_waccess::check_strncmp): Same.
* pointer-query.cc (access_ref::access_ref): Remove arguments.
Simpilfy.
(access_data::access_data): Define new ctors.
(access_data::set_bound): Define new member function.
(compute_objsize_r): Remove unnecessary code.
* pointer-query.h (struct access_ref): Remove ctor arguments.
(struct access_data): Declare ctor overloads.
(access_data::dst_bndrng): New member.
(access_data::src_bndrng): New member.
Martin Sebor [Sat, 4 Dec 2021 23:22:07 +0000 (16:22 -0700)]
Use the recursive form of compute_objsize [PR 103143].
gcc/ChangeLog:
PR middle-end/103143
* pointer-query.cc (gimple_call_return_array): Call compute_objsize_r.
gcc/testsuite/ChangeLog:
PR middle-end/103143
* gcc.dg/Wstringop-overflow-83.c: New test.
Marek Polacek [Thu, 25 Nov 2021 14:08:03 +0000 (09:08 -0500)]
c++: Handle auto(x) in parameter-declaration-clause [PR103401]
In C++23, auto(x) is valid, so decltype(auto(x)) should also be valid,
so
void f(decltype(auto(0)));
should be just as
void f(int);
but currently, everytime we see 'auto' in a parameter-declaration-clause,
we try to synthesize_implicit_template_parm for it, creating a new template
parameter list. The code above actually has us calling s_i_t_p twice;
once from cp_parser_decltype_expr -> cp_parser_postfix_expression which
fails and then again from cp_parser_decltype_expr -> cp_parser_expression.
So it looks like we have f<auto, auto> and we accept ill-formed code.
This shows that we need to be more careful about synthesizing the
implicit template parameter. [dcl.spec.auto.general] says that "A
placeholder-type-specifier of the form type-constraintopt auto can be
used as a decl-specifier of the decl-specifier-seq of a
parameter-declaration of a function declaration or lambda-expression..."
so this patch turns off auto_is_... after we've parsed the decl-specifier-seq.
That doesn't quite cut it yet though, because we also need to handle an
auto nested in the decl-specifier:
void f(decltype(new auto{0}));
therefore the cp_parser_decltype change.
To accept "sizeof(auto{10})", the cp_parser_type_id_1 hunk only gives a
hard error when we're not parsing tentatively.
The cp_parser_parameter_declaration hunk broke lambda-generic-85713-2.C but
I think the error we issue with this patch is in fact correct, and clang++
agrees.
The r11-1913 change is OK: we need to make sure that we see '(auto)' after
decltype to go ahead with 'decltype(auto)'.
PR c++/103401
gcc/cp/ChangeLog:
* parser.c (cp_parser_decltype): Clear
auto_is_implicit_function_template_parm_p.
(cp_parser_type_id_1): Give errors only when !cp_parser_simulate_error.
(cp_parser_parameter_declaration): Clear
auto_is_implicit_function_template_parm_p after parsing the
decl-specifier-seq.
(cp_parser_sizeof_operand): Clear
auto_is_implicit_function_template_parm_p.
gcc/testsuite/ChangeLog:
* g++.dg/cpp1y/lambda-generic-85713-2.C: Add dg-error.
* g++.dg/cpp1y/pr60054.C: Adjust dg-error.
* g++.dg/cpp1y/pr60332.C: Likewise.
* g++.dg/cpp2a/concepts-pr84979-2.C: Likewise.
* g++.dg/cpp2a/concepts-pr84979-3.C: Likewise.
* g++.dg/cpp2a/concepts-pr84979.C: Likewise.
* g++.dg/cpp23/auto-fncast7.C: New test.
* g++.dg/cpp23/auto-fncast8.C: New test.
* g++.dg/cpp23/auto-fncast9.C: New test.
Chung-Lin Tang [Thu, 9 Dec 2021 16:38:20 +0000 (00:38 +0800)]
openmp: Fix libgomp.c++ testsuite errors for non-offload configs
Some testcases for libgomp.c++ only works for non-shared address space offloading,
because it exercises the zero-length array section behavior for offloaded
address space, testing for NULL/non-NULL cases.
libgomp/ChangeLog:
* testsuite/libgomp.c++/target-lambda-1.C: Only run under
"target offload_device_nonshared_as"
* testsuite/libgomp.c++/target-this-3.C: Likewise.
* testsuite/libgomp.c++/target-this-4.C: Likewise.
Olivier Hainque [Sun, 28 Nov 2021 15:21:25 +0000 (15:21 +0000)]
Provide vxworks alternate stdint.h during the build
This change arranges to provide the vxworks alternate stdint.h
at build time instead of at install time, so it is used instead
of the system one while building the libraries.
This is a lot more consistent and helps the build on configurations
where the system does not come with stdint.h at all.
The change uses a similar mechanism as the one previsouly introduced
for glimits.h and takes the opportunity to simplify the glimits.h
command to use an automatic variable.
This introduces an indirect dependency on the VxWorks version.h
for vxcrtstuff objects, for which we then need to apply the same
tricks as for libgcc2 regarding include paths (to select the system
header instead of the gcc one).
2021-02-12 Olivier Hainque <hainque@adacore.com>
Rasmus Villemoes <rv@rasmusvillemoes.dk>
gcc/
* Makefile.in (T_STDINT_GCC_H): New variable, path to
stdint-gcc.h that a target configuration may override when
use_gcc_stdint is "provide".
(stmp-int-hdrs): Depend on it and copy that for
USE_GCC_INT=provide.
* config.gcc (vxworks): Revert to use_gcc_stdint=provide.
* config/t-vxworks (T_STDINT_GCC_H): Define, as vxw-stdint-gcc.h.
(vxw-stdint-gcc.h): New target, produced from the original
stdint-gcc.h.
(vxw-glimits.h): Use an automatic variable to designate the
first and only prerequisite.
* config/vxworks/stdint.h: Remove.
libgcc/
* config/t-vxworks: Set CRTSTUFF_T_CFLAGS to
$(LIBGCC2_INCLUDES).
* config/t-vxworks7: Likewise.
Iain Sandoe [Mon, 6 Dec 2021 07:50:08 +0000 (07:50 +0000)]
Darwin, PCH: Rework hooks for relocatable implementation [PR71934].
Now we have a relocatable PCH implementation we can revise the
hooks that find and use the mmapped memory. Specifically, this
removes the extra checking and diagnostic output for cases that
were likely to fail in a non-relocatable scenario.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/ChangeLog:
PR pch/71934
* config/host-darwin.c (SAFE_ALLOC_SIZE): Remove.
(darwin_gt_pch_get_address): Rework for relocatable PCH.
(darwin_gt_pch_use_address): Likewise.
Jakub Jelinek [Thu, 9 Dec 2021 14:54:33 +0000 (15:54 +0100)]
pch: Fix up Darwin and HPUX pch_use_address hooks [PR71934]
In the last change, I've changed the arguments from void * to void *&,
but missed the fact that these hooks will in that case update the value
the caller will see in an undesirable way.
2021-12-09 Jakub Jelinek <jakub@redhat.com>
PR pch/71934
* config/host-darwin.c (darwin_gt_pch_use_address): When reading
manually the file into mapped area, update mapped_addr as
an automatic variable rather than addr which is a reference parameter.
* config/host-hpux.c (hpux_gt_pch_use_address): When reading
manually the file into mapped area, update addr as
an automatic variable rather than base which is a reference parameter.
Jakub Jelinek [Thu, 9 Dec 2021 14:40:15 +0000 (15:40 +0100)]
pch: Add support for relocation of the PCH data [PR71934]
The following patch adds support for relocation of the PCH blob on PCH
restore if we don't manage to get the preferred map slot for it.
The GTY stuff knows where all the pointers are, after all it relocates
it once during PCH save from the addresses where it was initially allocated
to addresses in the preferred map slot.
But, if we were to do it solely using GTY info upon PCH restore, we'd need
another set of GTY functions, which I think would make it less maintainable
and I think it would also be more costly at PCH restore time. Those
functions would need to call something to add bias to pointers that haven't
been marked yet and make sure not to add bias to any pointer twice.
So, this patch instead builds a relocation table (sorted list of addresses
in the blob which needs relocation) at PCH save time, stores it in a very
compact form into the gch file and upon restore, adjusts pointers in GTY
roots (that is right away in the root structures) and the addresses in the
relocation table.
The cost on stdc++.gch/O2g.gch (previously 85MB large) is about 3% file size
growth, there are 2.5 million pointers that need relocation in the gch blob
and the relocation table uses uleb128 for address deltas and needs ~1.01 bytes
for one address that needs relocation, and about 20% compile time during
PCH save (I think it is mainly because of the need to qsort those 2.5
million pointers). On PCH restore, if it doesn't need relocation (the usual
case), it is just an extra fread of sizeof (size_t) data and fseek
(in my tests real time on vanilla tree for #include <bits/stdc++.h> CU
was ~0.175s and with the patch but no relocation ~0.173s), while if it needs
relocation it took ~0.193s, i.e. 11.5% slower.
Without PCH that
#include <bits/stdc++.h>
int i;
testcase compiles with -O2 -g in ~1.199s, i.e. 6.2 times slower than PCH with
relocation and 6.9 times than PCH without relocation.
The discovery of the pointers in the blob that need relocation is done
in the relocate_ptrs hook which does the pointer relocation during PCH save.
Unfortunately, I had to make one change to the gengtype stuff due to the
nested_ptr feature of GTY, which some libcpp headers and stringpool.c use.
The relocate_ptrs hook had 2 arguments, pointer to the pointer and a cookie.
When relocate_ptrs is done, in most cases it is called solely on the
subfields of the current object, so e.g.
if ((void *)(x) == this_obj)
op (&((*x).u.fld[0].rt_rtx), cookie);
so relocate_ptrs can assert that ptr_p is within the
state->ptrs[state->ptrs_i]->obj ..
state->ptrs[state->ptrs_i]->obj+state->ptrs[state->ptrs_i]->size-sizeof(void*)
range and compute from that the address in the blob which will need
relocation (state->ptrs[state->ptrs_i]->new_addr is the new address
given to it and ptr_p-state->ptrs[state->ptrs_i]->obj is the relative
offset. Unfortunately, for nested_ptr gengtype emits something like:
{
union tree_node * x0 =
((*x).val.node.node) ? HT_IDENT_TO_GCC_IDENT (HT_NODE (((*x).val.node.node))) : NULL;
if ((void *)(x) == this_obj)
op (&(x0), cookie);
(*x).val.node.node = (x0) ? CPP_HASHNODE (GCC_IDENT_TO_HT_IDENT ((x0))) : NULL;
}
so relocate_ptrs is called with an address of some temporary variable and
so doesn't know where the pointer will finally be.
So, I've added another argument to relocate_ptrs (and to
gt_pointer_operator). For the most common case I pass NULL as the new middle
argument to that function, first one remains pointer to the pointer that
needs adjustment and last the cookie. The NULL seems to be cheap to compute
and short in the gt*.[ch] files and stands for ptr_p is an address within
the this_obj's range, remember its address. For the nested_ptr case, the
new middle argument contains actual address of the pointer that might need
to be relocated, so instead of the above
op (&(x0), &((*x).val.node.node), cookie);
in there. And finally, e.g. for the reorder case I need a way to tell
restore_ptrs to ignore a particular address for the relocation purposes
and only treat it the old way. I've used for that the case when
the first and second arguments are equal.
In order to enable support for mapping PCH as fallback at different
addresses than the preferred ones, a small change is needed to the
host pch_use_address hooks. One change I've done to all of them is
the change of the type of the first argument from void * to void *&,
such that the actual address can be told to the callers (or shall I
instead use void **?), but another change that still needs to be done
in them if they want the relocation is actually not fail if they couldn't
get a preferred address, but instead modify what the first argument
refers to. I've done that only for host-linux.c and Iain is testing
similar change for host-darwin.c. Didn't change hpux, netbsd, openbsd,
solaris, mingw32 or the fallbacks because I can't test those.
Tested also with the:
--- gcc/config/host-linux.c.jj 2021-12-06 22:22:42.
007777367 +0100
+++ gcc/config/host-linux.c 2021-12-07 00:21:53.
052674040 +0100
@@ -191,6 +191,8 @@ linux_gt_pch_use_address (void *&base, s
if (size == 0)
return -1;
+base = (char *) base + ((size + 8191) & (size_t) -4096);
+
/* Try to map the file with MAP_PRIVATE. */
addr = mmap (base, size, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, offset);
hack which forces all PCH restores to be relocated. An earlier version of the
patch has been also regrest with base = (char *) base + 16384; in that spot,
so both relocation to a non-overlapping spot and to an overlapping spot have
been tested.
2021-12-09 Jakub Jelinek <jakub@redhat.com>
PR pch/71934
* coretypes.h (gt_pointer_operator): Use 3 pointer arguments instead
of two.
* gengtype.c (struct walk_type_data): Add in_nested_ptr argument.
(walk_type): Temporarily set d->in_nested_ptr around nested_ptr
handling.
(write_types_local_user_process_field): Pass a new middle pointer
to gt_pointer_operator op calls, if d->in_nested_ptr pass there
address of d->prev_val[2], otherwise NULL.
(write_types_local_process_field): Likewise.
* ggc-common.c (relocate_ptrs): Add real_ptr_p argument. If equal
to ptr_p, do nothing, otherwise if NULL remember ptr_p's
or if non-NULL real_ptr_p's corresponding new address in
reloc_addrs_vec.
(reloc_addrs_vec): New variable.
(compare_ptr, read_uleb128, write_uleb128): New functions.
(gt_pch_save): When iterating over objects through relocate_ptrs,
save current i into state.ptrs_i. Sort reloc_addrs_vec and emit
it as uleb128 of differences between pointer addresses into the
PCH file.
(gt_pch_restore): Allow restoring of PCH to a different address
than the preferred one, in that case adjust global pointers by bias
and also adjust by bias addresses read from the relocation table
as uleb128 differences. Otherwise fseek over it. Perform
gt_pch_restore_stringpool only after adjusting callbacks and for
callback adjustments also take into account the bias.
(default_gt_pch_use_address): Change type of first argument from
void * to void *&.
(mmap_gt_pch_use_address): Likewise.
* ggc-tests.c (gt_pch_nx): Pass NULL as new middle argument to op.
* hash-map.h (hash_map::pch_nx_helper): Likewise.
(gt_pch_nx): Likewise.
* hash-set.h (gt_pch_nx): Likewise.
* hash-table.h (gt_pch_nx): Likewise.
* hash-traits.h (ggc_remove::pch_nx): Likewise.
* hosthooks-def.h (default_gt_pch_use_address): Change type of first
argument from void * to void *&.
(mmap_gt_pch_use_address): Likewise.
* hosthooks.h (struct host_hooks): Change type of first argument of
gt_pch_use_address hook from void * to void *&.
* machmode.h (gt_pch_nx): Expect a callback with 3 pointers instead of
two in the middle argument.
* poly-int.h (gt_pch_nx): Likewise.
* stringpool.c (gt_pch_nx): Pass NULL as new middle argument to op.
* tree-cfg.c (gt_pch_nx): Likewise, except for LOCATION_BLOCK pass
the same &(block) twice.
* value-range.h (gt_pch_nx): Pass NULL as new middle argument to op.
* vec.h (gt_pch_nx): Likewise.
* wide-int.h (gt_pch_nx): Likewise.
* config/host-darwin.c (darwin_gt_pch_use_address): Change type of
first argument from void * to void *&.
* config/host-darwin.h (darwin_gt_pch_use_address): Likewise.
* config/host-hpux.c (hpux_gt_pch_use_address): Likewise.
* config/host-linux.c (linux_gt_pch_use_address): Likewise. If
it couldn't succeed to mmap at the preferred location, set base
to the actual one. Update addr in the manual reading loop instead of
base.
* config/host-netbsd.c (netbsd_gt_pch_use_address): Change type of
first argument from void * to void *&.
* config/host-openbsd.c (openbsd_gt_pch_use_address): Likewise.
* config/host-solaris.c (sol_gt_pch_use_address): Likewise.
* config/i386/host-mingw32.c (mingw32_gt_pch_use_address): Likewise.
* config/rs6000/rs6000-gen-builtins.c (write_init_file): Pass NULL
as new middle argument to op in the generated code.
* doc/gty.texi: Adjust samples for the addition of middle pointer
to gt_pointer_operator callback.
gcc/ada/
* gcc-interface/decl.c (gt_pch_nx): Pass NULL as new middle argument
to op.
gcc/c-family/
* c-pch.c (c_common_no_more_pch): Pass a temporary void * var
with NULL value instead of NULL to host_hooks.gt_pch_use_address.
gcc/c/
* c-decl.c (resort_field_decl_cmp): Pass the same pointer twice
to resort_data.new_value.
gcc/cp/
* module.cc (nop): Add another void * argument.
* name-lookup.c (resort_member_name_cmp): Pass the same pointer twice
to resort_data.new_value.
Martin Liska [Mon, 6 Dec 2021 12:02:22 +0000 (13:02 +0100)]
D: fix UBSAN
Fixes:
gcc/d/expr.cc:2596:9: runtime error: null pointer passed as argument 2, which is declared to never be null
gcc/d/ChangeLog:
* expr.cc: Call memcpy only when length != 0.
Alexandre Oliva [Thu, 9 Dec 2021 02:37:15 +0000 (23:37 -0300)]
[PR103097] tolerate reg-stack cross-block malformed asms
The testcase shows malformed asms in one block confuse reg-stack logic
in another block. Moving the resetting of any_malformed_asm to the
end of the pass enables it to take effect throughout the affected
function.
for gcc/ChangeLog
PR target/103097
* reg-stack.c (convert_regs_1): Move any_malformed_asm
resetting...
(reg_to_stack): ... here.
for gcc/testsuite/ChangeLog
PR target/103097
* gcc.target/i386/pr103097.c: New.
Alexandre Oliva [Thu, 9 Dec 2021 02:37:14 +0000 (23:37 -0300)]
[PR103302] skip multi-word pre-move clobber during lra
If we emit clobbers before multi-word moves during lra, we get
confused if a copy ends up with input or output replaced with each
other: the clobber then kills the previous set, and it gets deleted.
This patch avoids emitting such clobbers when lra_in_progress.
for gcc/ChangeLog
PR target/103302
* expr.c (emit_move_multi_word): Skip clobber during lra.
for gcc/testsuite/ChangeLog
PR target/103302
* gcc.target/riscv/pr103302.c: New.
Alexandre Oliva [Thu, 9 Dec 2021 02:37:09 +0000 (23:37 -0300)]
[PR103024,PR103530] support throwing compares and non-boolean types
This patch adjusts the harden-compares pass to cope with compares that
end basic blocks, and to accept non-boolean integral types whose
conversion to boolean may have been discarded.
for gcc/ChangeLog
PR tree-optimization/103024
PR middle-end/103530
* gimple-harden-conditionals.cc (non_eh_succ_edge): New.
(pass_harden_compares::execute): Accept 1-bit integral types,
and cope with throwing compares.
for gcc/testsuite/ChangeLog
PR tree-optimization/103024
PR middle-end/103530
* g++.dg/pr103024.C: New.
* g++.dg/pr103530.C: New.
GCC Administrator [Thu, 9 Dec 2021 00:16:31 +0000 (00:16 +0000)]
Daily bump.
Iain Buclaw [Sun, 5 Dec 2021 16:11:12 +0000 (17:11 +0100)]
d: Merge upstream dmd
568496d5b, druntime
178c44ff, phobos
574bf883b.
D front-end changes:
- Import dmd v2.098.0
- New ImportC module for compiling preprocessed C11 code into D.
- New -ftransition=in switch.
- Improved handling of new 'noreturn' type.
Druntime changes:
- Import druntime v2.098.0
- Fix broken import in core.sys.linux.perf_event module (PR103558).
Phobos changes:
- Import phobos v2.098.0
- All sources are now compiled with -fpreview=fieldwise.
gcc/d/ChangeLog:
* dmd/MERGE: Merge upstream dmd
568496d5b.
* Make-lang.in (D_FRONTEND_OBJS): Add d/common-file.o,
d/common-outbuffer.o, d/common-string.o, d/file_manager.o,
d/importc.o. Remove d/root-outbuffer.o.
(d/common-%.o): New recipe.
* d-builtins.cc (build_frontend_type): Update for new front-end
interface.
(d_build_d_type_nodes): Set noreturn_type_node.
* d-codegen.cc (d_build_call): Don't call function if one of the
arguments is type 'noreturn'.
(build_vthis_function): Propagate TYPE_QUAL_VOLATILE from original
function type.
* d-frontend.cc (eval_builtin): Update signature.
(getTypeInfoType): Likewise.
(toObjFile): New function.
* d-gimplify.cc (d_gimplify_call_expr): Always evaluate arguments from
left to right.
* d-lang.cc (d_handle_option): Handle OPT_ftransition_in.
(d_parse_file): Don't generate D main if it is declared in user code.
* d-tree.h (CALL_EXPR_ARGS_ORDERED): Remove.
(enum d_tree_index): Add DTI_BOTTOM_TYPE.
(noreturn_type_node): New.
* decl.cc (apply_pragma_crt): Remove.
(DeclVisitor::visit): Update for new front-end interface.
(DeclVisitor::visit (PragmaDeclaration *)): Don't handle
crt_constructor and crt_destructor pragmas.
(DeclVisitor::visit (VarDeclaration *)): Don't generate declarations
of type 'noreturn'.
(DeclVisitor::visit (FuncDeclaration *)): Stop adding parameters when
'noreturn' type has been encountered.
(get_symbol_decl): Set DECL_STATIC_CONSTRUCTOR and
DECL_STATIC_DESTRUCTOR on decl node if requested.
(aggregate_initializer_decl): Update for new front-end interface.
* expr.cc (ExprVisitor::visit (CallExp *)): Always use the 'this'
object as the result of calling any constructor function.
(ExprVisitor::visit): Update for new front-end interface.
* gdc.texi (Runtime Options): Document -fmain and -ftransition=in.
* lang.opt (ftransition=in): New option.
* modules.cc (get_internal_fn): Update for new front-end interface.
* types.cc (TypeVisitor::visit): Likewise.
(TypeVisitor::visit (TypeNoreturn *)): Return noreturn_type_node.
(TypeVisitor::visit (TypeFunction *)): Stop adding parameters when
'notreturn' type has been encountered. Qualify function types that
return 'noreturn' as TYPE_QUAL_VOLATILE.
libphobos/ChangeLog:
PR d/103558
* libdruntime/MERGE: Merge upstream druntime
178c44ff.
* libdruntime/Makefile.am (DRUNTIME_DSOURCES_LINUX): Add
core/sys/linux/syscalls.d.
(DRUNTIME_DSOURCES_OPENBSD): Add core/sys/openbsd/pthread_np.d.
* libdruntime/Makefile.in: Regenerate.
* src/MERGE: Merge upstream phobos
574bf883b.
* src/Makefile.am (D_EXTRA_DFLAGS): Add -fpreview=fieldwise.
* src/Makefile.in: Regenerate.
* testsuite/libphobos.exceptions/assert_fail.d: Update test.
* testsuite/libphobos.betterc/test22336.d: New test.
Jonathan Wakely [Wed, 8 Dec 2021 19:36:24 +0000 (19:36 +0000)]
libstdc++: Fix undefined shift when _Atomic_word is 64-bit
The check for _Atomic_word being 32-bit is just a normal runtime
condition for C++11 and C++14, because it doesn't use if-constexpr. That
means the 1LL << (CHAR_BIT * sizeof(_Atomic_word)) expression expands to
1LL << 64 on Solaris, which is ill-formed.
This adds another indirection so that the shift width is zero if the
code is unreachable.
libstdc++-v3/ChangeLog:
* include/bits/shared_ptr_base.h (_Sp_counted_base::_M_release()):
Make shift width conditional on __double_word condition.
Harald Anlauf [Wed, 8 Dec 2021 20:14:19 +0000 (21:14 +0100)]
Fortran: avoid NULL pointer dereference on missing or bad dummy arguments
gcc/fortran/ChangeLog:
PR fortran/103609
* symbol.c (gfc_sym_get_dummy_args): Catch NULL pointer
dereference.
gcc/testsuite/ChangeLog:
PR fortran/103609
* gfortran.dg/pr103609.f90: New test.
Iain Sandoe [Sun, 21 Nov 2021 17:19:24 +0000 (17:19 +0000)]
libgcc, Darwin: Build a libgcc_s.1 for backwards compatibility.
In order to reslve a long-standing issue with inter-operation
with libSystem, we have bumped the SO name for libgcc_s.
Distributions might wish to install this new version into a
structure where exisiting code is already linked with the
compiler-local libgcc_s.1 (providing symbols exported by the
now-retired libgcc_ext.10.x shims).
The replacement libgcc_s.1 forwards the symbols from the new SO.
In order to support DYLD_LIBRARY_PATH on systems (where it works)
we forward the libSystem unwinder symbols from 10.7+ and a
compiler-local version of the libgcc unwinder on earlier.
For macOS 10.4 to 10.6 this is 'bug-compatible' with existing uses.
For 10.7+ the behaviour will now actually be correct.
This should be squashed with the initial libgcc changes for PR80556
in any backport (r12-5418-gd4943ce939d)
libgcc/ChangeLog:
* config.host (*-*-darwin*): Add logic to build a shared
unwinder library for Darwin8-10.
* config/i386/t-darwin: Build legacy libgcc_s.1.
* config/rs6000/t-darwin: Likewise.
* config/t-darwin: Reorganise the EH fragments to place
them for inclusion in a shared EH lib.
* config/t-slibgcc-darwin: Build a legacy libgcc_s.1 and
the supporting pieces (all FAT libs).
* config/t-darwin-noeh: Removed.
* config/darwin-unwind.ver: New file.
* config/rs6000/t-darwin-ehs: New file.
* config/t-darwin-ehs: New file.
Iain Sandoe [Mon, 6 Dec 2021 13:17:10 +0000 (13:17 +0000)]
Darwin: Amend pie options when linking mdynamic-no-pic.
On i686 Darwin from macOS 10.7 onwards the default is to
link executables as PIE, which conflicts with code generated
using mdynamic-no-pic. Rather than warn about this and then
get the user to add -Wl,-no_pie, we can inject this in the
link specs.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/ChangeLog:
* config/darwin.h (DARWIN_PIE_SPEC): Add -no_pie when
linking mdynamic-no-pic code on macOS > 10.7.
Dimitar Dimitrov [Sun, 21 Nov 2021 13:55:53 +0000 (15:55 +0200)]
pru: Fixup flags for .pru_irq_map section
Assign correct flags for the .pru_irq_map section, which the
PRU remoteproc host loader introduced in Linux kernel 5.10:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=
c75c9fdac66efd8b54773368254ef330c276171b
gcc/ChangeLog:
* config/pru/pru.c (pru_section_type_flags): New function.
(TARGET_SECTION_TYPE_FLAGS): Wire it.
gcc/testsuite/ChangeLog:
* gcc.target/pru/pru_irq_map.c: New test.
Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
David Faust [Tue, 7 Dec 2021 19:45:48 +0000 (11:45 -0800)]
bpf: avoid potential NULL pointer dereference
If the result from SSA_NAME_DEF_STMT is NULL, we could try to
dereference it anyway and ICE. Avoid this.
gcc/ChangeLog:
* config/bpf/bpf.c (handle_attr_preserve): Avoid calling
is_gimple_assign with a NULL pointer.
Harald Anlauf [Tue, 7 Dec 2021 22:06:41 +0000 (23:06 +0100)]
Fortran: dimensions of an array have to be non-negative
gcc/fortran/ChangeLog:
PR fortran/103610
* array.c (spec_dimen_size): Fix simplification of SHAPE:
dimensions must be non-negative.
gcc/testsuite/ChangeLog:
PR fortran/103610
* gfortran.dg/shape_11.f90: New test.
Martin Liska [Tue, 7 Dec 2021 15:59:48 +0000 (16:59 +0100)]
Use -fopt-info in unswitch pass.
gcc/ChangeLog:
* profile-count.c (profile_count::dump): Add function
that can dump to a provided buffer.
(profile_probability::dump): Likewise.
* profile-count.h: Likewise.
* tree-ssa-loop-unswitch.c (tree_unswitch_single_loop):
Use dump_printf_loc infrastructure.
(tree_unswitch_outer_loop): Likewise.
(find_loop_guard): Likewise.
(hoist_guard): Likewise.
gcc/testsuite/ChangeLog:
* gcc.dg/loop-unswitch-1.c: Adjust test-case based on
dump_printf_loc.
* gcc.dg/loop-unswitch-2.c: Likewise.
* gcc.dg/loop-unswitch-3.c: Likewise.
* gcc.dg/loop-unswitch-4.c: Likewise.
* gcc.dg/loop-unswitch-5.c: Likewise.
François Dumont [Sun, 21 Nov 2021 10:56:40 +0000 (11:56 +0100)]
libstdc++: [_GLIBCXX_DEBUG] Enhance std::erase_if for vector/deque
libstdc++-v3/ChangeLog:
* include/std/deque (erase_if): Use _GLIBCXX_STD_C container reference and
__niter_wrap to limit _GLIBCXX_DEBUG mode impact.
* include/std/vector (erase_if): Likewise.
Hans-Peter Nilsson [Tue, 7 Dec 2021 05:18:57 +0000 (06:18 +0100)]
testsuite: Use attribute "noipa" in sibcall tests
...instead of attribute "noinline".
For cris-elf, testsuite/gcc.dg/sibcall-3.c and sibcall-4.c "XPASS",
without sibcalls being implemented. On inspection, recurser_void2 is
set to be an assembly-level alias for recurser_void1 as in
".set _recurser_void2,_recurser_void1" for both these cases.
IOW, those "__attribute__((noinline))" should be
"__attribute__((noipa))". The astute reader will notice that I also
adjust test-cases where self-recursion should occur: as mentioned in
sibcall-1.c "self-recursion tail calls are optimized for all targets,
regardless of presence of sibcall patterns". But, that optimization
happens even with "noipa", as observed by the test-cases still passing
for cris-elf after patching. Being of a small mind, I like
consistency, but not all the time, so there's hope.
testsuite:
* gcc.dg/sibcall-1.c, gcc.dg/sibcall-10.c,
gcc.dg/sibcall-2.c, gcc.dg/sibcall-3.c,
gcc.dg/sibcall-4.c, gcc.dg/sibcall-9.c: Replace
attribute "noinline" with "noipa".
Chung-Lin Tang [Wed, 8 Dec 2021 15:58:55 +0000 (23:58 +0800)]
OpenMP 5.0: Remove array section base-pointer mapping semantics and other front-end adjustments
This patch implements three pieces of functionality:
(1) Adjust array section mapping to have standards conforming behavior,
mapping array sections should *NOT* also map the base-pointer:
struct S { int *ptr; ... };
struct S s;
Instead of generating this during gimplify:
map(to:*_1 [len: 400]) map(attach:s.ptr [bias: 0])
Now, adjust to:
(i.e. do not map the base-pointer together. The attach operation is still
generated, and if s.ptr is already mapped prior, attachment will happen)
The correct way of achieving the base-pointer-also-mapped behavior would be to
use:
(A small Fortran front-end patch to trans-openmp.c:gfc_trans_omp_array_section
is also included, which removes generation of a GOMP_MAP_ALWAYS_POINTER for
array types, which appears incorrect and causes a regression in
libgomp.fortranlibgomp.fortran/struct-elem-map-1.f90)
(2) Related to the first item above, are fixes in libgomp/target.c to not
overwrite attached pointers when handling device<->host copies, mainly for the
"always" case.
(3) The third is a set of changes to the C/C++ front-ends to extend the allowed
component access syntax in map clauses. These changes are enabled for both
OpenACC and OpenMP.
gcc/c/ChangeLog:
* c-parser.c (struct omp_dim): New struct type for use inside
c_parser_omp_variable_list.
(c_parser_omp_variable_list): Allow multiple levels of array and
component accesses in array section base-pointer expression.
(c_parser_omp_clause_to): Set 'allow_deref' to true in call to
c_parser_omp_var_list_parens.
(c_parser_omp_clause_from): Likewise.
* c-typeck.c (handle_omp_array_sections_1): Extend allowed range
of base-pointer expressions involving INDIRECT/MEM/ARRAY_REF and
POINTER_PLUS_EXPR.
(c_finish_omp_clauses): Extend allowed ranged of expressions
involving INDIRECT/MEM/ARRAY_REF and POINTER_PLUS_EXPR.
gcc/cp/ChangeLog:
* parser.c (struct omp_dim): New struct type for use inside
cp_parser_omp_var_list_no_open.
(cp_parser_omp_var_list_no_open): Allow multiple levels of array and
component accesses in array section base-pointer expression.
(cp_parser_omp_all_clauses): Set 'allow_deref' to true in call to
cp_parser_omp_var_list for to/from clauses.
* semantics.c (handle_omp_array_sections_1): Extend allowed range
of base-pointer expressions involving INDIRECT/MEM/ARRAY_REF and
POINTER_PLUS_EXPR.
(handle_omp_array_sections): Adjust pointer map generation of
references.
(finish_omp_clauses): Extend allowed ranged of expressions
involving INDIRECT/MEM/ARRAY_REF and POINTER_PLUS_EXPR.
gcc/fortran/ChangeLog:
* trans-openmp.c (gfc_trans_omp_array_section): Do not generate
GOMP_MAP_ALWAYS_POINTER map for main array maps of ARRAY_TYPE type.
gcc/ChangeLog:
* gimplify.c (extract_base_bit_offset): Add 'tree *offsetp' parameter,
accomodate case where 'offset' return of get_inner_reference is
non-NULL.
(is_or_contains_p): Further robustify conditions.
(omp_target_reorder_clauses): In alloc/to/from sorting phase, also
move following GOMP_MAP_ALWAYS_POINTER maps along. Add new sorting
phase where we make sure pointers with an attach/detach map are ordered
correctly.
(gimplify_scan_omp_clauses): Add modifications to avoid creating
GOMP_MAP_STRUCT and associated alloc map for attach/detach maps.
gcc/testsuite/ChangeLog:
* c-c++-common/goacc/deep-copy-arrayofstruct.c: Adjust testcase.
* c-c++-common/gomp/target-enter-data-1.c: New testcase.
* c-c++-common/gomp/target-implicit-map-2.c: New testcase.
libgomp/ChangeLog:
* target.c (gomp_map_vars_existing): Make sure attached pointer is
not overwritten during cross-host/device copying.
(gomp_update): Likewise.
(gomp_exit_data): Likewise.
* testsuite/libgomp.c++/target-11.C: Adjust testcase.
* testsuite/libgomp.c++/target-12.C: Likewise.
* testsuite/libgomp.c++/target-15.C: Likewise.
* testsuite/libgomp.c++/target-16.C: Likewise.
* testsuite/libgomp.c++/target-17.C: Likewise.
* testsuite/libgomp.c++/target-21.C: Likewise.
* testsuite/libgomp.c++/target-23.C: Likewise.
* testsuite/libgomp.c/target-23.c: Likewise.
* testsuite/libgomp.c/target-29.c: Likewise.
* testsuite/libgomp.c-c++-common/target-implicit-map-2.c: New testcase.
Roger Sayle [Wed, 8 Dec 2021 11:45:38 +0000 (12:45 +0100)]
nvptx: Use cvt to perform sign-extension of truncation
This patch introduces some new define_insn rules to the nvptx backend,
to perform sign-extension of a truncation (from and to the same mode),
using a single cvt instruction. As an example, the following function
int foo(int x) { return (char)x; }
with -O2 currently generates:
mov.u32 %r24, %ar0;
mov.u32 %r26, %r24;
cvt.s32.s8 %value, %r26;
and with this patch, now generates:
mov.u32 %r24, %ar0;
cvt.s32.s8 %value, %r24;
This patch has been tested on nvptx-none hosted by x86_64-pc-linux-gnu
with a top-level "make" (including newlib) and a "make check" with no
new regressions.
gcc/ChangeLog:
* config/nvptx/nvptx.md (*extend_trunc_<mode>2_qi,
*extend_trunc_<mode>2_hi, *extend_trunc_di2_si): New insns.
Use cvt to perform sign-extension of truncation in one step.
gcc/testsuite/ChangeLog:
* gcc.target/nvptx/exttrunc-2.c: New test case.
* gcc.target/nvptx/exttrunc-3.c: New test case.
* gcc.target/nvptx/exttrunc-4.c: New test case.
* gcc.target/nvptx/exttrunc-5.c: New test case.
* gcc.target/nvptx/exttrunc-6.c: New test case.
Roger Sayle [Wed, 8 Dec 2021 13:21:49 +0000 (14:21 +0100)]
nvptx: Add test-case gcc.target/nvptx/exttrunc-1.c
Add new test-case converting short to char and back to short.
Tested on nvptx.
gcc/testsuite/ChangeLog:
* gcc.target/nvptx/exttrunc-1.c: New test case.
Chung-Lin Tang [Wed, 8 Dec 2021 14:28:03 +0000 (22:28 +0800)]
openmp: Improve OpenMP target support for C++ (PR92120)
This patch implements several C++ specific mapping capabilities introduced for
OpenMP 5.0, including implicit mapping of this[:1] for non-static member
functions, zero-length array section mapping of pointer-typed members,
lambda captured variable access in target regions, and use of lambda objects
inside target regions.
Several adjustments to the C/C++ front-ends to allow more member-access syntax
as valid is also included.
PR middle-end/92120
gcc/cp/ChangeLog:
* cp-tree.h (finish_omp_target): New declaration.
(finish_omp_target_clauses): Likewise.
* parser.c (cp_parser_omp_clause_map): Adjust call to
cp_parser_omp_var_list_no_open to set 'allow_deref' argument to true.
(cp_parser_omp_target): Factor out code, adjust into calls to new
function finish_omp_target.
* pt.c (tsubst_expr): Add call to finish_omp_target_clauses for
OMP_TARGET case.
* semantics.c (handle_omp_array_sections_1): Add handling to create
'this->member' from 'member' FIELD_DECL. Remove case of rejecting
'this' when not in declare simd.
(handle_omp_array_sections): Likewise.
(finish_omp_clauses): Likewise. Adjust to allow 'this[]' in OpenMP
map clauses. Handle 'A->member' case in map clauses. Remove case of
rejecting 'this' when not in declare simd.
(struct omp_target_walk_data): New struct for walking over
target-directive tree body.
(finish_omp_target_clauses_r): New function for tree walk.
(finish_omp_target_clauses): New function.
(finish_omp_target): New function.
gcc/c/ChangeLog:
* c-parser.c (c_parser_omp_clause_map): Set 'allow_deref' argument in
call to c_parser_omp_variable_list to 'true'.
* c-typeck.c (handle_omp_array_sections_1): Add strip of MEM_REF in
array base handling.
(c_finish_omp_clauses): Handle 'A->member' case in map clauses.
gcc/ChangeLog:
* gimplify.c ("tree-hash-traits.h"): Add include.
(gimplify_scan_omp_clauses): Change struct_map_to_clause to type
hash_map<tree_operand, tree> *. Adjust struct map handling to handle
cases of *A and A->B expressions. Under !DECL_P case of
GOMP_CLAUSE_MAP handling, add STRIP_NOPS for indir_p case, add to
struct_deref_set for map(*ptr_to_struct) cases. Add MEM_REF case when
handling component_ref_p case. Add unshare_expr and gimplification
when created GOMP_MAP_STRUCT is not a DECL. Add code to add
firstprivate pointer for *pointer-to-struct case.
(gimplify_adjust_omp_clauses): Move GOMP_MAP_STRUCT removal code for
exit data directives code to earlier position.
* omp-low.c (lower_omp_target):
Handle GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION, and
GOMP_MAP_POINTER_TO_ZERO_LENGTH_ARRAY_SECTION map kinds.
* tree-pretty-print.c (dump_omp_clause): Likewise.
gcc/testsuite/ChangeLog:
* gcc.dg/gomp/target-3.c: New testcase.
* g++.dg/gomp/target-3.C: New testcase.
* g++.dg/gomp/target-lambda-1.C: New testcase.
* g++.dg/gomp/target-lambda-2.C: New testcase.
* g++.dg/gomp/target-this-1.C: New testcase.
* g++.dg/gomp/target-this-2.C: New testcase.
* g++.dg/gomp/target-this-3.C: New testcase.
* g++.dg/gomp/target-this-4.C: New testcase.
* g++.dg/gomp/target-this-5.C: New testcase.
* g++.dg/gomp/this-2.C: Adjust testcase.
include/ChangeLog:
* gomp-constants.h (enum gomp_map_kind):
Add GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION, and
GOMP_MAP_POINTER_TO_ZERO_LENGTH_ARRAY_SECTION map kinds.
(GOMP_MAP_POINTER_P):
Include GOMP_MAP_POINTER_TO_ZERO_LENGTH_ARRAY_SECTION.
libgomp/ChangeLog:
* libgomp.h (gomp_attach_pointer): Add bool parameter.
* oacc-mem.c (acc_attach_async): Update call to gomp_attach_pointer.
(goacc_enter_data_internal): Likewise.
* target.c (gomp_map_vars_existing): Update assert condition to
include GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION.
(gomp_map_pointer): Add 'bool allow_zero_length_array_sections'
parameter, add support for mapping a pointer with NULL target.
(gomp_attach_pointer): Add 'bool allow_zero_length_array_sections'
parameter, add support for attaching a pointer with NULL target.
(gomp_map_vars_internal): Update calls to gomp_map_pointer and
gomp_attach_pointer, add handling for
GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION, and
GOMP_MAP_POINTER_TO_ZERO_LENGTH_ARRAY_SECTION cases.
* testsuite/libgomp.c++/target-23.C: New testcase.
* testsuite/libgomp.c++/target-lambda-1.C: New testcase.
* testsuite/libgomp.c++/target-lambda-2.C: New testcase.
* testsuite/libgomp.c++/target-this-1.C: New testcase.
* testsuite/libgomp.c++/target-this-2.C: New testcase.
* testsuite/libgomp.c++/target-this-3.C: New testcase.
* testsuite/libgomp.c++/target-this-4.C: New testcase.
* testsuite/libgomp.c++/target-this-5.C: New testcase.
Maged Michael [Tue, 7 Dec 2021 15:20:58 +0000 (15:20 +0000)]
libstdc++: Skip atomic instructions in shared_ptr when both counts are 1
This rewrites _Sp_counted_base::_M_release to skip the two atomic
instructions that decrement each of the use count and the weak count
when both are 1.
Benefits: Save the cost of the last atomic decrements of each of the use
count and the weak count in _Sp_counted_base. Atomic instructions are
significantly slower than regular loads and stores across major
architectures.
How current code works: _M_release() atomically decrements the use
count, checks if it was 1, if so calls _M_dispose(), atomically
decrements the weak count, checks if it was 1, and if so calls
_M_destroy().
How the proposed algorithm works: _M_release() loads both use count and
weak count together atomically (assuming suitable alignment, discussed
later), checks if the value corresponds to a 0x1 value in the individual
count members, and if so calls _M_dispose() and _M_destroy().
Otherwise, it follows the original algorithm.
Why it works: When the current thread executing _M_release() finds each
of the counts is equal to 1, then no other threads could possibly hold
use or weak references to this control block. That is, no other threads
could possibly access the counts or the protected object.
There are two crucial high-level issues that I'd like to point out first:
- Atomicity of access to the counts together
- Proper alignment of the counts together
The patch is intended to apply the proposed algorithm only to the case of
64-bit mode, 4-byte counts, and 8-byte aligned _Sp_counted_base.
** Atomicity **
- The proposed algorithm depends on the mutual atomicity among 8-byte
atomic operations and 4-byte atomic operations on each of the 4-byte halves
of the 8-byte aligned 8-byte block.
- The standard does not guarantee atomicity of 8-byte operations on a pair
of 8-byte aligned 4-byte objects.
- To my knowledge this works in practice on systems that guarantee native
implementation of 4-byte and 8-byte atomic operations.
- __atomic_always_lock_free is used to check for native atomic operations.
** Alignment **
- _Sp_counted_base is an internal base class, with a virtual destructor,
so it has a vptr at the beginning of the class, and will be aligned to
alignof(void*) i.e. 8 bytes.
- The first members of the class are the 4-byte use count and 4-byte
weak count, which will occupy 8 contiguous bytes immediately after the
vptr, i.e. they form an 8-byte aligned 8 byte range.
Other points:
- The proposed algorithm can interact correctly with the current algorithm.
That is, multiple threads using different versions of the code with and
without the patch operating on the same objects should always interact
correctly. The intent for the patch is to be ABI compatible with the
current implementation.
- The proposed patch involves a performance trade-off between saving the
costs of atomic instructions when the counts are both 1 vs adding the cost
of loading the 8-byte combined counts and comparison with {0x1, 0x1}.
- I noticed a big difference between the code generated by GCC vs LLVM. GCC
seems to generate noticeably more code and what seems to be redundant null
checks and branches.
- The patch has been in use (built using LLVM) in a large environment for
many months. The performance gains outweigh the losses (roughly 10 to 1)
across a large variety of workloads.
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
Co-authored-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:
* include/bits/c++config (_GLIBCXX_TSAN): Define macro
indicating that TSan is in use.
* include/bits/shared_ptr_base.h (_Sp_counted_base::_M_release):
Replace definition in primary template with explicit
specializations for _S_mutex and _S_atomic policies.
(_Sp_counted_base<_S_mutex>::_M_release): New specialization.
(_Sp_counted_base<_S_atomic>::_M_release): New specialization,
using a single atomic load to access both reference counts at
once.
(_Sp_counted_base::_M_release_last_use): New member function.
Andrew Stubbs [Thu, 11 Nov 2021 13:43:04 +0000 (13:43 +0000)]
dwarf: Multi-register CFI address support.
Add support for architectures such as AMD GCN, in which the pointer size is
larger than the register size. This allows the CFI information to include
multi-register locations for the stack pointer, frame pointer, and return
address.
This patch was originally posted by Andrew Stubbs in
https://gcc.gnu.org/pipermail/gcc-patches/2020-August/552873.html
It has now been re-worked according to the review comments. It does not use
DW_OP_piece or DW_OP_LLVM_piece_end. Instead it uses
DW_OP_bregx/DW_OP_shl/DW_OP_bregx/DW_OP_plus to build the CFA from multiple
consecutive registers. Here is how .debug_frame looks before and after this
patch:
$ cat factorial.c
int factorial(int n) {
if (n == 0) return 1;
return n * factorial (n - 1);
}
$ amdgcn-amdhsa-gcc -g factorial.c -O0 -c -o fac.o
$ llvm-dwarfdump -debug-frame fac.o
*** without this patch (edited for brevity)***
00000000 00000014 ffffffff CIE
DW_CFA_def_cfa: reg48 +0
DW_CFA_register: reg16 reg50
00000018 0000002c 00000000 FDE cie=
00000000 pc=
00000000...
000001ac
DW_CFA_advance_loc4: 96
DW_CFA_offset: reg46 0
DW_CFA_offset: reg47 4
DW_CFA_offset: reg50 8
DW_CFA_offset: reg51 12
DW_CFA_offset: reg16 8
DW_CFA_advance_loc4: 4
DW_CFA_def_cfa_sf: reg46 -16
*** with this patch (edited for brevity)***
00000000 00000024 ffffffff CIE
DW_CFA_def_cfa_expression: DW_OP_bregx SGPR49+0, DW_OP_const1u 0x20, DW_OP_shl, DW_OP_bregx SGPR48+0, DW_OP_plus
DW_CFA_expression: reg16 DW_OP_bregx SGPR51+0, DW_OP_const1u 0x20, DW_OP_shl, DW_OP_bregx SGPR50+0, DW_OP_plus
00000028 0000003c 00000000 FDE cie=
00000000 pc=
00000000...
000001ac
DW_CFA_advance_loc4: 96
DW_CFA_offset: reg46 0
DW_CFA_offset: reg47 4
DW_CFA_offset: reg50 8
DW_CFA_offset: reg51 12
DW_CFA_offset: reg16 8
DW_CFA_advance_loc4: 4
DW_CFA_def_cfa_expression: DW_OP_bregx SGPR47+0, DW_OP_const1u 0x20, DW_OP_shl, DW_OP_bregx SGPR46+0, DW_OP_plus, DW_OP_lit16, DW_OP_minus
gcc/ChangeLog:
* dwarf2cfi.c (dw_stack_pointer_regnum): Change type to struct cfa_reg.
(dw_frame_pointer_regnum): Likewise.
(new_cfi_row): Use set_by_dwreg.
(get_cfa_from_loc_descr): Use set_by_dwreg. Support register spans.
handle DW_OP_bregx with DW_OP_breg{0-31}. Support DW_OP_lit*,
DW_OP_const*, DW_OP_minus, DW_OP_shl and DW_OP_plus.
(lookup_cfa_1): Use set_by_dwreg.
(def_cfa_0): Update for cfa_reg and support register spans.
(reg_save): Change sreg parameter to struct cfa_reg. Support register
spans.
(dwf_cfa_reg): New function.
(dwarf2out_flush_queued_reg_saves): Use dwf_cfa_reg instead of
dwf_regno.
(dwarf2out_frame_debug_def_cfa): Likewise.
(dwarf2out_frame_debug_adjust_cfa): Likewise.
(dwarf2out_frame_debug_cfa_offset): Likewise. Update reg_save usage.
(dwarf2out_frame_debug_cfa_register): Likewise.
(dwarf2out_frame_debug_expr): Likewise.
(create_pseudo_cfg): Use set_by_dwreg.
(initial_return_save): Use set_by_dwreg and dwf_cfa_reg,
(create_cie_data): Use dwf_cfa_reg.
(execute_dwarf2_frame): Use dwf_cfa_reg.
(dump_cfi_row): Use set_by_dwreg.
* dwarf2out.c (build_span_loc, build_breg_loc): New function.
(build_cfa_loc): Support register spans.
(build_cfa_aligned_loc): Update cfa_reg usage.
(convert_cfa_to_fb_loc_list): Use set_by_dwreg.
* dwarf2out.h (struct cfa_reg): New type.
(struct dw_cfa_location): Use struct cfa_reg.
(build_span_loc): New prototype.
co-authored-By: Hafiz Abid Qadeer <abidh@codesourcery.com>
Haochen Jiang [Thu, 2 Dec 2021 07:30:17 +0000 (15:30 +0800)]
Add combine splitter to transform vpcmpeqd/vpxor/vblendvps to vblendvps for ~op0
gcc/ChangeLog:
PR target/100738
* config/i386/sse.md
(*<sse4_1>_blendv<ssefltmodesuffix><avxsizesuffix>_not_ltint):
Add new define_insn_and_split.
gcc/testsuite/ChangeLog:
PR target/100738
* g++.target/i386/pr100738-1.C: New test.
Alexandre Oliva [Sat, 4 Dec 2021 03:17:18 +0000 (00:17 -0300)]
[PR103149] detach values through mem only if general regs won't do
When hardening compares or conditional branches, we perform redundant
tests, and to prevent them from being optimized out, we use asm
statements that preserve a value used in a compare, but in a way that
the compiler can no longer assume it's the same value, so it can't
optimize the redundant test away.
We used to use +g, but that requires general regs or mem. You might
think that, if a reg constraint can't be satisfied, the register
allocator will fall back to memory, but that's not so: we decide on
matching MEMs very early on, by using the same addressable operand on
both input and output, and only if the constraint does not allow
registers. If it does, we use gimple registers and then pseudos as
inputs and outputs, and then inputs can be substituted by equivalent
expressions, and then, if no register contraint fits (e.g. because
that mode won't fit in general regs, or won't fit in regs at all), the
register allocator will give up before even trying to allocate some
temporary memory to unify input and output.
This patch arranges for us to create and use the temporary stack slot
if we can tell the mode requires memory, or won't otherwise fit in
general regs, and thus to use +m for that asm.
for gcc/ChangeLog
PR middle-end/103149
* gimple-harden-conditionals.cc (detach_value): Use memory if
general regs won't do.
for gcc/testsuite/ChangeLog
PR middle-end/103149
* gcc.target/aarch64/pr103149.c: New.
GCC Administrator [Wed, 8 Dec 2021 00:16:23 +0000 (00:16 +0000)]
Daily bump.
Harald Anlauf [Tue, 7 Dec 2021 20:34:31 +0000 (21:34 +0100)]
Fortran: perform array subscript checks only for valid INTEGER bounds
gcc/fortran/ChangeLog:
PR fortran/103607
* frontend-passes.c (do_subscript): Ensure that array bounds are
of type INTEGER before performing checks on array subscripts.
gcc/testsuite/ChangeLog:
PR fortran/103607
* gfortran.dg/pr103607.f90: New test.
Marek Polacek [Tue, 7 Dec 2021 21:06:19 +0000 (16:06 -0500)]
c++: Fix decltype-bitfield1.C on i?86
This test was failing on i?86 because of:
warning: width of 'A::l' exceeds its type
so change the type to 'long long' and make the test run only on arches
where sizeof(long long) == 8 to avoid failing like this again.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/decltype-bitfield1.C: Change a type to unsigned
long long. Only run on longlong64 targets.
Peter Bergner [Tue, 7 Dec 2021 20:42:38 +0000 (14:42 -0600)]
testsuite: Fix check_effective_target_rop_ok [PR103556, PR103586]
The new rop_ok effective target test doesn't correctly compute its expression
result because a new line starts a new statement. Solution is to remove
the new line.
2021-12-07 Peter Bergner <bergner@linux.ibm.com>
gcc/testsuite/
PR testsuite/103556
PR testsuite/103586
* lib/target-supports.exp (check_effective_target_rop_ok): Remove '\n'.
Harald Anlauf [Tue, 7 Dec 2021 17:46:52 +0000 (18:46 +0100)]
Fortran: catch failed simplification of bad stride expression
gcc/fortran/ChangeLog:
PR fortran/103588
* array.c (gfc_ref_dimen_size): Do not generate internal error on
failed simplification of stride expression; just return failure.
gcc/testsuite/ChangeLog:
PR fortran/103588
* gfortran.dg/pr103588.f90: New test.
Harald Anlauf [Mon, 6 Dec 2021 22:15:11 +0000 (23:15 +0100)]
Fortran: add check for type of upper bound in case range
gcc/fortran/ChangeLog:
PR fortran/103591
* match.c (match_case_selector): Check type of upper bound in case
range.
gcc/testsuite/ChangeLog:
PR fortran/103591
* gfortran.dg/select_9.f90: New test.
Martin Liska [Mon, 29 Nov 2021 13:46:47 +0000 (14:46 +0100)]
Fix --help -Q output
PR middle-end/103438
gcc/ChangeLog:
* config/s390/s390.c (s390_valid_target_attribute_inner_p):
Use new enum CLVC_INTEGER.
* opt-functions.awk: Use new CLVC_INTEGER.
* opts-common.c (set_option): Likewise.
(option_enabled): Return -1,0,1 for CLVC_INTEGER.
(get_option_state): Use new CLVC_INTEGER.
(control_warning_option): Likewise.
* opts.h (enum cl_var_type): Likewise.
Marek Polacek [Sat, 4 Dec 2021 17:07:41 +0000 (12:07 -0500)]
c++: Fix for decltype and bit-fields [PR95009]
Here, decltype deduces the wrong type for certain expressions involving
bit-fields. Unlike in C, in C++ bit-field width is explicitly not part
of the type, so I think decltype should never deduce to 'int:N'. The
problem isn't that we're not calling unlowered_expr_type--we are--it's
that is_bitfield_expr_with_lowered_type only handles certain codes, but
not others. For example, += works fine but ++ does not.
This also fixes decltype-bitfield2.C where we were crashing (!), but
unfortunately it does not fix 84516 or 70733 where the problem is likely
a missing call to unlowered_expr_type. It occurs to me now that typeof
likely has had the same issue, but this patch should fix that too.
PR c++/95009
gcc/cp/ChangeLog:
* typeck.c (is_bitfield_expr_with_lowered_type) <case MODIFY_EXPR>:
Handle UNARY_PLUS_EXPR, NEGATE_EXPR, NON_LVALUE_EXPR, BIT_NOT_EXPR,
P*CREMENT_EXPR too.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/decltype-bitfield1.C: New test.
* g++.dg/cpp0x/decltype-bitfield2.C: New test.
H.J. Lu [Tue, 7 Dec 2021 13:09:34 +0000 (05:09 -0800)]
x86: Check FUNCTION_DECL before calling cgraph_node::get
gcc/
PR target/103594
* config/i386/i386.c (ix86_call_use_plt_p): Check FUNCTION_DECL
before calling cgraph_node::get.
gcc/testsuite/
PR target/103594
* gcc.dg/pr103594.c: New test.
Richard Biener [Tue, 7 Dec 2021 10:13:39 +0000 (11:13 +0100)]
tree-optimization/103596 - fix missed propagation into switches
may_propagate_copy unnecessarily restricts propagating non-abnormals
into places that currently contain an abnormal SSA name but are
not the PHI argument for an abnormal edge. This causes VN to
not elide a CFG path that it assumes is elided, resulting in
released SSA names in the IL.
The fix is to enhance the may_propagate_copy API to specify the
destination is _not_ a PHI argument. I chose to not update only
the relevant caller in VN and the may_propagate_copy_into_stmt API
at this point because this is a regression and needs backporting.
2021-12-07 Richard Biener <rguenther@suse.de>
PR tree-optimization/103596
* tree-ssa-sccvn.c (eliminate_dom_walker::eliminate_stmt):
Note we are not propagating into a PHI argument to may_propagate_copy.
* tree-ssa-propagate.h (may_propagate_copy): Add
argument specifying whether we propagate into a PHI arg.
* tree-ssa-propagate.c (may_propagate_copy): Likewise.
When not doing so we can replace an abnormal with
something else.
(may_propagate_into_stmt): Update may_propagate_copy calls.
(replace_exp_1): Move propagation checking code to
propagate_value and rename to ...
(replace_exp): ... this and elide previous wrapper.
(propagate_value): Perform checking with adjusted
may_propagate_copy call and dispatch to replace_exp.
* gcc.dg/torture/pr103596.c: New testcase.
Matthias Kretz [Fri, 3 Dec 2021 08:37:52 +0000 (09:37 +0100)]
Fix hash_map::traverse overload
The hash_map::traverse overload taking a non-const Value pointer breaks
if the callback returns false. The other overload should behave the
same.
Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
gcc/ChangeLog:
* hash-map.h (hash_map::traverse): Let both overloads behave the
same.
* predict.c (assert_is_empty): Return true, thus not changing
behavior.
Tamar Christina [Tue, 7 Dec 2021 10:37:30 +0000 (10:37 +0000)]
Revert "libstdc++: Fix ctype changed after newlib update."
Newlib has reverted the commit that caused us to require a
workaround. As such we can now revert the workaround.
This reverts commit
0e510ab53414430e93c6f5b64841e2f40031cda7.
libstdc++-v3/ChangeLog:
PR libstdc++/103305
* config/os/newlib/ctype_base.h (upper, lower, alpha, digit, xdigit,
space, print, graph, cntrl, punct, alnum, blank): Revert.
YunQiang Su [Mon, 11 Oct 2021 10:42:39 +0000 (06:42 -0400)]
MIPS: R6: load/store can process unaligned address
MIPS release 6 requires the lw/ld/sw/sd can work with
unaligned address, while it can be implemented by
full hardware or trap&emulate.
Since it doesn't have to be fully done by hardware, we add a
pair of options -m(no-)unaligned-access. Kernels may need them.
gcc/ChangeLog:
* config/mips/mips.h (ISA_HAS_UNALIGNED_ACCESS, STRICT_ALIGNMENT):
R6 can unaligned access.
* config/mips/mips.md (movmisalign<mode>): Likewise.
* config/mips/mips.opt: add -m(no-)unaligned-access
* doc/invoke.texi: Likewise.
gcc/testsuite/ChangeLog:
* gcc.target/mips/mips.exp: add unaligned-access
* gcc.target/mips/unaligned-2.c: New test.
* gcc.target/mips/unaligned-3.c: New test.
Eugene Rozenfeld [Fri, 3 Dec 2021 02:37:09 +0000 (18:37 -0800)]
Improve AutoFDO count propagation algorithm
When a basic block A has been annotated with a count and it has only one
successor (or predecessor) B, we can propagate the A's count to B.
The algoritm without this change could leave B without an annotation if B had
other unannotated predecessors (or successors). For example, in the test case I added,
the loop header block was left unannotated, which prevented loop unrolling.
gcc/ChangeLog:
* auto-profile.c (afdo_propagate_edge): Improve count propagation algorithm.
gcc/testsuite/ChangeLog:
* gcc.dg/tree-prof/init-array.c: New test for unrolling inner loops.
GCC Administrator [Tue, 7 Dec 2021 00:16:23 +0000 (00:16 +0000)]
Daily bump.
David Malcolm [Mon, 6 Dec 2021 19:04:35 +0000 (14:04 -0500)]
analyzer: fix equivalence class state purging [PR103533]
Whilst debugging state explosions seen when enabling taint detection
with -fanalyzer (PR analyzer/103533), I noticed that constraint
manager instances could contain stray, redundant constants, such
as this instance:
constraint_manager:
equiv classes:
ec0: {(int)0 == [m_constant]‘0’}
ec1: {(size_t)4 == [m_constant]‘4’}
constraints:
where there are two equivalence classes, each just containing a
constant, with no constraints using them.
This patch makes constraint_manager::canonicalize more aggressive
about purging state, handling the case of purging a redundant
EC containing just a constant.
gcc/analyzer/ChangeLog:
PR analyzer/103533
* constraint-manager.cc (equiv_class::contains_non_constant_p):
New.
(constraint_manager::canonicalize): Call it when determining
redundant ECs.
(selftest::test_purging): New selftest.
(selftest::run_constraint_manager_tests): Likewise.
* constraint-manager.h (equiv_class::contains_non_constant_p):
New decl.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Paul A. Clarke [Mon, 6 Dec 2021 22:16:31 +0000 (16:16 -0600)]
rs6000: Fix errant "vector" instead of "__vector"
Fixes
85289ba36c2e62de84cc0232c954d9a74bda708a.
2021-12-06 Paul A. Clarke <pc@us.ibm.com>
gcc
PR target/103545
* config/rs6000/xmmintrin.h (_mm_movemask_ps): Replace "vector" with
"__vector".
Navid Rahimi [Mon, 6 Dec 2021 21:46:13 +0000 (13:46 -0800)]
MAINTAINERS: Add myself to write after approval and DCO sections.
* MAINTAINERS: Adding myself.
Jose E. Marchesi [Mon, 6 Dec 2021 20:57:53 +0000 (21:57 +0100)]
bpf: mark/remove unused arguments and remove an unused function
This patch does a little bit of cleanup by removing some unused
arguments, or marking them as unused. It also removes the function
ctfc_debuginfo_early_finish_p and the corresponding hook macro
definition, which are not used by GCC.
gcc/
* config/bpf/bpf.c (bpf_handle_preserve_access_index_attribute):
Mark arguments `args' and flags' as unused.
(bpf_core_newdecl): Remove unused local `newdecl'.
(bpf_core_newdecl): Remove unused argument `loc'.
(ctfc_debuginfo_early_finish_p): Remove unused function.
(TARGET_CTFC_DEBUGINFO_EARLY_FINISH_P): Remove definition.
(bpf_core_walk): Do not pass a location to bpf_core_newdecl.
Richard Sandiford [Mon, 6 Dec 2021 18:29:31 +0000 (18:29 +0000)]
ranger: Add shortcuts for single-successor blocks
When compiling an optabs.ii at -O2 with a release-checking build,
there were 6,643,575 calls to gimple_outgoing_range_stmt_p. 96.8% of
them were for blocks with a single successor, which never have a control
statement that generates new range info. This patch therefore adds a
shortcut for that case.
This gives a ~1% compile-time improvement for the test.
I tried making the function inline (in the header) so that the
single_succ_p didn't need to be repeated, but it seemed to make things
slightly worse.
gcc/
* gimple-range-edge.cc (gimple_outgoing_range::edge_range_p): Add
a shortcut for blocks with single successors.
* gimple-range-gori.cc (gori_map::calculate_gori): Likewise.
Richard Sandiford [Mon, 6 Dec 2021 18:29:30 +0000 (18:29 +0000)]
ranger: Optimise irange_union
When compiling an optabs.ii at -O2 with a release-checking build,
the hottest function in the profile was irange_union. This patch
tries to optimise it a bit. The specific changes are:
- Use quick_push rather than safe_push, since the final number
of entries is known in advance.
- Avoid assigning wi::to_wide & co. to a temporary wide_int,
such as in:
wide_int val_j = wi::to_wide (res[j]);
wi::to_wide returns a wide_int "view" of the in-place INTEGER_CST
storage. Assigning the result to wide_int forces an unnecessary
copy to temporary storage.
This is one area where "auto" helps a lot. In the end though,
it seemed more readable to inline the wi::to_*s rather than
use auto.
- Use to_widest_int rather than to_wide_int. Both are functionally
correct, but to_widest_int is more efficient, for three reasons:
- to_wide returns a wide-int representation in which the most
significant element might not be canonically sign-extended.
This is because we want to allow the storage of an INTEGER_CST
like 0x1U << 31 to be accessed directly with both a wide_int view
(where only 32 bits matter) and a widest_int view (where many more
bits matter, and where the 32 bits are zero-extended to match the
unsigned type). However, operating on uncanonicalised wide_int
forms is less efficient than operating on canonicalised forms.
- to_widest_int has a constant rather than variable precision and
there are never any redundant upper bits to worry about.
- Using widest_int avoids the need for an overflow check, since
there is enough precision to add 1 to any IL constant without
wrap-around.
This gives a ~2% compile-time speed up with the test above.
I also tried adding a path for two single-pair ranges, but it
wasn't a win.
gcc/
* value-range.cc (irange::irange_union): Use quick_push rather
than safe_push. Use widest_int rather than wide_int. Avoid
assigning wi::to_* results to wide*_int temporaries.
Andrew MacLeod [Fri, 3 Dec 2021 16:02:19 +0000 (11:02 -0500)]
Use dominators to reduce cache-flling.
Before walking the CFG and filling all cache entries, check if the
same information is available in a dominator.
* gimple-range-cache.cc (ranger_cache::fill_block_cache): Check for
a range from dominators before filling the cache.
(ranger_cache::range_from_dom): New.
* gimple-range-cache.h (ranger_cache::range_from_dom): Add prototype.
Andrew MacLeod [Fri, 3 Dec 2021 15:51:18 +0000 (10:51 -0500)]
Add BB option for outgoing_edge_range_p and may_reocmpute_p.
There are times we only need to know if any edge from a block can calculate
a range.
* gimple-range-gori.h (class gori_compute):: Add prototypes.
* gimple-range-gori.cc (gori_compute::has_edge_range_p): Add alternate
API for basic block. Call for edge alterantive.
(gori_compute::may_recompute_p): Ditto.
H.J. Lu [Mon, 6 Dec 2021 16:17:49 +0000 (08:17 -0800)]
libsanitizer: Update LOCAL_PATCHES
Add
commit
70b043845d7c378c6a9361a6769885897d1018c2
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Tue Nov 30 05:31:26 2021 -0800
libsanitizer: Use SSE to save and restore XMM registers
to LOCAL_PATCHES.
* LOCAL_PATCHES: Add commit
70b043845d7.
H.J. Lu [Tue, 30 Nov 2021 13:31:26 +0000 (05:31 -0800)]
libsanitizer: Use SSE to save and restore XMM registers
Use SSE, instead of AVX, to save and restore XMM registers to support
processors without AVX. The affected codes are unused in upstream since
https://github.com/llvm/llvm-project/commit/
66d4ce7e26a5
and will be removed in
https://reviews.llvm.org/D112604
This fixed
FAIL: g++.dg/tsan/pthread_cond_clockwait.C -O0 execution test
FAIL: g++.dg/tsan/pthread_cond_clockwait.C -O2 execution test
on machines without AVX.
PR sanitizer/103466
* tsan/tsan_rtl_amd64.S (__tsan_trace_switch_thunk): Replace
vmovdqu with movdqu.
(__tsan_report_race_thunk): Likewise.
Richard Biener [Mon, 6 Dec 2021 14:13:49 +0000 (15:13 +0100)]
tree-optimization/103581 - fix masked gather on x86
The recent fix to PR103527 exposed an issue with how the various
special casing for AVX512 masks in vect_build_gather_load_calls
are handled. The following makes that more obvious, fixing the
miscompile of 403.gcc.
2021-12-06 Richard Biener <rguenther@suse.de>
PR tree-optimization/103581
* tree-vect-stmts.c (vect_build_gather_load_calls): Properly
guard all the AVX512 mask cases.
* gcc.dg/vect/pr103581.c: New testcase.
Martin Liska [Mon, 6 Dec 2021 13:08:53 +0000 (14:08 +0100)]
contrib: Filter out -Wreturn-type in fold-const-call.c.
contrib/ChangeLog:
* filter-clang-warnings.py: Filter out one warning.
Richard Biener [Mon, 6 Dec 2021 10:43:28 +0000 (11:43 +0100)]
tree-optimization/103544 - SLP reduction chain as SLP reduction issue
When SLP reduction chain vectorization support added handling of
an outer conversion in the chain picking a failed reduction up
as SLP reduction that broke the invariant that the whole reduction
was forward reachable. The following plugs that hole noting
a future enhancement possibility.
2021-12-06 Richard Biener <rguenther@suse.de>
PR tree-optimization/103544
* tree-vect-slp.c (vect_analyze_slp): Only add a SLP reduction
opportunity if the stmt in question is the reduction root.
(dot_slp_tree): Add missing check for NULL child.
* gcc.dg/vect/pr103544.c: New testcase.
Jakub Jelinek [Mon, 6 Dec 2021 10:18:58 +0000 (11:18 +0100)]
avr: Fix AVR build [PR71934]
On Mon, Dec 06, 2021 at 11:00:30AM +0100, Martin Liška wrote:
> Jakub, I think the patch broke avr-linux target:
>
> g++ -fno-PIE -c -g -DIN_GCC -DCROSS_DIRECTORY_STRUCTURE -fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wno-erro
> /home/marxin/Programming/gcc/gcc/config/avr/avr.c: In function ‘void avr_output_data_section_asm_op(const void*)’:
> /home/marxin/Programming/gcc/gcc/config/avr/avr.c:10097:26: error: invalid conversion from ‘const void*’ to ‘const char*’ [-fpermissive]
This patch fixes that.
2021-12-06 Jakub Jelinek <jakub@redhat.com>
PR pch/71934
* config/avr/avr.c (avr_output_data_section_asm_op,
avr_output_bss_section_asm_op): Change argument type from const void *
to const char *.
Tamar Christina [Mon, 6 Dec 2021 10:15:15 +0000 (10:15 +0000)]
cse: Make sure duplicate elements are not entered into the equivalence set [PR103404]
CSE uses equivalence classes to keep track of expressions that all have the same
values at the current point in the program.
Normal equivalences through SETs only insert and perform lookups in this set but
equivalence determined from comparisons, e.g.
(insn 46 44 47 7 (set (reg:CCZ 17 flags)
(compare:CCZ (reg:SI 105 [ iD.2893 ])
(const_int 0 [0]))) "cse.c":18:22 7 {*cmpsi_ccno_1}
(expr_list:REG_DEAD (reg:SI 105 [ iD.2893 ])
(nil)))
creates the equivalence EQ on (reg:SI 105 [ iD.2893 ]) and (const_int 0 [0]).
This causes a merge to happen between the two equivalence sets denoted by
(const_int 0 [0]) and (reg:SI 105 [ iD.2893 ]) respectively.
The operation happens through merge_equiv_classes however this function has an
invariant that the classes to be merge not contain any duplicates. This is
because it frees entries before merging.
The given testcase when using the supplied flags trigger an ICE due to the
equivalence set being
(rr) p dump_class (class1)
Equivalence chain for (reg:SI 105 [ iD.2893 ]):
(reg:SI 105 [ iD.2893 ])
$3 = void
(rr) p dump_class (class2)
Equivalence chain for (const_int 0 [0]):
(const_int 0 [0])
(reg:SI 97 [ _10 ])
(reg:SI 97 [ _10 ])
$4 = void
This happens because the original INSN being recorded is
(insn 18 17 24 2 (set (subreg:V1SI (reg:SI 97 [ _10 ]) 0)
(const_vector:V1SI [
(const_int 0 [0])
])) "cse.c":11:9 1363 {*movv1si_internal}
(expr_list:REG_UNUSED (reg:SI 97 [ _10 ])
(nil)))
and we end up generating two equivalences. the first one is simply that
reg:SI 97 is 0. The second one is that 0 can be extracted from the V1SI, so
subreg (subreg:V1SI (reg:SI 97) 0) 0 == 0. This nested subreg gets folded away
to just reg:SI 97 and we re-insert the same equivalence.
This patch changes it so that if the nunits of a subreg is 1 then don't generate
a vec_select from the subreg as the subreg will be folded away and we get a dup.
gcc/ChangeLog:
PR rtl-optimization/103404
* cse.c (find_sets_in_insn): Don't select elements out of a V1 mode
subreg.
gcc/testsuite/ChangeLog:
PR rtl-optimization/103404
* gcc.target/i386/pr103404.c: New test.
liuhongt [Tue, 30 Nov 2021 05:50:11 +0000 (13:50 +0800)]
Prefer INT_SSE_REGS for SSE_FLOAT_MODE_P in preferred_reload_class.
When moves between integer and sse registers are cheap.
2021-12-06 Hongtao Liu <Hongtao.liu@intel.com>
Uroš Bizjak <ubizjak@gmail.com>
gcc/ChangeLog:
PR target/95740
* config/i386/i386.c (ix86_preferred_reload_class): Allow
integer regs when moves between register units are cheap.
* config/i386/i386.h (INT_SSE_CLASS_P): New.
gcc/testsuite/ChangeLog:
* gcc.target/i386/pr95740.c: New test.
Nelson Chu [Mon, 29 Nov 2021 12:48:20 +0000 (04:48 -0800)]
RISC-V: jal cannot refer to a default visibility symbol for shared object.
This is the original binutils bugzilla report,
https://sourceware.org/bugzilla/show_bug.cgi?id=28509
And this is the first version of the proposed binutils patch,
https://sourceware.org/pipermail/binutils/2021-November/118398.html
After applying the binutils patch, I get the the unexpected error when
building libgcc,
/scratch/nelsonc/riscv-gnu-toolchain/riscv-gcc/libgcc/config/riscv/div.S:42:
/scratch/nelsonc/build-upstream/rv64gc-linux/build-install/riscv64-unknown-linux-gnu/bin/ld: relocation R_RISCV_JAL against `__udivdi3' which may bind externally can not be used when making a shared object; recompile with -fPIC
Therefore, this patch add an extra hidden alias symbol for __udivdi3, and
then use HIDDEN_JUMPTARGET to target a non-preemptible symbol instead.
The solution is similar to glibc as follows,
https://sourceware.org/git/?p=glibc.git;a=commit;h=
68389203832ab39dd0dbaabbc4059e7fff51c29b
libgcc/ChangeLog:
* config/riscv/div.S: Add the hidden alias symbol for __udivdi3, and
then use HIDDEN_JUMPTARGET to target it since it is non-preemptible.
* config/riscv/riscv-asm.h: Added new macros HIDDEN_JUMPTARGET and
HIDDEN_DEF.
GCC Administrator [Mon, 6 Dec 2021 00:16:21 +0000 (00:16 +0000)]
Daily bump.
Iain Sandoe [Sun, 5 Dec 2021 20:15:40 +0000 (20:15 +0000)]
Objective-C, NeXT: Reorganise meta-data declarations.
This moves the GTY declaration of the meta-data indentifier
array into the header that enumerates these and provides
shorthand defines for them. This avoids a problem seen with
a relocatable PCH implementation.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/objc/ChangeLog:
* objc-next-metadata-tags.h (objc_rt_trees): Declare here.
* objc-next-runtime-abi-01.c: Remove from here.
* objc-next-runtime-abi-02.c: Likewise.
* objc-runtime-shared-support.c: Reorder headers, provide
a GTY declaration the definition of objc_rt_trees.
David Edelsohn [Sun, 5 Dec 2021 01:23:09 +0000 (20:23 -0500)]
aix: Move AIX math builtins before new builtin machinery.
The new builtin machinery has an early exit, so move the AIX-specific
builtins before the new machinery.
gcc/ChangeLog:
* config/rs6000/rs6000-call.c (rs6000_init_builtins): Move
AIX math builtin initialization before new_builtins_are_live.
GCC Administrator [Sun, 5 Dec 2021 00:16:28 +0000 (00:16 +0000)]
Daily bump.
Marek Polacek [Sat, 4 Dec 2021 20:29:18 +0000 (15:29 -0500)]
c++: Add fixed test [PR93614]
This was fixed by r11-86.
PR c++/93614
gcc/testsuite/ChangeLog:
* g++.dg/template/lookup18.C: New test.
Tobias Burnus [Sat, 4 Dec 2021 18:39:43 +0000 (19:39 +0100)]
Fortran/OpenMP: Support most of 5.1 atomic extensions
Implements moste of OpenMP 5.1 atomic extensions,
except that 'compare' is parsed but rejected during
resolution. (As the trans-openmp.c handling is missing.)
gcc/fortran/ChangeLog:
* dump-parse-tree.c (show_omp_clauses): Handle
weak/compare/fail clause.
* gfortran.h (gfc_omp_clauses): Add weak, compare, fail.
* openmp.c (enum omp_mask1, gfc_match_omp_clauses,
OMP_ATOMIC_CLAUSES): Update for new clauses.
(gfc_match_omp_atomic): Update for 5.1 atomic changes.
(is_conversion): Support widening in one go.
(is_scalar_intrinsic_expr): New.
(resolve_omp_atomic): Update for 5.1 atomic changes.
* parse.c (parse_omp_oacc_atomic): Update for compare.
* resolve.c (gfc_resolve_blocks): Update asserts.
* trans-openmp.c (gfc_trans_omp_atomic): Handle new clauses.
gcc/testsuite/ChangeLog:
* gfortran.dg/gomp/atomic-2.f90: Move now supported code to ...
* gfortran.dg/gomp/atomic.f90: here.
* gfortran.dg/gomp/atomic-10.f90: New test.
* gfortran.dg/gomp/atomic-12.f90: New test.
* gfortran.dg/gomp/atomic-15.f90: New test.
* gfortran.dg/gomp/atomic-16.f90: New test.
* gfortran.dg/gomp/atomic-17.f90: New test.
* gfortran.dg/gomp/atomic-18.f90: New test.
* gfortran.dg/gomp/atomic-19.f90: New test.
* gfortran.dg/gomp/atomic-20.f90: New test.
* gfortran.dg/gomp/atomic-22.f90: New test.
* gfortran.dg/gomp/atomic-24.f90: New test.
* gfortran.dg/gomp/atomic-25.f90: New test.
* gfortran.dg/gomp/atomic-26.f90: New test.
libgomp/ChangeLog
* libgomp.texi (OpenMP 5.1): Update status.
Jonathan Wakely [Sat, 4 Dec 2021 11:38:25 +0000 (11:38 +0000)]
libstdc++: Initialize member in std::match_results [PR103549]
This fixes a -Wuninitialized warning for std::cmatch m1, m2; m1=m2;
Also name the template parameters in the forward declaration, to get rid
of the <template-parameter-1-1> noise in diagnostics.
libstdc++-v3/ChangeLog:
PR libstdc++/103549
* include/bits/regex.h (match_results): Give names to template
parameters in first declaration.
(match_results::_M_begin): Add default member-initializer.
Tobias Burnus [Sat, 4 Dec 2021 12:28:03 +0000 (13:28 +0100)]
libgomp.texi: Update OMP_PLACES
libgomp/ChangeLog:
* libgomp.texi (OMP_PLACES): Extend description for OMP 5.1 changes.
Jakub Jelinek [Sat, 4 Dec 2021 10:09:33 +0000 (11:09 +0100)]
i386, ipa-modref: Comment spelling fix
This patch fixes spelling of prefer (misspelled as preffer).
2021-12-04 Jakub Jelinek <jakub@redhat.com>
* config/i386/x86-tune.def (X86_TUNE_PARTIAL_REG_DEPENDENCY): Fix
comment typo, Preffer -> prefer.
* ipa-modref-tree.c (modref_access_node::closer_pair_p): Likewise.
Jakub Jelinek [Sat, 4 Dec 2021 10:02:15 +0000 (11:02 +0100)]
c++: Allow indeterminate unsigned char or std::byte in bit_cast - P1272R4
P1272R4 has added to the std::byteswap new stuff to me quite unrelated
clarification for std::bit_cast.
The patch treats it as DR, applying to all languages.
We no longer diagnose if padding bits are stored into unsigned char
or std::byte result, fields or bitfields, instead arrange for that result,
those fields or bitfields to get indeterminate value (empty
CONSTRUCTOR with CONSTRUCTOR_NO_ZEROING or just leaving the member's
initializer out and setting CONSTRUCTOR_NO_ZEROING on parent).
We still have a bug that we don't diagnose in lots of places lvalue-to-rvalue
conversions of indeterminate values or class objects with some indeterminate
members.
2021-12-04 Jakub Jelinek <jakub@redhat.com>
* cp-tree.h (is_byte_access_type_not_plain_char): Declare.
* tree.c (is_byte_access_type_not_plain_char): New function.
* constexpr.c (clear_uchar_or_std_byte_in_mask): New function.
(cxx_eval_bit_cast): Don't error about padding bits if target
type is unsigned char or std::byte, instead return no clearing
ctor. Use clear_uchar_or_std_byte_in_mask.
* g++.dg/cpp2a/bit-cast11.C: New test.
* g++.dg/cpp2a/bit-cast12.C: New test.
* g++.dg/cpp2a/bit-cast13.C: New test.
* g++.dg/cpp2a/bit-cast14.C: New test.
Jakub Jelinek [Sat, 4 Dec 2021 10:00:09 +0000 (11:00 +0100)]
libcpp: Fix up handling of deferred pragmas [PR102432]
The https://gcc.gnu.org/pipermail/gcc-patches/2020-November/557903.html
change broke the following testcases. The problem is when a pragma
namespace allows expansion (i.e. p->is_nspace && p->allow_expansion),
e.g. the omp or acc namespaces do, then when parsing the second pragma
token we do it with pfile->state.in_directive set,
pfile->state.prevent_expansion clear and pfile->state.in_deferred_pragma
clear (the last one because we don't know yet if it will be a deferred
pragma or not). If the pragma line only contains a single name
and newline after it, and there exists a function-like macro with the
same name, the preprocessor needs to peek in funlike_invocation_p
the next token whether it isn't ( but in this case it will see a newline.
As pfile->state.in_directive is set, we don't read anything after the
newline, pfile->buffer->need_line is set and CPP_EOF is lexed, which
funlike_invocation_p doesn't push back. Because name is a function-like
macro and on the pragma line there is no ( after the name, it isn't
expanded, and control flow returns to do_pragma. If name is valid
deferred pragma, we set pfile->state.in_deferred_pragma (and really
need it set so that e.g. end_directive later on doesn't eat all the
tokens from the pragma line).
Before Nathan's change (which unfortunately didn't contain rationale
on why it is better to do it like that), this wasn't a problem,
next _cpp_lex_direct called when we want next token would return
CPP_PRAGMA_EOF when it saw buffer->need_line, which would turn off
pfile->state.in_deferred_pragma and following get token would already
read the next line. But Nathan's patch replaced it with an assertion
failure that now triggers and CPP_PRAGMA_EOL is done only when lexing
the '\n'. Except for this special case that works fine, but in
this case it doesn't because when peeking the token we still didn't know
that it will be a deferred pragma.
I've tried to fix that up in do_pragma by detecting this and pushing
CPP_PRAGMA_EOL as lookahead, but that doesn't work because end_directive
still needs to see pfile->state.in_deferred_pragma set.
So, this patch affectively reverts part of Nathan's change, CPP_PRAGMA_EOL
addition isn't done only when parsing the '\n', but is now done in both
places, in the first one instead of the assertion failure.
2021-12-04 Jakub Jelinek <jakub@redhat.com>
PR preprocessor/102432
* lex.c (_cpp_lex_direct): If buffer->need_line while
pfile->state.in_deferred_pragma, return CPP_PRAGMA_EOL token instead
of assertion failure.
* c-c++-common/gomp/pr102432.c: New test.
* c-c++-common/goacc/pr102432.c: New test.
Alexandre Oliva [Sat, 4 Dec 2021 03:17:16 +0000 (00:17 -0300)]
[PR103028] test ifcvt trap_if seq more strictly after reload
When -fif-conversion2 is enabled, we attempt to replace conditional
branches around unconditional traps with conditional traps. That
canonicalizes compares, which may change an immediate that barely fits
into one that doesn't.
The compare for the trap is first checked using the predicates of
cbranch predicates, and then, compare and conditional trap insns are
emitted and recognized.
In the failing s390x testcase, i <=u 0xffff_ffff is canonicalized into
i <u 0x1_0000_0000, and the latter immediate doesn't fit. The insn
predicates (both cbranch and cmpdi_ccu) happily accept it, since the
register allocator has no trouble getting them into registers. The
problem is that ifcvt2 runs after reload, so we recognize the compare
insn successfully, but later on we barf when we find that none of the
constraints fit.
This patch arranges for the trap_if-issuing bits in ifcvt to validate
post-reload insns using a stricter test that also checks that operands
fit the constraints.
for gcc/ChangeLog
PR rtl-optimization/103028
* ifcvt.c (find_cond_trap): Validate new insns more strictly
after reload.
for gcc/testsuite/ChangeLog
PR rtl-optimization/103028
* gcc.dg/pr103028.c: New.
David Edelsohn [Fri, 3 Dec 2021 19:47:48 +0000 (14:47 -0500)]
testsuite: powerpc/vec_reve_1.c requires VSX.
vector long long int and vector double require VSX not just Altivec.
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/vec_reve_1.c: Require VSX.
GCC Administrator [Sat, 4 Dec 2021 00:16:46 +0000 (00:16 +0000)]
Daily bump.
Jonathan Wakely [Fri, 3 Dec 2021 11:16:30 +0000 (11:16 +0000)]
libstdc++: Simplify emplace member functions in _Rb_tree
This introduces a new RAII type to simplify the emplace members which
currently use try-catch blocks to deallocate a node if an exception is
thrown by the comparisons done during insertion. The new type is created
on the stack and manages the allocation of a new node and deallocates it
in the destructor if it wasn't inserted into the tree. It also provides
helper functions for doing the insertion, releasing ownership of the
node to the tree.
Also, we don't need to use long qualified names if we put the return
type after the nested-name-specifier.
libstdc++-v3/ChangeLog:
* include/bits/stl_tree.h (_Rb_tree::_Auto_node): Define new
RAII helper for creating and inserting new nodes.
(_Rb_tree::_M_insert_node): Use trailing-return-type to simplify
out-of-line definition.
(_Rb_tree::_M_insert_lower_node): Likewise.
(_Rb_tree::_M_insert_equal_lower_node): Likewise.
(_Rb_tree::_M_emplace_unique): Likewise. Use _Auto_node.
(_Rb_tree::_M_emplace_equal): Likewise.
(_Rb_tree::_M_emplace_hint_unique): Likewise.
(_Rb_tree::_M_emplace_hint_equal): Likewise.
Jason Merrill [Fri, 18 Jun 2021 20:04:28 +0000 (16:04 -0400)]
c++: avoid redundant scope in diagnostics
We can make some function signatures shorter to print by omitting redundant
nested-name-specifiers in the rest of the declarator.
gcc/cp/ChangeLog:
* error.c (current_dump_scope): New variable.
(dump_scope): Check it.
(dump_function_decl): Set it.
gcc/testsuite/ChangeLog:
* g++.dg/diagnostic/scope1.C: New test.
Jonathan Wakely [Fri, 3 Dec 2021 20:50:50 +0000 (20:50 +0000)]
Fix typos in libstdc++-v3/ChangeLog
Martin Liska [Fri, 3 Dec 2021 17:58:25 +0000 (11:58 -0600)]
rs6000: Fix up flag_shrink_wrap handling in presence of -mrop-protect [PR101324]
PR101324 shows a problem in disabling shrink-wrapping when using -mrop-protect
when there is a attribute optimize/pragma. The fix envolves moving the handling
of flag_shrink_wrap so it gets re-disbled when we change or add options.
2021-12-03 Martin Liska <mliska@suse.cz>
gcc/
PR target/101324
* config/rs6000/rs6000.c (rs6000_option_override_internal): Move the
disabling of shrink-wrapping when using -mrop-protect from here...
(rs6000_override_options_after_change): ...to here.
2021-12-03 Peter Bergner <bergner@linux.ibm.com>
gcc/testsuite/
PR target/101324
* gcc.target/powerpc/pr101324.c: New test.
Peter Bergner [Fri, 3 Dec 2021 17:46:22 +0000 (11:46 -0600)]
rs6000: testsuite: Add rop_ok effective-target function
This patch adds a new effective-target function that tests whether
it is safe to emit the ROP-protect instructions and updates the
ROP test cases to use it.
2021-12-03 Peter Bergner <bergner@linux.ibm.com>
gcc/testsuite/
* lib/target-supports.exp (check_effective_target_rop_ok): New function.
* gcc.target/powerpc/rop-1.c: Use it.
* gcc.target/powerpc/rop-2.c: Likewise.
* gcc.target/powerpc/rop-3.c: Likewise.
* gcc.target/powerpc/rop-4.c: Likewise.
* gcc.target/powerpc/rop-5.c: Likewise.
Harald Anlauf [Thu, 2 Dec 2021 21:33:49 +0000 (22:33 +0100)]
Fortran: improve checking of array specifications
gcc/fortran/ChangeLog:
PR fortran/103505
* array.c (match_array_element_spec): Try to simplify array
element specifications to improve early checking.
* expr.c (gfc_try_simplify_expr): New. Try simplification of an
expression via gfc_simplify_expr. When an error occurs, roll
back.
* gfortran.h (gfc_try_simplify_expr): Declare it.
gcc/testsuite/ChangeLog:
PR fortran/103505
* gfortran.dg/pr103505.f90: New test.
Co-authored-by: Steven G. Kargl <kargl@gcc.gnu.org>
Marek Polacek [Wed, 1 Dec 2021 02:07:25 +0000 (21:07 -0500)]
c++: Fix for decltype(auto) and parenthesized expr [PR103403]
In r11-4758, I tried to fix this problem:
int &&i = 0;
decltype(auto) j = i; // should behave like int &&j = i; error
wherein do_auto_deduction was getting confused with a REFERENCE_REF_P
and it didn't realize its operand was a name, not an expression, and
deduced the wrong type.
Unfortunately that fix broke this:
int&& r = 1;
decltype(auto) rr = (r);
where 'rr' should be 'int &' since '(r)' is an expression, not a name. But
because I stripped the INDIRECT_REF with the r11-4758 change, we deduced
'rr's type as if decltype had gotten a name, resulting in 'int &&'.
I suspect I thought that the REF_PARENTHESIZED_P check when setting
'bool id' in do_auto_deduction would handle the (r) case, but that's not
the case; while the documentation for REF_PARENTHESIZED_P specifically says
it can be set in INDIRECT_REF, we don't actually do so.
This patch sets REF_PARENTHESIZED_P even on REFERENCE_REF_P, so that
do_auto_deduction can use it.
It also removes code in maybe_undo_parenthesized_ref that I think is
dead -- and we don't hit it while running dg.exp. To adduce more data,
it also looks dead here:
https://splichal.eu/lcov/gcc/cp/semantics.c.gcov.html
(It's dead since r9-1417.)
Also add a fixed test for c++/81176.
PR c++/103403
gcc/cp/ChangeLog:
* cp-gimplify.c (cp_fold): Don't recurse if maybe_undo_parenthesized_ref
doesn't change its argument.
* pt.c (do_auto_deduction): Don't strip REFERENCE_REF_P trees if they
are REF_PARENTHESIZED_P. Use stripped_init when checking for
id-expression.
* semantics.c (force_paren_expr): Set REF_PARENTHESIZED_P on
REFERENCE_REF_P trees too.
(maybe_undo_parenthesized_ref): Remove dead code.
gcc/testsuite/ChangeLog:
* g++.dg/cpp1y/decltype-auto2.C: New test.
* g++.dg/cpp1y/decltype-auto3.C: New test.
* g++.dg/cpp1y/decltype-auto4.C: New test.
* g++.dg/cpp1z/decomp-decltype1.C: New test.
H.J. Lu [Tue, 16 Nov 2021 02:52:56 +0000 (18:52 -0800)]
x86: Add -mmove-max=bits and -mstore-max=bits
Add -mmove-max=bits and -mstore-max=bits to enable 256-bit/512-bit move
and store, independent of -mprefer-vector-width=bits:
1. Add X86_TUNE_AVX512_MOVE_BY_PIECES and X86_TUNE_AVX512_STORE_BY_PIECES
which are enabled for Intel Sapphire Rapids processor.
2. Add -mmove-max=bits to set the maximum number of bits can be moved from
memory to memory efficiently. The default value is derived from
X86_TUNE_AVX512_MOVE_BY_PIECES, X86_TUNE_AVX256_MOVE_BY_PIECES, and the
preferred vector width.
3. Add -mstore-max=bits to set the maximum number of bits can be stored to
memory efficiently. The default value is derived from
X86_TUNE_AVX512_STORE_BY_PIECES, X86_TUNE_AVX256_STORE_BY_PIECES and the
preferred vector width.
gcc/
PR target/103269
* config/i386/i386-expand.c (ix86_expand_builtin): Pass PVW_NONE
and PVW_NONE to ix86_target_string.
* config/i386/i386-options.c (ix86_target_string): Add arguments
for move_max and store_max.
(ix86_target_string::add_vector_width): New lambda.
(ix86_debug_options): Pass ix86_move_max and ix86_store_max to
ix86_target_string.
(ix86_function_specific_print): Pass ptr->x_ix86_move_max and
ptr->x_ix86_store_max to ix86_target_string.
(ix86_valid_target_attribute_tree): Handle x_ix86_move_max and
x_ix86_store_max.
(ix86_option_override_internal): Set the default x_ix86_move_max
and x_ix86_store_max.
* config/i386/i386-options.h (ix86_target_string): Add
prefer_vector_width and prefer_vector_width.
* config/i386/i386.h (TARGET_AVX256_MOVE_BY_PIECES): Removed.
(TARGET_AVX256_STORE_BY_PIECES): Likewise.
(MOVE_MAX): Use 64 if ix86_move_max or ix86_store_max ==
PVW_AVX512. Use 32 if ix86_move_max or ix86_store_max >=
PVW_AVX256.
(STORE_MAX_PIECES): Use 64 if ix86_store_max == PVW_AVX512.
Use 32 if ix86_store_max >= PVW_AVX256.
* config/i386/i386.opt: Add -mmove-max=bits and -mstore-max=bits.
* config/i386/x86-tune.def (X86_TUNE_AVX512_MOVE_BY_PIECES): New.
(X86_TUNE_AVX512_STORE_BY_PIECES): Likewise.
* doc/invoke.texi: Document -mmove-max=bits and -mstore-max=bits.
gcc/testsuite/
PR target/103269
* gcc.target/i386/pieces-memcpy-17.c: New test.
* gcc.target/i386/pieces-memcpy-18.c: Likewise.
* gcc.target/i386/pieces-memcpy-19.c: Likewise.
* gcc.target/i386/pieces-memcpy-20.c: Likewise.
* gcc.target/i386/pieces-memcpy-21.c: Likewise.
* gcc.target/i386/pieces-memset-45.c: Likewise.
* gcc.target/i386/pieces-memset-46.c: Likewise.
* gcc.target/i386/pieces-memset-47.c: Likewise.
* gcc.target/i386/pieces-memset-48.c: Likewise.
* gcc.target/i386/pieces-memset-49.c: Likewise.
Bill Schmidt [Thu, 2 Dec 2021 22:29:12 +0000 (16:29 -0600)]
rs6000: Fix use of wrong enum for built-in function code
I discovered this bug while working on patches to remove the old built-ins
infrastructure. I missed a spot in converting from the rs6000_builtins enum to
the rs6000_gen_builtins_enum. This fixes it. The fix is technically not right
if new_builtins_are_enabled were to be set to zero, but we're not going to do
that anymore, and the remnants of that code will be removed shortly.
2021-12-02 Bill Schmidt <wschmidt@linux.ibm.com>
gcc/
* config/rs6000/rs6000.c (rs6000_builtin_reciprocal): Fix builtin
identifiers.
H.J. Lu [Fri, 3 Dec 2021 17:00:54 +0000 (09:00 -0800)]
x86: Scan leal in PR target/83782 tests for x32
Update PR target/83782 tests to scan leal for x32 to fix:
FAIL: gcc.target/i386/pr83782-1.c scan-assembler leaq[ \\t]foo\\(%rip\\),[ \\t]%rax
FAIL: gcc.target/i386/pr83782-2.c scan-assembler leaq[ \\t]foo\\(%rip\\),[ \\t]%rax
PR target/83782
* gcc.target/i386/pr83782-1.c: Also scan leal x32.
* gcc.target/i386/pr83782-2.c: Likewise.
SiYu Wu [Mon, 22 Nov 2021 08:19:10 +0000 (16:19 +0800)]
RISC-V: Add implied defines of Zk, Zkn and Zks
gcc/ChangeLog:
2021-11-22 SiYu Wu <siyu@isrc.iscas.ac.cn>
* common/config/riscv/riscv-common.c (riscv_implied_info):
Add K-ext related entry.
(riscv_supported_std_ext): Add 'k'.
* config/riscv/arch-canonicalize (CANONICAL_ORDER): Add 'k'.
(IMPLIED_EXT): Add K-ext related entry.