test_jj.git
2 years agox86: PR target/103611: Splitter for DST:DI = (HI:SI<<32)|LO:SI.
Roger Sayle [Sat, 18 Dec 2021 13:51:56 +0000 (13:51 +0000)]
x86: PR target/103611: Splitter for DST:DI = (HI:SI<<32)|LO:SI.

A common idiom is to create a DImode value from the "concat" of two SImode
values, using "(long long)hi << 32 | (long long)lo", where the operation
may be ior, xor or plus.  On x86, with -m32, the high and low parts of
a DImode register are actually different SImode registers (typically %edx
and %eax) so ideally this idiom should reduce to two move instructions
(or optimally, just clever register allocation).

Unfortunately, GCC currently performs the IOR operation above on -m32,
and worse allocates DImode registers (split to SImode register pairs)
for both the zero extended HI and LO values.

Hence, for test1 from the new test case below:

typedef int __v4si __attribute__ ((__vector_size__ (16)));
long long test1(__v4si v) {
  unsigned int loVal = (unsigned int)v[0];
  unsigned int hiVal = (unsigned int)v[1];
  return (long long)(loVal) | ((long long)(hiVal) << 32);
}

we currently generate (with -m32 -O2 -msse4.1):

test1: subl    $28, %esp
        pextrd  $1, %xmm0, %eax
        pmovzxdq        %xmm0, %xmm1
        movq    %xmm1, 8(%esp)
        movl    %eax, %edx
        movl    8(%esp), %eax
        orl     12(%esp), %edx
        addl    $28, %esp
        orb     $0, %ah
        ret

with this patch we now generate:

test1: pextrd  $1, %xmm0, %edx
        movd    %xmm0, %eax
        ret

The fix is to recognize and split the idiom (hi<<32)|zext(lo) prior
to register allocation on !TARGET_64BIT, simplifying this sequence to
"highpart(dst) = hi; lowpart(dst) = lo".

The one minor complication is that sse.md's define_insn for
*vec_extractv4si_0_zext_sse4 can sometimes interfere with this
optimization.  It turns out that on !TARGET_64BIT, the zero_extend:DI
following vec_select:SI isn't free, and this insn gets split back
into multiple instructions during later passes, but too late to
be optimized away by this patch/reload.  Hence the last hunk of
this patch is to restrict *vec_extractv4si_0_zext_sse4 to TARGET_64BIT.
Checking PR target/80286, where *vec_extractv4si_0_zext_sse4 was
first added, this seems reasonable.

2021-12-18  Roger Sayle  <roger@nextmovesoftware.com>
    Uroš Bizjak  <ubizjak@gmail.com>

gcc/ChangeLog
PR target/103611
* config/i386/i386.md (any_or_plus): New code iterator.
(define_split): Split (HI<<32)|zext(LO) into piece-wise
move instructions on !TARGET_64BIT.
* config/i386/sse.md (*vec_extractv4si_0_zext_sse4):
Restrict to TARGET_64BIT.

gcc/testsuite/ChangeLog
PR target/103611
* gcc.target/i386/pr103611-2.c: New test case.

2 years agoPR target/32803: Add -Oz option for improved clang compatibility.
Roger Sayle [Sat, 18 Dec 2021 13:47:52 +0000 (13:47 +0000)]
PR target/32803: Add -Oz option for improved clang compatibility.

This patch adds support for an -Oz command line option, aggressively
optimizing for size at the expense of performance.  GCC's current -Os
provides a reasonable balance of size and performance, whereas -Oz is
probably only useful for code size benchmarks such as CSiBE.  Or so I
thought until I read in https://news.ycombinator.com/item?id=25408853
that clang's -Oz sometimes outperforms -O[23s]; I suspect modern instruction
decode stages can treat "pushq $1; popq %rax" as a short uop encoding.

Instead of introducing a new global variable, this patch simply abuses
the existing optimize_size by setting its value to 2.  The only change
in behaviour is the tweak to the i386 backend implementing the suggestion
in PR target/32803 to use a short push/pop sequence for loading small
immediate values (-128..127) on x86, matching the behaviour of LLVM.

On x86_64, the simple function:
int foo() { return 25; }

currently generates with -Os:
foo:    movl    $25, %eax       // 5 bytes
        ret

With the proposed -Oz, it generates:
foo:    pushq   $25             // 2 bytes
        popq    %rax            // 1 byte
        ret

On CSiBE, this results in a 0.94% improvement (3703513 bytes total
down to 3668516 bytes).

2021-12-18  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
PR target/32803
* common.opt (Oz): New command line option.
* doc/invoke.texi: Document the new -Oz option.
* lto-wrapper.c (merge_and_complain, append_compiler_options):
Treat OPT_Oz as synonymous with OPT_Os.
* optc-save-gen.awk: Increase maximum value of optimize_size to 2.
* opts.c (default_options_optimization) [OPT_Oz]: Handle OPT_Oz
just like OPT_Os, except set opt->x_optimize_size to 2.
(common_handle_option): Skip OPT_Oz just like OPT_Os.

* config/i386/i386.md (*movdi_internal): Use a push/pop sequence
for suitable SImode TYPE_IMOV moves when optimize_size > 1.
(*movsi_internal): Likewise.

gcc/testsuite/ChangeLog
PR target/32803
* gcc.target/i386/pr32803.c: New test case.

2 years agotree-optimization/103759: Use sizetype everywhere for object sizes
Siddhesh Poyarekar [Sat, 18 Dec 2021 11:16:43 +0000 (16:46 +0530)]
tree-optimization/103759: Use sizetype everywhere for object sizes

Since all computations in tree-object-size are now done in sizetype and
not HOST_WIDE_INT, comparisons with HOST_WIDE_INT based unknown and
initval would be incorrect.  Instead, use the sizetype trees directly to
generate and evaluate initval and unknown size values.

gcc/ChangeLog:

PR tree-optimization/103759
* tree-object-size.c (unknown, initval): Remove functions.
(size_unknown, size_initval, size_unknown_p): Operate directly
on trees.

Signed-off-by: Siddhesh Poyarekar <siddhesh@gotplt.org>
2 years agoFortran: Cast arguments of <ctype.h> functions to unsigned char
François-Xavier Coudert [Thu, 16 Dec 2021 17:38:30 +0000 (18:38 +0100)]
Fortran: Cast arguments of <ctype.h> functions to unsigned char

Functions from <ctype.h> should only be called on values that can be
represented by unsigned char. On targets where char is a signed type,
some of libgfortran calls have undefined behaviour.

The solution is to cast the argument to unsigned char type. I’ve defined
macros in libgfortran.h to do so, to retain legibility of the library
code.

PR libfortran/95177

libgfortran/ChangeLog

* libgfortran.h: include ctype.h, provide safe macros.
* io/format.c: use safe macros.
* io/list_read.c: use safe macros.
* io/read.c: use safe macros.
* io/write.c: use safe macros.
* runtime/environ.c: use safe macros.

2 years agoDarwin: Future-proof and homogeneize detection of darwin versions
François-Xavier Coudert [Fri, 17 Dec 2021 18:30:36 +0000 (19:30 +0100)]
Darwin: Future-proof and homogeneize detection of darwin versions

The current GCC branch will become 12.1.0, which will be the stable
version of GCC when the next macOS version is released. There are some
places in GCC that don’t handle darwin22 as a version, so we need to
future-proof it (gcc/config.gcc and gcc/config/darwin-driver.c). We
align that code with what Apple clang does, i.e. accept all potential
major macOS versions until 99.

This patch also homogenises the handling of darwin version numbers,
where the majority of places use darwin2*, but some used darwin2[0-9]*.
Since there never was a darwin2.x version, the two are equivalent, and
we prefer the simpler darwin2*

gcc/ChangeLog:

* config/darwin-driver.c: Make version code more future-proof.
* config.gcc: Homogeneize darwin versions.
* configure.ac: Homogeneize darwin versions.
* configure: Regenerate.

gcc/testsuite/ChangeLog:

* gcc.dg/darwin-minversion-link.c: Test darwin21.
* obj-c++.dg/cxx-ivars-3.mm: Homogeneize darwin versions.
* obj-c++.dg/objc-gc-3.mm: Homogeneize darwin versions.
* objc.dg/objc-gc-4.m: Homogeneize darwin versions.

2 years agoDaily bump.
GCC Administrator [Sat, 18 Dec 2021 00:16:23 +0000 (00:16 +0000)]
Daily bump.

2 years agoattribs: Fix wrong error with -Wno-attribute=A::b [PR103649]
Marek Polacek [Thu, 16 Dec 2021 21:29:41 +0000 (16:29 -0500)]
attribs: Fix wrong error with -Wno-attribute=A::b [PR103649]

My patch to implement -Wno-attribute=A::b caused a bogus error when
parsing

  [[foo::bar(1, 2)]];

when -Wno-attributes=foo::bar was specified on the command line, because
when we create a fake foo::bar attribute and insert it into our attribute
table, it is created with max_length == 0 which doesn't allow any args.
That is wrong -- we know nothing about the attribute, so we shouldn't
require any specific number of arguments.  And since unknown attributes
can be rather complex (see for example omp::{directive,sequence}), we
must skip parsing their arguments.  To that end, I'm using max_length
with value -2.

Also let's not warn about things like

  [[vendor::assume(true)]];

because they may have some meaning (this is reminiscent of C++ Portable
Assumptions).

PR c/103649

gcc/ChangeLog:

* attribs.c (handle_ignored_attributes_option): Create the fake
attribute with max_length == -2.
(attribute_ignored_p): New overloads.
* attribs.h (attribute_ignored_p): Declare them.
* tree-core.h (struct attribute_spec): Document that max_length
can be -2.

gcc/c/ChangeLog:

* c-decl.c (c_warn_unused_attributes): Don't warn for
attribute_ignored_p.
* c-parser.c (c_parser_std_attribute): Skip parsing of the attribute
arguments when the attribute is ignored.

gcc/cp/ChangeLog:

* parser.c (cp_parser_declaration): Don't warn for attribute_ignored_p.
(cp_parser_std_attribute): Skip parsing of the attribute
arguments when the attribute is ignored.

gcc/testsuite/ChangeLog:

* c-c++-common/Wno-attributes-6.c: New test.

2 years agotestsuite: update expected results for ilp32.
David Edelsohn [Fri, 17 Dec 2021 22:16:19 +0000 (17:16 -0500)]
testsuite: update expected results for ilp32.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/fold-vec-insert-float-p9.c

2 years agoAdd -mdejagnu-cpu=power7 to dg-options for pr97142.c
Olivier Hainque [Sat, 9 Oct 2021 13:22:12 +0000 (13:22 +0000)]
Add -mdejagnu-cpu=power7 to dg-options for pr97142.c

To match the tests expectations for toolchains
configured to default to not so capable cpus.

2021-12-17  Olivier Hainque  <hainque@adacore.com>

gcc/testsuite/
* gcc.target/powerpc/pr97142.c: Add -mdejagnu-cpu=power7
to the dg-options.

2 years agoc++: Improve diagnostic for class tmpl/class redecl [PR103749]
Marek Polacek [Thu, 16 Dec 2021 19:57:07 +0000 (14:57 -0500)]
c++: Improve diagnostic for class tmpl/class redecl [PR103749]

For code like

  template<typename>
  struct bar;

  struct bar {
    int baz;
  };

  bar var;

we emit a fairly misleading and unwieldy diagnostic:

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
$ g++ -c u.cc
u.cc:6:8: error: template argument required for 'struct bar'
    6 | struct bar {
      |        ^~~
u.cc:10:5: error: class template argument deduction failed:
   10 | bar var;
      |     ^~~
u.cc:10:5: error: no matching function for call to 'bar()'
u.cc:3:17: note: candidate: 'template<class> bar()-> bar< <template-parameter-1-1> >'
    3 |   friend struct bar;
      |                 ^~~
u.cc:3:17: note:   template argument deduction/substitution failed:
u.cc:10:5: note:   couldn't deduce template parameter '<template-parameter-1-1>'
   10 | bar var;
      |     ^~~
u.cc:3:17: note: candidate: 'template<class> bar(bar< <template-parameter-1-1> >)-> bar< <template-parameter-1-1> >'
    3 |   friend struct bar;
      |                 ^~~
u.cc:3:17: note:   template argument deduction/substitution failed:
u.cc:10:5: note:   candidate expects 1 argument, 0 provided
   10 | bar var;
      |     ^~~
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

but with this patch we get:

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
z.C:4:10: error: class template 'bar' redeclared as non-template
    4 |   struct bar {
      |          ^~~
z.C:2:10: note: previous declaration here
    2 |   struct bar;
      |          ^~~
z.C:8:7: error: 'bar<...auto...> var' has incomplete type
    8 |   bar var;
      |       ^~~
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

which is clearer about what the problem is.

I thought it'd be nice to avoid printing the messages about failed CTAD,
too.  To that end, I'm using CLASSTYPE_ERRONEOUS to suppress CTAD.  Not
sure if that's entirely kosher.

The other direction (first a non-template class declaration followed by
a class template definition) we handle quite well:

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
z.C:11:8: error: 'bar' is not a template
   11 | struct bar {};
      |        ^~~
z.C:8:8: note: previous declaration here
    8 | struct bar;
      |        ^~~
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

PR c++/103749

gcc/cp/ChangeLog:

* decl.c (lookup_and_check_tag): Give an error when a class was
declared as template but no template header has been provided.
* pt.c (do_class_deduction): Don't deduce CLASSTYPE_ERRONEOUS
types.

gcc/testsuite/ChangeLog:

* g++.dg/template/redecl4.C: Adjust dg-error.
* g++.dg/diagnostic/redeclaration-2.C: New test.

2 years agors6000: Update darn testcases
Segher Boessenkool [Fri, 17 Dec 2021 17:01:16 +0000 (17:01 +0000)]
rs6000: Update darn testcases

Make the darn testcases work (and be tested) in 32-bit mode as well.
They used to ICE, but they no longer do.

2021-12-17  Segher Boessenkool <segher@kernel.crashing.org>

gcc/testsuite/
PR target/103624
* gcc.target/powerpc/darn-0.c: Remove target clause.
* gcc.target/powerpc/darn-1.c: Remove target clause. Remove lp64
requirement.  Change return type to long.
* gcc.target/powerpc/darn-2.c: Ditto.
* gcc.target/powerpc/darn-3.c: Remove target clause.

2 years agors6000: Redo darn (PR103624)
Segher Boessenkool [Fri, 17 Dec 2021 16:59:55 +0000 (16:59 +0000)]
rs6000: Redo darn (PR103624)

The builtins now all return "long".  The patterns have :GPR as the
output mode, so they can be 32-bit as well (the instruction makes sense
in 32 bit just fine).  The builtins expand to the DImode version
normally, but to the SImode if {32bit} is true.

2021-12-17  Segher Boessenkool <segher@kernel.crashing.org>

PR target/103624
* config/rs6000/rs6000-builtins.def (__builtin_darn): Expand to
darn_64_di.  Add {32bit} attribute.  Return long.
(__builtin_darn_32): Expand to darn_32_di.  Add {32bit} attribute.
Return long.
(__builtin_darn_raw): Expand to darn_raw_di.  Add {32bit} attribute.
Return long.
* config/rs6000/rs6000-call.c (rs6000_expand_builtin): Expand the darn
builtins to the _si variants for -m32.
* config/rs6000/rs6000.md (UNSPECV_DARN_32, UNSPECV_DARN_RAW): Delete.
(UNSPECV_DARN): Update comment.
(darn_32, darn_raw, darn): Delete.
(darn_32_<mode>, darn_64_<mode>, darn_raw_<mode> for GPR): New.
(@darn<mode> for GPR): New.

2 years agocoroutines: Handle initial awaiters with non-void returns [PR 100127].
Iain Sandoe [Sat, 2 Oct 2021 16:20:08 +0000 (17:20 +0100)]
coroutines: Handle initial awaiters with non-void returns [PR 100127].

The way in which a C++20 coroutine is specified discards any value
that might be returned from the initial or final await expressions.

This ICE was caused by an initial await expression with an
await_resume () returning a reference, the function rewrite code
was not set up to expect this.

Fixed by looking through any indirection present and by explicitly
discarding the value, if any, returned by await_resume().

It does not seem useful to make a diagnostic for this, since
the user could define a generic awaiter that usefully returns
values when used in a different position from the initial (or
final) await expressions.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
PR c++/100127

gcc/cp/ChangeLog:

* coroutines.cc (coro_rewrite_function_body): Handle initial
await expressions that try to produce a reference value.

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/pr100127.C: New test.

2 years agocoroutines: Pass lvalues to user-defined operator new [PR 100772].
Iain Sandoe [Sun, 3 Oct 2021 18:46:09 +0000 (19:46 +0100)]
coroutines: Pass lvalues to user-defined operator new [PR 100772].

The wording of the standard has been clarified to be explicit that
the the parameters to any user-defined operator-new in the promise
class should be lvalues.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
PR c++/100772

gcc/cp/ChangeLog:

* coroutines.cc (morph_fn_to_coro): Convert function parms
from reference before constructing any operator-new args
list.

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/pr100772-a.C: New test.
* g++.dg/coroutines/pr100772-b.C: New test.

2 years agocoroutines, c++: Add test for PR 96517.
Iain Sandoe [Fri, 15 Oct 2021 08:42:25 +0000 (09:42 +0100)]
coroutines, c++: Add test for PR 96517.

This PR was fixed by r12-5255-gdaa9c6b015, this adds
the testcase.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/testsuite/ChangeLog:

PR c++/96517
* g++.dg/coroutines/pr96517.C: New test.

2 years agors6000: Fix fake vec_promote overload
Bill Schmidt [Fri, 17 Dec 2021 16:39:00 +0000 (10:39 -0600)]
rs6000: Fix fake vec_promote overload

rs6000-overload.def defines one instance of vec_promote so that it can be
registered with the front end.  Actual expansion of the vec_promote overload
is done with special-case code in rs6000-c.c.  During another cleanup, I
observed that the fake instance has the wrong number of arguments.  Fix that.

2021-12-17  Bill Schmidt  <wschmidt@linux.ibm.com>

gcc/
* config/rs6000/rs6000-overload.def (__builtin_vec_promote): Add second
argument.

2 years agotestsuite: pragma-optimize.c requires ifunc.
David Edelsohn [Fri, 17 Dec 2021 14:45:20 +0000 (09:45 -0500)]
testsuite: pragma-optimize.c requires ifunc.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pragma-optimize.c: Require ifunc support.

2 years agovect: Fix multi-vector SLP gather loads [PR103744]
Richard Sandiford [Fri, 17 Dec 2021 14:18:39 +0000 (14:18 +0000)]
vect: Fix multi-vector SLP gather loads [PR103744]

This PR shows that I didn't properly test the multi-vector case when
adding support for SLP gather loads.  The patch fixes that case using
the same approach as we do for non-SLP cases: keep the scalar base
the same, but iterate through the (also multi-vector) vector offsets.
“vec_num * j + i” is already used elsewhere as a way of handling both
the multi-vector SLP case and the multi-vector non-SLP case.

gcc/
PR tree-optimization/103744
* tree-vect-stmts.c (vectorizable_load): Handle multi-vector
SLP gather loads.

gcc/testsuite/
PR tree-optimization/103744
* gcc.dg/vect/pr103744-1.c: New test.
* gcc.dg/vect/pr103744-2.c: Likewise.

2 years agodocs: fix option name reference
Martin Liska [Fri, 17 Dec 2021 13:56:52 +0000 (14:56 +0100)]
docs: fix option name reference

gcc/ChangeLog:

* doc/invoke.texi: Rename to -fstack-protector.

2 years agodocs: Fix spelling issues in -fipa-strict-aliasing.
Martin Liska [Fri, 17 Dec 2021 13:33:35 +0000 (14:33 +0100)]
docs: Fix spelling issues in -fipa-strict-aliasing.

gcc/ChangeLog:

* doc/invoke.texi: Fix spelling issues.

2 years agoslp: check that the operation we're combing is a boolean operation [PR103741]
Tamar Christina [Fri, 17 Dec 2021 10:59:25 +0000 (10:59 +0000)]
slp: check that the operation we're combing is a boolean operation [PR103741]

It seems I forgot to check that the operation we're combing when masking the
predicated together are actually predicates types.

Without it we end up accidentally trying to combine a value and a mask.

gcc/ChangeLog:

PR tree-optimization/103741
* tree-vect-stmts.c (vectorizable_operation): Check for boolean.

gcc/testsuite/ChangeLog:

PR tree-optimization/103741
* gcc.target/aarch64/pr103741.c: New test.

2 years agolibgcc, Darwin: Add missing build dependencies.
Iain Sandoe [Wed, 15 Dec 2021 14:11:58 +0000 (14:11 +0000)]
libgcc, Darwin: Add missing build dependencies.

There was a race condition where the link for the new shared EH library
(only used on earlier Darwin) could fail because the new crts had not been
copied to the gcc directory.  This can cause a build failure (although
currently only seen on powerpc-darwin).

Fixed by adding specific dependency on the crts and on the multi target.
We also add the declaration header for the Darwin10 unwinder shim to the
powerpc cases, since we build that there for Rosetta use.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
libgcc/ChangeLog:

* config.host: Add shim declaration header to powerpc*-darwin builds.
* config/rs6000/t-darwin-ehs: Remove dependency on the powerpc end
file.
* config/t-darwin-ehs: Add dependencies to the shared unwinder
objects.
* config/t-slibgcc-darwin: Add extra_parts to the dependencies for
the shared EH lib.  Add all-multi to the dependencies for the
libgcc_s.1.dylib redirections.

2 years agoSync config.sub: 2021-10-27
Martin Liska [Fri, 17 Dec 2021 08:56:21 +0000 (09:56 +0100)]
Sync config.sub: 2021-10-27

ChangeLog:

* config.sub: Sync from master.

2 years agoDarwin, Driver: Avoid a link line for empty commands.
Iain Sandoe [Wed, 15 Dec 2021 20:25:27 +0000 (20:25 +0000)]
Darwin, Driver: Avoid a link line for empty commands.

We were pushing a spec value for weak_reference_mismatches unconditionally
which is not needed (the value was the default) and the side-effect of
this was that we appeared to need to drive a link command; leading to
unexpected diagnostics for cases where gcc was invoked with an empty
command line.

Also we were pushing flags for sysroot, os minimum version and controls
even if the command line was empty.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/ChangeLog:

* config/darwin-driver.c (darwin_driver_init): Exit from the
option handling early if the command line is definitely enpty.
* config/darwin.h (SUBTARGET_DRIVER_SELF_SPECS): Remove
setting for the default content of weak_reference_mismatches.

2 years agoDarwin, ppc: Additional change for r12-5974.
Iain Sandoe [Fri, 17 Dec 2021 09:29:02 +0000 (09:29 +0000)]
Darwin, ppc: Additional change for r12-5974.

This adds a missed change from r12-5974-g926d64906af.
The builin_decls array has been renamed to drop the trailing
_x that was used during the main changes to the builtins.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/ChangeLog:

* config/rs6000/darwin.h: Drop trailing _x from the
builtin_decls array name.

2 years agoRevert "Fixed typo"
Martin Liska [Fri, 17 Dec 2021 08:54:53 +0000 (09:54 +0100)]
Revert "Fixed typo"

This reverts commit 06cd44b4387a9f6ab46f377f42ee5be9cf11bf15.

2 years agoAdd combine splitter to transform vpternlogd/vpcmpeqd/vpxor/vblendvps to vblendvps...
Haochen Jiang [Thu, 2 Dec 2021 07:30:17 +0000 (15:30 +0800)]
Add combine splitter to transform vpternlogd/vpcmpeqd/vpxor/vblendvps to vblendvps for ~op0

gcc/ChangeLog:

PR target/100738
* config/i386/sse.md (*avx_cmp<mode>3_lt, *avx_cmp<mode>3_ltint):
Remove MEM_P restriction and add force_reg for operands[2].
(*avx_cmp<mode>3_ltint_not): Add new define_insn_and_split.

gcc/testsuite/ChangeLog:

PR target/100738
* g++.target/i386/avx512vl-pr100738-1.C: New test.

2 years ago__builtin_dynamic_object_size: Recognize builtin
Siddhesh Poyarekar [Fri, 17 Dec 2021 04:04:44 +0000 (09:34 +0530)]
__builtin_dynamic_object_size: Recognize builtin

Recognize the __builtin_dynamic_object_size builtin and add paths in the
object size path to deal with it, but treat it like
__builtin_object_size for now.  Also add tests to provide the same
testing coverage for the new builtin name.

gcc/ChangeLog:

* builtins.def (BUILT_IN_DYNAMIC_OBJECT_SIZE): New builtin.
* tree-object-size.h: Move object size type bits enum from
tree-object-size.c and add new value OST_DYNAMIC.
* builtins.c (expand_builtin, fold_builtin_2): Handle it.
(fold_builtin_object_size): Handle new builtin and adjust for
change to compute_builtin_object_size.
* tree-object-size.c: Include builtins.h.
(compute_builtin_object_size): Adjust.
(early_object_sizes_execute_one,
dynamic_object_sizes_execute_one): New functions.
(object_sizes_execute): Rename insert_min_max_p argument to
early.  Handle BUILT_IN_DYNAMIC_OBJECT_SIZE and call the new
functions.
* doc/extend.texi (__builtin_dynamic_object_size): Document new
builtin.

gcc/testsuite/ChangeLog:

* g++.dg/ext/builtin-dynamic-object-size1.C: New test.
* g++.dg/ext/builtin-dynamic-object-size2.C: Likewise.
* gcc.dg/builtin-dynamic-alloc-size.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-1.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-10.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-11.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-12.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-13.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-14.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-15.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-16.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-17.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-18.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-19.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-2.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-3.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-4.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-5.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-6.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-7.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-8.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-9.c: Likewise.
* gcc.dg/builtin-object-size-16.c: Adjust to allow inclusion
from builtin-dynamic-object-size-16.c.
* gcc.dg/builtin-object-size-17.c: Likewise.

Signed-off-by: Siddhesh Poyarekar <siddhesh@gotplt.org>
2 years agotree-object-size: Use trees and support negative offsets
Siddhesh Poyarekar [Fri, 17 Dec 2021 01:37:18 +0000 (07:07 +0530)]
tree-object-size: Use trees and support negative offsets

Transform tree-object-size to operate on tree objects instead of host
wide integers.  This makes it easier to extend to dynamic expressions
for object sizes.

The compute_builtin_object_size interface also now returns a tree
expression instead of HOST_WIDE_INT, so callers have been adjusted to
account for that.

The trees in object_sizes are each an object_size object with members
size (the bytes from the pointer to the end of the object) and wholesize
(the size of the whole object).  This allows analysis of negative
offsets, which can now be allowed to the extent of the object bounds.
Tests have been added to verify that it actually works.

gcc/ChangeLog:

* tree-object-size.h (compute_builtin_object_size): Return tree
instead of HOST_WIDE_INT.
* builtins.c (fold_builtin_object_size): Adjust.
* gimple-fold.c (gimple_fold_builtin_strncat): Likewise.
* ubsan.c (instrument_object_size): Likewise.
* tree-object-size.c (object_size): New structure.
(object_sizes): Change type to vec<object_size>.
(initval): New function.
(unknown): Use it.
(size_unknown_p, size_initval, size_unknown): New functions.
(object_sizes_unknown_p): Use it.
(object_sizes_get): Return tree.
(object_sizes_initialize): Rename from object_sizes_set_force
and set VAL parameter type as tree.  Add new parameter WHOLEVAL.
(object_sizes_set): Set VAL parameter type as tree and adjust
implementation.  Add new parameter WHOLEVAL.
(size_for_offset): New function.
(decl_init_size): Adjust comment.
(addr_object_size): Change PSIZE parameter to tree and adjust
implementation.  Add new parameter PWHOLESIZE.
(alloc_object_size): Return tree.
(compute_builtin_object_size): Return tree in PSIZE.
(expr_object_size, call_object_size, unknown_object_size):
Adjust for object_sizes_set change.
(merge_object_sizes): Drop OFFSET parameter and adjust
implementation for tree change.
(plus_stmt_object_size): Call collect_object_sizes_for directly
instead of merge_object_size and call size_for_offset to get net
size.
(cond_expr_object_size, collect_object_sizes_for,
object_sizes_execute): Adjust for change of type from
HOST_WIDE_INT to tree.
(check_for_plus_in_loops_1): Likewise and skip non-positive
offsets.

gcc/testsuite/ChangeLog:

* gcc.dg/builtin-object-size-1.c (test9): New test.
(main): Call it.
* gcc.dg/builtin-object-size-2.c (test8): New test.
(main): Call it.
* gcc.dg/builtin-object-size-3.c (test9): New test.
(main): Call it.
* gcc.dg/builtin-object-size-4.c (test8): New test.
(main): Call it.
* gcc.dg/builtin-object-size-5.c (test5, test6, test7): New
tests.

Signed-off-by: Siddhesh Poyarekar <siddhesh@gotplt.org>
2 years agoc++: tweak comment
Jason Merrill [Wed, 15 Dec 2021 22:14:54 +0000 (17:14 -0500)]
c++: tweak comment

The comment documented a parameter that no longer exists.

gcc/cp/ChangeLog:

* constraint.cc (deduce_concept_introduction): Adjust comment.

2 years agoc++: layout of aggregate base with DMI [PR103681]
Jason Merrill [Tue, 14 Dec 2021 22:00:40 +0000 (17:00 -0500)]
c++: layout of aggregate base with DMI [PR103681]

C++14 changed the definition of 'aggregate' to allow default member
initializers, but such classes still need to be considered "non-POD for the
purpose of layout" for ABI compatibility with C++11 code.  It seems rare to
derive from such a class, as evidenced by how long this bug has
survived (since r216750 in 2014), but it's certainly worth fixing.

We only warn when we were failing to allocate another field into the
tail padding of the newly aggregate class; this is the only ABI impact.

This also changes end_of_class to consider all data members, not just empty
data members; that used to be an additional flag, removed in r9-5710, but I
don't see any reason not to always include them.  This makes the result of
the function correspond to the ABI nvsize term and its nameless counterpart
that does include virtual bases.

When looking closely at other users of end_of_class, I realized that we were
assuming that the latter corresponded to the ABI dsize term, but it doesn't
if the class ends with an empty virtual base (in the rare case that the
empty base can't be assigned offset 0), and this matters for layout of
[[no_unique_address]].  So I added another mode that returns the desired
value for that case.  I'm not adding a warning for this ABI fix because it's
a C++20 feature.

PR c++/103681

gcc/ChangeLog:

* common.opt (fabi-version): Add v17.

gcc/cp/ChangeLog:

* cp-tree.h (struct lang_type): Add non_pod_aggregate.
(CLASSTYPE_NON_POD_AGGREGATE): New.
* class.c (check_field_decls): Set it.
(check_bases_and_members): Check it.
(check_non_pod_aggregate): New.
(enum eoc_mode): New.
(end_of_class): Always include non-empty fields.
Add eoc_nv_or_dsize mode.
(include_empty_classes, layout_class_type): Adjust.

gcc/c-family/ChangeLog:

* c-opts.c (c_common_post_options): Update defaults.

gcc/testsuite/ChangeLog:

* g++.dg/abi/macro0.C: Update value.
* g++.dg/abi/no_unique_address6.C: New test.
* g++.dg/abi/nsdmi-aggr1.C: New test.
* g++.dg/abi/nsdmi-aggr1a.C: New test.

2 years agoDaily bump.
GCC Administrator [Fri, 17 Dec 2021 00:16:20 +0000 (00:16 +0000)]
Daily bump.

2 years agoTestsuite: Tweak gcc.dg/20021029-1.c for nios2.
Sandra Loosemore [Thu, 16 Dec 2021 23:17:14 +0000 (15:17 -0800)]
Testsuite: Tweak gcc.dg/20021029-1.c for nios2.

This test needs to be built with -G0 on nios2 to avoid treating the array
as small data.

2021-12-16  Sandra Loosemore  <sandra@codesourcery.com>

gcc/testsuite/
* gcc.dg/20021029-1.c: Build with -G0 for nios2.

2 years agoc++: delayed noexcept in member function template [PR99980]
Marek Polacek [Wed, 15 Dec 2021 22:41:53 +0000 (17:41 -0500)]
c++: delayed noexcept in member function template [PR99980]

Some time ago I noticed that we don't properly delay parsing of
noexcept for member function templates.  This patch fixes that.

It didn't work because even though we set CP_PARSER_FLAGS_DELAY_NOEXCEPT
in cp_parser_member_declaration, member template declarations take
a different path: we call cp_parser_template_declaration and return
prior to setting the flag.

PR c++/99980

gcc/cp/ChangeLog:

* parser.c (cp_parser_single_declaration): Maybe pass
CP_PARSER_FLAGS_DELAY_NOEXCEPT down to cp_parser_init_declarator.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/noexcept71.C: New test.

2 years agoCheck for class type before assuming a type is one [PR103703].
Martin Sebor [Thu, 16 Dec 2021 22:11:45 +0000 (15:11 -0700)]
Check for class type before assuming a type is one [PR103703].

Resolves:
PR c++/103703 - ICE with -Wmismatched-tags with friends and templates

gcc/cp/ChangeLog:

PR c++/103703
* parser.c (class_decl_loc_t::diag_mismatched_tags): Check for class
type before assuming a type is one.

gcc/testsuite/ChangeLog:

PR c++/103703
* g++.dg/warn/Wmismatched-tags-9.C: New test.

2 years agoFix member alignment for all targets [PR103751].
Martin Sebor [Thu, 16 Dec 2021 19:28:03 +0000 (12:28 -0700)]
Fix member alignment for all targets [PR103751].

Resolves:
PR testsuite/103751 - FAIL: gcc.dg/Warray-bounds-48.c (test for excess errors)

gcc/testsuite/ChangeLog:
PR testsuite/103751
* gcc.dg/Warray-bounds-48.c: Fix member alignment.

2 years agoc++: two-stage name lookup for overloaded operators [PR51577]
Patrick Palka [Thu, 16 Dec 2021 18:40:42 +0000 (13:40 -0500)]
c++: two-stage name lookup for overloaded operators [PR51577]

In order to properly implement two-stage name lookup for dependent
operator expressions, we need to remember the result of unqualified
lookup of the operator at template definition time, and reuse that
result rather than performing another unqualified lookup at
instantiation time.

Ideally we could just store the lookup in the expression directly, but
as pointed out in r9-6405 this isn't really possible since we use the
standard tree codes to represent most dependent operator expressions.

We could perhaps create a new tree code to represent dependent operator
expressions, with enough operands to store the lookup along with
everything else, but that'd require a lot of careful work to make sure
we handle this new tree code properly across the frontend.

But currently type-dependent operator (and call) expressions are given
an empty TREE_TYPE, which dependent_type_p treats as dependent, so this
field is effectively unused except to signal that the expression is
type-dependent.  It'd be convenient if we could store the lookup there
while preserving the dependent-ness of the expression.

To that end, this patch creates a new kind of type, called
DEPENDENT_OPERATOR_TYPE, which we give to dependent operator expressions
and into which we can store the result of operator lookup at template
definition time (DEPENDENT_OPERATOR_TYPE_SAVED_LOOKUPS).  Since this
type is always dependent (by definition), and since the frontend doesn't
seem to care much about the exact type of a type-dependent expression,
using this type in place of a NULL_TREE type seems to "just work"; only
dependent_type_p and WILDCARD_TYPE_P need to be adjusted to return true
for this new type.

The rest of the patch mostly consists of adding the necessary plumbing
to pass DEPENDENT_OPERATOR_TYPE_SAVED_LOOKUPS to add_operator_candidates,
adjusting all callers of build_x_* appropriately, and removing the now
unnecessary push_operator_bindings mechanism.

In passing, this patch simplifies finish_constraint_binary_op to avoid
using build_x_binary_op for building a binary constraint-expr; we don't
need to consider operator overloads here, as the &&/|| inside a
constraint effectively always has the built-in meaning (since atomic
constraints must have bool type).

This patch also makes FOLD_EXPR_OP yield a tree_code instead of a raw
INTEGER_CST.

Finally, this patch adds the XFAILed test operator-8.C which is about
broken two-stage name lookup for rewritten non-dependent operator
expressions, an existing bug that's otherwise only documented in
build_new_op.

PR c++/51577
PR c++/83035
PR c++/100465

gcc/cp/ChangeLog:

* call.c (add_operator_candidates): Add lookups parameter.
Use it to avoid performing a second unqualified lookup when
instantiating a dependent operator expression.
(build_new_op): Add lookups parameter and pass it appropriately.
* constraint.cc (finish_constraint_binary_op): Use
build_min_nt_loc instead of build_x_binary_op.
* coroutines.cc (build_co_await): Adjust call to build_new_op.
* cp-objcp-common.c (cp_common_init_ts): Mark
DEPENDENT_OPERATOR_TYPE appropriately.
* cp-tree.def (DEPENDENT_OPERATOR_TYPE): Define.
* cp-tree.h (WILDCARD_TYPE_P): Accept DEPENDENT_OPERATOR_TYPE.
(FOLD_EXPR_OP_RAW): New, renamed from ...
(FOLD_EXPR_OP): ... this.  Change this to return the tree_code directly.
(DEPENDENT_OPERATOR_TYPE_SAVED_LOOKUPS): Define.
(templated_operator_saved_lookups): Define.
(build_new_op): Add lookups parameter.
(build_dependent_operator_type): Declare.
(build_x_indirect_ref): Add lookups parameter.
(build_x_binary_op): Likewise.
(build_x_unary_op): Likewise.
(build_x_compound_expr): Likewise.
(build_x_modify_expr): Likewise.
* cxx-pretty-print.c (get_fold_operator): Adjust after
FOLD_EXPR_OP change.
* decl.c (start_preparsed_function): Don't call
push_operator_bindings.
* decl2.c (grok_array_decl): Adjust calls to build_new_op.
* method.c (do_one_comp): Likewise.
(build_comparison_op): Likewise.
* module.cc (trees_out::type_node): Handle DEPENDENT_OPERATOR_TYPE.
(trees_in::tree_node): Likewise.
* name-lookup.c (lookup_name): Revert r11-2876 change.
(op_unqualified_lookup): Remove.
(maybe_save_operator_binding): Remove.
(discard_operator_bindings): Remove.
(push_operator_bindings): Remove.
* name-lookup.h (maybe_save_operator_binding): Remove.
(push_operator_bindings): Remove.
(discard_operator_bindings): Remove.
* parser.c (cp_parser_unary_expression): Adjust calls to build_x_*.
(cp_parser_binary_expression): Likewise.
(cp_parser_assignment_expression): Likewise.
(cp_parser_expression): Likewise.
(do_range_for_auto_deduction): Likewise.
(cp_convert_range_for): Likewise.
(cp_parser_perform_range_for_lookup): Likewise.
(cp_parser_template_argument): Likewise.
(cp_parser_omp_for_cond): Likewise.
(cp_parser_omp_for_incr): Likewise.
(cp_parser_omp_for_loop_init): Likewise.
(cp_convert_omp_range_for): Likewise.
(cp_finish_omp_range_for): Likewise.
* pt.c (fold_expression): Adjust after FOLD_EXPR_OP change. Pass
templated_operator_saved_lookups to build_x_*.
(tsubst_omp_for_iterator): Adjust call to build_x_modify_expr.
(tsubst_expr) <case COMPOUND_EXPR>: Pass
templated_operator_saved_lookups to build_x_*.
(tsubst_copy_and_build) <case INDIRECT_REF>: Likewise.
<case tcc_unary>: Likewise.
<case tcc_binary>: Likewise.
<case MODOP_EXPR>: Likewise.
<case COMPOUND_EXPR>: Likewise.
(dependent_type_p_r): Return true for DEPENDENT_OPERATOR_TYPE.
* ptree.c (cxx_print_type): Handle DEPENDENT_OPERATOR_TYPE.
* semantics.c (finish_increment_expr): Adjust call to
build_x_unary_op.
(finish_unary_op_expr): Likewise.
(handle_omp_for_class_iterator): Adjust calls to build_x_*.
(finish_omp_cancel): Likewise.
(finish_unary_fold_expr): Use build_dependent_operator_type.
(finish_binary_fold_expr): Likewise.
* tree.c (cp_free_lang_data): Don't call discard_operator_bindings.
* typeck.c (rationalize_conditional_expr): Adjust call to
build_x_binary_op.
(op_unqualified_lookup): Define.
(build_dependent_operator_type): Define.
(build_x_indirect_ref): Add lookups parameter and use
build_dependent_operator_type.
(build_x_binary_op): Likewise.
(build_x_array_ref): Likewise.
(build_x_unary_op): Likewise.
(build_x_compound_expr_from_list): Adjust call to
build_x_compound_expr.
(build_x_compound_expr_from_vec): Likewise.
(build_x_compound_expr): Add lookups parameter and use
build_dependent_operator_type.
(cp_build_modify_expr): Adjust call to build_new_op.
(build_x_modify_expr): Add lookups parameter and use
build_dependent_operator_type.
* typeck2.c (build_x_arrow): Adjust call to build_new_op.

libcc1/ChangeLog:

* libcp1plugin.cc (plugin_build_unary_expr): Adjust call to
build_x_unary_op.
(plugin_build_binary_expr): Adjust call to build_x_binary_op.

gcc/testsuite/ChangeLog:

* g++.dg/lookup/operator-3.C: Split out operator overload
declarations into ...
* g++.dg/lookup/operator-3-ops.h: ... here.
* g++.dg/lookup/operator-3a.C: New test.
* g++.dg/lookup/operator-4.C: New test.
* g++.dg/lookup/operator-4a.C: New test.
* g++.dg/lookup/operator-5.C: New test.
* g++.dg/lookup/operator-5a.C: New test.
* g++.dg/lookup/operator-6.C: New test.
* g++.dg/lookup/operator-7.C: New test.
* g++.dg/lookup/operator-8.C: New test.

2 years agoi386: Enable VxHF vector modes lower ABI levels [PR103571]
Uros Bizjak [Thu, 16 Dec 2021 18:34:50 +0000 (19:34 +0100)]
i386: Enable VxHF vector modes lower ABI levels [PR103571]

Enable VxHF vector modes for SSE2, AVX and AVX512F ABIs.

2021-12-16  Uroš Bizjak  <ubizjak@gmail.com>

gcc/ChangeLog:

PR target/103571
* config/i386/i386.h (VALID_AVX256_REG_MODE): Add V16HFmode.
(VALID_AVX256_REG_OR_OI_VHF_MODE): Replace with ...
(VALID_AVX256_REG_OR_OI_MODE): ... this.  Remove V16HFmode.
(VALID_AVX512F_SCALAR_MODE): Remove HImode and HFmode.
(VALID_AVX512FP16_SCALAR_MODE): New.
(VALID_AVX512F_REG_MODE): Add V32HFmode.
(VALID_SSE2_REG_MODE): Add V8HFmode, V4HFmode and V2HFmode.
(VALID_SSE2_REG_VHF_MODE): Remove.
(VALID_INT_MODE_P): Add V2HFmode.
* config/i386/i386.c (function_arg_advance_64):
Remove explicit mention of V16HFmode and V32HFmode.
(ix86_hard_regno_mode_ok): Remove explicit mention of XImode
and V32HFmode, use VALID_AVX512F_REG_OR_XI_MODE instead.
Use VALID_AVX512FP_SCALAR_MODE for TARGET_aVX512FP16.
Use VALID_AVX256_REG_OR_OI_MODE instead of
VALID_AVX256_REG_OR_OI_VHF_MODE and VALID_SSE2_REG_MODE instead
of VALID_SSE2_REG_VHF_MODE.
(ix86_set_reg_reg_cost): Remove usge of VALID_AVX512FP16_REG_MODE.
(ix86_vector_mode_supported): Ditto.

gcc/testsuite/ChangeLog:

PR target/103571
* gcc.target/i386/pr102812.c (dg-final): Do not scan for movdqa.

2 years agoFixed typo
Matthias Seidel [Wed, 23 Jun 2021 18:35:24 +0000 (20:35 +0200)]
Fixed typo

ChangeLog:

* config.sub: Fix typo.

2 years agoopts: do not do sanity check when an error is seen
Martin Liska [Thu, 16 Dec 2021 12:33:00 +0000 (13:33 +0100)]
opts: do not do sanity check when an error is seen

PR target/103709

gcc/c-family/ChangeLog:

* c-pragma.c (handle_pragma_pop_options): Do not check
global options modification when an error is seen in parsing
of options (pragmas or attributes).

2 years agopragma: respect pragma in lambda functions
Martin Liska [Wed, 15 Dec 2021 16:27:56 +0000 (17:27 +0100)]
pragma: respect pragma in lambda functions

In g:01ad8c54fdca we started supporting target pragma changes
that are primarily caused by optimization option. The same can happen
in the opposite way and we need to check for changes both
in optimization_current_node and target_option_current_node.

PR c++/103696

gcc/ChangeLog:

* attribs.c (decl_attributes): Check if
target_option_current_node is changed.

gcc/testsuite/ChangeLog:

* g++.target/i386/pr103696.C: New test.

2 years agoFix FLUSH IOSTAT value
Francois-Xavier Coudert [Thu, 16 Dec 2021 14:33:17 +0000 (15:33 +0100)]
Fix FLUSH IOSTAT value

PR libfortran/101255

libgfortran/ChangeLog:

* io/file_pos.c: Fix error code.

gcc/testsuite/ChangeLog:

* gfortran.dg/iostat_5.f90: New file.

2 years agoFix timezone handling near year boundaries
Francois-Xavier Coudert [Thu, 16 Dec 2021 11:40:03 +0000 (12:40 +0100)]
Fix timezone handling near year boundaries

PR libfortran/98507

libgfortran/ChangeLog:

* intrinsics/time_1.h: Prefer clock_gettime() over
  gettimeofday().
* intrinsics/date_and_time.c: Fix timezone wrapping.

gcc/testsuite/ChangeLog:

* gfortran.dg/date_and_time_1.f90: New file.

2 years agodocs: add missing leading dash for option.
Martin Liska [Thu, 16 Dec 2021 14:21:36 +0000 (15:21 +0100)]
docs: add missing leading dash for option.

gcc/ChangeLog:

* doc/invoke.texi: Add missing dash.

2 years agors6000: Refactor altivec_build_resolved_builtin
Bill Schmidt [Thu, 16 Dec 2021 13:47:58 +0000 (07:47 -0600)]
rs6000: Refactor altivec_build_resolved_builtin

While replacing the built-in machinery, we agreed to defer some necessary
refactoring of the overload processing.  This patch cleans it up considerably.

I've put in one FIXME for an additional level of cleanup that should be done
independently.  The various helper functions (resolve_VEC_*) can be simplified
if we move the argument processing in altivec_resolve_overloaded_builtin
earlier.  But this requires making nontrivial changes to those functions that
will need careful review.  Let's do that in a later patch.

2021-12-16  Bill Schmidt  <wschmidt@linux.ibm.com>

gcc/
* config/rs6000/rs6000-c.c (resolution): New enum.
(resolve_vec_mul): New function.
(resolve_vec_cmpne): Likewise.
(resolve_vec_adde_sube): Likewise.
(resolve_vec_addec_subec): Likewise.
(resolve_vec_splats): Likewise.
(resolve_vec_extract): Likewise.
(resolve_vec_insert): Likewise.
(resolve_vec_step): Likewise.
(find_instance): Likewise.
(altivec_resolve_overloaded_builtin): Many cleanups.  Call factored-out
functions.  Move variable declarations closer to uses.  Add commentary.
Remove unnecessary levels of braces.  Avoid use of gotos.  Change
misleading variable names.  Use switches over if-else-if chains.

2 years agoaarch64: fix: ls64 tests fail on aarch64_be [PR103729]
Przemyslaw Wirkus [Thu, 16 Dec 2021 10:49:00 +0000 (10:49 +0000)]
aarch64: fix: ls64 tests fail on aarch64_be [PR103729]

This patch is sorting issue with LS64 intrinsics tests failing with
AArch64_be targets.

gcc/ChangeLog:

PR target/103729
* config/aarch64/aarch64-simd.md (aarch64_movv8di): Allow big endian
targets to move V8DI.

2 years agoFor -foffload= suggest also 'disable' and 'default' [PR103644]
Tobias Burnus [Thu, 16 Dec 2021 10:19:37 +0000 (11:19 +0100)]
For -foffload= suggest also 'disable' and 'default' [PR103644]

gcc/ChangeLog:

PR driver/103644
* gcc.c (check_offload_target_name): Add 'default' and 'disable'
to the candidate list.

2 years agoRevert "Sync with binutils: GCC: Pass --plugin to AR and RANLIB"
H.J. Lu [Thu, 16 Dec 2021 04:45:58 +0000 (20:45 -0800)]
Revert "Sync with binutils: GCC: Pass --plugin to AR and RANLIB"

This reverts commit bf8cdd35117dea2049abbeebcdf14de11b323ef7.

2 years agoVerbose support in analyze_brprob_spec
Xionghu Luo [Thu, 16 Dec 2021 01:58:10 +0000 (19:58 -0600)]
Verbose support in analyze_brprob_spec

Also add verbose argument support like analyze_brprob.py

contrib/ChangeLog:

* analyze_brprob_spec.py: Add verbose argument.

2 years agoDaily bump.
GCC Administrator [Thu, 16 Dec 2021 00:16:28 +0000 (00:16 +0000)]
Daily bump.

2 years agoc++: Allow constexpr decltype(auto) [PR102229]
Marek Polacek [Fri, 10 Dec 2021 20:38:35 +0000 (15:38 -0500)]
c++: Allow constexpr decltype(auto) [PR102229]

My r11-2202 was trying to enforce [dcl.type.auto.deduct]/4, which says
"If the placeholder-type-specifier is of the form type-constraint[opt]
decltype(auto), T shall be the placeholder alone."  But this made us
reject 'constexpr decltype(auto)', which, after clarification from CWG,
should be valid.  [dcl.type.auto.deduct]/4 is supposed to be a syntactic
constraint, not semantic, so it's OK that the constexpr marks the object
as const.

As a consequence, checking TYPE_QUALS in do_auto_deduction is too late,
and we have a FIXME there anyway.  So in this patch I'm attempting to
detect 'const decltype(auto)' earlier.  If I'm going to use TYPE_QUALS,
it needs to happen before we mark the object as const due to constexpr,
that is, before grokdeclarator's

  /* A `constexpr' specifier used in an object declaration declares
     the object as `const'.  */
  if (constexpr_p && innermost_code != cdk_function)
    ...

Constrained decltype(auto) was a little problem, hence the TYPENAME
check.  But in a typename context you can't use decltype(auto) anyway,
I think.

PR c++/102229

gcc/cp/ChangeLog:

* decl.c (check_decltype_auto): New.
(grokdeclarator): Call it.
* pt.c (do_auto_deduction): Don't check decltype(auto) here.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1y/decltype-auto5.C: New test.

2 years agotestsuite: Be more informative for ICEs
Thomas Schwinge [Fri, 10 Dec 2021 18:08:26 +0000 (19:08 +0100)]
testsuite: Be more informative for ICEs

For example, for the two (FAIL, XFAIL)
'gcc/testsuite/lib/gcc-dg.exp:gcc-dg-test-1' cases:

    -FAIL: g++.dg/modules/xtreme-header-3_a.H -std=c++17 (internal compiler error)
    +FAIL: g++.dg/modules/xtreme-header-3_a.H -std=c++17 (internal compiler error: tree check: expected var_decl or function_decl or field_decl or type_decl or concept_decl or template_decl, have namespace_decl in get_merge_kind, at cp/module.cc:10072)

    -FAIL: gfortran.dg/gomp/clauses-1.f90   -O  (internal compiler error)
    +FAIL: gfortran.dg/gomp/clauses-1.f90   -O  (internal compiler error: Segmentation fault)

    -XFAIL: c-c++-common/goacc/kernels-decompose-ice-1.c (internal compiler error)
    +XFAIL: c-c++-common/goacc/kernels-decompose-ice-1.c (internal compiler error: in lower_omp_target, at omp-low.c:13147)

    -XFAIL: g++.dg/cpp1z/constexpr-lambda26.C  -std=c++17 (internal compiler error)
    +XFAIL: g++.dg/cpp1z/constexpr-lambda26.C  -std=c++17 (internal compiler error: in cxx_eval_constant_expression, at cp/constexpr.c:6954)

That allows for more easily spotting when during development you're trading one
ICE for another.

gcc/testsuite/
* lib/fortran-torture.exp (fortran-torture-compile)
(fortran-torture-execute): Be more informative for ICEs.
* lib/gcc-defs.exp (${tool}_check_compile): Likewise.
* lib/gcc-dg.exp (gcc-dg-test-1): Likewise.
* lib/go-torture.exp (go-torture-compile, go-torture-execute):
Likewise.

2 years agoSync with binutils: Support the PGO build for binutils+gdb
H.J. Lu [Sat, 13 Nov 2021 14:11:41 +0000 (06:11 -0800)]
Sync with binutils: Support the PGO build for binutils+gdb

Sync with binutils for building binutils with LTO:

1dbde357be3 Add missing changes to Makefile.tpl
af019bfde9b Support the PGO build for binutils+gdb

Add the --enable-pgo-build[=lto] configure option.  When binutils+gdb
is not built together with GCC, --enable-pgo-build enables the PGO build:

1. First build with -fprofile-generate.
2. Use "make maybe-check-*" to generate profiling data and pass -i to make
to ignore errors when generating profiling data.
3. Use "make clean" to remove the previous build.
4. Rebuild with -fprofile-use.

With --enable-pgo-build=lto, -flto=jobserver -ffat-lto-objects are used
together with -fprofile-generate and -fprofile-use.  Add '+' to the command
line for recursive make to support -flto=jobserver -ffat-lto-objects.

NB: --enable-pgo-build=lto enables the PGO build with LTO while
--enable-lto enables LTO support in toolchain.

BZ binutils/26766
* Makefile.tpl (BUILD_CFLAGS): New.
(CFLAGS): Append $(BUILD_CFLAGS).
(CXXFLAGS): Likewise.
(PGO_BUILD_GEN_FLAGS_TO_PASS): New.
(PGO_BUILD_TRAINING_CFLAGS): Likewise.
(PGO_BUILD_TRAINING_CXXFLAGS): Likewise.
(PGO_BUILD_TRAINING_FLAGS_TO_PASS): Likewise.
(PGO_BUILD_TRAINING_MFLAGS): Likewise.
(PGO_BUILD_USE_FLAGS_TO_PASS): Likewise.
(PGO-TRAINING-TARGETS): Likewise.
(PGO_BUILD_TRAINING): Likewise.
(all): Add '+' to the command line for recursive make.  Support
the PGO build.
* configure.ac: Add --enable-pgo-build[=lto].
AC_SUBST PGO_BUILD_GEN_CFLAGS, PGO_BUILD_USE_CFLAGS and
PGO_BUILD_LTO_CFLAGS.  Enable the PGO build in Makefile.
* Makefile.in: Regenerated.
* configure: Likewise.

2 years agoSync with binutils: GCC: Pass --plugin to AR and RANLIB
H.J. Lu [Sat, 13 Nov 2021 14:11:41 +0000 (06:11 -0800)]
Sync with binutils: GCC: Pass --plugin to AR and RANLIB

Sync with binutils for building binutils with LTO:

50ad1254d50 GCC: Pass --plugin to AR and RANLIB

Detect GCC LTO plugin.  Pass --plugin to AR and RANLIB to support LTO
build.

ChangeLog:

* Makefile.tpl (AR): Add @AR_PLUGIN_OPTION@
(RANLIB): Add @RANLIB_PLUGIN_OPTION@.
* configure.ac: Include config/gcc-plugin.m4.
AC_SUBST AR_PLUGIN_OPTION and RANLIB_PLUGIN_OPTION.
* libtool.m4 (_LT_CMD_OLD_ARCHIVE): Pass --plugin to AR and
RANLIB if possible.
* Makefile.in: Regenerated.
* configure: Likewise.

config/

* gcc-plugin.m4 (GCC_PLUGIN_OPTION): New.

libiberty/

* Makefile.in (AR): Add @AR_PLUGIN_OPTION@
(RANLIB): Add @RANLIB_PLUGIN_OPTION@.
(configure_deps): Depend on ../config/gcc-plugin.m4.
* configure.ac: AC_SUBST AR_PLUGIN_OPTION and
RANLIB_PLUGIN_OPTION.
* aclocal.m4: Regenerated.
* configure: Likewise.

zlib/

* configure: Regenerated.

2 years agolibstdc++: Overload std::__to_address for __gnu_cxx::__normal_iterator.
François Dumont [Wed, 6 Oct 2021 04:55:19 +0000 (06:55 +0200)]
libstdc++: Overload std::__to_address for __gnu_cxx::__normal_iterator.

Prefer to overload __to_address to partially specialize std::pointer_traits because
std::pointer_traits would be mostly useless. Moreover partial specialization of
pointer_traits<__normal_iterator<P, C>> fails to rebind C, so you get incorrect types
like __normal_iterator<long*, vector<int>>. In the case of __gnu_debug::_Safe_iterator
the to_pointer method is impossible to implement correctly because we are missing
the parent container to associate the iterator to.

libstdc++-v3/ChangeLog:

* include/bits/stl_iterator.h
(std::pointer_traits<__gnu_cxx::__normal_iterator<>>): Remove.
(std::__to_address(const __gnu_cxx::__normal_iterator<>&)): New for C++11 to C++17.
* include/debug/safe_iterator.h
(std::__to_address(const __gnu_debug::_Safe_iterator<__gnu_cxx::__normal_iterator<>,
_Sequence>&)): New for C++11 to C++17.
* testsuite/24_iterators/normal_iterator/to_address.cc: Add check on std::vector::iterator
to validate both __gnu_cxx::__normal_iterator<> __to_address overload in normal mode and
__gnu_debug::_Safe_iterator in _GLIBCXX_DEBUG mode.

2 years agod: Merge upstream dmd 93108bb9e, druntime 6364e010, phobos 575b67a9b.
Iain Buclaw [Wed, 15 Dec 2021 18:47:02 +0000 (19:47 +0100)]
d: Merge upstream dmd 93108bb9e, druntime 6364e010, phobos 575b67a9b.

D front-end changes:

    - Import dmd v2.098.1-beta.1.
    - Default extern(C++) compatibility to C++17.

Druntime changes:

    - Import druntime v2.098.1-beta.1.
    - Fix definition of stat_t on MIPS64 (PR103604)

Phobos changes:

    - Import phobos v2.098.1-beta.1.

gcc/d/ChangeLog:

* d-lang.cc (d_init_options): Set default -fextern-std= to C++17.
* dmd/MERGE: Merge upstream dmd 93108bb9e.
* gdc.texi (Runtime Options): Document the default for -fextern-std=.

libphobos/ChangeLog:

PR d/103604
* configure: Regenerate.
* configure.ac (libtool_VERSION): Update to 3:0:0.
* libdruntime/MERGE: Merge upstream druntime 6364e010.
* src/MERGE: Merge upstream phobos 575b67a9b.
* testsuite/libphobos.traits/all_satisfy.d: New test.
* testsuite/libphobos.traits/traits.exp: New test.

2 years agoAdd new test [PR78969].
Martin Sebor [Wed, 15 Dec 2021 15:43:02 +0000 (08:43 -0700)]
Add new test [PR78969].

gcc/testsuite/ChangeLog:
PR tree-optimization/78969
* gcc.dg/tree-ssa/builtin-snprintf-warn-6.c: New test.

2 years agoconfigure: Account CXXFLAGS in gcc-plugin.m4.
Iain Sandoe [Wed, 8 Dec 2021 20:58:34 +0000 (20:58 +0000)]
configure: Account CXXFLAGS in gcc-plugin.m4.

We now use a C++ compiler so that we need to process
CXXFLAGS as well as CFLAGS in the gcc-plugin config
fragment.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
config/ChangeLog:

* gcc-plugin.m4: Save and process CXXFLAGS.

gcc/ChangeLog:

* configure: Regenerate.

libcc1/ChangeLog:

* configure: Regenerate.

2 years agonvptx: Add -misa=sm_75 and -misa=sm_80
Roger Sayle [Sun, 12 Dec 2021 17:03:03 +0000 (18:03 +0100)]
nvptx: Add -misa=sm_75 and -misa=sm_80

Add new target macros TARGET_SM75 and TARGET_SM80.  Add support for
__builtin_tanhf, HFmode exp2/tanh and also for HFmode min/max, controlled by
TARGET_SM75 and TARGET_SM80 respectively.

The following has been tested on nvptx-none, hosted on x86_64-pc-linux-gnu
with a "make" and "make -k check" with no new failures.

gcc/ChangeLog:

* config/nvptx/nvptx-opts.h (ptx_isa): PTX_ISA_SM75 and PTX_ISA_SM80
ISA levels.
* config/nvptx/nvptx.opt: Add sm_75 and sm_80 to -misa.
* config/nvptx/nvptx.h (TARGET_SM75, TARGET_SM80):
New helper macros to conditionalize functionality on target ISA.
* config/nvptx/nvptx-c.c (nvptx_cpu_cpp_builtins): Add __PTX_SM__
support for the new ISA levels.
* config/nvptx/nvptx.c (nvptx_file_start): Add support for TARGET_SM75
and TARGET_SM80.
* config/nvptx/nvptx.md (define_c_enum "unspec"): New UNSPEC_TANH.
(define_mode_iterator HSFM): New iterator for HFmode and SFmode.
(exp2hf2): New define_insn controlled by TARGET_SM75.
(tanh<mode>2): New define_insn controlled by TARGET_SM75.
(sminhf3, smaxhf3): New define_isnns controlled by TARGET_SM80.

gcc/testsuite/ChangeLog:

* gcc.target/nvptx/float16-2.c: New test case.
* gcc.target/nvptx/tanh-1.c: New test case.

2 years ago[nvptx] Add -mptx=7.0
Tom de Vries [Wed, 15 Dec 2021 13:37:58 +0000 (14:37 +0100)]
[nvptx] Add -mptx=7.0

Add support for ptx isa version 7.0, required for the addition of -misa=sm_75
and -misa=sm_80.

Tested by setting the default ptx isa version to 7.0, and doing a build and
libgomp test run.

gcc/ChangeLog:

* config/nvptx/nvptx-opts.h (enum ptx_version): Add PTX_VERSION_7_0.
* config/nvptx/nvptx.c (nvptx_file_start): Handle TARGET_PTX_7_0.
* config/nvptx/nvptx.h (TARGET_PTX_7_0): New macro.
* config/nvptx/nvptx.opt (ptx_version): Add 7.0.

2 years agoaarch64: Don't classify vector pairs as short vectors [PR103094]
Richard Sandiford [Wed, 15 Dec 2021 12:19:00 +0000 (12:19 +0000)]
aarch64: Don't classify vector pairs as short vectors [PR103094]

In this PR we were wrongly classifying a pair of 8-byte vectors
as a 16-byte “short vector” (in the AAPCS64 sense).  As the
comment in the patch says, this stems from an old condition
in aarch64_short_vector_p that is too loose, but that would
be difficult to tighten now.

We can still do the right thing for the newly-added modes though,
since there are no backwards compatibility concerns there.

Co-authored-by: Tamar Christina <tamar.christina@arm.com>
gcc/
PR target/103094
* config/aarch64/aarch64.c (aarch64_short_vector_p): Return false
for structure modes, rather than ignoring the type in that case.

gcc/testsuite/
PR target/103094
* gcc.target/aarch64/pr103094.c: New test.

2 years agoc++: Fix warning word splitting [PR103713]
Martin Liska [Wed, 15 Dec 2021 11:09:28 +0000 (12:09 +0100)]
c++: Fix warning word splitting [PR103713]

PR c++/103713

gcc/cp/ChangeLog:

* tree.c (maybe_warn_parm_abi): Fix warning word splitting.

2 years agomiddle-end: REE should always check all vector usages, even if it finds a defining...
Tamar Christina [Wed, 15 Dec 2021 10:26:10 +0000 (10:26 +0000)]
middle-end: REE should always check all vector usages, even if it finds a defining def. [PR103350]

This and the report in PR103632 are caused by a bug in REE where it generates
incorrect code.

It's trying to eliminate the following zero extension

(insn 54 90 102 2 (set (reg:V4SI 33 v1)
        (zero_extend:V4SI (reg/v:V4HI 40 v8)))
     (nil))

by folding it in the definition of `v8`:

(insn 2 5 104 2 (set (reg/v:V4HI 40 v8)
        (reg:V4HI 32 v0 [156]))
     (nil))

which is fine, except that `v8` is also used by the extracts, e.g.:

(insn 11 10 12 2 (set (reg:SI 1 x1)
        (zero_extend:SI (vec_select:HI (reg/v:V4HI 40 v8)
                (parallel [
                        (const_int 3)
                    ]))))
     (nil))

REE replaces insn 2 by folding insn 54 and placing it at the definition site of
insn 2, so before insn 11.

Trying to eliminate extension:
(insn 54 90 102 2 (set (reg:V4SI 33 v1)
        (zero_extend:V4SI (reg/v:V4HI 40 v8)))
     (nil))
Tentatively merged extension with definition (copy needed):
(insn 2 5 104 2 (set (reg:V4SI 33 v1)
        (zero_extend:V4SI (reg:V4HI 32 v0)))
     (nil))

to produce

(insn 2 5 110 2 (set (reg:V4SI 33 v1)
        (zero_extend:V4SI (reg:V4HI 32 v0)))
     (nil))
(insn 110 2 104 2 (set (reg:V4SI 40 v8)
        (reg:V4SI 33 v1))
     (nil))

The new insn 2 using v0 directly is correct, but the insn 110 it creates is
wrong, `v8` should still be V4HI.

or it also needs to eliminate the zero extension from the extracts, so instead
of

(insn 11 10 12 2 (set (reg:SI 1 x1)
        (zero_extend:SI (vec_select:HI (reg/v:V4HI 40 v8)
                (parallel [
                        (const_int 3)
                    ]))))
     (nil))

it should be

(insn 11 10 12 2 (set (reg:SI 1 x1)
        (vec_select:SI (reg/v:V4SI 40 v8)
                (parallel [
                        (const_int 3)
                    ])))
     (nil))

without doing so the indices have been remapped in the extension and so we
extract the wrong elements

At any other optimization level but -Os ree seems to abort so this doesn't
trigger:

Trying to eliminate extension:
(insn 54 90 101 2 (set (reg:V4SI 32 v0)
        (zero_extend:V4SI (reg/v:V4HI 40 v8)))
     (nil))
Elimination opportunities = 2 realized = 0

purely due to the ordering of instructions. REE doesn't check uses of `v8`
because it assumes that with a zero extended value, you still have access to the
lower bits by using the the bottom part of the register.

This is true for scalar but not for vector.  This would have been fine as well
if REE had eliminated the zero_extend on insn 11 and the rest but it doesn't do
so since REE can only handle cases where the SRC value are REG_P.

It does try to do this in add_removable_extension:

 1160      /* For vector mode extensions, ensure that all uses of the
 1161         XEXP (src, 0) register are in insn or debug insns, as unlike
 1162         integral extensions lowpart subreg of the sign/zero extended
 1163         register are not equal to the original register, so we have
 1164         to change all uses or none and the current code isn't able
 1165         to change them all at once in one transaction.  */

However this code doesn't trigger for the example because REE doesn't check the
uses if the defining instruction doesn't feed into another extension..

Which is bogus. For vectors it should always check all usages.

r12-2288-g8695bf78dad1a42636775843ca832a2f4dba4da3 simply exposed this as it now
lowers VEC_SELECT 0 into the RTL canonical form subreg 0 which causes REE to run
more often.

gcc/ChangeLog:

PR rtl-optimization/103350
* ree.c (add_removable_extension): Don't stop at first definition but
inspect all.

gcc/testsuite/ChangeLog:

PR rtl-optimization/103350
* gcc.target/aarch64/pr103350-1.c: New test.
* gcc.target/aarch64/pr103350-2.c: New test.

2 years agotestsuite: Fix up cpp23/auto-fncast11.C testcase [PR103408]
Jakub Jelinek [Wed, 15 Dec 2021 10:18:14 +0000 (11:18 +0100)]
testsuite: Fix up cpp23/auto-fncast11.C testcase [PR103408]

This test fails:
+FAIL: g++.dg/cpp23/auto-fncast11.C  -std=c++2b  (test for errors, line 19)
+FAIL: g++.dg/cpp23/auto-fncast11.C  -std=c++2b (test for excess errors)
because the regex in dg-error was missing an indefinite article.

2021-12-15  Jakub Jelinek  <jakub@redhat.com>

PR c++/103408
* g++.dg/cpp23/auto-fncast11.C: Fix expected diagnostic wording.

2 years agodwarf2cfi: Improve cfa_reg comparisons [PR103619]
Jakub Jelinek [Wed, 15 Dec 2021 09:41:02 +0000 (10:41 +0100)]
dwarf2cfi: Improve cfa_reg comparisons [PR103619]

On Tue, Dec 14, 2021 at 10:32:21AM -0700, Jeff Law wrote:
> I think the attached testcase should trigger on c6x with -mbig-endian -O2 -g

Thanks.  Finally I see what's going on.  c6x doesn't really need the CFA
with span > 1 (and I bet neither does armbe), the only reason why
dwf_cfa_reg is called is that the code in 13 cases just tries to compare
the CFA against dwf_cfa_reg (some_reg).  And that dwf_cfa_reg on some reg
that usually isn't a CFA reg results in targetm.dwarf_register_span hook
call, which on targets like c6x or armeb and others for some registers
creates a PARALLEL with various REGs in it, then the loop with the assertion
and finally operator== which just notes that the reg is different and fails.

This seems compile time memory and time inefficient.

The following so far untested patch instead adds an extra operator== and !=
for comparison of cfa_reg with rtx, which has the most common case where it
is a different register number done early without actually invoking
dwf_cfa_reg.  This means the assertion in dwf_cfa_reg can stay as is (at
least until some big endian target needs to have hard frame pointer or stack
pointer with span > 1 as well).
I've removed a different assertion there because it is redundant - dwf_regno
already has exactly that assertion in it too.

And I've included those 2 tweaks to avoid creating a REG in GC memory when
we can use {stack,hard_frame}_pointer_rtx which is already initialized to
the same REG we need by init_emit_regs.

On Tue, Dec 14, 2021 at 03:05:37PM -0700, Jeff Law wrote:
> So if someone is unfamiliar with the underlying issues here and needs to
> twiddle dwarf2cfi, how are they supposed to know if they should compare
> directly or use dwf_cfa_reg?

Comparison without dwf_cfa_reg should be used whenever possible, because
for registers which are never CFA related that won't call
targetm.dwarf_register_span uselessly.

The only comparisons with dwf_cfa_reg I've kept are the:
            regno = dwf_cfa_reg (XEXP (XEXP (dest, 0), 0));

            if (cur_cfa->reg == regno)
              offset -= cur_cfa->offset;
            else if (cur_trace->cfa_store.reg == regno)
              offset -= cur_trace->cfa_store.offset;
            else
              {
                gcc_assert (cur_trace->cfa_temp.reg == regno);
                offset -= cur_trace->cfa_temp.offset;
              }
and
            struct cfa_reg regno = dwf_cfa_reg (XEXP (dest, 0));

            if (cur_cfa->reg == regno)
              offset = -cur_cfa->offset;
            else if (cur_trace->cfa_store.reg == regno)
              offset = -cur_trace->cfa_store.offset;
            else
              {
                gcc_assert (cur_trace->cfa_temp.reg == regno);
                offset = -cur_trace->cfa_temp.offset;
              }
and there are 2 reasons for it:
1) there is an assertion, which guarantees it must compare equal to one of
those 3 cfa related struct cfa_reg structs, so it must be some CFA related
register (so, right now, targetm.dwarf_register_span shouldn't return
non-NULL in those on anything but gcn)
2) it is compared 3 times in a row, so for the GCN case doing
            if (cur_cfa->reg == XEXP (XEXP (dest, 0), 0))
              offset -= cur_cfa->offset;
            else if (cur_trace->cfa_store.reg == XEXP (XEXP (dest, 0), 0))
              offset -= cur_trace->cfa_store.offset;
            else
              {
                gcc_assert (cur_trace->cfa_temp.reg == XEXP (XEXP (dest, 0), 0));
                offset -= cur_trace->cfa_temp.offset;
              }
could actually create more GC allocated garbage than the way it is written
now.  But doing it that way would work fine.

I think for most of the comparisons even comparing with dwf_cfa_reg would
work but be less compile time/memory efficient (e.g. those assertions that
it is equal to some CFA related cfa_reg or in any spots where only the CFA
related regs may appear in the frame related patterns).

I'm aware just of a single spot where comparison with dwf_cfa_reg doesn't
work (when the assert is in dwf_cfa_reg), that is the spot that was ICEing
on your testcase, where we save arbitrary call saved register:
      if (REG_P (src)
          && REGNO (src) != STACK_POINTER_REGNUM
          && REGNO (src) != HARD_FRAME_POINTER_REGNUM
          && cur_cfa->reg == src)

2021-12-15  Jakub Jelinek  <jakub@redhat.com>

PR debug/103619
* dwarf2cfi.c (dwf_cfa_reg): Remove gcc_assert.
(operator==, operator!=): New overloaded operators.
(dwarf2out_frame_debug_adjust_cfa, dwarf2out_frame_debug_cfa_offset,
dwarf2out_frame_debug_expr): Compare vars with cfa_reg type directly
with REG rtxes rather than with dwf_cfa_reg results on those REGs.
(create_cie_data): Use stack_pointer_rtx instead of
gen_rtx_REG (Pmode, STACK_POINTER_REGNUM).
(execute_dwarf2_frame): Use hard_frame_pointer_rtx instead of
gen_rtx_REG (Pmode, HARD_FRAME_POINTER_REGNUM).

2 years agoi386: Fix emissing of __builtin_cpu_supports.
Martin Liska [Mon, 13 Dec 2021 14:34:30 +0000 (15:34 +0100)]
i386: Fix emissing of __builtin_cpu_supports.

PR target/103661

gcc/ChangeLog:

* config/i386/i386-builtins.c (fold_builtin_cpu): Compare to 0
as API expects that non-zero values are returned (do that
it mask == 31).
For "avx512vbmi2" argument, we return now 1 << 31, which is a
negative integer value.

2 years agoopenmp: Avoid calling operand_equal_p on OMP_CLAUSEs [PR103704]
Jakub Jelinek [Wed, 15 Dec 2021 09:27:08 +0000 (10:27 +0100)]
openmp: Avoid calling operand_equal_p on OMP_CLAUSEs [PR103704]

On OMP_CLAUSEs we reuse TREE_TYPE as CP_OMP_CLAUSE_INFO in the C++ FE.
This confuses the hashing code that operand_equal_p does when checking.
There is really no reason to compare OMP_CLAUSEs against expressions
like captured this, they will never compare equal.

2021-12-15  Jakub Jelinek  <jakub@redhat.com>

PR c++/103704
* semantics.c (finish_omp_target_clauses_r): For OMP_CLAUSEs
just walk subtrees.

* g++.dg/gomp/pr103704.C: New test.

2 years agolibstdc++: Poor man's case insensitive comparisons in time_get [PR71557]
Jakub Jelinek [Wed, 15 Dec 2021 09:25:53 +0000 (10:25 +0100)]
libstdc++: Poor man's case insensitive comparisons in time_get [PR71557]

This patch uses the same not completely correct case insensitive comparisons
as used elsewhere in the same header.  Proper comparisons that would handle
even multi-byte characters would be harder, but I don't see them implemented
in __ctype's methods.

2021-12-15  Jakub Jelinek  <jakub@redhat.com>

PR libstdc++/71557
* include/bits/locale_facets_nonio.tcc (_M_extract_via_format):
Compare characters other than format specifiers and whitespace
case insensitively.
(_M_extract_name): Compare characters case insensitively.
* testsuite/22_locale/time_get/get/char/71557.cc: New test.
* testsuite/22_locale/time_get/get/wchar_t/71557.cc: New test.

2 years agoAdd combine splitter to transform vashr/vlshr/vashl_optab to ashr/lshr/ashl_optab...
Haochen Jiang [Wed, 1 Dec 2021 08:48:28 +0000 (16:48 +0800)]
Add combine splitter to transform vashr/vlshr/vashl_optab to ashr/lshr/ashl_optab for const vector duplicate operand.

gcc/ChangeLog:

PR target/101796
* config/i386/predicates.md (const_vector_operand):
Add new predicate.
* config/i386/sse.md(<insn><mode>3<mask_name>):
Add new define_split below.

gcc/testsuite/ChangeLog:

PR target/101796
* gcc.target/i386/pr101796-1.c: New test.

2 years agoGenerate XXSPLTIDP for scalars on power10.
Michael Meissner [Wed, 15 Dec 2021 07:30:20 +0000 (02:30 -0500)]
Generate XXSPLTIDP for scalars on power10.

This patch implements XXSPLTIDP support for SF, and DF scalar constants.
The previous patch added support for vector constants.  This patch adds
the support for SFmode and DFmode scalar constants.

I added 2 new tests to test loading up SF and DF scalar constants.

2021-12-15  Michael Meissner  <meissner@the-meissners.org>

gcc/

* config/rs6000/rs6000.md (UNSPEC_XXSPLTIDP_CONST): New unspec.
(UNSPEC_XXSPLTIW_CONST): New unspec.
(movsf_hardfloat): Add support for generating XXSPLTIDP.
(mov<mode>_hardfloat32): Likewise.
(mov<mode>_hardfloat64): Likewise.
(xxspltidp_<mode>_internal): New insns.
(xxspltiw_<mode>_internal): New insns.
(splitters for SF/DFmode): Add new splitters for XXSPLTIDP.

gcc/testsuite/

* gcc.target/powerpc/vec-splat-constant-df.c: New test.
* gcc.target/powerpc/vec-splat-constant-sf.c: New test.

2 years agoGenerate XXSPLTIDP for vectors on power10.
Michael Meissner [Wed, 15 Dec 2021 07:02:24 +0000 (02:02 -0500)]
Generate XXSPLTIDP for vectors on power10.

This patch implements XXSPLTIDP support for all vector constants.  The
XXSPLTIDP instruction is given a 32-bit immediate that is converted to a vector
of two DFmode constants.  The immediate is in SFmode format, so only constants
that fit as SFmode values can be loaded with XXSPLTIDP.

The constraint (eP) added in the previous patch for XXSPLTIW is also used
for XXSPLTIDP.

DImode scalar constants are not handled.  This is due to the majority of DImode
constants will be in the GPR registers.  With vector registers, you have the
problem that XXSPLTIDP splats the double word into both elements of the
vector.  However, if TImode is loaded with an integer constant, it wants a full
128-bit constant.

SFmode and DFmode scalar constants are not handled in this patch.  The
support for for those constants will be in the next patch.

I have added a temporary switch (-msplat-float-constant) to control whether or
not the XXSPLTIDP instruction is generated.

I added 2 new tests to test loading up V2DI and V2DF vector constants.

2021-12-14  Michael Meissner  <meissner@the-meissners.org>

gcc/

* config/rs6000/predicates.md (easy_fp_constant): Add support for
generating XXSPLTIDP.
(vsx_prefixed_constant): Likewise.
(easy_vector_constant): Likewise.
* config/rs6000/rs6000-protos.h (constant_generates_xxspltidp):
New declaration.
* config/rs6000/rs6000.c (output_vec_const_move): Add support for
generating XXSPLTIDP.
(prefixed_xxsplti_p): Likewise.
(constant_generates_xxspltidp): New function.
* config/rs6000/rs6000.opt (-msplat-float-constant): New debug option.

gcc/testsuite/

* gcc.target/powerpc/pr86731-fwrapv-longlong.c: Update insn
regex for power10.
* gcc.target/powerpc/vec-splat-constant-v2df.c: New test.
* gcc.target/powerpc/vec-splat-constant-v2di.c: New test.

2 years agoGenerate XXSPLTIW on power10.
Michael Meissner [Wed, 15 Dec 2021 06:37:08 +0000 (01:37 -0500)]
Generate XXSPLTIW on power10.

This patch adds support to automatically generate the ISA 3.1 XXSPLTIW
instruction for V8HImode, V4SImode, and V4SFmode vectors.  It does this by
adding support for vector constants that can be used, and adding a
VEC_DUPLICATE pattern to generate the actual XXSPLTIW instruction.

Add the eP constraint to recognize constants that can be loaded into
vector registers with a single prefixed instruction such as xxspltiw and
xxspltidp.

I added 4 new tests to test loading up V16QI, V8HI, V4SI, and V4SF vector
constants.

2021-12-14  Michael Meissner  <meissner@linux.ibm.com>

gcc/

* config/rs6000/constraints.md (eP): Update comment.
* config/rs6000/predicates.md (easy_fp_constant): Add support for
generating XXSPLTIW.
(vsx_prefixed_constant): New predicate.
(easy_vector_constant): Add support for
generating XXSPLTIW.
* config/rs6000/rs6000-protos.h (prefixed_xxsplti_p): New
declaration.
(constant_generates_xxspltiw): Likewise.
* config/rs6000/rs6000.c (xxspltib_constant_p): Generate XXSPLTIW
if possible instead of XXSPLTIB and sign extending the constant.
(output_vec_const_move): Add support for XXSPLTIW.
(prefixed_xxsplti_p): New function.
(constant_generates_xxspltiw): New function.
* config/rs6000/rs6000.md (prefixed attribute): Add support to
mark XXSPLTI* instructions as being prefixed.
* config/rs6000/rs6000.opt (-msplat-word-constant): New debug
switch.
* config/rs6000/vsx.md (vsx_mov<mode>_64bit): Add support for
generating XXSPLTIW or XXSPLTIDP.
(vsx_mov<mode>_32bit): Likewise.
* doc/md.texi (PowerPC and IBM RS6000 constraints): Document the
eP constraint.

gcc/testsuite/

* gcc.target/powerpc/vec-splat-constant-v16qi.c: New test.
* gcc.target/powerpc/vec-splat-constant-v4sf.c: New test.
* gcc.target/powerpc/vec-splat-constant-v4si.c: New test.
* gcc.target/powerpc/vec-splat-constant-v8hi.c: New test.
* gcc.target/powerpc/vec-splati-runnable.c: Update insn count.

2 years agoAdd LXVKQ support.
Michael Meissner [Wed, 15 Dec 2021 05:57:44 +0000 (00:57 -0500)]
Add LXVKQ support.

This patch adds support to generate the LXVKQ instruction to load specific
IEEE-128 floating point constants.

Compared to the last time I submitted this patch, I modified it so that it
uses the bit pattern of the vector to see if it can generate the LXVKQ
instruction.  This means on a little endian Power<xxx> system, the
following code will generate a LXVKQ 34,16 instruction:

    vector long long foo (void)
    {
      return (vector long long) { 0x0000000000000000, 0x8000000000000000 };
    }

because that vector pattern is the same bit pattern as -0.0F128.

2021-12-14  Michael Meissner  <meissner@the-meissners.org>

gcc/

* config/rs6000/constraints.md (eQ): New constraint.
* config/rs6000/predicates.md (easy_fp_constant): Add support for
generating the LXVKQ instruction.
(easy_vector_constant_ieee128): New predicate.
(easy_vector_constant): Add support for generating the LXVKQ
instruction.
* config/rs6000/rs6000-protos.h (constant_generates_lxvkq): New
declaration.
* config/rs6000/rs6000.c (output_vec_const_move): Add support for
generating LXVKQ.
(constant_generates_lxvkq): New function.
* config/rs6000/rs6000.opt (-mieee128-constant): New debug
option.
* config/rs6000/vsx.md (vsx_mov<mode>_64bit): Add support for
generating LXVKQ.
(vsx_mov<mode>_32bit): Likewise.
* doc/md.texi (PowerPC and IBM RS6000 constraints): Document the
eQ constraint.

gcc/testsuite/

* gcc.target/powerpc/float128-constant.c: New test.

2 years agoAdd new constant data structure.
Michael Meissner [Wed, 15 Dec 2021 05:56:25 +0000 (00:56 -0500)]
Add new constant data structure.

This patch provides the data structure and function to convert a
CONST_INT, CONST_DOUBLE, CONST_VECTOR, or VEC_DUPLICATE of a constant) to
an array of bytes, half-words, words, and  double words that can be loaded
into a 128-bit vector register.

The next patches will use this data structure to generate code that
generates load of the vector/floating point registers using the XXSPLTIDP,
XXSPLTIW, and LXVKQ instructions that were added in power10.

2021-12-15  Michael Meissner  <meissner@the-meissners.org>

gcc/

* config/rs6000/rs6000-protos.h (VECTOR_128BIT_BITS): New macro.
(VECTOR_128BIT_BYTES): Likewise.
(VECTOR_128BIT_HALF_WORDS): Likewise.
(VECTOR_128BIT_WORDS): Likewise.
(VECTOR_128BIT_DOUBLE_WORDS): Likewise.
(vec_const_128bit_type): New structure type.
(vec_const_128bit_to_bytes): New declaration.
* config/rs6000/rs6000.c (constant_int_to_128bit_vector): New
helper function.
(constant_fp_to_128bit_vector): New helper function.
(vec_const_128bit_to_bytes): New function.

2 years ago[PR100518] store by mult pieces: keep addr in Pmode
Alexandre Oliva [Wed, 15 Dec 2021 05:22:34 +0000 (02:22 -0300)]
[PR100518] store by mult pieces: keep addr in Pmode

The conversion of a MEM address to ptr_mode in
try_store_by_multiple_pieces was misguided: copy_addr_to_reg expects
Pmode for addresses.

for  gcc/ChangeLog

PR target/100518
* builtins.c (try_store_by_multiple_pieces): Drop address
conversion to ptr_mode.

for  gcc/testsuite/ChangeLog

PR target/100518
* gcc.target/aarch64/pr100518.c: New.

2 years ago[PR100843] store by mult pieces: punt on max_len < min_len
Alexandre Oliva [Wed, 15 Dec 2021 05:22:33 +0000 (02:22 -0300)]
[PR100843] store by mult pieces: punt on max_len < min_len

The testcase confuses the code that detects min and max len for the
memset, so max_len ends up less than min_len.  That shouldn't be
possible, but the testcase requires us to handle this case.

The store-by-mult-pieces algorithm actually relies on min and max
lengths, so if we find them to be inconsistent, the best we can do is
punting.

for  gcc/ChangeLog

PR middle-end/100843
* builtins.c (try_store_by_multiple_pieces): Fail if min_len
is greater than max_len.

for  gcc/testsuite/ChangeLog

PR middle-end/100843
* gcc.dg/pr100843.c: New.

2 years agoDaily bump.
GCC Administrator [Wed, 15 Dec 2021 00:16:28 +0000 (00:16 +0000)]
Daily bump.

2 years agoFix ICE. [PR103682]
liuhongt [Tue, 14 Dec 2021 01:47:08 +0000 (09:47 +0800)]
Fix ICE. [PR103682]

Check is_gimple_assign before gimple_assign_rhs_code.

gcc/ChangeLog:

PR target/103682
* tree-ssa-ccp.c (optimize_atomic_bit_test_and): Check
is_gimple_assign before gimple_assign_rhs_code.

gcc/testsuite/ChangeLog:

* gcc.c-torture/compile/pr103682.c: New test.

2 years agolibstdc++: Support old and new T_FMT for en_HK locale [PR103687]
Jonathan Wakely [Tue, 14 Dec 2021 22:14:48 +0000 (22:14 +0000)]
libstdc++: Support old and new T_FMT for en_HK locale [PR103687]

This checks whether the locale data for en_HK includes %p and adjusts
the string being tested accordingly. To account for Jakub's fix to make
%I parse "12" as 0 instead of 12, we need to change the expected value
for the case where the locale format doesn't include %p. Also change the
time from 12:00:00 to 12:02:01 so we can tell if the minutes and seconds
get mixed up.

libstdc++-v3/ChangeLog:

PR libstdc++/103687
* testsuite/22_locale/time_get/get_date/wchar_t/4.cc: Restore
original locale before returning.
* testsuite/22_locale/time_get/get_time/char/2.cc: Check for %p
in locale's T_FMT and adjust accordingly.
* testsuite/22_locale/time_get/get_time/wchar_t/2.cc: Likewise.

2 years ago[PATCH] stddef.h: add support for musl typedef macro guards
Sören Tempel [Tue, 14 Dec 2021 23:07:47 +0000 (18:07 -0500)]
[PATCH] stddef.h: add support for musl typedef macro guards

The stddef.h header checks/sets various hardcoded toolchain/os specific
macro guards to prevent redefining types such as ptrdiff_t, wchar_t, or
size_t. However, without this patch, the file does not check/set the
typedef macro guards for musl libc. This causes types such as size_t to
be defined twice for files which include both musl's stdlib.h as well as
GCC's ginclude/stddef.h. This is, for example, the case for
libgo/sysinfo.c. If libgo/sysinfo.c has multiple typedefs for size_t
this confuses -fdump-go-spec and causes size_t not to be included in the
generated type definitions thereby causing a gcc-go compilation failure
on Alpine Linux Edge (which uses musl libc) with the following error:

sysinfo.go:7765:13: error: use of undefined type '_size_t'
 7765 | type Size_t _size_t
      |             ^
libcall_posix.go:49:35: error: non-integer len argument in make
   49 |                 b := make([]byte, len)
      |

This commit fixes this issue by ensuring that ptrdiff_t, wchar_t, and size_t
are only defined once in the pre-processed libgo/sysinfo.c file by enhancing
gcc/ginclude/stddef.h with musl-specific typedef macro guards.

gcc/ChangeLog:

* ginclude/stddef.h (__DEFINED_ptrdiff_t): Add support for musl
libc typedef macro guard.
(__DEFINED_size_t): Ditto.
(__DEFINED_wchar_t): Ditto.

2 years agoregrename: Skip renaming if instruction is noop move.
JoJo R [Tue, 14 Dec 2021 21:55:57 +0000 (16:55 -0500)]
regrename: Skip renaming if instruction is noop move.

gcc/
* regrename.c (find_rename_reg): Return satisfied regno
if instruction is noop move.

2 years agolibstdc++: Fix handling of invalid ranges in std::regex [PR102447]
Jonathan Wakely [Tue, 14 Dec 2021 14:32:35 +0000 (14:32 +0000)]
libstdc++: Fix handling of invalid ranges in std::regex [PR102447]

std::regex currently allows invalid bracket ranges such as [\w-a] which
are only allowed by ECMAScript when in web browser compatibility mode.
It should be an error, because the start of the range is a character
class, not a single character. The current implementation of
_Compiler::_M_expression_term does not provide a way to reject this,
because we only remember a previous character, not whether we just
processed a character class (or collating symbol etc.)

This patch replaces the pair<bool, CharT> used to emulate
optional<CharT> with a custom class closer to pair<tribool,CharT>. That
allows us to track three states, so that we can tell when we've just
seen a character class.

With this additional state the code in _M_expression_term for processing
the _S_token_bracket_dash can be improved to correctly reject the [\w-a]
case, without regressing for valid cases such as [\w-] and [----].

libstdc++-v3/ChangeLog:

PR libstdc++/102447
* include/bits/regex_compiler.h (_Compiler::_BracketState): New
class.
(_Compiler::_BrackeyMatcher): New alias template.
(_Compiler::_M_expression_term): Change pair<bool, CharT>
parameter to _BracketState. Process first character for
ECMAScript syntax as well as POSIX.
* include/bits/regex_compiler.tcc
(_Compiler::_M_insert_bracket_matcher): Pass _BracketState.
(_Compiler::_M_expression_term): Use _BracketState to store
state between calls. Improve handling of dashes in ranges.
* testsuite/28_regex/algorithms/regex_match/cstring_bracket_01.cc:
Add more tests for ranges containing dashes. Check invalid
ranges with character class at the beginning.

2 years agolibstdc++: Simplify typedefs by using __UINTPTR_TYPE__
Jonathan Wakely [Tue, 14 Dec 2021 14:21:18 +0000 (14:21 +0000)]
libstdc++: Simplify typedefs by using __UINTPTR_TYPE__

libstdc++-v3/ChangeLog:

* include/ext/pointer.h (_Relative_pointer_impl::_UIntPtrType):
Rename to uintptr_t and define as __UINTPTR_TYPE__.

2 years agolibstdc++: Simplify definition of std::regex_constants variables
Jonathan Wakely [Tue, 14 Dec 2021 13:31:11 +0000 (13:31 +0000)]
libstdc++: Simplify definition of std::regex_constants variables

This removes the __syntax_option and __match_flag enumeration types,
which are only used to define enumerators with successive values that
are then used to initialize the std::regex_constants global variables.

By defining enumerators in the syntax_option_type and match_flag_type
enumeration types with the correct values for the globals we get rid of
two useless enumeration types that just count from 0 to N, and we
improve the debugging experience. Because the enumeration types now have
enumerators defined, GDB will print values in terms of those enumerators
e.g.

$6 = (std::regex_constants::_S_ECMAScript | std::regex_constants::_S_multiline)

Previously this would have been shown as simply 0x810 because there were
no enumerators of that type.

This changes the type and value of enumerators such as _S_grep, but
users should never be referring to them directly anyway.

libstdc++-v3/ChangeLog:

* include/bits/regex_constants.h (__syntax_option, __match_flag):
Remove.
(syntax_option_type, match_flag_type): Define enumerators.
Use to initialize globals. Add constexpr to compound assignment
operators.
* include/bits/regex_error.h (error_type): Add comment.
* testsuite/28_regex/constants/constexpr.cc: Remove comment.
* testsuite/28_regex/constants/error_type.cc: Improve comment.
* testsuite/28_regex/constants/match_flag_type.cc: Check bitmask
requirements.
* testsuite/28_regex/constants/syntax_option_type.cc: Likewise.

2 years agors6000: Rename arrays to remove temporary _x suffix
Bill Schmidt [Tue, 14 Dec 2021 19:34:11 +0000 (13:34 -0600)]
rs6000: Rename arrays to remove temporary _x suffix

While we had two sets of built-in infrastructure at once, I added _x as a
suffix to two arrays to disambiguate the old and new versions.  Time to fix
that also.

2021-12-06  Bill Schmidt  <wschmidt@linux.ibm.com>

gcc/
* config/rs6000/rs6000-c.c (altivec_build_resolved_builtin): Rename
rs6000_builtin_decls_x to rs6000_builtin_decls.
(altivec_resolve_overloaded_builtin): Likewise.  Also rename
rs6000_builtin_info_x to rs6000_builtin_info.
* config/rs6000/rs6000-call.c (rs6000_invalid_builtin): Rename
rs6000_builtin_info_x to rs6000_builtin_info.
(rs6000_builtin_is_supported): Likewise.
(rs6000_gimple_fold_mma_builtin): Likewise.  Also rename
rs6000_builtin_decls_x to rs6000_builtin_decls.
(rs6000_gimple_fold_builtin): Rename rs6000_builtin_info_x to
rs6000_builtin_info.
(cpu_expand_builtin): Likewise.
(rs6000_expand_builtin): Likewise.
(rs6000_init_builtins): Likewise.  Also rename rs6000_builtin_decls_x
to rs6000_builtin_decls.
(rs6000_builtin_decl): Rename rs6000_builtin_decls_x to
rs6000_builtin_decls.
* config/rs6000/rs6000-gen-builtins.c (write_decls): In generated code,
rename rs6000_builtin_decls_x to rs6000_builtin_decls, and rename
rs6000_builtin_info_x to rs6000_builtin_info.
(write_bif_static_init): In generated code, rename
rs6000_builtin_info_x to rs6000_builtin_info.
(write_init_bif_table): In generated code, rename
rs6000_builtin_decls_x to rs6000_builtin_decls, and rename
rs6000_builtin_info_x to rs6000_builtin_info.
(write_init_ovld_table): In generated code, rename
rs6000_builtin_decls_x to rs6000_builtin_decls.
(write_init_file): Likewise.
* config/rs6000/rs6000.c (rs6000_builtin_vectorized_function):
Likewise.
(rs6000_builtin_md_vectorized_function): Likewise.
(rs6000_builtin_reciprocal): Likewise.
(add_condition_to_bb): Likewise.
(rs6000_atomic_assign_expand_fenv): Likewise.

2 years agors6000: Rename functions with "new" in their names
Bill Schmidt [Tue, 14 Dec 2021 19:32:14 +0000 (13:32 -0600)]
rs6000: Rename functions with "new" in their names

While we had two sets of built-in functionality at the same time, I put "new"
in the names of quite a few functions.  Time to undo that.

2021-12-02  Bill Schmidt  <wschmidt@linux.ibm.com>

gcc/
* config/rs6000/rs6000-c.c (altivec_resolve_new_overloaded_builtin):
Remove forward declaration.
(rs6000_new_builtin_type_compatible): Rename to
rs6000_builtin_type_compatible.
(rs6000_builtin_type_compatible): Remove.
(altivec_resolve_overloaded_builtin): Remove.
(altivec_build_new_resolved_builtin): Rename to
altivec_build_resolved_builtin.
(altivec_resolve_new_overloaded_builtin): Rename to
altivec_resolve_overloaded_builtin.  Remove static keyword.  Adjust
called function names.
* config/rs6000/rs6000-call.c (rs6000_expand_new_builtin): Remove
forward declaration.
(rs6000_gimple_fold_new_builtin): Likewise.
(rs6000_invalid_new_builtin): Rename to rs6000_invalid_builtin.
(rs6000_gimple_fold_builtin): Remove.
(rs6000_new_builtin_valid_without_lhs): Rename to
rs6000_builtin_valid_without_lhs.
(rs6000_new_builtin_is_supported): Rename to
rs6000_builtin_is_supported.
(rs6000_gimple_fold_new_mma_builtin): Rename to
rs6000_gimple_fold_mma_builtin.
(rs6000_gimple_fold_new_builtin): Rename to
rs6000_gimple_fold_builtin.  Remove static keyword.  Adjust called
function names.
(rs6000_expand_builtin): Remove.
(new_cpu_expand_builtin): Rename to cpu_expand_builtin.
(new_mma_expand_builtin): Rename to mma_expand_builtin.
(new_htm_spr_num): Rename to htm_spr_num.
(new_htm_expand_builtin): Rename to htm_expand_builtin.  Change name
of called function.
(rs6000_expand_new_builtin): Rename to rs6000_expand_builtin.  Remove
static keyword.  Adjust called function names.
(rs6000_new_builtin_decl): Rename to rs6000_builtin_decl.  Remove
static keyword.
(rs6000_builtin_decl): Remove.
* config/rs6000/rs6000-gen-builtins.c (write_decls): In gnerated code,
rename rs6000_new_builtin_is_supported to rs6000_builtin_is_supported.
* config/rs6000/rs6000-internal.h (rs6000_invalid_new_builtin): Rename
to rs6000_invalid_builtin.
* config/rs6000/rs6000.c (rs6000_new_builtin_vectorized_function):
Rename to rs6000_builtin_vectorized_function.
(rs6000_new_builtin_md_vectorized_function): Rename to
rs6000_builtin_md_vectorized_function.
(rs6000_builtin_vectorized_function): Remove.
(rs6000_builtin_md_vectorized_function): Remove.

2 years agors6000: Remove rs6000-builtin.def and associated data and functions
Bill Schmidt [Tue, 14 Dec 2021 19:30:22 +0000 (13:30 -0600)]
rs6000: Remove rs6000-builtin.def and associated data and functions

2021-12-02  Bill Schmidt  <wschmidt@linux.ibm.com>

gcc/
* config/rs6000/rs6000-builtin.def: Delete.
* config/rs6000/rs6000-call.c (builtin_compatibility): Delete.
(builtin_description): Delete.
(builtin_hash_struct): Delete.
(builtin_hasher): Delete.
(builtin_hash_table): Delete.
(builtin_hasher::hash): Delete.
(builtin_hasher::equal): Delete.
(rs6000_builtin_info_type): Delete.
(rs6000_builtin_info): Delete.
(bdesc_compat): Delete.
(bdesc_3arg): Delete.
(bdesc_4arg): Delete.
(bdesc_dst): Delete.
(bdesc_2arg): Delete.
(bdesc_altivec_preds): Delete.
(bdesc_abs): Delete.
(bdesc_1arg): Delete.
(bdesc_0arg): Delete.
(bdesc_htm): Delete.
(bdesc_mma): Delete.
(rs6000_overloaded_builtin_p): Delete.
(rs6000_overloaded_builtin_name): Delete.
(htm_spr_num): Delete.
(rs6000_builtin_is_supported_p): Delete.
(rs6000_gimple_fold_mma_builtin): Delete.
(gt-rs6000-call.h): Remove include directive.
* config/rs6000/rs6000-protos.h (rs6000_overloaded_builtin_p): Delete.
(rs6000_builtin_is_supported_p): Delete.
(rs6000_overloaded_builtin_name): Delete.
* config/rs6000/rs6000.c (rs6000_builtin_decls): Delete.
(rs6000_debug_reg_global): Remove reference to RS6000_BUILTIN_COUNT.
* config/rs6000/rs6000.h (rs6000_builtins): Delete.
(altivec_builtin_types): Delete.
(rs6000_builtin_decls): Delete.
* config/rs6000/t-rs6000 (TM_H): Don't add rs6000-builtin.def.

2 years agors6000: Rename rs6000-builtin-new.def to rs6000-builtins.def
Bill Schmidt [Tue, 14 Dec 2021 19:27:58 +0000 (13:27 -0600)]
rs6000: Rename rs6000-builtin-new.def to rs6000-builtins.def

2021-12-02  Bill Schmidt  <wschmidt@linux.ibm.com>

gcc/
* config/rs6000/rs6000-builtin-new.def: Rename to...
* config/rs6000/rs6000-builtins.def: ...this.
* config/rs6000/rs6000-gen-builtins.c: Adjust header commentary.
* config/rs6000/t-rs6000 (EXTRA_GTYPE_DEPS): Rename
rs6000-builtin-new.def to rs6000-builtins.def.
(rs6000-builtins.c): Likewise.

2 years agors6000: Remove altivec_overloaded_builtins array and initialization
Bill Schmidt [Tue, 14 Dec 2021 19:25:12 +0000 (13:25 -0600)]
rs6000: Remove altivec_overloaded_builtins array and initialization

2021-12-06  Bill Schmidt  <wschmidt@linux.ibm.com>

gcc/
* config/rs6000/rs6000-call.c (altivec_overloaded_builtins): Remove.
* config/rs6000/rs6000.h (altivec_overloaded_builtins): Remove.

2 years agors6000: Do not allow combining of multiple assemble quads [PR103548]
Peter Bergner [Tue, 14 Dec 2021 20:50:41 +0000 (14:50 -0600)]
rs6000: Do not allow combining of multiple assemble quads [PR103548]

The compiler will gladly CSE the result of two __builtin_mma_build_acc
calls with the same four vector arguments, leading to illegal MMA
code being generated.  The fix here is to make the mma_assemble_acc
pattern use a unspec_volatile to stop the CSE from happening.

2021-12-14  Peter Bergner  <bergner@linux.ibm.com>

gcc/
PR target/103548
* config/rs6000/mma.md (UNSPEC_MMA_ASSEMBLE): Rename unspec from this...
(UNSPEC_VSX_ASSEMBLE): ...to this.
(UNSPECV_MMA_ASSEMBLE): New unspecv.
(vsx_assemble_pair): Use UNSPEC_VSX_ASSEMBLE.
(*vsx_assemble_pair): Likewise.
(mma_assemble_acc): Use UNSPECV_MMA_ASSEMBLE.
(*mma_assemble_acc): Likewise.
* config/rs6000/rs6000.c (rs6000_split_multireg_move): Handle
UNSPEC_VOLATILE.  Use UNSPEC_VSX_ASSEMBLE and UNSPECV_MMA_ASSEMBLE.

gcc/testsuite/
PR target/103548
* gcc.target/powerpc/mma-builtin-10-pair.c: New test.
* gcc.target/powerpc/mma-builtin-10-quad.c: New test.

2 years agoFortran: prevent NULL pointer dereference in check of passed do-loop variable
Harald Anlauf [Tue, 14 Dec 2021 20:57:04 +0000 (21:57 +0100)]
Fortran: prevent NULL pointer dereference in check of passed do-loop variable

gcc/fortran/ChangeLog:

PR fortran/103717
* frontend-passes.c (doloop_code): Prevent NULL pointer
dereference when checking for passing a do-loop variable to a
contained procedure with an interface mismatch.

gcc/testsuite/ChangeLog:

PR fortran/103717
* gfortran.dg/do_check_19.f90: New test.

2 years agoFortran: prevent NULL pointer dereferences checking do-loop contained stuff
Harald Anlauf [Tue, 14 Dec 2021 20:02:04 +0000 (21:02 +0100)]
Fortran: prevent NULL pointer dereferences checking do-loop contained stuff

gcc/fortran/ChangeLog:

PR fortran/103718
PR fortran/103719
* frontend-passes.c (doloop_contained_procedure_code): Add several
checks to prevent NULL pointer dereferences on valid and invalid
code called within do-loops.

gcc/testsuite/ChangeLog:

PR fortran/103718
PR fortran/103719
* gfortran.dg/do_check_18.f90: New test.

2 years agoi386: Implement VxHF vector set/insert/extract with lower ABI levels
Uros Bizjak [Tue, 14 Dec 2021 17:27:22 +0000 (18:27 +0100)]
i386: Implement VxHF vector set/insert/extract with lower ABI levels

This is a preparation patch that moves VxHF vector set/insert/extract
expansions from AVX512FP16 ABI to lower ABIs.  There are no functional
changes for -mavx512fp16 and a follow-up patch is needed to actually
enable VxHF vector modes for lower ABIs.

2021-12-14  Uroš Bizjak  <ubizjak@gmail.com>

gcc/ChangeLog:

PR target/103571
* config/i386/i386-expand.c (ix86_expand_vector_init_duplicate)
<case E_V8HFmode>: Implement for TARGET_SSE2.
<case E_V16HFmode>: Implement for TARGET_AVX.
<case E_V32HFmode>: Implement for TARGET_AVX512F.
(ix86_expand_vector_set_var): Handle V32HFmode
without TARGET_AVX512BW.
(ix86_expand_vector_extract)
<case E_V8HFmode>: Implement for TARGET_SSE2.
<case E_V16HFmode>: Implement for TARGET_AVX.
<case E_V32HFmode>: Implement for TARGET_AVX512BW.
(expand_vec_perm_broadcast_1) <case E_V8HFmode>: New.
* config/i386/sse.md (VI12HF_AVX512VL): Remove
TARGET_AVX512FP16 condition.
(V): Ditto.
(V_256_512): Ditto.
(avx_vbroadcastf128_<mode>): Use V_256H mode iterator.

2 years agors6000: Remove new_builtins_are_live and dead code it was guarding
Bill Schmidt [Tue, 14 Dec 2021 17:23:32 +0000 (11:23 -0600)]
rs6000: Remove new_builtins_are_live and dead code it was guarding

To allow for a sane switch-over from the old built-in infrastructure to the
new, both sets of code have co-existed, with the enabled one under the control
of the boolean variable new_builtins_are_live.  As a first step in removing the
old code, remove this variable and the now-dead code it was guarding.

2021-12-06  Bill Schmidt  <wschmidt@linux.ibm.com>

gcc/
* config/rs6000/darwin.h (SUBTARGET_INIT_BUILTINS): Remove
test for new_builtins_are_live and simplify.
* config/rs6000/rs6000-c.c (altivec_build_resolved_builtin): Remove
dead function.
(altivec_resolve_overloaded_builtin): Remove test for
new_builtins_are_live and simplify.
* config/rs6000/rs6000-call.c (altivec_init_builtins): Remove forward
declaration.
(builtin_function_type): Likewise.
(rs6000_common_init_builtins): Likewise.
(htm_init_builtins): Likewise.
(mma_init_builtins): Likewise.
(def_builtin): Remove dead function.
(rs6000_expand_zeroop_builtin): Likewise.
(rs6000_expand_mtfsf_builtin): Likewise.
(rs6000_expand_mtfsb_builtin): Likewise.
(rs6000_expand_set_fpscr_rn_builtin): Likewise.
(rs6000_expand_set_fpscr_drn_builtin): Likewise.
(rs6000_expand_unop_builtin): Likewise.
(altivec_expand_abs_builtin): Likewise.
(rs6000_expand_binop_builtin): Likewise.
(altivec_expand_lxvr_builtin): Likewise.
(altivec_expand_lv_builtin): Likewise.
(altivec_expand_stxvl_builtin): Likewise.
(altivec_expand_stv_builtin): Likewise.
(mma_expand_builtin): Likewise.
(htm_expand_builtin): Likewise.
(cpu_expand_builtin): Likewise.
(rs6000_expand_quaternop_builtin): Likewise.
(rs6000_expand_ternop_builtin): Likewise.
(altivec_expand_dst_builtin): Likewise.
(altivec_expand_vec_sel_builtin): Likewise.
(altivec_expand_builtin): Likewise.
(rs6000_invalid_builtin): Likewise.
(rs6000_builtin_valid_without_lhs): Likewise.
(rs6000_gimple_fold_builtin): Remove test for new_builtins_are_live and
simplify.
(rs6000_expand_builtin): Likewise.
(rs6000_init_builtins): Remove tests for new_builtins_are_live and
simplify.
(rs6000_builtin_decl): Likewise.
(altivec_init_builtins): Remove dead function.
(mma_init_builtins): Likewise.
(htm_init_builtins): Likewise.
(builtin_quaternary_function_type): Likewise.
(builtin_function_type): Likewise.
(rs6000_common_init_builtins): Likewise.
* config/rs6000/rs6000-gen-builtins.c (write_header_file): Don't
declare new_builtins_are_live.
(write_init_bif_table): In generated code, remove test for
new_builtins_are_live and simplify.
(write_init_ovld_table): Likewise.
(write_init_file): Don't initialize new_builtins_are_live.
* config/rs6000/rs6000.c (rs6000_builtin_vectorized_function): Remove
test for new_builtins_are_live and simplify.
(rs6000_builtin_md_vectorized_function): Likewise.
(rs6000_builtin_reciprocal): Likewise.
(add_condition_to_bb): Likewise.
(rs6000_atomic_assign_expand_fenv): Likewise.

2 years agors6000: Builtins for doubleword compare should be in [power8-vector] (PR103625)
Bill Schmidt [Mon, 13 Dec 2021 15:30:18 +0000 (09:30 -0600)]
rs6000: Builtins for doubleword compare should be in [power8-vector] (PR103625)

2021-12-13  Bill Schmidt  <wschmidt@linux.ibm.com>

gcc/
PR target/103625
* config/rs6000/rs6000-builtin-new.def (__builtin_altivec_vcmpequd):
Move to power8-vector stanza.
(__builtin_altivec_vcmpequd_p): Likewise.
(__builtin_altivec_vcmpgtsd): Likewise.
(__builtin_altivec_vcmpgtsd_p): Likewise.
(__builtin_altivec_vcmpgtud): Likewise.
(__builtin_altivec_vcmpgtud_p): Likewise.

2 years agors6000: Some builtins require IBM-128 long double format (PR103623)
Bill Schmidt [Tue, 14 Dec 2021 16:09:06 +0000 (10:09 -0600)]
rs6000: Some builtins require IBM-128 long double format (PR103623)

2021-12-14  Bill Schmidt  <wschmidt@linux.ibm.com>

gcc/
PR target/103623
* config/rs6000/rs6000-builtin-new.def (__builtin_pack_longdouble): Add
ibmld attribute.
(__builtin_unpack_longdouble): Likewise.
* config/rs6000/rs6000-call.c (rs6000_expand_new_builtin): Add special
handling for ibmld attribute.
* config/rs6000/rs6000-gen-builtins.c (attrinfo): Add isibmld.
(parse_bif_attrs): Handle ibmld.
(write_decls): Likewise.
(write_bif_static_init): Likewise.

2 years agoAdd support for global rvalue initialization and constructors
Petter Tomner [Mon, 29 Nov 2021 19:44:07 +0000 (20:44 +0100)]
Add support for global rvalue initialization and constructors

This patch adds support for initialization of global variables
with rvalues and creating constructors for array, struct and
union types which can be used as rvalues.

Signed-off-by:
2021-12-14 Petter Tomner <tomner@kth.se>

gcc/jit/
* jit-common.h: New enum
* jit-playback.c : Folding an setting intitial
(global_new_decl) : Handle const global generation
(new_global) : New flag
(global_set_init_rvalue) : New
(new_ctor) : New
(new_global_initialized) : Flag
(as_truth_value) : Fold
(new_unary_op) : Fold
(new_binary_op) : Fold
(new_comparison) : Fold
(new_array_access) : Fold
(new_dereference) : Fold
(get_address) : Fold
* jit-playback.h :
(global_set_init_rvalue) : New
(new_ctor) : New
* jit-recording.c :
* jit-recording.h :
(new_global_init_rvalue) : New
(new_ctor) : New
(ctor) : New, inherits rvalue
(global_init_rvalue) : New, inherits memento
(type::is_union) : New
* libgccjit++.h : New entrypoints, see C-header
* libgccjit.c : See .h
* libgccjit.h : New entrypoints
(gcc_jit_context_new_array_constructor) : New
(gcc_jit_context_new_struct_constructor) : New
(gcc_jit_context_new_union_constructor) : New
(gcc_jit_global_set_initializer_rvalue) : New
(LIBGCCJIT_HAVE_CTORS) : New feuture macro
* libgccjit.map : New entrypoints added to ABI 19
* docs/topics/expressions.rst : Updated docs

gcc/testsuite/
* jit.dg/all-non-failing-tests.h: Added two tests
* jit.dg/test-error-ctor-array-wrong-obj.c: New
* jit.dg/test-error-ctor-struct-too-big.c: New
* jit.dg/test-error-ctor-struct-wrong-field-obj.c: New
* jit.dg/test-error-ctor-struct-wrong-type.c: New
* jit.dg/test-error-ctor-struct-wrong-type2.c
* jit.dg/test-error-ctor-union-wrong-field-name.c: New
* jit.dg/test-error-global-already-init.c: New
* jit.dg/test-error-global-common-section.c: New
* jit.dg/test-error-global-init-too-small-array.c: New
* jit.dg/test-error-global-lvalue-init.c: New
* jit.dg/test-error-global-nonconst-init.c: New
* jit.dg/test-global-init-rvalue.c: New
* jit.dg/test-local-init-rvalue.c: New

2 years agoFortran: PACK intrinsic should not try to read from zero-sized array
Harald Anlauf [Mon, 13 Dec 2021 19:50:19 +0000 (20:50 +0100)]
Fortran: PACK intrinsic should not try to read from zero-sized array

libgfortran/ChangeLog:

PR libfortran/103634
* intrinsics/pack_generic.c (pack_internal): Handle case when the
array argument of PACK has one or more extents of size zero to
avoid invalid reads.

gcc/testsuite/ChangeLog:

PR libfortran/103634
* gfortran.dg/intrinsic_pack_6.f90: New test.

2 years agoDetermine global memory accesses in ipa-modref
Jan Hubicka [Tue, 14 Dec 2021 15:50:27 +0000 (16:50 +0100)]
Determine global memory accesses in ipa-modref

As discussed in PR103585, fatigue2 is now only benchmark from my usual testing
set (SPEC2k6, SPEC2k17, CPP benchmarks, polyhedron, Firefox, clang) which sees
important regression when inlining functions called once is limited.  This
prevents us from solving runtime issues in roms benchmarks and elsewhere.

The problem is that there is perdida function that takes many arguments and
some of them are array descriptors.  We constant propagate most of their fields
but still keep their initialization. Because perdida is quite fast, the call
overhead dominates, since we need over 100 memory stores consuing about 35%
of the overall benchmark runtime.

The memory stores would be eliminated if perdida did not call fortran I/O which
makes modref to thin that the array descriptors could be accessed. We are
quite close discovering that they can't becuase they are non-escaping from
function.  This patch makes modref to distingush between global memory access
(only things that escapes) and unkonwn accesss (that may access also
nonescaping things reaching the function).  This makes disambiguation for
functions containing error handling better.

Unfortunately the patch hits two semi-latent issues in Fortran frontned.
First is wrong code in gfortran.dg/unlimited_polymorphic_3.f03. This can be
turned into wrong code testcase on both mainline and gcc11 if the runtime
call is removed, so I filled PR 103662 for it. There is TBAA mismatch for
structure produced in FE.

Second is issue with GOMP where Fortran marks certain parameters as non-escaping
and then makes them escape via GOMP_parallel.  For this I disabled the use of
escape info in verify_arg which also disables the useful transform on perdida
but still does useful work for e.g. GCC error handling.  I will work on this
incrementally.

Bootstrapped/regtested x86_64-linux, lto-bootstrapped and also tested with
clang build.  I plan to commit this tomorrow if there are no complains
(the patch is not completely short but conceptualy simple and handles a lot
of common cases).

gcc/ChangeLog:

2021-12-12  Jan Hubicka  <hubicka@ucw.cz>

PR ipa/103585
* ipa-modref-tree.c (modref_access_node::range_info_useful_p): Handle
MODREF_GLOBAL_MEMORY_PARM.
(modref_access_node::dump): Likewise.
(modref_access_node::get_call_arg): Likewise.
* ipa-modref-tree.h (enum modref_special_parms): Add
MODREF_GLOBAL_MEMORY_PARM.
(modref_access_node::useful_for_kill): Handle
MODREF_GLOBAL_MEMORY_PARM.
(modref:tree::merge): Add promote_unknown_to_global.
* ipa-modref.c (verify_arg):New function.
(may_access_nonescaping_parm_p): New function.
(modref_access_analysis::record_global_memory_load): New member
function.
(modref_access_analysis::record_global_memory_store): Likewise.
(modref_access_analysis::process_fnspec): Distingush global and local
memory.
(modref_access_analysis::analyze_call): Likewise.
* tree-ssa-alias.c (ref_may_access_global_memory_p): New function.
(modref_may_conflict): Use it.

gcc/testsuite/ChangeLog:

2021-12-12  Jan Hubicka  <hubicka@ucw.cz>

* gcc.dg/analyzer/data-model-1.c: Disable ipa-modref.
* gcc.dg/uninit-38.c: Likewise.
* gcc.dg/uninit-pr98578.c: Liewise.

2 years agotestsuite: Silence conversion warnings for MIN1 and MAX1
Manfred Schwarb [Tue, 14 Dec 2021 15:30:27 +0000 (16:30 +0100)]
testsuite: Silence conversion warnings for MIN1 and MAX1

gcc/testsuite/ChangeLog:

PR fortran/91497
* gfortran.dg/pr91497.f90: Adjust test to use
dg-require-effective-target directive.
* gfortran.dg/pr91497_2.f90: New test to cover all targets.
Cover MAX1 and MIN1 intrinsics.