review.tizen.org Git - platform/upstream/gcc.git/log

range-op-float: Extend lhs by 0.5ulp rather than 1ulp if not -frounding-math [PR109008]

This patch, incremental to the just posted one, improves the reverse
operation ranges significantly by widening just by 0.5ulp in each
direction rather than 1ulp. Again, REAL_VALUE_TYPE has both wider
exponent range and wider mantissa precision (160 bits) than any
supported type, this patch uses the latter property.

The patch doesn't do it if -frounding-math, because then the rounding
can be +-1ulp in each direction depending on the rounding mode which
we don't know, or for IBM double double because that type is just weird
and we can't trust in sane properties.

I've performed testing of these 2 patches on 300000 random tests as with
yesterday's patch, exact numbers are in the PR, but I see very significant
improvement in the precision of the ranges while keeping it conservatively
correct.

2023-03-10 Jakub Jelinek <jakub@redhat.com>

PR tree-optimization/109008
* range-op-float.cc (float_widen_lhs_range): If not
-frounding-math and not IBM double double format, extend lhs
range just by 0.5ulp rather than 1ulp in each direction.

libstdc++: Fix GDB Xmethod for std::shared_ptr::use_count() [PR109064]

libstdc++-v3/ChangeLog:

PR libstdc++/109064
* python/libstdcxx/v6/xmethods.py (SharedPtrUseCountWorker):
Remove self-recursion in __init__. Add missing _supports.
* testsuite/libstdc++-xmethods/shared_ptr.cc: Check use_count()
and unique().

cygwin: Don't try to support multilibs [PR107998]

As discussed in the PR, t-cygwin-w64 file has been introduced in 2013
and has one important problem, two different multilib options -m64 and -m32,
but MULTILIB_DIRNAMES with just one word in it.
Before the genmultilib sanity checking was added, my understanding is that
this essentially resulted in effective --disable-multilib,
$ gcc -print-multi-lib
.;
;@m32
$ gcc -print-multi-directory
.
$ gcc -print-multi-directory -m64
.
$ gcc -print-multi-directory -m32

$ gcc -print-multi-os-directory
../lib
$ gcc -print-multi-os-directory -m64
../lib
$ gcc -print-multi-os-directory -m32
../lib32
and because of the way e.g. config-ml.in operates
multidirs=
for i in `${CC-gcc} --print-multi-lib 2>/dev/null`; do
  dir=`echo $i | sed -e 's/;.*$//'`
  if [ "${dir}" = "." ]; then
    true
  else
    if [ -z "${multidirs}" ]; then
      multidirs="${dir}"
    else
      multidirs="${multidirs} ${dir}"
    fi
  fi
done
dir was . first time (and so nothing was done) and empty
second time, multidirs empty too, so multidirs was set to empty
like it would be with --disable-multilib.

With the added sanity checking the build fails unless --disable-multilib
is used in configure (dunno whether people usually configure that way
on cygwin).

>From what has been said in the PR, multilibs were not meant to be supported
and e.g. cygwin headers probably aren't ready for it.

So the following patch just removes the file with the (incorrect) multilib
stuff instead of fixing it (say by setting MULTILIB_DIRNAMES to 64 32).

I have no way to test this though, no Windows around, can anyone please
test this?  I just would like to get some progress on the P1s we have...

2023-02-22  Jakub Jelinek  <jakub@redhat.com>

gcc/ChangeLog:

PR target/107998
* config.gcc (x86_64-*-cygwin*): Don't add i386/t-cygwin-w64 into
$tmake_file.
* config/i386/t-cygwin-w64: Remove.

Signed-off-by: Jonathan Yong <10walls@gmail.com>

tree: Use comdat tree_code_{type,length} even for C++11/14 [PR108634]

The recent change to undo the tree_code_type/tree_code_length
excessive duplication apparently broke building the Linux kernel
plugin.  While it is certainly desirable that GCC plugins are built
with the same compiler as GCC has been built and with the same options
(at least the important ones), it might be hard to arrange that,
e.g. if gcc is built using a cross-compiler but the plugin then built
natively, or GCC isn't bootstrapped for other reasons, or just as in
the kernel case they were building the plugin with -std=gnu++11 while
the bootstrapped GCC has been built without any such option and so with
whatever the compiler defaulted to.

For C++17 and later tree_code_{type,length} are UNIQUE symbols with
those assembler names, while for C++11/14 they were
_ZL14tree_code_type and _ZL16tree_code_length.

The following patch uses a comdat var for those even for C++11/14
as suggested by Maciej Cencora.  Relying on weak attribute is not an
option because not all hosts support it and there are non-GNU system
compilers.  While we could use it unconditionally,
I think defining a template just to make it comdat is weird, and
the compiler itself is always built with the same compiler.
Plugins, being separate shared libraries, will have a separate copy of
the arrays if they are ODR-used in the plugin, so there is not a big
deal if e.g. cc1plus uses tree_code_type while plugin uses
_ZN19tree_code_type_tmplILi0EE14tree_code_typeE or vice versa.

2023-03-10  Jakub Jelinek  <jakub@redhat.com>

PR plugins/108634
* tree-core.h (tree_code_type, tree_code_length): For C++11 or
C++14, don't declare as extern const arrays.
(tree_code_type_tmpl, tree_code_length_tmpl): New types with
static constexpr member arrays for C++11 or C++14.
* tree.h (TREE_CODE_CLASS): For C++11 or C++14 use
tree_code_type_tmpl <0>::tree_code_type instead of tree_code_type.
(TREE_CODE_LENGTH): For C++11 or C++14 use
tree_code_length_tmpl <0>::tree_code_length instead of
tree_code_length.
* tree.cc (tree_code_type, tree_code_length): Remove.

file-prefix-map: Fix up -f*-prefix-map= [PR108464]

On Tue, Nov 01, 2022 at 01:46:20PM -0600, Jeff Law via Gcc-patches wrote:
> > This does cause a change of behaviour if users were previously relying upon
> > symlinks or absolute paths not being resolved.
>
> I'm not too worried about this scenario.

As mentioned in the PR, this patch breaks e.g. ccache testsuite.

I strongly doubt most of the users want such a behavior, because it
makes all filenames absolute when -f*-prefix-map= options remap one
absolute path to another one.
Say if I'm in /tmp and /tmp is the canonical path and there is
src/test.c file, with -fdebug-prefix-map=/tmp=/blah
previously there would be DW_AT_comp_dir "/blah" and it is still there,
but DW_AT_name which was previouly "src/test.c" (relative against
DW_AT_comp_dir) is now "/blah/src/test.c" instead.

Even worse, the canonicalization is only done on the remap_filename
argument, but not on the old_prefix side.  That is e.g. what breaks
ccache.  If there is
/tmp/foobar1 directory and
ln -sf foobar1 /tmp/foobar2
cd /tmp/foobar2
then -fdebug-prefix-map=`pwd`:/blah will just not work, while
src/test.c will be canonicalized to /tmp/foobar1/src/test.c,
old_prefix is still what the user provided which is /tmp/foobar2.
User would need to change their uses to use -fdebug-prefix-map=`realpath $(pwd)`=/blah

I've created 3 patches for this.

The first patch just reverts the patch (and its follow-up patch).

The second introduces a new option, -f{,no}-canon-prefix-map which affects
the behavior of -f{file,macro,debug,profile}-prefix-map=, if on it
canonicalizes the old path of the prefix map option and compares that
against the canonicalized filename for absolute paths but not relative.

And last is like the second, but does that also for relative paths except
for filenames with no / (or / or \ on DOS based fs).  So, the third patch
gets an optional behavior of what has been on the trunk lately with the
difference that the old_prefix is canonicalized by the compiler.

Initially I've thought I'd just add some magic syntax to the OLD=NEW
argument of those options (because there are 4 of them), but as noted
in the comments, = is valid char in OLD (just not new), so it would
be hard to figure out some syntax.  So instead a new option, which one
can turn on and off for different -f*-prefix-map= options if needed.

-fdebug-prefix-map=/path1=/mypath1 -fcanon-prefix-map \
-fdebug-prefix-map=/path2=/mypath2 -fno-canon-prefix-map \
-fdebug-prefix-map=/path3=/mypath3

will use the old behavior for the /path1 and /path3 handling and
the new one only for /path2 handling.

This commit is the third patch described above.

2023-03-10  Jakub Jelinek  <jakub@redhat.com>

PR other/108464
* common.opt (fcanon-prefix-map): New option.
* opts.cc: Include file-prefix-map.h.
(flag_canon_prefix_map): New variable.
(common_handle_option): Handle OPT_fcanon_prefix_map.
(gen_command_line_string): Ignore OPT_fcanon_prefix_map.
* file-prefix-map.h (flag_canon_prefix_map): Declare.
* file-prefix-map.cc (struct file_prefix_map): Add canonicalize
member.
(add_prefix_map): Initialize canonicalize member from
flag_canon_prefix_map, and if true canonicalize it using lrealpath.
(remap_filename): Revert 2022-11-01 and 2022-11-07 changes,
use lrealpath result only for map->canonicalize map entries.
* lto-opts.cc (lto_write_options): Ignore OPT_fcanon_prefix_map.
* opts-global.cc (handle_common_deferred_options): Clear
flag_canon_prefix_map at the start and handle OPT_fcanon_prefix_map.
* doc/invoke.texi (-fcanon-prefix-map): Document.
(-ffile-prefix-map, -fdebug-prefix-map, -fprofile-prefix-map): Add
see also for -fcanon-prefix-map.
* doc/cppopts.texi (-fmacro-prefix-map): Likewise.

c, c++, cgraphunit: Prevent duplicated -Wunused-value warnings [PR108079]

On the following testcase, we warn with -Wunused-value twice, once
in the FEs and later on cgraphunit again with slightly different
wording.

The following patch fixes that by registering a warning suppression in the
FEs when we warn and not warning in cgraphunit anymore if that happened.

2023-03-10 Jakub Jelinek <jakub@redhat.com>

PR c/108079
gcc/
* cgraphunit.cc (check_global_declaration): Don't warn for unused
variables which have OPT_Wunused_variable warning suppressed.
gcc/c/
* c-decl.cc (pop_scope): Suppress OPT_Wunused_variable warning
after diagnosing it.
gcc/cp/
* decl.cc (poplevel): Suppress OPT_Wunused_variable warning
after diagnosing it.
gcc/testsuite/
* c-c++-common/Wunused-var-18.c: New test.

range-op-float: Fix up -ffinite-math-only range extension and don't extend into infinities [PR109008]

The following patch does two things (both related to range extension
around the boundaries).

The first part (in the 2 real_isfinite blocks) is to make the ranges
narrower when the old boundaries are minimum and/or maximum representable
finite number.  In that case frange_nextafter gives -Inf or +Inf,
but then the resulting computed reverse range is very far from the actually
needed range, usually extends up to infinity or could even result in NaNs.
While infinities are really the next representable numbers in the
corresponding mode, REAL_VALUE_TYPE is actually a type with wider range
for exponent and 160 bit precision, so the patch instead uses
nextafter number in a hypothetical floating point format with the same
mantissa precision but wider range of exponents.  This significantly
improves the actual ranges of the reverse operations, while still making
them conservatively correct.

The second part is a fix for miscompilation of the new testcase below.
For -ffinite-math-only, without this patch we extend the minimum and/or
maximum representable finite number to -Inf or +Inf, with the patch to
some number outside of the normal exponent range of the mode, but then
we use set which canonicalizes it and turns the boundaries back to
the minimum and/or maximum representable finite numbers, but because
in say [__DBL_MAX__, __DBL_MAX__] = op1 + [__DBL_MAX__, __DBL_MAX__]
op1 can be larger than 0, up to the largest number which rounds to even
down back to __DBL_MAX__ and there are still no infinities involved,
it needs to work even with -ffinite-math-only.  So, we really need to
widen the lhs range a little bit even in that case.  The patch does
that through temporarily clearing -ffinite-math-only, such that the
value with infinities or the outside of bounds values passes the
setting and verification (the VR_VARYING case is needed because
we get ICEs otherwise, but when lhs is VR_VARYING in -ffast-math,
i.e. minimum to maximum representable finite and both signs of NaN,
then set does all we need, we don't need to or in a NaN range).
We don't really later use the range in a way that would become a problem
that it is wider than varying, we actually just perform maths on the
two boundaries.

As I said in the PR, this doesn't fix the !MODE_HAS_INFINITIES case,
I believe we actually need to treat the boundary values as infinities
in that case because they (probably) work like that, but it is unclear
if it is just the reverse operation lhs widening that is a problem there,
or whether it is a general problem.  I have zero experience with
floating points without infinities (PDP11, some ARM half type?,
what else?).

2023-03-10  Jakub Jelinek  <jakub@redhat.com>

PR tree-optimization/109008
* range-op-float.cc (float_widen_lhs_range): If lb is
minimum representable finite number or ub is maximum
representable finite number, instead of widening it to
-inf or inf widen it to negative or positive 0x0.8p+(EMAX+1).
Temporarily clear flag_finite_math_only when canonicalizing
the widened range.

* gcc.dg/pr109008.c: New test.

RISC-V: Add fault first load C/C++ support

gcc/ChangeLog:

* config/riscv/riscv-builtins.cc (riscv_gimple_fold_builtin): New function.
* config/riscv/riscv-protos.h (riscv_gimple_fold_builtin): Ditto.
(gimple_fold_builtin): Ditto.
* config/riscv/riscv-vector-builtins-bases.cc (class read_vl): New class.
(class vleff): Ditto.
(BASE): Ditto.
* config/riscv/riscv-vector-builtins-bases.h: Ditto.
* config/riscv/riscv-vector-builtins-functions.def (read_vl): Ditto.
(vleff): Ditto.
* config/riscv/riscv-vector-builtins-shapes.cc (struct read_vl_def): Ditto.
(struct fault_load_def): Ditto.
(SHAPE): Ditto.
* config/riscv/riscv-vector-builtins-shapes.h: Ditto.
* config/riscv/riscv-vector-builtins.cc
(rvv_arg_type_info::get_tree_type): Add size_ptr.
(gimple_folder::gimple_folder): New class.
(gimple_folder::fold): Ditto.
(gimple_fold_builtin): New function.
(get_read_vl_instance): Ditto.
(get_read_vl_decl): Ditto.
* config/riscv/riscv-vector-builtins.def (size_ptr): Add size_ptr.
* config/riscv/riscv-vector-builtins.h (class gimple_folder): New class.
(get_read_vl_instance): New function.
(get_read_vl_decl): Ditto.
* config/riscv/riscv-vsetvl.cc (fault_first_load_p): Ditto.
(read_vl_insn_p): Ditto.
(available_occurrence_p): Ditto.
(backward_propagate_worthwhile_p): Ditto.
(gen_vsetvl_pat): Adapt for vleff support.
(get_forward_read_vl_insn): New function.
(get_backward_fault_first_load_insn): Ditto.
(source_equal_p): Adapt for vleff support.
(first_ratio_invalid_for_second_sew_p): Remove.
(first_ratio_invalid_for_second_lmul_p): Ditto.
(first_lmul_less_than_second_lmul_p): Ditto.
(first_ratio_less_than_second_ratio_p): Ditto.
(support_relaxed_compatible_p): New function.
(vector_insn_info::operator>): Remove.
(vector_insn_info::operator>=): Refine.
(vector_insn_info::parse_insn): Adapt for vleff support.
(vector_insn_info::compatible_p): Ditto.
(vector_insn_info::update_fault_first_load_avl): New function.
(pass_vsetvl::transfer_after): Adapt for vleff support.
(pass_vsetvl::demand_fusion): Ditto.
(pass_vsetvl::cleanup_insns): Ditto.
* config/riscv/riscv-vsetvl.def (DEF_INCOMPATIBLE_COND): Remove
redundant condtions.
* config/riscv/riscv-vsetvl.h (struct demands_cond): New function.
* config/riscv/riscv.cc (TARGET_GIMPLE_FOLD_BUILTIN): New target hook.
* config/riscv/riscv.md: Adapt for vleff support.
* config/riscv/t-riscv: Ditto.
* config/riscv/vector-iterators.md: New iterator.
* config/riscv/vector.md (read_vlsi): New pattern.
(read_vldi_zero_extend): Ditto.
(@pred_fault_load<mode>): Ditto.

Extend nops num in "maybe_gen_insn" for RISC-V Vector intrinsics

Hi, current maybe_gen_insn can only expand 9 nops.
For RVV intrinsics, I need to extend it as 10, otherwise I should use GEN_FCN.
This patch is quite obvious change, Ok for trunk ?

Thanks.

gcc/ChangeLog:

* config/riscv/riscv-vector-builtins.cc
(function_expander::use_ternop_insn): Use maybe_gen_insn instead.
(function_expander::use_widen_ternop_insn): Ditto.
* optabs.cc (maybe_gen_insn): Extend nops handling.

RISC-V: Fine tune merge operand constraint for integer/load/store

gcc/ChangeLog:

* config/riscv/riscv-vector-builtins-bases.cc: Split indexed load
patterns according to RVV ISA.
* config/riscv/vector-iterators.md: New iterators.
* config/riscv/vector.md
(@pred_indexed_<order>load<VNX1_QHSD:mode><VNX1_QHSDI:mode>): Remove.
(@pred_indexed_<order>load<mode>_same_eew): New pattern.
(@pred_indexed_<order>load<mode>_x2_greater_eew): Ditto.
(@pred_indexed_<order>load<mode>_x4_greater_eew): Ditto.
(@pred_indexed_<order>load<mode>_x8_greater_eew): Ditto.
(@pred_indexed_<order>load<mode>_x2_smaller_eew): Ditto.
(@pred_indexed_<order>load<mode>_x4_smaller_eew): Ditto.
(@pred_indexed_<order>load<mode>_x8_smaller_eew): Ditto.
(@pred_indexed_<order>load<VNX2_QHSD:mode><VNX2_QHSDI:mode>): Remove.
(@pred_indexed_<order>load<VNX4_QHSD:mode><VNX4_QHSDI:mode>): Ditto.
(@pred_indexed_<order>load<VNX8_QHSD:mode><VNX8_QHSDI:mode>): Ditto.
(@pred_indexed_<order>load<VNX16_QHS:mode><VNX16_QHSI:mode>): Ditto.
(@pred_indexed_<order>load<VNX32_QH:mode><VNX32_QHI:mode>): Ditto.
(@pred_indexed_<order>load<VNX64_Q:mode><VNX64_Q:mode>): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/merge_constraint-1.c: New test.

[PATCH v2] vect: Check that vector factor is a compile-time constant

* tree-vect-loop-manip.cc (vect_do_peeling): Use
result of constant_lower_bound instead of vf for the lower
bound of the epilog loop trip count.

c++: signed __int128_t [PR108099]

The code for handling signed + typedef was breaking on __int128_t, because
it isn't a proper typedef: it doesn't have DECL_ORIGINAL_TYPE.

PR c++/108099

gcc/cp/ChangeLog:

* decl.cc (grokdeclarator): Handle non-typedef typedef_decl.

gcc/testsuite/ChangeLog:

* g++.dg/ext/int128-7.C: New test.

c++: overloaded fn in contract [PR108542]

PR c++/108542

gcc/cp/ChangeLog:

* class.cc (instantiate_type): Strip location wrapper.

gcc/testsuite/ChangeLog:

* g++.dg/contracts/contracts-err1.C: New test.

Daily bump.

c++: allocator temps in list of arrays [PR108773]

The optimization to reuse the same allocator temporary for all string
constructor calls was breaking on this testcase, because the temps were
already in the argument to build_vec_init, and replacing them with
references to one slot got confused with calls at multiple levels (for the
initializer_list backing array, and then again for the array member of the
std::array). Fixed by reusing the whole TARGET_EXPR instead of pulling out
the slot; gimplification ensures that it's only initialized once.

I also moved the check for initializing a std:: class down into the tree
walk, and handle multiple temps within a single array element
initialization.

PR c++/108773

gcc/cp/ChangeLog:

* init.cc (find_allocator_temps_r): New.
(combine_allocator_temps): Replace find_allocator_temp.
(build_vec_init): Adjust.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/initlist-array18.C: New test.
* g++.dg/cpp0x/initlist-array19.C: New test.

testsuite: add various -Wanalyzer-null-dereference false +ve test cases

There are various -Wanalyzer-null-dereference false +ves in bugzilla
that I've been attempting to fix. Unfortunately I haven't made much
progress, but it seems worth at least capturing the reduced
reproducers as test cases, to make it easier to spot changes in
behavior.

gcc/testsuite/ChangeLog:
PR analyzer/102671
PR analyzer/105755
PR analyzer/108251
PR analyzer/108400
* gcc.dg/analyzer/null-deref-pr102671-1.c: New test, reduced
from Emacs.
* gcc.dg/analyzer/null-deref-pr102671-2.c: Likewise.
* gcc.dg/analyzer/null-deref-pr105755.c: Likewise.
* gcc.dg/analyzer/null-deref-pr108251-smp_fetch_ssl_fc_has_early-O2.c:
New test, reduced from haproxy's src/ssl_sample.c.
* gcc.dg/analyzer/null-deref-pr108251-smp_fetch_ssl_fc_has_early.c:
Likewise.
* gcc.dg/analyzer/null-deref-pr108400-SoftEtherVPN-WebUi.c: New
test, reduced from SoftEtherVPN's src/Cedar/WebUI.c.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

middle-end: On emergency dumps finish the graph generation.

When doing an emergency dump the cfg output dumps are corrupted because the
ending "}" is missing.

Normally when the pass manager finishes it would call finish_graph_dump_file to
produce this. This is called here because each pass can dump multiple digraphs.

However during an emergency dump we only dump the current function and so after
that is done we never go back to the pass manager.

As such, we need to manually call finish_graph_dump_file in order to properly
finish off graph generation.

With this -ftree-dump-*-graph works properly during a crash dump.

gcc/ChangeLog:

* passes.cc (emergency_dump_function): Finish graph generation.

AArch64: Fix codegen regressions around tbz.

We were analyzing code quality after recent changes and have noticed that the
tbz support somehow managed to increase the number of branches overall rather
than decreased them.

While investigating this we figured out that the problem is that when an
existing & <contants> exists in gimple and the instruction is generated because
of the range information gotten from the ANDed constant that we end up with the
situation that you get a NOP AND in the RTL expansion.

This is not a problem as CSE will take care of it normally.   The issue is when
this original AND was done in a location where PRE or FRE "lift" the AND to a
different basic block.  This triggers a problem when the resulting value is not
single use.  Instead of having an AND and tbz, we end up generating an
AND + TST + BR if the mode is HI or QI.

This CSE across BB was a problem before but this change made it worse. Our
branch patterns rely on combine being able to fold AND or zero_extends into the
instructions.

To work around this (since a proper fix is outside of the scope of stage-4) we
are limiting the new tbranch optab to only HI and QI mode values.  This isn't a
problem because these two modes are modes for which we don't have CBZ support,
so they are the problematic cases to begin with.  Additionally booleans are QI.

The second thing we're doing is limiting the only legal bitpos to pos 0. i.e.
only the bottom bit.  This such that we prevent the double ANDs as much as
possible.

Now most other cases, i.e. where we had an explicit & in the source code are
still handled correctly by the anonymous (*tb<optab><ALLI:mode><GPI:mode>1)
pattern that was added along with tbranch support.

This means we don't expand the superflous AND here, and while it doesn't fix the
problem that in the cross BB case we loss tbz, it also doesn't make things worse.

With these tweaks we've now reduced the number of insn uniformly was originally
expected.

gcc/ChangeLog:

* config/aarch64/aarch64.md (tbranch_<code><mode>3): Restrict to SHORT
and bottom bit only.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/tbz_2.c: New test.
* gcc.target/aarch64/tbz_3.c: New test.

libstdc++: Implement LWG 3820/3849 changes to cartesian_product_view

The LWG 3820 testcase revealed a bug in _M_advance, which this patch
also fixes.

libstdc++-v3/ChangeLog:

* include/std/ranges
(cartesian_product_view::_Iterator::_Iterator): Remove
constraint on default constructor as per LWG 3849.
(cartesian_product_view::_Iterator::_M_prev): Adjust position
of _Nm > 0 test as per LWG 3820.
(cartesian_product_view::_Iterator::_M_advance): Perform bounds
checking only on sized cartesian products.
* testsuite/std/ranges/cartesian_product/1.cc (test08): New test.

libstdc++: Implement LWG 3796 changes to repeat_/chunk_by_view [PR109024]

PR libstdc++/109024

libstdc++-v3/ChangeLog:

* include/std/ranges (chunk_by_view::_M_pred): Remove DMI as per
LWG 3796.
(repeat_view::_M_pred): Likewise.
* testsuite/std/ranges/adaptors/chunk_by/1.cc (test03): New test.
* testsuite/std/ranges/repeat/1.cc (test05): New test.

libstdc++: Make views::single/iota/istream SFINAE-friendly [PR108362]

PR libstdc++/108362

libstdc++-v3/ChangeLog:

* include/std/ranges (__detail::__can_single_view): New concept.
(_Single::operator()): Constrain it.  Move [[nodiscard]] to the
end of the function declarator.
(__detail::__can_iota_view): New concept.
(_Iota::operator()): Constrain it.  Move [[nodiscard]] to the
end of the function declarator.
(__detail::__can_istream_view): New concept.
(_Istream::operator()): Constrain it.  Move [[nodiscard]] to the
end of the function declarator.
* testsuite/std/ranges/iota/iota_view.cc (test07): New test.
* testsuite/std/ranges/istream_view.cc (test08): New test.
* testsuite/std/ranges/single_view.cc (test07): New test.

Fix PR 108980: note without warning due to array bounds check

The problem here is after r13-4748-g2a27ae32fabf85, in some
cases we were calling inform without a corresponding warning.
This changes the logic such that we only cause that to happen
if there was a warning happened before hand.

Changes since
* v1: Fix formating and dump message as suggested by Jakub.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

gcc/ChangeLog:

PR tree-optimization/108980
* gimple-array-bounds.cc (array_bounds_checker::check_array_ref):
Reorgnize the call to warning for not strict flexible arrays
to be before the check of warned.

libstdc++: extraneous begin in cartesian_product_view::end [PR107572]

ranges::begin() isn't guaranteed to be equality-preserving for non-forward
ranges, so in cartesian_product_view::end we need to avoid needlessly
calling begin() on the first range (which could be non-forward) in the
case where __empty_tail is false as per its specification.

Since we're already using a variadic lambda to compute __empty_tail, we
might as well use that same lambda to build up the tuple of iterators
instead of building it separately via e.g. std::apply or __tuple_transform.

PR libstdc++/107572

libstdc++-v3/ChangeLog:

* include/std/ranges (cartesian_product_view::end): When
building the tuple of iterators, avoid calling ranges::begin on
the first range if __empty_tail is false.
* testsuite/std/ranges/cartesian_product/1.cc (test07): New test.

libstdc++: Really fix symver for __gnu_cxx11_ieee128::__try_use_facet [PR108882]

libstdc++-v3/ChangeLog:

PR libstdc++/108882
* config/os/gnu-linux/ldbl-ieee128-extra.ver: Fix incorrect
patterns.

c++: CTAD for less-specialized alias template [PR102529]

The standard was unclear what happens with the transformation of a deduction
guide if the initial template argument deduction fails for a reason other
than not deducing all the arguments; my implementation assumed that the
right thing was to give up on the deduction guide. But in consideration of
CWG2664 this week I realized that we get a better result by just continuing
with an empty set of deductions, so the alias deduction guide is the same as
the original deduction guide plus the deducible constraint.

DR 2664
PR c++/102529

gcc/cp/ChangeLog:

* pt.cc (alias_ctad_tweaks): Continue after deduction failure.

gcc/testsuite/ChangeLog:

* g++.dg/DRs/dr2664.C: New test.
* g++.dg/cpp2a/class-deduction-alias15.C: New test.

c++: fix alias CTAD [PR105841]

In my initial implementation of alias CTAD, I described a couple of
differences from the specification that I thought would not have a practical
effect; this testcase demonstrates that I was wrong. One difference is
resolved by the CPTK_IS_DEDUCIBLE commit; the other (adding too many of the
alias template parameters to the new deduction guide) is fixed by this
patch.

PR c++/105841

gcc/cp/ChangeLog:

* pt.cc (corresponding_template_parameter_list): Split out...
(corresponding_template_parameter): ...from here.
(find_template_parameters): Factor out...
(find_template_parameter_info::find_in): ...this function.
(find_template_parameter_info::find_in_recursive): New.
(find_template_parameter_info::found): New.
(alias_ctad_tweaks): Only add parms used in the deduced args.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/class-deduction-alias14.C: New test.

Co-authored-by: Michael Spertus <mike@spertus.com>

c++: hide __is_deducible for GCC 13

I want to have more discussion about the interface before claiming the
__is_deducible name, so for GCC 13 make it internal-only.

gcc/ChangeLog:

* doc/extend.texi: Comment out __is_deducible docs.

gcc/cp/ChangeLog:

* cp-trait.def (IS_DEDUCIBLE): Add space to name.

gcc/testsuite/ChangeLog:

* g++.dg/ext/is_deducible1.C: Guard with
__has_builtin (__is_deducible).

c++: add __is_deducible trait [PR105841]

C++20 class template argument deduction for an alias template involves
adding a constraint that the template arguments for the alias template can
be deduced from the return type of the deduction guide for the underlying
class template. In the standard, this is modeled as defining a class
template with a partial specialization, but it's much more efficient to
implement with a trait that directly tries to perform the deduction.

The first argument to the trait is a template rather than a type, so various
places needed to be adjusted to accommodate that.

PR c++/105841

gcc/ChangeLog:

* doc/extend.texi (Type Traits):: Document __is_deducible.

gcc/cp/ChangeLog:

* cp-trait.def (IS_DEDUCIBLE): New.
* cxx-pretty-print.cc (pp_cxx_trait): Handle non-type.
* parser.cc (cp_parser_trait): Likewise.
* tree.cc (cp_tree_equal): Likewise.
* pt.cc (tsubst_copy_and_build): Likewise.
(type_targs_deducible_from): New.
(alias_ctad_tweaks): Use it.
* semantics.cc (trait_expr_value): Handle CPTK_IS_DEDUCIBLE.
(finish_trait_expr): Likewise.
* constraint.cc (diagnose_trait_expr): Likewise.
* cp-tree.h (type_targs_deducible_from): Declare.

gcc/testsuite/ChangeLog:

* g++.dg/ext/is_deducible1.C: New test.

Enable UTF-8 code page on Windows 64-bit host [PR108865]

Compile a resource object that contains the utf8 manifest.

Then link that object into the driver and compiler proper.

For compiler proper the link has to be forced because the
resource object file gets into a static library (libbackend.a)
and gets eventually dropped because it has no symbols of
its own and nothing is referencing it inside the library.

Therefore, an artificial symbol is planted to force the link.

gcc/ChangeLog:

PR driver/108865
* config.host: add object for x86_64-*-mingw*.
* config/i386/sym-mingw32.cc: dummy file to attach
symbol.
* config/i386/utf8-mingw32.rc: windres resource file.
* config/i386/winnt-utf8.manifest: XML manifest to
enable UTF-8.
* config/i386/x-mingw32: reference to x-mingw32-utf8.
* config/i386/x-mingw32-utf8: Makefile fragment to
embed UTF-8 manifest.

Signed-off-by: Jonathan Yong <10walls@gmail.com>

LRA: For clobbered regs use operand mode instead of the biggest mode

LRA is too conservative in calculation of conflicts with clobbered regs by
using the biggest access mode.  This results in failure of possible reg
coalescing and worse code.  This patch solves the problem.

        PR rtl-optimization/108999

gcc/ChangeLog:

* lra-constraints.cc (process_alt_operands): Use operand modes for
clobbered regs instead of the biggest access mode.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/pr108999.c: New.

middle-end/108995 - avoid folding when sanitizing overflow

The following plugs one place in extract_muldiv where it should avoid
folding when sanitizing overflow.

PR middle-end/108995
* fold-const.cc (extract_muldiv_1): Avoid folding
(CST * b) / CST2 when sanitizing overflow and we rely on
overflow being undefined.

* gcc.dg/ubsan/pr108995.c: New testcase.

range-op-float: Fix up reverse binary operations [PR109008]

The following testcase is reduced from miscompilation of scipy package.
If we have say lhs = [1., 1.] - [1., 1.] and want to compute the range
of lhs from it, we correctly determine it is [0., 0.] (if computations
are exact, we generally don't try to round them further in
frange_arithmetic).  In the testcase it is about a reverse operation,
[1., 1.] = op1 + [1., 1.] and we want to compute range of op1 from that.
Right now we just perform the inverse operation (there are some corner
cases about NaN and infinities handling) and so arrive to range
[0., 0.] as well, and because it is a singleton, optimize return eps;
to return 0.  That is incorrect though, for the reverse ops we need to
take into account also rounding, the right exact range is
[-0x1.0p-54, 0x1.0p-53] in this case when rounding to nearest, i.e.
all numbers which added to 1. with round to nearest still produce 1.

The problem isn't solely on singleton ranges, and isn't solely on
results around zero.  We basically need to consider also values
where the result is up to 0.5ulp away from the lhs range boundaries
in each direction.

The following patch fixes it by extending the lhs range for the
reverse operations by 1ulp in each direction.  The PR contains
a pseudo-random test generator I've used to generate 300000 tests
of + and - and then used the same test with * and / instead of + and -
together with a hack to print the discovered ranges by the patch in
a form that another test could then verify the range is conservatively
correct and how far it is from a minimal range.

I believe the results are good enough for now, though plan to look
incrementally into trying to do something better on the -XXX_MAX or
XXX_MAX boundaries (where I think frange_nextafter will use -inf or +inf)
and also try to increase the range just by 0.5ulp rather than 1ulp
if !flag_rounding_math.  But dunno if either of those will be doable
and will pass the testing, so I think it is worth committing this fix
first.

2023-03-09  Jakub Jelinek  <jakub@redhat.com>
    Richard Biener  <rguenther@suse.de>

PR tree-optimization/109008
* range-op-float.cc (float_widen_lhs_range): New function.
(foperator_plus::op1_range, foperator_minus::op1_range,
foperator_minus::op2_range, foperator_mult::op1_range,
foperator_div::op1_range, foperator_div::op2_range): Use it.

* gcc.c-torture/execute/ieee/pr109008.c: New test.

libgomp: Fix default value of GOMP_SPINCOUNT [PR 109062]

When OMP_WAIT_POLICY is not specified, current implementation will cause
icv flag GOMP_ICV_WAIT_POLICY unset, so global variable wait_policy
will remain its uninitialized value. Initialize it to -1 to make
GOMP_SPINCOUNT behavior consistent with its description.

libgomp/ChangeLog:

PR libgomp/109062
* env.c (wait_policy): Initialize to -1.
(initialize_icvs): Initialize icvs->wait_policy to -1.
* testsuite/libgomp.c-c++-common/pr109062.c: New test.

Daily bump.

libgomp.texi: Mention GCN_STACK_SIZE in Offload-Target Specifics

libgomp/ChangeLog:

* libgomp.texi (Offload-Target Specifics): Mention GCN_STACK_SIZE.

libgcc, rs6000: Fix bump size for powerpc64 elfv1 ABI [PR108727]

As PR108727 shows, when cleanup code called by the stack
unwinder calls function _Unwind_Resume, it goes via plt
stub like:

   function 00000000.plt_call._Unwind_Resume:

=> 0x0000000010003580 <+0>:     std     r2,40(r1)
   0x0000000010003584 <+4>:     ld      r12,-31760(r2)
   0x0000000010003588 <+8>:     mtctr   r12
   0x000000001000358c <+12>:    ld      r2,-31752(r2)
   0x0000000010003590 <+16>:    cmpldi  r2,0
   0x0000000010003594 <+20>:    bnectr+
   0x0000000010003598 <+24>:    b       0x100031a4
                                        <_Unwind_Resume@plt>

It wants to save TOC base (r2) to r1 + 40, but we only
bump the stack segment by 32 bytes as follows:

   stdu %r29,-32(%r3)

It means the access is out of the stack segment allocated
by __generic_morestack, once the touch area isn't writable
like this failure shows, it would cause segment fault.

So fix the bump size with one reasonable value PARAMS.

PR libgcc/108727

libgcc/ChangeLog:

* config/rs6000/morestack.S (__morestack): Use PARAMS for new stack
bump size.

testsuite: Adjust powerpc ppc-fortran.exp to support dg-{warning,error}

According to Haochen's finding in [1], currently ppc-fortran.exp
doesn't support Fortran specific warning or error messages well.
By looking into it, it's due to that gfortran uses some different
warning/error prefixes as follows:

    set gcc_warning_prefix "\[Ww\]arning:"
    set gcc_error_prefix "(Fatal )?\[Ee\]rror:"

comparing to:

    set gcc_warning_prefix "warning:"
    set gcc_error_prefix "(fatal )?error:"

So this is to override these two prefixes and make it support
dg-{warning,error} checks.

[1] https://gcc.gnu.org/pipermail/gcc-patches/2023-March/613302.html

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/ppc-fortran/ppc-fortran.exp: Override
gcc_{warning,error}_prefix with Fortran specific one used in
gfortran_init.

testsuite: Adjust scalar-test-data-class-1[45].c with int128

Test cases scalar-test-data-class-1[45].c adopts type __int128
which requires to check int128 effective target, otherwise the
testing on them will fail at -m32. This patch is to add int128
effective target requirement.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/bfp/scalar-test-data-class-14.c: Adjust with
int128 effective target requirement.
* gcc.target/powerpc/bfp/scalar-test-data-class-15.c: Likewise.

testsuite: Adjust two bfp test cases with has_arch_ppc64 [PR108729]

Two test cases scalar-test-data-class-12.c and vec-test-data-class-9.c
fail on Power9 BE testing at -m32, they adopts a built-in function
scalar_insert_exp which requires powerpc64 support. This patch
is to make them to check has_arch_ppc64 effective target requirement.

PR testsuite/108729

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/bfp/scalar-test-data-class-12.c: Adjust with
has_arch_ppc64 effective target.
* gcc.target/powerpc/bfp/vec-test-data-class-9.c: Likewise.

testsuite: Adjust scalar-test-neg-8.c with lp64 [PR108730]

The built-in function scalar_test_neg_qp is under stanza
ieee128-hw, that is TARGET_FLOAT128_HW.  Since we don't
have float128 hardware support on 32-bit as follows:

if (TARGET_FLOAT128_HW && !TARGET_64BIT)
  {
    if ((rs6000_isa_flags_explicit & OPTION_MASK_FLOAT128_HW) != 0)
      error ("%qs requires %qs", "%<-mfloat128-hardware%>", "-m64");
    rs6000_isa_flags &= ~OPTION_MASK_FLOAT128_HW;
  }

So adjust the case with lp64 effective target accordingly.

PR testsuite/108730

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/bfp/scalar-test-neg-8.c: Adjust with lp64
effective target requirement.

testsuite: Adjust pr101384-2.c for Power9 [PR108813]

Compiled with cpu type Power9 or later, GCC generates
xxspltib rather than vspltis*, so adjust the test
case scanning content accordingly.

PR testsuite/108813

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr101384-2.c: Adjust with xxspltib.

testsuite: Adjust fold-vec-extract-double.p9.c for powerpc BE [PR108810]

On BE, the extracted index for the leftmost element is 0
rather than 1, adjust the test case accordingly.

PR testsuite/108810

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/fold-vec-extract-double.p9.c (testd_cst): Adjust
the extracted index for BE.

Fix MIPS testsuite over-eager matching

The mips msa-ds.c test is trying to ensure that MSA branches can have their
delay slots filled.  The regexp it used looked for the function name, a nop,
then the function name again.  If found that sequence, then the test failed.

The problem is with Vlad's recent IRA work there's simply less code in the
test (good) and as a result one of the *other* branches in the test had an
unfilled delay slot -- the delay slot for the MSA branch was still being
filled.

This patch tightens up the regexp.  In particular it looks for the MSA branch
and a nop on the next line (avoiding the over-eager .* construct).  That
indicates that the MSA branch did not have its delay slot filled.  When that
sequence is found, then the test fails.

This fixes the recent regressions for mips64 and mips64el in the tester.

Installing on the trunk,

gcc/testsuite:
* gcc.target/mips/msa-ds.c: Fix over eager pattern matching.

testsuite: Fix omp-parallel-for-get-min.c and -for-1.c for non-openmp

The recently added tests missed checking for "fopenmp" (see
other tests where "-fopenmp" is passed), which makes them
fail on non-openmp systems.

* gcc.dg/analyzer/omp-parallel-for-get-min.c,
gcc.dg/analyzer/omp-parallel-for-1.c: Require effective target fopenmp.

Daily bump.

docs: Clarify LeakSanitizer in documentation [PR81649]

gcc/ChangeLog

PR sanitizer/81649
* doc/invoke.texi (Instrumentation Options): Clarify
LeakSanitizer behavior.

docs: Add link to gmplib.org.

gcc/ChangeLog
* doc/install.texi (Prerequisites): Add link to gmplib.org.

c++: static lambda tsubst [PR108526]

A missed piece of the patch for static operator(): in tsubst_function_decl,
we don't want to replace the first parameter with a new closure pointer if
operator() is static.

PR c++/108526
PR c++/106651

gcc/cp/ChangeLog:

* pt.cc (tsubst_function_decl): Don't replace the closure
parameter if DECL_STATIC_FUNCTION_P.

gcc/testsuite/ChangeLog:

* g++.dg/cpp23/static-operator-call5.C: Pass -g.

libstdc++: Some baseline_symbols.txt updates

This updates baseline_symbols.txt for the Fedora 39 arches.
Most of the added symbols are added to all 5 files, exceptions are
DF16_ rtti stuff (only added on x86 and aarch64 which supports those),
DF16b rtti stuff (only x86 right now), _M_replace_cold (m vs. j
differences), DF128_ charconv (only x86), GLIBCXX_LDBL_3.4.31
symver (s390x), _M_get_sys_info/_M_get_local_info (l vs. x).
I was using
grep ^+ | sed 's/OBJECT:[0-9]*:/OBJECT:/' | sort | uniq -c | sort -n | less
on the patch to analyze.
powerpc64le-linux not included because I'll need to regenerate it.

2023-03-07 Jakub Jelinek <jakub@redhat.com>

* config/abi/post/x86_64-linux-gnu/baseline_symbols.txt: Update.
* config/abi/post/x86_64-linux-gnu/32/baseline_symbols.txt: Update.
* config/abi/post/i486-linux-gnu/baseline_symbols.txt: Update.
* config/abi/post/aarch64-linux-gnu/baseline_symbols.txt: Update.
* config/abi/post/s390x-linux-gnu/baseline_symbols.txt: Update.

libstdc++: Fix symver for __gnu_cxx11_ieee128::__try_use_facet [PR108882]

libstdc++-v3/ChangeLog:

PR libstdc++/108882
* config/abi/pre/gnu.ver (GLIBCXX_3.4.31): Adjust patterns to
not match symbols in namespace std::__gnu_cxx11_ieee128.
* config/os/gnu-linux/ldbl-ieee128-extra.ver: Add patterns for
std::__gnu_cxx11_ieee128::money_{get,put}.

libstdc++: Fix comment typo in eh_personality.cc

libstdc++-v3/ChangeLog:

* libsupc++/eh_personality.cc: Fix spelling in comment.

c++: -Wdangling-reference with reference wrapper [PR107532]

Here, -Wdangling-reference triggers where it probably shouldn't, causing
some grief.  The code in question uses a reference wrapper with a member
function returning a reference to a subobject of a non-temporary object:

  const Plane & meta = fm.planes().inner();

I've tried a few approaches, e.g., checking that the member function's
return type is the same as the type of the enclosing class (which is
the case for member functions returning *this), but that then breaks
Wdangling-reference4.C with std::optional<std::string>.

This patch adjusts do_warn_dangling_reference so that we look through
reference wrapper classes (meaning, has a reference member and a
constructor taking the same reference type, or is std::reference_wrapper
or std::ranges::ref_view) and don't warn for them, supposing that the
member function returns a reference to a non-temporary object.

PR c++/107532

gcc/cp/ChangeLog:

* call.cc (reference_like_class_p): New.
(do_warn_dangling_reference): Add new bool parameter.  See through
reference_like_class_p.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Wdangling-reference8.C: New test.
* g++.dg/warn/Wdangling-reference9.C: New test.

testsuite: Fix another syntax problem in slp-3.c

This fixes another syntax error in slp-3.c. I missed a '{ ... }' in
order to properly exclude s390_vx.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/slp-3.c: Add '{ ... }'.

c++: Fix up ICE in emit_support_tinfo_1 [PR109042]

In my recent rtti.cc change I assumed when emitting the support tinfos
that the tinfos for the fundamental types haven't been created yet.
Normally (in libsupc++.a (fundamental_type_info.o)) that is the case,
but as can be seen on the testcase, one can violate it by using typeid
etc. in the same TU and do it before ~__fundamental_type_info ()
definition.

The following patch fixes that by popping from unemitted_tinfo_decls
only in the normal case when it is there, and treating non-NULL
DECL_INITIAL on a tinfo node as indication that emit_tinfo_decl has
processed it already.

2023-03-07 Jakub Jelinek <jakub@redhat.com>

PR c++/109042
* rtti.cc (emit_support_tinfo_1): Don't assert that last
unemitted_tinfo_decls element is tinfo, instead pop from it only in
that case.
* decl2.cc (c_parse_final_cleanups): Don't call emit_tinfo_decl
for unemitted_tinfO_decls which have already non-NULL DECL_INITIAL.

* g++.dg/rtti/pr109042.C: New test.

c++: noexcept and copy elision [PR109030]

When processing a noexcept, constructors aren't elided: build_over_call
has
/* It's unsafe to elide the constructor when handling
a noexcept-expression, it may evaluate to the wrong
value (c++/53025). */
&& (force_elide || cp_noexcept_operand == 0))
so the assert I added recently needs to be relaxed a little bit.

PR c++/109030

gcc/cp/ChangeLog:

* constexpr.cc (cxx_eval_call_expression): Relax assert.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/noexcept77.C: New test.

c++: error with constexpr operator() [PR107939]

Similarly to PR107938, this also started with r11-557, whereby cp_finish_decl
can call check_initializer even in a template for a constexpr initializer.

Here we are rejecting

  extern const Q q;

  template<int>
  constexpr auto p = q(0);

even though q has a constexpr operator().  It's deemed non-const by
decl_maybe_constant_var_p because even though 'q' is const it is not
of integral/enum type.

If fun is not a function pointer, we don't know if we're using it as an
lvalue or rvalue, so with this patch we pass 'any' for want_rval.  With
that, p_c_e/VAR_DECL doesn't flat out reject the underlying VAR_DECL.

PR c++/107939

gcc/cp/ChangeLog:

* constexpr.cc (potential_constant_expression_1) <case CALL_EXPR>: Pass
'any' when recursing on a VAR_DECL and not a pointer to function.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1y/var-templ74.C: Remove dg-error.
* g++.dg/cpp1y/var-templ77.C: New test.

RISC-V: Bugfix for rvv bool mode precision adjustment

Fix the bug of the rvv bool mode precision with the adjustment.
The bits size of vbool*_t will be adjusted to
[1, 2, 4, 8, 16, 32, 64] according to the rvv spec 1.0 isa. The
adjusted mode precison of vbool*_t will help underlying pass to
make the right decision for both the correctness and optimization.

Given below sample code:

void test_1(int8_t * restrict in, int8_t * restrict out)
{
  vbool8_t v2 = *(vbool8_t*)in;
  vbool16_t v5 = *(vbool16_t*)in;
  *(vbool16_t*)(out + 200) = v5;
  *(vbool8_t*)(out + 100) = v2;
}

Before the precision adjustment:

addi    a4,a1,100
vsetvli a5,zero,e8,m1,ta,ma
addi    a1,a1,200
vlm.v   v24,0(a0)
vsm.v   v24,0(a4)
// Need one vsetvli and vlm.v for correctness here.
vsm.v   v24,0(a1)

After the precision adjustment:

csrr    t0,vlenb
slli    t1,t0,1
csrr    a3,vlenb
sub     sp,sp,t1
slli    a4,a3,1
add     a4,a4,sp
sub     a3,a4,a3
vsetvli a5,zero,e8,m1,ta,ma
addi    a2,a1,200
vlm.v   v24,0(a0)
vsm.v   v24,0(a3)
addi    a1,a1,100
vsetvli a4,zero,e8,mf2,ta,ma
csrr    t0,vlenb
vlm.v   v25,0(a3)
vsm.v   v25,0(a2)
slli    t1,t0,1
vsetvli a5,zero,e8,m1,ta,ma
vsm.v   v24,0(a1)
add     sp,sp,t1
jr      ra

However, there may be some optimization opportunates after
the mode precision adjustment. It can be token care of in
the RISC-V backend in the underlying separted PR(s).

gcc/ChangeLog:

PR target/108185
PR target/108654
* config/riscv/riscv-modes.def (ADJUST_PRECISION): Adjust VNx*BI
modes.
* config/riscv/riscv.cc (riscv_v_adjust_precision): New.
* config/riscv/riscv.h (riscv_v_adjust_precision): New.
* genmodes.cc (adj_precision): New.
(ADJUST_PRECISION): New.
(emit_mode_adjustments): Handle ADJUST_PRECISION.

gcc/testsuite/ChangeLog:

PR target/108185
PR target/108654
* gcc.target/riscv/rvv/base/pr108185-1.c: New test.
* gcc.target/riscv/rvv/base/pr108185-2.c: New test.
* gcc.target/riscv/rvv/base/pr108185-3.c: New test.
* gcc.target/riscv/rvv/base/pr108185-4.c: New test.
* gcc.target/riscv/rvv/base/pr108185-5.c: New test.
* gcc.target/riscv/rvv/base/pr108185-6.c: New test.
* gcc.target/riscv/rvv/base/pr108185-7.c: New test.
* gcc.target/riscv/rvv/base/pr108185-8.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>
Co-authored-by: Ju-Zhe Zhong <juzhe.zhong@rivai.ai>

aarch64: testsuite: disable stack protector for tests relying on stack offset

Stack protector needs a guard value on the stack and change the stack
layout. So we need to disable it for those tests, to avoid test failure
with --enable-default-ssp.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/shrink_wrap_1.c (dg-options): Add
-fno-stack-protector.
* gcc.target/aarch64/stack-check-cfa-1.c (dg-options): Add
-fno-stack-protector.
* gcc.target/aarch64/stack-check-cfa-2.c (dg-options): Add
-fno-stack-protector.
* gcc.target/aarch64/test_frame_17.c (dg-options): Add
-fno-stack-protector.

aarch64: testsuite: disable stack protector for pr104005.c

Storing stack guarding variable need one stp instruction, breaking the
scan-assembler-not pattern in the test. Disable stack protector to
avoid a test failure with --enable-default-ssp.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/pr104005.c (dg-options): Add
-fno-stack-protector.

aarch64: testsuite: disable stack protector for auto-init-7.c

The test scans for "const_int 0" in the RTL dump, but stack protector
can produce more "const_int 0". To avoid a failure with
--enable-default-ssp, disable stack protector for this.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/auto-init-7.c (dg-options): Add
-fno-stack-protector.

aarch64: testsuite: disable stack protector for pr103147-10 tests

Stack protector influence code generation and cause function body checks
fail.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/pr103147-10.c (dg-options): Add
-fno-stack-protector.
* g++.target/aarch64/pr103147-10.C: Likewise.

aarch64: testsuite: disable stack protector for sve-pcs tests

If GCC is configured with --enable-default-ssp, the stack protector can
make many sve-pcs tests fail.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/sve/pcs/aarch64-sve-pcs.exp (sve_flags):
Add -fno-stack-protector.

aarch64: testsuite: disable PIE for fuse_adrp_add_1.c [PR70150]

In PIE, symbol "fixed_regs" is addressed via GOT. It will break the
scan-assembler pattern and cause test failure with --enable-default-pie.

gcc/testsuite/ChangeLog:

PR testsuite/70150
* gcc.target/aarch64/fuse_adrp_add_1.c (dg-options): Add
-fno-pie.

aarch64: testsuite: disable PIE for tests with large code model [PR70150]

These tests set large code model with -mcmodel=large or target pragma for
AArch64. But if GCC is configured with --enable-default-pie, it triggers
"sorry: unimplemented: code model large with -fpic". Disable PIE to make
avoid the issue.

gcc/testsuite/ChangeLog:

PR testsuite/70150
* gcc.dg/tls/pr78796.c (dg-additional-options): Add -fno-pie
-no-pie for aarch64-*-*.
* gcc.target/aarch64/pr63304_1.c (dg-options): Add -fno-pie.
* gcc.target/aarch64/pr70120-2.c (dg-options): Add -fno-pie.
* gcc.target/aarch64/pr78733.c (dg-options): Add -fno-pie.
* gcc.target/aarch64/pr79041-2.c (dg-options): Add -fno-pie.
* gcc.target/aarch64/pr94530.c (dg-options): Add -fno-pie.
* gcc.target/aarch64/pr94577.c (dg-options): Add -fno-pie.
* gcc.target/aarch64/reload-valid-spoff.c (dg-options): Add
-fno-pie.

aarch64: testsuite: disable PIE for aapcs64 tests [PR70150]

If GCC is built with --enable-default-pie, a lot of aapcs64 tests fail
because relocation unsupported in PIE is used.

gcc/testsuite/ChangeLog:

PR testsuite/70150
* gcc.target/aarch64/aapcs64/aapcs64.exp (additional_flags):
Add -fno-pie -no-pie.

testsuite: Support scanning tree-dumps

No planned usage.

* lib/target-supports.exp (check_compile): Support scanning tree-dumps.

testsuite: Gate gcc.dg/plugin/must-tail-call-1.c and -2.c on tail_call

While gcc.dg/plugin/must-tail-call-2.c passes for all targets even
without this, the error message is, for a target like cris-elf that
doesn't implement sibling calls: "error: cannot tail-call: machine
description does not have a sibcall_epilogue instruction pattern"
rather than "error: cannot tail-call: callee returns a structure".
Also, it'd be confusing to exclude must-tail-call-1.c but not
must-tail-call-2.c

* gcc.dg/plugin/must-tail-call-1.c, gcc.dg/plugin/must-tail-call-2.c:
Gate on effective target tail_call.

doc: Document testsuite check_effective_target_tail_call

Spot-checked the PDF output for sanity.

* doc/sourcebuild.texi: Document check_effective_target_tail_call.

testsuite: Add tail_call effective target

The RTL "expand" dump is the first RTL dump, and it also appears to be
the earliest trace of the target having implemented sibcalls.
Including the "," in the pattern searched for, to try and avoid
possible false matches, but there doesn't appear to be any identifiers
or target names nearby so this is just belts and suspenders. Using
"tail_call" as a shorter and more commonly used term than a derivative
of "sibling calls", and expecting only gcc folks to have heard of
"sibcalls".

* lib/target-supports.exp (check_effective_target_tail_call): New.

Daily bump.

testsuite: Fix gcc.dg/analyzer/allocation-size-multiline-3.c

For 32-bit newlib targets (such as cris-elf and pru-elf),
that int32_t is "long int". See other regexps in the
testsuite matching "aka (long )?int" (with single-quotes
where needed) where the pattern in
allocation-size-multiline-3.c matches plain "int". Uses the
special syntax recently introduced for multi-line patterns.

testsuite:
* gcc.dg/analyzer/allocation-size-multiline-3.c: Handle
int32_t being "long int".

testsuite: Provide means to regexp in multiline patterns

Those multi-line-patterns are literal.  Sometimes a regexp
needs to be matched.  This is a start: just three elements
are supported: "(" ")" and the compound ")?" (and on second
thought, it can be argued that "(...)" alone is not useful).
Note that Tcl "string map" is documented to have the desired
effect: a once-over but no re-recognitions of previously
replaced mapped elements.  Also, drop a doubled "containing".

testsuite:
* lib/multiline.exp (_build_multiline_regex): Map
"{re:" to "(", similarly ")?" from ":re?}" and the
same without question mark.

Update gcc fr.po, sv.po

* fr.po, sv.po: Update.

PR target/107299: Fix build issue when long double is IEEE 128-bit

This patch updates the IEEE 128-bit types used in libgcc.

At the moment, we cannot build GCC when the target uses IEEE 128-bit long
doubles, such as building the compiler for a native Fedora 36 system.  The
build dies when it is trying to build the _mulkc3.c and _divkc3 modules.

This patch changes libgcc to use long double for the IEEE 128-bit base type if
long double is IEEE 128-bit, and it uses _Float128 otherwise.  The built-in
functions are adjusted to be the correct version based on the IEEE 128-bit base
type used.

While it is desirable to ultimately have __float128 and _Float128 use the same
internal type and mode within GCC, at present if you use the option
-mabi=ieeelongdouble, the __float128 type will use the long double type and not
the _Float128 type.  We get an internal compiler error if we combine the
signbitf128 built-in with a long double type.

I've gone through several iterations of trying to fix this within GCC, and
there are various problems that have come up.  I developed this alternative
patch that changes libgcc so that it does not tickle the issue.  I hope we can
fix the compiler at some point, but right now, this is preventing people on
Fedora 36 systems from building compilers where the default long double is IEEE
128-bit.

2023-03-06   Michael Meissner  <meissner@linux.ibm.com>

libgcc/

PR target/107299
* config/rs6000/_divkc3.c (COPYSIGN): Use the correct built-in based on
whether long double is IBM or IEEE.
(INFINITY): Likewise.
(FABS): Likewise.
* config/rs6000/_mulkc3.c (COPYSIGN): Likewise.
(INFINITY): Likewise.
* config/rs6000/quad-float128.h (TF): Remove definition.
(TFtype): Define to be long double or _Float128.
(TCtype): Define to be _Complex long double or _Complex _Float128.
* libgcc2.h (TFtype): Allow machine config files to override this.
(TCtype): Likewise.
* soft-fp/quad.h (TFtype): Likewise.

amdgcn: Add instruction patterns for conditional min/max operations

gcc/ChangeLog:

* config/gcn/gcn-valu.md (<expander><mode>3_exec): Add patterns for
{s|u}{max|min} in QI, HI and DI modes.
(<expander><mode>3): Add pattern for {s|u}{max|min} in DI mode.
(cond_<fexpander><mode>): Add pattern for cond_f{max|min}.
(cond_<expander><mode>): Add pattern for cond_{s|u}{max|min}.
* config/gcn/gcn.cc (gcn_spill_class): Allow the exec register to be
saved in SGPRs.

gcc/testsuite/ChangeLog:

* gcc.target/gcn/cond_fmaxnm_1.c: New test.
* gcc.target/gcn/cond_fmaxnm_1_run.c: New test.
* gcc.target/gcn/cond_fmaxnm_2.c: New test.
* gcc.target/gcn/cond_fmaxnm_2_run.c: New test.
* gcc.target/gcn/cond_fmaxnm_3.c: New test.
* gcc.target/gcn/cond_fmaxnm_3_run.c: New test.
* gcc.target/gcn/cond_fmaxnm_4.c: New test.
* gcc.target/gcn/cond_fmaxnm_4_run.c: New test.
* gcc.target/gcn/cond_fmaxnm_5.c: New test.
* gcc.target/gcn/cond_fmaxnm_5_run.c: New test.
* gcc.target/gcn/cond_fmaxnm_6.c: New test.
* gcc.target/gcn/cond_fmaxnm_6_run.c: New test.
* gcc.target/gcn/cond_fmaxnm_7.c: New test.
* gcc.target/gcn/cond_fmaxnm_7_run.c: New test.
* gcc.target/gcn/cond_fmaxnm_8.c: New test.
* gcc.target/gcn/cond_fmaxnm_8_run.c: New test.
* gcc.target/gcn/cond_fminnm_1.c: New test.
* gcc.target/gcn/cond_fminnm_1_run.c: New test.
* gcc.target/gcn/cond_fminnm_2.c: New test.
* gcc.target/gcn/cond_fminnm_2_run.c: New test.
* gcc.target/gcn/cond_fminnm_3.c: New test.
* gcc.target/gcn/cond_fminnm_3_run.c: New test.
* gcc.target/gcn/cond_fminnm_4.c: New test.
* gcc.target/gcn/cond_fminnm_4_run.c: New test.
* gcc.target/gcn/cond_fminnm_5.c: New test.
* gcc.target/gcn/cond_fminnm_5_run.c: New test.
* gcc.target/gcn/cond_fminnm_6.c: New test.
* gcc.target/gcn/cond_fminnm_6_run.c: New test.
* gcc.target/gcn/cond_fminnm_7.c: New test.
* gcc.target/gcn/cond_fminnm_7_run.c: New test.
* gcc.target/gcn/cond_fminnm_8.c: New test.
* gcc.target/gcn/cond_fminnm_8_run.c: New test.
* gcc.target/gcn/cond_smax_1.c: New test.
* gcc.target/gcn/cond_smax_1_run.c: New test.
* gcc.target/gcn/cond_smin_1.c: New test.
* gcc.target/gcn/cond_smin_1_run.c: New test.
* gcc.target/gcn/cond_umax_1.c: New test.
* gcc.target/gcn/cond_umax_1_run.c: New test.
* gcc.target/gcn/cond_umin_1.c: New test.
* gcc.target/gcn/cond_umin_1_run.c: New test.
* gcc.target/gcn/smax_1.c: New test.
* gcc.target/gcn/smax_1_run.c: New test.
* gcc.target/gcn/smin_1.c: New test.
* gcc.target/gcn/smin_1_run.c: New test.
* gcc.target/gcn/umax_1.c: New test.
* gcc.target/gcn/umax_1_run.c: New test.
* gcc.target/gcn/umin_1.c: New test.
* gcc.target/gcn/umin_1_run.c: New test.

Fix assertion failure on VSS library

gcc/ada/
PR ada/108858
* sem_ch6.adb (Analyze_Subprogram_Body_Helper): For functions with
separate spec, if their return type was visible through a limited-
with context clause, their extra formals were not added when the
spec was analyzed. Now the full view must be available, and the
extra formals can be created and Returns_By_Ref computed.

Revert "Respect GNATMAKE Makefile variable" commit

It breaks cross native builds.

gcc/ada/
PR ada/108909
PR ada/108983
* Make-generated.in: Do not use GNATMAKE.
* gcc-interface/Makefile.in: Ditto.

tree-optimization/109025 - fixup double reduction detection

The following closes a gap in double reduction detection where we
in the outer loop analysis fail to verify the inner LC PHI use is
the latch definition of the inner loop PHI. That latch definition
is used to detect that an inner loop is part of a double reduction
when later doing the inner loop analysis.

PR tree-optimization/109025
* tree-vect-loop.cc (vect_is_simple_reduction): Verify
the inner LC PHI use is the inner loop PHI latch definition
before classifying an outer PHI as double reduction.

* gcc.dg/vect/pr109025.c: New testcase.

Enable scatter for generic

2023-03-06 Jan Hubicka <hubicka@ucw.cz>

PR target/108429
* config/i386/x86-tune.def (X86_TUNE_USE_SCATTER_2PARTS): Enable for
generic.
(X86_TUNE_USE_SCATTER_4PARTS): Likewise.
(X86_TUNE_USE_SCATTER): Likewise.

LoongArch: testsuite: Disable stack protector for some tests

Stack protector will affect stack layout and break the expectation of
these tests, causing test failures if GCC is configured with
--enable-default-ssp.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/prolog-opt.c (dg-options): Add
-fno-stack-protector.
* gcc.target/loongarch/stack-check-cfa-1.c (dg-options):
Likewise.
* gcc.target/loongarch/stack-check-cfa-2.c (dg-options):
Likewise.

LoongArch: Stop -mfpu from silently breaking ABI [PR109000]

In the toolchain convention, we describe -mfpu= as:

"Selects the allowed set of basic floating-point instructions and
registers. This option should not change the FP calling convention
unless it's necessary."

Though not explicitly stated, the rationale of this rule is to allow
combinations like "-mabi=lp64s -mfpu=64".  This will be useful for
running applications with LP64S/F ABI on a double-float-capable
LoongArch hardware and using a math library with LP64S/F ABI but native
double float HW instructions, for a better performance.

And now a case in Linux kernel has again proven the usefulness of this
kind of combination.  The AMDGPU DCN kernel driver needs to perform some
floating-point operation, but the entire kernel uses LP64S ABI.  So the
translation units of the AMDGPU DCN driver need to be compiled with
-mfpu=64 (the kernel lacks soft-FP routines in libgcc), but -mabi=lp64s
(or you can't link it with the other part of the kernel).

Unfortunately, currently GCC uses TARGET_{HARD,SOFT,DOUBLE}_FLOAT to
determine the floating calling convention.  This causes "-mfpu=64"
silently allow using $fa* to pass parameters and return values EVEN IF
-mabi=lp64s is used.  To make things worse, the generated object file
has SOFT-FLOAT set in the eflags field so the linker will happily link
it with other LP64S ABI object files, but obviously this will lead to
bad results at runtime.  And for now all loongarch64 CPU models (-march
settings) implies -mfpu=64 on by default, so the issue makes a single
"-mabi=lp64s" option basically broken (fortunately most projects for eg
the Linux kernel have used -msoft-float which implies both -mabi=lp64s
and -mfpu=none as we've recommended in the toolchain convention doc).

The fix is simple: use TARGET_*_FLOAT_ABI instead.

I consider this a bug fix: the behavior difference from the toolchain
convention doc is a bug, and generating object files with SOFT-FLOAT
flag but parameters/return values passed through FPRs is definitely a
bug.

Bootstrapped and regtested on loongarch64-linux-gnu.  Ok for trunk and
release/gcc-12 branch?

gcc/ChangeLog:

PR target/109000
* config/loongarch/loongarch.h (FP_RETURN): Use
TARGET_*_FLOAT_ABI instead of TARGET_*_FLOAT.
(UNITS_PER_FP_ARG): Likewise.

gcc/testsuite/ChangeLog:

PR target/109000
* gcc.target/loongarch/flt-abi-isa-1.c: New test.
* gcc.target/loongarch/flt-abi-isa-2.c: New test.
* gcc.target/loongarch/flt-abi-isa-3.c: New test.
* gcc.target/loongarch/flt-abi-isa-4.c: New test.

libgo: revert incorrectly committed change

This directory should be changed upstream, not in the GCC repo.

Daily bump.

Fortran: fix CLASS attribute handling [PR106856]

gcc/fortran/ChangeLog:

PR fortran/106856
* class.cc (gfc_build_class_symbol): Handle update of attributes of
existing class container.
(gfc_find_derived_vtab): Fix several memory leaks.
(find_intrinsic_vtab): Ditto.
* decl.cc (attr_decl1): Manage update of symbol attributes from
CLASS attributes.
* primary.cc (gfc_variable_attr): OPTIONAL shall not be taken or
updated from the class container.
* symbol.cc (free_old_symbol): Adjust management of symbol versions
to not prematurely free array specs while working on the declation
of CLASS variables.

gcc/testsuite/ChangeLog:

PR fortran/106856
* gfortran.dg/interface_41.f90: Remove dg-pattern from valid testcase.
* gfortran.dg/class_74.f90: New test.
* gfortran.dg/class_75.f90: New test.

Co-authored-by: Tobias Burnus <tobias@codesourcery.com>

testsuite: Fix up syntax error in scan-tree-dump-times target selector

On aarch64, powerpc64le and s390x-linux I'm seeing another syntax error
which didn't show up on x86_64-linux nor i686-linux:
ERROR: gcc.dg/vect/slp-perm-8.c -flto -ffat-lto-objects: error executing dg-final: syntax error in target selector "target  ! vect_load_lanes  && vect_partial_vectors_usage_1 &&  ! s390_vx"
ERROR: gcc.dg/vect/slp-perm-8.c: error executing dg-final: syntax error in target selector "target  ! vect_load_lanes  && vect_partial_vectors_usage_1 &&  ! s390_vx"

The following patch fixes that.

2023-03-05  Jakub Jelinek  <jakub@redhat.com>

* gcc.dg/vect/slp-perm-8.c: Fix up syntax error in
scan-tree-dump-times target selector.

RISC-V: Fix ICE for avl_single-86/avl_single-88/avl_single-90

If prop is demand of vsetvl instruction and reaching doesn't demand
AVL. We don't backward propagate since vsetvl instruction has no
side effects.

FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-86.c  -Og -g  (internal
compiler error: Segmentation fault)
FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-86.c  -Og -g  (test for
excess errors)
FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-88.c  -Og -g  (internal
compiler error: Segmentation fault)
FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-88.c  -Og -g  (test for
excess errors)
FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-90.c  -Og -g  (internal
compiler error: Segmentation fault)
FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-90.c  -Og -g  (test for
excess errors)

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (reg_available_p): Fix bug.
(pass_vsetvl::backward_demand_fusion): Ditto.

RISC-V: Implement ZKSH and ZKSED extensions

This patch supports Zksh and Zksed extension.
It includes instruction's machine description and built-in funtions.

gcc/ChangeLog:

* config/riscv/crypto.md (riscv_sm3p0_<mode>): Add ZKSED's and ZKSH's
instructions.
(riscv_sm3p1_<mode>): New.
(riscv_sm4ed_<mode>): New.
(riscv_sm4ks_<mode>): New.
* config/riscv/riscv-builtins.cc (AVAIL): Add ZKSED's and ZKSH's AVAIL.
* config/riscv/riscv-scalar-crypto.def (RISCV_BUILTIN): Add ZKSED's and
ZKSH's built-in functions.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/zksed32.c: New test.
* gcc.target/riscv/zksed64.c: New test.
* gcc.target/riscv/zksh32.c: New test.
* gcc.target/riscv/zksh64.c: New test.

Co-Authored-By: SiYu Wu <siyu@isrc.iscas.ac.cn>

RISC-V: Implement ZKNH extension

This patch supports Zknh extension.
It includes instruction's machine description and built-in funtions.

gcc/ChangeLog:

* config/riscv/crypto.md (riscv_sha256sig0_<mode>): Add ZKNH's instructions.
(riscv_sha256sig1_<mode>): New.
(riscv_sha256sum0_<mode>): New.
(riscv_sha256sum1_<mode>): New.
(riscv_sha512sig0h): New.
(riscv_sha512sig0l): New.
(riscv_sha512sig1h): New.
(riscv_sha512sig1l): New.
(riscv_sha512sum0r): New.
(riscv_sha512sum1r): New.
(riscv_sha512sig0): New.
(riscv_sha512sig1): New.
(riscv_sha512sum0): New.
(riscv_sha512sum1): New.
* config/riscv/riscv-builtins.cc (AVAIL): And ZKNH's AVAIL.
* config/riscv/riscv-scalar-crypto.def (RISCV_BUILTIN): And ZKNH's
built-in functions.
(DIRECT_BUILTIN): Add new.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/zknh-sha256.c: New test.
* gcc.target/riscv/zknh-sha512-32.c: New test.
* gcc.target/riscv/zknh-sha512-64.c: New test.

Co-Authored-By: SiYu Wu <siyu@isrc.iscas.ac.cn>

RISC-V: Implement ZKND and ZKNE extensions

This patch supports Zkne and Zknd extension.
It includes instruction's machine description and built-in funtions.

gcc/ChangeLog:

* config/riscv/constraints.md (D03): Add constants of bs and rnum.
(DsA): New.
* config/riscv/crypto.md (riscv_aes32dsi): Add ZKND's and ZKNE's instructions.
(riscv_aes32dsmi): New.
(riscv_aes64ds): New.
(riscv_aes64dsm): New.
(riscv_aes64im): New.
(riscv_aes64ks1i): New.
(riscv_aes64ks2): New.
(riscv_aes32esi): New.
(riscv_aes32esmi): New.
(riscv_aes64es): New.
(riscv_aes64esm): New.
* config/riscv/riscv-builtins.cc (AVAIL): Add ZKND's and ZKNE's AVAIL.
* config/riscv/riscv-scalar-crypto.def (DIRECT_BUILTIN): Add ZKND's and
ZKNE's built-in functions.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/zknd32.c: New test.
* gcc.target/riscv/zknd64.c: New test.
* gcc.target/riscv/zkne32.c: New test.
* gcc.target/riscv/zkne64.c: New test.

Co-Authored-By: SiYu Wu <siyu@isrc.iscas.ac.cn>

RISC-V: Implement ZBKB, ZBKC and ZBKX extensions

This patch supports Zkbk, Zbkc and Zkbx extension.
It includes instruction's machine description and built-in funtions.
It is worth mentioning that this patch only adds instructions in Zbkb but no
longer in Zbb.
If any instructions both in Zbb and Zbkb, they will be generated by code
generator instead of built-in functions.

gcc/ChangeLog:

* config/riscv/bitmanip.md: Add ZBKB's instructions.
* config/riscv/riscv-builtins.cc (AVAIL): Add new.
* config/riscv/riscv.md: Add new type for crypto instructions.
* config/riscv/crypto.md: Add Scalar Cryptography extension's machine
description file.
* config/riscv/riscv-scalar-crypto.def: Add Scalar Cryptography
extension's built-in function file.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/zbkb32.c: New test.
* gcc.target/riscv/zbkb64.c: New test.
* gcc.target/riscv/zbkc32.c: New test.
* gcc.target/riscv/zbkc64.c: New test.
* gcc.target/riscv/zbkx32.c: New test.
* gcc.target/riscv/zbkx64.c: New test.

Co-Authored-By: SiYu Wu <siyu@isrc.iscas.ac.cn>

RISC-V: Add prototypes for RISC-V Crypto built-in functions

This patch adds prototypes for RISC-V Crypto built-in functions.

gcc/ChangeLog:

* config/riscv/riscv-builtins.cc (RISCV_FTYPE_NAME2): New.
(RISCV_FTYPE_NAME3): New.
(RISCV_ATYPE_QI): New.
(RISCV_ATYPE_HI): New.
(RISCV_FTYPE_ATYPES2): New.
(RISCV_FTYPE_ATYPES3): New.
* config/riscv/riscv-ftypes.def (2): New.
(3): New.

Co-Authored-By: SiYu Wu <siyu@isrc.iscas.ac.cn>

RISC-V: costs: miscomputed shiftadd_cost triggering synth_mult [PR/108987]

This showed up as dynamic icount regression in SPEC 531.deepsjeng with upstream
gcc (vs. gcc 12.2). gcc was resorting to synthetic multiply using shift+add(s)
even when multiply had clear cost benefit.

|00000000000133b8 <see(state_t*, int, int, int, int) [clone .constprop.0]+0x382>:
|   133b8: srl a3,a1,s6
|   133bc: and a3,a3,s5
|   133c0: slli a4,a3,0x9
|   133c4: add a4,a4,a3
|   133c6: slli a4,a4,0x9
|   133c8: add a4,a4,a3
|   133ca: slli a3,a4,0x1b
|   133ce: add a4,a4,a3

vs. gcc 12 doing something lke below.

|00000000000131c4 <see(state_t*, int, int, int, int) [clone .constprop.0]+0x35c>:
|   131c4: ld s1,8(sp)
|   131c6: srl a3,a1,s4
|   131ca: and a3,a3,s11
|   131ce: mul a3,a3,s1

Bisected this to f90cb39235c4 ("RISC-V: costs: support shift-and-add in
strength-reduction"). The intent was to optimize cost for
shift-add-pow2-{1,2,3} corresponding to bitmanip insns SH*ADD, but ended
up doing that for all shift values which seems to favor synthezing
multiply among others.

The bug itself is trivial, IN_RANGE() calling pow2p_hwi() which returns bool
vs. exact_log2() returning power of 2.

This fix also requires update to the test introduced by the same commit
which now generates MUL vs. synthesizing it.

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_rtx_costs): Fixed IN_RANGE() to
use exact_log2().

gcc/testsuite/ChangeLog:

* gcc.target/riscv/zba-shNadd-07.c: f2(i*783) now generates MUL vs.
5 insn sh1add+slli+add+slli+sub.
* gcc.target/riscv/pr108987.c: New test.

Signed-off-by: Vineet Gupta <vineetg@rivosinc.com>
Reviewed-by: Philipp Tomsich <philipp.tomsich@vrull.eu>

RISC-V: Add RVV misc intrinsic support

Co-authored-by: kito-cheng <kito.cheng@sifive.com>
gcc/ChangeLog:

* config/riscv/predicates.md (vector_any_register_operand): New predicate.
* config/riscv/riscv-c.cc (riscv_check_builtin_call): New function.
(riscv_register_pragmas): Add builtin function check call.
* config/riscv/riscv-protos.h (RVV_VUNDEF): Adapt macro.
(check_builtin_call): New function.
* config/riscv/riscv-vector-builtins-bases.cc (class vundefined): New class.
(class vreinterpret): Ditto.
(class vlmul_ext): Ditto.
(class vlmul_trunc): Ditto.
(class vset): Ditto.
(class vget): Ditto.
(BASE): Ditto.
* config/riscv/riscv-vector-builtins-bases.h: Ditto.
* config/riscv/riscv-vector-builtins-functions.def (vluxei8): Change name.
(vluxei16): Ditto.
(vluxei32): Ditto.
(vluxei64): Ditto.
(vloxei8): Ditto.
(vloxei16): Ditto.
(vloxei32): Ditto.
(vloxei64): Ditto.
(vsuxei8): Ditto.
(vsuxei16): Ditto.
(vsuxei32): Ditto.
(vsuxei64): Ditto.
(vsoxei8): Ditto.
(vsoxei16): Ditto.
(vsoxei32): Ditto.
(vsoxei64): Ditto.
(vundefined): Add new intrinsic.
(vreinterpret): Ditto.
(vlmul_ext): Ditto.
(vlmul_trunc): Ditto.
(vset): Ditto.
(vget): Ditto.
* config/riscv/riscv-vector-builtins-shapes.cc (struct return_mask_def): New class.
(struct narrow_alu_def): Ditto.
(struct reduc_alu_def): Ditto.
(struct vundefined_def): Ditto.
(struct misc_def): Ditto.
(struct vset_def): Ditto.
(struct vget_def): Ditto.
(SHAPE): Ditto.
* config/riscv/riscv-vector-builtins-shapes.h: Ditto.
* config/riscv/riscv-vector-builtins-types.def (DEF_RVV_EEW8_INTERPRET_OPS): New def.
(DEF_RVV_EEW16_INTERPRET_OPS): Ditto.
(DEF_RVV_EEW32_INTERPRET_OPS): Ditto.
(DEF_RVV_EEW64_INTERPRET_OPS): Ditto.
(DEF_RVV_X2_VLMUL_EXT_OPS): Ditto.
(DEF_RVV_X4_VLMUL_EXT_OPS): Ditto.
(DEF_RVV_X8_VLMUL_EXT_OPS): Ditto.
(DEF_RVV_X16_VLMUL_EXT_OPS): Ditto.
(DEF_RVV_X32_VLMUL_EXT_OPS): Ditto.
(DEF_RVV_X64_VLMUL_EXT_OPS): Ditto.
(DEF_RVV_LMUL1_OPS): Ditto.
(DEF_RVV_LMUL2_OPS): Ditto.
(DEF_RVV_LMUL4_OPS): Ditto.
(vint16mf4_t): Ditto.
(vint16mf2_t): Ditto.
(vint16m1_t): Ditto.
(vint16m2_t): Ditto.
(vint16m4_t): Ditto.
(vint16m8_t): Ditto.
(vint32mf2_t): Ditto.
(vint32m1_t): Ditto.
(vint32m2_t): Ditto.
(vint32m4_t): Ditto.
(vint32m8_t): Ditto.
(vint64m1_t): Ditto.
(vint64m2_t): Ditto.
(vint64m4_t): Ditto.
(vint64m8_t): Ditto.
(vuint16mf4_t): Ditto.
(vuint16mf2_t): Ditto.
(vuint16m1_t): Ditto.
(vuint16m2_t): Ditto.
(vuint16m4_t): Ditto.
(vuint16m8_t): Ditto.
(vuint32mf2_t): Ditto.
(vuint32m1_t): Ditto.
(vuint32m2_t): Ditto.
(vuint32m4_t): Ditto.
(vuint32m8_t): Ditto.
(vuint64m1_t): Ditto.
(vuint64m2_t): Ditto.
(vuint64m4_t): Ditto.
(vuint64m8_t): Ditto.
(vint8mf4_t): Ditto.
(vint8mf2_t): Ditto.
(vint8m1_t): Ditto.
(vint8m2_t): Ditto.
(vint8m4_t): Ditto.
(vint8m8_t): Ditto.
(vuint8mf4_t): Ditto.
(vuint8mf2_t): Ditto.
(vuint8m1_t): Ditto.
(vuint8m2_t): Ditto.
(vuint8m4_t): Ditto.
(vuint8m8_t): Ditto.
(vint8mf8_t): Ditto.
(vuint8mf8_t): Ditto.
(vfloat32mf2_t): Ditto.
(vfloat32m1_t): Ditto.
(vfloat32m2_t): Ditto.
(vfloat32m4_t): Ditto.
(vfloat64m1_t): Ditto.
(vfloat64m2_t): Ditto.
(vfloat64m4_t): Ditto.
* config/riscv/riscv-vector-builtins.cc (DEF_RVV_TYPE): Ditto.
(DEF_RVV_EEW8_INTERPRET_OPS): Ditto.
(DEF_RVV_EEW16_INTERPRET_OPS): Ditto.
(DEF_RVV_EEW32_INTERPRET_OPS): Ditto.
(DEF_RVV_EEW64_INTERPRET_OPS): Ditto.
(DEF_RVV_X2_VLMUL_EXT_OPS): Ditto.
(DEF_RVV_X4_VLMUL_EXT_OPS): Ditto.
(DEF_RVV_X8_VLMUL_EXT_OPS): Ditto.
(DEF_RVV_X16_VLMUL_EXT_OPS): Ditto.
(DEF_RVV_X32_VLMUL_EXT_OPS): Ditto.
(DEF_RVV_X64_VLMUL_EXT_OPS): Ditto.
(DEF_RVV_LMUL1_OPS): Ditto.
(DEF_RVV_LMUL2_OPS): Ditto.
(DEF_RVV_LMUL4_OPS): Ditto.
(DEF_RVV_TYPE_INDEX): Ditto.
(required_extensions_p): Adapt for new intrinsic support/
(get_required_extensions): New function.
(check_required_extensions): Ditto.
(unsigned_base_type_p): Remove.
(rvv_arg_type_info::get_scalar_ptr_type): New function.
(get_mode_for_bitsize): Remove.
(rvv_arg_type_info::get_scalar_const_ptr_type): New function.
(rvv_arg_type_info::get_base_vector_type): Ditto.
(rvv_arg_type_info::get_function_type_index): Ditto.
(DEF_RVV_BASE_TYPE): New def.
(function_builder::apply_predication): New class.
(function_expander::mask_mode): Ditto.
(function_checker::function_checker): Ditto.
(function_checker::report_non_ice): Ditto.
(function_checker::report_out_of_range): Ditto.
(function_checker::require_immediate): Ditto.
(function_checker::require_immediate_range): Ditto.
(function_checker::check): Ditto.
(check_builtin_call): Ditto.
* config/riscv/riscv-vector-builtins.def (DEF_RVV_TYPE): New def.
(DEF_RVV_BASE_TYPE): Ditto.
(DEF_RVV_TYPE_INDEX): Ditto.
(vbool64_t): Ditto.
(vbool32_t): Ditto.
(vbool16_t): Ditto.
(vbool8_t): Ditto.
(vbool4_t): Ditto.
(vbool2_t): Ditto.
(vbool1_t): Ditto.
(vuint8mf8_t): Ditto.
(vuint8mf4_t): Ditto.
(vuint8mf2_t): Ditto.
(vuint8m1_t): Ditto.
(vuint8m2_t): Ditto.
(vint8m4_t): Ditto.
(vuint8m4_t): Ditto.
(vint8m8_t): Ditto.
(vuint8m8_t): Ditto.
(vint16mf4_t): Ditto.
(vuint16mf2_t): Ditto.
(vuint16m1_t): Ditto.
(vuint16m2_t): Ditto.
(vuint16m4_t): Ditto.
(vuint16m8_t): Ditto.
(vint32mf2_t): Ditto.
(vuint32m1_t): Ditto.
(vuint32m2_t): Ditto.
(vuint32m4_t): Ditto.
(vuint32m8_t): Ditto.
(vuint64m1_t): Ditto.
(vuint64m2_t): Ditto.
(vuint64m4_t): Ditto.
(vuint64m8_t): Ditto.
(vfloat32mf2_t): Ditto.
(vfloat32m1_t): Ditto.
(vfloat32m2_t): Ditto.
(vfloat32m4_t): Ditto.
(vfloat32m8_t): Ditto.
(vfloat64m1_t): Ditto.
(vfloat64m4_t): Ditto.
(vector): Move it def.
(scalar): Ditto.
(mask): Ditto.
(signed_vector): Ditto.
(unsigned_vector): Ditto.
(unsigned_scalar): Ditto.
(vector_ptr): Ditto.
(scalar_ptr): Ditto.
(scalar_const_ptr): Ditto.
(void): Ditto.
(size): Ditto.
(ptrdiff): Ditto.
(unsigned_long): Ditto.
(long): Ditto.
(eew8_index): Ditto.
(eew16_index): Ditto.
(eew32_index): Ditto.
(eew64_index): Ditto.
(shift_vector): Ditto.
(double_trunc_vector): Ditto.
(quad_trunc_vector): Ditto.
(oct_trunc_vector): Ditto.
(double_trunc_scalar): Ditto.
(double_trunc_signed_vector): Ditto.
(double_trunc_unsigned_vector): Ditto.
(double_trunc_unsigned_scalar): Ditto.
(double_trunc_float_vector): Ditto.
(float_vector): Ditto.
(lmul1_vector): Ditto.
(widen_lmul1_vector): Ditto.
(eew8_interpret): Ditto.
(eew16_interpret): Ditto.
(eew32_interpret): Ditto.
(eew64_interpret): Ditto.
(vlmul_ext_x2): Ditto.
(vlmul_ext_x4): Ditto.
(vlmul_ext_x8): Ditto.
(vlmul_ext_x16): Ditto.
(vlmul_ext_x32): Ditto.
(vlmul_ext_x64): Ditto.
* config/riscv/riscv-vector-builtins.h (DEF_RVV_BASE_TYPE): New def.
(struct function_type_info): New function.
(struct rvv_arg_type_info): Ditto.
(class function_checker): New class.
(rvv_arg_type_info::get_scalar_type): New function.
(rvv_arg_type_info::get_vector_type): Ditto.
(function_expander::ret_mode): New function.
(function_checker::arg_mode): Ditto.
(function_checker::ret_mode): Ditto.
* config/riscv/t-riscv: Add generator.
* config/riscv/vector-iterators.md: New iterators.
* config/riscv/vector.md (vundefined<mode>): New pattern.
(@vundefined<mode>): Ditto.
(@vreinterpret<mode>): Ditto.
(@vlmul_extx2<mode>): Ditto.
(@vlmul_extx4<mode>): Ditto.
(@vlmul_extx8<mode>): Ditto.
(@vlmul_extx16<mode>): Ditto.
(@vlmul_extx32<mode>): Ditto.
(@vlmul_extx64<mode>): Ditto.
(*vlmul_extx2<mode>): Ditto.
(*vlmul_extx4<mode>): Ditto.
(*vlmul_extx8<mode>): Ditto.
(*vlmul_extx16<mode>): Ditto.
(*vlmul_extx32<mode>): Ditto.
(*vlmul_extx64<mode>): Ditto.
* config/riscv/genrvv-type-indexer.cc: New file.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/vlmul_v.c: New test.

Co-authored-by: kito-cheng <kito.cheng@sifive.com>

RISC-V: Add permutation C/C++ support

gcc/ChangeLog:

* config/riscv/riscv-protos.h (enum vlen_enum): New enum.
(slide1_sew64_helper): New function.
* config/riscv/riscv-v.cc (compute_vlmax): Ditto.
(get_unknown_min_value): Ditto.
(force_vector_length_operand): Ditto.
(gen_no_side_effects_vsetvl_rtx): Ditto.
(get_vl_x2_rtx): Ditto.
(slide1_sew64_helper): Ditto.
* config/riscv/riscv-vector-builtins-bases.cc (class slideop): New class.
(class vrgather): Ditto.
(class vrgatherei16): Ditto.
(class vcompress): Ditto.
(BASE): Ditto.
* config/riscv/riscv-vector-builtins-bases.h: Ditto.
* config/riscv/riscv-vector-builtins-functions.def (vslideup): Ditto.
(vslidedown): Ditto.
(vslide1up): Ditto.
(vslide1down): Ditto.
(vfslide1up): Ditto.
(vfslide1down): Ditto.
(vrgather): Ditto.
(vrgatherei16): Ditto.
(vcompress): Ditto.
* config/riscv/riscv-vector-builtins-types.def (DEF_RVV_EI16_OPS): New macro.
(vint8mf8_t): Ditto.
(vint8mf4_t): Ditto.
(vint8mf2_t): Ditto.
(vint8m1_t): Ditto.
(vint8m2_t): Ditto.
(vint8m4_t): Ditto.
(vint16mf4_t): Ditto.
(vint16mf2_t): Ditto.
(vint16m1_t): Ditto.
(vint16m2_t): Ditto.
(vint16m4_t): Ditto.
(vint16m8_t): Ditto.
(vint32mf2_t): Ditto.
(vint32m1_t): Ditto.
(vint32m2_t): Ditto.
(vint32m4_t): Ditto.
(vint32m8_t): Ditto.
(vint64m1_t): Ditto.
(vint64m2_t): Ditto.
(vint64m4_t): Ditto.
(vint64m8_t): Ditto.
(vuint8mf8_t): Ditto.
(vuint8mf4_t): Ditto.
(vuint8mf2_t): Ditto.
(vuint8m1_t): Ditto.
(vuint8m2_t): Ditto.
(vuint8m4_t): Ditto.
(vuint16mf4_t): Ditto.
(vuint16mf2_t): Ditto.
(vuint16m1_t): Ditto.
(vuint16m2_t): Ditto.
(vuint16m4_t): Ditto.
(vuint16m8_t): Ditto.
(vuint32mf2_t): Ditto.
(vuint32m1_t): Ditto.
(vuint32m2_t): Ditto.
(vuint32m4_t): Ditto.
(vuint32m8_t): Ditto.
(vuint64m1_t): Ditto.
(vuint64m2_t): Ditto.
(vuint64m4_t): Ditto.
(vuint64m8_t): Ditto.
(vfloat32mf2_t): Ditto.
(vfloat32m1_t): Ditto.
(vfloat32m2_t): Ditto.
(vfloat32m4_t): Ditto.
(vfloat32m8_t): Ditto.
(vfloat64m1_t): Ditto.
(vfloat64m2_t): Ditto.
(vfloat64m4_t): Ditto.
(vfloat64m8_t): Ditto.
* config/riscv/riscv-vector-builtins.cc (DEF_RVV_EI16_OPS): Ditto.
* config/riscv/riscv.md: Adjust RVV instruction types.
* config/riscv/vector-iterators.md (down): New iterator.
(=vd,vr): New attribute.
(UNSPEC_VSLIDE1UP): New unspec.
* config/riscv/vector.md (@pred_slide<ud><mode>): New pattern.
(*pred_slide<ud><mode>): Ditto.
(*pred_slide<ud><mode>_extended): Ditto.
(@pred_gather<mode>): Ditto.
(@pred_gather<mode>_scalar): Ditto.
(@pred_gatherei16<mode>): Ditto.
(@pred_compress<mode>): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/binop_vx_constraint-167.c: New test.
* gcc.target/riscv/rvv/base/binop_vx_constraint-168.c: New test.
* gcc.target/riscv/rvv/base/binop_vx_constraint-169.c: New test.
* gcc.target/riscv/rvv/base/binop_vx_constraint-170.c: New test.
* gcc.target/riscv/rvv/base/binop_vx_constraint-171.c: New test.
* gcc.target/riscv/rvv/base/binop_vx_constraint-172.c: New test.
* gcc.target/riscv/rvv/base/binop_vx_constraint-173.c: New test.
* gcc.target/riscv/rvv/base/binop_vx_constraint-174.c: New test.

RISC-V: Remove void_type_node of void_args for vsetvlmax intrinsic

This patch is to fix https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108927.
PR108927

gcc/ChangeLog:

* config/riscv/riscv-vector-builtins.cc: Remove void_type_node.

RISC-V: Add testcase for VSETVL PASS

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/scalar_move-1.c: New test.
* gcc.target/riscv/rvv/base/scalar_move-2.c: New test.
* gcc.target/riscv/rvv/base/scalar_move-3.c: New test.
* gcc.target/riscv/rvv/base/scalar_move-4.c: New test.
* gcc.target/riscv/rvv/base/scalar_move-5.c: New test.
* gcc.target/riscv/rvv/base/scalar_move-6.c: New test.
* gcc.target/riscv/rvv/base/scalar_move-7.c: New test.
* gcc.target/riscv/rvv/base/scalar_move-8.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-100.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-101.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-78.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-79.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-80.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-81.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-82.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-83.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-84.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-85.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-86.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-87.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-88.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-89.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-90.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-91.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-92.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-93.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-94.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-95.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-96.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-97.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-98.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-99.c: New test.

RISC-V: Add scalar move support and fix VSETVL bugs

gcc/ChangeLog:

* config/riscv/constraints.md (Wb1): New constraint.
* config/riscv/predicates.md
(vector_least_significant_set_mask_operand): New predicate.
(vector_broadcast_mask_operand): Ditto.
* config/riscv/riscv-protos.h (enum vlmul_type): Adjust.
(gen_scalar_move_mask): New function.
* config/riscv/riscv-v.cc (gen_scalar_move_mask): Ditto.
* config/riscv/riscv-vector-builtins-bases.cc (class vmv): New class.
(class vmv_s): Ditto.
(BASE): Ditto.
* config/riscv/riscv-vector-builtins-bases.h: Ditto.
* config/riscv/riscv-vector-builtins-functions.def (vmv_x): Ditto.
(vmv_s): Ditto.
(vfmv_f): Ditto.
(vfmv_s): Ditto.
* config/riscv/riscv-vector-builtins-shapes.cc (struct scalar_move_def): Ditto.
(SHAPE): Ditto.
* config/riscv/riscv-vector-builtins-shapes.h: Ditto.
* config/riscv/riscv-vector-builtins.cc (function_expander::mask_mode): Ditto.
(function_expander::use_exact_insn): New function.
(function_expander::use_contiguous_load_insn): New function.
(function_expander::use_contiguous_store_insn): New function.
(function_expander::use_ternop_insn): New function.
(function_expander::use_widen_ternop_insn): New function.
(function_expander::use_scalar_move_insn): New function.
* config/riscv/riscv-vector-builtins.def (s): New operand suffix.
* config/riscv/riscv-vector-builtins.h
(function_expander::add_scalar_move_mask_operand): New class.
* config/riscv/riscv-vsetvl.cc (ignore_vlmul_insn_p): New function.
(scalar_move_insn_p): Ditto.
(has_vsetvl_killed_avl_p): Ditto.
(anticipatable_occurrence_p): Ditto.
(insert_vsetvl): Ditto.
(get_vl_vtype_info): Ditto.
(calculate_sew): Ditto.
(calculate_vlmul): Ditto.
(incompatible_avl_p): Ditto.
(different_sew_p): Ditto.
(different_lmul_p): Ditto.
(different_ratio_p): Ditto.
(different_tail_policy_p): Ditto.
(different_mask_policy_p): Ditto.
(possible_zero_avl_p): Ditto.
(first_ratio_invalid_for_second_sew_p): Ditto.
(first_ratio_invalid_for_second_lmul_p): Ditto.
(second_ratio_invalid_for_first_sew_p): Ditto.
(second_ratio_invalid_for_first_lmul_p): Ditto.
(second_sew_less_than_first_sew_p): Ditto.
(first_sew_less_than_second_sew_p): Ditto.
(compare_lmul): Ditto.
(second_lmul_less_than_first_lmul_p): Ditto.
(first_lmul_less_than_second_lmul_p): Ditto.
(first_ratio_less_than_second_ratio_p): Ditto.
(second_ratio_less_than_first_ratio_p): Ditto.
(DEF_INCOMPATIBLE_COND): Ditto.
(greatest_sew): Ditto.
(first_sew): Ditto.
(second_sew): Ditto.
(first_vlmul): Ditto.
(second_vlmul): Ditto.
(first_ratio): Ditto.
(second_ratio): Ditto.
(vlmul_for_first_sew_second_ratio): Ditto.
(ratio_for_second_sew_first_vlmul): Ditto.
(DEF_SEW_LMUL_FUSE_RULE): Ditto.
(always_unavailable): Ditto.
(avl_unavailable_p): Ditto.
(sew_unavailable_p): Ditto.
(lmul_unavailable_p): Ditto.
(ge_sew_unavailable_p): Ditto.
(ge_sew_lmul_unavailable_p): Ditto.
(ge_sew_ratio_unavailable_p): Ditto.
(DEF_UNAVAILABLE_COND): Ditto.
(same_sew_lmul_demand_p): Ditto.
(propagate_avl_across_demands_p): Ditto.
(reg_available_p): Ditto.
(avl_info::has_non_zero_avl): Ditto.
(vl_vtype_info::has_non_zero_avl): Ditto.
(vector_insn_info::operator>=): Refactor.
(vector_insn_info::parse_insn): Adjust for scalar move.
(vector_insn_info::demand_vl_vtype): Remove.
(vector_insn_info::compatible_p): New function.
(vector_insn_info::compatible_avl_p): Ditto.
(vector_insn_info::compatible_vtype_p): Ditto.
(vector_insn_info::available_p): Ditto.
(vector_insn_info::merge): Ditto.
(vector_insn_info::fuse_avl): Ditto.
(vector_insn_info::fuse_sew_lmul): Ditto.
(vector_insn_info::fuse_tail_policy): Ditto.
(vector_insn_info::fuse_mask_policy): Ditto.
(vector_insn_info::dump): Ditto.
(vector_infos_manager::release): Ditto.
(pass_vsetvl::compute_local_backward_infos): Adjust for scalar move support.
(pass_vsetvl::get_backward_fusion_type): Adjust for scalar move support.
(pass_vsetvl::hard_empty_block_p): Ditto.
(pass_vsetvl::backward_demand_fusion): Ditto.
(pass_vsetvl::forward_demand_fusion): Ditto.
(pass_vsetvl::refine_vsetvls): Ditto.
(pass_vsetvl::cleanup_vsetvls): Ditto.
(pass_vsetvl::commit_vsetvls): Ditto.
(pass_vsetvl::propagate_avl): Ditto.
* config/riscv/riscv-vsetvl.h (enum demand_status): New class.
(struct demands_pair): Ditto.
(struct demands_cond): Ditto.
(struct demands_fuse_rule): Ditto.
* config/riscv/vector-iterators.md: New iterator.
* config/riscv/vector.md (@pred_broadcast<mode>): New pattern.
(*pred_broadcast<mode>): Ditto.
(*pred_broadcast<mode>_extended_scalar): Ditto.
(@pred_extract_first<mode>): Ditto.
(*pred_extract_first<mode>): Ditto.
(@pred_extract_first_trunc<mode>): Ditto.
* config/riscv/riscv-vsetvl.def: New file.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/vsetvl/vsetvlmax-10.c: Adjust test.
* gcc.target/riscv/rvv/vsetvl/vsetvlmax-11.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vsetvlmax-12.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vsetvlmax-15.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vsetvlmax-18.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vsetvlmax-9.c: Ditto.

RISC-V: Allow const0_rtx operand in max/min

Optimize cases that use max[u]/min[u] against a zero constant.

E.g., the case int f(int x) { return x >= 0 ? x : 0; }
the current asm output in rv64gc_zba_zbb

li rtmp,0
max a0,a0,rtmp

could be optimized into

max a0,a0,zero

gcc/ChangeLog:
* config/riscv/bitmanip.md: allow 0 constant in max/min
pattern.

gcc/testsuite/ChangeLog:
* gcc.target/riscv/zbb-min-max-03.c: New test.

RISC-V: Fix wrong partial subreg check for bsetidisi

The partial subreg check should be for subreg operand(operand 1) instead of
the immediate operand(operand 2). This change also fix pr68648.c in zbs.

gcc/ChangeLog:

* config/riscv/bitmanip.md: Fix wrong index in the check.
Reviewed-by: <philipp.tomsich@vrull.eu>

Daily bump.