review.tizen.org Git - platform/upstream/gcc.git/log

forwprop: Canonicalize atomic fetch_op op x to op_fetch or vice versa [PR98737]

When writing the PR98737 fix, I've handled just the case where people
use __atomic_op_fetch (p, x, y) etc.
But some people actually use the other builtins, like
__atomic_fetch_op (p, x, y) op x.
The following patch canonicalizes the latter to the former and vice versa
when possible if the result of the builtin is a single use and if
that use is a cast with same precision, also that cast's lhs has a single
use.
For all ops of +, -, &, | and ^ we can do those
__atomic_fetch_op (p, x, y) op x -> __atomic_op_fetch (p, x, y)
(and __sync too) opts, but cases of INTEGER_CST and SSA_NAME x
behave differently.  For INTEGER_CST, typically - x is
canonicalized to + (-x), while for SSA_NAME we need to handle various
casts, which sometimes happen on the second argument of the builtin
(there can be even two subsequent casts for char/short due to the
promotions we do) and there can be a cast on the argument of op too.
And all ops but - are commutative.
For the other direction, i.e.
__atomic_op_fetch (p, x, y) rop x -> __atomic_fetch_op (p, x, y)
we can't handle op of & and |, those aren't reversible, for
op + rop is -, for - rop is + and for ^ rop is ^, otherwise the same
stuff as above applies.
And, there is another case, we canonicalize
x - y == 0 (or != 0) and x ^ y == 0 (or != 0) to x == y (or x != y)
and for constant y x + y == 0 (or != 0) to x == -y (or != -y),
so the patch also virtually undoes those canonicalizations, because
e.g. for the earlier PR98737 patch but even generally, it is better
if a result of atomic op fetch is compared against 0 than doing
atomic fetch op and compare it to some variable or non-zero constant.
As for debug info, for non-reversible operations (& and |) the patch
resets debug stmts if there are any, for -fnon-call-exceptions too
(didn't want to include debug temps right before all uses), but
otherwise it emits (on richi's request) the reverse operation from
the result as a new setter of the old lhs, so that later DCE fixes
up the debug info.

On the emitted assembly for the testcases which are fairly large,
I see substantial decreases of the *.s size:
-rw-rw-r--. 1 jakub jakub 116897 Jan 13 09:58 pr98737-1.svanilla
-rw-rw-r--. 1 jakub jakub  93861 Jan 13 09:57 pr98737-1.spatched
-rw-rw-r--. 1 jakub jakub  70257 Jan 13 09:57 pr98737-2.svanilla
-rw-rw-r--. 1 jakub jakub  67537 Jan 13 09:57 pr98737-2.spatched
There are some functions where due to RA we get one more instruction
than previously, but most of them are smaller even when not hitting
the PR98737 previous patch's optimizations.

2022-01-14  Jakub Jelinek  <jakub@redhat.com>

PR target/98737
* tree-ssa-forwprop.c (simplify_builtin_call): Canonicalize
__atomic_fetch_op (p, x, y) op x into __atomic_op_fetch (p, x, y)
and __atomic_op_fetch (p, x, y) iop x into
__atomic_fetch_op (p, x, y).

* gcc.dg/tree-ssa/pr98737-1.c: New test.
* gcc.dg/tree-ssa/pr98737-2.c: New test.

arc: Add DWARF2 alternate CFA column.

Add DWARF 2 CFA column which tracks the return address from a signal
handler context. This value must not correspond to a hard register
and must be out of the range of DWARF_FRAME_REGNUM().

gcc/
* config/arc/arc.h (DWARF_FRAME_REGNUM): Update definition.
(DWARF_FRAME_RETURN_COLUMN): Use RETURN_ADDR_REGNUM macro.
(INCOMING_RETURN_ADDR_RTX): Likewise.
(DWARF_ALT_FRAME_RETURN_COLUMN): Define.

gcc/testsuite/
* gcc.target/arc/cancel-1.c: New file.

libgcc/
* config/arc/linux-unwind.h (arc_fallback_frame_state): Use
DWARF_ALT_FRAME_RETURN_COLUMN macro.

Signed-off-by: Claudiu Zissulescu <claziss@synopsys.com>

arc: Update stack size computation when accumulator registers are available.

When accumulator registers are available in a processor, they need to
be save onto stack durring interrupts. We were already doing so, but
the stack size was wrongly computed in the case other than ARC600.

gcc/

* config/arc/arc.c (arc_compute_frame_size): Remove condition when
computin checking accumulator regs.
(arc_expand_prologue): Update comments.
(arc_expand_epilogue): Likewise.

Signed-off-by: Claudiu Zissulescu <claziss@synopsys.com>

libstdc++: Add C++20 std::make_shared enhancements (P0674R1)

This adds the overloads of std::make_shared and std::allocate_shared for
creating arrays, added to C++20 by P0674R1.

It also adds std::make_shared_for_overwrite, added to C++20 by P1020R1
(and renamed by P1973R1). The std::make_unique_for_overwite overloads
are already supported.

The original std::make_shared overload is changed to construct a
shared_ptr directly instead of calling std::allocate_shared. This
removes a function call at runtime, and avoids having to do overload
resolution for std::allocate_shared, now that there are five overloads
of it.

Allocating a shared array is done by a new __shared_count constructor.
An array is allocated with space for additional elements at the end and
an instance of new _Sp_counted_array class template is constructed in
that unused capacity.

The non-array form of std::make_shared_for_overwrite uses the same
__shared_count constructor as the original std::make_shared overload,
but a new partial specialization of _Sp_counted_ptr_inplace is selected
when the allocator's value_type is the new _Sp_overwrite_tag type. That
new partial specialization default-initializes its contained object and
destroys it with a destructor call rather than using the allocator.

Despite being C++20 features, this implementation only uses concepts
conditionally, with workarounds when they are not supported. This allows
it to work with older non-GCC compilers (Clang 9 and icc 2021). At some
point we can simplify the code by removing the workarounds.

libstdc++-v3/ChangeLog:

* include/bits/shared_ptr.h (__cpp_lib_shared_ptr_weak_type):
Correct type of macro value.
(shared_ptr): Add additional friend declarations.
(make_shared, allocate_shared): Constrain existing overloads and
remove static_assert.
* include/bits/shared_ptr_base.h (__cpp_lib_smart_ptr_for_overwrite):
New macro.
(_Sp_counted_ptr_inplace<T, Alloc, Lp>): New partial
specialization for use with make_shared_for_overwrite.
(__cpp_lib_shared_ptr_arrays): Update value for C++20.
(_Sp_counted_array_base): New class template.
(_Sp_counted_array): New class template.
(__shared_count(_Tp*&, const _Sp_counted_array_base&, _Init)):
New constructor for allocating shared arrays.
(__shared_ptr(const _Sp_counted_array_base&, _Init)): Likewise.
* include/std/version (__cpp_lib_shared_ptr_weak_type): Correct
type.
(__cpp_lib_shared_ptr_arrays): Update value for C++20.
(__cpp_lib_smart_ptr_for_overwrite): New macro.
* testsuite/20_util/shared_ptr/creation/99006.cc: Adjust
expected errors.
* testsuite/20_util/shared_ptr/creation/array.cc: New test.
* testsuite/20_util/shared_ptr/creation/overwrite.cc: New test.
* testsuite/20_util/shared_ptr/creation/version.cc: New test.
* testsuite/20_util/unique_ptr/creation/for_overwrite.cc: Check
feature test macro. Test non-trivial default-initialization.

libstdc++: Ignore cv-quals when std::allocator<void> constructs

When I added the std::allocator_traits<std::allocator<void>>
specialization it broke code like this:

std::allocate_shared<const int>(std::allocator<void>());

The problem is that allocator_traits<allocator<void>>::construct(a, p)
now uses std::_Construct(p), which only does a static_cast<void*>(p) and
so fails if the pointer has cv-quals.

This changes std::_Construct (and the related std::_Construct_novalue)
to use a C-style cast to (void*) which matches the effects of the
"voidify" helper in the C++20 standard.

libstdc++-v3/ChangeLog:

* include/bits/stl_construct.h (_Construct, _Construct_novalue):
Also cast away cv-qualifiers when converting pointer to void.
* testsuite/20_util/allocator/void.cc: Test construct function
with cv-qualified types.

libstdc++: Use std::construct_at in std::common_iterator [PR103992]

This should have been done as part of the LWG 3574 changes.

libstdc++-v3/ChangeLog:

PR libstdc++/103992
* include/bits/stl_iterator.h (common_iterator): Use
std::construct_at instead of placement new.
* testsuite/24_iterators/common_iterator/1.cc: Check copy
construction is usable in constant expressions.

libstdc++: Document new std::random_device tokens

libstdc++-v3/ChangeLog:

* doc/xml/manual/status_cxx2011.xml: Document new tokens
accepted by std::random_device constructor.
* doc/html/manual/status.html: Regenerate.

x86_64: Improvements to arithmetic right shifts of V1TImode values.

This patch to the i386 backend's ix86_expand_v1ti_ashiftrt provides
improved (shorter) implementations of V1TI mode arithmetic right shifts
for constant amounts between 111 and 126 bits.  The significance of
this range is that this functionality is useful for (eventually)
providing sign extension from HImode and QImode to V1TImode.

For example, x>>112 (to sign extend a 16-bit value), was previously
generated as a four operation sequence:

        movdqa  %xmm0, %xmm1 // word    7 6 5 4 3 2 1 0
        psrad   $31, %xmm0 // V8HI = [S,S,?,?,?,?,?,?]
        psrad   $16, %xmm1 // V8HI = [S,X,?,?,?,?,?,?]
        punpckhqdq      %xmm0, %xmm1 // V8HI = [S,S,?,?,S,X,?,?]
        pshufd  $253, %xmm1, %xmm0 // V8HI = [S,S,S,S,S,S,S,X]

with this patch, we now generates a three operation sequence:

        psrad   $16, %xmm0 // V8HI = [S,X,?,?,?,?,?,?]
        pshufhw $254, %xmm0, %xmm0 // V8HI = [S,S,S,X,?,?,?,?]
        pshufd  $254, %xmm0, %xmm0 // V8HI = [S,S,S,S,S,S,S,X]

The correctness of generated code is confirmed by the existing
run-time test gcc.target/i386/sse2-v1ti-ashiftrt-1.c in the testsuite.
This idiom is safe to use for shifts by 127, but that case gets handled
by a two operation sequence earlier in this function.

2022-01-14  Roger Sayle  <roger@nextmovesoftware.com>
    Uroš Bizjak  <ubizjak@gmail.com>

gcc/ChangeLog
* config/i386/i386-expand.c (ix86_expand_v1ti_to_ti): Use force_reg.
(ix86_expand_ti_to_v1ti): Use force_reg.
(ix86_expand_v1ti_shift): Use force_reg.
(ix86_expand_v1ti_rotate): Use force_reg.
(ix86_expand_v1ti_ashiftrt): Provide new three operation
implementations for shifts by 111..126 bits.  Use force_reg.

ARM: fix -Wformat= error

gcc/ChangeLog:

* common/config/arm/arm-common.c (arm_target_mode): Fix
warning: unterminated quoting directive [-Wformat=].

tree-optimization/104009: Conservative underflow estimate in object size

Restrict negative offset computation only to dynamic object sizes, where
size expressions are accurate and not a maximum/minimum estimate and in
cases where negative offsets definitely mean an underflow, e.g. in
MEM_REF of the whole object with negative ofset in addr_object_size.

This ends up missing some cases where __builtin_object_size could have
come up with more precise results, so tests have been adjusted to
reflect that.

gcc/ChangeLog:

PR tree-optimization/104009
* tree-object-size.c (compute_builtin_object_size): Bail out on
negative offset.
(plus_stmt_object_size): Return maximum of wholesize and minimum
of 0 for negative offset.

gcc/testsuite/ChangeLog:

PR tree-optimization/104009
* gcc.dg/builtin-object-size-1.c (test10): New test.
* gcc.dg/builtin-object-size-3.c (test10): Likewise.
(test9): Expect zero size for negative offsets.
* gcc.dg/builtin-object-size-4.c (test8): Likewise.
* gcc.dg/builtin-object-size-5.c (test7): Drop test for
__builtin_object_size.

Signed-off-by: Siddhesh Poyarekar <siddhesh@gotplt.org>

Fix ICE of unrecognizable insn. [PR target/104001]

For define_insn_and_split "*xor2andn":

1. Refine predicate of operands[0] from nonimmediate_operand to
register_operand.
2. Remove TARGET_AVX512BW from condition to avoid kmov when TARGET_BMI
is not available.

gcc/ChangeLog:

PR target/104001
PR target/94790
PR target/104014
* config/i386/i386.md (*xor2andn): Refine predicate of
operands[0] from nonimmediate_operand to
register_operand, remove TARGET_AVX512BW from condition.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr104001.c: New test.

Add __attribute__ ((tainted_args))

This patch adds a new __attribute__ ((tainted_args)) to the C/C++ frontends.

It can be used on function decls: the analyzer will treat as tainted
all parameters to the function and all buffers pointed to by parameters
to the function.  Adding this in one place to the Linux kernel's
__SYSCALL_DEFINEx macro allows the analyzer to treat all syscalls as
having tainted inputs.  This gives some coverage of system calls without
needing to "teach" the analyzer about "__user" - an example of the use
of this can be seen in CVE-2011-2210, where given:

SYSCALL_DEFINE5(osf_getsysinfo, unsigned long, op, void __user *, buffer,
                 unsigned long, nbytes, int __user *, start, void __user *, arg)

the analyzer will treat the nbytes param as under attacker control, and
can complain accordingly:

taint-CVE-2011-2210-1.c: In function 'sys_osf_getsysinfo':
taint-CVE-2011-2210-1.c:69:21: warning: use of attacker-controlled value
  'nbytes' as size without upper-bounds checking [CWE-129] [-Wanalyzer-tainted-size]
   69 |                 if (copy_to_user(buffer, hwrpb, nbytes) != 0)
      |                     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Additionally, the patch allows the attribute to be used on field decls:
specifically function pointers.  Any function used as an initializer
for such a field gets treated as being called with tainted arguments.
An example can be seen in CVE-2020-13143, where adding
__attribute__((tainted_args)) to the "store" callback of
configfs_attribute:

  struct configfs_attribute {
    /* [...snip...] */
    ssize_t (*store)(struct config_item *, const char *, size_t)
      __attribute__((tainted_args));
    /* [...snip...] */
  };

allows the analyzer to see:

CONFIGFS_ATTR(gadget_dev_desc_, UDC);

and treat gadget_dev_desc_UDC_store as having tainted arguments, so that
it complains:

taint-CVE-2020-13143-1.c: In function 'gadget_dev_desc_UDC_store':
taint-CVE-2020-13143-1.c:33:17: warning: use of attacker-controlled value
  'len + 18446744073709551615' as offset without upper-bounds checking [CWE-823] [-Wanalyzer-tainted-offset]
   33 |         if (name[len - 1] == '\n')
      |             ~~~~^~~~~~~~~

As before this currently still needs -fanalyzer-checker=taint (in
addition to -fanalyzer).

gcc/analyzer/ChangeLog:
* engine.cc: Include "stringpool.h", "attribs.h", and
"tree-dfa.h".
(mark_params_as_tainted): New.
(class tainted_args_function_custom_event): New.
(class tainted_args_function_info): New.
(exploded_graph::add_function_entry): Handle functions with
"tainted_args" attribute.
(class tainted_args_field_custom_event): New.
(class tainted_args_callback_custom_event): New.
(class tainted_args_call_info): New.
(add_tainted_args_callback): New.
(add_any_callbacks): New.
(exploded_graph::build_initial_worklist): Likewise.
(exploded_graph::build_initial_worklist): Find callbacks that are
reachable from global initializers, calling add_any_callbacks on
them.

gcc/c-family/ChangeLog:
* c-attribs.c (c_common_attribute_table): Add "tainted_args".
(handle_tainted_args_attribute): New.

gcc/ChangeLog:
* doc/extend.texi (Function Attributes): Note that "tainted_args" can
be used on field decls.
(Common Function Attributes): Add entry on "tainted_args" attribute.

gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/attr-tainted_args-1.c: New test.
* gcc.dg/analyzer/attr-tainted_args-misuses.c: New test.
* gcc.dg/analyzer/taint-CVE-2011-2210-1.c: New test.
* gcc.dg/analyzer/taint-CVE-2020-13143-1.c: New test.
* gcc.dg/analyzer/taint-CVE-2020-13143-2.c: New test.
* gcc.dg/analyzer/taint-CVE-2020-13143.h: New test.
* gcc.dg/analyzer/taint-alloc-3.c: New test.
* gcc.dg/analyzer/taint-alloc-4.c: New test.
* gcc.dg/analyzer/test-uaccess.h: New test.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

Daily bump.

toplevel: Remove incorrectly added file

2022-01-13 Jakub Jelinek <jakub@redhat.com>

* Makefile.am: Remove.

c++: warning for dependent template members [PR70417]

Add a helpful warning message for when the user forgets to
include the "template" keyword after ., -> or :: when
accessing a member in a dependent context, where the member is a
template.

PR c++/70417

gcc/c-family/ChangeLog:

* c.opt: Added -Wmissing-template-keyword.

gcc/cp/ChangeLog:

* parser.c (cp_parser_id_expression): Handle
-Wmissing-template-keyword.
(struct saved_token_sentinel): Add modes to control what happens
on destruction.
(cp_parser_statement): Adjust.
(cp_parser_skip_entire_template_parameter_list): New function that
skips an entire template parameter list.
(cp_parser_require_end_of_template_parameter_list): Rename old
cp_parser_skip_to_end_of_template_parameter_list.
(cp_parser_skip_to_end_of_template_parameter_list): Refactor to be
called from one of the above two functions.
(cp_parser_lambda_declarator_opt)
(cp_parser_explicit_template_declaration)
(cp_parser_enclosed_template_argument_list): Adjust.

gcc/ChangeLog:

* doc/invoke.texi: Documentation for Wmissing-template-keyword.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/variadic-mem_fn2.C: Catch warning about missing
template keyword.
* g++.dg/template/dependent-name17.C: New test.
* g++.dg/template/dependent-name18.C: New test.

Co-authored-by: Jason Merrill <jason@redhat.com>

i386: Introduce V2QImode vectorized shifts [PR103861]

Add V2QImode shift operations and split them to synthesized
double HI/LO QImode operations with integer registers.

Also robustify arithmetic split patterns.

2022-01-13 Uroš Bizjak <ubizjak@gmail.com>

gcc/ChangeLog:

PR target/103861
* config/i386/i386.md (*ashlqi_ext<mode>_2): New insn pattern.
(*<any_shiftrt:insn>qi_ext<mode>_2): Ditto.
* config/i386/mmx.md (<any_shift:insn>v2qi):
New insn_and_split pattern.

gcc/testsuite/ChangeLog:

PR target/103861
* gcc.target/i386/pr103861.c (shl,ashr,lshr): New tests.

vect: Add bias parameter for partial vectorization

This introduces a bias parameter for the len_load/len_store ifns as well as
optabs that is meant to distinguish between Power and s390 variants.
PowerPC's instructions require a bias of 0, while in s390's case
vll/vstl do not support lengths of zero bytes and a bias of -1 must be used.

gcc/ChangeLog:

* internal-fn.c (expand_partial_load_optab_fn): Add bias.
(expand_partial_store_optab_fn): Likewise.
(internal_len_load_store_bias): New function.
* internal-fn.h (VECT_PARTIAL_BIAS_UNSUPPORTED): New define.
(internal_len_load_store_bias): New function.
* tree-vect-loop-manip.c (vect_set_loop_controls_directly): Set bias.
(vect_set_loop_condition_partial_vectors): Add header_seq parameter.
* tree-vect-loop.c (vect_verify_loop_lens): Verify bias.
(vect_estimate_min_profitable_iters): Account for bias.
(vect_get_loop_len): Add bias-adjusted length.
* tree-vect-stmts.c (vectorizable_store): Use.
(vectorizable_load): Use.
* tree-vectorizer.h (struct rgroup_controls): Add bias-adjusted length.
(LOOP_VINFO_PARTIAL_LOAD_STORE_BIAS): New macro.
* config/rs6000/vsx.md: Use const0 bias predicate.
* doc/md.texi: Document bias value.

Add support for allocate clause (OpenMP 5.0).

This patch adds support for OpenMP 5.0 allocate clause for fortran. It does not
yet support the allocator-modifier as specified in OpenMP 5.1. The allocate
clause is already supported in C/C++.

gcc/fortran/ChangeLog:

* dump-parse-tree.c (show_omp_clauses): Handle OMP_LIST_ALLOCATE.
* gfortran.h (OMP_LIST_ALLOCATE): New enum value.
* openmp.c (enum omp_mask1): Add OMP_CLAUSE_ALLOCATE.
(gfc_match_omp_clauses): Handle OMP_CLAUSE_ALLOCATE
(OMP_PARALLEL_CLAUSES, OMP_DO_CLAUSES, OMP_SECTIONS_CLAUSES)
(OMP_TASK_CLAUSES, OMP_TASKLOOP_CLAUSES, OMP_TARGET_CLAUSES)
(OMP_TEAMS_CLAUSES, OMP_DISTRIBUTE_CLAUSES)
(OMP_SINGLE_CLAUSES): Add OMP_CLAUSE_ALLOCATE.
(OMP_TASKGROUP_CLAUSES): New.
(gfc_match_omp_taskgroup): Use OMP_TASKGROUP_CLAUSES instead of
OMP_CLAUSE_TASK_REDUCTION.
(resolve_omp_clauses): Handle OMP_LIST_ALLOCATE.
(resolve_omp_do): Avoid warning when loop iteration variable is
in allocate clause.
* trans-openmp.c (gfc_trans_omp_clauses): Handle translation of
allocate clause.
(gfc_split_omp_clauses): Update for OMP_LIST_ALLOCATE.

gcc/testsuite/ChangeLog:

* gfortran.dg/gomp/allocate-1.f90: New test.
* gfortran.dg/gomp/allocate-2.f90: New test.
* gfortran.dg/gomp/allocate-3.f90: New test.
* gfortran.dg/gomp/collapse1.f90: Update error message.
* gfortran.dg/gomp/openmp-simd-4.f90: Likewise.
* gfortran.dg/gomp/clauses-1.f90: Uncomment allocate clause.

libgomp/ChangeLog:

* testsuite/libgomp.fortran/allocate-1.c: New test.
* testsuite/libgomp.fortran/allocate-1.f90: New test.
* libgomp.texi: Remove string that says that allocate clause
support is for C/C++ only.

Allow more precision when querying from fold_const.

fold_const::expr_not_equal_to queries for a current range, but still uses
the old value_range class. This is causing it to miss opportunities when
ranger can provide something better.

PR tree-optimization/83072
PR tree-optimization/83073
PR tree-optimization/97909
gcc/
* fold-const.c (expr_not_equal_to): Use a multi-range class.

gcc/testsuite/
* gcc.dg/pr83072-2.c: New.
* gcc.dg/pr83073.c: New.

Add relation to unsigned right shift.

If the first operand and the shift value of a right shift operation are both
>= 0, then we know the LHS of the operation is <= the first operand.

PR tree-optimization/96707
gcc/
* range-op.cc (operator_rshift::lhs_op1_relation): New.

gcc/testsuite/
* g++.dg/pr96707.C: New.

Fortran: fix error recovery on bad structure constructor in DATA statement

gcc/fortran/ChangeLog:

PR fortran/67804
* primary.c (gfc_match_structure_constructor): Recover from errors
that occurred while checking for a valid structure constructor in
a DATA statement.

gcc/testsuite/ChangeLog:

PR fortran/67804
* gfortran.dg/pr93604.f90: Adjust to changed diagnostics.
* gfortran.dg/pr67804.f90: New test.

i386: Cleanup V2QI arithmetic instructions

2022-01-13 Uroš Bizjak <ubizjak@gmail.com>

gcc/ChangeLog:

* config/i386/mmx.md (negv2qi): Disparage GPR alternative a bit.
Disable for TARGET_PARTIAL_REG_STALL unless optimizing for size.
(negv2qi splitters): Use lowpart_subreg instead of
gen_lowpart to create subreg.
(<plusminus:insn>v2qi3): Disparage GPR alternative a bit.
Disable for TARGET_PARTIAL_REG_STALL unless optimizing for size.
(<plusminus:insn>v2qi3 splitters): Use lowpart_subreg instead of
gen_lowpart to create subreg.
* config/i386/i386.md (*subqi_ext<mode>_2): Move.

libgfortran: Fix Solaris version file creation [PR104006]

I forgot to change the gfortran.map-sun goal to gfortran.ver-sun
when changing other spots for the preprocessed version file.

2022-01-13 Jakub Jelinek <jakub@redhat.com>

PR libfortran/104006
* Makefile.am (gfortran.map-sun): Rename target to ...
(gfortran.ver-sun): ... this.
* Makefile.in: Regenerated.

ii386: Add 16-bit vector modes to xop_pcmov [PR104003]

2022-01-13 Uroš Bizjak <ubizjak@gmail.com>

gcc/ChangeLog:

PR target/104003
* config/i386/mmx.md (*xop_pcmov_<mode>): Use VI_16_32 mode iterator.

gcc/testsuite/ChangeLog:

PR target/104003
* g++.target/i386/pr103861-1-sse4.C: New test.
* g++.target/i386/pr103861-1-xop.C: Ditto.

Fix -Wformat-diag for ARM target.

gcc/ChangeLog:

* common/config/arm/arm-common.c (arm_target_mode): Wrap
keywords with %<, %> and remove trailing punctuation char.
(arm_canon_arch_option_1): Likewise.
(arm_asm_auto_mfpu): Likewise.
* config/arm/arm-builtins.c (arm_expand_builtin): Likewise.
* config/arm/arm.c (arm_options_perform_arch_sanity_checks): Likewise.
(use_vfp_abi): Likewise.
(aapcs_vfp_is_call_or_return_candidate): Likewise.
(arm_handle_cmse_nonsecure_entry): Likewise.
(arm_handle_cmse_nonsecure_call): Likewise.
(thumb1_md_asm_adjust): Likewise.

rs6000: Support SSE4.1 "round" intrinsics

Suppress exceptions (when specified), by saving, manipulating, and
restoring the FPSCR.  Similarly, save, set, and restore the floating-point
rounding mode when required.

No attempt is made to optimize writing the FPSCR (by checking if the new
value would be the same), other than using lighter weight instructions
when possible. Note that explicit instruction scheduling "barriers" are
added to prevent floating-point computations from being moved before or
after the explicit FPSCR manipulations.  (That these are required has
been reported as an issue in GCC: PR102783.)

The scalar versions naively use the parallel versions to compute the
single scalar result and then construct the remainder of the result.

Of minor note, the values of _MM_FROUND_TO_NEG_INF and _MM_FROUND_TO_ZERO
are swapped from the corresponding values on x86 so as to match the
corresponding rounding mode values in the Power ISA.

Move implementations of _mm_ceil* and _mm_floor* into _mm_round*, and
convert _mm_ceil* and _mm_floor* into macros. This matches the current
analogous implementations in config/i386/smmintrin.h.

Function signatures match the analogous functions in config/i386/smmintrin.h.

Add tests for _mm_round_pd, _mm_round_ps, _mm_round_sd, _mm_round_ss,
modeled after the very similar "floor" and "ceil" tests.

Include basic tests, plus tests at the boundaries for floating-point
representation, positive and negative, test all of the parameterized
rounding modes as well as the C99 rounding modes and interactions
between the two.

Exceptions are not explicitly tested.

2022-01-13  Paul A. Clarke  <pc@us.ibm.com>

gcc
* config/rs6000/smmintrin.h (_mm_round_pd, _mm_round_ps,
_mm_round_sd, _mm_round_ss, _MM_FROUND_TO_NEAREST_INT,
_MM_FROUND_TO_ZERO, _MM_FROUND_TO_POS_INF, _MM_FROUND_TO_NEG_INF,
_MM_FROUND_CUR_DIRECTION, _MM_FROUND_RAISE_EXC, _MM_FROUND_NO_EXC,
_MM_FROUND_NINT, _MM_FROUND_FLOOR, _MM_FROUND_CEIL, _MM_FROUND_TRUNC,
_MM_FROUND_RINT, _MM_FROUND_NEARBYINT): New.
(_mm_ceil_pd, _mm_ceil_ps, _mm_ceil_sd, _mm_ceil_ss, _mm_floor_pd,
_mm_floor_ps, _mm_floor_sd, _mm_floor_ss): Convert from function to
macro.

gcc/testsuite
* gcc.target/powerpc/sse4_1-round3.h: New.
* gcc.target/powerpc/sse4_1-roundpd.c: New.
* gcc.target/powerpc/sse4_1-roundps.c: New.
* gcc.target/powerpc/sse4_1-roundsd.c: New.
* gcc.target/powerpc/sse4_1-roundss.c: New.

c/104002 - shufflevector variable indexing

Variable indexing of a __builtin_shufflevector result is broken because
we fail to properly mark the TARGET_EXPR decl as addressable.

2022-01-13 Richard Biener <rguenther@suse.de>

PR c/104002
gcc/c-family/
* c-common.c (c_common_mark_addressable_vec): Handle TARGET_EXPR.

gcc/testsuite/
* c-c++-common/builtin-shufflevector-3.c: Move ...
* c-c++-common/torture/builtin-shufflevector-3.c: ... here.

inliner: Don't emit copy stmts for empty type parameters [PR103989]

The following patch avoids emitting a parameter copy statement when inlining
if the parameter has empty type.  E.g. the gimplifier does something similar
(except that it needs to evaluate side-effects if any, which isn't the case
here):
  /* For empty types only gimplify the left hand side and right hand
     side as statements and throw away the assignment.  Do this after
     gimplify_modify_expr_rhs so we handle TARGET_EXPRs of addressable
     types properly.  */
  if (is_empty_type (TREE_TYPE (*from_p))
      && !want_value
      /* Don't do this for calls that return addressable types, expand_call
         relies on those having a lhs.  */
      && !(TREE_ADDRESSABLE (TREE_TYPE (*from_p))
           && TREE_CODE (*from_p) == CALL_EXPR))
    {
      gimplify_stmt (from_p, pre_p);
      gimplify_stmt (to_p, pre_p);
      *expr_p = NULL_TREE;
      return GS_ALL_DONE;
    }
Unfortunately, this patch doesn't cure the uninit warnings in that PR,
which is caused by ipa inlining happening even at -Og when the post-IPA
-Og passes don't expect the need to clean up after ipa inlining,
but I think is desirable anyway.

2022-01-13  Jakub Jelinek  <jakub@redhat.com>

PR tree-optimization/103989
* tree-inline.c (setup_one_parameter): Don't copy parms with
empty type.

Improve Intel MIC offloading XFAILing for 'omp_get_device_num'

After recent commit be661959a6b6d8f9c3c8608a746789e7b2ec3ca4
"libgomp/testsuite: Improve omp_get_device_num() tests", we're now iterating
over all OpenMP target devices.  Intel MIC (emulated) offloading still doesn't
properly implement device-side 'omp_get_device_num', and we thus regress:

    PASS: libgomp.c/../libgomp.c-c++-common/target-45.c (test for excess errors)
    [-PASS:-]{+FAIL:+} libgomp.c/../libgomp.c-c++-common/target-45.c execution test

    PASS: libgomp.c++/../libgomp.c-c++-common/target-45.c (test for excess errors)
    [-PASS:-]{+FAIL:+} libgomp.c++/../libgomp.c-c++-common/target-45.c execution test

    PASS: libgomp.fortran/target10.f90   -O0  (test for excess errors)
    [-PASS:-]{+FAIL:+} libgomp.fortran/target10.f90   -O0  execution test
    PASS: libgomp.fortran/target10.f90   -O1  (test for excess errors)
    [-PASS:-]{+FAIL:+} libgomp.fortran/target10.f90   -O1  execution test
    PASS: libgomp.fortran/target10.f90   -O2  (test for excess errors)
    [-PASS:-]{+FAIL:+} libgomp.fortran/target10.f90   -O2  execution test
    PASS: libgomp.fortran/target10.f90   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  (test for excess errors)
    [-PASS:-]{+FAIL:+} libgomp.fortran/target10.f90   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  execution test
    PASS: libgomp.fortran/target10.f90   -O3 -g  (test for excess errors)
    [-PASS:-]{+FAIL:+} libgomp.fortran/target10.f90   -O3 -g  execution test
    PASS: libgomp.fortran/target10.f90   -Os  (test for excess errors)
    [-PASS:-]{+FAIL:+} libgomp.fortran/target10.f90   -Os  execution test

Improve the XFAILing added in commit bb75b22aba254e8ff144db27b1c8b4804bad73bb
"Allow matching Intel MIC in OpenMP 'declare variant'" for the case that *any*
Intel MIC offload device is available.

libgomp/
* testsuite/libgomp.c-c++-common/on_device_arch.h
(any_device_arch, any_device_arch_intel_mic): New.
* testsuite/lib/libgomp.exp
(check_effective_target_offload_device_any_intel_mic): New.
* testsuite/libgomp.c-c++-common/target-45.c: Use it.
* testsuite/libgomp.fortran/target10.f90: Likewise.

Merge 'c-c++-common/goacc/routine-6.c' into 'c-c++-common/goacc/routine-5.c', and document current C/C++ difference

gcc/testsuite/
* c-c++-common/goacc/routine-6.c: Merge into...
* c-c++-common/goacc/routine-5.c: ... this, and document current
C/C++ difference.

Document current '-Wuninitialized' diagnostics for 'libgomp.oacc-fortran/routine-10.f90' [PR102192]

libgomp/
PR tree-optimization/102192
* testsuite/libgomp.oacc-fortran/routine-10.f90: Document current
'-Wuninitialized' diagnostics.

Document current '-Wuninitialized'/'-Wmaybe-uninitialized' diagnostics for OpenACC test cases

... including "note: '[...]' was declared here" emitted since recent
commit 9695e1c23be5b5c55d572ced152897313ddb96ae
"Improve -Wuninitialized note location".

For those that seemed incorrect to me, I've placed XFAILed 'dg-bogus'es,
including one more instance of PR77504 etc., and several instances where
for "local variables" of reference-data-type reductions (etc.?) we emit
bogus (?) diagnostics.

For implicit data clauses (including 'firstprivate'), we seem to be missing
diagnostics, so I've placed XFAILed 'dg-warning's.

gcc/testsuite/
* c-c++-common/goacc/builtin-goacc-parlevel-id-size.c: Document
current '-Wuninitialized' diagnostics.
* c-c++-common/goacc/mdc-1.c: Likewise.
* c-c++-common/goacc/nested-reductions-1-kernels.c: Likewise.
* c-c++-common/goacc/nested-reductions-1-parallel.c: Likewise.
* c-c++-common/goacc/nested-reductions-1-routine.c: Likewise.
* c-c++-common/goacc/nested-reductions-2-kernels.c: Likewise.
* c-c++-common/goacc/nested-reductions-2-parallel.c: Likewise.
* c-c++-common/goacc/nested-reductions-2-routine.c: Likewise.
* c-c++-common/goacc/uninit-dim-clause.c: Likewise.
* c-c++-common/goacc/uninit-firstprivate-clause.c: Likewise.
* c-c++-common/goacc/uninit-if-clause.c: Likewise.
* gfortran.dg/goacc/array-with-dt-1.f90: Likewise.
* gfortran.dg/goacc/array-with-dt-2.f90: Likewise.
* gfortran.dg/goacc/array-with-dt-3.f90: Likewise.
* gfortran.dg/goacc/array-with-dt-4.f90: Likewise.
* gfortran.dg/goacc/array-with-dt-5.f90: Likewise.
* gfortran.dg/goacc/derived-chartypes-1.f90: Likewise.
* gfortran.dg/goacc/derived-chartypes-2.f90: Likewise.
* gfortran.dg/goacc/derived-chartypes-3.f90: Likewise.
* gfortran.dg/goacc/derived-chartypes-4.f90: Likewise.
* gfortran.dg/goacc/derived-classtypes-1.f95: Likewise.
* gfortran.dg/goacc/derived-types-2.f90: Likewise.
* gfortran.dg/goacc/host_data-tree.f95: Likewise.
* gfortran.dg/goacc/kernels-tree.f95: Likewise.
* gfortran.dg/goacc/modules.f95: Likewise.
* gfortran.dg/goacc/nested-reductions-1-kernels.f90: Likewise.
* gfortran.dg/goacc/nested-reductions-1-parallel.f90: Likewise.
* gfortran.dg/goacc/nested-reductions-1-routine.f90: Likewise.
* gfortran.dg/goacc/nested-reductions-2-kernels.f90: Likewise.
* gfortran.dg/goacc/nested-reductions-2-parallel.f90: Likewise.
* gfortran.dg/goacc/nested-reductions-2-routine.f90: Likewise.
* gfortran.dg/goacc/parallel-tree.f95: Likewise.
* gfortran.dg/goacc/pr93464.f90: Likewise.
* gfortran.dg/goacc/privatization-1-compute-loop.f90: Likewise.
* gfortran.dg/goacc/privatization-1-compute.f90: Likewise.
* gfortran.dg/goacc/privatization-1-routine_gang-loop.f90:
Likewise.
* gfortran.dg/goacc/privatization-1-routine_gang.f90: Likewise.
* gfortran.dg/goacc/uninit-dim-clause.f95: Likewise.
* gfortran.dg/goacc/uninit-firstprivate-clause.f95: Likewise.
* gfortran.dg/goacc/uninit-if-clause.f95: Likewise.
* gfortran.dg/goacc/uninit-use-device-clause.f95: Likewise.
* gfortran.dg/goacc/wait.f90: Likewise.
libgomp/
* testsuite/libgomp.oacc-c-c++-common/vred2d-128.c: Document
current '-Wuninitialized' diagnostics.
* testsuite/libgomp.oacc-fortran/data-5.f90: Likewise.
* testsuite/libgomp.oacc-fortran/gemm-2.f90: Likewise.
* testsuite/libgomp.oacc-fortran/gemm.f90: Likewise.
* testsuite/libgomp.oacc-fortran/optional-reduction.f90: Likewise.
* testsuite/libgomp.oacc-fortran/parallel-reduction.f90: Likewise.
* testsuite/libgomp.oacc-fortran/pr70643.f90: Likewise.
* testsuite/libgomp.oacc-fortran/pr96628-part1.f90: Likewise.
* testsuite/libgomp.oacc-fortran/privatized-ref-2.f90: Likewise.
* testsuite/libgomp.oacc-fortran/reduction-5.f90: Likewise.
* testsuite/libgomp.oacc-fortran/reduction-7.f90: Likewise.
* testsuite/libgomp.oacc-fortran/reference-reductions.f90:
Likewise.

Simplify git-backport.py script.

It's very unlikely that somebody is going to backport a revision
that is > 14 months old to a release branch.

contrib/ChangeLog:

* git-backport.py: Simplify the script as pre-auto-ChangeLog era
is 14 months old.

Host and offload targets have no common meaning of address spaces

gcc/
* tree-streamer-out.c (pack_ts_base_value_fields): Don't pack
'TYPE_ADDR_SPACE' for offloading.
* tree-streamer-in.c (unpack_ts_base_value_fields): Don't unpack
'TYPE_ADDR_SPACE' for offloading.
libgomp/
* testsuite/libgomp.c/address-space-1.c: Remove 'dg-xfail-run-if'
for 'offload_device_intel_mic'.

Wait at end of OpenACC asynchronous kernels regions

In OpenACC 'kernels' decomposition, we're improperly nesting synchronous and
asynchronous data and compute regions, giving rise to data races when the
asynchronicity is actually executed, as is visible in at least on test case
with GCN offloading.

The proper fix is to correctly use the asynchronous interfaces, making the
currently synchronous data regions fully asynchronous (see also
<https://gcc.gnu.org/PR97390> "[OpenACC] 'async' clause on 'data' construct",
which is to share the same implementation), but that's for later; for now add
some more synchronization.

gcc/
* omp-oacc-kernels-decompose.cc (add_wait): New function, split out
of...
(add_async_clauses_and_wait): ...here. Call new outlined function.
(decompose_kernels_region_body): Add wait at the end of
explicitly-asynchronous kernels regions.
libgomp/
* testsuite/libgomp.oacc-c-c++-common/f-asyncwait-1.c: Remove GCN
offloading execution XFAIL.

Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>

OpenACC 'kernels' decomposition: Mark variables used in synthesized data clauses as addressable [PR100280]

... as otherwise 'gcc/omp-low.c:lower_omp_target' has to create a temporary:

    13073 else if (is_gimple_reg (var))
    13074   {
    13075     gcc_assert (offloaded);
    13076     tree avar = create_tmp_var (TREE_TYPE (var));
    13077     mark_addressable (avar);

..., which (a) is only implemented for actualy *offloaded* regions (but not
data regions), and (b) the subsequently synthesized code for writing to and
later reading back from the temporary fundamentally conflicts with OpenACC
'async' (as used by OpenACC 'kernels' decomposition).  That's all not trivial
to make work, so let's just avoid this case.

gcc/
PR middle-end/100280
* omp-oacc-kernels-decompose.cc (maybe_build_inner_data_region):
Mark variables used in synthesized data clauses as addressable.
gcc/testsuite/
PR middle-end/100280
* c-c++-common/goacc/kernels-decompose-pr100280-1.c: New.
* c-c++-common/goacc/classify-kernels-parloops.c: Likewise.
* c-c++-common/goacc/classify-kernels-unparallelized-parloops.c:
Likewise.
* c-c++-common/goacc/classify-kernels-unparallelized.c: Test
'--param openacc-kernels=decompose'.
* c-c++-common/goacc/classify-kernels.c: Likewise.
* c-c++-common/goacc/kernels-decompose-2.c: Update.
* c-c++-common/goacc/kernels-decompose-ice-1.c: Remove.
* c-c++-common/goacc/kernels-decompose-ice-2.c: Likewise.
* gfortran.dg/goacc/classify-kernels-parloops.f95: New.
* gfortran.dg/goacc/classify-kernels-unparallelized-parloops.f95:
Likewise.
* gfortran.dg/goacc/classify-kernels-unparallelized.f95: Test
'--param openacc-kernels=decompose'.
* gfortran.dg/goacc/classify-kernels.f95: Likewise.
libgomp/
PR middle-end/100280
* testsuite/libgomp.oacc-c-c++-common/declare-vla-kernels-decompose-ice-1.c:
Update.
* testsuite/libgomp.oacc-c-c++-common/f-asyncwait-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/kernels-decompose-1.c:
Likewise.

Suggested-by: Julian Brown <julian@codesourcery.com>

Enhance OpenACC 'kernels' decomposition testing

gcc/testsuite/
* c-c++-common/goacc/kernels-decompose-1.c: Enhance.
* c-c++-common/goacc/kernels-decompose-2.c: Likewise.
* c-c++-common/goacc/kernels-decompose-ice-1.c: Likewise.
* c-c++-common/goacc/kernels-decompose-ice-2.c: Likewise.
* gfortran.dg/goacc/kernels-decompose-1.f95: Likewise.
* gfortran.dg/goacc/kernels-decompose-2.f95: Likewise.
libgomp/
* testsuite/libgomp.oacc-c-c++-common/declare-vla-kernels-decompose-ice-1.c:
Enhance.
* testsuite/libgomp.oacc-c-c++-common/declare-vla-kernels-decompose.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/declare-vla.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/f-asyncwait-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/f-asyncwait-2.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/f-asyncwait-3.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/kernels-decompose-1.c:
Likewise.
* testsuite/libgomp.oacc-fortran/asyncwait-1.f90: Likewise.
* testsuite/libgomp.oacc-fortran/asyncwait-2.f90: Likewise.
* testsuite/libgomp.oacc-fortran/asyncwait-3.f90: Likewise.
* testsuite/libgomp.oacc-fortran/pr94358-1.f90: Likewise.

epiphany: fix -Wimplicit-fallthrough warnings in epiphany.c.

gcc/ChangeLog:

* config/epiphany/epiphany.c (epiphany_mode_priority):
Use gcc_unreachable for not handled cases.

epiphany: fir -Wformat-diag.

gcc/ChangeLog:

* config/epiphany/epiphany.c (epiphany_handle_interrupt_attribute):
Use %qs format specifier.
(epiphany_override_options): Wrap keyword in %<, %>.

Optimize a ^ ((a ^ b) & mask) to (~mask & a) | (b & mask).

From the perspective of the pipeline, `andn + and + ior` version take
2 cycles(AND and ANDN doesn't have dependence), but xor + and + xor
will take 3 cycles.

-       xorl    %edi, %esi
        andl    %edx, %esi
-       movl    %esi, %eax
-       xorl    %edi, %eax
+       andn    %edi, %edx, %eax
+       orl     %esi, %eax

gcc/ChangeLog:

PR target/94790
* config/i386/i386.md (*xor2andn): New define_insn_and_split.

gcc/testsuite/ChangeLog:

PR target/94790
* gcc.target/i386/pr94790-1.c: New test.
* gcc.target/i386/pr94790-2.c: Ditto.

rs6000: Add split pattern to replace

7: r120:V4SI=const_vector
8: r121:V4SI=unspec[r120:V4SI,r120:V4SI,0xc] 260

with r121:v4SI = r120:V4SI when r120 is a vector with same element.

gcc/ChangeLog:

* config/rs6000/altivec.md (sldoi_to_mov<mode>): New.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/sldoi_to_mov.c: New test.

Daily bump.

testsuite: Compile gcc.target/i386/pr103861-3.c with -fno-vect-cost-model [PR103941]

2022-01-12 Uroš Bizjak <ubizjak@gmail.com>

gcc/testsuite/ChangeLog:

PR target/103941
* gcc.target/i386/pr103861-3.c (dg-options): Add -fno-vect-cost-model.

testsuite: Compile g++.dg/vect/slp-pr98855.cc only for x86 targets [PR103935]

The testcase is x86 specific, other targets have different costs defined.

2022-01-12 Uroš Bizjak <ubizjak@gmail.com>

gcc/testsuite/ChangeLog:

PR target/103935
* g++.dg/vect/slp-pr98855.cc: Compile only for x86 targets.

i386: Add CC clobber and splits for 32-bit vector mode logic insns [PR100673, PR103861]

Add CC clobber to 32-bit vector mode logic insns to allow variants with
general-purpose registers.  Also improve ix86_sse_movcc to emit insn with
CC clobber for narrow vector modes in order to re-enable conditional moves
for 16-bit and 32-bit narrow vector modes with -msse2.

2022-01-12  Uroš Bizjak  <ubizjak@gmail.com>

gcc/ChangeLog:

PR target/100637
PR target/103861
* config/i386/i386-expand.c (ix86_emit_vec_binop): New static function.
(ix86_expand_sse_movcc): Use ix86_emit_vec_binop instead of gen_rtx_X
when constructing vector logic RTXes.
(expand_vec_perm_pshufb2): Ditto.
* config/i386/mmx.md (negv2qi): Disparage GPR alternative a bit.
(<plusminus:insn>v2qi3): Ditto.
(vcond<mode><mode>): Re-enable for TARGET_SSE2.
(vcondu<mode><mode>): Ditto.
(vcond_mask_<mode><mode>): Ditto.
(one_cmpl<VI_32:mode>2): Remove expander.
(one_cmpl<VI_16_32:mode>2): Rename from one_cmplv2qi.
Use VI_16_32 mode iterator.
(one_cmpl<VI_16_32:mode>2 splitters): Use VI_16_32 mode iterator.
Use lowpart_subreg instead of gen_lowpart to create subreg.
(*andnot<VI_16_32:mode>3): Merge from "*andnot<VI_32:mode>" and
"*andnotv2qi3" insn patterns using VI_16_32 mode iterator.
Disparage GPR alternative a bit.  Add CC clobber.
(*andnot<VI_16_32:mode>3 splitters): Use VI_16_32 mode iterator.
Use lowpart_subreg instead of gen_lowpart to create subreg.
(*<any_logic:code><VI_16_32:mode>3): Merge from
"*<any_logic:code><VI_32:mode>" and "*<any_logic:code>v2qi3" insn patterns
using VI_16_32 mode iterator.  Disparage GPR alternative a bit.
Add CC clobber.
(*<any_logic:code><VI_16_32:mode>3 splitters):Use VI_16_32 mode
iterator.  Use lowpart_subreg instead of gen_lowpart to create subreg.

gcc/testsuite/ChangeLog:

PR target/100637
PR target/103861
* g++.target/i386/pr100637-1b.C (dg-options):
Use -msse2 instead of -msse4.1.
* g++.target/i386/pr100637-1w.C (dg-options): Ditto.
* g++.target/i386/pr103861-1.C (dg-options): Ditto.
* gcc.target/i386/pr100637-4b.c (dg-options): Ditto.
* gcc.target/i386/pr103861-4.c (dg-options): Ditto.
* gcc.target/i386/pr100637-1b.c: Remove scan-assembler
directives for logic instructions.
* gcc.target/i386/pr100637-1w.c: Ditto.
* gcc.target/i386/warn-vect-op-2.c:
Update dg-warning for vector logic operation.

Fix pr101384-1.c code generation test.

Add support for the compiler using XXSPLTIB reg,255 to load all 1's into a
register on power9 and above instead of using VSPLTI{B,H,W} reg,-1.

gcc/testsuite/
2022-01-12 Michael Meissner <meissner@the-meissners.org>

PR testsuite/102935
* gcc.target/powerpc/pr101384-1.c: Update insn regexp for power9
and power10.

libstdc++: Add explicit dg-do directive to .../103955.cc

libstdc++-v3/ChangeLog:

* testsuite/20_util/to_chars/103955.cc: Add explicit dg-do
directive.

aix: handle 64bit inodes for include directories

On AIX, stat will store inodes in 32bit even when using LARGE_FILES.
If the inode is larger, it will return -1 in st_ino.
Thus, in incpath.c when comparing include directories, if several
of them have 64bit inodes, they will be considered as duplicated.

gcc/ChangeLog:
2022-01-12 Clément Chigot <clement.chigot@atos.net>

* configure.ac: Check sizeof ino_t and dev_t.
(HOST_STAT_FOR_64BIT_INODES): New AC_DEFINE to provide stat
syscall being able to handle 64bit inodes.
* config.in: Regenerate.
* configure: Regenerate.
* incpath.c (HOST_STAT_FOR_64BIT_INODES): New define.
(remove_duplicates): Use it.

libcpp/ChangeLog:
2022-01-12 Clément Chigot <clement.chigot@atos.net>

* configure.ac: Check sizeof ino_t and dev_t.
* config.in: Regenerate.
* configure: Regenerate.
* include/cpplib.h (INO_T_CPP): Change for AIX.
(DEV_T_CPP): New macro.
(struct cpp_dir): Use it.

Add testcase for PR 83541.

Ranger now performs this optimzation.

PR tree-optimization/83541
gcc/testsuite
* g++.dg/pr83541.C: New.

Always set EDGE_EXECUTABLE in VRP2.

PR tree-optimization/103551
* tree-vrp.c (execute_ranger_vrp): Always set EDGE_EXECUTABLE.

tree-optimization/103990 - fix CFG cleanup regression from PRE change

This adjusts the CFG cleanup flow back to what it was before the
last change which fixes the observed regression of 541.leela_r with
LTO and FDO.

2022-01-12 Richard Biener <rguenther@suse.de>

PR tree-optimization/103990
* tree-pass.h (tail_merge_optimize): Drop unused argument.
* tree-ssa-tail-merge.c (tail_merge_optimize): Likewise.
* tree-ssa-pre.c (pass_pre::execute): Retain TODO_cleanup_cfg
and adjust call to tail_merge_optimize.

analyzer: complain about tainted sizes with "access" attribute [PR103940]

GCC 10 gained the "access" function and type attribute, which
optionally can take a size-index param:
https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html

-fanalyzer in trunk (for GCC 12) has gained a -Wanalyzer-tainted-size to
complain about attacker-controlled size values, but this was only being
used deep inside the region-model code when handling the hardcoded known
behavior of certain functions (memset, IIRC).

This patch extends -Wanalyzer-tainted-size to also complain about
unsanitized attacker-controlled values being passed to function
parameters marked as a size via the "access" attribute.

Note that -fanalyzer-checker=taint is currently required in
addition to -fanalyzer to use this warning, due to scaling issues
(see bug 103533).

gcc/analyzer/ChangeLog:
PR analyzer/103940
* engine.cc (impl_sm_context::impl_sm_context): Add
"unknown_side_effects" param and use it to initialize
new m_unknown_side_effects field.
(impl_sm_context::unknown_side_effects_p): New.
(impl_sm_context::m_unknown_side_effects): New.
(exploded_node::on_stmt): Pass unknown_side_effects to sm_ctxt
ctor.
* sm-taint.cc: Include "stringpool.h" and "attribs.h".
(tainted_size::tainted_size): Drop "dir" param.
(tainted_size::get_kind): Drop "FINAL".
(tainted_size::emit): Likewise.
(tainted_size::m_dir): Drop unused field.
(class tainted_access_attrib_size): New subclass.
(taint_state_machine::on_stmt): Call check_for_tainted_size_arg on
external functions with unknown side effects.
(taint_state_machine::check_for_tainted_size_arg): New.
(region_model::check_region_for_taint): Drop "dir" param from
tainted_size ctor.
* sm.h (sm_context::unknown_side_effects_p): New.

gcc/testsuite/ChangeLog:
PR analyzer/103940
* gcc.dg/analyzer/taint-size-access-attr-1.c: New test.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

[nvptx] Add gcc.target/nvptx/atomic-exchange-*.c test-cases

Add a few test-cases that test expansion of __atomic_exchange.

Tested on nvptx.

gcc/testsuite/ChangeLog:

2022-01-12 Tom de Vries <tdevries@suse.de>

* gcc.target/nvptx/atomic-exchange-1.c: New test.
* gcc.target/nvptx/atomic-exchange-2.c: New test.
* gcc.target/nvptx/atomic-exchange-3.c: New test.
* gcc.target/nvptx/atomic-exchange-4.c: New test.

[nvptx] Improve gcc.target/nvptx/atomic_fetch-*.c test-cases

Fix a few issues in test-cases gcc.target/nvptx/atomic_fetch-*.c:
- atomic_fetch-1.c uses scan-assembler instead of scan-assembler-times,
  which is less accurate
- atomic_fetch-2.c only contains negative testing using
  scan-assembler-not
- the test-cases use stack variables to generate generic addresses,
  while stack atomics are not natively supported
- the test-cases only test (64-bit) x (generic), instead of
  (32-bit, 64-bit) x (generic, global, shared)
- the test-cases use a hardcoded '0' instead of the clearer
  MEMMODEL_RELAXED

Tested on nvptx.

gcc/testsuite/ChangeLog:

2022-01-12  Tom de Vries  <tdevries@suse.de>

* gcc.target/nvptx/atomic_fetch-1.c: Rewrite.
* gcc.target/nvptx/atomic_fetch-2.c: Rewrite.

[vect] PR103971, PR103977: Fix epilogue mode selection for autodetect only

gcc/ChangeLog:

* tree-vect-loop.c (vect-analyze-loop): Handle scenario where target
does not add autovectorize_vector_modes.

libstdc++: Avoid overflow in bounds checks [PR103955]

We currently crash when the floating-point to_chars overloads are passed
a precision value near INT_MAX, ultimately due to overflow in the bounds
checks that verify the output range is large enough.

The simplest portable fix seems to be to replace bounds checks of the form
A >= B + C (where B + C may overflow) with the otherwise equivalent check
A >= B && A - B >= C, which is the approach this patch takes.

Before we could do this in __floating_to_chars_hex, there we first need
to track the unbounded "excess" precision (i.e. the number of trailing
fractional digits in the output that are guaranteed to be '0') separately
from the bounded "effective" precision (i.e. the number of significant
fractional digits in the output), like we do in __f_t_c_precision.

PR libstdc++/103955

libstdc++-v3/ChangeLog:

* src/c++17/floating_to_chars.cc (__floating_to_chars_hex):
Track the excess precision separately from the effective
precision. Avoid overflow in bounds check by splitting it into
two checks.
(__floating_to_chars_precision): Avoid overflow in bounds checks
similarly.
* testsuite/20_util/to_chars/103955.cc: New test.

Fix -Wformat-diag for aarch64 target.

gcc/ChangeLog:

* config/aarch64/aarch64.c (aarch64_parse_boolean_options): Use
%qs where possible.
(aarch64_parse_sve_width_string): Likewise.
(aarch64_override_options_internal): Likewise.
(aarch64_print_hint_for_extensions): Likewise.
(aarch64_validate_sls_mitigation): Likewise.
(aarch64_handle_attr_arch): Likewise.
(aarch64_handle_attr_cpu): Likewise.
(aarch64_handle_attr_tune): Likewise.
(aarch64_handle_attr_isa_flags): Likewise.

Include elfos.h before ${tm_file}.

Fixes:

In file included from ./tm.h:23,
                  from gcc/genconfig.c:25:
gcc/config/elfos.h:209: warning: "READONLY_DATA_SECTION_ASM_OP" redefined
   209 | #define READONLY_DATA_SECTION_ASM_OP    "\t.section\t.rodata"
       |
In file included from ./tm.h:21,
                  from gcc/genconfig.c:25:
gcc/config/epiphany/epiphany.h:671: note: this is the location of the previous definition
   671 | #define READONLY_DATA_SECTION_ASM_OP    "\t.section .rodata"

gcc/ChangeLog:

* config.gcc: Include elfos.h before ${tm_file}.

opts: do not do sanity check when an error is seen

PR target/103804

gcc/c-family/ChangeLog:

* c-attribs.c (handle_optimize_attribute): Do not call
cl_optimization_compare if we seen an error.

Fortran: fix testcase comment

gcc/testsuite/ChangeLog:

* gfortran.dg/ieee/signaling_1.f90: Fix comment.

Fortran: fix testcase compiler flags

-fsignaling-nans is already passed by ieee.exp, so it's not needed.
We must use dg-additional-options instead of dg-options, otherwise we
override flags passed from ieee.exp. And we need to use -w because
some options only make sense for the Fortran source.

gcc/testsuite/ChangeLog:

* gfortran.dg/ieee/signaling_1.f90: Adjust flags.

c++: Silence -Wuseless-cast warnings during move [PR103480]

This is maybe just a shot in the dark, but IMHO we shouldn't be diagnosing
-Wuseless-cast on casts the compiler adds on its own when calling its move
function.  We don't seem to warn when user calls std::move either.
We call move on elinit (*NON_LVALUE_EXPR <(struct C[2] &&) &D.2497->b>)[0]
so it is already an xvalue_p and try to static_cast it to struct C &&.
But we don't warn e.g. on std::move (std::move (whatever)).

Fixed by not doing the static cast and just returning expr from move
if expr is already an xvalue.

2022-01-11  Jakub Jelinek  <jakub@redhat.com>
    Jason Merrill  <jason@redhat.com>

PR c++/103480
* tree.c (move): If expr is xvalue_p, just return expr without
build_static_cast.

* g++.dg/warn/Wuseless-cast2.C: New test.

libgfortran: Fix build on non-glibc targets

When the __GLIBC_PREREQ macro isn't defined, the
#if ... && defined __GLIBC_PREREQ && __GLIBC_PREREQ (2, 32)
directive has invalid syntax - the __GLIBC_PREREQ in there evaluates
to 0 and is followed by (2, 32).

2022-01-12 Jakub Jelinek <jakub@redhat.com>

* libgfortran.h (POWER_IEEE128): Use __GLIBC_PREREQ in a separate
#if directive inside of #if ... && defined __GLIBC_PREREQ.

testsuite: Fix up c-c++-common/builtin-shufflevector-3.c testcase [PR101530]

This fixes:
FAIL: c-c++-common/builtin-shufflevector-3.c -Wc++-compat (test for excess errors)
Excess errors:
.../gcc/testsuite/c-c++-common/builtin-shufflevector-3.c:6:1: warning: SSE vector argument without SSE enabled changes the ABI [-Wpsabi]

2022-01-12 Jakub Jelinek <jakub@redhat.com>

PR middle-end/101530
* c-c++-common/builtin-shufflevector-3.c: Add -Wno-psabi to
dg-options.

tree-optimization/76174 - testcase for fixed PR

This adds a testcase for the fixed PR, VN now gets us the transform
via IV equality plus predication.

2022-01-12 Richard Biener <rguenther@suse.de>

PR tree-optimization/76174
* gcc.dg/tree-ssa/pr76174.c: New testcase.

cris: Avoid format-string-related warnings in calls to error functions

These tweaks are installed to avoid build-warnings for
config/cris/cris.c, like:

x/gcc/config/cris/cris.c: In function 'const char* cris_op_str(rtx)':
x/gcc/config/cris/cris.c:728:23: warning: unquoted identifier or keyword \
'cris_op_str' in format [-Wformat-diag]
  728 |       internal_error ("MULT case in cris_op_str");
      |                       ^~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from ./tm.h:20,
                 from x/gcc/backend.h:28,
                 from x/gcc/config/cris/cris.c:26:
x/gcc/config/cris/cris.c: In function 'void cris_expand_return(bool)':
x/gcc/config/cris/cris.h:42:33: warning: unquoted operator '->' in \
format [-Wformat-diag]
   42 |  do { if (!(x)) internal_error ("CRIS-port assertion failed: " #x); \
} while (0)
x/gcc/config/cris/cris.c:1862:3: note: in expansion of macro 'CRIS_ASSERT'
1862 |   CRIS_ASSERT (cfun->machine->return_type != CRIS_RETINSN_RET \
|| !on_stack);
      |   ^~~~~~~~~~~
x/gcc/config/cris/cris.c: In function 'void cris_option_override()':
x/gcc/config/cris/cris.c:2298:9: warning: space followed by punctuation \
character ':' [-Wformat-diag]
2298 |  error ("unknown CRIS version specification in %<-march=%> or "
      |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2299 |         "%<-mcpu=%> : %s", cris_cpu_str);
      |         ~~~~~~~~~~~~~~~~~
x/gcc/config/cris/cris.c:2334:9: warning: space followed by punctuation \
character ':' [-Wformat-diag]
2334 |  error ("unknown CRIS cpu version specification in %<-mtune=%> : %s",
      |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from ./tm.h:20,
                 from x/gcc/backend.h:28,
                 from x/gcc/config/cris/cris.c:26:
x/gcc/config/cris/cris.c: In function 'rtx_def* cris_split_movdx(rtx_def**)':
x/gcc/config/cris/cris.h:42:33: warning: unquoted identifier or keyword \
'GET_CODE' in format [-Wformat-diag]
   42 |  do { if (!(x)) internal_error ("CRIS-port assertion failed: " #x); \
} while (0)
x/gcc/config/cris/cris.c:2457:3: note: in expansion of macro 'CRIS_ASSERT'
2457 |   CRIS_ASSERT (GET_CODE (dest) != SUBREG && GET_CODE (src) != SUBREG);
      |   ^~~~~~~~~~~

Not that I therefore agree that operators, identifiers and keywords
should have to be dressed up like this for internal error messages;
they were more readable without these garments, if only slightly so.

2022-01-11  Hans-Peter Nilsson  <hp@axis.com>

* config/cris/cris.c: Quote identifiers in parameters to error
and internal_error, and remove extraneous spaces with punctuation.
* config/cris/cris.h (CRIS_ASSERT): When passing on stringified
expression to internal_error, pass it as a parameter instead of
appending it to the format part.

cris: Parenthesize parameter to as_a.

Noted by Richard Sandiford in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103974#c7 (thanks!)

Mea culpa: I wrongly thought the default development-level value
("yes,extra") would include everything interesting to normal target
hacking (i.e. as opposed to hacking stuff like GC).  I see
rtl-checking is marked as "expensive" and presumably therefore left
out.  Maybe it could be split into rtl-static (cheap; catching type
errors including this kind of foulups) and rtl-dynamic (the expensive
parts).  I suppose that's for whomever feels a strong enough itch.

A quick (error-prone) grep-and-eyeball in config/ shows this was the
only file missing the parenthesis.  This lets cris-elf configured with
--enable-checking=yes,extra,rtl survive make all-gcc.

2022-01-11  Hans-Peter Nilsson  <hp@axis.com>

* config/cris/cris.c (cris_postdbr_cmpelim): Parenthesize
parameter to as_a.

Daily bump.

Change the 3rd parameter of function .DEFERRED_INIT from IS_VLA to decl name.

Currently, the 3rd parameter of function .DEFERRED_INIT is IS_VLA, which is
not needed at all;

In this patch, we change the 3rd parameter from IS_VLA to the name of the var
decl for the following purposes:

1. Fix (or work around) PR103720:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103720

As confirmed in PR103720, with the current definition of .DEFERRED_INIT,

Dom transformed:
  c$a$0_6 = .DEFERRED_INIT (8, 2, 0);
  _1 = .DEFERRED_INIT (8, 2, 0);

into:
  c$a$0_6 = .DEFERRED_INIT (8, 2, 0);
  _1 = c$a$0_6;

which is incorrectly done due to Dom treating the two calls to const function
.DEFERRED_INIT as the same call since all actual parameters are the same.

The same issue has been exposed in PR102608 due to a different optimization VN,
the fix for PR102608 is to specially handle call to .DEFERRED_INIT in VN to
exclude it from CSE.

To fix PR103720, we could do the same as the fix to PR102608 to specially
handle call to .DEFERRED_INIT in Dom to exclude it from being optimized.

However, in addition to Dom and VN, there should be other optimizations that
have the same issue as PR103720 or PR102608 (As I built Linux kernel with
-ftrivial-auto-var-init=zero -Werror, I noticed a bunch of bugos warnings).

Other than identifying all the optimizations and specially handling call to
.DEFERRED_INIT in all these optimizations, changing the 3rd parameter of the
function .DEFERRED_INIT from IS_VLA to the name string of the var decl might
be a better workaround (or a fix). After this change, since the 3rd actual
parameter is the name string of the variable, different calls for different
variables will have different name strings as the 3rd actual, As a result, the
optimization that previously treated the different calls to .DEFERRED_INIT as
the same will be prevented.

2. Prepare for enabling -Wuninitialized + -ftrivail-auto-var-init for address
taken variables.

As discussion in the following thread:

https://gcc.gnu.org/pipermail/gcc-patches/2021-August/577431.html

With the current implemenation of -ftrivial-auto-var-init and uninitialized
warning analysis, the uninitialized warning for an address taken auto variable
might be missed since the variable is completely eliminated by optimization and
replaced with a temporary variable in all the uses.

In order to improve such situation, changing the 3rd parameter of the function
.DEFERRED_INIT to the name string of the variable will provide necessary
information to uninitialized warning analysis to make the missing warning
possible.

gcc/ChangeLog:

2022-01-11  qing zhao  <qing.zhao@oracle.com>

* gimplify.c (gimple_add_init_for_auto_var): Delete the 3rd argument.
Change the 3rd argument of function .DEFERRED_INIT to the name of the
decl.
(gimplify_decl_expr): Delete the 3rd argument when call
gimple_add_init_for_auto_var.
* internal-fn.c (expand_DEFERRED_INIT): Update comments to reflect
the 3rd argument change of function .DEFERRED_INIT.
* tree-cfg.c (verify_gimple_call): Update comments and verification
to reflect the 3rd argument change of function .DEFERRED_INIT.
* tree-sra.c (generate_subtree_deferred_init): Delete the 3rd argument.
(sra_modify_deferred_init): Change the 3rd argument of function
.DEFERRED_INIT to the name of the decl.

gcc/testsuite/ChangeLog:

2022-01-11  qing zhao  <qing.zhao@oracle.com>

* c-c++-common/auto-init-1.c: Adjust testcase to reflect the 3rd
argument change of function .DEFERRED_INIT.
* c-c++-common/auto-init-10.c: Likewise.
* c-c++-common/auto-init-11.c: Likewise.
* c-c++-common/auto-init-12.c: Likewise.
* c-c++-common/auto-init-13.c: Likewise.
* c-c++-common/auto-init-14.c: Likewise.
* c-c++-common/auto-init-15.c: Likewise.
* c-c++-common/auto-init-16.c: Likewise.
* c-c++-common/auto-init-2.c: Likewise.
* c-c++-common/auto-init-3.c: Likewise.
* c-c++-common/auto-init-4.c: Likewise.
* c-c++-common/auto-init-5.c: Likewise.
* c-c++-common/auto-init-6.c: Likewise.
* c-c++-common/auto-init-7.c: Likewise.
* c-c++-common/auto-init-8.c: Likewise.
* c-c++-common/auto-init-9.c: Likewise.
* c-c++-common/auto-init-esra.c: Likewise.
* c-c++-common/auto-init-padding-1.c: Likewise.
* gcc.target/aarch64/auto-init-2.c: Likewise.

power-ieee128: Fix up byte-swapping for IBM extended real(kind=16)

Here is a patch to fix up the ppc64be vs. ppc64le byteswapping
of IBM extended real(kind=16) and complex(kind=16).
Similarly to the BT_COMPLEX case it halves size and doubles nelems
for the bswap_array calls. Of course for r16_ibm and r16_ieee conversions
one needs to make sure it is only done when the on file data is in that
format and not in IEEE quad.

2022-01-11 Jakub Jelinek <jakub@redhat.com>

* io/transfer.c (unformatted_read, unformatted_write): When
byteswapping IBM extended real(kind=16), handle it as byteswapping
two real(kind=8) values.

Handle R16 conversion for POWER in the environment variables.

This patch handles the environment variables for the REAL(KIND=16)
variables like for the little/big-endian routines, so users without
who have no access to the source or are unwilling to recompile
can use this.

Syntax is, for example

GFORTRAN_CONVERT_UNIT="r16_ieee:10;little_endian:10" ./a.out

libgfortran/ChangeLog:

* runtime/environ.c (R16_IEEE): New macro.
(R16_IBM): New macro.
(next_token): Handle IBM R16 conversion cases.
(push_token): Likewise.
(mark_single): Likewise.
(do_parse): Likewise, initialize endian.

Implement CONVERT specifier for OPEN.

This patch, based on Jakub's work, implements the CONVERT
specifier for the power-ieee128 brach. It allows specifying
the conversion as r16_ieee,big_endian and the other way around,
based on a table. Setting the conversion via environment
variable and via program option does not yet work.

gcc/ChangeLog:

* flag-types.h (enum gfc_convert): Add flags for
conversion.

gcc/fortran/ChangeLog:

* libgfortran.h (unit_convert): Add flags.

libgfortran/ChangeLog:

* Makefile.in: Regenerate.
* io/file_pos.c (unformatted_backspace): Mask off
R16 parts for convert.
* io/inquire.c (inquire_via_unit): Add cases for
R16 parts.
* io/open.c (st_open): Add cases for R16 conversion.
* io/transfer.c (unformatted_read): Adjust for R16 conversions.
(unformatted_write): Likewise.
(us_read): Mask of R16 bits.
(data_transfer_init): Likewiese.
(write_us_marker): Likewise.

libgfortran: Make sure glibc < 2.32 built powerpc64le-linux libgfortran doesn't use __*ieee128 APIs

I've just tried to build libgfortran on an old glibc system
(gcc112.fsffrance.org) and unfortunately we still have work to do:

[jakub@gcc2-power8 obj38]$ LD_PRELOAD=/home/jakub/gcc/obj38/powerpc64le-unknown-linux-gnu/libgfortran/.libs/libgfortran.so.5.0.0 /bin/true
[jakub@gcc2-power8 obj38]$ LD_BIND_NOW=1 LD_PRELOAD=/home/jakub/gcc/obj38/powerpc64le-unknown-linux-gnu/libgfortran/.libs/libgfortran.so.5.0.0 /bin/true
/bin/true: symbol lookup error: /home/jakub/gcc/obj38/powerpc64le-unknown-linux-gnu/libgfortran/.libs/libgfortran.so.5.0.0: undefined symbol: __atan2ieee128

While we do use some libquadmath APIs:
readelf -Wr /home/jakub/gcc/obj38/powerpc64le-unknown-linux-gnu/libgfortran/.libs/libgfortran.so.5.0.0 | grep QUADMATH
0000000000251268  000005e400000026 R_PPC64_ADDR64         0000000000000000 quadmath_snprintf@QUADMATH_1.0 + 0
0000000000251270  0000030600000026 R_PPC64_ADDR64         0000000000000000 strtoflt128@QUADMATH_1.0 + 0
00000000002502e0  0000011600000015 R_PPC64_JMP_SLOT       0000000000000000 ynq@QUADMATH_1.0 + 0
0000000000250390  0000016000000015 R_PPC64_JMP_SLOT       0000000000000000 sqrtq@QUADMATH_1.0 + 0
0000000000250508  000001fa00000015 R_PPC64_JMP_SLOT       0000000000000000 fmaq@QUADMATH_1.0 + 0
0000000000250530  0000021200000015 R_PPC64_JMP_SLOT       0000000000000000 fabsq@QUADMATH_1.0 + 0
0000000000250760  0000030600000015 R_PPC64_JMP_SLOT       0000000000000000 strtoflt128@QUADMATH_1.0 + 0
0000000000250990  000003df00000015 R_PPC64_JMP_SLOT       0000000000000000 cosq@QUADMATH_1.0 + 0
00000000002509f0  0000040a00000015 R_PPC64_JMP_SLOT       0000000000000000 expq@QUADMATH_1.0 + 0
0000000000250a88  0000045100000015 R_PPC64_JMP_SLOT       0000000000000000 erfcq@QUADMATH_1.0 + 0
0000000000250a98  0000045e00000015 R_PPC64_JMP_SLOT       0000000000000000 jnq@QUADMATH_1.0 + 0
0000000000250ac8  0000047e00000015 R_PPC64_JMP_SLOT       0000000000000000 sinq@QUADMATH_1.0 + 0
0000000000250e38  000005db00000015 R_PPC64_JMP_SLOT       0000000000000000 fmodq@QUADMATH_1.0 + 0
0000000000250e48  000005e000000015 R_PPC64_JMP_SLOT       0000000000000000 tanq@QUADMATH_1.0 + 0
0000000000250e58  000005e400000015 R_PPC64_JMP_SLOT       0000000000000000 quadmath_snprintf@QUADMATH_1.0 + 0
0000000000250f20  0000062900000015 R_PPC64_JMP_SLOT       0000000000000000 copysignq@QUADMATH_1.0 + 0
we don't do it consistently:
readelf -Wr /home/jakub/gcc/obj38/powerpc64le-unknown-linux-gnu/libgfortran/.libs/libgfortran.so.5.0.0 | grep ieee128
0000000000250310  0000012800000015 R_PPC64_JMP_SLOT       0000000000000000 __atan2ieee128 + 0
0000000000250340  0000014200000015 R_PPC64_JMP_SLOT       0000000000000000 __clogieee128 + 0
0000000000250438  000001a300000015 R_PPC64_JMP_SLOT       0000000000000000 __acoshieee128 + 0
00000000002504b8  000001cc00000015 R_PPC64_JMP_SLOT       0000000000000000 __csinieee128 + 0
0000000000250500  000001f300000015 R_PPC64_JMP_SLOT       0000000000000000 __sinhieee128 + 0
0000000000250570  0000022a00000015 R_PPC64_JMP_SLOT       0000000000000000 __asinieee128 + 0
0000000000250580  0000022d00000015 R_PPC64_JMP_SLOT       0000000000000000 __roundieee128 + 0
00000000002505a0  0000023e00000015 R_PPC64_JMP_SLOT       0000000000000000 __logieee128 + 0
00000000002505c8  0000024900000015 R_PPC64_JMP_SLOT       0000000000000000 __tanieee128 + 0
0000000000250630  0000027500000015 R_PPC64_JMP_SLOT       0000000000000000 __ccosieee128 + 0
0000000000250670  0000028a00000015 R_PPC64_JMP_SLOT       0000000000000000 __log10ieee128 + 0
00000000002506c8  000002bd00000015 R_PPC64_JMP_SLOT       0000000000000000 __cexpieee128 + 0
00000000002506d8  000002c800000015 R_PPC64_JMP_SLOT       0000000000000000 __coshieee128 + 0
00000000002509b0  000003ef00000015 R_PPC64_JMP_SLOT       0000000000000000 __truncieee128 + 0
0000000000250af8  000004a600000015 R_PPC64_JMP_SLOT       0000000000000000 __expieee128 + 0
0000000000250b50  000004c600000015 R_PPC64_JMP_SLOT       0000000000000000 __fmodieee128 + 0
0000000000250bb0  000004e700000015 R_PPC64_JMP_SLOT       0000000000000000 __tanhieee128 + 0
0000000000250c38  0000051300000015 R_PPC64_JMP_SLOT       0000000000000000 __acosieee128 + 0
0000000000250ce0  0000055400000015 R_PPC64_JMP_SLOT       0000000000000000 __sinieee128 + 0
0000000000250d60  0000057e00000015 R_PPC64_JMP_SLOT       0000000000000000 __atanieee128 + 0
0000000000250dd8  000005b100000015 R_PPC64_JMP_SLOT       0000000000000000 __sqrtieee128 + 0
0000000000250e98  0000060200000015 R_PPC64_JMP_SLOT       0000000000000000 __cosieee128 + 0
0000000000250eb0  0000060a00000015 R_PPC64_JMP_SLOT       0000000000000000 __atanhieee128 + 0
0000000000250ef0  0000062000000015 R_PPC64_JMP_SLOT       0000000000000000 __asinhieee128 + 0
0000000000250fd8  0000067f00000015 R_PPC64_JMP_SLOT       0000000000000000 __csqrtieee128 + 0
0000000000251038  000006ad00000015 R_PPC64_JMP_SLOT       0000000000000000 __cabsieee128 + 0
All these should for POWER_IEEE128 use atan2q@QUADMATH_1.0 etc.

It seems all these come from f951 compiled sources.
For user code, I think the agreement was if you want to use successfully
-mabi=ieeelongdouble, you need glibc 2.32 or later, which is why the Fortran
FE doesn't conditionalize on whether glibc 2.32 is available or not and just
emits __WHATEVERieee128 entrypoints.
But for Fortran compiled sources in libgfortran, we need to use
__WHATEVERieee128 only if glibc 2.32 or later and WHATEVERq (from
libquadmath) otherwise.

The following patch implements that, adds -fbuilding-libgfortran option
similar to e.g. -fbuilding-libgcc used when building libgcc and if
that option is set and the TARGET_GLIBC_{MAJOR,MINOR} macros indicate
no glibc or glibc older than 2.32, it will use the libquadmath APIs
rather than glibc 2.32 APIs.

2022-01-07  Jakub Jelinek  <jakub@redhat.com>

gcc/fortran/
* trans-types.c (gfc_init_kinds): When setting abi_kind to 17, if not
targetting glibc 2.32 or later and -fbuilding-libgfortran, set
gfc_real16_is_float128 and c_float128 in gfc_real_kinds.
(gfc_build_real_type): Don't set c_long_double if c_float128 is
already set.
* trans-intrinsic.c (builtin_decl_for_precision): Don't use
long_double_built_in if gfc_real16_is_float128 and
long_double_type_node == gfc_float128_type_node.
* lang.opt (fbuilding-libgfortran): New undocumented option.
libgfortran/
* Makefile.am (AM_FCFLAGS): Add -fbuilding-libgfortran after
-fallow-leading-underscore.
* Makefile.in: Regenerated.

libgfortran: Avoid using libquadmath APIs on powerpc64le on glibc 2.32+

On a glibc 2.32+ build, we still use some libquadmath APIs
when we shouldn't:
readelf -Wr /home/jakub/gcc/obj/powerpc64le-unknown-linux-gnu/libgfortran/.libs/libgfortran.so.5 | grep QUADMATH
00000000002502c8  0000002600000015 R_PPC64_JMP_SLOT       0000000000000000 fmaq@QUADMATH_1.0 + 0
00000000002505f8  0000006700000015 R_PPC64_JMP_SLOT       0000000000000000 tanq@QUADMATH_1.0 + 0
0000000000250930  0000009b00000015 R_PPC64_JMP_SLOT       0000000000000000 fabsq@QUADMATH_1.0 + 0
0000000000250940  0000009d00000015 R_PPC64_JMP_SLOT       0000000000000000 sinq@QUADMATH_1.0 + 0
0000000000250c98  000000cf00000015 R_PPC64_JMP_SLOT       0000000000000000 copysignq@QUADMATH_1.0 + 0
0000000000251038  0000010700000015 R_PPC64_JMP_SLOT       0000000000000000 cosq@QUADMATH_1.0 + 0
0000000000251068  0000010a00000015 R_PPC64_JMP_SLOT       0000000000000000 fmodq@QUADMATH_1.0 + 0
These should use __fmaieee128, __tanieee128 etc. instead.

2022-01-07  Jakub Jelinek  <jakub@redhat.com>

* libgfortran.h (__copysignieee128, __fmaieee128, __fmodieee128):
Declare.
* intrinsics/trigd.c (COPYSIGN, FMOD, FABS, FMA, SIN, COS, TAN): If
POWER_IEEE128 is defined, define these for kind 17 include.
* intrinsics/trigd_lib.inc (COPYSIGN, FMOD, FABS, FMA, SIN, COS, TAN):
Don't define if COPYSIGN is already defined.

Allow other languages to change long double format.

With Fortran adding support for changing the long double format, this
patch removes the code that only allowed C/C++ to change the long double
format for GLIBC 2.32 and later without a warning.

gcc/
2022-01-05 Michael Meissner <meissner@the-meissners.org>

* config/rs6000/rs6000.c (rs6000_option_override_internal): Remove
checks for only C/C++ front ends before allowing the long double
format to change without a warning.

testsuite: Fix pr47614.f test

This test FAILs because
f951: Error: '-mabi=ieeelongdouble' requires full ISA 2.06 support
compiler exited with status 1
FAIL: gfortran.dg/pr47614.f -O0 (test for excess errors)
As powerpc64le* only supports -mcpu=power8 and newer, I think we shouldn't
be testing with that option.

2022-01-04 Jakub Jelinek <jakub@redhat.com>

* gfortran.dg/pr47614.f: Don't use -mcpu=power4 for
powerpc64le*-*-linux*.

fortran, libgfortran: Add remaining missing *_r17 symbols

Following patch adds remaining missing *_r17 entrypoints, so that
we have 91 *_r16 and 91 *_r17 entrypoints (and 24 *_c16 and 24 *_c17).

This fixes:
FAIL: gfortran.dg/dec_math.f90   -O0  execution test
FAIL: gfortran.dg/dec_math.f90   -O1  execution test
FAIL: gfortran.dg/dec_math.f90   -O2  execution test
FAIL: gfortran.dg/dec_math.f90   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  execution test
FAIL: gfortran.dg/dec_math.f90   -O3 -g  execution test
FAIL: gfortran.dg/dec_math.f90   -Os  execution test
FAIL: gfortran.dg/ieee/dec_math_1.f90   -O0  execution test
FAIL: gfortran.dg/ieee/dec_math_1.f90   -O1  execution test
FAIL: gfortran.dg/ieee/dec_math_1.f90   -O2  execution test
FAIL: gfortran.dg/ieee/dec_math_1.f90   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  execution test
FAIL: gfortran.dg/ieee/dec_math_1.f90   -O3 -g  execution test
FAIL: gfortran.dg/ieee/dec_math_1.f90   -Os  execution test

2022-01-04  Jakub Jelinek  <jakub@redhat.com>

gcc/fortran/
* trans-intrinsic.c (gfc_get_intrinsic_lib_fndecl): Use
gfc_type_abi_kind.
libgfortran/
* libgfortran.h (GFC_REAL_17_INFINITY, GFC_REAL_17_QUIET_NAN): Define.
(__erfcieee128): Declare.
* intrinsics/trigd.c (_gfortran_sind_r17, _gfortran_cosd_r17,
_gfortran_tand_r17): Define for HAVE_GFC_REAL_17.
* intrinsics/random.c (random_r17, arandom_r17, rnumber_17): Define.
* intrinsics/erfc_scaled.c (ERFC_SCALED): Define.
(erfc_scaled_r16): Use ERFC_SCALED macro.
(erfc_scaled_r17): Define.

fortran, libgfortran: Assorted -mabi=ieeelongdouble I/O fixes

Another patch, this fixes:
FAIL: gfortran.dg/intrinsic_spread_2.f90   -O0  execution test
FAIL: gfortran.dg/intrinsic_spread_2.f90   -O1  execution test
FAIL: gfortran.dg/intrinsic_spread_2.f90   -O2  execution test
FAIL: gfortran.dg/intrinsic_spread_2.f90   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  execution test
FAIL: gfortran.dg/intrinsic_spread_2.f90   -O3 -g  execution test
FAIL: gfortran.dg/intrinsic_spread_2.f90   -Os  execution test
FAIL: gfortran.dg/intrinsic_unpack_2.f90   -O0  execution test
FAIL: gfortran.dg/intrinsic_unpack_2.f90   -O1  execution test
FAIL: gfortran.dg/intrinsic_unpack_2.f90   -O2  execution test
FAIL: gfortran.dg/intrinsic_unpack_2.f90   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  execution test
FAIL: gfortran.dg/intrinsic_unpack_2.f90   -O3 -g  execution test
FAIL: gfortran.dg/intrinsic_unpack_2.f90   -Os  execution test
FAIL: gfortran.dg/large_real_kind_form_io_1.f90   -O0  execution test
FAIL: gfortran.dg/large_real_kind_form_io_1.f90   -O1  execution test
FAIL: gfortran.dg/large_real_kind_form_io_1.f90   -O2  execution test
FAIL: gfortran.dg/large_real_kind_form_io_1.f90   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  execution test
FAIL: gfortran.dg/large_real_kind_form_io_1.f90   -O3 -g  execution test
FAIL: gfortran.dg/large_real_kind_form_io_1.f90   -Os  execution test
FAIL: gfortran.dg/quad_2.f90   -O0  execution test
FAIL: gfortran.dg/quad_2.f90   -O1  execution test
FAIL: gfortran.dg/quad_2.f90   -O2  execution test
FAIL: gfortran.dg/quad_2.f90   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  execution test
FAIL: gfortran.dg/quad_2.f90   -O3 -g  execution test
FAIL: gfortran.dg/quad_2.f90   -Os  execution test

2022-01-04  Jakub Jelinek  <jakub@redhat.com>

gcc/fortran/
* trans-io.c (transfer_array_desc): Pass abi kind instead of kind
to libgfortran.
libgfortran/
* io/read.c (convert_real): Add missing break; for the
HAVE_GFC_REAL_17 case.

libgfortran: -mabi=ieeelongdouble I/O fix

The following patch fixes:
FAIL: gfortran.dg/fmt_en.f90   -O0  output pattern test
FAIL: gfortran.dg/fmt_en.f90   -O1  output pattern test
FAIL: gfortran.dg/fmt_en.f90   -O2  output pattern test
FAIL: gfortran.dg/fmt_en.f90   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  output pattern test
FAIL: gfortran.dg/fmt_en.f90   -O3 -g  output pattern test
FAIL: gfortran.dg/fmt_en.f90   -Os  output pattern test
FAIL: gfortran.dg/fmt_en_rd.f90   -O0  output pattern test
FAIL: gfortran.dg/fmt_en_rd.f90   -O1  output pattern test
FAIL: gfortran.dg/fmt_en_rd.f90   -O2  output pattern test
FAIL: gfortran.dg/fmt_en_rd.f90   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  output pattern test
FAIL: gfortran.dg/fmt_en_rd.f90   -O3 -g  output pattern test
FAIL: gfortran.dg/fmt_en_rd.f90   -Os  output pattern test
FAIL: gfortran.dg/fmt_en_rn.f90   -O0  output pattern test
FAIL: gfortran.dg/fmt_en_rn.f90   -O1  output pattern test
FAIL: gfortran.dg/fmt_en_rn.f90   -O2  output pattern test
FAIL: gfortran.dg/fmt_en_rn.f90   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  output pattern test
FAIL: gfortran.dg/fmt_en_rn.f90   -O3 -g  output pattern test
FAIL: gfortran.dg/fmt_en_rn.f90   -Os  output pattern test
FAIL: gfortran.dg/fmt_en_ru.f90   -O0  output pattern test
FAIL: gfortran.dg/fmt_en_ru.f90   -O1  output pattern test
FAIL: gfortran.dg/fmt_en_ru.f90   -O2  output pattern test
FAIL: gfortran.dg/fmt_en_ru.f90   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  output pattern test
FAIL: gfortran.dg/fmt_en_ru.f90   -O3 -g  output pattern test
FAIL: gfortran.dg/fmt_en_ru.f90   -Os  output pattern test
FAIL: gfortran.dg/fmt_en_rz.f90   -O0  output pattern test
FAIL: gfortran.dg/fmt_en_rz.f90   -O1  output pattern test
FAIL: gfortran.dg/fmt_en_rz.f90   -O2  output pattern test
FAIL: gfortran.dg/fmt_en_rz.f90   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  output pattern test
FAIL: gfortran.dg/fmt_en_rz.f90   -O3 -g  output pattern test
FAIL: gfortran.dg/fmt_en_rz.f90   -Os  output pattern test
FAIL: gfortran.dg/fmt_g0_7.f08   -O0  execution test
FAIL: gfortran.dg/fmt_g0_7.f08   -O1  execution test
FAIL: gfortran.dg/fmt_g0_7.f08   -O2  execution test
FAIL: gfortran.dg/fmt_g0_7.f08   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  execution test
FAIL: gfortran.dg/fmt_g0_7.f08   -O3 -g  execution test
FAIL: gfortran.dg/fmt_g0_7.f08   -Os  execution test
FAIL: gfortran.dg/fmt_pf.f90   -O0  output pattern test
FAIL: gfortran.dg/fmt_pf.f90   -O1  output pattern test
FAIL: gfortran.dg/fmt_pf.f90   -O2  output pattern test
FAIL: gfortran.dg/fmt_pf.f90   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  output pattern test
FAIL: gfortran.dg/fmt_pf.f90   -O3 -g  output pattern test
FAIL: gfortran.dg/fmt_pf.f90   -Os  output pattern test
FAIL: gfortran.dg/large_real_kind_1.f90   -O0  execution test
FAIL: gfortran.dg/large_real_kind_1.f90   -O1  execution test
FAIL: gfortran.dg/large_real_kind_1.f90   -O2  execution test
FAIL: gfortran.dg/large_real_kind_1.f90   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  execution test
FAIL: gfortran.dg/large_real_kind_1.f90   -O3 -g  execution test
FAIL: gfortran.dg/large_real_kind_1.f90   -Os  execution test

2022-01-04  Jakub Jelinek  <jakub@redhat.com>

* io/write_float.def (CALCULATE_EXP): If HAVE_GFC_REAL_17, also use
CALCULATE_EXP(17).
(determine_en_precision): Use 17 instead of 16 as first EN_PREC
argument for kind 17.
(get_float_string): Use 17 instead of 16 as first FORMAT_FLOAT
argument for kind 17.

fortran, libgfortran: -mabi=ieeelongdouble I/O

The following patch adds the compiler and library side of -mabi=ieeelongdouble
I/O support.

2022-01-04 Jakub Jelinek <jakub@redhat.com>

gcc/fortran/
* trans-io.c (transfer_namelist_element): Use gfc_type_abi_kind,
formatting fixes.
(transfer_expr): Use gfc_type_abi_kind, use *REAL128* APIs even
for abi_kind == 17.
libgfortran/
* libgfortran.h (__acoshieee128, __acosieee128, __asinhieee128,
__asinieee128, __atan2ieee128, __atanhieee128, __atanieee128,
__coshieee128, __cosieee128, __erfieee128, __expieee128,
__fabsieee128, __jnieee128, __log10ieee128, __logieee128,
__powieee128, __sinhieee128, __sinieee128, __sqrtieee128,
__tanhieee128, __tanieee128, __ynieee128): Formatting fixes.
(__strtoieee128, __snprintfieee128): Declare.
* io/io.h (default_width_for_float, default_precision_for_float):
Handle kind == 17.
* io/size_from_kind.c (size_from_real_kind, size_from_complex_kind):
Likewise.
* io/read.c (set_integer, si_max, convert_real, convert_infnan,
read_f): Likewise.
* io/write.c (extract_uint, size_from_kind, set_fnode_default):
Likewise.
* io/write_float.def (DTOA2Q, FDTOA2Q): Define for HAVE_GFC_REAL_17.
(determine_en_precision, get_float_string): Handle kind == 17.
* io/transfer128.c: Use also for HAVE_GFC_REAL_17, but don't drag in
libquadmath if POWER_IEEE128.
* Makefile.am (comma, PREPROCESS): New variables.
(gfortran.ver): New goal.
(version_arg, version_dep): Use gfortran.ver instead of
$(srcdir)/gfortran.map.
(gfortran.map-sun): Depend on and use gfortran.ver instead of
$(srcdir)/gfortran.map.
(BUILT_SOURCES): Add $(version_dep).
* Makefile.in: Regenerated.
* gfortran.map (GFORTRAN_8): Don't export
_gfortran_transfer_complex128, _gfortran_transfer_complex128_write,
_gfortran_transfer_real128 and _gfortran_transfer_real128_write if
HAVE_GFC_REAL_17 is defined.
(GFORTRAN_12): Export those here instead.

libquadmath: Use -mno-gnu-attribute in libquadmath

Testing found that we also need libquadmath to be built with
-mno-gnu-attribute, otherwise -mabi=ieeelongdouble programs don't link.

2022-01-03 Jakub Jelinek <jakub@redhat.com>

* configure.ac: Set XCFLAGS to -mno-gnu-attribute on
powerpc64le*-linux*.
* configure: Regenerated.

Make sure the Fortran specifics have real(kind=16).

This brings the library to compile with all specific functions.
It also corrects the patsubst patterns so the right files
get the flags.

It was necessary to manually add -D__powerpc64__ because apparently
this is not set for Fortran.

libgfortran/ChangeLog:

* Makefile.am: Correct files for compilation flags. Add
-D__powerpc64__ for Fortran sources. Get kinds.inc from
grep of kinds.h and kinds-override.h.
* Makefile.in: Regenerate.
* config.h.in: Regenerate.
* configure: Regenerate.
* configure.ac: Add -mno-gnu-attribute to compile flags.
* generated/_abs_c17.F90: Regenerate.
* generated/_abs_r17.F90: Regenerate.
* generated/_acos_r17.F90: Regenerate.
* generated/_acosh_r17.F90: Regenerate.
* generated/_aimag_c17.F90: Regenerate.
* generated/_aint_r17.F90: Regenerate.
* generated/_anint_r17.F90: Regenerate.
* generated/_asin_r17.F90: Regenerate.
* generated/_asinh_r17.F90: Regenerate.
* generated/_atan2_r17.F90: Regenerate.
* generated/_atan_r17.F90: Regenerate.
* generated/_atanh_r17.F90: Regenerate.
* generated/_conjg_c17.F90: Regenerate.
* generated/_cos_c17.F90: Regenerate.
* generated/_cos_r17.F90: Regenerate.
* generated/_cosh_r17.F90: Regenerate.
* generated/_dim_r17.F90: Regenerate.
* generated/_exp_c17.F90: Regenerate.
* generated/_exp_r17.F90: Regenerate.
* generated/_log10_r17.F90: Regenerate.
* generated/_log_c17.F90: Regenerate.
* generated/_log_r17.F90: Regenerate.
* generated/_mod_r17.F90: Regenerate.
* generated/_sign_r17.F90: Regenerate.
* generated/_sin_c17.F90: Regenerate.
* generated/_sin_r17.F90: Regenerate.
* generated/_sinh_r17.F90: Regenerate.
* generated/_sqrt_c17.F90: Regenerate.
* generated/_sqrt_r17.F90: Regenerate.
* generated/_tan_r17.F90: Regenerate.
* generated/_tanh_r17.F90: Regenerate.
* kinds-override.h: Adjust to trunk.
Change condition to single line so it can be grepped.
* m4/specific.m4: Make sure that real=kind16 is used
for _r17.F90 and _c17.F90 files.
* m4/specific2.m4: Likewise.

gfortran: Introduce gfc_type_abi_kind

The following patch detects the powerpc64le-linux kind == 16 cases
and for the -mabi=ieeelongdouble case (no matter whether it is the
configured in default or just option used on the command line) uses
_r17 or _c17 instead of _r16 or _c17 in the library API names.

From what I can see, e.g. calls to sin on real(kind = 16) works fine
with or without this patch (we call __builtin_sinl and the backend
uses rs6000_mangle_decl_assembler_name which ensures __sinieee128
is called).

What is clearly still broken is IO, where for
  real(kind=16) a
  a = 1.0
  print *, a
end
we call
  _gfortran_transfer_real_write (&dt_parm.0, &a, 16);
for both -mabi=ibmlongdouble and -mabi=ieeelongdouble
I don't remember what was the agreement, do we want
  _gfortran_transfer_real_write (&dt_parm.0, &a, 17);
for the ieeelongdouble case, or some new entrypoint for
the abi_kind == 17 real/complex IO?
Also, what about kind stored in array descriptors?  Shall we use
there the abi_kind or kind?

I guess at least before the IO case is solved there is no point
in checking the testsuite, too many things will be majorly broken...

2021-12-31  Jakub Jelinek  <jakub@redhat.com>

* gfortran.h (gfc_real_info): Add abi_kind member.
(gfc_type_abi_kind): Declare.
* trans-types.c (gfc_init_kinds): Initialize abi_kind.
* intrinsic.c (gfc_type_abi_kind): New function.
(conv_name): Use it.
* iresolve.c (resolve_transformational, gfc_resolve_abs,
gfc_resolve_char_achar, gfc_resolve_acos, gfc_resolve_acosh,
gfc_resolve_aimag, gfc_resolve_and, gfc_resolve_aint, gfc_resolve_all,
gfc_resolve_anint, gfc_resolve_any, gfc_resolve_asin,
gfc_resolve_asinh, gfc_resolve_atan, gfc_resolve_atanh,
gfc_resolve_atan2, gfc_resolve_bessel_n2, gfc_resolve_ceiling,
gfc_resolve_cmplx, gfc_resolve_complex, gfc_resolve_cos,
gfc_resolve_cosh, gfc_resolve_count, gfc_resolve_dble,
gfc_resolve_dim, gfc_resolve_dot_product, gfc_resolve_dprod,
gfc_resolve_exp, gfc_resolve_floor, gfc_resolve_hypot,
gfc_resolve_int, gfc_resolve_int2, gfc_resolve_int8, gfc_resolve_long,
gfc_resolve_log, gfc_resolve_log10, gfc_resolve_logical,
gfc_resolve_matmul, gfc_resolve_minmax, gfc_resolve_maxloc,
gfc_resolve_findloc, gfc_resolve_maxval, gfc_resolve_merge,
gfc_resolve_minloc, gfc_resolve_minval, gfc_resolve_mod,
gfc_resolve_modulo, gfc_resolve_nearest, gfc_resolve_or,
gfc_resolve_real, gfc_resolve_realpart, gfc_resolve_reshape,
gfc_resolve_sign, gfc_resolve_sin, gfc_resolve_sinh, gfc_resolve_sqrt,
gfc_resolve_tan, gfc_resolve_tanh, gfc_resolve_transpose,
gfc_resolve_trigd, gfc_resolve_xor, gfc_resolve_random_number):
Likewise.
* trans-decl.c (gfc_build_intrinsic_function_decls): Likewise.

libgfortran: Small progress on the library side

The following patch quiets
../../../libgfortran/generated/in_pack_r17.c:35:1: warning: no previous prototype for ‘internal_pack_r17’ [-Wmissing-prototypes]
../../../libgfortran/generated/in_pack_c17.c:35:1: warning: no previous prototype for ‘internal_pack_c17’ [-Wmissing-prototypes]
../../../libgfortran/generated/in_unpack_r17.c:33:1: warning: no previous prototype for ‘internal_unpack_r17’ [-Wmissing-prototypes]
../../../libgfortran/generated/in_unpack_c17.c:33:1: warning: no previous prototype for ‘internal_unpack_c17’ [-Wmissing-prototypes]
../../../libgfortran/generated/pack_r17.c:73:1: warning: no previous prototype for ‘pack_r17’ [-Wmissing-prototypes]
../../../libgfortran/generated/pack_c17.c:73:1: warning: no previous prototype for ‘pack_c17’ [-Wmissing-prototypes]
../../../libgfortran/generated/unpack_r17.c:34:1: warning: no previous prototype for ‘unpack0_r17’ [-Wmissing-prototypes]
../../../libgfortran/generated/unpack_r17.c:178:1: warning: no previous prototype for ‘unpack1_r17’ [-Wmissing-prototypes]
../../../libgfortran/generated/unpack_c17.c:34:1: warning: no previous prototype for ‘unpack0_c17’ [-Wmissing-prototypes]
../../../libgfortran/generated/unpack_c17.c:178:1: warning: no previous prototype for ‘unpack1_c17’ [-Wmissing-prototypes]
../../../libgfortran/generated/spread_r17.c:34:1: warning: no previous prototype for ‘spread_r17’ [-Wmissing-prototypes]
../../../libgfortran/generated/spread_r17.c:230:1: warning: no previous prototype for ‘spread_scalar_r17’ [-Wmissing-prototypes]
../../../libgfortran/generated/spread_c17.c:34:1: warning: no previous prototype for ‘spread_c17’ [-Wmissing-prototypes]
../../../libgfortran/generated/spread_c17.c:230:1: warning: no previous prototype for ‘spread_scalar_c17’ [-Wmissing-prototypes]
../../../libgfortran/generated/cshift0_r17.c:33:1: warning: no previous prototype for ‘cshift0_r17’ [-Wmissing-prototypes]
../../../libgfortran/generated/cshift0_c17.c:33:1: warning: no previous prototype for ‘cshift0_c17’ [-Wmissing-prototypes]
../../../libgfortran/generated/cshift1_4_r17.c:32:1: warning: no previous prototype for ‘cshift1_4_r17’ [-Wmissing-prototypes]
../../../libgfortran/generated/cshift1_4_c17.c:32:1: warning: no previous prototype for ‘cshift1_4_c17’ [-Wmissing-prototypes]
../../../libgfortran/generated/cshift1_8_r17.c:32:1: warning: no previous prototype for ‘cshift1_8_r17’ [-Wmissing-prototypes]
../../../libgfortran/generated/cshift1_8_c17.c:32:1: warning: no previous prototype for ‘cshift1_8_c17’ [-Wmissing-prototypes]
../../../libgfortran/generated/cshift1_16_r17.c:32:1: warning: no previous prototype for ‘cshift1_16_r17’ [-Wmissing-prototypes]
../../../libgfortran/generated/cshift1_16_c17.c:32:1: warning: no previous prototype for ‘cshift1_16_c17’ [-Wmissing-prototypes]
warnings during libgfortran build and exports the new entrypoints.
Note, not all of them, clearly e.g. there are fewer *_r17* entrypoints than
*_r16* entrypoints, so more work is needed.

2021-12-31 Jakub Jelinek <jakub@redhat.com>

* libgfortran.h (internal_pack_r17, internal_pack_c17,
internal_unpack_r17, internal_unpack_c17, pack_r17, pack_c17,
unpack0_r17, unpack0_c17, unpack1_r17, unpack1_c17, spread_r17,
spread_c17, spread_scalar_r17, spread_scalar_c17, cshift0_r17,
cshift0_c17, cshift1_4_r17, cshift1_8_r17, cshift1_16_r17,
cshift1_4_c17, cshift1_8_c17, cshift1_16_c17): Declare.
* gfortran.map (GFORTRAN_12): Export *_r17 and *_c17.

Generate config.h macros for IEEE128 math functions.

libgfortran/ChangeLog:

* acinclude.m4 (LIBGFOR_CHECK_MATH_IEEE128): New macro.
* configure.ac: Use it.
* config.h.in: Regenerate.
* configure: Regenerate.

Fix pattern substition for _r17 and _c17.

libgfortran/ChangeLog:

* Makefile.am: Fix pattern substitution for _r17 and _c17.
* Makefile.in: Regenerate.

Prepare library for REAL(KIND=17).

This prepares the library side for REAL(KIND=17).  It is
not yet tested, but at least compiles cleanly on POWER 9
and x86_64.

2021-10-19  Thomas Koenig  <tkoenig@gcc.gnu.org>

* Makefile.am: Add _r17 and _c17 files.  Build them
with -mabi=ieeelongdouble on POWER.
* Makefile.in: Regenerate.
* configure: Regenerate.
* configure.ac: New flag HAVE_REAL_17.
* kinds-override.h: (HAVE_GFC_REAL_17): New macro.
(HAVE_GFC_COMPLEX_17): New macro.
(GFC_REAL_17_HUGE): New macro.
(GFC_REAL_17_LITERAL_SUFFIX): New macro.
(GFC_REAL_17_LITERAL): New macro.
(GFC_REAL_17_DIGITS): New macro.
(GFC_REAL_17_RADIX): New macro.
* libgfortran.h (POWER_IEEE128): New macro.
(gfc_array_r17): Typedef.
(GFC_DTYPE_REAL_17): New macro.
(GFC_DTYPE_COMPLEX_17): New macro.
(__acoshieee128): Prototype.
(__acosieee128): Prototype.
(__asinhieee128): Prototype.
(__asinieee128): Prototype.
(__atan2ieee128): Prototype.
(__atanhieee128): Prototype.
(__atanieee128): Prototype.
(__coshieee128): Prototype.
(__cosieee128): Prototype.
(__erfieee128): Prototype.
(__expieee128): Prototype.
(__fabsieee128): Prototype.
(__jnieee128): Prototype.
(__log10ieee128): Prototype.
(__logieee128): Prototype.
(__powieee128): Prototype.
(__sinhieee128): Prototype.
(__sinieee128): Prototype.
(__sqrtieee128): Prototype.
(__tanhieee128): Prototype.
(__tanieee128): Prototype.
(__ynieee128): Prototype.
* m4/mtype.m4: Make a bit more readable. Add KIND=17.
* generated/_abs_c17.F90: New file.
* generated/_abs_r17.F90: New file.
* generated/_acos_r17.F90: New file.
* generated/_acosh_r17.F90: New file.
* generated/_aimag_c17.F90: New file.
* generated/_aint_r17.F90: New file.
* generated/_anint_r17.F90: New file.
* generated/_asin_r17.F90: New file.
* generated/_asinh_r17.F90: New file.
* generated/_atan2_r17.F90: New file.
* generated/_atan_r17.F90: New file.
* generated/_atanh_r17.F90: New file.
* generated/_conjg_c17.F90: New file.
* generated/_cos_c17.F90: New file.
* generated/_cos_r17.F90: New file.
* generated/_cosh_r17.F90: New file.
* generated/_dim_r17.F90: New file.
* generated/_exp_c17.F90: New file.
* generated/_exp_r17.F90: New file.
* generated/_log10_r17.F90: New file.
* generated/_log_c17.F90: New file.
* generated/_log_r17.F90: New file.
* generated/_mod_r17.F90: New file.
* generated/_sign_r17.F90: New file.
* generated/_sin_c17.F90: New file.
* generated/_sin_r17.F90: New file.
* generated/_sinh_r17.F90: New file.
* generated/_sqrt_c17.F90: New file.
* generated/_sqrt_r17.F90: New file.
* generated/_tan_r17.F90: New file.
* generated/_tanh_r17.F90: New file.
* generated/bessel_r17.c: New file.
* generated/cshift0_c17.c: New file.
* generated/cshift0_r17.c: New file.
* generated/cshift1_16_c17.c: New file.
* generated/cshift1_16_r17.c: New file.
* generated/cshift1_4_c17.c: New file.
* generated/cshift1_4_r17.c: New file.
* generated/cshift1_8_c17.c: New file.
* generated/cshift1_8_r17.c: New file.
* generated/findloc0_c17.c: New file.
* generated/findloc0_r17.c: New file.
* generated/findloc1_c17.c: New file.
* generated/findloc1_r17.c: New file.
* generated/in_pack_c17.c: New file.
* generated/in_pack_r17.c: New file.
* generated/in_unpack_c17.c: New file.
* generated/in_unpack_r17.c: New file.
* generated/matmul_c17.c: New file.
* generated/matmul_r17.c: New file.
* generated/matmulavx128_c17.c: New file.
* generated/matmulavx128_r17.c: New file.
* generated/maxloc0_16_r17.c: New file.
* generated/maxloc0_4_r17.c: New file.
* generated/maxloc0_8_r17.c: New file.
* generated/maxloc1_16_r17.c: New file.
* generated/maxloc1_4_r17.c: New file.
* generated/maxloc1_8_r17.c: New file.
* generated/maxval_r17.c: New file.
* generated/minloc0_16_r17.c: New file.
* generated/minloc0_4_r17.c: New file.
* generated/minloc0_8_r17.c: New file.
* generated/minloc1_16_r17.c: New file.
* generated/minloc1_4_r17.c: New file.
* generated/minloc1_8_r17.c: New file.
* generated/minval_r17.c: New file.
* generated/norm2_r17.c: New file.
* generated/pack_c17.c: New file.
* generated/pack_r17.c: New file.
* generated/pow_c17_i16.c: New file.
* generated/pow_c17_i4.c: New file.
* generated/pow_c17_i8.c: New file.
* generated/pow_r17_i16.c: New file.
* generated/pow_r17_i4.c: New file.
* generated/pow_r17_i8.c: New file.
* generated/product_c17.c: New file.
* generated/product_r17.c: New file.
* generated/reshape_c17.c: New file.
* generated/reshape_r17.c: New file.
* generated/spread_c17.c: New file.
* generated/spread_r17.c: New file.
* generated/sum_c17.c: New file.
* generated/sum_r17.c: New file.
* generated/unpack_c17.c: New file.
* generated/unpack_r17.c: New file.

ira: Fix old-reload targets [PR103974]

The new IRA heuristics would need more work on old-reload targets,
since flattening needs to be able to undo the cost propagation.
It's doable, but hardly seems worth it.

This patch therefore makes all the new calls to
ira_subloop_allocnos_can_differ_p return false if !ira_use_lra_p.
The color_pass code that predated the new function (and that was
the source of ira_subloop_allocnos_can_differ_p) continues to
behave as before.

It's a hack, but at least it has the advantage that the new parameter
would become obviously unused if reload and (!)ira_use_lra_p were
removed. The hack should therefore disappear alongside reload.

gcc/
PR rtl-optimization/103974
* ira-int.h (ira_subloop_allocnos_can_differ_p): Take an
extra argument, default true, that says whether old-reload
targets should be excluded.
* ira-color.c (color_pass): Pass false.

libstdc++: Install <source_location> header for freestanding [PR103726]

This C++20 header is also supposed to be present for freestanding.

libstdc++-v3/ChangeLog:

PR libstdc++/103726
* include/Makefile.am: Install <source_location> for
freestanding.
* include/Makefile.in: Regenerate.
* include/std/version (__cpp_lib_source_location): Define for
freestanding.

i386: Introduce V2QImode vector cmove for -msse4.1 [PR103861]

This patch also moves V2HI and V4QImode vector conditional moves
to SSE4.1 targets.  Vector cmoves are implemented with SSE logic functions
without -msse4.1, and they are hardly worthwile for narrow vector modes.
More important, we would like to keep vector logic functions for GPR
registers, and the current RTX description of 32-bit vector modes logic
insns does not include the necessary CC reg clobber.  Solve these issues by
restricting vector cmove insns for these modes to -msse4.1, where logic
instructions are avoided, and pblend insn is used instead.

A follow-up patch will add clobbers and necessary splits to 32-bit
vector mode logic insns, and in a future patch, ix86_sse_movcc will be
improved to use expand_simple_{unop,binop} to emit logic insns, allowing
us to re-enable 16-bit and 32-bit narrow vector cmoves for -msse2.

2022-01-11  Uroš Bizjak  <ubizjak@gmail.com>

gcc/ChangeLog:

PR target/103861
* config/i386/mmx.md (vcond<mode><mode>):
Use VI_16_32 mode iterator.  Enable for TARGET_SSE4_1.
(vcondu<mode><mode>): Ditto.
(vcond_mask_<mode><mode>): Ditto.
(mmx_pblendvb_v8qi): Rename from mmx_pblendvb64.
(mmx_pblendvb_<mode>): Rename from mmx_pblendvb32.
Use VI_16_32 mode iterator.
* config/i386/i386-expand.c (ix86_expand_sse_movcc):
Update for rename.  Handle V2QImode.
(expand_vec_perm_blend): Update for rename.

gcc/testsuite/ChangeLog:

PR target/103861
* g++.target/i386/pr100637-1b.C (dg-options):
Use -msse4 instead of -msse2.
* g++.target/i386/pr100637-1w.C (dg-options): Ditto.
* g++.target/i386/pr103861-1.C: New test.
* gcc.target/i386/pr100637-4b.c (dg-options):
Use -msse4 instead of -msse2.
* gcc.target/i386/pr103861-4.c: New test.

c++: Fix ICEs with OBJ_TYPE_REF pretty printing [PR101597]

The following testcase ICEs, because middle-end uses the C++ FE pretty
printing code through langhooks in the diagnostics.
The FE expects OBJ_TYPE_REF_OBJECT's type to be useful (pointer to the
class type it is called on), but in the middle-end conversions between
pointer types are useless, so the actual type can be some random
unrelated pointer type (in the testcase void * pointer).  The pretty
printing code then ICEs on it.

The following patch fixes that by sticking the original
OBJ_TYPE_REF_OBJECT's also as type of OBJ_TYPE_REF_TOKEN operand.
That one must be an INTEGER_CST, all the current uses of
OBJ_TYPE_REF_TOKEN just use tree_to_uhwi or tree_to_shwi on it,
and because it is constant, there is no risk of the middle-end propagating
into it some other pointer type.  So, approach similar to how MEM_REF
treats its second operand or a couple of internal functions (e.g.
IFN_VA_ARG) some of its parameters.

2022-01-11  Jakub Jelinek  <jakub@redhat.com>

PR c++/101597
gcc/
* tree.def (OBJ_TYPE_REF): Document type of OBJ_TYPE_REF_TOKEN.
gcc/cp/
* class.c (build_vfn_ref): Build OBJ_TYPE_REF with INTEGER_CST
OBJ_TYPE_REF_TOKEN with type equal to OBJ_TYPE_REF_OBJECT type.
* error.c (resolve_virtual_fun_from_obj_type_ref): Use type of
OBJ_TYPE_REF_TOKEN rather than type of OBJ_TYPE_REF_OBJECT as
obj_type.
gcc/objc/
* objc-act.c (objc_rewrite_function_call): Build OBJ_TYPE_REF
with INTEGER_CST OBJ_TYPE_REF_TOKEN with type equal to
OBJ_TYPE_REF_OBJECT type.
* objc-next-runtime-abi-01.c (build_objc_method_call): Likewise.
* objc-gnu-runtime-abi-01.c (build_objc_method_call): Likewise.
* objc-next-runtime-abi-02.c (build_v2_objc_method_fixup_call,
build_v2_build_objc_method_call): Likewise.
gcc/testsuite/
* g++.dg/opt/pr101597.C: New test.

c-family: Fix up -W*conversion on bitwise &/|/^ [PR101537]

The following testcases emit a bogus -Wconversion warning.  This is because
conversion_warning function doesn't handle BIT_*_EXPR (only unsafe_conversion_p
that is called during the default: case, and that one doesn't handle
SAVE_EXPRs added because the unsigned char & or | operands promoted to int
have side-effects and =| or =& is used.

The patch handles BIT_IOR_EXPR/BIT_XOR_EXPR like the last 2 operands of
COND_EXPR by recursing on the two operands, if either of them doesn't fit
into the narrower type, complain.  BIT_AND_EXPR too, but first it needs to
handle some special cases that unsafe_conversion_p does, namely when one
of the two operands is a constant.

This fixes completely the pr101537.c test and for C also pr103881.c
and doesn't regress anything in the testsuite, for C++ pr103881.c still
emits the bogus warnings.
This is because while the C FE emits in that case a SAVE_EXPR that
conversion_warning can handle already, C++ FE emits
TARGET_EXPR <D.whatever, ...>, something | D.whatever
etc. and conversion_warning handles COMPOUND_EXPR by "recursing" on the
rhs.  To handle that case, we'd need for TARGET_EXPR on the lhs remember
in some hash map the mapping from D.whatever to the TARGET_EXPR and when
we see D.whatever, use corresponding TARGET_EXPR initializer instead.

2022-01-11  Jakub Jelinek  <jakub@redhat.com>

PR c/101537
PR c/103881
gcc/c-family/
* c-warn.c (conversion_warning): Handle BIT_AND_EXPR, BIT_IOR_EXPR
and BIT_XOR_EXPR.
gcc/testsuite/
* c-c++-common/pr101537.c: New test.
* c-c++-common/pr103881.c: New test.

c++: dependent bases and 'this' availability [PR103831]

Here during satisfaction of B's constraints we're failing to reject the
object-less call to the non-static member function A::size ultimately
because satisfaction is performed in the (access) context of the class
template B, which has a dependent base, and so the any_dependent_bases_p
check within build_new_method_call causes us to not reject the call.
(Subsequent constexpr evaluation of the call succeeds since the function
is effectively static.)

This patch fixes this by refining the any_dependent_bases_p check within
build_new_method_call: if we're in a context where 'this' is unavailable,
then we cannot resolve the implicit object regardless of the presence of
a dependent base. So let's also check current_class_ptr alongside a_d_b_p.

PR c++/103831

gcc/cp/ChangeLog:

* call.c (build_new_method_call): Consider dependent bases only
if 'this' is available.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-class3.C: New test.
* g++.dg/template/non-dependent18.C: New test.

libstdc++: Add missing noexcept to lazy_split_view iterator (LWG 3593)

This was approved at the October 2021 plenary. We already have noexcept
in the other places the issue adds it in the spec.

libstdc++-v3/ChangeLog:

* include/std/ranges (ranges::lazy_split_view::_InnerIter::end()):
Add neoxcept (LWG 3593).

libstdc++: Make copyable-box completely constexpr (LWG 3572)

This LWG issue was approved at the October 2021 plenary and can be
implemented now that std::optional is fully constexpr.

libstdc++-v3/ChangeLog:

* include/std/ranges (ranges::__detail::__box): Add constexpr to
assignment operators (LWG 3572).
* testsuite/std/ranges/adaptors/filter.cc: Check assignment of a
view that uses copyable-box.

tree-object-size: Dynamic sizes for ADDR_EXPR

Allow returning dynamic expressions from ADDR_EXPR for
__builtin_dynamic_object_size and also allow offsets to be dynamic.

gcc/ChangeLog:

PR middle-end/70090
* tree-object-size.c (size_valid_p): New function.
(size_for_offset): Remove OFFSET constness assertion.
(addr_object_size): Build dynamic expressions for object
sizes and use size_valid_p to decide if it is valid for the
given OBJECT_SIZE_TYPE.
(compute_builtin_object_size): Allow dynamic offsets when
computing size at O0.
(call_object_size): Call size_valid_p.
(plus_stmt_object_size): Allow non-constant offset and use
size_valid_p to decide if it is valid for the given
OBJECT_SIZE_TYPE.

gcc/testsuite/ChangeLog:

PR middle-end/70090
* gcc.dg/builtin-dynamic-object-size-0.c: Add new tests.
* gcc.dg/builtin-object-size-1.c (test1)
[__builtin_object_size]: Adjust expected output for dynamic
object sizes.
* gcc.dg/builtin-object-size-2.c (test1)
[__builtin_object_size]: Likewise.
* gcc.dg/builtin-object-size-3.c (test1)
[__builtin_object_size]: Likewise.
* gcc.dg/builtin-object-size-4.c (test1)
[__builtin_object_size]: Likewise.

Signed-off-by: Siddhesh Poyarekar <siddhesh@gotplt.org>

tree-object-size: Handle GIMPLE_CALL

Handle non-constant expressions in GIMPLE_CALL arguments. Also handle
alloca.

gcc/ChangeLog:

PR middle-end/70090
* tree-object-size.c (alloc_object_size): Make and return
non-constant size expression.
(call_object_size): Return expression or unknown based on
whether dynamic object size is requested.

gcc/testsuite/ChangeLog:

PR middle-end/70090
* gcc.dg/builtin-dynamic-object-size-0.c: Add new tests.
* gcc.dg/builtin-object-size-1.c (test1)
[__builtin_object_size]: Alter expected result for dynamic
object size.
* gcc.dg/builtin-object-size-2.c (test1)
[__builtin_object_size]: Likewise.
* gcc.dg/builtin-object-size-3.c (test1)
[__builtin_object_size]: Likewise.
* gcc.dg/builtin-object-size-4.c (test1)
[__builtin_object_size]: Likewise.

Signed-off-by: Siddhesh Poyarekar <siddhesh@gotplt.org>

tree-object-size: Handle function parameters

Handle hints provided by __attribute__ ((access (...))) to compute
dynamic sizes for objects.

gcc/ChangeLog:

PR middle-end/70090
* tree-object-size.c: Include tree-dfa.h.
(parm_object_size): New function.
(collect_object_sizes_for): Call it.

gcc/testsuite/ChangeLog:

PR middle-end/70090
* gcc.dg/builtin-dynamic-object-size-0.c (test_parmsz_simple,
test_parmsz_scaled, test_parmsz_unknown): New functions.
(main): Call them. Add new arguments argc and argv.

Signed-off-by: Siddhesh Poyarekar <siddhesh@gotplt.org>

tree-object-size: Support dynamic sizes in conditions

Handle GIMPLE_PHI and conditionals specially for dynamic objects,
returning PHI/conditional expressions instead of just a MIN/MAX
estimate.

This makes the returned object size variable for loops and conditionals,
so tests need to be adjusted to look for precise size in some cases.
builtin-dynamic-object-size-5.c had to be modified to only look for
success in maximum object size case and skip over the minimum object
size tests because the result is no longer a compile time constant.

I also added some simple tests to exercise conditionals with dynamic
object sizes.

gcc/ChangeLog:

PR middle-end/70090
* builtins.c (fold_builtin_object_size): Adjust for dynamic size
expressions.
* tree-object-size.c: Include gimplify-me.h.
(struct object_size_info): New member UNKNOWNS.
(size_initval_p, size_usable_p, object_sizes_get_raw): New
functions.
(object_sizes_get): Return suitable gimple variable for
object size.
(bundle_sizes): New function.
(object_sizes_set): Use it and handle dynamic object size
expressions.
(object_sizes_set_temp): New function.
(size_for_offset): Adjust for dynamic size expressions.
(emit_phi_nodes, propagate_unknowns, gimplify_size_expressions):
New functions.
(compute_builtin_object_size): Call gimplify_size_expressions
for OST_DYNAMIC.
(dynamic_object_size): New function.
(cond_expr_object_size): Use it.
(phi_dynamic_object_size): New function.
(collect_object_sizes_for): Call it for OST_DYNAMIC. Adjust to
accommodate dynamic object sizes.

gcc/testsuite/ChangeLog:

PR middle-end/70090
* gcc.dg/builtin-dynamic-object-size-0.c: New tests.
* gcc.dg/builtin-dynamic-object-size-10.c: Add comment.
* gcc.dg/builtin-dynamic-object-size-5-main.c: New file.
* gcc.dg/builtin-dynamic-object-size-5.c: Use it and change test
to dg-do run.
* gcc.dg/builtin-object-size-5.c [!N]: Define N.
(test1, test2, test3, test4) [__builtin_object_size]: Expect
exact result for __builtin_dynamic_object_size.
* gcc.dg/builtin-object-size-1.c [__builtin_object_size]: Expect
exact size expressions for __builtin_dynamic_object_size.
* gcc.dg/builtin-object-size-2.c [__builtin_object_size]:
Likewise.
* gcc.dg/builtin-object-size-3.c [__builtin_object_size]:
Likewise.
* gcc.dg/builtin-object-size-4.c [__builtin_object_size]:
Likewise.

Signed-off-by: Siddhesh Poyarekar <siddhesh@gotplt.org>

tree-optimization/103961: Never compute offset for -1 size

Never try to compute size for offset when the object size is -1, which
is either unknown maximum or uninitialized minimum irrespective of the
osi->pass number.

gcc/ChangeLog:

PR tree-optimization/103961
* tree-object-size.c (plus_stmt_object_size): Always avoid
computing offset for -1 size.

gcc/testsuite/ChangeLog:

PR tree-optimization/103961
* gcc.dg/pr103961.c: New test case.

Co-authored-by: Jakub Jelinek <jakub@redhat.com>
Signed-off-by: Siddhesh Poyarekar <siddhesh@gotplt.org>