review.tizen.org Git - platform/upstream/gcc.git/log

AVX512FP16: Add ABI test for ymm.

gcc/testsuite/ChangeLog:

* gcc.target/x86_64/abi/avx512fp16/m256h/abi-avx512fp16-ymm.exp:
New exp file.
* gcc.target/x86_64/abi/avx512fp16/m256h/args.h: New header.
* gcc.target/x86_64/abi/avx512fp16/m256h/avx512fp16-ymm-check.h:
Likewise.
* gcc.target/x86_64/abi/avx512fp16/m256h/asm-support.S: New.
* gcc.target/x86_64/abi/avx512fp16/m256h/test_m256_returning.c:
New test.
* gcc.target/x86_64/abi/avx512fp16/m256h/test_passing_m256.c: Likewise.
* gcc.target/x86_64/abi/avx512fp16/m256h/test_passing_structs.c:
Likewise.
* gcc.target/x86_64/abi/avx512fp16/m256h/test_passing_unions.c:
Likewise.
* gcc.target/x86_64/abi/avx512fp16/m256h/test_varargs-m256.c: Likewise.

AVX512FP16: Add ABI tests for xmm.

Copied from regular XMM ABI tests. Only run AVX512FP16 ABI tests for ELF
targets.

gcc/testsuite/ChangeLog:

* gcc.target/x86_64/abi/avx512fp16/abi-avx512fp16-xmm.exp: New exp
file for abi test.
* gcc.target/x86_64/abi/avx512fp16/args.h: New header file for abi test.
* gcc.target/x86_64/abi/avx512fp16/avx512fp16-check.h: Likewise.
* gcc.target/x86_64/abi/avx512fp16/avx512fp16-xmm-check.h: Likewise.
* gcc.target/x86_64/abi/avx512fp16/defines.h: Likewise.
* gcc.target/x86_64/abi/avx512fp16/macros.h: Likewise.
* gcc.target/x86_64/abi/avx512fp16/asm-support.S: New asm for abi check.
* gcc.target/x86_64/abi/avx512fp16/test_3_element_struct_and_unions.c:
New test.
* gcc.target/x86_64/abi/avx512fp16/test_basic_alignment.c: Likewise.
* gcc.target/x86_64/abi/avx512fp16/test_basic_array_size_and_align.c:
Likewise.
* gcc.target/x86_64/abi/avx512fp16/test_basic_returning.c: Likewise.
* gcc.target/x86_64/abi/avx512fp16/test_basic_sizes.c: Likewise.
* gcc.target/x86_64/abi/avx512fp16/test_basic_struct_size_and_align.c:
Likewise.
* gcc.target/x86_64/abi/avx512fp16/test_basic_union_size_and_align.c:
Likewise.
* gcc.target/x86_64/abi/avx512fp16/test_complex_returning.c: Likewise.
* gcc.target/x86_64/abi/avx512fp16/test_m64m128_returning.c: Likewise.
* gcc.target/x86_64/abi/avx512fp16/test_passing_floats.c: Likewise.
* gcc.target/x86_64/abi/avx512fp16/test_passing_m64m128.c: Likewise.
* gcc.target/x86_64/abi/avx512fp16/test_passing_structs.c: Likewise.
* gcc.target/x86_64/abi/avx512fp16/test_passing_unions.c: Likewise.
* gcc.target/x86_64/abi/avx512fp16/test_struct_returning.c: Likewise.
* gcc.target/x86_64/abi/avx512fp16/test_varargs-m128.c: Likewise.

AVX512FP16: Add tests for vector passing in variable arguments.

gcc/testsuite/ChangeLog:

* gcc.target/i386/avx512fp16-vararg-1.c: New test.
* gcc.target/i386/avx512fp16-vararg-2.c: Ditto.
* gcc.target/i386/avx512fp16-vararg-3.c: Ditto.
* gcc.target/i386/avx512fp16-vararg-4.c: Ditto.

AVX512FP16: Add testcase for vector init and broadcast intrinsics.

gcc/testsuite/ChangeLog:

* gcc.target/i386/m512-check.h: Add union128h, union256h, union512h.
* gcc.target/i386/avx512fp16-10a.c: New test.
* gcc.target/i386/avx512fp16-10b.c: Ditto.
* gcc.target/i386/avx512fp16-1a.c: Ditto.
* gcc.target/i386/avx512fp16-1b.c: Ditto.
* gcc.target/i386/avx512fp16-1c.c: Ditto.
* gcc.target/i386/avx512fp16-1d.c: Ditto.
* gcc.target/i386/avx512fp16-1e.c: Ditto.
* gcc.target/i386/avx512fp16-2a.c: Ditto.
* gcc.target/i386/avx512fp16-2b.c: Ditto.
* gcc.target/i386/avx512fp16-2c.c: Ditto.
* gcc.target/i386/avx512fp16-3a.c: Ditto.
* gcc.target/i386/avx512fp16-3b.c: Ditto.
* gcc.target/i386/avx512fp16-3c.c: Ditto.
* gcc.target/i386/avx512fp16-4.c: Ditto.
* gcc.target/i386/avx512fp16-5.c: Ditto.
* gcc.target/i386/avx512fp16-6.c: Ditto.
* gcc.target/i386/avx512fp16-7.c: Ditto.
* gcc.target/i386/avx512fp16-8.c: Ditto.
* gcc.target/i386/avx512fp16-9a.c: Ditto.
* gcc.target/i386/avx512fp16-9b.c: Ditto.
* gcc.target/i386/pr54855-13.c: Ditto.
* gcc.target/i386/avx512fp16-vec_set_var.c: Ditto.

AVX512FP16: Support vector init/broadcast/set/extract for FP16.

gcc/ChangeLog:

* config/i386/avx512fp16intrin.h (_mm_set_ph): New intrinsic.
(_mm256_set_ph): Likewise.
(_mm512_set_ph): Likewise.
(_mm_setr_ph): Likewise.
(_mm256_setr_ph): Likewise.
(_mm512_setr_ph): Likewise.
(_mm_set1_ph): Likewise.
(_mm256_set1_ph): Likewise.
(_mm512_set1_ph): Likewise.
(_mm_setzero_ph): Likewise.
(_mm256_setzero_ph): Likewise.
(_mm512_setzero_ph): Likewise.
(_mm_set_sh): Likewise.
(_mm_load_sh): Likewise.
(_mm_store_sh): Likewise.
* config/i386/i386-builtin-types.def (V8HF): New type.
(DEF_FUNCTION_TYPE (V8HF, V8HI)): New builtin function type
* config/i386/i386-expand.c (ix86_expand_vector_init_duplicate):
Support vector HFmodes.
(ix86_expand_vector_init_one_nonzero): Likewise.
(ix86_expand_vector_init_one_var): Likewise.
(ix86_expand_vector_init_interleave): Likewise.
(ix86_expand_vector_init_general): Likewise.
(ix86_expand_vector_set): Likewise.
(ix86_expand_vector_extract): Likewise.
(ix86_expand_vector_init_concat): Likewise.
(ix86_expand_sse_movcc): Handle vector HFmodes.
(ix86_expand_vector_set_var): Ditto.
* config/i386/i386-modes.def: Add HF vector modes in comment.
* config/i386/i386.c (classify_argument): Add HF vector modes.
(ix86_hard_regno_mode_ok): Allow HF vector modes for AVX512FP16.
(ix86_vector_mode_supported_p): Likewise.
(ix86_set_reg_reg_cost): Handle vector HFmode.
(ix86_get_ssemov): Handle vector HFmode.
(function_arg_advance_64): Pass unamed V16HFmode and V32HFmode
by stack.
(function_arg_advance_32): Pass V8HF/V16HF/V32HF by sse reg for 32bit
mode.
(function_arg_advance_32): Ditto.
* config/i386/i386.h (VALID_AVX512FP16_REG_MODE): New.
(VALID_AVX256_REG_OR_OI_MODE): Rename to ..
(VALID_AVX256_REG_OR_OI_VHF_MODE): .. this, and add V16HF.
(VALID_SSE2_REG_VHF_MODE): New.
(VALID_AVX512VL_128_REG_MODE): Add V8HF and TImode.
(SSE_REG_MODE_P): Add vector HFmode.
* config/i386/i386.md (mode): Add HF vector modes.
(MODE_SIZE): Likewise.
(ssemodesuffix): Add ph suffix for HF vector modes.
* config/i386/sse.md (VFH_128): New mode iterator.
(VMOVE): Adjust for HF vector modes.
(V): Likewise.
(V_256_512): Likewise.
(avx512): Likewise.
(avx512fmaskmode): Likewise.
(shuffletype): Likewise.
(sseinsnmode): Likewise.
(ssedoublevecmode): Likewise.
(ssehalfvecmode): Likewise.
(ssehalfvecmodelower): Likewise.
(ssePScmode): Likewise.
(ssescalarmode): Likewise.
(ssescalarmodelower): Likewise.
(sseintprefix): Likewise.
(i128): Likewise.
(bcstscalarsuff): Likewise.
(xtg_mode): Likewise.
(VI12HF_AVX512VL): New mode_iterator.
(VF_AVX512FP16): Likewise.
(VIHF): Likewise.
(VIHF_256): Likewise.
(VIHF_AVX512BW): Likewise.
(V16_256): Likewise.
(V32_512): Likewise.
(sseintmodesuffix): New mode_attr.
(sse): Add scalar and vector HFmodes.
(ssescalarmode): Add vector HFmode mapping.
(ssescalarmodesuffix): Add sh suffix for HFmode.
(*<sse>_vm<insn><mode>3): Use VFH_128.
(*<sse>_vm<multdiv_mnemonic><mode>3): Likewise.
(*ieee_<ieee_maxmin><mode>3): Likewise.
(<avx512>_blendm<mode>): New define_insn.
(vec_setv8hf): New define_expand.
(vec_set<mode>_0): New define_insn for HF vector set.
(*avx512fp16_movsh): Likewise.
(avx512fp16_movsh): Likewise.
(vec_extract_lo_v32hi): Rename to ...
(vec_extract_lo_<mode>): ... this, and adjust to allow HF
vector modes.
(vec_extract_hi_v32hi): Likewise.
(vec_extract_hi_<mode>): Likewise.
(vec_extract_lo_v16hi): Likewise.
(vec_extract_lo_<mode>): Likewise.
(vec_extract_hi_v16hi): Likewise.
(vec_extract_hi_<mode>): Likewise.
(vec_set_hi_v16hi): Likewise.
(vec_set_hi_<mode>): Likewise.
(vec_set_lo_v16hi): Likewise.
(vec_set_lo_<mode>): Likewise.
(*vec_extract<mode>_0): New define_insn_and_split for HF
vector extract.
(*vec_extracthf): New define_insn.
(VEC_EXTRACT_MODE): Add HF vector modes.
(PINSR_MODE): Add V8HF.
(sse2p4_1): Likewise.
(pinsr_evex_isa): Likewise.
(<sse2p4_1>_pinsr<ssemodesuffix>): Adjust to support
insert for V8HFmode.
(pbroadcast_evex_isa): Add HF vector modes.
(AVX2_VEC_DUP_MODE): Likewise.
(VEC_INIT_MODE): Likewise.
(VEC_INIT_HALF_MODE): Likewise.
(avx2_pbroadcast<mode>): Adjust to support HF vector mode
broadcast.
(avx2_pbroadcast<mode>_1): Likewise.
(<avx512>_vec_dup<mode>_1): Likewise.
(<avx512>_vec_dup<mode><mask_name>): Likewise.
(<mask_codefor><avx512>_vec_dup_gpr<mode><mask_name>):
Likewise.

AVX512FP16: Initial support for AVX512FP16 feature and scalar _Float16 instructions.

gcc/ChangeLog:

* common/config/i386/cpuinfo.h (get_available_features):
Detect FEATURE_AVX512FP16.
* common/config/i386/i386-common.c
(OPTION_MASK_ISA_AVX512FP16_SET,
OPTION_MASK_ISA_AVX512FP16_UNSET,
OPTION_MASK_ISA2_AVX512FP16_SET,
OPTION_MASK_ISA2_AVX512FP16_UNSET): New.
(OPTION_MASK_ISA2_AVX512BW_UNSET,
OPTION_MASK_ISA2_AVX512BF16_UNSET): Add AVX512FP16.
(ix86_handle_option): Handle -mavx512fp16.
* common/config/i386/i386-cpuinfo.h (enum processor_features):
Add FEATURE_AVX512FP16.
* common/config/i386/i386-isas.h: Add entry for AVX512FP16.
* config.gcc: Add avx512fp16intrin.h.
* config/i386/avx512fp16intrin.h: New intrinsic header.
* config/i386/cpuid.h: Add bit_AVX512FP16.
* config/i386/i386-builtin-types.def: (FLOAT16): New primitive type.
* config/i386/i386-builtins.c: Support _Float16 type for i386
backend.
(ix86_register_float16_builtin_type): New function.
(ix86_float16_type_node): New.
* config/i386/i386-c.c (ix86_target_macros_internal): Define
__AVX512FP16__.
* config/i386/i386-expand.c (ix86_expand_branch): Support
HFmode.
(ix86_prepare_fp_compare_args): Adjust TARGET_SSE_MATH &&
SSE_FLOAT_MODE_P to SSE_FLOAT_MODE_SSEMATH_OR_HF_P.
(ix86_expand_fp_movcc): Ditto.
* config/i386/i386-isa.def: Add PTA define for AVX512FP16.
* config/i386/i386-options.c (isa2_opts): Add -mavx512fp16.
(ix86_valid_target_attribute_inner_p): Add avx512fp16 attribute.
* config/i386/i386.c (ix86_get_ssemov): Use
vmovdqu16/vmovw/vmovsh for HFmode/HImode scalar or vector.
(ix86_get_excess_precision): Use
FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16 when TARGET_AVX512FP16
existed.
(sse_store_index): Use SFmode cost for HFmode cost.
(inline_memory_move_cost): Add HFmode, and perfer SSE cost over
GPR cost for HFmode.
(ix86_hard_regno_mode_ok): Allow HImode in sse register.
(ix86_mangle_type): Add manlging for _Float16 type.
(inline_secondary_memory_needed): No memory is needed for
16bit movement between gpr and sse reg under
TARGET_AVX512FP16.
(ix86_multiplication_cost): Adjust TARGET_SSE_MATH &&
SSE_FLOAT_MODE_P to SSE_FLOAT_MODE_SSEMATH_OR_HF_P.
(ix86_division_cost): Ditto.
(ix86_rtx_costs): Ditto.
(ix86_add_stmt_cost): Ditto.
(ix86_optab_supported_p): Ditto.
* config/i386/i386.h (VALID_AVX512F_SCALAR_MODE): Add HFmode.
(SSE_FLOAT_MODE_SSEMATH_OR_HF_P): Add HFmode.
(PTA_SAPPHIRERAPIDS): Add PTA_AVX512FP16.
* config/i386/i386.md (mode): Add HFmode.
(MODE_SIZE): Add HFmode.
(isa): Add avx512fp16.
(enabled): Handle avx512fp16.
(ssemodesuffix): Add sh suffix for HFmode.
(comm): Add mult, div.
(plusminusmultdiv): New code iterator.
(insn): Add mult, div.
(*movhf_internal): Adjust for avx512fp16 instruction.
(*movhi_internal): Ditto.
(*cmpi<unord>hf): New define_insn for HFmode.
(*ieee_s<ieee_maxmin>hf3): Likewise.
(extendhf<mode>2): Likewise.
(trunc<mode>hf2): Likewise.
(float<floatunssuffix><mode>hf2): Likewise.
(*<insn>hf): Likewise.
(cbranchhf4): New expander.
(movhfcc): Likewise.
(<insn>hf3): Likewise.
(mulhf3): Likewise.
(divhf3): Likewise.
* config/i386/i386.opt: Add mavx512fp16.
* config/i386/immintrin.h: Include avx512fp16intrin.h.
* doc/invoke.texi: Add mavx512fp16.
* doc/extend.texi: Add avx512fp16 Usage Notes.

gcc/testsuite/ChangeLog:

* gcc.target/i386/avx-1.c: Add -mavx512fp16 in dg-options.
* gcc.target/i386/avx-2.c: Ditto.
* gcc.target/i386/avx512-check.h: Check cpuid for AVX512FP16.
* gcc.target/i386/funcspec-56.inc: Add new target attribute check.
* gcc.target/i386/sse-13.c: Add -mavx512fp16.
* gcc.target/i386/sse-14.c: Ditto.
* gcc.target/i386/sse-22.c: Ditto.
* gcc.target/i386/sse-23.c: Ditto.
* lib/target-supports.exp: (check_effective_target_avx512fp16): New.
* g++.target/i386/float16-1.C: New test.
* g++.target/i386/float16-2.C: Ditto.
* g++.target/i386/float16-3.C: Ditto.
* gcc.target/i386/avx512fp16-12a.c: Ditto.
* gcc.target/i386/avx512fp16-12b.c: Ditto.
* gcc.target/i386/float16-3a.c: Ditto.
* gcc.target/i386/float16-3b.c: Ditto.
* gcc.target/i386/float16-4a.c: Ditto.
* gcc.target/i386/float16-4b.c: Ditto.
* gcc.target/i386/pr54855-12.c: Ditto.
* g++.dg/other/i386-2.C: Ditto.
* g++.dg/other/i386-3.C: Ditto.

Co-Authored-By: H.J. Lu <hongjiu.lu@intel.com>
Co-Authored-By: Liu Hongtao <hongtao.liu@intel.com>
Co-Authored-By: Wang Hongyu <hongyu.wang@intel.com>
Co-Authored-By: Xu Dianhong <dianhong.xu@intel.com>

Support -fexcess-precision=16 which will enable FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16 when backend supports _Float16.

gcc/ada/ChangeLog:

* gcc-interface/misc.c (gnat_post_options): Issue an error for
-fexcess-precision=16.

gcc/c-family/ChangeLog:

* c-common.c (excess_precision_mode_join): Update below comments.
(c_ts18661_flt_eval_method): Set excess_precision_type to
EXCESS_PRECISION_TYPE_FLOAT16 when -fexcess-precision=16.
* c-cppbuiltin.c (cpp_atomic_builtins): Update below comments.
(c_cpp_flt_eval_method_iec_559): Set excess_precision_type to
EXCESS_PRECISION_TYPE_FLOAT16 when -fexcess-precision=16.

gcc/ChangeLog:

* common.opt: Support -fexcess-precision=16.
* config/aarch64/aarch64.c (aarch64_excess_precision): Return
FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16 when
EXCESS_PRECISION_TYPE_FLOAT16.
* config/arm/arm.c (arm_excess_precision): Ditto.
* config/i386/i386.c (ix86_get_excess_precision): Ditto.
* config/m68k/m68k.c (m68k_excess_precision): Issue an error
when EXCESS_PRECISION_TYPE_FLOAT16.
* config/s390/s390.c (s390_excess_precision): Ditto.
* coretypes.h (enum excess_precision_type): Add
EXCESS_PRECISION_TYPE_FLOAT16.
* doc/tm.texi (TARGET_C_EXCESS_PRECISION): Update documents.
* doc/tm.texi.in (TARGET_C_EXCESS_PRECISION): Ditto.
* doc/extend.texi (Half-Precision): Document
-fexcess-precision=16.
* flag-types.h (enum excess_precision): Add
EXCESS_PRECISION_FLOAT16.
* target.def (excess_precision): Update document.
* tree.c (excess_precision_type): Set excess_precision_type to
EXCESS_PRECISION_FLOAT16 when -fexcess-precision=16.

gcc/fortran/ChangeLog:

* options.c (gfc_post_options): Issue an error for
-fexcess-precision=16.

gcc/testsuite/ChangeLog:

* gcc.target/i386/float16-6.c: New test.
* gcc.target/i386/float16-7.c: New test.

Adjust the wording for x86 _Float16 type.

gcc/ChangeLog:

* doc/extend.texi: (@node Floating Types): Adjust the wording.
(@node Half-Precision): Ditto.

Daily bump.

gcc: xtensa: fix PR target/102115

2021-09-07 Takayuki 'January June' Suwa <jjsuwa_sys3175@yahoo.co.jp>
gcc/
PR target/102115
* config/xtensa/xtensa.c (xtensa_emit_move_sequence): Add
'CONST_INT_P (src)' to the condition of the block that tries to
eliminate literal when loading integer contant.

runtime: use hash32, not hash64, for amd64p32, mips64p32, mips64p32le

Fixes PR go/102102

Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/348015

doc: BPF CO-RE documentation

Document the new command line options (-mco-re and -mno-co-re), the new
BPF target builtin (__builtin_preserve_access_index), and the new BPF
target attribute (preserve_access_index) introduced with BPF CO-RE.

gcc/ChangeLog:

* doc/extend.texi (BPF Type Attributes) New node.
Document new preserve_access_index attribute.
Document new preserve_access_index builtin.
* doc/invoke.texi: Document -mco-re and -mno-co-re options.

bpf testsuite: Add BPF CO-RE tests

This commit adds several tests for the new BPF CO-RE functionality to
the BPF target testsuite.

gcc/testsuite/ChangeLog:

* gcc.target/bpf/core-attr-1.c: New test.
* gcc.target/bpf/core-attr-2.c: Likewise.
* gcc.target/bpf/core-attr-3.c: Likewise.
* gcc.target/bpf/core-attr-4.c: Likewise
* gcc.target/bpf/core-builtin-1.c: Likewise
* gcc.target/bpf/core-builtin-2.c: Likewise.
* gcc.target/bpf/core-builtin-3.c: Likewise.
* gcc.target/bpf/core-section-1.c: Likewise.

bpf: BPF CO-RE support

This commit introduces support for BPF Compile Once - Run
Everywhere (CO-RE) in GCC.

gcc/ChangeLog:

* config/bpf/bpf.c: Adjust includes.
(bpf_handle_preserve_access_index_attribute): New function.
(bpf_attribute_table): Use it here.
(bpf_builtins): Add BPF_BUILTIN_PRESERVE_ACCESS_INDEX.
(bpf_option_override): Handle "-mco-re" option.
(bpf_asm_init_sections): New.
(TARGET_ASM_INIT_SECTIONS): Redefine.
(bpf_file_end): New.
(TARGET_ASM_FILE_END): Redefine.
(bpf_init_builtins): Add "__builtin_preserve_access_index".
(bpf_core_compute, bpf_core_get_index): New.
(is_attr_preserve_access): New.
(bpf_expand_builtin): Handle new builtins.
(bpf_core_newdecl, bpf_core_is_maybe_aggregate_access): New.
(bpf_core_walk): New.
(bpf_resolve_overloaded_builtin): New.
(TARGET_RESOLVE_OVERLOADED_BUILTIN): Redefine.
(handle_attr): New.
(pass_bpf_core_attr): New RTL pass.
* config/bpf/bpf-passes.def: New file.
* config/bpf/bpf-protos.h (make_pass_bpf_core_attr): New.
* config/bpf/coreout.c: New file.
* config/bpf/coreout.h: Likewise.
* config/bpf/t-bpf (TM_H): Add $(srcdir)/config/bpf/coreout.h.
(coreout.o): New rule.
(PASSES_EXTRA): Add $(srcdir)/config/bpf/bpf-passes.def.
* config.gcc (bpf): Add coreout.h to extra_headers.
Add coreout.o to extra_objs.
Add $(srcdir)/config/bpf/coreout.c to target_gtfiles.

btf: expose get_btf_id

Expose the function get_btf_id, so that it may be used by the BPF
backend. This enables the BPF CO-RE machinery in the BPF backend to
lookup BTF type IDs, in order to create CO-RE relocation records.

A prototype is added in ctfc.h

gcc/ChangeLog:

* btfout.c (get_btf_id): Function is no longer static.
* ctfc.h: Expose it here.

ctfc: add function to lookup CTF ID of a TREE type

Add a new function, ctf_lookup_tree_type, to return the CTF type ID
associated with a type via its is TREE node. The function is exposed via
a prototype in ctfc.h.

gcc/ChangeLog:

* ctfc.c (ctf_lookup_tree_type): New function.
* ctfc.h: Likewise.

ctfc: externalize ctf_dtd_lookup

Expose the function ctf_dtd_lookup, so that it can be used by the BPF
CO-RE machinery. The function is no longer static, and an extern
prototype is added in ctfc.h.

gcc/ChangeLog:

* ctfc.c (ctf_dtd_lookup): Function is no longer static.
* ctfc.h: Analogous change.

dwarf: externalize lookup_type_die

Expose the function lookup_type_die in dwarf2out, so that it can be used
by CTF/BTF when adding BPF CO-RE information. The function is now
non-static, and an extern prototype is added in dwarf2out.h.

gcc/ChangeLog:

* dwarf2out.c (lookup_type_die): Function is no longer static.
* dwarf2out.h: Expose it here.

Fix fatal typo in gcc.dg/no_profile_instrument_function-attr-2.c

Dejagnu is unfortunately brittle: a syntax error in a
directive can abort the test-run for the current "tool"
(gcc, g++, gfortran), and if you don't check for this
condition or actually read the stdout log yourself, your
tools may make you believe the test was successful without
regressions.  At the very least, always grep for ^ERROR: in
the stdout log!

With r12-3379, the testsuite got such a fatal syntax error,
causing the gcc test-run to abort at (e.g.):

...
FAIL: gcc.dg/memchr.c (test for excess errors)
FAIL: gcc.dg/memcmp-3.c (test for excess errors)
ERROR: (DejaGnu) proc "scan-tree-dump-not\" = foo {"} optimized" does not exist.
The error code is TCL LOOKUP COMMAND scan-tree-dump-not\"
The info on the error is:
invalid command name "scan-tree-dump-not""
    while executing
"::tcl_unknown scan-tree-dump-not\" = foo {"} optimized"
    ("uplevel" body line 1)
    invoked from within
"uplevel 1 ::tcl_unknown $args"

=== gcc Summary ===

# of expected passes 63740
# of unexpected failures 38
# of unexpected successes 2
# of expected failures 351
# of unresolved testcases 3
# of unsupported tests 662
x/cris-elf/gccobj/gcc/xgcc  version 12.0.0 20210907 (experimental)\
[master r12-3391-g849d5f5929fc] (GCC)

testsuite:
* gcc.dg/no_profile_instrument_function-attr-2.c: Fix
typo in last change.

Fortran - improve error recovery determining array element from constructor

gcc/fortran/ChangeLog:

PR fortran/101327
* expr.c (find_array_element): When bounds cannot be determined as
constant, return error instead of aborting.

gcc/testsuite/ChangeLog:

PR fortran/101327
* gfortran.dg/pr101327.f90: New test.

dwarf2out: Emit BTF in dwarf2out_finish for BPF CO-RE usecase

DWARF generation is split between early and late phases when LTO is in effect.
This poses challenges for CTF/BTF generation especially if late debug info
generation is desirable, as turns out to be the case for BPF CO-RE.

The approach taken here in this patch is:

1. LTO is disabled for BPF CO-RE
The reason to disable LTO for BPF CO-RE is that if LTO is in effect, BPF CO-RE
relocations need to be generated in the LTO link phase _after_ the optimizations
are done. This means we need to devise way to combine early and late BTF. At
this time, in absence of linker support for BTF sections, it makes sense to
steer clear of LTO for BPF CO-RE and bypass the issue.

2. The BPF backend updates the write_symbols with BPF_WITH_CORE_DEBUG to convey
the case that BTF with CO-RE support needs to be generated.  This information
is used by the debug info emission routines to defer the emission of BTF/CO-RE
until dwarf2out_finish.

So, in other words,

dwarf2out_early_finish
  - Always emit CTF here.
  - if (BTF && !BTF_WITH_CORE), emit BTF now.

dwarf2out_finish
  - if (BTF_WITH_CORE) emit BTF now.

gcc/ChangeLog:

* dwarf2ctf.c (ctf_debug_finalize): Make it static.
(ctf_debug_early_finish): New definition.
(ctf_debug_finish): Likewise.
* dwarf2ctf.h (ctf_debug_finalize): Remove declaration.
(ctf_debug_early_finish): New declaration.
(ctf_debug_finish): Likewise.
* dwarf2out.c (dwarf2out_finish): Invoke ctf_debug_finish.
(dwarf2out_early_finish): Invoke ctf_debug_early_finish.

bpf: Add new -mco-re option for BPF CO-RE

-mco-re in the BPF backend enables code generation for the CO-RE usecase. LTO is
disabled for CO-RE compilations.

gcc/ChangeLog:

* config/bpf/bpf.c (bpf_option_override): For BPF backend, disable LTO
support when compiling for CO-RE.
* config/bpf/bpf.opt: Add new command line option -mco-re.

gcc/testsuite/ChangeLog:

* gcc.target/bpf/core-lto-1.c: New test.

debug: Add BTF_WITH_CORE_DEBUG debug format

To best handle BTF/CO-RE in GCC, a distinct BTF_WITH_CORE_DEBUG debug format is
being added. This helps the compiler detect whether BTF with CO-RE relocations
needs to be emitted.

gcc/ChangeLog:

* flag-types.h (enum debug_info_type): Add new enum
DINFO_TYPE_BTF_WITH_CORE.
(BTF_WITH_CORE_DEBUG): New bitmask.
* flags.h (btf_with_core_debuginfo_p): New declaration.
* opts.c (btf_with_core_debuginfo_p): New definition.

tree: Change error_operand_p to an inline function

I've thought for a while that many of the macros in tree.h and such should
become inline functions. This one in particular was confusing Coverity; the
null check in the macro made it think that all code guarded by
error_operand_p would also need null checks.

gcc/ChangeLog:

* tree.h (error_operand_p): Change to inline function.

c++: Fix up constexpr evaluation of deleting dtors [PR100495]

We do not save bodies of constexpr clones and instead evaluate the bodies
of the constexpr functions they were cloned from.
I believe that is just fine for constructors because complete vs. base
ctors differ only in classes that have virtual bases and such constructors
aren't constexpr, similarly complete/base destructors.
But as the testcase below shows, for deleting destructors it is not fine,
deleting dtors while marked as clones in fact are just artificial functions
with synthetized body which calls the user destructor and deallocation.

So, either we'd need to evaluate the destructor and afterwards synthetize
and evaluate the deallocation, or we can just save and use the deleting
dtors bodies. The latter seems much easier to me.

2021-09-07 Jakub Jelinek <jakub@redhat.com>

PR c++/100495
* constexpr.c (maybe_save_constexpr_fundef): Save body even for
constexpr deleting dtors.
(cxx_eval_call_expression): Don't use DECL_CLONED_FUNCTION for
deleting dtors.

* g++.dg/cpp2a/constexpr-new21.C: New test.

libgomp.texi: Extend OpenMP 5.0 Implementation Status

libgomp/
* libgomp.texi (OpenMP Implementation Status): Extend
OpenMP 5.0 section.
(OpenACC Profiling Interface): Fix typo.

Rename forwarder_block_p in treading code to empty_block_with_phis_p.

gcc/ChangeLog:

* tree-ssa-threadedge.c (forwarder_block_p): Rename to...
(empty_block_with_phis_p): ...this.
(potentially_threadable_block): Same.
(jump_threader::thread_through_normal_block): Same.

libgfortran: Makefile fix for ISO_Fortran_binding.h

libgfortran/ChangeLog:

* Makefile.am (gfor_built_src): Depend on
include/ISO_Fortran_binding.h not on ISO_Fortran_binding.h.
(ISO_Fortran_binding.h): Rename make target to ...
(include/ISO_Fortran_binding.h): ... this.
* Makefile.in: Regenerate.

Fix PR debug/101947

This is the recent LTO bootstrap failure with Ada enabled. The compiler now
generates DW_OP_deref_type for a unit of the Ada front-end, which means that
the offset of base types in the CU must be computed during early DWARF too.

gcc/
PR debug/101947
* dwarf2out.c (mark_base_types): New overloaded function.
(dwarf2out_early_finish): Invoke it on the COMDAT type list as well
as the compilation unit, and call move_marked_base_types afterward.

x86: Enable FMA in unsigned SI to SF expanders

Enable FMA in scalar/vector unsigned SI to SF expanders. Don't check
TARGET_AVX512F which has vcvtusi2ss and vcvtudq2ps instructions.

gcc/

PR target/85819
* config/i386/i386-expand.c (ix86_expand_convert_uns_sisf_sse):
Enable FMA.
(ix86_expand_vector_convert_uns_vsivsf): Likewise.

gcc/testsuite/

PR target/85819
* gcc.target/i386/pr85819-1a.c: New test.
* gcc.target/i386/pr85819-1b.c: Likewise.
* gcc.target/i386/pr85819-2a.c: Likewise.
* gcc.target/i386/pr85819-2b.c: Likewise.
* gcc.target/i386/pr85819-2c.c: Likewise.
* gcc.target/i386/pr85819-3.c: Likewise.

tree-optimization/102226 - fix epilogue vector re-use

This fixes re-use of the reduction value in epilogue vectorization
when a conversion from/to variable lenght vectors is required.

2021-09-07 Richard Biener <rguenther@suse.de>

PR tree-optimization/102226
* tree-vect-loop.c (vect_transform_cycle_phi): Record
the converted value for the epilogue PHI use.

* g++.dg/vect/pr102226.cc: New testcase.

C, C++, Fortran, OpenMP: Add support for 'flush seq_cst' construct.

This patch adds support for the 'seq_cst' memory order clause on the 'flush'
directive which was introduced in OpenMP 5.1.

gcc/c-family/ChangeLog:

* c-omp.c (c_finish_omp_flush): Handle MEMMODEL_SEQ_CST.

gcc/c/ChangeLog:

* c-parser.c (c_parser_omp_flush): Parse 'seq_cst' clause on 'flush'
directive.

gcc/cp/ChangeLog:

* parser.c (cp_parser_omp_flush): Parse 'seq_cst' clause on 'flush'
directive.
* semantics.c (finish_omp_flush): Handle MEMMODEL_SEQ_CST.

gcc/fortran/ChangeLog:

* openmp.c (gfc_match_omp_flush): Parse 'seq_cst' clause on 'flush'
directive.
* trans-openmp.c (gfc_trans_omp_flush): Handle OMP_MEMORDER_SEQ_CST.

gcc/testsuite/ChangeLog:

* c-c++-common/gomp/flush-1.c: Add test case for 'seq_cst'.
* c-c++-common/gomp/flush-2.c: Add test case for 'seq_cst'.
* g++.dg/gomp/attrs-1.C: Adapt test to handle all flush clauses.
* g++.dg/gomp/attrs-2.C: Adapt test to handle all flush clauses.
* gfortran.dg/gomp/flush-1.f90: Add test case for 'seq_cst'.
* gfortran.dg/gomp/flush-2.f90: Add test case for 'seq_cst'.

inline: do not einline when no_profile_instrument_function is different

PR gcov-profile/80223

gcc/ChangeLog:

* ipa-inline.c (can_inline_edge_p): Similarly to sanitizer
options, do not inline when no_profile_instrument_function
attributes are different in early inliner. It's fine to inline
it after PGO instrumentation.

gcc/testsuite/ChangeLog:

* gcc.dg/no_profile_instrument_function-attr-2.c: New test.

tree-optimization/101555 - avoid redundant alias queries in PRE

This avoids doing redundant work during PHI translation to invalidate
mems when translating their corresponding VUSE through the blocks
virtual PHI node. All the invalidation work is already done by
prune_clobbered_mems.

This speeds up the compile of the testcase from 275s with PRE
taking 91% of the compile-time down to 43s with PRE taking 16%
of the compile-time.

2021-09-07 Richard Biener <rguenther@suse.de>

PR tree-optimization/101555
* tree-ssa-pre.c (translate_vuse_through_block): Do not
perform an alias walk to determine the validity of the
mem at the start of the block which is already guaranteed
by means of prune_clobbered_mems.
(phi_translate_1): Pass edge to translate_vuse_through_block.

libgomp.texi: Add OpenMP Implementation Status

libgomp/
* libgomp.texi (Enabling OpenMP): Refer to OMP spec in general
not to 4.5; link to new section.
(OpenMP Implementation Status): New.

Fortran: Revert to non-multilib-specific ISO_Fortran_binding.h

Commit fef67987cf502fe322e92ddce22eea7ac46b4d75 changed the
libgfortran build process to generate multilib-specific versions of
ISO_Fortran_binding.h from a template, by running gfortran to identify
the values of the Fortran kind constants C_LONG_DOUBLE, C_FLOAT128,
and C_INT128_T. This caused multiple problems with search paths, both
for build-tree testing and installed-tree use, not all of which have
been fixed.

This patch reverts to a non-multilib-specific .h file that uses GCC's
predefined preprocessor symbols to detect the supported types and map
them to kind values in the same way as the Fortran front end.

2021-09-06 Sandra Loosemore <sandra@codesourcery.com>

libgfortran/
* ISO_Fortran_binding-1-tmpl.h: Deleted.
* ISO_Fortran_binding-2-tmpl.h: Deleted.
* ISO_Fortran_binding-3-tmpl.h: Deleted.
* ISO_Fortran_binding.h: New file to replace the above.
* Makefile.am (gfor_cdir): Remove MULTISUBDIR.
(ISO_Fortran_binding.h): Simplify to just copy the file.
* Makefile.in: Regenerated.
* mk-kinds-h.sh: Revert pieces no longer needed for
ISO_Fortran_binding.h.

rs6000: Expand fmod and remainder when built with fast-math [PR97142]

fmod/fmodf and remainder/remainderf could be expanded instead of library
call when fast-math build, which is much faster.

fmodf:
     fdivs   f0,f1,f2
     friz    f0,f0
     fnmsubs f1,f2,f0,f1

remainderf:
     fdivs   f0,f1,f2
     frin    f0,f0
     fnmsubs f1,f2,f0,f1

SPEC2017 Ofast P8LE: 511.povray_r +1.14%,  526.blender_r +1.72%

gcc/ChangeLog:

2021-09-07  Xionghu Luo  <luoxhu@linux.ibm.com>

PR target/97142
* config/rs6000/rs6000.md (fmod<mode>3): New define_expand.
(remainder<mode>3): Likewise.

gcc/testsuite/ChangeLog:

2021-09-07  Xionghu Luo  <luoxhu@linux.ibm.com>

PR target/97142
* gcc.target/powerpc/pr97142.c: New test.

MIPS: add .module arch and ase to all output asm

Currently, the asm output file for MIPS has no rev info.
It can make some trouble, for example:

  assembler is mips1 by default,
  gcc is fpxx by default.

To assemble the output of gcc -S, we have to pass -mips2
to assembler.

The same situation is for some CPU has extension insn.
Octeon is an example.
So we can just add ".set arch=octeon".

If an ASE is enabled, .module ase will also be used.

gcc/ChangeLog:
* config/mips/mips.c (mips_file_start): add .module for
  arch and ase.

Daily bump.

Correct implementation of wi::clz

As diagnosed with Jakub and Richard in the analysis of PR 102134, the
current implementation of wi::clz has incorrect/inconsistent behaviour.
As mentioned by Richard in comment #7, clz should (always) return zero
for negative values, but the current implementation can only return 0
when precision is a multiple of HOST_BITS_PER_WIDE_INT. The fix is
simply to reorder/shuffle the existing tests.

2021-09-06 Roger Sayle <roger@nextmovesoftware.com>

gcc/ChangeLog
* wide-int.cc (wi::clz): Reorder tests to ensure the result
is zero for all negative values.

invoke.texi: Fix @opindex for -foffload-options

gcc/
* doc/invoke.texi (-foffload-options): Fix @opindex.

gcc_update: use human readable name for revision string in gcc/REVISION

contrib/Changelog:

* gcc_update: Derive human readable name for HEAD using git describe
like "git gcc-descr" with short commit hash. Drop "revision" from
gcc/REVISION.

x86: Add non-destructive source to @xorsign<mode>3_1

Add non-destructive source alternative to @xorsign<mode>3_1 for AVX.

gcc/

PR target/89984
* config/i386/i386-expand.c (ix86_split_xorsign): Use operands[2].
* config/i386/i386.md (@xorsign<mode>3_1): Add non-destructive
source alternative for AVX.

gcc/testsuite/

PR target/89984
* gcc.target/i386/pr89984-1.c: New test.
* gcc.target/i386/pr89984-2.c: Likewise.
* gcc.target/i386/xorsign-avx.c: Likewise.

Avoid FROM being overwritten in expand_fix.

For the conversion from _Float16 to int, if the corresponding optab
does not exist, the compiler will try the wider mode (SFmode here),
but when floatsfsi exists but FAIL, FROM will be rewritten, which
leads to a PR runtime error.

gcc/ChangeLog:

PR middle-end/102182
* optabs.c (expand_fix): Add from1 to avoid from being
overwritten.

gcc/testsuite/ChangeLog:

PR middle-end/102182
* gcc.target/i386/pr101282.c: New test.

'libgomp.c/target-43.c': '-latomic' for nvptx offloading

... to avoid a regression with recent
commit 090f0d78f194e3cda23fe904016db77ea36c38fa
"openmp: Improve expand_omp_atomic_pipeline":

    unresolved symbol __atomic_compare_exchange_1
    collect2: error: ld returned 1 exit status
    mkoffload: fatal error: [...]/gcc/x86_64-pc-linux-gnu-accel-nvptx-none-gcc returned 1 exit status

libgomp/
* testsuite/libgomp.c/target-43.c: '-latomic' for nvptx offloading.

Fix debug info for packed array types in Ada

Packed array types are sometimes represented with integer types under the
hood in Ada, but we nevertheless need to emit them as array types in the
debug info so we have the types.get_array_descr_info langhook for this
purpose; but it is not invoked from modified_type_die, which causes:

FAIL: gdb.ada/arrayptr.exp: scenario=minimal: print pa_ptr.all
FAIL: gdb.ada/arrayptr.exp: scenario=minimal: print pa_ptr.all(3)

in the GDB testsuite.

gcc/
* dwarf2out.c (modified_type_die): Deal with all array types earlier
and use local variable consistently throughout the function.

match.pd: Fix up __builtin_*_overflow arg demotion [PR102207]

My earlier patch to demote arguments of __builtin_*_overflow unfortunately
caused a wrong-code regression. The builtins operate on infinite precision
arguments, outer_prec > inner_prec signed -> signed, unsigned -> unsigned
promotions there are just repeating the sign or 0s and can be demoted,
similarly unsigned -> signed which also is repeating 0s, but as the
testcase shows, signed -> unsigned promotions need to be preserved (unless
we'd know the inner arguments can't be negative), because for negative
numbers such promotion sets the outer_prec -> inner_prec bits to 1 bit the
bits above that to 0 in the infinite precision.

So, the following patch avoids the demotions for the signed -> unsigned
promotions.

2021-09-06 Jakub Jelinek <jakub@redhat.com>

PR tree-optimization/102207
* match.pd: Don't demote operands of IFN_{ADD,SUB,MUL}_OVERFLOW if they
were promoted from signed to wider unsigned type.

* gcc.dg/pr102207.c: New test.

Fix PR tree-optimization/63184: add simplification of (& + A) != (& + B)

These two testcases have been failing since GCC 5 but things
have improved such that adding a simplification to match.pd
for this case is easier than before.
In the end we have the following IR:
....
  _5 = &a[1] + _4;
  _7 = &a + _13;
  if (_5 != _7)

So we can fold the _5 != _7 into:
(&a[1] - &a) + _4 != _13

The subtraction is folded into constant by ptr_difference_const.
In this case, the full expression gets folded into a constant
and we are able to remove the if statement.

OK? Bootstrapped and tested on aarch64-linux-gnu with no regressions.

gcc/ChangeLog:

PR tree-optimization/63184
* match.pd: Add simplification of pointer_diff of two pointer_plus
with addr_expr in the first operand of each pointer_plus.
Add simplificatoin of ne/eq of two pointer_plus with addr_expr
in the first operand of each pointer_plus.

gcc/testsuite/ChangeLog:

PR tree-optimization/63184
* c-c++-common/pr19807-2.c: Enable for all targets and remove the xfail.
* c-c++-common/pr19807-3.c: Likewise.

Explicitly add -msse2 to compile HF related libgcc source file.

For 32-bit libgcc configure w/o sse2, there's would be an error since
GCC only support _Float16 under sse2. Explicitly add -msse2 for those
HF related libgcc functions, so users can still link them w/ the
upper configuration.

libgcc/ChangeLog:

* Makefile.in: Adjust to support specific CFLAGS for each
libgcc source file.
* config/i386/64/t-softfp: Explicitly add -msse2 for HF
related libgcc source files.
* config/i386/t-softfp: Ditto.
* config/i386/_divhc3.c: New file.
* config/i386/_mulhc3.c: New file.

tree-optimization/102176 - locally compute participating SLP stmts

This performs local re-computation of participating scalar stmts
in BB vectorization subgraphs to allow precise computation of
liveness of scalar stmts after vectorization and thus precise
costing. This treats all extern defs as live but continues
to optimistically handle scalar defs that we think we can handle
by lane-extraction even though that can still fail late during
code-generation.

2021-09-02 Richard Biener <rguenther@suse.de>

PR tree-optimization/102176
* tree-vect-slp.c (vect_slp_gather_vectorized_scalar_stmts):
New function.
(vect_bb_slp_scalar_cost): Use the computed set of
vectorized scalar stmts instead of relying on the out-of-date
and not accurate PURE_SLP_STMT.
(vect_bb_vectorization_profitable_p): Compute the set
of vectorized scalar stmts.

Daily bump.

libgo: update to final Go 1.17 release

Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/343729

Make the path solver's range_of_stmt() handle all statements.

The path solver's range_of_stmt() was handcuffed to only fold
GIMPLE_COND statements, since those were the only statements the
backward threader needed to resolve. However, there is no need for this
restriction, as the folding code is perfectly capable of folding any
statement.

This can be the case when trying to fold other statements in the final
block of a path (for instance, in the forward threader as it tries to
fold candidate statements along a path).

Tested on x86-64 Linux.

gcc/ChangeLog:

* gimple-range-path.cc (path_range_query::range_of_stmt): Remove
GIMPLE_COND special casing.
(path_range_query::range_defined_in_block): Use range_of_stmt
instead of calling fold_range directly.

Add an unreachable_path_p method to path_range_query.

Keeping track of unreachable calculations while traversing a path is
useful to determine edge reachability, among other things.  We've been
doing this ad-hoc in the backwards threader, so this provides a cleaner
way of accessing the information.

This patch also makes it easier to compare different threading
implementations, in some upcoming work.  For example, it's currently
difficult to gague how good we're doing compared to the forward threader,
because it can thread paths that are obviously unreachable.  This
provides a way of discarding those paths.

Note that I've opted to keep unreachable_path_p() out-of-line, because I
have local changes that will enhance this method.

Tested on x86-64 Linux.

gcc/ChangeLog:

* gimple-range-path.cc (path_range_query::range_of_expr): Set
m_undefined_path when appropriate.
(path_range_query::internal_range_of_expr): Copy from range_of_expr.
(path_range_query::unreachable_path_p): New.
(path_range_query::precompute_ranges): Set m_undefined_path.
* gimple-range-path.h (path_range_query::unreachable_path_p): New.
(path_range_query::internal_range_of_expr): New.
* tree-ssa-threadbackward.c (back_threader::find_taken_edge_cond):
Use unreachable_path_p.

Clean up registering of paths in backwards threader.

All callers to maybe_register_path() call find_taken_edge() beforehand
and pass the edge as an argument. There's no reason to repeat this
at each call site.

This is a clean-up in preparation for some other enhancements to the
backwards threader.

Tested on x86-64 Linux.

gcc/ChangeLog:

* tree-ssa-threadbackward.c (back_threader::maybe_register_path):
Remove argument and call find_taken_edge.
(back_threader::resolve_phi): Do not calculate taken edge before
calling maybe_register_path.
(back_threader::find_paths_to_names): Same.

Improve handling of C bit for setcc insns

gcc/
* config/h8300/h8300.md (QHSI2 mode iterator): New mode iterator.
* config/h8300/testcompare.md (store_c): Update name, use new
QHSI2 iterator.
(store_neg_c, store_shifted_c): New patterns.

Daily bump.

rs6000: Don't use r12 for CR save on ELFv2 (PR102107)

CR is saved and/or restored on some paths where GPR12 is already live
since it has a meaning in the calling convention in the ELFv2 ABI.

It is not completely clear to me that we can always use r11 here, but
it does seem save, there is checking code (to detect conflicts here),
and it is stage 1. So here goes.

2021-09-03 Segher Boessenkool <segher@kernel.crashing.org>

PR target/102107
* config/rs6000/rs6000-logue.c (rs6000_emit_prologue): On ELFv2 use r11
instead of r12 for CR save, in all cases.

coroutines: Support for debugging implementation state.

Some of the state that is associated with the implementation
is of interest to a user debugging a coroutine.  In particular
items such as the suspend point, promise object, and current
suspend point.

These variables live in the coroutine frame, but we can inject
proxies for them into the outermost bind expression of the
coroutine.  Such variables are automatically moved into the
coroutine frame (if they need to persist across a suspend
expression).  PLacing the proxies thus allows the user to
inspect them by name in the debugger.

To implement this, we ensure that (at the outermost scope) the
frame entries are not mangled (coroutine frame variables are
usually mangled with scope nesting information so that they do
not clash).  We can safely avoid doing this for the outermost
scope so that we can map frame entries directly to the variables.

This is partial contribution to debug support (PR 99215).

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/cp/ChangeLog:

* coroutines.cc (register_local_var_uses): Do not mangle
frame entries for the outermost scope.  Record the outer
scope as nesting depth 0.

coroutines: Add a helper for creating local vars.

This is primarily code factoring, but we take this opportunity
to rename some of the implementation variables (which we intend
to expose to debugging) so that they are in the implementation
namespace.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/cp/ChangeLog:

* coroutines.cc (coro_build_artificial_var): New.
(build_actor_fn): Use var builder, rename vars to use
implementation namespace.
(coro_rewrite_function_body): Likewise.
(morph_fn_to_coro): Likewise.

coroutines: Use DECL_VALUE_EXPR instead of rewriting vars.

Variables that need to persist over suspension expressions
must be preserved by being copied into the coroutine frame.

The initial implementations do this manually in the transform
code. However, that has various disadvantages - including
that the debug connections are lost between the original var
and the frame copy.

The revised implementation makes use of DECL_VALUE_EXPRs to
contain the frame offset expressions, so that the original
var names are preserved in the code.

This process is also applied to the function parms which are
always copied to the frame. In this case the decls need to be
copied since they are used in two different contexts during
the re-write (in the building of the ramp function, and in
the actor function itself).

This will assist in improvement of debugging (PR 99215).

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/cp/ChangeLog:

* coroutines.cc (transform_local_var_uses): Record
frame offset expressions as DECL_VALUE_EXPRs instead of
rewriting them.

Fix target/102173 ICE after error recovery

After the recent r12-3278-823685221de986a change, the testcase
gcc.target/aarch64/sve/acle/general-c/type_redef_1.c started
to ICE as the code was not ready for error_mark_node in the
type. This fixes that and the testcase now passes.

gcc/ChangeLog:

* config/aarch64/aarch64-sve-builtins.cc (register_vector_type):
Handle error_mark_node as the type of the type_decl.

Fix some GC issues in the aarch64 back-end.

I got some ICEs in my latest testsing while running the libstdc++ testsuite.
I had noticed the problem was connected to types and had just touched the
builtins code but nothing which could have caused this and I looked for
some types/variables that were not being marked with GTY.

OK? Bootstrapped and tested on aarch64-linux-gnu with no regressions.

gcc/ChangeLog:

* config/aarch64/aarch64-builtins.c (struct aarch64_simd_type_info):
Mark with GTY.
(aarch64_simd_types): Likewise.
(aarch64_simd_intOI_type_node): Likewise.
(aarch64_simd_intCI_type_node): Likewise.
(aarch64_simd_intXI_type_node): Likewise.
* config/aarch64/aarch64.h (aarch64_fp16_type_node): Likewise.
(aarch64_fp16_ptr_type_node): Likewise.
(aarch64_bf16_type_node): Likewise.
(aarch64_bf16_ptr_type_node): Likewise.

Implement POINTER_DIFF_EXPR entry in range-op.

I've seen cases in the upcoming jump threader enhancements where we see
a difference of two pointers that are known to be equivalent, and yet we
fail to return 0 for the range. This is because we have no working
range-op entry for POINTER_DIFF_EXPR. The entry we currently have is
a mere placeholder to avoid ignoring POINTER_DIFF_EXPR's so
adjust_pointer_diff_expr() could get a whack at it here:

// def = __builtin_memchr (arg, 0, sz)
// n = def - arg
//
// The range for N can be narrowed to [0, PTRDIFF_MAX - 1].

This patch adds the relational magic to range-op, which we can just
steal from the minus_expr code.

gcc/ChangeLog:

* range-op.cc (operator_minus::op1_op2_relation_effect): Abstract
out to...
(minus_op1_op2_relation_effect): ...here.
(class operator_pointer_diff): New.
(operator_pointer_diff::op1_op2_relation_effect): Call
minus_op1_op2_relation_effect.
(integral_table::integral_table): Add entry for POINTER_DIFF_EXPR.

c++: shortcut bad convs during overload resolution [PR101904]

In the context of overload resolution we have the notion of a "bad"
argument conversion, which is a conversion that "would be a permitted
with a bending of the language standards", and we handle such bad
conversions specially.  In particular, we rank a bad conversion as
better than no conversion but worse than a good conversion, and a bad
conversion doesn't necessarily make a candidate unviable.  With the
flag -fpermissive, we permit the situation where overload resolution
selects a candidate that contains a bad conversion (which we call a
non-strictly viable candidate).  And without the flag, the caller
of overload resolution usually issues a distinct permerror in this
situation instead.

One consequence of this defacto behavior is that in order to distinguish
a non-strictly viable candidate from an unviable candidate, if we
encounter a bad argument conversion during overload resolution we must
keep converting subsequent arguments because a subsequent conversion
could render the candidate unviable instead of just non-strictly viable.
But checking subsequent arguments can force template instantiations and
result in otherwise avoidable hard errors.  And in particular, all
'this' conversions are at worst bad, so this means the const/ref-qualifiers
of a member function can't be used to prune a candidate quickly, which
is the subject of the mentioned PR.

This patch tries to improve the situation without changing the defacto
output of add_candidates.  Specifically, when considering a candidate
during overload resolution this patch makes us shortcut argument
conversion checking upon encountering the first bad conversion
(tentatively marking the candidate as non-strictly viable, though it
could ultimately be unviable) under the assumption that we'll eventually
find a strictly viable candidate anyway (which renders moot the
distinction between non-strictly viable and unviable, since both are
worse than a strictly viable candidate).  If this assumption turns out
to be false, we'll fully reconsider the candidate under the defacto
behavior (without the shortcutting) so that all its conversions are
computed.

So in the best case (there's a strictly viable candidate), we avoid
some argument conversions and/or template argument deduction that may
cause a hard error.  In the worst case (there's no such candidate), we
have to redundantly consider some candidates twice.  (In a previous
version of the patch, to avoid this redundant checking I created a new
"deferred" conversion type that represents a conversion that is yet to
be computed, and instead of reconsidering a candidate I just realized
its deferred conversions.  But it doesn't seem this redundancy is a
significant performance issue to justify the added complexity of this
other approach.)

PR c++/101904

gcc/cp/ChangeLog:

* call.c (build_this_conversion): New function, split out from
add_function_candidate.
(add_function_candidate): New parameter shortcut_bad_convs.
Document it.  Use build_this_conversion.  Stop at the first bad
argument conversion when shortcut_bad_convs is true.
(add_template_candidate_real): New parameter shortcut_bad_convs.
Use build_this_conversion to check the 'this' conversion before
attempting deduction.  When the rejection reason code is
rr_bad_arg_conversion, pass -1 instead of 0 as the viable
parameter to add_candidate.  Pass 'convs' to add_candidate.
(add_template_candidate): New parameter shortcut_bad_convs.
(add_template_conv_candidate): Pass false as shortcut_bad_convs
to add_template_candidate_real.
(add_candidates): Prefer to shortcut bad conversions during
overload resolution under the assumption that we'll eventually
see a strictly viable candidate.  If this assumption turns out
to be false, re-process the non-strictly viable candidates
without shortcutting those bad conversions.

gcc/testsuite/ChangeLog:

* g++.dg/template/conv17.C: New test.

libgcc, soft-float: Fix strong_alias macro use for Darwin.

Darwin does not support strong symbol aliases and a work-
around is provided in sfp-machine.h where a second function
is created that simply calls the original. However this
needs the arguments to the synthesized function to track
the mode of the original function.

So the fix here is to match known floating point modes from
the incoming function and apply the one found to the new
function args.

The matching is highly specific to the current set of modes
and will need adjusting should more cases be added.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
libgcc/ChangeLog:

* config/i386/sfp-machine.h (alias_HFtype, alias_SFtype
alias_DFtype, alias_TFtype): New.
(ALIAS_SELECTOR): New.
(strong_alias): Use __typeof and a _Generic selector to
provide the type to the synthesized function.

Do not assume loop header threading in backward threader.

The registry's thread_through_all_blocks() has a may_peel_loop_headers
argument.  When refactoring the backward threader code, I removed this
argument for the local passthru method because it was always TRUE.  This
may not necessarily be true in the future, if the backward threader is
called from another context.  This patch removes the default definition,
in favor of an argument that is exactly the same as the identically
named function in tree-ssa-threadupdate.c.  I think this also makes it
less confusing when looking at both methods across the source base.

Tested on x86-64 Linux.

gcc/ChangeLog:

* tree-ssa-threadbackward.c (back_threader::thread_through_all_blocks):
Add may_peel_loop_headers.
(back_threader_registry::thread_through_all_blocks): Same.
(try_thread_blocks): Pass may_peel_loop_headers argument.
(pass_early_thread_jumps::execute): Same.

Abstract PHI and forwarder block checks in jump threader.

This patch abstracts out a couple common idioms in the forward
threader that I found useful while navigating the code base.

Tested on x86-64 Linux.

gcc/ChangeLog:

* tree-ssa-threadedge.c (has_phis_p): New.
(forwarder_block_p): New.
(potentially_threadable_block): Call forwarder_block_p.
(jump_threader::thread_around_empty_blocks): Call has_phis_p.
(jump_threader::thread_through_normal_block): Call
forwarder_block_p.

Improve backwards threader debugging dumps.

This patch adds debugging helpers to the backwards threader.  I have
also noticed that profitable_path_p() can bail early on paths that
crosses loops and leave the dump of blocks incomplete.  Fixed as
well.

Unfortunately the new methods cannot be marked const, because we call
the solver's dump which is not const.  I believe this was because the
ranger dump calls m_cache.block_range().  This could probably use a
cleanup at a later time.

Tested on x86-64 Linux.

gcc/ChangeLog:

* tree-ssa-threadbackward.c (back_threader::dump): New.
(back_threader::debug): New.
(back_threader_profitability::profitable_path_p): Dump blocks
even if we are bailing early.

Dump reason why threads are being cancelled and abstract code.

We are inconsistent on dumping out reasons why a thread was canceled.
This makes debugging jump threading problems harder because paths can be
canceled with no reason given. This patch abstracts out the thread
canceling code and adds a reason for every cancellation.

Tested on x86-64 Linux.

gcc/ChangeLog:

* tree-ssa-threadupdate.c (cancel_thread): New.
(jump_thread_path_registry::thread_block_1): Use cancel_thread.
(jump_thread_path_registry::mark_threaded_blocks): Same.
(jump_thread_path_registry::register_jump_thread): Same.

c++: Avoid bogus -Wunused with recent change

My change to make limit_bad_template_recursion avoid instantiating members
of erroneous classes produced a bogus "used but not defined" warning for
23_containers/unordered_set/instantiation_neg.cc; it's not defined because
we decided not to instantiate it. So we need to suppress that warning.

gcc/cp/ChangeLog:

* pt.c (limit_bad_template_recursion): Suppress -Wunused for decls
we decide not to instantiate.

Fortran: Fix Bind(C) char-len check, add ptr-contiguous check

Add F2018, 18.3.6 (5), pointer + contiguous is not permitted
check for dummies in BIND(C) procs.

Fix misreading of F2018, 18.3.4/18.3.5 + 18.3.6 (5) regarding
character dummies passed as byte stream to a bind(C) dummy arg:
Per F2018, 18.3.1 only len=1 is interoperable (since F2003).
F2008 added 'constant expression' for vars (F2018, 18.3.4/18.3.5),
applicable to dummy args per F2018, C1554. I misread this such
that len > 1 is permitted if len is a constant expr.

While the latter would work as character len=1 a(10) and len=2 a(5)
have the same storage sequence and len is fixed, it is still invalid.
Hence, it is now rejected again.

gcc/fortran/ChangeLog:

* decl.c (gfc_verify_c_interop_param): Reject pointer with
CONTIGUOUS attributes as dummy arg. Reject character len > 1
when passed as byte stream.

gcc/testsuite/ChangeLog:

* gfortran.dg/bind_c_char_6.f90: Update dg-error.
* gfortran.dg/bind_c_char_7.f90: Likewise.
* gfortran.dg/bind_c_char_8.f90: Likewise.
* gfortran.dg/iso_c_binding_char_1.f90: Likewise.
* gfortran.dg/pr32599.f03: Likewise.
* gfortran.dg/bind_c_char_9.f90: Comment testcase bits which are
implementable but not valid F2018.
* gfortran.dg/bind_c_contiguous.f90: New test.

Avoid using unavailable objects in jt_state.

The jump threading state is about to get more interesting, and it may
get with a ranger or with the const_copies/etc helpers. This patch
makes sure we have an object before we attempt to call push_marker or
pop_to_marker.

Tested on x86-64 Linux.

gcc/ChangeLog:

* tree-ssa-threadedge.c (jt_state::push): Only call methods for
which objects are available.
(jt_state::pop): Same.
(jt_state::register_equiv): Same.
(jt_state::register_equivs_on_edge): Same.

Do not release state location until after path registry.

We are popping state and then calling the registry code.  This causes
the registry to have incorrect information.  This isn't visible in
current trunk, but will be an issue when I submit further enhancements
to the threading code.  However, it is a cleanup on its own so I am
pushing it now.

Tested on x86-64 Linux.

gcc/ChangeLog:

* tree-ssa-threadedge.c (jump_threader::thread_across_edge):
Move pop until after a thread is registered.

Add debug helper for jump thread paths.

Tested on x86-64 Linux.

gcc/ChangeLog:

* tree-ssa-threadupdate.c (debug): New.

RAII class to change dump_file.

The function dump_ranger() shows everything the ranger knows at the
current time.  To do this, we tickle all the statements to force ranger
to provide as much information as possible.  During this process, the
relation code will dump status out to the dump_file, whereas in
dump_ranger, we want to dump it out to a specific file (most likely
stderr).  This patch changes the dump_file through the life of
dump_ranger() and resets it when its done.

This patch only affects dump/debugging code.

Tested on x86-64 Linux.

gcc/ChangeLog:

* gimple-range-trace.cc (push_dump_file::push_dump_file): New.
(push_dump_file::~push_dump_file): New.
(dump_ranger): Change dump_file temporarily while dumping
ranger.
* gimple-range-trace.h (class push_dump_file): New.

Add function name when dumping ranger contents.

These are minor cleanups to the dumping code.

Tested on x86-64 Linux.

gcc/ChangeLog:

* gimple-range-trace.cc (debug_seed_ranger): Remove static.
(dump_ranger): Dump function name.

Use non-null knowledge in path_range_query.

This patch improves ranges for pointers we are interested in a path, by
using the non-null class from the ranger. This allows us to thread more
paths with minimal effort.

Tested on x86-64 Linux.

gcc/ChangeLog:

* gimple-range-path.cc (path_range_query::range_defined_in_block):
Adjust for non-null.
(path_range_query::adjust_for_non_null_uses): New.
(path_range_query::precompute_ranges): Call
adjust_for_non_null_uses.
* gimple-range-path.h: Add m_non_null and
adjust_for_non_null_uses.

Improve path_range_query dumps.

Tested on x86-64 Linux.

gcc/ChangeLog:

* gimple-range-path.cc (path_range_query::dump): Dump path
length.
(path_range_query::precompute_ranges): Dump entire path.

Implement relation_oracle::debug.

Tested on x86-64 Linux.

gcc/ChangeLog:

* value-relation.cc (relation_oracle::debug): New.
* value-relation.h (relation_oracle::debug): New.

Remove unnecessary include from tree-ssa-loop-ch.c

Tested on x86-64 Linux.

gcc/ChangeLog:

* tree-ssa-loop-ch.c: Remove unnecessary include file.

Skip statements with no BB in ranger.

The function postfold_gcond_edges() registers relations coming out of a
GIMPLE_COND.  With upcoming changes, we may be called with statements
not in the IL (for example, dummy statements created by the
forward threader).  This patch avoids breakage by exiting if the
statement does not have a defining basic block.  There is a similar
change to the path solver.

Tested on x86-64 Linux.

gcc/ChangeLog:

* gimple-range-fold.cc (fold_using_range::postfold_gcond_edges):
Skip statements with no defining BB.
* gimple-range-path.cc (path_range_query::range_defined_in_block):
Do not get confused by statements with no defining BB.

Improve support for IMAGPART_EXPR and REALPART_EXPR in ranger.

Currently we adjust statements containing an IMAGPART_EXPR if the
defining statement was one of a few built-ins known to return boolean
types. We can also adjust statements for both IMAGPART_EXPR and
REALPART_EXPR where the defining statement is a constant.

This patch adds such support, and cleans up the code a bit.

Tested on x86-64 Linux.

gcc/ChangeLog:

* gimple-range-fold.cc (adjust_imagpart_expr): Move from
gimple_range_adjustment. Add support for constants.
(adjust_realpart_expr): New.
(gimple_range_adjustment): Move IMAGPART_EXPR code to
adjust_imagpart_expr.
* range-op.cc (integral_table::integral_table): Add entry for
REALPART_CST.

libgomp.*/error-1.{c,f90}: Fix dg-output newline pattern

libgomp/ChangeLog:

* testsuite/libgomp.c-c++-common/error-1.c: Use \r\n not \n\r in
dg-output.
* testsuite/libgomp.fortran/error-1.f90: Likewise.

Improve compatibility of -fdump-ada-spec with warnings

This makes sure that the style and warning settings used in the
C/C++ bindings generated by -fdump-ada-spec do not leak into the
units that use them.

gcc/c-family/
* c-ada-spec.c (dump_ads): Generate pragmas to disable style checks
and -gnatwu warning for the package specification.

openmp: Improve expand_omp_atomic_pipeline

When __atomic_* builtins were introduced, omp-expand.c (omp-low.c
at that point) has been adjusted in several spots so that it uses
the atomic builtins instead of sync builtins, but
expand_omp_atomic_pipeline has not because the __atomic_compare_exchange_*
APIs take address of the argument, so it kept using __sync_val_compare_swap_*.
That means it always uses seq_cst though.
This patch changes it to use the ATOMIC_COMPARE_EXCHANGE ifn which gimple-fold
folds __atomic_compare_exchange_* into - that ifn also passes expected
directly.

2021-09-03 Jakub Jelinek <jakub@redhat.com>

* omp-expand.c (expand_omp_atomic_pipeline): Use
IFN_ATOMIC_COMPARE_EXCHANGE instead of
BUILT_IN_SYNC_VAL_COMPARE_AND_SWAP_? so that memory order
can be provided.

c++, abi: Set DECL_FIELD_CXX_ZERO_WIDTH_BIT_FIELD on C++ zero width bitfields [PR102024]

The removal of remove_zero_width_bitfields function and its call from
C++ FE layout_class_type (which I've done in the P0466R5
layout-compatible helper intrinsics patch, so that the FE can actually
determine what is and isn't layout-compatible according to the spec)
unfortunately changed the ABI on various platforms.
The C FE has been keeping zero-width bitfields in the types, while
the C++ FE has been removing them after structure layout, so in various
cases when passing such structures in registers we had different ABI
between C and C++.

While both the C and C++ FE had some code to remove zero width bitfields
after structure layout, in both FEs it was buggy and didn't really remove
any.  In the C FE that code has been removed later on, while in the C++ FE
for GCC 4.5 in PR42217 it has been actually fixed, so the C++ FE started
to remove those bitfields.

The following patch doesn't change anything ABI-wise, but allows the
targets to decide what to do, emit -Wpsabi warnings etc.
Non-C zero width bitfields will be seen by the backends as normal
zero width bitfields, C++ zero width bitfields that used to be previously
removed will have DECL_FIELD_CXX_ZERO_WIDTH_BIT_FIELD flag set.
I've reused the DECL_FIELD_ABI_IGNORED flag which is only used on non-bitfield
FIELD_DECLs right now, but the macros now check DECL_BIT_FIELD flag.

Each backend can then decide what it wants, whether it wants to keep
different ABI between C and C++ as in GCC 11 and older (i.e. incompatible
with G++ <= 4.4, compatible with G++ 4.5 .. 11), for that it would
ignore for the aggregate passing/returning decisions all
DECL_FIELD_CXX_ZERO_WIDTH_BIT_FIELD FIELD_DECLs), whether it wants to never
ignore zero width bitfields (no changes needed for that case, except perhaps
-Wpsabi warning should be added and for that DECL_FIELD_CXX_ZERO_WIDTH_BIT_FIELD
can be tested), or whether it wants to always ignore zero width bitfields
(I think e.g. riscv in GCC 10+ does that).

All this patch does is set the flag which the backends can then use.

2021-09-03  Jakub Jelinek  <jakub@redhat.com>

PR target/102024
gcc/
* tree.h (DECL_FIELD_ABI_IGNORED): Changed into rvalue only macro
that is false if DECL_BIT_FIELD.
(SET_DECL_FIELD_ABI_IGNORED, DECL_FIELD_CXX_ZERO_WIDTH_BIT_FIELD,
SET_DECL_FIELD_CXX_ZERO_WIDTH_BIT_FIELD): Define.
* tree-streamer-out.c (pack_ts_decl_common_value_fields): For
DECL_BIT_FIELD stream DECL_FIELD_CXX_ZERO_WIDTH_BIT_FIELD instead
of DECL_FIELD_ABI_IGNORED.
* tree-streamer-in.c (unpack_ts_decl_common_value_fields): Use
SET_DECL_FIELD_ABI_IGNORED instead of writing to
DECL_FIELD_ABI_IGNORED and for DECL_BIT_FIELD use
SET_DECL_FIELD_CXX_ZERO_WIDTH_BIT_FIELD instead.
* lto-streamer-out.c (hash_tree): For DECL_BIT_FIELD hash
DECL_FIELD_CXX_ZERO_WIDTH_BIT_FIELD instead of DECL_FIELD_ABI_IGNORED.
gcc/cp/
* class.c (build_base_field): Use SET_DECL_FIELD_ABI_IGNORED
instead of writing to DECL_FIELD_ABI_IGNORED.
(layout_class_type): Likewise.  In the place where zero-width
bitfields used to be removed, use
SET_DECL_FIELD_CXX_ZERO_WIDTH_BIT_FIELD on those fields instead.
gcc/lto/
* lto-common.c (compare_tree_sccs_1): Also compare
DECL_FIELD_CXX_ZERO_WIDTH_BIT_FIELD values.

Remove macro check for __AMX_BF16/INT8/TILE__ in header file.

gcc/ChangeLog:

PR target/102166
* config/i386/amxbf16intrin.h : Remove macro check for __AMX_BF16__.
* config/i386/amxint8intrin.h : Remove macro check for __AMX_INT8__.
* config/i386/amxtileintrin.h : Remove macro check for __AMX_TILE__.

gcc/testsuite/ChangeLog:

PR target/102166
* g++.target/i386/pr102166.C: New test.

Daily bump.

libgfortran: Further fixes for GFC/CFI descriptor conversions.

This patch is for:
PR100907 - Bind(c): failure handling wide character
PR100911 - Bind(c): failure handling C_PTR
PR100914 - Bind(c): errors handling complex
PR100915 - Bind(c): failure handling C_FUNPTR
PR100917 - Bind(c): errors handling long double real

All of these problems are related to the GFC descriptors constructed
by the Fortran front end containing ambigous or incomplete
information.  This patch does not attempt to change the GFC data
structure or the front end, and only makes the runtime interpret it in
more reasonable ways.  It's not a complete fix for any of the listed
issues.

The Fortran front end does not distinguish between C_PTR and
C_FUNPTR, mapping both onto BT_VOID.  That is what this patch does also.

The other bugs are related to GFC descriptors only containing elem_len
and not kind.  For complex types, the elem_len needs to be divided by
2 and then mapped onto a real kind.  On x86 targets, the kind
corresponding to C long double is different than its elem_len; since
we cannot accurately disambiguate between a 16-byte kind 10 long
double from __float128, this patch arbitrarily prefers to interpret that as
the standard long double type rather than the GNU extension.

Similarly, for character types, the GFC descriptor cannot distinguish
between character(kind=c_char, len=4) and character(kind=ucs4, len=1).
But since the front end currently rejects anything other than len=1
(PR92482) this patch uses the latter interpretation.

2021-09-01  Sandra Loosemore  <sandra@codesourcery.com>
    José Rui Faustino de Sousa  <jrfsousa@gmail.com>

gcc/testsuite/
PR fortran/100911
PR fortran/100915
PR fortran/100916
* gfortran.dg/PR100911.c: New file.
* gfortran.dg/PR100911.f90: New file.
* gfortran.dg/PR100914.c: New file.
* gfortran.dg/PR100914.f90: New file.
* gfortran.dg/PR100915.c: New file.
* gfortran.dg/PR100915.f90: New file.

libgfortran/
PR fortran/100907
PR fortran/100911
PR fortran/100914
PR fortran/100915
PR fortran/100917
* ISO_Fortran_binding-1-tmpl.h (CFI_type_cfunptr): Make equivalent
to CFI_type_cptr.
* runtime/ISO_Fortran_binding.c (cfi_desc_to_gfc_desc): Fix
handling of CFI_type_cptr and CFI_type_cfunptr.  Additional error
checking and code cleanup.
(gfc_desc_to_cfi_desc): Likewise.  Also correct kind mapping
for character, complex, and long double types.

Fortran: TS 29113 testsuite

Add tests to exercise features added to Fortran via TS 29113, "Further
Interoperability of Fortran with C":

https://wg5-fortran.org/N1901-N1950/N1942.pdf

2021-09-01 Sandra Loosemore <sandra@codesourcery.com>

gcc/testsuite/
* gfortran.dg/c-interop/allocatable-dummy-c.c: New file.
* gfortran.dg/c-interop/allocatable-dummy.f90: New file.
* gfortran.dg/c-interop/allocatable-optional-pointer.f90: New file.
* gfortran.dg/c-interop/allocate-c.c: New file.
* gfortran.dg/c-interop/allocate-errors-c.c: New file.
* gfortran.dg/c-interop/allocate-errors.f90: New file.
* gfortran.dg/c-interop/allocate.f90: New file.
* gfortran.dg/c-interop/argument-association-assumed-rank-1.f90:
New file.
* gfortran.dg/c-interop/argument-association-assumed-rank-2.f90:
New file.
* gfortran.dg/c-interop/argument-association-assumed-rank-3.f90:
New file.
* gfortran.dg/c-interop/argument-association-assumed-rank-4.f90:
New file.
* gfortran.dg/c-interop/argument-association-assumed-rank-5.f90:
New file.
* gfortran.dg/c-interop/argument-association-assumed-rank-6.f90:
New file.
* gfortran.dg/c-interop/argument-association-assumed-rank-7.f90:
New file.
* gfortran.dg/c-interop/argument-association-assumed-rank-8.f90:
New file.
* gfortran.dg/c-interop/assumed-type-dummy.f90: New file.
* gfortran.dg/c-interop/c-interop.exp: New file.
* gfortran.dg/c-interop/c1255-1.f90: New file.
* gfortran.dg/c-interop/c1255-2.f90: New file.
* gfortran.dg/c-interop/c1255a.f90: New file.
* gfortran.dg/c-interop/c407a-1.f90: New file.
* gfortran.dg/c-interop/c407a-2.f90: New file.
* gfortran.dg/c-interop/c407b-1.f90: New file.
* gfortran.dg/c-interop/c407b-2.f90: New file.
* gfortran.dg/c-interop/c407c-1.f90: New file.
* gfortran.dg/c-interop/c516.f90: New file.
* gfortran.dg/c-interop/c524a.f90: New file.
* gfortran.dg/c-interop/c535a-1.f90: New file.
* gfortran.dg/c-interop/c535a-2.f90: New file.
* gfortran.dg/c-interop/c535b-1.f90: New file.
* gfortran.dg/c-interop/c535b-2.f90: New file.
* gfortran.dg/c-interop/c535b-3.f90: New file.
* gfortran.dg/c-interop/c535c-1.f90: New file.
* gfortran.dg/c-interop/c535c-2.f90: New file.
* gfortran.dg/c-interop/c535c-3.f90: New file.
* gfortran.dg/c-interop/c535c-4.f90: New file.
* gfortran.dg/c-interop/cf-descriptor-1-c.c: New file.
* gfortran.dg/c-interop/cf-descriptor-1.f90: New file.
* gfortran.dg/c-interop/cf-descriptor-2-c.c: New file.
* gfortran.dg/c-interop/cf-descriptor-2.f90: New file.
* gfortran.dg/c-interop/cf-descriptor-3-c.c: New file.
* gfortran.dg/c-interop/cf-descriptor-3.f90: New file.
* gfortran.dg/c-interop/cf-descriptor-4-c.c: New file.
* gfortran.dg/c-interop/cf-descriptor-4.f90: New file.
* gfortran.dg/c-interop/cf-descriptor-5-c.c: New file.
* gfortran.dg/c-interop/cf-descriptor-5.f90: New file.
* gfortran.dg/c-interop/cf-descriptor-6-c.c: New file.
* gfortran.dg/c-interop/cf-descriptor-6.f90: New file.
* gfortran.dg/c-interop/cf-descriptor-7-c.c: New file.
* gfortran.dg/c-interop/cf-descriptor-7.f90: New file.
* gfortran.dg/c-interop/cf-descriptor-8-c.c: New file.
* gfortran.dg/c-interop/cf-descriptor-8.f90: New file.
* gfortran.dg/c-interop/cf-out-descriptor-1-c.c: New file.
* gfortran.dg/c-interop/cf-out-descriptor-1.f90: New file.
* gfortran.dg/c-interop/cf-out-descriptor-2-c.c: New file.
* gfortran.dg/c-interop/cf-out-descriptor-2.f90: New file.
* gfortran.dg/c-interop/cf-out-descriptor-3-c.c: New file.
* gfortran.dg/c-interop/cf-out-descriptor-3.f90: New file.
* gfortran.dg/c-interop/cf-out-descriptor-4-c.c: New file.
* gfortran.dg/c-interop/cf-out-descriptor-4.f90: New file.
* gfortran.dg/c-interop/cf-out-descriptor-5-c.c: New file.
* gfortran.dg/c-interop/cf-out-descriptor-5.f90: New file.
* gfortran.dg/c-interop/cf-out-descriptor-6-c.c: New file.
* gfortran.dg/c-interop/cf-out-descriptor-6.f90: New file.
* gfortran.dg/c-interop/contiguous-1-c.c: New file.
* gfortran.dg/c-interop/contiguous-1.f90: New file.
* gfortran.dg/c-interop/contiguous-2-c.c: New file.
* gfortran.dg/c-interop/contiguous-2.f90: New file.
* gfortran.dg/c-interop/contiguous-3-c.c: New file.
* gfortran.dg/c-interop/contiguous-3.f90: New file.
* gfortran.dg/c-interop/deferred-character-1.f90: New file.
* gfortran.dg/c-interop/deferred-character-2.f90: New file.
* gfortran.dg/c-interop/dump-descriptors.c: New file.
* gfortran.dg/c-interop/dump-descriptors.h: New file.
* gfortran.dg/c-interop/establish-c.c: New file.
* gfortran.dg/c-interop/establish-errors-c.c: New file.
* gfortran.dg/c-interop/establish-errors.f90: New file.
* gfortran.dg/c-interop/establish.f90: New file.
* gfortran.dg/c-interop/explicit-interface.f90: New file.
* gfortran.dg/c-interop/fc-descriptor-1-c.c: New file.
* gfortran.dg/c-interop/fc-descriptor-1.f90: New file.
* gfortran.dg/c-interop/fc-descriptor-2-c.c: New file.
* gfortran.dg/c-interop/fc-descriptor-2.f90: New file.
* gfortran.dg/c-interop/fc-descriptor-3-c.c: New file.
* gfortran.dg/c-interop/fc-descriptor-3.f90: New file.
* gfortran.dg/c-interop/fc-descriptor-4-c.c: New file.
* gfortran.dg/c-interop/fc-descriptor-4.f90: New file.
* gfortran.dg/c-interop/fc-descriptor-5-c.c: New file.
* gfortran.dg/c-interop/fc-descriptor-5.f90: New file.
* gfortran.dg/c-interop/fc-descriptor-6-c.c: New file.
* gfortran.dg/c-interop/fc-descriptor-6.f90: New file.
* gfortran.dg/c-interop/fc-descriptor-7-c.c: New file.
* gfortran.dg/c-interop/fc-descriptor-7.f90: New file.
* gfortran.dg/c-interop/fc-descriptor-8-c.c: New file.
* gfortran.dg/c-interop/fc-descriptor-8.f90: New file.
* gfortran.dg/c-interop/fc-descriptor-9-c.c: New file.
* gfortran.dg/c-interop/fc-descriptor-9.f90: New file.
* gfortran.dg/c-interop/fc-out-descriptor-1-c.c: New file.
* gfortran.dg/c-interop/fc-out-descriptor-1.f90: New file.
* gfortran.dg/c-interop/fc-out-descriptor-2-c.c: New file.
* gfortran.dg/c-interop/fc-out-descriptor-2.f90: New file.
* gfortran.dg/c-interop/fc-out-descriptor-3-c.c: New file.
* gfortran.dg/c-interop/fc-out-descriptor-3.f90: New file.
* gfortran.dg/c-interop/fc-out-descriptor-4-c.c: New file.
* gfortran.dg/c-interop/fc-out-descriptor-4.f90: New file.
* gfortran.dg/c-interop/fc-out-descriptor-5-c.c: New file.
* gfortran.dg/c-interop/fc-out-descriptor-5.f90: New file.
* gfortran.dg/c-interop/fc-out-descriptor-6-c.c: New file.
* gfortran.dg/c-interop/fc-out-descriptor-6.f90: New file.
* gfortran.dg/c-interop/fc-out-descriptor-7-c.c: New file.
* gfortran.dg/c-interop/fc-out-descriptor-7.f90: New file.
* gfortran.dg/c-interop/ff-descriptor-1.f90: New file.
* gfortran.dg/c-interop/ff-descriptor-2.f90: New file.
* gfortran.dg/c-interop/ff-descriptor-3.f90: New file.
* gfortran.dg/c-interop/ff-descriptor-4.f90: New file.
* gfortran.dg/c-interop/ff-descriptor-5.f90: New file.
* gfortran.dg/c-interop/ff-descriptor-6.f90: New file.
* gfortran.dg/c-interop/ff-descriptor-7.f90: New file.
* gfortran.dg/c-interop/note-5-3.f90: New file.
* gfortran.dg/c-interop/note-5-4-c.c: New file.
* gfortran.dg/c-interop/note-5-4.f90: New file.
* gfortran.dg/c-interop/optional-c.c: New file.
* gfortran.dg/c-interop/optional.f90: New file.
* gfortran.dg/c-interop/rank-class.f90: New file.
* gfortran.dg/c-interop/rank.f90: New file.
* gfortran.dg/c-interop/removed-restrictions-1.f90: New file.
* gfortran.dg/c-interop/removed-restrictions-2.f90: New file.
* gfortran.dg/c-interop/removed-restrictions-3.f90: New file.
* gfortran.dg/c-interop/removed-restrictions-4.f90: New file.
* gfortran.dg/c-interop/section-1-c.c: New file.
* gfortran.dg/c-interop/section-1.f90: New file.
* gfortran.dg/c-interop/section-1p.f90: New file.
* gfortran.dg/c-interop/section-2-c.c: New file.
* gfortran.dg/c-interop/section-2.f90: New file.
* gfortran.dg/c-interop/section-2p.f90: New file.
* gfortran.dg/c-interop/section-3-c.c: New file.
* gfortran.dg/c-interop/section-3.f90: New file.
* gfortran.dg/c-interop/section-3p.f90: New file.
* gfortran.dg/c-interop/section-4-c.c: New file.
* gfortran.dg/c-interop/section-4.f90: New file.
* gfortran.dg/c-interop/section-errors-c.c: New file.
* gfortran.dg/c-interop/section-errors.f90: New file.
* gfortran.dg/c-interop/select-c.c: New file.
* gfortran.dg/c-interop/select-errors-c.c: New file.
* gfortran.dg/c-interop/select-errors.f90: New file.
* gfortran.dg/c-interop/select.f90: New file.
* gfortran.dg/c-interop/setpointer-c.c: New file.
* gfortran.dg/c-interop/setpointer-errors-c.c: New file.
* gfortran.dg/c-interop/setpointer-errors.f90: New file.
* gfortran.dg/c-interop/setpointer.f90: New file.
* gfortran.dg/c-interop/shape.f90: New file.
* gfortran.dg/c-interop/size.f90: New file.
* gfortran.dg/c-interop/tkr.f90: New file.
* gfortran.dg/c-interop/typecodes-array-basic-c.c: New file.
* gfortran.dg/c-interop/typecodes-array-basic.f90: New file.
* gfortran.dg/c-interop/typecodes-array-char-c.c: New file.
* gfortran.dg/c-interop/typecodes-array-char.f90: New file.
* gfortran.dg/c-interop/typecodes-array-float128-c.c: New file.
* gfortran.dg/c-interop/typecodes-array-float128.f90: New file.
* gfortran.dg/c-interop/typecodes-array-int128-c.c: New file.
* gfortran.dg/c-interop/typecodes-array-int128.f90: New file.
* gfortran.dg/c-interop/typecodes-array-longdouble-c.c: New file.
* gfortran.dg/c-interop/typecodes-array-longdouble.f90: New file.
* gfortran.dg/c-interop/typecodes-sanity-c.c: New file.
* gfortran.dg/c-interop/typecodes-sanity.f90: New file.
* gfortran.dg/c-interop/typecodes-scalar-basic-c.c: New file.
* gfortran.dg/c-interop/typecodes-scalar-basic.f90: New file.
* gfortran.dg/c-interop/typecodes-scalar-float128-c.c: New file.
* gfortran.dg/c-interop/typecodes-scalar-float128.f90: New file.
* gfortran.dg/c-interop/typecodes-scalar-int128-c.c: New file.
* gfortran.dg/c-interop/typecodes-scalar-int128.f90: New file.
* gfortran.dg/c-interop/typecodes-scalar-longdouble-c.c: New file.
* gfortran.dg/c-interop/typecodes-scalar-longdouble.f90: New file.
* gfortran.dg/c-interop/ubound.f90: New file.
* lib/target-supports.exp
(check_effective_target_fortran_real_c_float128): New function.

libstdc++: Implement std::atomic<T*>::compare_exchange_weak

For some reason r170217 didn't add compare_exchange_weak to the
__atomic_base<T*> partial specialization, and so weak compare exchange
operations on pointers use compare_exchange_strong instead.

This adds __atomic_base<T*>::compare_exchange_weak and then uses it in
std::atomic<T*>::compare_exchange_weak.

Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:

* include/bits/atomic_base.h (__atomic_base<P*>::compare_exchange_weak):
Add new functions.
* include/std/atomic (atomic<T*>::compare_exchange_weak): Use
it.

libstdc++: Tweak whitespace in <atomic>

Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:

* include/std/atomic: Tweak whitespace.

libstdc++: Remove "no stronger" assertion in compare exchange [PR102177]

P0418R2 removed some preconditions from std::atomic::compare_exchange_*
but we still enforce them via __glibcxx_assert. This removes those
assertions.

Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:

PR c++/102177
* include/bits/atomic_base.h (__is_valid_cmpexch_failure_order):
New function to check if a memory order is valid for the failure
case of compare exchange operations.
(__atomic_base<I>::compare_exchange_weak): Simplify assertions
by using __is_valid_cmpexch_failure_order.
(__atomic_base<I>::compare_exchange_strong): Likewise.
(__atomic_base<P*>::compare_exchange_weak): Likewise.
(__atomic_base<P*>::compare_exchange_strong): Likewise.
(__atomic_impl::compare_exchange_weak): Add assertion.
(__atomic_impl::compare_exchange_strong): Likewise.
* include/std/atomic (atomic::compare_exchange_weak): Likewise.
(atomic::compare_exchange_strong): Likewise.

libstdc++: Define std::invoke_r for C++23 (P2136R3)

We already supported this feature as std::__invoke<R>, for internal use.
This just adds a public version of it to <functional>.

Internal uses should continue to include <bits/invoke.h> and use
std::__invoke<R> so that they don't need to include all of <functional>.

Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:

* include/std/functional (invoke_r): Define.
* include/std/version (__cpp_lib_invoke_r): Define.
* testsuite/20_util/function_objects/invoke/version.cc: Check
for __cpp_lib_invoke_r as well as __cpp_lib_invoke.
* testsuite/20_util/function_objects/invoke/4.cc: New test.

Improve -Wuninitialized note location.

Related:
PR tree-optimization/17506 - warning about uninitialized variable points to wrong location
PR testsuite/37182 - Revision 139286 caused gcc.dg/pr17506.c and gcc.dg/uninit-15.c

gcc/ChangeLog:

PR tree-optimization/17506
PR testsuite/37182
* tree-ssa-uninit.c (warn_uninit): Remove conditional guarding note.

gcc/testsuite/ChangeLog:

PR tree-optimization/17506
PR testsuite/37182
* gcc.dg/diagnostic-tree-expr-ranges-2.c: Add expected output.
* gcc.dg/uninit-15-O0.c: Remove xfail.
* gcc.dg/uninit-15.c: Same.

Add support for device-modifiers for 'omp target device'.

gcc/testsuite/ChangeLog:

* gfortran.dg/gomp/target-device-ancestor-4.f90: Comment out dg-final to avoid
UNRESOLVED.

Refine fix for PR78185, improve LIM for code after inner loops

This refines the fix for PR78185 after understanding that the code
regarding to the comment 'In a loop that is always entered we may
proceed anyway.  But record that we entered it and stop once we leave
it.' was supposed to protect us from leaving possibly infinite inner
loops.  The simpler fix of moving the misplaced stopping code
can then be refined to continue processing when the exited inner
loop is finite, improving invariant motion for cases like in the
added testcase.

2021-09-02  Richard Biener  <rguenther@suse.de>

* tree-ssa-loop-im.c (fill_always_executed_in_1): Refine
fix for PR78185 and continue processing when leaving
finite inner loops.

* gcc.dg/tree-ssa/ssa-lim-16.c: New testcase.

match.pd: Demote IFN_{ADD,SUB,MUL}_OVERFLOW operands [PR99591]

The overflow builtins work on infinite precision integers and then convert
to the result type's precision, so any argument promotions are useless.
The expand_arith_overflow expansion is able to demote the arguments itself
through get_range_pos_neg and get_min_precision calls and if needed promote
to whatever mode it decides to perform the operations in, but if there are
any promotions it demoted, those are already expanded.  Normally combine
would remove the useless sign or zero extensions when it sees the result
of those is only used in a lowpart subreg, but typically those lowpart
subregs appear multiple times in the pattern so that they describe properly
the overflow behavior and combine gives up, so we end up with e.g.
        movswl  %si, %esi
        movswl  %di, %edi
        imulw   %si, %di
        seto    %al
where both movswl insns are useless.

The following patch fixes it by demoting operands of the ifns (only gets
rid of integral to integral conversions that increase precision).
While IFN_{ADD,MUL}_OVERFLOW are commutative and just one simplify would be
enough, IFN_SUB_OVERFLOW is not, therefore two simplifications.

2021-09-02  Jakub Jelinek  <jakub@redhat.com>

PR tree-optimization/99591
* match.pd: Demote operands of IFN_{ADD,SUB,MUL}_OVERFLOW if they
were promoted.

* gcc.target/i386/pr99591.c: New test.
* gcc.target/i386/pr97950.c: Match or reject setb or jn?b instructions
together with seta or jn?a.