From c600df9a4060da3c6121ff4d0b93f179eafd69d1 Mon Sep 17 00:00:00 2001 From: Richard Sandiford Date: Tue, 29 Oct 2019 09:08:47 +0000 Subject: [PATCH] [AArch64] Add support for the SVE PCS The AAPCS64 specifies that if a function takes arguments in SVE registers or returns them in SVE registers, it must preserve all of Z8-Z23 and all of P4-P11. (Normal functions only preserve the low 64 bits of Z8-Z15 and clobber all of the predicate registers.) This variation is known informally as the "SVE PCS" and functions that use it are known informally as "SVE functions". The SVE PCS is mutually interoperable with functions that follow the standard AAPCS64 rules and those that use the aarch64_vector_pcs attribute. (Note that it's an error to use the attribute for SVE functions.) One complication -- although it's not really that complicated -- is that SVE registers need to be saved at a VL-dependent offset while other registers need to be saved at a constant offset. The easiest way of handling this seemed to be to group the SVE registers together below the hard frame pointer. In common cases, the frame pointer is then usually an easy-to-compute VL multiple above the stack pointer and a constant amount below the incoming stack pointer. A bigger complication is that, because the base AAPCS64 specifies that only the low 64 bits of V8-V15 are preserved by calls, the associated DWARF frame registers are also treated as 64 bits by the unwinder. The 64 bits must also have the same layout as they would for a base AAPCS64 function, otherwise unwinding won't work correctly. (This is actually a problem for the existing aarch64_vector_pcs support too, but I'll fix that separately.) This falls out naturally for little-endian targets but not for big-endian targets. The easiest way of meeting the requirement for them was to use ST1D and LD1D to save and restore Z8-Z15, which also has the nice property of storing the 64 bits at the start of the slot. However, using ST1D and LD1D requires a spare predicate register, and since all of P0-P7 are either argument registers or call-preserved, we may need to spill P4 in order to save the vector registers, even if P4 wouldn't need to be saved otherwise. Since Z16-Z23 are fully clobbered by base AAPCS64 functions, we don't need to emit frame information for them at all. This avoids having to decide whether the registers should be treated as having 64 bits (as for Z8-Z15), 128 bits (for Advanced SIMD) or the full SVE width. There are two ways of dealing with stack-clash protection when saving SVE registers: (1) If the area between the hard frame pointer and the incoming stack pointer is allocated via a store with writeback (callee_adjust != 0), the SVE save area is allocated separately and becomes the "initial" allocation as far as stack-clash protection goes. In this case the store with writeback acts as a probe at the hard frame pointer position. (2) If the area between the hard frame pointer and the incoming stack pointer is allocated via aarch64_allocate_and_probe_stack_space, the SVE save area is added to this initial allocation, so that the SP ends up pointing at the SVE register saves. It's then necessary to use a temporary base register to save the non-SVE registers. Setting up this temporary register requires a single instruction only and so should be more efficient than doing two allocations and probes. When SVE registers need to be saved, saving them below the frame pointer makes it harder to rely on the LR save as a stack probe, since the LR register's offset won't usually be a compile-time constant. The patch copes with that by using the lowest SVE register save as a stack probe too, and thus prevents the save from being shrink-wrapped if stack clash protection is enabled. The changelog describes the low-level details. 2019-10-29 Richard Sandiford gcc/ * calls.c (pass_by_reference): Leave the target to decide whether POLY_INT_CST-sized arguments should be passed by value or reference, rather than forcing them to be passed by reference. (must_pass_in_stack_var_size): Likewise. * config/aarch64/aarch64.md (LAST_SAVED_REGNUM): Redefine from V31_REGNUM to P15_REGNUM. * config/aarch64/aarch64-protos.h (aarch64_init_cumulative_args): Take an extra "silent_p" parameter, defaulting to false. (aarch64_sve::svbool_type_p): Declare. (aarch64_sve::nvectors_if_data_type): Likewise. * config/aarch64/aarch64.h (NUM_PR_ARG_REGS): New macro. (aarch64_frame::reg_offset): Turn into poly_int64s. (aarch64_frame::save_regs_size): Likewise. (aarch64_frame::below_hard_fp_saved_regs_size): New field. (aarch64_frame::sve_callee_adjust): Likewise. (aarch64_frame::spare_reg_reg): Likewise. (ARM_PCS_SVE): New arm_pcs value. (CUMULATIVE_ARGS::aapcs_nprn): New field. (CUMULATIVE_ARGS::aapcs_nextnprn): Likewise. (CUMULATIVE_ARGS::silent_p): Likewise. (BITS_PER_SVE_PRED): New macro. * config/aarch64/aarch64.c (handle_aarch64_vector_pcs_attribute): New function. Reject aarch64_vector_pcs attributes on SVE functions. (aarch64_attribute_table): Use the above handler. (aarch64_sve_abi): New function. (aarch64_sve_argument_p): Likewise. (aarch64_returns_value_in_sve_regs_p): Likewise. (aarch64_takes_arguments_in_sve_regs_p): Likewise. (aarch64_fntype_abi): Check for SVE functions and return the SVE PCS descriptor for them. (aarch64_simd_decl_p): Delete. (aarch64_emit_cfi_for_reg_p): New function. (aarch64_reg_save_mode): Remove the fndecl argument and instead use crtl->abi to choose the mode for FP registers. Handle the SVE PCS. (aarch64_hard_regno_call_part_clobbered): Do not treat FP registers as partly clobbered for the SVE PCS. (aarch64_function_ok_for_sibcall): Check whether the two functions use the same ABI, rather than checking specifically for whether they're aarch64_vector_pcs functions. (aarch64_pass_by_reference): Raise an error for attempts to pass SVE arguments when SVE is disabled. Pass SVE arguments by reference if there are not enough free registers left, or if the argument is variadic. (aarch64_function_value): Handle SVE predicates, vectors and tuples. (aarch64_return_in_memory): Do not return SVE predicates, vectors and tuples in memory. (aarch64_layout_arg): Take a function_arg_info rather than individual properties. Handle SVE predicates, vectors and tuples. Raise an error if they are passed to unprototyped functions. (aarch64_function_arg): If the silent_p flag is set, suppress the usual error about using float registers without TARGET_FLOAT. (aarch64_init_cumulative_args): Take a silent_p parameter and store it in the cumulative_args structure. Initialize aapcs_nprn and aapcs_nextnprn. If the silent_p flag is set, suppress the usual error about using float registers without TARGET_FLOAT. If the silent_p flag is not set, also raise an error about using SVE functions when SVE is disabled. (aarch64_function_arg_advance): Update the call to aarch64_layout_arg, and call it for SVE functions too. Update aapcs_nprn similarly to the other register counts. (aarch64_layout_frame): If a big-endian function needs to save and restore Z8-Z15, search for a spare predicate that it can use. Store SVE predicates at the bottom of the register save area, followed by SVE vectors, then followed by the normal slots. Keep pointing the hard frame pointer at the base of the normal slots, above the SVE vectors. Update the various frame creation and tear-down strategies for the new layout, initializing the new sve_callee_adjust field. Add an additional layout for frames whose saved registers are all SVE registers. (aarch64_register_saved_on_entry): Cope with poly_int64 reg_offsets. (aarch64_return_address_signing_enabled): Likewise. (aarch64_push_regs, aarch64_pop_regs): Update calls to aarch64_reg_save_mode. (aarch64_adjust_sve_callee_save_base): New function. (aarch64_add_cfa_expression): Move earlier in file. Take the saved register as an rtx rather than a register number and use its mode for the MEM slot. (aarch64_save_callee_saves): Remove the mode argument and instead use aarch64_reg_save_mode to get the mode of each save slot. Add a hard_fp_valid_p parameter. Cope with poly_int64 register offsets. Allow GP offsets to be saved at a VL-based offset from the stack, handling this case using the frame pointer if available or a temporary register otherwise. Use ST1D to save Z8-Z15 for big-endian SVE functions; use normal moves for other SVE saves. Only mark the save as frame-related if aarch64_emit_cfi_for_reg_p returns true. Add explicit CFA notes when not storing via the stack pointer. Do not try to pair SVE saves. (aarch64_restore_callee_saves): Cope with poly_int64 register offsets. Use LD1D to restore Z8-Z15 for big-endian SVE functions; use normal moves for other SVE restores. Only add CFA restore notes if aarch64_emit_cfi_for_reg_p returns true. Do not try to pair SVE restores. (aarch64_get_separate_components): Always keep the first SVE save in the prologue if we need to use it as a stack probe. Don't allow Z8-Z15 saves and loads to be shrink-wrapped for big-endian targets. Likewise the spare predicate register that they need. Update the offset calculation to account for the SVE save area. Use the appropriate range check for SVE LDR and STR instructions. (aarch64_components_for_bb): Cope with poly_int64 reg_offsets. (aarch64_process_components): Likewise. Update the offset calculation to account for the SVE save area. Only mark the save as frame-related if aarch64_emit_cfi_for_reg_p returns true. Do not try to pair SVE saves. (aarch64_allocate_and_probe_stack_space): Cope with poly_int64 reg_offsets. When handling the final allocation, expect the first SVE register save to be part of the initial allocation and for it to act as a probe at SP. Account for the SVE callee save area in the dump information. (aarch64_expand_prologue): Update the frame diagram. Fold the SVE callee allocation into the initial allocation if stack clash protection is enabled. Use new variables to track the offset of the frame chain (and hard frame pointer) from the current stack pointer, and likewise the offset of the bottom of the register save area. Update calls to aarch64_save_callee_saves and aarch64_add_cfa_expression. Apply sve_callee_adjust before saving the FP&SIMD registers. Save the predicate registers. (aarch64_expand_epilogue): Take below_hard_fp_saved_regs_size into account when setting the stack pointer from the frame pointer, and when deciding whether we can inherit the initial adjustment amount from the prologue. Restore the predicate registers after the vector registers, then apply sve_callee_adjust, then restore the general registers. (aarch64_secondary_reload): Don't use secondary SVE reloads for VNx16BImode. (aapcs_vfp_sub_candidate): Assert that the type is not an SVE type. (aarch64_short_vector_p): Return false for SVE types. (aarch64_vfp_is_call_or_return_candidate): Initialize *is_ha at the start of the function. Return false for SVE types. (aarch64_asm_output_variant_pcs): Output .variant_pcs for SVE functions too. (TARGET_STRICT_ARGUMENT_NAMING): Redefine to request strict naming. * config/aarch64/aarch64-sve.md (*aarch64_sve_mov_le): Extend to big-endian targets for bytewise moves. (*aarch64_sve_mov_be): Exclude the bytewise case. gcc/testsuite/ * gcc.target/aarch64/sve/pcs/aarch64-sve-pcs.exp: New file. * gcc.target/aarch64/sve/pcs/annotate_1.c: New test. * gcc.target/aarch64/sve/pcs/annotate_2.c: Likewise. * gcc.target/aarch64/sve/pcs/annotate_3.c: Likewise. * gcc.target/aarch64/sve/pcs/annotate_4.c: Likewise. * gcc.target/aarch64/sve/pcs/annotate_5.c: Likewise. * gcc.target/aarch64/sve/pcs/annotate_6.c: Likewise. * gcc.target/aarch64/sve/pcs/annotate_7.c: Likewise. * gcc.target/aarch64/sve/pcs/args_1.c: Likewise. * gcc.target/aarch64/sve/pcs/args_10.c: Likewise. * gcc.target/aarch64/sve/pcs/args_11_nosc.c: Likewise. * gcc.target/aarch64/sve/pcs/args_11_sc.c: Likewise. * gcc.target/aarch64/sve/pcs/args_2.c: Likewise. * gcc.target/aarch64/sve/pcs/args_3.c: Likewise. * gcc.target/aarch64/sve/pcs/args_4.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_be_f16.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_be_f32.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_be_f64.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_be_s16.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_be_s32.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_be_s64.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_be_s8.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_be_u16.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_be_u32.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_be_u64.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_be_u8.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_le_f16.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_le_f32.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_le_f64.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_le_s16.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_le_s32.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_le_s64.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_le_s8.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_le_u16.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_le_u32.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_le_u64.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_le_u8.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_be_f16.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_be_f32.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_be_f64.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_be_s16.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_be_s32.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_be_s64.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_be_s8.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_be_u16.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_be_u32.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_be_u64.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_be_u8.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_le_f16.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_le_f32.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_le_f64.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_le_s16.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_le_s32.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_le_s64.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_le_s8.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_le_u16.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_le_u32.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_le_u64.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_le_u8.c: Likewise. * gcc.target/aarch64/sve/pcs/args_7.c: Likewise. * gcc.target/aarch64/sve/pcs/args_8.c: Likewise. * gcc.target/aarch64/sve/pcs/args_9.c: Likewise. * gcc.target/aarch64/sve/pcs/nosve_1.c: Likewise. * gcc.target/aarch64/sve/pcs/nosve_2.c: Likewise. * gcc.target/aarch64/sve/pcs/nosve_3.c: Likewise. * gcc.target/aarch64/sve/pcs/nosve_4.c: Likewise. * gcc.target/aarch64/sve/pcs/nosve_5.c: Likewise. * gcc.target/aarch64/sve/pcs/nosve_6.c: Likewise. * gcc.target/aarch64/sve/pcs/nosve_7.c: Likewise. * gcc.target/aarch64/sve/pcs/nosve_8.c: Likewise. * gcc.target/aarch64/sve/pcs/return_1.c: Likewise. * gcc.target/aarch64/sve/pcs/return_1_1024.c: Likewise. * gcc.target/aarch64/sve/pcs/return_1_2048.c: Likewise. * gcc.target/aarch64/sve/pcs/return_1_256.c: Likewise. * gcc.target/aarch64/sve/pcs/return_1_512.c: Likewise. * gcc.target/aarch64/sve/pcs/return_2.c: Likewise. * gcc.target/aarch64/sve/pcs/return_3.c: Likewise. * gcc.target/aarch64/sve/pcs/return_4.c: Likewise. * gcc.target/aarch64/sve/pcs/return_4_1024.c: Likewise. * gcc.target/aarch64/sve/pcs/return_4_2048.c: Likewise. * gcc.target/aarch64/sve/pcs/return_4_256.c: Likewise. * gcc.target/aarch64/sve/pcs/return_4_512.c: Likewise. * gcc.target/aarch64/sve/pcs/return_5.c: Likewise. * gcc.target/aarch64/sve/pcs/return_5_1024.c: Likewise. * gcc.target/aarch64/sve/pcs/return_5_2048.c: Likewise. * gcc.target/aarch64/sve/pcs/return_5_256.c: Likewise. * gcc.target/aarch64/sve/pcs/return_5_512.c: Likewise. * gcc.target/aarch64/sve/pcs/return_6.c: Likewise. * gcc.target/aarch64/sve/pcs/return_6_1024.c: Likewise. * gcc.target/aarch64/sve/pcs/return_6_2048.c: Likewise. * gcc.target/aarch64/sve/pcs/return_6_256.c: Likewise. * gcc.target/aarch64/sve/pcs/return_6_512.c: Likewise. * gcc.target/aarch64/sve/pcs/return_7.c: Likewise. * gcc.target/aarch64/sve/pcs/return_8.c: Likewise. * gcc.target/aarch64/sve/pcs/return_9.c: Likewise. * gcc.target/aarch64/sve/pcs/saves_1_be_nowrap.c: Likewise. * gcc.target/aarch64/sve/pcs/saves_1_be_wrap.c: Likewise. * gcc.target/aarch64/sve/pcs/saves_1_le_nowrap.c: Likewise. * gcc.target/aarch64/sve/pcs/saves_1_le_wrap.c: Likewise. * gcc.target/aarch64/sve/pcs/saves_2_be_nowrap.c: Likewise. * gcc.target/aarch64/sve/pcs/saves_2_be_wrap.c: Likewise. * gcc.target/aarch64/sve/pcs/saves_2_le_nowrap.c: Likewise. * gcc.target/aarch64/sve/pcs/saves_2_le_wrap.c: Likewise. * gcc.target/aarch64/sve/pcs/saves_3.c: Likewise. * gcc.target/aarch64/sve/pcs/saves_4_be.c: Likewise. * gcc.target/aarch64/sve/pcs/saves_4_le.c: Likewise. * gcc.target/aarch64/sve/pcs/saves_5_be.c: Likewise. * gcc.target/aarch64/sve/pcs/saves_5_le.c: Likewise. * gcc.target/aarch64/sve/pcs/stack_clash_1.c: Likewise. * gcc.target/aarch64/sve/pcs/stack_clash_1_256.c: Likewise. * gcc.target/aarch64/sve/pcs/stack_clash_1_512.c: Likewise. * gcc.target/aarch64/sve/pcs/stack_clash_1_1024.c: Likewise. * gcc.target/aarch64/sve/pcs/stack_clash_1_2048.c: Likewise. * gcc.target/aarch64/sve/pcs/stack_clash_2.c: Likewise. * gcc.target/aarch64/sve/pcs/stack_clash_2_256.c: Likewise. * gcc.target/aarch64/sve/pcs/stack_clash_2_512.c: Likewise. * gcc.target/aarch64/sve/pcs/stack_clash_2_1024.c: Likewise. * gcc.target/aarch64/sve/pcs/stack_clash_2_2048.c: Likewise. * gcc.target/aarch64/sve/pcs/stack_clash_3.c: Likewise. * gcc.target/aarch64/sve/pcs/unprototyped_1.c: Likewise. * gcc.target/aarch64/sve/pcs/varargs_1.c: Likewise. * gcc.target/aarch64/sve/pcs/varargs_2_f16.c: Likewise. * gcc.target/aarch64/sve/pcs/varargs_2_f32.c: Likewise. * gcc.target/aarch64/sve/pcs/varargs_2_f64.c: Likewise. * gcc.target/aarch64/sve/pcs/varargs_2_s16.c: Likewise. * gcc.target/aarch64/sve/pcs/varargs_2_s32.c: Likewise. * gcc.target/aarch64/sve/pcs/varargs_2_s64.c: Likewise. * gcc.target/aarch64/sve/pcs/varargs_2_s8.c: Likewise. * gcc.target/aarch64/sve/pcs/varargs_2_u16.c: Likewise. * gcc.target/aarch64/sve/pcs/varargs_2_u32.c: Likewise. * gcc.target/aarch64/sve/pcs/varargs_2_u64.c: Likewise. * gcc.target/aarch64/sve/pcs/varargs_2_u8.c: Likewise. * gcc.target/aarch64/sve/pcs/varargs_3_nosc.c: Likewise. * gcc.target/aarch64/sve/pcs/varargs_3_sc.c: Likewise. * gcc.target/aarch64/sve/pcs/vpcs_1.c: Likewise. * g++.target/aarch64/sve/catch_7.C: Likewise. From-SVN: r277564 --- gcc/ChangeLog | 137 +++ gcc/calls.c | 4 +- gcc/config/aarch64/aarch64-protos.h | 4 +- gcc/config/aarch64/aarch64-sve.md | 12 +- gcc/config/aarch64/aarch64.c | 994 ++++++++++++++++----- gcc/config/aarch64/aarch64.h | 30 +- gcc/config/aarch64/aarch64.md | 2 +- gcc/testsuite/ChangeLog | 139 +++ gcc/testsuite/g++.target/aarch64/sve/catch_7.C | 38 + .../gcc.target/aarch64/sve/pcs/aarch64-sve-pcs.exp | 52 ++ .../gcc.target/aarch64/sve/pcs/annotate_1.c | 104 +++ .../gcc.target/aarch64/sve/pcs/annotate_2.c | 103 +++ .../gcc.target/aarch64/sve/pcs/annotate_3.c | 99 ++ .../gcc.target/aarch64/sve/pcs/annotate_4.c | 143 +++ .../gcc.target/aarch64/sve/pcs/annotate_5.c | 143 +++ .../gcc.target/aarch64/sve/pcs/annotate_6.c | 143 +++ .../gcc.target/aarch64/sve/pcs/annotate_7.c | 97 ++ gcc/testsuite/gcc.target/aarch64/sve/pcs/args_1.c | 49 + gcc/testsuite/gcc.target/aarch64/sve/pcs/args_10.c | 33 + .../gcc.target/aarch64/sve/pcs/args_11_nosc.c | 61 ++ .../gcc.target/aarch64/sve/pcs/args_11_sc.c | 61 ++ gcc/testsuite/gcc.target/aarch64/sve/pcs/args_2.c | 70 ++ gcc/testsuite/gcc.target/aarch64/sve/pcs/args_3.c | 70 ++ gcc/testsuite/gcc.target/aarch64/sve/pcs/args_4.c | 70 ++ .../gcc.target/aarch64/sve/pcs/args_5_be_f16.c | 63 ++ .../gcc.target/aarch64/sve/pcs/args_5_be_f32.c | 63 ++ .../gcc.target/aarch64/sve/pcs/args_5_be_f64.c | 63 ++ .../gcc.target/aarch64/sve/pcs/args_5_be_s16.c | 63 ++ .../gcc.target/aarch64/sve/pcs/args_5_be_s32.c | 63 ++ .../gcc.target/aarch64/sve/pcs/args_5_be_s64.c | 63 ++ .../gcc.target/aarch64/sve/pcs/args_5_be_s8.c | 63 ++ .../gcc.target/aarch64/sve/pcs/args_5_be_u16.c | 63 ++ .../gcc.target/aarch64/sve/pcs/args_5_be_u32.c | 63 ++ .../gcc.target/aarch64/sve/pcs/args_5_be_u64.c | 63 ++ .../gcc.target/aarch64/sve/pcs/args_5_be_u8.c | 63 ++ .../gcc.target/aarch64/sve/pcs/args_5_le_f16.c | 58 ++ .../gcc.target/aarch64/sve/pcs/args_5_le_f32.c | 58 ++ .../gcc.target/aarch64/sve/pcs/args_5_le_f64.c | 58 ++ .../gcc.target/aarch64/sve/pcs/args_5_le_s16.c | 58 ++ .../gcc.target/aarch64/sve/pcs/args_5_le_s32.c | 58 ++ .../gcc.target/aarch64/sve/pcs/args_5_le_s64.c | 58 ++ .../gcc.target/aarch64/sve/pcs/args_5_le_s8.c | 58 ++ .../gcc.target/aarch64/sve/pcs/args_5_le_u16.c | 58 ++ .../gcc.target/aarch64/sve/pcs/args_5_le_u32.c | 58 ++ .../gcc.target/aarch64/sve/pcs/args_5_le_u64.c | 58 ++ .../gcc.target/aarch64/sve/pcs/args_5_le_u8.c | 58 ++ .../gcc.target/aarch64/sve/pcs/args_6_be_f16.c | 71 ++ .../gcc.target/aarch64/sve/pcs/args_6_be_f32.c | 71 ++ .../gcc.target/aarch64/sve/pcs/args_6_be_f64.c | 71 ++ .../gcc.target/aarch64/sve/pcs/args_6_be_s16.c | 71 ++ .../gcc.target/aarch64/sve/pcs/args_6_be_s32.c | 71 ++ .../gcc.target/aarch64/sve/pcs/args_6_be_s64.c | 71 ++ .../gcc.target/aarch64/sve/pcs/args_6_be_s8.c | 71 ++ .../gcc.target/aarch64/sve/pcs/args_6_be_u16.c | 71 ++ .../gcc.target/aarch64/sve/pcs/args_6_be_u32.c | 71 ++ .../gcc.target/aarch64/sve/pcs/args_6_be_u64.c | 71 ++ .../gcc.target/aarch64/sve/pcs/args_6_be_u8.c | 71 ++ .../gcc.target/aarch64/sve/pcs/args_6_le_f16.c | 70 ++ .../gcc.target/aarch64/sve/pcs/args_6_le_f32.c | 70 ++ .../gcc.target/aarch64/sve/pcs/args_6_le_f64.c | 70 ++ .../gcc.target/aarch64/sve/pcs/args_6_le_s16.c | 70 ++ .../gcc.target/aarch64/sve/pcs/args_6_le_s32.c | 70 ++ .../gcc.target/aarch64/sve/pcs/args_6_le_s64.c | 70 ++ .../gcc.target/aarch64/sve/pcs/args_6_le_s8.c | 70 ++ .../gcc.target/aarch64/sve/pcs/args_6_le_u16.c | 70 ++ .../gcc.target/aarch64/sve/pcs/args_6_le_u32.c | 70 ++ .../gcc.target/aarch64/sve/pcs/args_6_le_u64.c | 70 ++ .../gcc.target/aarch64/sve/pcs/args_6_le_u8.c | 70 ++ gcc/testsuite/gcc.target/aarch64/sve/pcs/args_7.c | 30 + gcc/testsuite/gcc.target/aarch64/sve/pcs/args_8.c | 28 + gcc/testsuite/gcc.target/aarch64/sve/pcs/args_9.c | 49 + gcc/testsuite/gcc.target/aarch64/sve/pcs/nosve_1.c | 14 + gcc/testsuite/gcc.target/aarch64/sve/pcs/nosve_2.c | 14 + gcc/testsuite/gcc.target/aarch64/sve/pcs/nosve_3.c | 14 + gcc/testsuite/gcc.target/aarch64/sve/pcs/nosve_4.c | 14 + gcc/testsuite/gcc.target/aarch64/sve/pcs/nosve_5.c | 15 + gcc/testsuite/gcc.target/aarch64/sve/pcs/nosve_6.c | 14 + gcc/testsuite/gcc.target/aarch64/sve/pcs/nosve_7.c | 8 + gcc/testsuite/gcc.target/aarch64/sve/pcs/nosve_8.c | 11 + .../gcc.target/aarch64/sve/pcs/return_1.c | 32 + .../gcc.target/aarch64/sve/pcs/return_1_1024.c | 31 + .../gcc.target/aarch64/sve/pcs/return_1_2048.c | 31 + .../gcc.target/aarch64/sve/pcs/return_1_256.c | 31 + .../gcc.target/aarch64/sve/pcs/return_1_512.c | 31 + .../gcc.target/aarch64/sve/pcs/return_2.c | 32 + .../gcc.target/aarch64/sve/pcs/return_3.c | 34 + .../gcc.target/aarch64/sve/pcs/return_4.c | 237 +++++ .../gcc.target/aarch64/sve/pcs/return_4_1024.c | 237 +++++ .../gcc.target/aarch64/sve/pcs/return_4_2048.c | 237 +++++ .../gcc.target/aarch64/sve/pcs/return_4_256.c | 237 +++++ .../gcc.target/aarch64/sve/pcs/return_4_512.c | 237 +++++ .../gcc.target/aarch64/sve/pcs/return_5.c | 237 +++++ .../gcc.target/aarch64/sve/pcs/return_5_1024.c | 237 +++++ .../gcc.target/aarch64/sve/pcs/return_5_2048.c | 237 +++++ .../gcc.target/aarch64/sve/pcs/return_5_256.c | 237 +++++ .../gcc.target/aarch64/sve/pcs/return_5_512.c | 237 +++++ .../gcc.target/aarch64/sve/pcs/return_6.c | 258 ++++++ .../gcc.target/aarch64/sve/pcs/return_6_1024.c | 265 ++++++ .../gcc.target/aarch64/sve/pcs/return_6_2048.c | 265 ++++++ .../gcc.target/aarch64/sve/pcs/return_6_256.c | 265 ++++++ .../gcc.target/aarch64/sve/pcs/return_6_512.c | 265 ++++++ .../gcc.target/aarch64/sve/pcs/return_7.c | 313 +++++++ .../gcc.target/aarch64/sve/pcs/return_8.c | 346 +++++++ .../gcc.target/aarch64/sve/pcs/return_9.c | 405 +++++++++ .../gcc.target/aarch64/sve/pcs/saves_1_be_nowrap.c | 196 ++++ .../gcc.target/aarch64/sve/pcs/saves_1_be_wrap.c | 196 ++++ .../gcc.target/aarch64/sve/pcs/saves_1_le_nowrap.c | 184 ++++ .../gcc.target/aarch64/sve/pcs/saves_1_le_wrap.c | 184 ++++ .../gcc.target/aarch64/sve/pcs/saves_2_be_nowrap.c | 271 ++++++ .../gcc.target/aarch64/sve/pcs/saves_2_be_wrap.c | 271 ++++++ .../gcc.target/aarch64/sve/pcs/saves_2_le_nowrap.c | 255 ++++++ .../gcc.target/aarch64/sve/pcs/saves_2_le_wrap.c | 255 ++++++ gcc/testsuite/gcc.target/aarch64/sve/pcs/saves_3.c | 92 ++ .../gcc.target/aarch64/sve/pcs/saves_4_be.c | 84 ++ .../gcc.target/aarch64/sve/pcs/saves_4_le.c | 80 ++ .../gcc.target/aarch64/sve/pcs/saves_5_be.c | 78 ++ .../gcc.target/aarch64/sve/pcs/saves_5_le.c | 74 ++ .../gcc.target/aarch64/sve/pcs/stack_clash_1.c | 204 +++++ .../aarch64/sve/pcs/stack_clash_1_1024.c | 184 ++++ .../aarch64/sve/pcs/stack_clash_1_2048.c | 185 ++++ .../gcc.target/aarch64/sve/pcs/stack_clash_1_256.c | 184 ++++ .../gcc.target/aarch64/sve/pcs/stack_clash_1_512.c | 184 ++++ .../gcc.target/aarch64/sve/pcs/stack_clash_2.c | 336 +++++++ .../aarch64/sve/pcs/stack_clash_2_1024.c | 285 ++++++ .../aarch64/sve/pcs/stack_clash_2_2048.c | 285 ++++++ .../gcc.target/aarch64/sve/pcs/stack_clash_2_256.c | 284 ++++++ .../gcc.target/aarch64/sve/pcs/stack_clash_2_512.c | 285 ++++++ .../gcc.target/aarch64/sve/pcs/stack_clash_3.c | 63 ++ .../gcc.target/aarch64/sve/pcs/unprototyped_1.c | 11 + .../gcc.target/aarch64/sve/pcs/varargs_1.c | 170 ++++ .../gcc.target/aarch64/sve/pcs/varargs_2_f16.c | 170 ++++ .../gcc.target/aarch64/sve/pcs/varargs_2_f32.c | 170 ++++ .../gcc.target/aarch64/sve/pcs/varargs_2_f64.c | 170 ++++ .../gcc.target/aarch64/sve/pcs/varargs_2_s16.c | 170 ++++ .../gcc.target/aarch64/sve/pcs/varargs_2_s32.c | 170 ++++ .../gcc.target/aarch64/sve/pcs/varargs_2_s64.c | 170 ++++ .../gcc.target/aarch64/sve/pcs/varargs_2_s8.c | 170 ++++ .../gcc.target/aarch64/sve/pcs/varargs_2_u16.c | 170 ++++ .../gcc.target/aarch64/sve/pcs/varargs_2_u32.c | 170 ++++ .../gcc.target/aarch64/sve/pcs/varargs_2_u64.c | 170 ++++ .../gcc.target/aarch64/sve/pcs/varargs_2_u8.c | 170 ++++ .../gcc.target/aarch64/sve/pcs/varargs_3_nosc.c | 75 ++ .../gcc.target/aarch64/sve/pcs/varargs_3_sc.c | 75 ++ gcc/testsuite/gcc.target/aarch64/sve/pcs/vpcs_1.c | 6 + 144 files changed, 17406 insertions(+), 225 deletions(-) create mode 100644 gcc/testsuite/g++.target/aarch64/sve/catch_7.C create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/aarch64-sve-pcs.exp create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/annotate_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/annotate_2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/annotate_3.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/annotate_4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/annotate_5.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/annotate_6.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/annotate_7.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/args_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/args_10.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/args_11_nosc.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/args_11_sc.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/args_2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/args_3.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/args_4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_be_f16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_be_f32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_be_f64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_be_s16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_be_s32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_be_s64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_be_s8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_be_u16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_be_u32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_be_u64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_be_u8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_le_f16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_le_f32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_le_f64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_le_s16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_le_s32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_le_s64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_le_s8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_le_u16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_le_u32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_le_u64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_le_u8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_be_f16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_be_f32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_be_f64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_be_s16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_be_s32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_be_s64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_be_s8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_be_u16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_be_u32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_be_u64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_be_u8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_le_f16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_le_f32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_le_f64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_le_s16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_le_s32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_le_s64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_le_s8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_le_u16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_le_u32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_le_u64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_le_u8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/args_7.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/args_8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/args_9.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/nosve_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/nosve_2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/nosve_3.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/nosve_4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/nosve_5.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/nosve_6.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/nosve_7.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/nosve_8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/return_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/return_1_1024.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/return_1_2048.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/return_1_256.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/return_1_512.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/return_2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/return_3.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/return_4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/return_4_1024.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/return_4_2048.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/return_4_256.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/return_4_512.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/return_5.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/return_5_1024.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/return_5_2048.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/return_5_256.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/return_5_512.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/return_6.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/return_6_1024.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/return_6_2048.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/return_6_256.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/return_6_512.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/return_7.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/return_8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/return_9.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/saves_1_be_nowrap.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/saves_1_be_wrap.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/saves_1_le_nowrap.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/saves_1_le_wrap.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/saves_2_be_nowrap.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/saves_2_be_wrap.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/saves_2_le_nowrap.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/saves_2_le_wrap.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/saves_3.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/saves_4_be.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/saves_4_le.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/saves_5_be.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/saves_5_le.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/stack_clash_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/stack_clash_1_1024.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/stack_clash_1_2048.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/stack_clash_1_256.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/stack_clash_1_512.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/stack_clash_2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/stack_clash_2_1024.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/stack_clash_2_2048.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/stack_clash_2_256.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/stack_clash_2_512.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/stack_clash_3.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/unprototyped_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/varargs_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/varargs_2_f16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/varargs_2_f32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/varargs_2_f64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/varargs_2_s16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/varargs_2_s32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/varargs_2_s64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/varargs_2_s8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/varargs_2_u16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/varargs_2_u32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/varargs_2_u64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/varargs_2_u8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/varargs_3_nosc.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/varargs_3_sc.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/vpcs_1.c diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 937a179..a9a2293 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,4 +1,141 @@ 2019-10-29 Richard Sandiford + + * calls.c (pass_by_reference): Leave the target to decide whether + POLY_INT_CST-sized arguments should be passed by value or reference, + rather than forcing them to be passed by reference. + (must_pass_in_stack_var_size): Likewise. + * config/aarch64/aarch64.md (LAST_SAVED_REGNUM): Redefine from + V31_REGNUM to P15_REGNUM. + * config/aarch64/aarch64-protos.h (aarch64_init_cumulative_args): + Take an extra "silent_p" parameter, defaulting to false. + (aarch64_sve::svbool_type_p): Declare. + (aarch64_sve::nvectors_if_data_type): Likewise. + * config/aarch64/aarch64.h (NUM_PR_ARG_REGS): New macro. + (aarch64_frame::reg_offset): Turn into poly_int64s. + (aarch64_frame::save_regs_size): Likewise. + (aarch64_frame::below_hard_fp_saved_regs_size): New field. + (aarch64_frame::sve_callee_adjust): Likewise. + (aarch64_frame::spare_reg_reg): Likewise. + (ARM_PCS_SVE): New arm_pcs value. + (CUMULATIVE_ARGS::aapcs_nprn): New field. + (CUMULATIVE_ARGS::aapcs_nextnprn): Likewise. + (CUMULATIVE_ARGS::silent_p): Likewise. + (BITS_PER_SVE_PRED): New macro. + * config/aarch64/aarch64.c (handle_aarch64_vector_pcs_attribute): New + function. Reject aarch64_vector_pcs attributes on SVE functions. + (aarch64_attribute_table): Use the above handler. + (aarch64_sve_abi): New function. + (aarch64_sve_argument_p): Likewise. + (aarch64_returns_value_in_sve_regs_p): Likewise. + (aarch64_takes_arguments_in_sve_regs_p): Likewise. + (aarch64_fntype_abi): Check for SVE functions and return the SVE PCS + descriptor for them. + (aarch64_simd_decl_p): Delete. + (aarch64_emit_cfi_for_reg_p): New function. + (aarch64_reg_save_mode): Remove the fndecl argument and instead use + crtl->abi to choose the mode for FP registers. Handle the SVE PCS. + (aarch64_hard_regno_call_part_clobbered): Do not treat FP registers + as partly clobbered for the SVE PCS. + (aarch64_function_ok_for_sibcall): Check whether the two functions + use the same ABI, rather than checking specifically for whether + they're aarch64_vector_pcs functions. + (aarch64_pass_by_reference): Raise an error for attempts to pass + SVE arguments when SVE is disabled. Pass SVE arguments by reference + if there are not enough free registers left, or if the argument is + variadic. + (aarch64_function_value): Handle SVE predicates, vectors and tuples. + (aarch64_return_in_memory): Do not return SVE predicates, vectors and + tuples in memory. + (aarch64_layout_arg): Take a function_arg_info rather than + individual properties. Handle SVE predicates, vectors and tuples. + Raise an error if they are passed to unprototyped functions. + (aarch64_function_arg): If the silent_p flag is set, suppress the + usual error about using float registers without TARGET_FLOAT. + (aarch64_init_cumulative_args): Take a silent_p parameter and store + it in the cumulative_args structure. Initialize aapcs_nprn and + aapcs_nextnprn. If the silent_p flag is set, suppress the usual + error about using float registers without TARGET_FLOAT. + If the silent_p flag is not set, also raise an error about + using SVE functions when SVE is disabled. + (aarch64_function_arg_advance): Update the call to aarch64_layout_arg, + and call it for SVE functions too. Update aapcs_nprn similarly + to the other register counts. + (aarch64_layout_frame): If a big-endian function needs to save + and restore Z8-Z15, search for a spare predicate that it can use. + Store SVE predicates at the bottom of the register save area, + followed by SVE vectors, then followed by the normal slots. + Keep pointing the hard frame pointer at the base of the normal slots, + above the SVE vectors. Update the various frame creation and + tear-down strategies for the new layout, initializing the new + sve_callee_adjust field. Add an additional layout for frames + whose saved registers are all SVE registers. + (aarch64_register_saved_on_entry): Cope with poly_int64 reg_offsets. + (aarch64_return_address_signing_enabled): Likewise. + (aarch64_push_regs, aarch64_pop_regs): Update calls to + aarch64_reg_save_mode. + (aarch64_adjust_sve_callee_save_base): New function. + (aarch64_add_cfa_expression): Move earlier in file. Take the + saved register as an rtx rather than a register number and use + its mode for the MEM slot. + (aarch64_save_callee_saves): Remove the mode argument and instead + use aarch64_reg_save_mode to get the mode of each save slot. + Add a hard_fp_valid_p parameter. Cope with poly_int64 register + offsets. Allow GP offsets to be saved at a VL-based offset from + the stack, handling this case using the frame pointer if available + or a temporary register otherwise. Use ST1D to save Z8-Z15 for + big-endian SVE functions; use normal moves for other SVE saves. + Only mark the save as frame-related if aarch64_emit_cfi_for_reg_p + returns true. Add explicit CFA notes when not storing via the + stack pointer. Do not try to pair SVE saves. + (aarch64_restore_callee_saves): Cope with poly_int64 register + offsets. Use LD1D to restore Z8-Z15 for big-endian SVE functions; + use normal moves for other SVE restores. Only add CFA restore notes + if aarch64_emit_cfi_for_reg_p returns true. Do not try to pair + SVE restores. + (aarch64_get_separate_components): Always keep the first SVE save + in the prologue if we need to use it as a stack probe. Don't allow + Z8-Z15 saves and loads to be shrink-wrapped for big-endian targets. + Likewise the spare predicate register that they need. Update the + offset calculation to account for the SVE save area. Use the + appropriate range check for SVE LDR and STR instructions. + (aarch64_components_for_bb): Cope with poly_int64 reg_offsets. + (aarch64_process_components): Likewise. Update the offset + calculation to account for the SVE save area. Only mark the + save as frame-related if aarch64_emit_cfi_for_reg_p returns true. + Do not try to pair SVE saves. + (aarch64_allocate_and_probe_stack_space): Cope with poly_int64 + reg_offsets. When handling the final allocation, expect the + first SVE register save to be part of the initial allocation + and for it to act as a probe at SP. Account for the SVE callee + save area in the dump information. + (aarch64_expand_prologue): Update the frame diagram. Fold the + SVE callee allocation into the initial allocation if stack clash + protection is enabled. Use new variables to track the offset + of the frame chain (and hard frame pointer) from the current + stack pointer, and likewise the offset of the bottom of the + register save area. Update calls to aarch64_save_callee_saves + and aarch64_add_cfa_expression. Apply sve_callee_adjust before + saving the FP&SIMD registers. Save the predicate registers. + (aarch64_expand_epilogue): Take below_hard_fp_saved_regs_size + into account when setting the stack pointer from the frame pointer, + and when deciding whether we can inherit the initial adjustment + amount from the prologue. Restore the predicate registers after + the vector registers, then apply sve_callee_adjust, then restore + the general registers. + (aarch64_secondary_reload): Don't use secondary SVE reloads + for VNx16BImode. + (aapcs_vfp_sub_candidate): Assert that the type is not an SVE type. + (aarch64_short_vector_p): Return false for SVE types. + (aarch64_vfp_is_call_or_return_candidate): Initialize *is_ha + at the start of the function. Return false for SVE types. + (aarch64_asm_output_variant_pcs): Output .variant_pcs for SVE + functions too. + (TARGET_STRICT_ARGUMENT_NAMING): Redefine to request strict naming. + * config/aarch64/aarch64-sve.md (*aarch64_sve_mov_le): Extend + to big-endian targets for bytewise moves. + (*aarch64_sve_mov_be): Exclude the bytewise case. + +2019-10-29 Richard Sandiford Kugan Vivekanandarajah Prathamesh Kulkarni diff --git a/gcc/calls.c b/gcc/calls.c index ae90447..e2b770f 100644 --- a/gcc/calls.c +++ b/gcc/calls.c @@ -911,7 +911,7 @@ pass_by_reference (CUMULATIVE_ARGS *ca, function_arg_info arg) return true; /* GCC post 3.4 passes *all* variable sized types by reference. */ - if (!TYPE_SIZE (type) || TREE_CODE (TYPE_SIZE (type)) != INTEGER_CST) + if (!TYPE_SIZE (type) || !poly_int_tree_p (TYPE_SIZE (type))) return true; /* If a record type should be passed the same as its first (and only) @@ -5878,7 +5878,7 @@ must_pass_in_stack_var_size (const function_arg_info &arg) return false; /* If the type has variable size... */ - if (TREE_CODE (TYPE_SIZE (arg.type)) != INTEGER_CST) + if (!poly_int_tree_p (TYPE_SIZE (arg.type))) return true; /* If the type is marked as addressable (it is required diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h index 15f288b..1d4f4fd 100644 --- a/gcc/config/aarch64/aarch64-protos.h +++ b/gcc/config/aarch64/aarch64-protos.h @@ -617,7 +617,7 @@ void aarch64_expand_prologue (void); void aarch64_expand_vector_init (rtx, rtx); void aarch64_sve_expand_vector_init (rtx, rtx); void aarch64_init_cumulative_args (CUMULATIVE_ARGS *, const_tree, rtx, - const_tree, unsigned); + const_tree, unsigned, bool = false); void aarch64_init_expanders (void); void aarch64_init_simd_builtins (void); void aarch64_emit_call_insn (rtx); @@ -705,6 +705,8 @@ namespace aarch64_sve { void handle_arm_sve_h (); tree builtin_decl (unsigned, bool); bool builtin_type_p (const_tree); + bool svbool_type_p (const_tree); + unsigned int nvectors_if_data_type (const_tree); const char *mangle_builtin_type (const_tree); tree resolve_overloaded_builtin (location_t, unsigned int, vec *); diff --git a/gcc/config/aarch64/aarch64-sve.md b/gcc/config/aarch64/aarch64-sve.md index 57db06a..0cda882 100644 --- a/gcc/config/aarch64/aarch64-sve.md +++ b/gcc/config/aarch64/aarch64-sve.md @@ -586,14 +586,14 @@ } ) -;; Unpredicated moves (little-endian). Only allow memory operations -;; during and after RA; before RA we want the predicated load and -;; store patterns to be used instead. +;; Unpredicated moves (bytes or little-endian). Only allow memory operations +;; during and after RA; before RA we want the predicated load and store +;; patterns to be used instead. (define_insn "*aarch64_sve_mov_le" [(set (match_operand:SVE_ALL 0 "aarch64_sve_nonimmediate_operand" "=w, Utr, w, w") (match_operand:SVE_ALL 1 "aarch64_sve_general_operand" "Utr, w, w, Dn"))] "TARGET_SVE - && !BYTES_BIG_ENDIAN + && (mode == VNx16QImode || !BYTES_BIG_ENDIAN) && ((lra_in_progress || reload_completed) || (register_operand (operands[0], mode) && nonmemory_operand (operands[1], mode)))" @@ -604,12 +604,12 @@ * return aarch64_output_sve_mov_immediate (operands[1]);" ) -;; Unpredicated moves (big-endian). Memory accesses require secondary +;; Unpredicated moves (non-byte big-endian). Memory accesses require secondary ;; reloads. (define_insn "*aarch64_sve_mov_be" [(set (match_operand:SVE_ALL 0 "register_operand" "=w, w") (match_operand:SVE_ALL 1 "aarch64_nonmemory_operand" "w, Dn"))] - "TARGET_SVE && BYTES_BIG_ENDIAN" + "TARGET_SVE && BYTES_BIG_ENDIAN && mode != VNx16QImode" "@ mov\t%0.d, %1.d * return aarch64_output_sve_mov_immediate (operands[1]);" diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 9cafef4..599d07a 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -1212,12 +1212,41 @@ enum aarch64_key_type aarch64_ra_sign_key = AARCH64_KEY_A; /* The current tuning set. */ struct tune_params aarch64_tune_params = generic_tunings; +/* Check whether an 'aarch64_vector_pcs' attribute is valid. */ + +static tree +handle_aarch64_vector_pcs_attribute (tree *node, tree name, tree, + int, bool *no_add_attrs) +{ + /* Since we set fn_type_req to true, the caller should have checked + this for us. */ + gcc_assert (FUNC_OR_METHOD_TYPE_P (*node)); + switch ((arm_pcs) fntype_abi (*node).id ()) + { + case ARM_PCS_AAPCS64: + case ARM_PCS_SIMD: + return NULL_TREE; + + case ARM_PCS_SVE: + error ("the %qE attribute cannot be applied to an SVE function type", + name); + *no_add_attrs = true; + return NULL_TREE; + + case ARM_PCS_TLSDESC: + case ARM_PCS_UNKNOWN: + break; + } + gcc_unreachable (); +} + /* Table of machine attributes. */ static const struct attribute_spec aarch64_attribute_table[] = { /* { name, min_len, max_len, decl_req, type_req, fn_type_req, affects_type_identity, handler, exclude } */ - { "aarch64_vector_pcs", 0, 0, false, true, true, true, NULL, NULL }, + { "aarch64_vector_pcs", 0, 0, false, true, true, true, + handle_aarch64_vector_pcs_attribute, NULL }, { NULL, 0, 0, false, false, false, false, NULL, NULL } }; @@ -1384,6 +1413,25 @@ aarch64_simd_abi (void) return simd_abi; } +/* Return the descriptor of the SVE PCS. */ + +static const predefined_function_abi & +aarch64_sve_abi (void) +{ + predefined_function_abi &sve_abi = function_abis[ARM_PCS_SVE]; + if (!sve_abi.initialized_p ()) + { + HARD_REG_SET full_reg_clobbers + = default_function_abi.full_reg_clobbers (); + for (int regno = V8_REGNUM; regno <= V23_REGNUM; ++regno) + CLEAR_HARD_REG_BIT (full_reg_clobbers, regno); + for (int regno = P4_REGNUM; regno <= P11_REGNUM; ++regno) + CLEAR_HARD_REG_BIT (full_reg_clobbers, regno); + sve_abi.initialize (ARM_PCS_SVE, full_reg_clobbers); + } + return sve_abi; +} + /* Generate code to enable conditional branches in functions over 1 MiB. */ const char * aarch64_gen_far_branch (rtx * operands, int pos_label, const char * dest, @@ -1878,6 +1926,74 @@ aarch64_hard_regno_mode_ok (unsigned regno, machine_mode mode) return false; } +/* Return true if TYPE is a type that should be passed or returned in + SVE registers, assuming enough registers are available. When returning + true, set *NUM_ZR and *NUM_PR to the number of required Z and P registers + respectively. */ + +static bool +aarch64_sve_argument_p (const_tree type, unsigned int *num_zr, + unsigned int *num_pr) +{ + if (aarch64_sve::svbool_type_p (type)) + { + *num_pr = 1; + *num_zr = 0; + return true; + } + + if (unsigned int nvectors = aarch64_sve::nvectors_if_data_type (type)) + { + *num_pr = 0; + *num_zr = nvectors; + return true; + } + + return false; +} + +/* Return true if a function with type FNTYPE returns its value in + SVE vector or predicate registers. */ + +static bool +aarch64_returns_value_in_sve_regs_p (const_tree fntype) +{ + unsigned int num_zr, num_pr; + tree return_type = TREE_TYPE (fntype); + return (return_type != error_mark_node + && aarch64_sve_argument_p (return_type, &num_zr, &num_pr)); +} + +/* Return true if a function with type FNTYPE takes arguments in + SVE vector or predicate registers. */ + +static bool +aarch64_takes_arguments_in_sve_regs_p (const_tree fntype) +{ + CUMULATIVE_ARGS args_so_far_v; + aarch64_init_cumulative_args (&args_so_far_v, NULL_TREE, NULL_RTX, + NULL_TREE, 0, true); + cumulative_args_t args_so_far = pack_cumulative_args (&args_so_far_v); + + for (tree chain = TYPE_ARG_TYPES (fntype); + chain && chain != void_list_node; + chain = TREE_CHAIN (chain)) + { + tree arg_type = TREE_VALUE (chain); + if (arg_type == error_mark_node) + return false; + + function_arg_info arg (arg_type, /*named=*/true); + apply_pass_by_reference_rules (&args_so_far_v, arg); + unsigned int num_zr, num_pr; + if (aarch64_sve_argument_p (arg.type, &num_zr, &num_pr)) + return true; + + targetm.calls.function_arg_advance (args_so_far, arg); + } + return false; +} + /* Implement TARGET_FNTYPE_ABI. */ static const predefined_function_abi & @@ -1885,40 +2001,65 @@ aarch64_fntype_abi (const_tree fntype) { if (lookup_attribute ("aarch64_vector_pcs", TYPE_ATTRIBUTES (fntype))) return aarch64_simd_abi (); + + if (aarch64_returns_value_in_sve_regs_p (fntype) + || aarch64_takes_arguments_in_sve_regs_p (fntype)) + return aarch64_sve_abi (); + return default_function_abi; } -/* Return true if this is a definition of a vectorized simd function. */ +/* Return true if we should emit CFI for register REGNO. */ static bool -aarch64_simd_decl_p (tree fndecl) +aarch64_emit_cfi_for_reg_p (unsigned int regno) { - tree fntype; - - if (fndecl == NULL) - return false; - fntype = TREE_TYPE (fndecl); - if (fntype == NULL) - return false; - - /* Functions with the aarch64_vector_pcs attribute use the simd ABI. */ - if (lookup_attribute ("aarch64_vector_pcs", TYPE_ATTRIBUTES (fntype)) != NULL) - return true; - - return false; + return (GP_REGNUM_P (regno) + || !default_function_abi.clobbers_full_reg_p (regno)); } -/* Return the mode a register save/restore should use. DImode for integer - registers, DFmode for FP registers in non-SIMD functions (they only save - the bottom half of a 128 bit register), or TFmode for FP registers in - SIMD functions. */ +/* Return the mode we should use to save and restore register REGNO. */ static machine_mode -aarch64_reg_save_mode (tree fndecl, unsigned regno) +aarch64_reg_save_mode (unsigned int regno) { - return GP_REGNUM_P (regno) - ? E_DImode - : (aarch64_simd_decl_p (fndecl) ? E_TFmode : E_DFmode); + if (GP_REGNUM_P (regno)) + return DImode; + + if (FP_REGNUM_P (regno)) + switch (crtl->abi->id ()) + { + case ARM_PCS_AAPCS64: + /* Only the low 64 bits are saved by the base PCS. */ + return DFmode; + + case ARM_PCS_SIMD: + /* The vector PCS saves the low 128 bits (which is the full + register on non-SVE targets). */ + return TFmode; + + case ARM_PCS_SVE: + /* Use vectors of DImode for registers that need frame + information, so that the first 64 bytes of the save slot + are always the equivalent of what storing D would give. */ + if (aarch64_emit_cfi_for_reg_p (regno)) + return VNx2DImode; + + /* Use vectors of bytes otherwise, so that the layout is + endian-agnostic, and so that we can use LDR and STR for + big-endian targets. */ + return VNx16QImode; + + case ARM_PCS_TLSDESC: + case ARM_PCS_UNKNOWN: + break; + } + + if (PR_REGNUM_P (regno)) + /* Save the full predicate register. */ + return VNx16BImode; + + gcc_unreachable (); } /* Implement TARGET_INSN_CALLEE_ABI. */ @@ -1943,7 +2084,7 @@ aarch64_hard_regno_call_part_clobbered (unsigned int abi_id, unsigned int regno, machine_mode mode) { - if (FP_REGNUM_P (regno)) + if (FP_REGNUM_P (regno) && abi_id != ARM_PCS_SVE) { poly_int64 per_register_size = GET_MODE_SIZE (mode); unsigned int nregs = hard_regno_nregs (regno, mode); @@ -4582,10 +4723,9 @@ aarch64_split_sve_subreg_move (rtx dest, rtx ptrue, rtx src) } static bool -aarch64_function_ok_for_sibcall (tree decl ATTRIBUTE_UNUSED, - tree exp ATTRIBUTE_UNUSED) +aarch64_function_ok_for_sibcall (tree, tree exp) { - if (aarch64_simd_decl_p (cfun->decl) != aarch64_simd_decl_p (decl)) + if (crtl->abi->id () != expr_callee_abi (exp).id ()) return false; return true; @@ -4594,12 +4734,30 @@ aarch64_function_ok_for_sibcall (tree decl ATTRIBUTE_UNUSED, /* Implement TARGET_PASS_BY_REFERENCE. */ static bool -aarch64_pass_by_reference (cumulative_args_t, const function_arg_info &arg) +aarch64_pass_by_reference (cumulative_args_t pcum_v, + const function_arg_info &arg) { + CUMULATIVE_ARGS *pcum = get_cumulative_args (pcum_v); HOST_WIDE_INT size; machine_mode dummymode; int nregs; + unsigned int num_zr, num_pr; + if (arg.type && aarch64_sve_argument_p (arg.type, &num_zr, &num_pr)) + { + if (pcum && !pcum->silent_p && !TARGET_SVE) + /* We can't gracefully recover at this point, so make this a + fatal error. */ + fatal_error (input_location, "arguments of type %qT require" + " the SVE ISA extension", arg.type); + + /* Variadic SVE types are passed by reference. Normal non-variadic + arguments are too if we've run out of registers. */ + return (!arg.named + || pcum->aapcs_nvrn + num_zr > NUM_FP_ARG_REGS + || pcum->aapcs_nprn + num_pr > NUM_PR_ARG_REGS); + } + /* GET_MODE_SIZE (BLKmode) is useless since it is 0. */ if (arg.mode == BLKmode && arg.type) size = int_size_in_bytes (arg.type); @@ -4673,6 +4831,29 @@ aarch64_function_value (const_tree type, const_tree func, if (INTEGRAL_TYPE_P (type)) mode = promote_function_mode (type, mode, &unsignedp, func, 1); + unsigned int num_zr, num_pr; + if (type && aarch64_sve_argument_p (type, &num_zr, &num_pr)) + { + /* Don't raise an error here if we're called when SVE is disabled, + since this is really just a query function. Other code must + do that where appropriate. */ + mode = TYPE_MODE_RAW (type); + gcc_assert (VECTOR_MODE_P (mode) + && (!TARGET_SVE || aarch64_sve_mode_p (mode))); + + if (num_zr > 0 && num_pr == 0) + return gen_rtx_REG (mode, V0_REGNUM); + + if (num_zr == 0 && num_pr == 1) + return gen_rtx_REG (mode, P0_REGNUM); + + gcc_unreachable (); + } + + /* Generic vectors that map to SVE modes with -msve-vector-bits=N are + returned in memory, not by value. */ + gcc_assert (!aarch64_sve_mode_p (mode)); + if (aarch64_return_in_msb (type)) { HOST_WIDE_INT size = int_size_in_bytes (type); @@ -4755,6 +4936,16 @@ aarch64_return_in_memory (const_tree type, const_tree fndecl ATTRIBUTE_UNUSED) /* Simple scalar types always returned in registers. */ return false; + unsigned int num_zr, num_pr; + if (type && aarch64_sve_argument_p (type, &num_zr, &num_pr)) + { + /* All SVE types we support fit in registers. For example, it isn't + yet possible to define an aggregate of 9+ SVE vectors or 5+ SVE + predicates. */ + gcc_assert (num_zr <= NUM_FP_ARG_REGS && num_pr <= NUM_PR_ARG_REGS); + return false; + } + if (aarch64_vfp_is_call_or_return_candidate (TYPE_MODE (type), type, &ag_mode, @@ -4830,11 +5021,11 @@ aarch64_function_arg_alignment (machine_mode mode, const_tree type, numbers refer to the rule numbers in the AAPCS64. */ static void -aarch64_layout_arg (cumulative_args_t pcum_v, machine_mode mode, - const_tree type, - bool named ATTRIBUTE_UNUSED) +aarch64_layout_arg (cumulative_args_t pcum_v, const function_arg_info &arg) { CUMULATIVE_ARGS *pcum = get_cumulative_args (pcum_v); + tree type = arg.type; + machine_mode mode = arg.mode; int ncrn, nvrn, nregs; bool allocate_ncrn, allocate_nvrn; HOST_WIDE_INT size; @@ -4846,6 +5037,46 @@ aarch64_layout_arg (cumulative_args_t pcum_v, machine_mode mode, pcum->aapcs_arg_processed = true; + unsigned int num_zr, num_pr; + if (type && aarch64_sve_argument_p (type, &num_zr, &num_pr)) + { + /* The PCS says that it is invalid to pass an SVE value to an + unprototyped function. There is no ABI-defined location we + can return in this case, so we have no real choice but to raise + an error immediately, even though this is only a query function. */ + if (arg.named && pcum->pcs_variant != ARM_PCS_SVE) + { + gcc_assert (!pcum->silent_p); + error ("SVE type %qT cannot be passed to an unprototyped function", + arg.type); + /* Avoid repeating the message, and avoid tripping the assert + below. */ + pcum->pcs_variant = ARM_PCS_SVE; + } + + /* We would have converted the argument into pass-by-reference + form if it didn't fit in registers. */ + pcum->aapcs_nextnvrn = pcum->aapcs_nvrn + num_zr; + pcum->aapcs_nextnprn = pcum->aapcs_nprn + num_pr; + gcc_assert (arg.named + && pcum->pcs_variant == ARM_PCS_SVE + && aarch64_sve_mode_p (mode) + && pcum->aapcs_nextnvrn <= NUM_FP_ARG_REGS + && pcum->aapcs_nextnprn <= NUM_PR_ARG_REGS); + + if (num_zr > 0 && num_pr == 0) + pcum->aapcs_reg = gen_rtx_REG (mode, V0_REGNUM + pcum->aapcs_nvrn); + else if (num_zr == 0 && num_pr == 1) + pcum->aapcs_reg = gen_rtx_REG (mode, P0_REGNUM + pcum->aapcs_nprn); + else + gcc_unreachable (); + return; + } + + /* Generic vectors that map to SVE modes with -msve-vector-bits=N are + passed by reference, not by value. */ + gcc_assert (!aarch64_sve_mode_p (mode)); + /* Size in bytes, rounded to the nearest multiple of 8 bytes. */ if (type) size = int_size_in_bytes (type); @@ -4870,7 +5101,7 @@ aarch64_layout_arg (cumulative_args_t pcum_v, machine_mode mode, and homogenous short-vector aggregates (HVA). */ if (allocate_nvrn) { - if (!TARGET_FLOAT) + if (!pcum->silent_p && !TARGET_FLOAT) aarch64_err_no_fpadvsimd (mode); if (nvrn + nregs <= NUM_FP_ARG_REGS) @@ -4990,12 +5221,13 @@ aarch64_function_arg (cumulative_args_t pcum_v, const function_arg_info &arg) { CUMULATIVE_ARGS *pcum = get_cumulative_args (pcum_v); gcc_assert (pcum->pcs_variant == ARM_PCS_AAPCS64 - || pcum->pcs_variant == ARM_PCS_SIMD); + || pcum->pcs_variant == ARM_PCS_SIMD + || pcum->pcs_variant == ARM_PCS_SVE); if (arg.end_marker_p ()) return gen_int_mode (pcum->pcs_variant, DImode); - aarch64_layout_arg (pcum_v, arg.mode, arg.type, arg.named); + aarch64_layout_arg (pcum_v, arg); return pcum->aapcs_reg; } @@ -5004,12 +5236,15 @@ aarch64_init_cumulative_args (CUMULATIVE_ARGS *pcum, const_tree fntype, rtx libname ATTRIBUTE_UNUSED, const_tree fndecl ATTRIBUTE_UNUSED, - unsigned n_named ATTRIBUTE_UNUSED) + unsigned n_named ATTRIBUTE_UNUSED, + bool silent_p) { pcum->aapcs_ncrn = 0; pcum->aapcs_nvrn = 0; + pcum->aapcs_nprn = 0; pcum->aapcs_nextncrn = 0; pcum->aapcs_nextnvrn = 0; + pcum->aapcs_nextnprn = 0; if (fntype) pcum->pcs_variant = (arm_pcs) fntype_abi (fntype).id (); else @@ -5018,8 +5253,10 @@ aarch64_init_cumulative_args (CUMULATIVE_ARGS *pcum, pcum->aapcs_arg_processed = false; pcum->aapcs_stack_words = 0; pcum->aapcs_stack_size = 0; + pcum->silent_p = silent_p; - if (!TARGET_FLOAT + if (!silent_p + && !TARGET_FLOAT && fndecl && TREE_PUBLIC (fndecl) && fntype && fntype != error_mark_node) { @@ -5030,7 +5267,20 @@ aarch64_init_cumulative_args (CUMULATIVE_ARGS *pcum, &mode, &nregs, NULL)) aarch64_err_no_fpadvsimd (TYPE_MODE (type)); } - return; + + if (!silent_p + && !TARGET_SVE + && pcum->pcs_variant == ARM_PCS_SVE) + { + /* We can't gracefully recover at this point, so make this a + fatal error. */ + if (fndecl) + fatal_error (input_location, "%qE requires the SVE ISA extension", + fndecl); + else + fatal_error (input_location, "calls to functions of type %qT require" + " the SVE ISA extension", fntype); + } } static void @@ -5039,14 +5289,16 @@ aarch64_function_arg_advance (cumulative_args_t pcum_v, { CUMULATIVE_ARGS *pcum = get_cumulative_args (pcum_v); if (pcum->pcs_variant == ARM_PCS_AAPCS64 - || pcum->pcs_variant == ARM_PCS_SIMD) + || pcum->pcs_variant == ARM_PCS_SIMD + || pcum->pcs_variant == ARM_PCS_SVE) { - aarch64_layout_arg (pcum_v, arg.mode, arg.type, arg.named); + aarch64_layout_arg (pcum_v, arg); gcc_assert ((pcum->aapcs_reg != NULL_RTX) != (pcum->aapcs_stack_words != 0)); pcum->aapcs_arg_processed = false; pcum->aapcs_ncrn = pcum->aapcs_nextncrn; pcum->aapcs_nvrn = pcum->aapcs_nextnvrn; + pcum->aapcs_nprn = pcum->aapcs_nextnprn; pcum->aapcs_stack_size += pcum->aapcs_stack_words; pcum->aapcs_stack_words = 0; pcum->aapcs_reg = NULL_RTX; @@ -5479,9 +5731,11 @@ aarch64_needs_frame_chain (void) static void aarch64_layout_frame (void) { - HOST_WIDE_INT offset = 0; + poly_int64 offset = 0; int regno, last_fp_reg = INVALID_REGNUM; - bool simd_function = (crtl->abi->id () == ARM_PCS_SIMD); + machine_mode vector_save_mode = aarch64_reg_save_mode (V8_REGNUM); + poly_int64 vector_save_size = GET_MODE_SIZE (vector_save_mode); + bool frame_related_fp_reg_p = false; aarch64_frame &frame = cfun->machine->frame; frame.emit_frame_chain = aarch64_needs_frame_chain (); @@ -5495,12 +5749,10 @@ aarch64_layout_frame (void) frame.wb_candidate1 = INVALID_REGNUM; frame.wb_candidate2 = INVALID_REGNUM; + frame.spare_pred_reg = INVALID_REGNUM; /* First mark all the registers that really need to be saved... */ - for (regno = R0_REGNUM; regno <= R30_REGNUM; regno++) - frame.reg_offset[regno] = SLOT_NOT_REQUIRED; - - for (regno = V0_REGNUM; regno <= V31_REGNUM; regno++) + for (regno = 0; regno <= LAST_SAVED_REGNUM; regno++) frame.reg_offset[regno] = SLOT_NOT_REQUIRED; /* ... that includes the eh data registers (if needed)... */ @@ -5523,25 +5775,83 @@ aarch64_layout_frame (void) { frame.reg_offset[regno] = SLOT_REQUIRED; last_fp_reg = regno; + if (aarch64_emit_cfi_for_reg_p (regno)) + frame_related_fp_reg_p = true; } + /* Big-endian SVE frames need a spare predicate register in order + to save Z8-Z15. Decide which register they should use. Prefer + an unused argument register if possible, so that we don't force P4 + to be saved unnecessarily. */ + if (frame_related_fp_reg_p + && crtl->abi->id () == ARM_PCS_SVE + && BYTES_BIG_ENDIAN) + { + bitmap live1 = df_get_live_out (ENTRY_BLOCK_PTR_FOR_FN (cfun)); + bitmap live2 = df_get_live_in (EXIT_BLOCK_PTR_FOR_FN (cfun)); + for (regno = P0_REGNUM; regno <= P7_REGNUM; regno++) + if (!bitmap_bit_p (live1, regno) && !bitmap_bit_p (live2, regno)) + break; + gcc_assert (regno <= P7_REGNUM); + frame.spare_pred_reg = regno; + df_set_regs_ever_live (regno, true); + } + + for (regno = P0_REGNUM; regno <= P15_REGNUM; regno++) + if (df_regs_ever_live_p (regno) + && !fixed_regs[regno] + && !crtl->abi->clobbers_full_reg_p (regno)) + frame.reg_offset[regno] = SLOT_REQUIRED; + + /* With stack-clash, LR must be saved in non-leaf functions. */ + gcc_assert (crtl->is_leaf + || maybe_ne (frame.reg_offset[R30_REGNUM], SLOT_NOT_REQUIRED)); + + /* Now assign stack slots for the registers. Start with the predicate + registers, since predicate LDR and STR have a relatively small + offset range. These saves happen below the hard frame pointer. */ + for (regno = P0_REGNUM; regno <= P15_REGNUM; regno++) + if (known_eq (frame.reg_offset[regno], SLOT_REQUIRED)) + { + frame.reg_offset[regno] = offset; + offset += BYTES_PER_SVE_PRED; + } + + /* We save a maximum of 8 predicate registers, and since vector + registers are 8 times the size of a predicate register, all the + saved predicates fit within a single vector. Doing this also + rounds the offset to a 128-bit boundary. */ + if (maybe_ne (offset, 0)) + { + gcc_assert (known_le (offset, vector_save_size)); + offset = vector_save_size; + } + + /* If we need to save any SVE vector registers, add them next. */ + if (last_fp_reg != (int) INVALID_REGNUM && crtl->abi->id () == ARM_PCS_SVE) + for (regno = V0_REGNUM; regno <= V31_REGNUM; regno++) + if (known_eq (frame.reg_offset[regno], SLOT_REQUIRED)) + { + frame.reg_offset[regno] = offset; + offset += vector_save_size; + } + + /* OFFSET is now the offset of the hard frame pointer from the bottom + of the callee save area. */ + bool saves_below_hard_fp_p = maybe_ne (offset, 0); + frame.below_hard_fp_saved_regs_size = offset; if (frame.emit_frame_chain) { /* FP and LR are placed in the linkage record. */ - frame.reg_offset[R29_REGNUM] = 0; + frame.reg_offset[R29_REGNUM] = offset; frame.wb_candidate1 = R29_REGNUM; - frame.reg_offset[R30_REGNUM] = UNITS_PER_WORD; + frame.reg_offset[R30_REGNUM] = offset + UNITS_PER_WORD; frame.wb_candidate2 = R30_REGNUM; - offset = 2 * UNITS_PER_WORD; + offset += 2 * UNITS_PER_WORD; } - /* With stack-clash, LR must be saved in non-leaf functions. */ - gcc_assert (crtl->is_leaf - || frame.reg_offset[R30_REGNUM] != SLOT_NOT_REQUIRED); - - /* Now assign stack slots for them. */ for (regno = R0_REGNUM; regno <= R30_REGNUM; regno++) - if (frame.reg_offset[regno] == SLOT_REQUIRED) + if (known_eq (frame.reg_offset[regno], SLOT_REQUIRED)) { frame.reg_offset[regno] = offset; if (frame.wb_candidate1 == INVALID_REGNUM) @@ -5551,19 +5861,19 @@ aarch64_layout_frame (void) offset += UNITS_PER_WORD; } - HOST_WIDE_INT max_int_offset = offset; - offset = ROUND_UP (offset, STACK_BOUNDARY / BITS_PER_UNIT); - bool has_align_gap = offset != max_int_offset; + poly_int64 max_int_offset = offset; + offset = aligned_upper_bound (offset, STACK_BOUNDARY / BITS_PER_UNIT); + bool has_align_gap = maybe_ne (offset, max_int_offset); for (regno = V0_REGNUM; regno <= V31_REGNUM; regno++) - if (frame.reg_offset[regno] == SLOT_REQUIRED) + if (known_eq (frame.reg_offset[regno], SLOT_REQUIRED)) { /* If there is an alignment gap between integer and fp callee-saves, allocate the last fp register to it if possible. */ if (regno == last_fp_reg && has_align_gap - && !simd_function - && (offset & 8) == 0) + && known_eq (vector_save_size, 8) + && multiple_p (offset, 16)) { frame.reg_offset[regno] = max_int_offset; break; @@ -5575,31 +5885,34 @@ aarch64_layout_frame (void) else if (frame.wb_candidate2 == INVALID_REGNUM && frame.wb_candidate1 >= V0_REGNUM) frame.wb_candidate2 = regno; - offset += simd_function ? UNITS_PER_VREG : UNITS_PER_WORD; + offset += vector_save_size; } - offset = ROUND_UP (offset, STACK_BOUNDARY / BITS_PER_UNIT); + offset = aligned_upper_bound (offset, STACK_BOUNDARY / BITS_PER_UNIT); frame.saved_regs_size = offset; - HOST_WIDE_INT varargs_and_saved_regs_size - = offset + frame.saved_varargs_size; + poly_int64 varargs_and_saved_regs_size = offset + frame.saved_varargs_size; - frame.hard_fp_offset + poly_int64 above_outgoing_args = aligned_upper_bound (varargs_and_saved_regs_size + get_frame_size (), STACK_BOUNDARY / BITS_PER_UNIT); + frame.hard_fp_offset + = above_outgoing_args - frame.below_hard_fp_saved_regs_size; + /* Both these values are already aligned. */ gcc_assert (multiple_p (crtl->outgoing_args_size, STACK_BOUNDARY / BITS_PER_UNIT)); - frame.frame_size = frame.hard_fp_offset + crtl->outgoing_args_size; + frame.frame_size = above_outgoing_args + crtl->outgoing_args_size; frame.locals_offset = frame.saved_varargs_size; frame.initial_adjust = 0; frame.final_adjust = 0; frame.callee_adjust = 0; + frame.sve_callee_adjust = 0; frame.callee_offset = 0; HOST_WIDE_INT max_push_offset = 0; @@ -5609,53 +5922,86 @@ aarch64_layout_frame (void) max_push_offset = 256; HOST_WIDE_INT const_size, const_outgoing_args_size, const_fp_offset; + HOST_WIDE_INT const_saved_regs_size; if (frame.frame_size.is_constant (&const_size) && const_size < max_push_offset - && known_eq (crtl->outgoing_args_size, 0)) + && known_eq (frame.hard_fp_offset, const_size)) { /* Simple, small frame with no outgoing arguments: + stp reg1, reg2, [sp, -frame_size]! stp reg3, reg4, [sp, 16] */ frame.callee_adjust = const_size; } else if (crtl->outgoing_args_size.is_constant (&const_outgoing_args_size) - && const_outgoing_args_size + frame.saved_regs_size < 512 + && frame.saved_regs_size.is_constant (&const_saved_regs_size) + && const_outgoing_args_size + const_saved_regs_size < 512 + /* We could handle this case even with outgoing args, provided + that the number of args left us with valid offsets for all + predicate and vector save slots. It's such a rare case that + it hardly seems worth the effort though. */ + && (!saves_below_hard_fp_p || const_outgoing_args_size == 0) && !(cfun->calls_alloca && frame.hard_fp_offset.is_constant (&const_fp_offset) && const_fp_offset < max_push_offset)) { /* Frame with small outgoing arguments: + sub sp, sp, frame_size stp reg1, reg2, [sp, outgoing_args_size] stp reg3, reg4, [sp, outgoing_args_size + 16] */ frame.initial_adjust = frame.frame_size; frame.callee_offset = const_outgoing_args_size; } + else if (saves_below_hard_fp_p + && known_eq (frame.saved_regs_size, + frame.below_hard_fp_saved_regs_size)) + { + /* Frame in which all saves are SVE saves: + + sub sp, sp, hard_fp_offset + below_hard_fp_saved_regs_size + save SVE registers relative to SP + sub sp, sp, outgoing_args_size */ + frame.initial_adjust = (frame.hard_fp_offset + + frame.below_hard_fp_saved_regs_size); + frame.final_adjust = crtl->outgoing_args_size; + } else if (frame.hard_fp_offset.is_constant (&const_fp_offset) && const_fp_offset < max_push_offset) { - /* Frame with large outgoing arguments but a small local area: + /* Frame with large outgoing arguments or SVE saves, but with + a small local area: + stp reg1, reg2, [sp, -hard_fp_offset]! stp reg3, reg4, [sp, 16] + [sub sp, sp, below_hard_fp_saved_regs_size] + [save SVE registers relative to SP] sub sp, sp, outgoing_args_size */ frame.callee_adjust = const_fp_offset; + frame.sve_callee_adjust = frame.below_hard_fp_saved_regs_size; frame.final_adjust = crtl->outgoing_args_size; } else { - /* Frame with large local area and outgoing arguments using frame pointer: + /* Frame with large local area and outgoing arguments or SVE saves, + using frame pointer: + sub sp, sp, hard_fp_offset stp x29, x30, [sp, 0] add x29, sp, 0 stp reg3, reg4, [sp, 16] + [sub sp, sp, below_hard_fp_saved_regs_size] + [save SVE registers relative to SP] sub sp, sp, outgoing_args_size */ frame.initial_adjust = frame.hard_fp_offset; + frame.sve_callee_adjust = frame.below_hard_fp_saved_regs_size; frame.final_adjust = crtl->outgoing_args_size; } /* Make sure the individual adjustments add up to the full frame size. */ gcc_assert (known_eq (frame.initial_adjust + frame.callee_adjust + + frame.sve_callee_adjust + frame.final_adjust, frame.frame_size)); frame.laid_out = true; @@ -5667,7 +6013,7 @@ aarch64_layout_frame (void) static bool aarch64_register_saved_on_entry (int regno) { - return cfun->machine->frame.reg_offset[regno] >= 0; + return known_ge (cfun->machine->frame.reg_offset[regno], 0); } /* Return the next register up from REGNO up to LIMIT for the callee @@ -5734,7 +6080,7 @@ static void aarch64_push_regs (unsigned regno1, unsigned regno2, HOST_WIDE_INT adjustment) { rtx_insn *insn; - machine_mode mode = aarch64_reg_save_mode (cfun->decl, regno1); + machine_mode mode = aarch64_reg_save_mode (regno1); if (regno2 == INVALID_REGNUM) return aarch64_pushwb_single_reg (mode, regno1, adjustment); @@ -5780,7 +6126,7 @@ static void aarch64_pop_regs (unsigned regno1, unsigned regno2, HOST_WIDE_INT adjustment, rtx *cfi_ops) { - machine_mode mode = aarch64_reg_save_mode (cfun->decl, regno1); + machine_mode mode = aarch64_reg_save_mode (regno1); rtx reg1 = gen_rtx_REG (mode, regno1); *cfi_ops = alloc_reg_note (REG_CFA_RESTORE, reg1, *cfi_ops); @@ -5859,7 +6205,7 @@ aarch64_return_address_signing_enabled (void) if its LR is pushed onto stack. */ return (aarch64_ra_sign_scope == AARCH64_FUNCTION_ALL || (aarch64_ra_sign_scope == AARCH64_FUNCTION_NON_LEAF - && cfun->machine->frame.reg_offset[LR_REGNUM] >= 0)); + && known_ge (cfun->machine->frame.reg_offset[LR_REGNUM], 0))); } /* Return TRUE if Branch Target Identification Mechanism is enabled. */ @@ -5869,17 +6215,75 @@ aarch64_bti_enabled (void) return (aarch64_enable_bti == 1); } +/* The caller is going to use ST1D or LD1D to save or restore an SVE + register in mode MODE at BASE_RTX + OFFSET, where OFFSET is in + the range [1, 16] * GET_MODE_SIZE (MODE). Prepare for this by: + + (1) updating BASE_RTX + OFFSET so that it is a legitimate ST1D + or LD1D address + + (2) setting PRED to a valid predicate register for the ST1D or LD1D, + if the variable isn't already nonnull + + (1) is needed when OFFSET is in the range [8, 16] * GET_MODE_SIZE (MODE). + Handle this case using a temporary base register that is suitable for + all offsets in that range. Use ANCHOR_REG as this base register if it + is nonnull, otherwise create a new register and store it in ANCHOR_REG. */ + +static inline void +aarch64_adjust_sve_callee_save_base (machine_mode mode, rtx &base_rtx, + rtx &anchor_reg, poly_int64 &offset, + rtx &ptrue) +{ + if (maybe_ge (offset, 8 * GET_MODE_SIZE (mode))) + { + /* This is the maximum valid offset of the anchor from the base. + Lower values would be valid too. */ + poly_int64 anchor_offset = 16 * GET_MODE_SIZE (mode); + if (!anchor_reg) + { + anchor_reg = gen_rtx_REG (Pmode, STACK_CLASH_SVE_CFA_REGNUM); + emit_insn (gen_add3_insn (anchor_reg, base_rtx, + gen_int_mode (anchor_offset, Pmode))); + } + base_rtx = anchor_reg; + offset -= anchor_offset; + } + if (!ptrue) + { + int pred_reg = cfun->machine->frame.spare_pred_reg; + emit_move_insn (gen_rtx_REG (VNx16BImode, pred_reg), + CONSTM1_RTX (VNx16BImode)); + ptrue = gen_rtx_REG (VNx2BImode, pred_reg); + } +} + +/* Add a REG_CFA_EXPRESSION note to INSN to say that register REG + is saved at BASE + OFFSET. */ + +static void +aarch64_add_cfa_expression (rtx_insn *insn, rtx reg, + rtx base, poly_int64 offset) +{ + rtx mem = gen_frame_mem (GET_MODE (reg), + plus_constant (Pmode, base, offset)); + add_reg_note (insn, REG_CFA_EXPRESSION, gen_rtx_SET (mem, reg)); +} + /* Emit code to save the callee-saved registers from register number START to LIMIT to the stack at the location starting at offset START_OFFSET, - skipping any write-back candidates if SKIP_WB is true. */ + skipping any write-back candidates if SKIP_WB is true. HARD_FP_VALID_P + is true if the hard frame pointer has been set up. */ static void -aarch64_save_callee_saves (machine_mode mode, poly_int64 start_offset, - unsigned start, unsigned limit, bool skip_wb) +aarch64_save_callee_saves (poly_int64 start_offset, + unsigned start, unsigned limit, bool skip_wb, + bool hard_fp_valid_p) { rtx_insn *insn; unsigned regno; unsigned regno2; + rtx anchor_reg = NULL_RTX, ptrue = NULL_RTX; for (regno = aarch64_next_callee_save (start, limit); regno <= limit; @@ -5887,7 +6291,7 @@ aarch64_save_callee_saves (machine_mode mode, poly_int64 start_offset, { rtx reg, mem; poly_int64 offset; - int offset_diff; + bool frame_related_p = aarch64_emit_cfi_for_reg_p (regno); if (skip_wb && (regno == cfun->machine->frame.wb_candidate1 @@ -5895,27 +6299,53 @@ aarch64_save_callee_saves (machine_mode mode, poly_int64 start_offset, continue; if (cfun->machine->reg_is_wrapped_separately[regno]) - continue; + continue; + machine_mode mode = aarch64_reg_save_mode (regno); reg = gen_rtx_REG (mode, regno); offset = start_offset + cfun->machine->frame.reg_offset[regno]; - mem = gen_frame_mem (mode, plus_constant (Pmode, stack_pointer_rtx, - offset)); + rtx base_rtx = stack_pointer_rtx; + poly_int64 sp_offset = offset; - regno2 = aarch64_next_callee_save (regno + 1, limit); - offset_diff = cfun->machine->frame.reg_offset[regno2] - - cfun->machine->frame.reg_offset[regno]; + HOST_WIDE_INT const_offset; + if (mode == VNx2DImode && BYTES_BIG_ENDIAN) + aarch64_adjust_sve_callee_save_base (mode, base_rtx, anchor_reg, + offset, ptrue); + else if (GP_REGNUM_P (regno) + && (!offset.is_constant (&const_offset) || const_offset >= 512)) + { + gcc_assert (known_eq (start_offset, 0)); + poly_int64 fp_offset + = cfun->machine->frame.below_hard_fp_saved_regs_size; + if (hard_fp_valid_p) + base_rtx = hard_frame_pointer_rtx; + else + { + if (!anchor_reg) + { + anchor_reg = gen_rtx_REG (Pmode, STACK_CLASH_SVE_CFA_REGNUM); + emit_insn (gen_add3_insn (anchor_reg, base_rtx, + gen_int_mode (fp_offset, Pmode))); + } + base_rtx = anchor_reg; + } + offset -= fp_offset; + } + mem = gen_frame_mem (mode, plus_constant (Pmode, base_rtx, offset)); + bool need_cfa_note_p = (base_rtx != stack_pointer_rtx); - if (regno2 <= limit + if (!aarch64_sve_mode_p (mode) + && (regno2 = aarch64_next_callee_save (regno + 1, limit)) <= limit && !cfun->machine->reg_is_wrapped_separately[regno2] - && known_eq (GET_MODE_SIZE (mode), offset_diff)) + && known_eq (GET_MODE_SIZE (mode), + cfun->machine->frame.reg_offset[regno2] + - cfun->machine->frame.reg_offset[regno])) { rtx reg2 = gen_rtx_REG (mode, regno2); rtx mem2; - offset = start_offset + cfun->machine->frame.reg_offset[regno2]; - mem2 = gen_frame_mem (mode, plus_constant (Pmode, stack_pointer_rtx, - offset)); + offset += GET_MODE_SIZE (mode); + mem2 = gen_frame_mem (mode, plus_constant (Pmode, base_rtx, offset)); insn = emit_insn (aarch64_gen_store_pair (mode, mem, reg, mem2, reg2)); @@ -5923,71 +6353,96 @@ aarch64_save_callee_saves (machine_mode mode, poly_int64 start_offset, always assumed to be relevant to the frame calculations; subsequent parts, are only frame-related if explicitly marked. */ - RTX_FRAME_RELATED_P (XVECEXP (PATTERN (insn), 0, 1)) = 1; + if (aarch64_emit_cfi_for_reg_p (regno2)) + { + if (need_cfa_note_p) + aarch64_add_cfa_expression (insn, reg2, stack_pointer_rtx, + sp_offset + GET_MODE_SIZE (mode)); + else + RTX_FRAME_RELATED_P (XVECEXP (PATTERN (insn), 0, 1)) = 1; + } + regno = regno2; } + else if (mode == VNx2DImode && BYTES_BIG_ENDIAN) + { + insn = emit_insn (gen_aarch64_pred_mov (mode, mem, ptrue, reg)); + need_cfa_note_p = true; + } + else if (aarch64_sve_mode_p (mode)) + insn = emit_insn (gen_rtx_SET (mem, reg)); else insn = emit_move_insn (mem, reg); - RTX_FRAME_RELATED_P (insn) = 1; + RTX_FRAME_RELATED_P (insn) = frame_related_p; + if (frame_related_p && need_cfa_note_p) + aarch64_add_cfa_expression (insn, reg, stack_pointer_rtx, sp_offset); } } -/* Emit code to restore the callee registers of mode MODE from register - number START up to and including LIMIT. Restore from the stack offset - START_OFFSET, skipping any write-back candidates if SKIP_WB is true. - Write the appropriate REG_CFA_RESTORE notes into CFI_OPS. */ +/* Emit code to restore the callee registers from register number START + up to and including LIMIT. Restore from the stack offset START_OFFSET, + skipping any write-back candidates if SKIP_WB is true. Write the + appropriate REG_CFA_RESTORE notes into CFI_OPS. */ static void -aarch64_restore_callee_saves (machine_mode mode, - poly_int64 start_offset, unsigned start, +aarch64_restore_callee_saves (poly_int64 start_offset, unsigned start, unsigned limit, bool skip_wb, rtx *cfi_ops) { - rtx base_rtx = stack_pointer_rtx; unsigned regno; unsigned regno2; poly_int64 offset; + rtx anchor_reg = NULL_RTX, ptrue = NULL_RTX; for (regno = aarch64_next_callee_save (start, limit); regno <= limit; regno = aarch64_next_callee_save (regno + 1, limit)) { + bool frame_related_p = aarch64_emit_cfi_for_reg_p (regno); if (cfun->machine->reg_is_wrapped_separately[regno]) - continue; + continue; rtx reg, mem; - int offset_diff; if (skip_wb && (regno == cfun->machine->frame.wb_candidate1 || regno == cfun->machine->frame.wb_candidate2)) continue; + machine_mode mode = aarch64_reg_save_mode (regno); reg = gen_rtx_REG (mode, regno); offset = start_offset + cfun->machine->frame.reg_offset[regno]; + rtx base_rtx = stack_pointer_rtx; + if (mode == VNx2DImode && BYTES_BIG_ENDIAN) + aarch64_adjust_sve_callee_save_base (mode, base_rtx, anchor_reg, + offset, ptrue); mem = gen_frame_mem (mode, plus_constant (Pmode, base_rtx, offset)); - regno2 = aarch64_next_callee_save (regno + 1, limit); - offset_diff = cfun->machine->frame.reg_offset[regno2] - - cfun->machine->frame.reg_offset[regno]; - - if (regno2 <= limit + if (!aarch64_sve_mode_p (mode) + && (regno2 = aarch64_next_callee_save (regno + 1, limit)) <= limit && !cfun->machine->reg_is_wrapped_separately[regno2] - && known_eq (GET_MODE_SIZE (mode), offset_diff)) + && known_eq (GET_MODE_SIZE (mode), + cfun->machine->frame.reg_offset[regno2] + - cfun->machine->frame.reg_offset[regno])) { rtx reg2 = gen_rtx_REG (mode, regno2); rtx mem2; - offset = start_offset + cfun->machine->frame.reg_offset[regno2]; + offset += GET_MODE_SIZE (mode); mem2 = gen_frame_mem (mode, plus_constant (Pmode, base_rtx, offset)); emit_insn (aarch64_gen_load_pair (mode, reg, mem, reg2, mem2)); *cfi_ops = alloc_reg_note (REG_CFA_RESTORE, reg2, *cfi_ops); regno = regno2; } + else if (mode == VNx2DImode && BYTES_BIG_ENDIAN) + emit_insn (gen_aarch64_pred_mov (mode, reg, ptrue, mem)); + else if (aarch64_sve_mode_p (mode)) + emit_insn (gen_rtx_SET (reg, mem)); else emit_move_insn (reg, mem); - *cfi_ops = alloc_reg_note (REG_CFA_RESTORE, reg, *cfi_ops); + if (frame_related_p) + *cfi_ops = alloc_reg_note (REG_CFA_RESTORE, reg, *cfi_ops); } } @@ -6069,13 +6524,35 @@ aarch64_get_separate_components (void) for (unsigned regno = 0; regno <= LAST_SAVED_REGNUM; regno++) if (aarch64_register_saved_on_entry (regno)) { + /* Punt on saves and restores that use ST1D and LD1D. We could + try to be smarter, but it would involve making sure that the + spare predicate register itself is safe to use at the save + and restore points. Also, when a frame pointer is being used, + the slots are often out of reach of ST1D and LD1D anyway. */ + machine_mode mode = aarch64_reg_save_mode (regno); + if (mode == VNx2DImode && BYTES_BIG_ENDIAN) + continue; + poly_int64 offset = cfun->machine->frame.reg_offset[regno]; - if (!frame_pointer_needed) - offset += cfun->machine->frame.frame_size - - cfun->machine->frame.hard_fp_offset; + + /* If the register is saved in the first SVE save slot, we use + it as a stack probe for -fstack-clash-protection. */ + if (flag_stack_clash_protection + && maybe_ne (cfun->machine->frame.below_hard_fp_saved_regs_size, 0) + && known_eq (offset, 0)) + continue; + + /* Get the offset relative to the register we'll use. */ + if (frame_pointer_needed) + offset -= cfun->machine->frame.below_hard_fp_saved_regs_size; + else + offset += crtl->outgoing_args_size; + /* Check that we can access the stack slot of the register with one direct load with no adjustments needed. */ - if (offset_12bit_unsigned_scaled_p (DImode, offset)) + if (aarch64_sve_mode_p (mode) + ? offset_9bit_signed_scaled_p (mode, offset) + : offset_12bit_unsigned_scaled_p (mode, offset)) bitmap_set_bit (components, regno); } @@ -6083,6 +6560,12 @@ aarch64_get_separate_components (void) if (frame_pointer_needed) bitmap_clear_bit (components, HARD_FRAME_POINTER_REGNUM); + /* If the spare predicate register used by big-endian SVE code + is call-preserved, it must be saved in the main prologue + before any saves that use it. */ + if (cfun->machine->frame.spare_pred_reg != INVALID_REGNUM) + bitmap_clear_bit (components, cfun->machine->frame.spare_pred_reg); + unsigned reg1 = cfun->machine->frame.wb_candidate1; unsigned reg2 = cfun->machine->frame.wb_candidate2; /* If registers have been chosen to be stored/restored with @@ -6136,18 +6619,19 @@ aarch64_components_for_bb (basic_block bb) || bitmap_bit_p (gen, regno) || bitmap_bit_p (kill, regno))) { - unsigned regno2, offset, offset2; bitmap_set_bit (components, regno); /* If there is a callee-save at an adjacent offset, add it too to increase the use of LDP/STP. */ - offset = cfun->machine->frame.reg_offset[regno]; - regno2 = ((offset & 8) == 0) ? regno + 1 : regno - 1; + poly_int64 offset = cfun->machine->frame.reg_offset[regno]; + unsigned regno2 = multiple_p (offset, 16) ? regno + 1 : regno - 1; if (regno2 <= LAST_SAVED_REGNUM) { - offset2 = cfun->machine->frame.reg_offset[regno2]; - if ((offset & ~8) == (offset2 & ~8)) + poly_int64 offset2 = cfun->machine->frame.reg_offset[regno2]; + if (regno < regno2 + ? known_eq (offset + 8, offset2) + : multiple_p (offset2, 16) && known_eq (offset2 + 8, offset)) bitmap_set_bit (components, regno2); } } @@ -6202,16 +6686,16 @@ aarch64_process_components (sbitmap components, bool prologue_p) while (regno != last_regno) { - /* AAPCS64 section 5.1.2 requires only the low 64 bits to be saved - so DFmode for the vector registers is enough. For simd functions - we want to save the low 128 bits. */ - machine_mode mode = aarch64_reg_save_mode (cfun->decl, regno); + bool frame_related_p = aarch64_emit_cfi_for_reg_p (regno); + machine_mode mode = aarch64_reg_save_mode (regno); rtx reg = gen_rtx_REG (mode, regno); poly_int64 offset = cfun->machine->frame.reg_offset[regno]; - if (!frame_pointer_needed) - offset += cfun->machine->frame.frame_size - - cfun->machine->frame.hard_fp_offset; + if (frame_pointer_needed) + offset -= cfun->machine->frame.below_hard_fp_saved_regs_size; + else + offset += crtl->outgoing_args_size; + rtx addr = plus_constant (Pmode, ptr_reg, offset); rtx mem = gen_frame_mem (mode, addr); @@ -6222,39 +6706,49 @@ aarch64_process_components (sbitmap components, bool prologue_p) if (regno2 == last_regno) { insn = emit_insn (set); - RTX_FRAME_RELATED_P (insn) = 1; - if (prologue_p) - add_reg_note (insn, REG_CFA_OFFSET, copy_rtx (set)); - else - add_reg_note (insn, REG_CFA_RESTORE, reg); + if (frame_related_p) + { + RTX_FRAME_RELATED_P (insn) = 1; + if (prologue_p) + add_reg_note (insn, REG_CFA_OFFSET, copy_rtx (set)); + else + add_reg_note (insn, REG_CFA_RESTORE, reg); + } break; } poly_int64 offset2 = cfun->machine->frame.reg_offset[regno2]; /* The next register is not of the same class or its offset is not mergeable with the current one into a pair. */ - if (!satisfies_constraint_Ump (mem) + if (aarch64_sve_mode_p (mode) + || !satisfies_constraint_Ump (mem) || GP_REGNUM_P (regno) != GP_REGNUM_P (regno2) || (crtl->abi->id () == ARM_PCS_SIMD && FP_REGNUM_P (regno)) || maybe_ne ((offset2 - cfun->machine->frame.reg_offset[regno]), GET_MODE_SIZE (mode))) { insn = emit_insn (set); - RTX_FRAME_RELATED_P (insn) = 1; - if (prologue_p) - add_reg_note (insn, REG_CFA_OFFSET, copy_rtx (set)); - else - add_reg_note (insn, REG_CFA_RESTORE, reg); + if (frame_related_p) + { + RTX_FRAME_RELATED_P (insn) = 1; + if (prologue_p) + add_reg_note (insn, REG_CFA_OFFSET, copy_rtx (set)); + else + add_reg_note (insn, REG_CFA_RESTORE, reg); + } regno = regno2; continue; } + bool frame_related2_p = aarch64_emit_cfi_for_reg_p (regno2); + /* REGNO2 can be saved/restored in a pair with REGNO. */ rtx reg2 = gen_rtx_REG (mode, regno2); - if (!frame_pointer_needed) - offset2 += cfun->machine->frame.frame_size - - cfun->machine->frame.hard_fp_offset; + if (frame_pointer_needed) + offset2 -= cfun->machine->frame.below_hard_fp_saved_regs_size; + else + offset2 += crtl->outgoing_args_size; rtx addr2 = plus_constant (Pmode, ptr_reg, offset2); rtx mem2 = gen_frame_mem (mode, addr2); rtx set2 = prologue_p ? gen_rtx_SET (mem2, reg2) @@ -6265,16 +6759,23 @@ aarch64_process_components (sbitmap components, bool prologue_p) else insn = emit_insn (aarch64_gen_load_pair (mode, reg, mem, reg2, mem2)); - RTX_FRAME_RELATED_P (insn) = 1; - if (prologue_p) + if (frame_related_p || frame_related2_p) { - add_reg_note (insn, REG_CFA_OFFSET, set); - add_reg_note (insn, REG_CFA_OFFSET, set2); - } - else - { - add_reg_note (insn, REG_CFA_RESTORE, reg); - add_reg_note (insn, REG_CFA_RESTORE, reg2); + RTX_FRAME_RELATED_P (insn) = 1; + if (prologue_p) + { + if (frame_related_p) + add_reg_note (insn, REG_CFA_OFFSET, set); + if (frame_related2_p) + add_reg_note (insn, REG_CFA_OFFSET, set2); + } + else + { + if (frame_related_p) + add_reg_note (insn, REG_CFA_RESTORE, reg); + if (frame_related2_p) + add_reg_note (insn, REG_CFA_RESTORE, reg2); + } } regno = aarch64_get_next_set_bit (components, regno2 + 1); @@ -6343,15 +6844,31 @@ aarch64_allocate_and_probe_stack_space (rtx temp1, rtx temp2, HOST_WIDE_INT guard_size = 1 << PARAM_VALUE (PARAM_STACK_CLASH_PROTECTION_GUARD_SIZE); HOST_WIDE_INT guard_used_by_caller = STACK_CLASH_CALLER_GUARD; - /* When doing the final adjustment for the outgoing argument size we can't - assume that LR was saved at position 0. So subtract it's offset from the - ABI safe buffer so that we don't accidentally allow an adjustment that - would result in an allocation larger than the ABI buffer without - probing. */ HOST_WIDE_INT min_probe_threshold - = final_adjustment_p - ? guard_used_by_caller - cfun->machine->frame.reg_offset[LR_REGNUM] - : guard_size - guard_used_by_caller; + = (final_adjustment_p + ? guard_used_by_caller + : guard_size - guard_used_by_caller); + /* When doing the final adjustment for the outgoing arguments, take into + account any unprobed space there is above the current SP. There are + two cases: + + - When saving SVE registers below the hard frame pointer, we force + the lowest save to take place in the prologue before doing the final + adjustment (i.e. we don't allow the save to be shrink-wrapped). + This acts as a probe at SP, so there is no unprobed space. + + - When there are no SVE register saves, we use the store of the link + register as a probe. We can't assume that LR was saved at position 0 + though, so treat any space below it as unprobed. */ + if (final_adjustment_p + && known_eq (cfun->machine->frame.below_hard_fp_saved_regs_size, 0)) + { + poly_int64 lr_offset = cfun->machine->frame.reg_offset[LR_REGNUM]; + if (known_ge (lr_offset, 0)) + min_probe_threshold -= lr_offset.to_constant (); + else + gcc_assert (!flag_stack_clash_protection || known_eq (poly_size, 0)); + } poly_int64 frame_size = cfun->machine->frame.frame_size; @@ -6361,13 +6878,15 @@ aarch64_allocate_and_probe_stack_space (rtx temp1, rtx temp2, if (flag_stack_clash_protection && !final_adjustment_p) { poly_int64 initial_adjust = cfun->machine->frame.initial_adjust; + poly_int64 sve_callee_adjust = cfun->machine->frame.sve_callee_adjust; poly_int64 final_adjust = cfun->machine->frame.final_adjust; if (known_eq (frame_size, 0)) { dump_stack_clash_frame_info (NO_PROBE_NO_FRAME, false); } - else if (known_lt (initial_adjust, guard_size - guard_used_by_caller) + else if (known_lt (initial_adjust + sve_callee_adjust, + guard_size - guard_used_by_caller) && known_lt (final_adjust, guard_used_by_caller)) { dump_stack_clash_frame_info (NO_PROBE_SMALL_FRAME, true); @@ -6571,18 +7090,6 @@ aarch64_epilogue_uses (int regno) return 0; } -/* Add a REG_CFA_EXPRESSION note to INSN to say that register REG - is saved at BASE + OFFSET. */ - -static void -aarch64_add_cfa_expression (rtx_insn *insn, unsigned int reg, - rtx base, poly_int64 offset) -{ - rtx mem = gen_frame_mem (DImode, plus_constant (Pmode, base, offset)); - add_reg_note (insn, REG_CFA_EXPRESSION, - gen_rtx_SET (mem, regno_reg_rtx[reg])); -} - /* AArch64 stack frames generated by this compiler look like: +-------------------------------+ @@ -6604,8 +7111,12 @@ aarch64_add_cfa_expression (rtx_insn *insn, unsigned int reg, +-------------------------------+ | | LR' | | +-------------------------------+ | - | FP' | / <- hard_frame_pointer_rtx (aligned) - +-------------------------------+ + | FP' | | + +-------------------------------+ |<- hard_frame_pointer_rtx (aligned) + | SVE vector registers | | \ + +-------------------------------+ | | below_hard_fp_saved_regs_size + | SVE predicate registers | / / + +-------------------------------+ | dynamic allocation | +-------------------------------+ | padding | @@ -6638,7 +7149,8 @@ aarch64_add_cfa_expression (rtx_insn *insn, unsigned int reg, The following registers are reserved during frame layout and should not be used for any other purpose: - - r11: Used by stack clash protection when SVE is enabled. + - r11: Used by stack clash protection when SVE is enabled, and also + as an anchor register when saving and restoring registers - r12(EP0) and r13(EP1): Used as temporaries for stack adjustment. - r14 and r15: Used for speculation tracking. - r16(IP0), r17(IP1): Used by indirect tailcalls. @@ -6661,11 +7173,23 @@ aarch64_expand_prologue (void) HOST_WIDE_INT callee_adjust = cfun->machine->frame.callee_adjust; poly_int64 final_adjust = cfun->machine->frame.final_adjust; poly_int64 callee_offset = cfun->machine->frame.callee_offset; + poly_int64 sve_callee_adjust = cfun->machine->frame.sve_callee_adjust; + poly_int64 below_hard_fp_saved_regs_size + = cfun->machine->frame.below_hard_fp_saved_regs_size; unsigned reg1 = cfun->machine->frame.wb_candidate1; unsigned reg2 = cfun->machine->frame.wb_candidate2; bool emit_frame_chain = cfun->machine->frame.emit_frame_chain; rtx_insn *insn; + if (flag_stack_clash_protection && known_eq (callee_adjust, 0)) + { + /* Fold the SVE allocation into the initial allocation. + We don't do this in aarch64_layout_arg to avoid pessimizing + the epilogue code. */ + initial_adjust += sve_callee_adjust; + sve_callee_adjust = 0; + } + /* Sign return address for functions. */ if (aarch64_return_address_signing_enabled ()) { @@ -6718,18 +7242,27 @@ aarch64_expand_prologue (void) if (callee_adjust != 0) aarch64_push_regs (reg1, reg2, callee_adjust); + /* The offset of the frame chain record (if any) from the current SP. */ + poly_int64 chain_offset = (initial_adjust + callee_adjust + - cfun->machine->frame.hard_fp_offset); + gcc_assert (known_ge (chain_offset, 0)); + + /* The offset of the bottom of the save area from the current SP. */ + poly_int64 saved_regs_offset = chain_offset - below_hard_fp_saved_regs_size; + if (emit_frame_chain) { - poly_int64 reg_offset = callee_adjust; if (callee_adjust == 0) { reg1 = R29_REGNUM; reg2 = R30_REGNUM; - reg_offset = callee_offset; - aarch64_save_callee_saves (DImode, reg_offset, reg1, reg2, false); + aarch64_save_callee_saves (saved_regs_offset, reg1, reg2, + false, false); } + else + gcc_assert (known_eq (chain_offset, 0)); aarch64_add_offset (Pmode, hard_frame_pointer_rtx, - stack_pointer_rtx, callee_offset, + stack_pointer_rtx, chain_offset, tmp1_rtx, tmp0_rtx, frame_pointer_needed); if (frame_pointer_needed && !frame_size.is_constant ()) { @@ -6756,23 +7289,31 @@ aarch64_expand_prologue (void) /* Change the save slot expressions for the registers that we've already saved. */ - reg_offset -= callee_offset; - aarch64_add_cfa_expression (insn, reg2, hard_frame_pointer_rtx, - reg_offset + UNITS_PER_WORD); - aarch64_add_cfa_expression (insn, reg1, hard_frame_pointer_rtx, - reg_offset); + aarch64_add_cfa_expression (insn, regno_reg_rtx[reg2], + hard_frame_pointer_rtx, UNITS_PER_WORD); + aarch64_add_cfa_expression (insn, regno_reg_rtx[reg1], + hard_frame_pointer_rtx, 0); } emit_insn (gen_stack_tie (stack_pointer_rtx, hard_frame_pointer_rtx)); } - aarch64_save_callee_saves (DImode, callee_offset, R0_REGNUM, R30_REGNUM, - callee_adjust != 0 || emit_frame_chain); - if (crtl->abi->id () == ARM_PCS_SIMD) - aarch64_save_callee_saves (TFmode, callee_offset, V0_REGNUM, V31_REGNUM, - callee_adjust != 0 || emit_frame_chain); - else - aarch64_save_callee_saves (DFmode, callee_offset, V0_REGNUM, V31_REGNUM, - callee_adjust != 0 || emit_frame_chain); + aarch64_save_callee_saves (saved_regs_offset, R0_REGNUM, R30_REGNUM, + callee_adjust != 0 || emit_frame_chain, + emit_frame_chain); + if (maybe_ne (sve_callee_adjust, 0)) + { + gcc_assert (!flag_stack_clash_protection + || known_eq (initial_adjust, 0)); + aarch64_allocate_and_probe_stack_space (tmp1_rtx, tmp0_rtx, + sve_callee_adjust, + !frame_pointer_needed, false); + saved_regs_offset += sve_callee_adjust; + } + aarch64_save_callee_saves (saved_regs_offset, P0_REGNUM, P15_REGNUM, + false, emit_frame_chain); + aarch64_save_callee_saves (saved_regs_offset, V0_REGNUM, V31_REGNUM, + callee_adjust != 0 || emit_frame_chain, + emit_frame_chain); /* We may need to probe the final adjustment if it is larger than the guard that is assumed by the called. */ @@ -6810,6 +7351,9 @@ aarch64_expand_epilogue (bool for_sibcall) HOST_WIDE_INT callee_adjust = cfun->machine->frame.callee_adjust; poly_int64 final_adjust = cfun->machine->frame.final_adjust; poly_int64 callee_offset = cfun->machine->frame.callee_offset; + poly_int64 sve_callee_adjust = cfun->machine->frame.sve_callee_adjust; + poly_int64 below_hard_fp_saved_regs_size + = cfun->machine->frame.below_hard_fp_saved_regs_size; unsigned reg1 = cfun->machine->frame.wb_candidate1; unsigned reg2 = cfun->machine->frame.wb_candidate2; rtx cfi_ops = NULL; @@ -6823,15 +7367,23 @@ aarch64_expand_epilogue (bool for_sibcall) = 1 << PARAM_VALUE (PARAM_STACK_CLASH_PROTECTION_GUARD_SIZE); HOST_WIDE_INT guard_used_by_caller = STACK_CLASH_CALLER_GUARD; - /* We can re-use the registers when the allocation amount is smaller than - guard_size - guard_used_by_caller because we won't be doing any probes - then. In such situations the register should remain live with the correct + /* We can re-use the registers when: + + (a) the deallocation amount is the same as the corresponding + allocation amount (which is false if we combine the initial + and SVE callee save allocations in the prologue); and + + (b) the allocation amount doesn't need a probe (which is false + if the amount is guard_size - guard_used_by_caller or greater). + + In such situations the register should remain live with the correct value. */ bool can_inherit_p = (initial_adjust.is_constant () - && final_adjust.is_constant ()) + && final_adjust.is_constant () && (!flag_stack_clash_protection - || known_lt (initial_adjust, - guard_size - guard_used_by_caller)); + || (known_lt (initial_adjust, + guard_size - guard_used_by_caller) + && known_eq (sve_callee_adjust, 0)))); /* We need to add memory barrier to prevent read from deallocated stack. */ bool need_barrier_p @@ -6856,7 +7408,8 @@ aarch64_expand_epilogue (bool for_sibcall) /* If writeback is used when restoring callee-saves, the CFA is restored on the instruction doing the writeback. */ aarch64_add_offset (Pmode, stack_pointer_rtx, - hard_frame_pointer_rtx, -callee_offset, + hard_frame_pointer_rtx, + -callee_offset - below_hard_fp_saved_regs_size, tmp1_rtx, tmp0_rtx, callee_adjust == 0); else /* The case where we need to re-use the register here is very rare, so @@ -6864,14 +7417,17 @@ aarch64_expand_epilogue (bool for_sibcall) immediate doesn't fit. */ aarch64_add_sp (tmp1_rtx, tmp0_rtx, final_adjust, true); - aarch64_restore_callee_saves (DImode, callee_offset, R0_REGNUM, R30_REGNUM, + /* Restore the vector registers before the predicate registers, + so that we can use P4 as a temporary for big-endian SVE frames. */ + aarch64_restore_callee_saves (callee_offset, V0_REGNUM, V31_REGNUM, + callee_adjust != 0, &cfi_ops); + aarch64_restore_callee_saves (callee_offset, P0_REGNUM, P15_REGNUM, + false, &cfi_ops); + if (maybe_ne (sve_callee_adjust, 0)) + aarch64_add_sp (NULL_RTX, NULL_RTX, sve_callee_adjust, true); + aarch64_restore_callee_saves (callee_offset - sve_callee_adjust, + R0_REGNUM, R30_REGNUM, callee_adjust != 0, &cfi_ops); - if (crtl->abi->id () == ARM_PCS_SIMD) - aarch64_restore_callee_saves (TFmode, callee_offset, V0_REGNUM, V31_REGNUM, - callee_adjust != 0, &cfi_ops); - else - aarch64_restore_callee_saves (DFmode, callee_offset, V0_REGNUM, V31_REGNUM, - callee_adjust != 0, &cfi_ops); if (need_barrier_p) emit_insn (gen_stack_tie (stack_pointer_rtx, stack_pointer_rtx)); @@ -9397,13 +9953,14 @@ aarch64_secondary_reload (bool in_p ATTRIBUTE_UNUSED, rtx x, secondary_reload_info *sri) { /* Use aarch64_sve_reload_be for SVE reloads that cannot be handled - directly by the *aarch64_sve_mov_be move pattern. See the + directly by the *aarch64_sve_mov_[lb]e move patterns. See the comment at the head of aarch64-sve.md for more details about the big-endian handling. */ if (BYTES_BIG_ENDIAN && reg_class_subset_p (rclass, FP_REGS) && !((REG_P (x) && HARD_REGISTER_P (x)) || aarch64_simd_valid_immediate (x, NULL)) + && mode != VNx16QImode && aarch64_sve_data_mode_p (mode)) { sri->icode = CODE_FOR_aarch64_sve_reload_be; @@ -14983,6 +15540,10 @@ aapcs_vfp_sub_candidate (const_tree type, machine_mode *modep) machine_mode mode; HOST_WIDE_INT size; + /* SVE types (and types containing SVE types) must be handled + before calling this function. */ + gcc_assert (!aarch64_sve::builtin_type_p (type)); + switch (TREE_CODE (type)) { case REAL_TYPE: @@ -15154,6 +15715,9 @@ aarch64_short_vector_p (const_tree type, { poly_int64 size = -1; + if (type && aarch64_sve::builtin_type_p (type)) + return false; + if (type && TREE_CODE (type) == VECTOR_TYPE) size = int_size_in_bytes (type); else if (GET_MODE_CLASS (mode) == MODE_VECTOR_INT @@ -15214,11 +15778,14 @@ aarch64_vfp_is_call_or_return_candidate (machine_mode mode, int *count, bool *is_ha) { + if (is_ha != NULL) *is_ha = false; + + if (type && aarch64_sve::builtin_type_p (type)) + return false; + machine_mode new_mode = VOIDmode; bool composite_p = aarch64_composite_type_p (type, mode); - if (is_ha != NULL) *is_ha = false; - if ((!composite_p && GET_MODE_CLASS (mode) == MODE_FLOAT) || aarch64_short_vector_p (type, mode)) { @@ -17121,11 +17688,15 @@ aarch64_asm_preferred_eh_data_format (int code ATTRIBUTE_UNUSED, int global) static void aarch64_asm_output_variant_pcs (FILE *stream, const tree decl, const char* name) { - if (aarch64_simd_decl_p (decl)) + if (TREE_CODE (decl) == FUNCTION_DECL) { - fprintf (stream, "\t.variant_pcs\t"); - assemble_name (stream, name); - fprintf (stream, "\n"); + arm_pcs pcs = (arm_pcs) fndecl_abi (decl).id (); + if (pcs == ARM_PCS_SIMD || pcs == ARM_PCS_SVE) + { + fprintf (stream, "\t.variant_pcs\t"); + assemble_name (stream, name); + fprintf (stream, "\n"); + } } } @@ -21373,6 +21944,9 @@ aarch64_libgcc_floating_mode_supported_p #undef TARGET_ASM_POST_CFI_STARTPROC #define TARGET_ASM_POST_CFI_STARTPROC aarch64_post_cfi_startproc +#undef TARGET_STRICT_ARGUMENT_NAMING +#define TARGET_STRICT_ARGUMENT_NAMING hook_bool_CUMULATIVE_ARGS_true + struct gcc_target targetm = TARGET_INITIALIZER; #include "gt-aarch64.h" diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h index 070fdc9..425a363 100644 --- a/gcc/config/aarch64/aarch64.h +++ b/gcc/config/aarch64/aarch64.h @@ -479,9 +479,10 @@ extern unsigned aarch64_architecture_version; #define ARG_POINTER_REGNUM AP_REGNUM #define FIRST_PSEUDO_REGISTER (FFRT_REGNUM + 1) -/* The number of (integer) argument register available. */ +/* The number of argument registers available for each class. */ #define NUM_ARG_REGS 8 #define NUM_FP_ARG_REGS 8 +#define NUM_PR_ARG_REGS 4 /* A Homogeneous Floating-Point or Short-Vector Aggregate may have at most four members. */ @@ -725,7 +726,7 @@ extern enum aarch64_processor aarch64_tune; #ifdef HAVE_POLY_INT_H struct GTY (()) aarch64_frame { - HOST_WIDE_INT reg_offset[FIRST_PSEUDO_REGISTER]; + poly_int64 reg_offset[LAST_SAVED_REGNUM + 1]; /* The number of extra stack bytes taken up by register varargs. This area is allocated by the callee at the very top of the @@ -733,9 +734,12 @@ struct GTY (()) aarch64_frame STACK_BOUNDARY. */ HOST_WIDE_INT saved_varargs_size; - /* The size of the saved callee-save int/FP registers. */ + /* The size of the callee-save registers with a slot in REG_OFFSET. */ + poly_int64 saved_regs_size; - HOST_WIDE_INT saved_regs_size; + /* The size of the callee-save registers with a slot in REG_OFFSET that + are saved below the hard frame pointer. */ + poly_int64 below_hard_fp_saved_regs_size; /* Offset from the base of the frame (incomming SP) to the top of the locals area. This value is always a multiple of @@ -763,6 +767,10 @@ struct GTY (()) aarch64_frame It may be non-zero if no push is used (ie. callee_adjust == 0). */ poly_int64 callee_offset; + /* The size of the stack adjustment before saving or after restoring + SVE registers. */ + poly_int64 sve_callee_adjust; + /* The size of the stack adjustment after saving callee-saves. */ poly_int64 final_adjust; @@ -772,6 +780,11 @@ struct GTY (()) aarch64_frame unsigned wb_candidate1; unsigned wb_candidate2; + /* Big-endian SVE frames need a spare predicate register in order + to save vector registers in the correct layout for unwinding. + This is the register they should use. */ + unsigned spare_pred_reg; + bool laid_out; }; @@ -800,6 +813,8 @@ enum arm_pcs { ARM_PCS_AAPCS64, /* Base standard AAPCS for 64 bit. */ ARM_PCS_SIMD, /* For aarch64_vector_pcs functions. */ + ARM_PCS_SVE, /* For functions that pass or return + values in SVE registers. */ ARM_PCS_TLSDESC, /* For targets of tlsdesc calls. */ ARM_PCS_UNKNOWN }; @@ -827,6 +842,8 @@ typedef struct int aapcs_nextncrn; /* Next next core register number. */ int aapcs_nvrn; /* Next Vector register number. */ int aapcs_nextnvrn; /* Next Next Vector register number. */ + int aapcs_nprn; /* Next Predicate register number. */ + int aapcs_nextnprn; /* Next Next Predicate register number. */ rtx aapcs_reg; /* Register assigned to this argument. This is NULL_RTX if this parameter goes on the stack. */ @@ -837,6 +854,8 @@ typedef struct aapcs_reg == NULL_RTX. */ int aapcs_stack_size; /* The total size (in words, per 8 byte) of the stack arg area so far. */ + bool silent_p; /* True if we should act silently, rather than + raise an error for invalid calls. */ } CUMULATIVE_ARGS; #endif @@ -1144,7 +1163,8 @@ extern poly_uint16 aarch64_sve_vg; #define BITS_PER_SVE_VECTOR (poly_uint16 (aarch64_sve_vg * 64)) #define BYTES_PER_SVE_VECTOR (poly_uint16 (aarch64_sve_vg * 8)) -/* The number of bytes in an SVE predicate. */ +/* The number of bits and bytes in an SVE predicate. */ +#define BITS_PER_SVE_PRED BYTES_PER_SVE_VECTOR #define BYTES_PER_SVE_PRED aarch64_sve_vg /* The SVE mode for a vector of bytes. */ diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index d1fe173..f19e227 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -85,7 +85,6 @@ (V29_REGNUM 61) (V30_REGNUM 62) (V31_REGNUM 63) - (LAST_SAVED_REGNUM 63) (SFP_REGNUM 64) (AP_REGNUM 65) (CC_REGNUM 66) @@ -107,6 +106,7 @@ (P13_REGNUM 81) (P14_REGNUM 82) (P15_REGNUM 83) + (LAST_SAVED_REGNUM 83) (FFR_REGNUM 84) ;; "FFR token": a fake register used for representing the scheduling ;; restrictions on FFR-related operations. diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog index 0507477..e2656d5 100644 --- a/gcc/testsuite/ChangeLog +++ b/gcc/testsuite/ChangeLog @@ -1,4 +1,143 @@ 2019-10-29 Richard Sandiford + + * gcc.target/aarch64/sve/pcs/aarch64-sve-pcs.exp: New file. + * gcc.target/aarch64/sve/pcs/annotate_1.c: New test. + * gcc.target/aarch64/sve/pcs/annotate_2.c: Likewise. + * gcc.target/aarch64/sve/pcs/annotate_3.c: Likewise. + * gcc.target/aarch64/sve/pcs/annotate_4.c: Likewise. + * gcc.target/aarch64/sve/pcs/annotate_5.c: Likewise. + * gcc.target/aarch64/sve/pcs/annotate_6.c: Likewise. + * gcc.target/aarch64/sve/pcs/annotate_7.c: Likewise. + * gcc.target/aarch64/sve/pcs/args_1.c: Likewise. + * gcc.target/aarch64/sve/pcs/args_10.c: Likewise. + * gcc.target/aarch64/sve/pcs/args_11_nosc.c: Likewise. + * gcc.target/aarch64/sve/pcs/args_11_sc.c: Likewise. + * gcc.target/aarch64/sve/pcs/args_2.c: Likewise. + * gcc.target/aarch64/sve/pcs/args_3.c: Likewise. + * gcc.target/aarch64/sve/pcs/args_4.c: Likewise. + * gcc.target/aarch64/sve/pcs/args_5_be_f16.c: Likewise. + * gcc.target/aarch64/sve/pcs/args_5_be_f32.c: Likewise. + * gcc.target/aarch64/sve/pcs/args_5_be_f64.c: Likewise. + * gcc.target/aarch64/sve/pcs/args_5_be_s16.c: Likewise. + * gcc.target/aarch64/sve/pcs/args_5_be_s32.c: Likewise. + * gcc.target/aarch64/sve/pcs/args_5_be_s64.c: Likewise. + * gcc.target/aarch64/sve/pcs/args_5_be_s8.c: Likewise. + * gcc.target/aarch64/sve/pcs/args_5_be_u16.c: Likewise. + * gcc.target/aarch64/sve/pcs/args_5_be_u32.c: Likewise. + * gcc.target/aarch64/sve/pcs/args_5_be_u64.c: Likewise. + * gcc.target/aarch64/sve/pcs/args_5_be_u8.c: Likewise. + * gcc.target/aarch64/sve/pcs/args_5_le_f16.c: Likewise. + * gcc.target/aarch64/sve/pcs/args_5_le_f32.c: Likewise. + * gcc.target/aarch64/sve/pcs/args_5_le_f64.c: Likewise. + * gcc.target/aarch64/sve/pcs/args_5_le_s16.c: Likewise. + * gcc.target/aarch64/sve/pcs/args_5_le_s32.c: Likewise. + * gcc.target/aarch64/sve/pcs/args_5_le_s64.c: Likewise. + * gcc.target/aarch64/sve/pcs/args_5_le_s8.c: Likewise. + * gcc.target/aarch64/sve/pcs/args_5_le_u16.c: Likewise. + * gcc.target/aarch64/sve/pcs/args_5_le_u32.c: Likewise. + * gcc.target/aarch64/sve/pcs/args_5_le_u64.c: Likewise. + * gcc.target/aarch64/sve/pcs/args_5_le_u8.c: Likewise. + * gcc.target/aarch64/sve/pcs/args_6_be_f16.c: Likewise. + * gcc.target/aarch64/sve/pcs/args_6_be_f32.c: Likewise. + * gcc.target/aarch64/sve/pcs/args_6_be_f64.c: Likewise. + * gcc.target/aarch64/sve/pcs/args_6_be_s16.c: Likewise. + * gcc.target/aarch64/sve/pcs/args_6_be_s32.c: Likewise. + * gcc.target/aarch64/sve/pcs/args_6_be_s64.c: Likewise. + * gcc.target/aarch64/sve/pcs/args_6_be_s8.c: Likewise. + * gcc.target/aarch64/sve/pcs/args_6_be_u16.c: Likewise. + * gcc.target/aarch64/sve/pcs/args_6_be_u32.c: Likewise. + * gcc.target/aarch64/sve/pcs/args_6_be_u64.c: Likewise. + * gcc.target/aarch64/sve/pcs/args_6_be_u8.c: Likewise. + * gcc.target/aarch64/sve/pcs/args_6_le_f16.c: Likewise. + * gcc.target/aarch64/sve/pcs/args_6_le_f32.c: Likewise. + * gcc.target/aarch64/sve/pcs/args_6_le_f64.c: Likewise. + * gcc.target/aarch64/sve/pcs/args_6_le_s16.c: Likewise. + * gcc.target/aarch64/sve/pcs/args_6_le_s32.c: Likewise. + * gcc.target/aarch64/sve/pcs/args_6_le_s64.c: Likewise. + * gcc.target/aarch64/sve/pcs/args_6_le_s8.c: Likewise. + * gcc.target/aarch64/sve/pcs/args_6_le_u16.c: Likewise. + * gcc.target/aarch64/sve/pcs/args_6_le_u32.c: Likewise. + * gcc.target/aarch64/sve/pcs/args_6_le_u64.c: Likewise. + * gcc.target/aarch64/sve/pcs/args_6_le_u8.c: Likewise. + * gcc.target/aarch64/sve/pcs/args_7.c: Likewise. + * gcc.target/aarch64/sve/pcs/args_8.c: Likewise. + * gcc.target/aarch64/sve/pcs/args_9.c: Likewise. + * gcc.target/aarch64/sve/pcs/nosve_1.c: Likewise. + * gcc.target/aarch64/sve/pcs/nosve_2.c: Likewise. + * gcc.target/aarch64/sve/pcs/nosve_3.c: Likewise. + * gcc.target/aarch64/sve/pcs/nosve_4.c: Likewise. + * gcc.target/aarch64/sve/pcs/nosve_5.c: Likewise. + * gcc.target/aarch64/sve/pcs/nosve_6.c: Likewise. + * gcc.target/aarch64/sve/pcs/nosve_7.c: Likewise. + * gcc.target/aarch64/sve/pcs/nosve_8.c: Likewise. + * gcc.target/aarch64/sve/pcs/return_1.c: Likewise. + * gcc.target/aarch64/sve/pcs/return_1_1024.c: Likewise. + * gcc.target/aarch64/sve/pcs/return_1_2048.c: Likewise. + * gcc.target/aarch64/sve/pcs/return_1_256.c: Likewise. + * gcc.target/aarch64/sve/pcs/return_1_512.c: Likewise. + * gcc.target/aarch64/sve/pcs/return_2.c: Likewise. + * gcc.target/aarch64/sve/pcs/return_3.c: Likewise. + * gcc.target/aarch64/sve/pcs/return_4.c: Likewise. + * gcc.target/aarch64/sve/pcs/return_4_1024.c: Likewise. + * gcc.target/aarch64/sve/pcs/return_4_2048.c: Likewise. + * gcc.target/aarch64/sve/pcs/return_4_256.c: Likewise. + * gcc.target/aarch64/sve/pcs/return_4_512.c: Likewise. + * gcc.target/aarch64/sve/pcs/return_5.c: Likewise. + * gcc.target/aarch64/sve/pcs/return_5_1024.c: Likewise. + * gcc.target/aarch64/sve/pcs/return_5_2048.c: Likewise. + * gcc.target/aarch64/sve/pcs/return_5_256.c: Likewise. + * gcc.target/aarch64/sve/pcs/return_5_512.c: Likewise. + * gcc.target/aarch64/sve/pcs/return_6.c: Likewise. + * gcc.target/aarch64/sve/pcs/return_6_1024.c: Likewise. + * gcc.target/aarch64/sve/pcs/return_6_2048.c: Likewise. + * gcc.target/aarch64/sve/pcs/return_6_256.c: Likewise. + * gcc.target/aarch64/sve/pcs/return_6_512.c: Likewise. + * gcc.target/aarch64/sve/pcs/return_7.c: Likewise. + * gcc.target/aarch64/sve/pcs/return_8.c: Likewise. + * gcc.target/aarch64/sve/pcs/return_9.c: Likewise. + * gcc.target/aarch64/sve/pcs/saves_1_be_nowrap.c: Likewise. + * gcc.target/aarch64/sve/pcs/saves_1_be_wrap.c: Likewise. + * gcc.target/aarch64/sve/pcs/saves_1_le_nowrap.c: Likewise. + * gcc.target/aarch64/sve/pcs/saves_1_le_wrap.c: Likewise. + * gcc.target/aarch64/sve/pcs/saves_2_be_nowrap.c: Likewise. + * gcc.target/aarch64/sve/pcs/saves_2_be_wrap.c: Likewise. + * gcc.target/aarch64/sve/pcs/saves_2_le_nowrap.c: Likewise. + * gcc.target/aarch64/sve/pcs/saves_2_le_wrap.c: Likewise. + * gcc.target/aarch64/sve/pcs/saves_3.c: Likewise. + * gcc.target/aarch64/sve/pcs/saves_4_be.c: Likewise. + * gcc.target/aarch64/sve/pcs/saves_4_le.c: Likewise. + * gcc.target/aarch64/sve/pcs/saves_5_be.c: Likewise. + * gcc.target/aarch64/sve/pcs/saves_5_le.c: Likewise. + * gcc.target/aarch64/sve/pcs/stack_clash_1.c: Likewise. + * gcc.target/aarch64/sve/pcs/stack_clash_1_256.c: Likewise. + * gcc.target/aarch64/sve/pcs/stack_clash_1_512.c: Likewise. + * gcc.target/aarch64/sve/pcs/stack_clash_1_1024.c: Likewise. + * gcc.target/aarch64/sve/pcs/stack_clash_1_2048.c: Likewise. + * gcc.target/aarch64/sve/pcs/stack_clash_2.c: Likewise. + * gcc.target/aarch64/sve/pcs/stack_clash_2_256.c: Likewise. + * gcc.target/aarch64/sve/pcs/stack_clash_2_512.c: Likewise. + * gcc.target/aarch64/sve/pcs/stack_clash_2_1024.c: Likewise. + * gcc.target/aarch64/sve/pcs/stack_clash_2_2048.c: Likewise. + * gcc.target/aarch64/sve/pcs/stack_clash_3.c: Likewise. + * gcc.target/aarch64/sve/pcs/unprototyped_1.c: Likewise. + * gcc.target/aarch64/sve/pcs/varargs_1.c: Likewise. + * gcc.target/aarch64/sve/pcs/varargs_2_f16.c: Likewise. + * gcc.target/aarch64/sve/pcs/varargs_2_f32.c: Likewise. + * gcc.target/aarch64/sve/pcs/varargs_2_f64.c: Likewise. + * gcc.target/aarch64/sve/pcs/varargs_2_s16.c: Likewise. + * gcc.target/aarch64/sve/pcs/varargs_2_s32.c: Likewise. + * gcc.target/aarch64/sve/pcs/varargs_2_s64.c: Likewise. + * gcc.target/aarch64/sve/pcs/varargs_2_s8.c: Likewise. + * gcc.target/aarch64/sve/pcs/varargs_2_u16.c: Likewise. + * gcc.target/aarch64/sve/pcs/varargs_2_u32.c: Likewise. + * gcc.target/aarch64/sve/pcs/varargs_2_u64.c: Likewise. + * gcc.target/aarch64/sve/pcs/varargs_2_u8.c: Likewise. + * gcc.target/aarch64/sve/pcs/varargs_3_nosc.c: Likewise. + * gcc.target/aarch64/sve/pcs/varargs_3_sc.c: Likewise. + * gcc.target/aarch64/sve/pcs/vpcs_1.c: Likewise. + * g++.target/aarch64/sve/catch_7.C: Likewise. + +2019-10-29 Richard Sandiford Kugan Vivekanandarajah Prathamesh Kulkarni diff --git a/gcc/testsuite/g++.target/aarch64/sve/catch_7.C b/gcc/testsuite/g++.target/aarch64/sve/catch_7.C new file mode 100644 index 0000000..ac10b69 --- /dev/null +++ b/gcc/testsuite/g++.target/aarch64/sve/catch_7.C @@ -0,0 +1,38 @@ +/* { dg-do run { target aarch64_sve_hw } } */ +/* { dg-options "-O" } */ + +#include + +void __attribute__ ((noipa)) +f1 (void) +{ + throw 1; +} + +void __attribute__ ((noipa)) +f2 (svbool_t) +{ + register svint8_t z8 asm ("z8") = svindex_s8 (11, 1); + asm volatile ("" :: "w" (z8)); + f1 (); +} + +void __attribute__ ((noipa)) +f3 (int n) +{ + register double d8 asm ("v8") = 42.0; + for (int i = 0; i < n; ++i) + { + asm volatile ("" : "=w" (d8) : "w" (d8)); + try { f2 (svptrue_b8 ()); } catch (int) { break; } + } + if (d8 != 42.0) + __builtin_abort (); +} + +int +main (void) +{ + f3 (100); + return 0; +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/aarch64-sve-pcs.exp b/gcc/testsuite/gcc.target/aarch64/sve/pcs/aarch64-sve-pcs.exp new file mode 100644 index 0000000..7458875 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/aarch64-sve-pcs.exp @@ -0,0 +1,52 @@ +# Specific regression driver for AArch64 SVE. +# Copyright (C) 2009-2019 Free Software Foundation, Inc. +# Contributed by ARM Ltd. +# +# This file is part of GCC. +# +# GCC is free software; you can redistribute it and/or modify it +# under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 3, or (at your option) +# any later version. +# +# GCC is distributed in the hope that it will be useful, but +# WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +# General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with GCC; see the file COPYING3. If not see +# . */ + +# GCC testsuite that uses the `dg.exp' driver. + +# Exit immediately if this isn't an AArch64 target. +if {![istarget aarch64*-*-*] } then { + return +} + +# Load support procs. +load_lib gcc-dg.exp + +# If a testcase doesn't have special options, use these. +global DEFAULT_CFLAGS +if ![info exists DEFAULT_CFLAGS] then { + set DEFAULT_CFLAGS " -ansi -pedantic-errors" +} + +# Initialize `dg'. +dg-init + +# Force SVE if we're not testing it already. +if { [check_effective_target_aarch64_sve] } { + set sve_flags "" +} else { + set sve_flags "-march=armv8.2-a+sve" +} + +# Main loop. +dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.c]] \ + $sve_flags $DEFAULT_CFLAGS + +# All done. +dg-finish diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/annotate_1.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/annotate_1.c new file mode 100644 index 0000000..1172be5 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/annotate_1.c @@ -0,0 +1,104 @@ +/* { dg-do compile } */ + +#include + +svbool_t ret_b (void) { return svptrue_b8 (); } + +svint8_t ret_s8 (void) { return svdup_s8 (0); } +svint16_t ret_s16 (void) { return svdup_s16 (0); } +svint32_t ret_s32 (void) { return svdup_s32 (0); } +svint64_t ret_s64 (void) { return svdup_s64 (0); } +svuint8_t ret_u8 (void) { return svdup_u8 (0); } +svuint16_t ret_u16 (void) { return svdup_u16 (0); } +svuint32_t ret_u32 (void) { return svdup_u32 (0); } +svuint64_t ret_u64 (void) { return svdup_u64 (0); } +svfloat16_t ret_f16 (void) { return svdup_f16 (0); } +svfloat32_t ret_f32 (void) { return svdup_f32 (0); } +svfloat64_t ret_f64 (void) { return svdup_f64 (0); } + +svint8x2_t ret_s8x2 (void) { return svundef2_s8 (); } +svint16x2_t ret_s16x2 (void) { return svundef2_s16 (); } +svint32x2_t ret_s32x2 (void) { return svundef2_s32 (); } +svint64x2_t ret_s64x2 (void) { return svundef2_s64 (); } +svuint8x2_t ret_u8x2 (void) { return svundef2_u8 (); } +svuint16x2_t ret_u16x2 (void) { return svundef2_u16 (); } +svuint32x2_t ret_u32x2 (void) { return svundef2_u32 (); } +svuint64x2_t ret_u64x2 (void) { return svundef2_u64 (); } +svfloat16x2_t ret_f16x2 (void) { return svundef2_f16 (); } +svfloat32x2_t ret_f32x2 (void) { return svundef2_f32 (); } +svfloat64x2_t ret_f64x2 (void) { return svundef2_f64 (); } + +svint8x3_t ret_s8x3 (void) { return svundef3_s8 (); } +svint16x3_t ret_s16x3 (void) { return svundef3_s16 (); } +svint32x3_t ret_s32x3 (void) { return svundef3_s32 (); } +svint64x3_t ret_s64x3 (void) { return svundef3_s64 (); } +svuint8x3_t ret_u8x3 (void) { return svundef3_u8 (); } +svuint16x3_t ret_u16x3 (void) { return svundef3_u16 (); } +svuint32x3_t ret_u32x3 (void) { return svundef3_u32 (); } +svuint64x3_t ret_u64x3 (void) { return svundef3_u64 (); } +svfloat16x3_t ret_f16x3 (void) { return svundef3_f16 (); } +svfloat32x3_t ret_f32x3 (void) { return svundef3_f32 (); } +svfloat64x3_t ret_f64x3 (void) { return svundef3_f64 (); } + +svint8x4_t ret_s8x4 (void) { return svundef4_s8 (); } +svint16x4_t ret_s16x4 (void) { return svundef4_s16 (); } +svint32x4_t ret_s32x4 (void) { return svundef4_s32 (); } +svint64x4_t ret_s64x4 (void) { return svundef4_s64 (); } +svuint8x4_t ret_u8x4 (void) { return svundef4_u8 (); } +svuint16x4_t ret_u16x4 (void) { return svundef4_u16 (); } +svuint32x4_t ret_u32x4 (void) { return svundef4_u32 (); } +svuint64x4_t ret_u64x4 (void) { return svundef4_u64 (); } +svfloat16x4_t ret_f16x4 (void) { return svundef4_f16 (); } +svfloat32x4_t ret_f32x4 (void) { return svundef4_f32 (); } +svfloat64x4_t ret_f64x4 (void) { return svundef4_f64 (); } + +/* { dg-final { scan-assembler {\t\.variant_pcs\tret_b\n} } } */ + +/* { dg-final { scan-assembler {\t\.variant_pcs\tret_s8\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tret_s16\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tret_s32\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tret_s64\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tret_u8\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tret_u16\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tret_u32\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tret_u64\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tret_f16\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tret_f32\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tret_f64\n} } } */ + +/* { dg-final { scan-assembler {\t\.variant_pcs\tret_s8x2\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tret_s16x2\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tret_s32x2\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tret_s64x2\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tret_u8x2\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tret_u16x2\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tret_u32x2\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tret_u64x2\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tret_f16x2\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tret_f32x2\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tret_f64x2\n} } } */ + +/* { dg-final { scan-assembler {\t\.variant_pcs\tret_s8x3\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tret_s16x3\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tret_s16x3\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tret_s32x3\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tret_s64x3\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tret_u8x3\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tret_u16x3\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tret_u32x3\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tret_u64x3\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tret_f16x3\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tret_f32x3\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tret_f64x3\n} } } */ + +/* { dg-final { scan-assembler {\t\.variant_pcs\tret_s8x4\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tret_s16x4\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tret_s32x4\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tret_s64x4\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tret_u8x4\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tret_u16x4\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tret_u32x4\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tret_u64x4\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tret_f16x4\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tret_f32x4\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tret_f64x4\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/annotate_2.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/annotate_2.c new file mode 100644 index 0000000..6f10f90 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/annotate_2.c @@ -0,0 +1,103 @@ +/* { dg-do compile } */ + +#include + +void fn_b (svbool_t x) {} + +void fn_s8 (svint8_t x) {} +void fn_s16 (svint16_t x) {} +void fn_s32 (svint32_t x) {} +void fn_s64 (svint64_t x) {} +void fn_u8 (svuint8_t x) {} +void fn_u16 (svuint16_t x) {} +void fn_u32 (svuint32_t x) {} +void fn_u64 (svuint64_t x) {} +void fn_f16 (svfloat16_t x) {} +void fn_f32 (svfloat32_t x) {} +void fn_f64 (svfloat64_t x) {} + +void fn_s8x2 (svint8x2_t x) {} +void fn_s16x2 (svint16x2_t x) {} +void fn_s32x2 (svint32x2_t x) {} +void fn_s64x2 (svint64x2_t x) {} +void fn_u8x2 (svuint8x2_t x) {} +void fn_u16x2 (svuint16x2_t x) {} +void fn_u32x2 (svuint32x2_t x) {} +void fn_u64x2 (svuint64x2_t x) {} +void fn_f16x2 (svfloat16x2_t x) {} +void fn_f32x2 (svfloat32x2_t x) {} +void fn_f64x2 (svfloat64x2_t x) {} + +void fn_s8x3 (svint8x3_t x) {} +void fn_s16x3 (svint16x3_t x) {} +void fn_s32x3 (svint32x3_t x) {} +void fn_s64x3 (svint64x3_t x) {} +void fn_u8x3 (svuint8x3_t x) {} +void fn_u16x3 (svuint16x3_t x) {} +void fn_u32x3 (svuint32x3_t x) {} +void fn_u64x3 (svuint64x3_t x) {} +void fn_f16x3 (svfloat16x3_t x) {} +void fn_f32x3 (svfloat32x3_t x) {} +void fn_f64x3 (svfloat64x3_t x) {} + +void fn_s8x4 (svint8x4_t x) {} +void fn_s16x4 (svint16x4_t x) {} +void fn_s32x4 (svint32x4_t x) {} +void fn_s64x4 (svint64x4_t x) {} +void fn_u8x4 (svuint8x4_t x) {} +void fn_u16x4 (svuint16x4_t x) {} +void fn_u32x4 (svuint32x4_t x) {} +void fn_u64x4 (svuint64x4_t x) {} +void fn_f16x4 (svfloat16x4_t x) {} +void fn_f32x4 (svfloat32x4_t x) {} +void fn_f64x4 (svfloat64x4_t x) {} + +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_b\n} } } */ + +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_s8\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_s16\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_s32\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_s64\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u8\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u16\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u32\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u64\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f16\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f32\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f64\n} } } */ + +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_s8x2\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_s16x2\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_s32x2\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_s64x2\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u8x2\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u16x2\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u32x2\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u64x2\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f16x2\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f32x2\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f64x2\n} } } */ + +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_s8x3\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_s16x3\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_s32x3\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_s64x3\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u8x3\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u16x3\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u32x3\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u64x3\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f16x3\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f32x3\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f64x3\n} } } */ + +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_s8x4\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_s16x4\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_s32x4\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_s64x4\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u8x4\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u16x4\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u32x4\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u64x4\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f16x4\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f32x4\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f64x4\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/annotate_3.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/annotate_3.c new file mode 100644 index 0000000..d922a8a --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/annotate_3.c @@ -0,0 +1,99 @@ +/* { dg-do compile } */ + +#include + +void fn_s8 (float d0, float d1, float d2, float d3, svint8_t x) {} +void fn_s16 (float d0, float d1, float d2, float d3, svint16_t x) {} +void fn_s32 (float d0, float d1, float d2, float d3, svint32_t x) {} +void fn_s64 (float d0, float d1, float d2, float d3, svint64_t x) {} +void fn_u8 (float d0, float d1, float d2, float d3, svuint8_t x) {} +void fn_u16 (float d0, float d1, float d2, float d3, svuint16_t x) {} +void fn_u32 (float d0, float d1, float d2, float d3, svuint32_t x) {} +void fn_u64 (float d0, float d1, float d2, float d3, svuint64_t x) {} +void fn_f16 (float d0, float d1, float d2, float d3, svfloat16_t x) {} +void fn_f32 (float d0, float d1, float d2, float d3, svfloat32_t x) {} +void fn_f64 (float d0, float d1, float d2, float d3, svfloat64_t x) {} + +void fn_s8x2 (float d0, float d1, float d2, float d3, svint8x2_t x) {} +void fn_s16x2 (float d0, float d1, float d2, float d3, svint16x2_t x) {} +void fn_s32x2 (float d0, float d1, float d2, float d3, svint32x2_t x) {} +void fn_s64x2 (float d0, float d1, float d2, float d3, svint64x2_t x) {} +void fn_u8x2 (float d0, float d1, float d2, float d3, svuint8x2_t x) {} +void fn_u16x2 (float d0, float d1, float d2, float d3, svuint16x2_t x) {} +void fn_u32x2 (float d0, float d1, float d2, float d3, svuint32x2_t x) {} +void fn_u64x2 (float d0, float d1, float d2, float d3, svuint64x2_t x) {} +void fn_f16x2 (float d0, float d1, float d2, float d3, svfloat16x2_t x) {} +void fn_f32x2 (float d0, float d1, float d2, float d3, svfloat32x2_t x) {} +void fn_f64x2 (float d0, float d1, float d2, float d3, svfloat64x2_t x) {} + +void fn_s8x3 (float d0, float d1, float d2, float d3, svint8x3_t x) {} +void fn_s16x3 (float d0, float d1, float d2, float d3, svint16x3_t x) {} +void fn_s32x3 (float d0, float d1, float d2, float d3, svint32x3_t x) {} +void fn_s64x3 (float d0, float d1, float d2, float d3, svint64x3_t x) {} +void fn_u8x3 (float d0, float d1, float d2, float d3, svuint8x3_t x) {} +void fn_u16x3 (float d0, float d1, float d2, float d3, svuint16x3_t x) {} +void fn_u32x3 (float d0, float d1, float d2, float d3, svuint32x3_t x) {} +void fn_u64x3 (float d0, float d1, float d2, float d3, svuint64x3_t x) {} +void fn_f16x3 (float d0, float d1, float d2, float d3, svfloat16x3_t x) {} +void fn_f32x3 (float d0, float d1, float d2, float d3, svfloat32x3_t x) {} +void fn_f64x3 (float d0, float d1, float d2, float d3, svfloat64x3_t x) {} + +void fn_s8x4 (float d0, float d1, float d2, float d3, svint8x4_t x) {} +void fn_s16x4 (float d0, float d1, float d2, float d3, svint16x4_t x) {} +void fn_s32x4 (float d0, float d1, float d2, float d3, svint32x4_t x) {} +void fn_s64x4 (float d0, float d1, float d2, float d3, svint64x4_t x) {} +void fn_u8x4 (float d0, float d1, float d2, float d3, svuint8x4_t x) {} +void fn_u16x4 (float d0, float d1, float d2, float d3, svuint16x4_t x) {} +void fn_u32x4 (float d0, float d1, float d2, float d3, svuint32x4_t x) {} +void fn_u64x4 (float d0, float d1, float d2, float d3, svuint64x4_t x) {} +void fn_f16x4 (float d0, float d1, float d2, float d3, svfloat16x4_t x) {} +void fn_f32x4 (float d0, float d1, float d2, float d3, svfloat32x4_t x) {} +void fn_f64x4 (float d0, float d1, float d2, float d3, svfloat64x4_t x) {} + +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_s8\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_s16\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_s32\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_s64\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u8\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u16\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u32\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u64\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f16\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f32\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f64\n} } } */ + +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_s8x2\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_s16x2\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_s32x2\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_s64x2\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u8x2\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u16x2\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u32x2\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u64x2\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f16x2\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f32x2\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f64x2\n} } } */ + +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_s8x3\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_s16x3\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_s32x3\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_s64x3\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u8x3\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u16x3\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u32x3\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u64x3\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f16x3\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f32x3\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f64x3\n} } } */ + +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_s8x4\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_s16x4\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_s32x4\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_s64x4\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u8x4\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u16x4\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u32x4\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u64x4\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f16x4\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f32x4\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f64x4\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/annotate_4.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/annotate_4.c new file mode 100644 index 0000000..d057158 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/annotate_4.c @@ -0,0 +1,143 @@ +/* { dg-do compile } */ + +#include + +void fn_s8 (float d0, float d1, float d2, float d3, + float d4, svint8_t x) {} +void fn_s16 (float d0, float d1, float d2, float d3, + float d4, svint16_t x) {} +void fn_s32 (float d0, float d1, float d2, float d3, + float d4, svint32_t x) {} +void fn_s64 (float d0, float d1, float d2, float d3, + float d4, svint64_t x) {} +void fn_u8 (float d0, float d1, float d2, float d3, + float d4, svuint8_t x) {} +void fn_u16 (float d0, float d1, float d2, float d3, + float d4, svuint16_t x) {} +void fn_u32 (float d0, float d1, float d2, float d3, + float d4, svuint32_t x) {} +void fn_u64 (float d0, float d1, float d2, float d3, + float d4, svuint64_t x) {} +void fn_f16 (float d0, float d1, float d2, float d3, + float d4, svfloat16_t x) {} +void fn_f32 (float d0, float d1, float d2, float d3, + float d4, svfloat32_t x) {} +void fn_f64 (float d0, float d1, float d2, float d3, + float d4, svfloat64_t x) {} + +void fn_s8x2 (float d0, float d1, float d2, float d3, + float d4, svint8x2_t x) {} +void fn_s16x2 (float d0, float d1, float d2, float d3, + float d4, svint16x2_t x) {} +void fn_s32x2 (float d0, float d1, float d2, float d3, + float d4, svint32x2_t x) {} +void fn_s64x2 (float d0, float d1, float d2, float d3, + float d4, svint64x2_t x) {} +void fn_u8x2 (float d0, float d1, float d2, float d3, + float d4, svuint8x2_t x) {} +void fn_u16x2 (float d0, float d1, float d2, float d3, + float d4, svuint16x2_t x) {} +void fn_u32x2 (float d0, float d1, float d2, float d3, + float d4, svuint32x2_t x) {} +void fn_u64x2 (float d0, float d1, float d2, float d3, + float d4, svuint64x2_t x) {} +void fn_f16x2 (float d0, float d1, float d2, float d3, + float d4, svfloat16x2_t x) {} +void fn_f32x2 (float d0, float d1, float d2, float d3, + float d4, svfloat32x2_t x) {} +void fn_f64x2 (float d0, float d1, float d2, float d3, + float d4, svfloat64x2_t x) {} + +void fn_s8x3 (float d0, float d1, float d2, float d3, + float d4, svint8x3_t x) {} +void fn_s16x3 (float d0, float d1, float d2, float d3, + float d4, svint16x3_t x) {} +void fn_s32x3 (float d0, float d1, float d2, float d3, + float d4, svint32x3_t x) {} +void fn_s64x3 (float d0, float d1, float d2, float d3, + float d4, svint64x3_t x) {} +void fn_u8x3 (float d0, float d1, float d2, float d3, + float d4, svuint8x3_t x) {} +void fn_u16x3 (float d0, float d1, float d2, float d3, + float d4, svuint16x3_t x) {} +void fn_u32x3 (float d0, float d1, float d2, float d3, + float d4, svuint32x3_t x) {} +void fn_u64x3 (float d0, float d1, float d2, float d3, + float d4, svuint64x3_t x) {} +void fn_f16x3 (float d0, float d1, float d2, float d3, + float d4, svfloat16x3_t x) {} +void fn_f32x3 (float d0, float d1, float d2, float d3, + float d4, svfloat32x3_t x) {} +void fn_f64x3 (float d0, float d1, float d2, float d3, + float d4, svfloat64x3_t x) {} + +void fn_s8x4 (float d0, float d1, float d2, float d3, + float d4, svint8x4_t x) {} +void fn_s16x4 (float d0, float d1, float d2, float d3, + float d4, svint16x4_t x) {} +void fn_s32x4 (float d0, float d1, float d2, float d3, + float d4, svint32x4_t x) {} +void fn_s64x4 (float d0, float d1, float d2, float d3, + float d4, svint64x4_t x) {} +void fn_u8x4 (float d0, float d1, float d2, float d3, + float d4, svuint8x4_t x) {} +void fn_u16x4 (float d0, float d1, float d2, float d3, + float d4, svuint16x4_t x) {} +void fn_u32x4 (float d0, float d1, float d2, float d3, + float d4, svuint32x4_t x) {} +void fn_u64x4 (float d0, float d1, float d2, float d3, + float d4, svuint64x4_t x) {} +void fn_f16x4 (float d0, float d1, float d2, float d3, + float d4, svfloat16x4_t x) {} +void fn_f32x4 (float d0, float d1, float d2, float d3, + float d4, svfloat32x4_t x) {} +void fn_f64x4 (float d0, float d1, float d2, float d3, + float d4, svfloat64x4_t x) {} + +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_s8\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_s16\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_s32\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_s64\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u8\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u16\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u32\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u64\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f16\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f32\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f64\n} } } */ + +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_s8x2\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_s16x2\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_s32x2\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_s64x2\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u8x2\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u16x2\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u32x2\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u64x2\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f16x2\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f32x2\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f64x2\n} } } */ + +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_s8x3\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_s16x3\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_s32x3\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_s64x3\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u8x3\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u16x3\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u32x3\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u64x3\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f16x3\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f32x3\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f64x3\n} } } */ + +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_s8x4\n} } } */ +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_s16x4\n} } } */ +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_s32x4\n} } } */ +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_s64x4\n} } } */ +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_u8x4\n} } } */ +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_u16x4\n} } } */ +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_u32x4\n} } } */ +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_u64x4\n} } } */ +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_f16x4\n} } } */ +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_f32x4\n} } } */ +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_f64x4\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/annotate_5.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/annotate_5.c new file mode 100644 index 0000000..3523528 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/annotate_5.c @@ -0,0 +1,143 @@ +/* { dg-do compile } */ + +#include + +void fn_s8 (float d0, float d1, float d2, float d3, + float d4, float d5, svint8_t x) {} +void fn_s16 (float d0, float d1, float d2, float d3, + float d4, float d5, svint16_t x) {} +void fn_s32 (float d0, float d1, float d2, float d3, + float d4, float d5, svint32_t x) {} +void fn_s64 (float d0, float d1, float d2, float d3, + float d4, float d5, svint64_t x) {} +void fn_u8 (float d0, float d1, float d2, float d3, + float d4, float d5, svuint8_t x) {} +void fn_u16 (float d0, float d1, float d2, float d3, + float d4, float d5, svuint16_t x) {} +void fn_u32 (float d0, float d1, float d2, float d3, + float d4, float d5, svuint32_t x) {} +void fn_u64 (float d0, float d1, float d2, float d3, + float d4, float d5, svuint64_t x) {} +void fn_f16 (float d0, float d1, float d2, float d3, + float d4, float d5, svfloat16_t x) {} +void fn_f32 (float d0, float d1, float d2, float d3, + float d4, float d5, svfloat32_t x) {} +void fn_f64 (float d0, float d1, float d2, float d3, + float d4, float d5, svfloat64_t x) {} + +void fn_s8x2 (float d0, float d1, float d2, float d3, + float d4, float d5, svint8x2_t x) {} +void fn_s16x2 (float d0, float d1, float d2, float d3, + float d4, float d5, svint16x2_t x) {} +void fn_s32x2 (float d0, float d1, float d2, float d3, + float d4, float d5, svint32x2_t x) {} +void fn_s64x2 (float d0, float d1, float d2, float d3, + float d4, float d5, svint64x2_t x) {} +void fn_u8x2 (float d0, float d1, float d2, float d3, + float d4, float d5, svuint8x2_t x) {} +void fn_u16x2 (float d0, float d1, float d2, float d3, + float d4, float d5, svuint16x2_t x) {} +void fn_u32x2 (float d0, float d1, float d2, float d3, + float d4, float d5, svuint32x2_t x) {} +void fn_u64x2 (float d0, float d1, float d2, float d3, + float d4, float d5, svuint64x2_t x) {} +void fn_f16x2 (float d0, float d1, float d2, float d3, + float d4, float d5, svfloat16x2_t x) {} +void fn_f32x2 (float d0, float d1, float d2, float d3, + float d4, float d5, svfloat32x2_t x) {} +void fn_f64x2 (float d0, float d1, float d2, float d3, + float d4, float d5, svfloat64x2_t x) {} + +void fn_s8x3 (float d0, float d1, float d2, float d3, + float d4, float d5, svint8x3_t x) {} +void fn_s16x3 (float d0, float d1, float d2, float d3, + float d4, float d5, svint16x3_t x) {} +void fn_s32x3 (float d0, float d1, float d2, float d3, + float d4, float d5, svint32x3_t x) {} +void fn_s64x3 (float d0, float d1, float d2, float d3, + float d4, float d5, svint64x3_t x) {} +void fn_u8x3 (float d0, float d1, float d2, float d3, + float d4, float d5, svuint8x3_t x) {} +void fn_u16x3 (float d0, float d1, float d2, float d3, + float d4, float d5, svuint16x3_t x) {} +void fn_u32x3 (float d0, float d1, float d2, float d3, + float d4, float d5, svuint32x3_t x) {} +void fn_u64x3 (float d0, float d1, float d2, float d3, + float d4, float d5, svuint64x3_t x) {} +void fn_f16x3 (float d0, float d1, float d2, float d3, + float d4, float d5, svfloat16x3_t x) {} +void fn_f32x3 (float d0, float d1, float d2, float d3, + float d4, float d5, svfloat32x3_t x) {} +void fn_f64x3 (float d0, float d1, float d2, float d3, + float d4, float d5, svfloat64x3_t x) {} + +void fn_s8x4 (float d0, float d1, float d2, float d3, + float d4, float d5, svint8x4_t x) {} +void fn_s16x4 (float d0, float d1, float d2, float d3, + float d4, float d5, svint16x4_t x) {} +void fn_s32x4 (float d0, float d1, float d2, float d3, + float d4, float d5, svint32x4_t x) {} +void fn_s64x4 (float d0, float d1, float d2, float d3, + float d4, float d5, svint64x4_t x) {} +void fn_u8x4 (float d0, float d1, float d2, float d3, + float d4, float d5, svuint8x4_t x) {} +void fn_u16x4 (float d0, float d1, float d2, float d3, + float d4, float d5, svuint16x4_t x) {} +void fn_u32x4 (float d0, float d1, float d2, float d3, + float d4, float d5, svuint32x4_t x) {} +void fn_u64x4 (float d0, float d1, float d2, float d3, + float d4, float d5, svuint64x4_t x) {} +void fn_f16x4 (float d0, float d1, float d2, float d3, + float d4, float d5, svfloat16x4_t x) {} +void fn_f32x4 (float d0, float d1, float d2, float d3, + float d4, float d5, svfloat32x4_t x) {} +void fn_f64x4 (float d0, float d1, float d2, float d3, + float d4, float d5, svfloat64x4_t x) {} + +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_s8\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_s16\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_s32\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_s64\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u8\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u16\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u32\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u64\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f16\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f32\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f64\n} } } */ + +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_s8x2\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_s16x2\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_s32x2\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_s64x2\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u8x2\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u16x2\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u32x2\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u64x2\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f16x2\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f32x2\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f64x2\n} } } */ + +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_s8x3\n} } } */ +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_s16x3\n} } } */ +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_s32x3\n} } } */ +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_s64x3\n} } } */ +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_u8x3\n} } } */ +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_u16x3\n} } } */ +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_u32x3\n} } } */ +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_u64x3\n} } } */ +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_f16x3\n} } } */ +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_f32x3\n} } } */ +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_f64x3\n} } } */ + +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_s8x4\n} } } */ +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_s16x4\n} } } */ +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_s32x4\n} } } */ +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_s64x4\n} } } */ +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_u8x4\n} } } */ +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_u16x4\n} } } */ +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_u32x4\n} } } */ +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_u64x4\n} } } */ +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_f16x4\n} } } */ +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_f32x4\n} } } */ +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_f64x4\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/annotate_6.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/annotate_6.c new file mode 100644 index 0000000..1f89dce --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/annotate_6.c @@ -0,0 +1,143 @@ +/* { dg-do compile } */ + +#include + +void fn_s8 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, svint8_t x) {} +void fn_s16 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, svint16_t x) {} +void fn_s32 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, svint32_t x) {} +void fn_s64 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, svint64_t x) {} +void fn_u8 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, svuint8_t x) {} +void fn_u16 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, svuint16_t x) {} +void fn_u32 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, svuint32_t x) {} +void fn_u64 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, svuint64_t x) {} +void fn_f16 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, svfloat16_t x) {} +void fn_f32 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, svfloat32_t x) {} +void fn_f64 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, svfloat64_t x) {} + +void fn_s8x2 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, svint8x2_t x) {} +void fn_s16x2 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, svint16x2_t x) {} +void fn_s32x2 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, svint32x2_t x) {} +void fn_s64x2 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, svint64x2_t x) {} +void fn_u8x2 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, svuint8x2_t x) {} +void fn_u16x2 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, svuint16x2_t x) {} +void fn_u32x2 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, svuint32x2_t x) {} +void fn_u64x2 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, svuint64x2_t x) {} +void fn_f16x2 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, svfloat16x2_t x) {} +void fn_f32x2 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, svfloat32x2_t x) {} +void fn_f64x2 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, svfloat64x2_t x) {} + +void fn_s8x3 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, svint8x3_t x) {} +void fn_s16x3 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, svint16x3_t x) {} +void fn_s32x3 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, svint32x3_t x) {} +void fn_s64x3 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, svint64x3_t x) {} +void fn_u8x3 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, svuint8x3_t x) {} +void fn_u16x3 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, svuint16x3_t x) {} +void fn_u32x3 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, svuint32x3_t x) {} +void fn_u64x3 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, svuint64x3_t x) {} +void fn_f16x3 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, svfloat16x3_t x) {} +void fn_f32x3 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, svfloat32x3_t x) {} +void fn_f64x3 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, svfloat64x3_t x) {} + +void fn_s8x4 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, svint8x4_t x) {} +void fn_s16x4 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, svint16x4_t x) {} +void fn_s32x4 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, svint32x4_t x) {} +void fn_s64x4 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, svint64x4_t x) {} +void fn_u8x4 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, svuint8x4_t x) {} +void fn_u16x4 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, svuint16x4_t x) {} +void fn_u32x4 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, svuint32x4_t x) {} +void fn_u64x4 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, svuint64x4_t x) {} +void fn_f16x4 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, svfloat16x4_t x) {} +void fn_f32x4 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, svfloat32x4_t x) {} +void fn_f64x4 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, svfloat64x4_t x) {} + +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_s8\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_s16\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_s32\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_s64\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u8\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u16\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u32\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u64\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f16\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f32\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f64\n} } } */ + +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_s8x2\n} } } */ +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_s16x2\n} } } */ +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_s32x2\n} } } */ +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_s64x2\n} } } */ +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_u8x2\n} } } */ +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_u16x2\n} } } */ +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_u32x2\n} } } */ +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_u64x2\n} } } */ +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_f16x2\n} } } */ +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_f32x2\n} } } */ +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_f64x2\n} } } */ + +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_s8x3\n} } } */ +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_s16x3\n} } } */ +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_s32x3\n} } } */ +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_s64x3\n} } } */ +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_u8x3\n} } } */ +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_u16x3\n} } } */ +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_u32x3\n} } } */ +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_u64x3\n} } } */ +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_f16x3\n} } } */ +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_f32x3\n} } } */ +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_f64x3\n} } } */ + +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_s8x4\n} } } */ +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_s16x4\n} } } */ +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_s32x4\n} } } */ +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_s64x4\n} } } */ +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_u8x4\n} } } */ +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_u16x4\n} } } */ +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_u32x4\n} } } */ +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_u64x4\n} } } */ +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_f16x4\n} } } */ +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_f32x4\n} } } */ +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_f64x4\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/annotate_7.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/annotate_7.c new file mode 100644 index 0000000..e67d180 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/annotate_7.c @@ -0,0 +1,97 @@ +/* { dg-do compile } */ + +#include + +void fn_s8 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, float d7, svint8_t x) {} +void fn_s16 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, float d7, svint16_t x) {} +void fn_s32 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, float d7, svint32_t x) {} +void fn_s64 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, float d7, svint64_t x) {} +void fn_u8 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, float d7, svuint8_t x) {} +void fn_u16 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, float d7, svuint16_t x) {} +void fn_u32 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, float d7, svuint32_t x) {} +void fn_u64 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, float d7, svuint64_t x) {} +void fn_f16 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, float d7, svfloat16_t x) {} +void fn_f32 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, float d7, svfloat32_t x) {} +void fn_f64 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, float d7, svfloat64_t x) {} + +void fn_s8x2 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, float d7, svint8x2_t x) {} +void fn_s16x2 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, float d7, svint16x2_t x) {} +void fn_s32x2 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, float d7, svint32x2_t x) {} +void fn_s64x2 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, float d7, svint64x2_t x) {} +void fn_u8x2 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, float d7, svuint8x2_t x) {} +void fn_u16x2 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, float d7, svuint16x2_t x) {} +void fn_u32x2 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, float d7, svuint32x2_t x) {} +void fn_u64x2 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, float d7, svuint64x2_t x) {} +void fn_f16x2 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, float d7, svfloat16x2_t x) {} +void fn_f32x2 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, float d7, svfloat32x2_t x) {} +void fn_f64x2 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, float d7, svfloat64x2_t x) {} + +void fn_s8x3 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, float d7, svint8x3_t x) {} +void fn_s16x3 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, float d7, svint16x3_t x) {} +void fn_s32x3 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, float d7, svint32x3_t x) {} +void fn_s64x3 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, float d7, svint64x3_t x) {} +void fn_u8x3 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, float d7, svuint8x3_t x) {} +void fn_u16x3 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, float d7, svuint16x3_t x) {} +void fn_u32x3 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, float d7, svuint32x3_t x) {} +void fn_u64x3 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, float d7, svuint64x3_t x) {} +void fn_f16x3 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, float d7, svfloat16x3_t x) {} +void fn_f32x3 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, float d7, svfloat32x3_t x) {} +void fn_f64x3 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, float d7, svfloat64x3_t x) {} + +void fn_s8x4 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, float d7, svint8x4_t x) {} +void fn_s16x4 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, float d7, svint16x4_t x) {} +void fn_s32x4 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, float d7, svint32x4_t x) {} +void fn_s64x4 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, float d7, svint64x4_t x) {} +void fn_u8x4 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, float d7, svuint8x4_t x) {} +void fn_u16x4 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, float d7, svuint16x4_t x) {} +void fn_u32x4 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, float d7, svuint32x4_t x) {} +void fn_u64x4 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, float d7, svuint64x4_t x) {} +void fn_f16x4 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, float d7, svfloat16x4_t x) {} +void fn_f32x4 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, float d7, svfloat32x4_t x) {} +void fn_f64x4 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, float d7, svfloat64x4_t x) {} + +/* { dg-final { scan-assembler-not {\t\.variant_pcs\t\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_1.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_1.c new file mode 100644 index 0000000..d0c3e5a --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_1.c @@ -0,0 +1,49 @@ +/* { dg-do compile } */ +/* { dg-options "-O -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include + +/* +** callee_pred: +** ldr (p[0-9]+), \[x0\] +** ldr (p[0-9]+), \[x1\] +** brkpa (p[0-7])\.b, p0/z, p1\.b, p2\.b +** brkpb (p[0-7])\.b, \3/z, p3\.b, \1\.b +** brka p0\.b, \4/z, \2\.b +** ret +*/ +__SVBool_t __attribute__((noipa)) +callee_pred (__SVBool_t p0, __SVBool_t p1, __SVBool_t p2, __SVBool_t p3, + __SVBool_t mem0, __SVBool_t mem1) +{ + p0 = svbrkpa_z (p0, p1, p2); + p0 = svbrkpb_z (p0, p3, mem0); + return svbrka_z (p0, mem1); +} + +/* +** caller_pred: +** ... +** ptrue (p[0-9]+)\.b, vl5 +** str \1, \[x0\] +** ... +** ptrue (p[0-9]+)\.h, vl6 +** str \2, \[x1\] +** ptrue p3\.d, vl4 +** ptrue p2\.s, vl3 +** ptrue p1\.h, vl2 +** ptrue p0\.b, vl1 +** bl callee_pred +** ... +*/ +__SVBool_t __attribute__((noipa)) +caller_pred (void) +{ + return callee_pred (svptrue_pat_b8 (SV_VL1), + svptrue_pat_b16 (SV_VL2), + svptrue_pat_b32 (SV_VL3), + svptrue_pat_b64 (SV_VL4), + svptrue_pat_b8 (SV_VL5), + svptrue_pat_b16 (SV_VL6)); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_10.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_10.c new file mode 100644 index 0000000..1bbcb77 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_10.c @@ -0,0 +1,33 @@ +/* { dg-do compile } */ +/* { dg-options "-O -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include + +/* +** callee: +** fadd s0, (s0, s6|s6, s0) +** ret +*/ +float __attribute__((noipa)) +callee (float s0, double d1, svfloat32x4_t z2, svfloat64x4_t stack1, + float s6, double d7) +{ + return s0 + s6; +} + +float __attribute__((noipa)) +caller (float32_t *x0, float64_t *x1) +{ + return callee (0.0f, 1.0, + svld4 (svptrue_b8 (), x0), + svld4 (svptrue_b8 (), x1), + 6.0f, 7.0); +} + +/* { dg-final { scan-assembler {\tld4w\t{z2\.s - z5\.s}, p[0-7]/z, \[x0\]\n} } } */ +/* { dg-final { scan-assembler {\tld4d\t{z[0-9]+\.d - z[0-9]+\.d}, p[0-7]/z, \[x1\]\n} } } */ +/* { dg-final { scan-assembler {\tmovi\tv0\.[24]s, #0\n} } } */ +/* { dg-final { scan-assembler {\tfmov\td1, #?1\.0} } } */ +/* { dg-final { scan-assembler {\tfmov\ts6, #?6\.0} } } */ +/* { dg-final { scan-assembler {\tfmov\td7, #?7\.0} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_11_nosc.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_11_nosc.c new file mode 100644 index 0000000..0f62e0b --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_11_nosc.c @@ -0,0 +1,61 @@ +/* { dg-do run { target aarch64_sve_hw } } */ +/* { dg-options "-O0 -g" } */ + +#include + +void __attribute__((noipa)) +callee (svbool_t p, svint8_t s8, svuint16x4_t u16, svfloat32x3_t f32, + svint64x2_t s64) +{ + svbool_t pg; + pg = svptrue_b8 (); + + if (svptest_any (pg, sveor_z (pg, p, svptrue_pat_b8 (SV_VL7)))) + __builtin_abort (); + + if (svptest_any (pg, svcmpne (pg, s8, svindex_s8 (1, 2)))) + __builtin_abort (); + + if (svptest_any (pg, svcmpne (pg, svget4 (u16, 0), svindex_u16 (2, 3)))) + __builtin_abort (); + + if (svptest_any (pg, svcmpne (pg, svget4 (u16, 1), svindex_u16 (3, 4)))) + __builtin_abort (); + + if (svptest_any (pg, svcmpne (pg, svget4 (u16, 2), svindex_u16 (4, 5)))) + __builtin_abort (); + + if (svptest_any (pg, svcmpne (pg, svget4 (u16, 3), svindex_u16 (5, 6)))) + __builtin_abort (); + + if (svptest_any (pg, svcmpne (pg, svget3 (f32, 0), svdup_f32 (1.0)))) + __builtin_abort (); + + if (svptest_any (pg, svcmpne (pg, svget3 (f32, 1), svdup_f32 (2.0)))) + __builtin_abort (); + + if (svptest_any (pg, svcmpne (pg, svget3 (f32, 2), svdup_f32 (3.0)))) + __builtin_abort (); + + if (svptest_any (pg, svcmpne (pg, svget2 (s64, 0), svindex_s64 (6, 7)))) + __builtin_abort (); + + if (svptest_any (pg, svcmpne (pg, svget2 (s64, 1), svindex_s64 (7, 8)))) + __builtin_abort (); +} + +int __attribute__((noipa)) +main (void) +{ + callee (svptrue_pat_b8 (SV_VL7), + svindex_s8 (1, 2), + svcreate4 (svindex_u16 (2, 3), + svindex_u16 (3, 4), + svindex_u16 (4, 5), + svindex_u16 (5, 6)), + svcreate3 (svdup_f32 (1.0), + svdup_f32 (2.0), + svdup_f32 (3.0)), + svcreate2 (svindex_s64 (6, 7), + svindex_s64 (7, 8))); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_11_sc.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_11_sc.c new file mode 100644 index 0000000..8a98d58 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_11_sc.c @@ -0,0 +1,61 @@ +/* { dg-do run { target aarch64_sve_hw } } */ +/* { dg-options "-O0 -fstack-clash-protection -g" } */ + +#include + +void __attribute__((noipa)) +callee (svbool_t p, svint8_t s8, svuint16x4_t u16, svfloat32x3_t f32, + svint64x2_t s64) +{ + svbool_t pg; + pg = svptrue_b8 (); + + if (svptest_any (pg, sveor_z (pg, p, svptrue_pat_b8 (SV_VL7)))) + __builtin_abort (); + + if (svptest_any (pg, svcmpne (pg, s8, svindex_s8 (1, 2)))) + __builtin_abort (); + + if (svptest_any (pg, svcmpne (pg, svget4 (u16, 0), svindex_u16 (2, 3)))) + __builtin_abort (); + + if (svptest_any (pg, svcmpne (pg, svget4 (u16, 1), svindex_u16 (3, 4)))) + __builtin_abort (); + + if (svptest_any (pg, svcmpne (pg, svget4 (u16, 2), svindex_u16 (4, 5)))) + __builtin_abort (); + + if (svptest_any (pg, svcmpne (pg, svget4 (u16, 3), svindex_u16 (5, 6)))) + __builtin_abort (); + + if (svptest_any (pg, svcmpne (pg, svget3 (f32, 0), svdup_f32 (1.0)))) + __builtin_abort (); + + if (svptest_any (pg, svcmpne (pg, svget3 (f32, 1), svdup_f32 (2.0)))) + __builtin_abort (); + + if (svptest_any (pg, svcmpne (pg, svget3 (f32, 2), svdup_f32 (3.0)))) + __builtin_abort (); + + if (svptest_any (pg, svcmpne (pg, svget2 (s64, 0), svindex_s64 (6, 7)))) + __builtin_abort (); + + if (svptest_any (pg, svcmpne (pg, svget2 (s64, 1), svindex_s64 (7, 8)))) + __builtin_abort (); +} + +int __attribute__((noipa)) +main (void) +{ + callee (svptrue_pat_b8 (SV_VL7), + svindex_s8 (1, 2), + svcreate4 (svindex_u16 (2, 3), + svindex_u16 (3, 4), + svindex_u16 (4, 5), + svindex_u16 (5, 6)), + svcreate3 (svdup_f32 (1.0), + svdup_f32 (2.0), + svdup_f32 (3.0)), + svcreate2 (svindex_s64 (6, 7), + svindex_s64 (7, 8))); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_2.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_2.c new file mode 100644 index 0000000..a5dd73b --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_2.c @@ -0,0 +1,70 @@ +/* { dg-do compile } */ +/* { dg-options "-O -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include + +/* +** callee_int: +** ptrue p3\.b, all +** ld1b (z(?:2[4-9]|3[0-1]).b), p3/z, \[x4\] +** st1b \1, p2, \[x0\] +** st1b z4\.b, p1, \[x0\] +** st1h z5\.h, p1, \[x1\] +** st1w z6\.s, p1, \[x2\] +** st1d z7\.d, p1, \[x3\] +** st1b z0\.b, p0, \[x0\] +** st1h z1\.h, p0, \[x1\] +** st1w z2\.s, p0, \[x2\] +** st1d z3\.d, p0, \[x3\] +** ret +*/ +void __attribute__((noipa)) +callee_int (int8_t *x0, int16_t *x1, int32_t *x2, int64_t *x3, + svint8_t z0, svint16_t z1, svint32_t z2, svint64_t z3, + svint8_t z4, svint16_t z5, svint32_t z6, svint64_t z7, + svint8_t z8, + svbool_t p0, svbool_t p1, svbool_t p2) +{ + svst1 (p2, x0, z8); + svst1 (p1, x0, z4); + svst1 (p1, x1, z5); + svst1 (p1, x2, z6); + svst1 (p1, x3, z7); + svst1 (p0, x0, z0); + svst1 (p0, x1, z1); + svst1 (p0, x2, z2); + svst1 (p0, x3, z3); +} + +void __attribute__((noipa)) +caller_int (int8_t *x0, int16_t *x1, int32_t *x2, int64_t *x3) +{ + callee_int (x0, x1, x2, x3, + svdup_s8 (0), + svdup_s16 (1), + svdup_s32 (2), + svdup_s64 (3), + svdup_s8 (4), + svdup_s16 (5), + svdup_s32 (6), + svdup_s64 (7), + svdup_s8 (8), + svptrue_pat_b8 (SV_VL1), + svptrue_pat_b16 (SV_VL2), + svptrue_pat_b32 (SV_VL3)); +} + +/* { dg-final { scan-assembler {\tmov\tz0\.b, #0\n} } } */ +/* { dg-final { scan-assembler {\tmov\tz1\.h, #1\n} } } */ +/* { dg-final { scan-assembler {\tmov\tz2\.s, #2\n} } } */ +/* { dg-final { scan-assembler {\tmov\tz3\.d, #3\n} } } */ +/* { dg-final { scan-assembler {\tmov\tz4\.b, #4\n} } } */ +/* { dg-final { scan-assembler {\tmov\tz5\.h, #5\n} } } */ +/* { dg-final { scan-assembler {\tmov\tz6\.s, #6\n} } } */ +/* { dg-final { scan-assembler {\tmov\tz7\.d, #7\n} } } */ +/* { dg-final { scan-assembler {\tmov\tx4, sp\n} } } */ +/* { dg-final { scan-assembler {\tmov\t(z[0-9]+\.b), #8\n.*\tst1b\t\1, p[0-7], \[x4\]\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp0\.b, vl1\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp1\.h, vl2\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp2\.s, vl3\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_3.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_3.c new file mode 100644 index 0000000..b44243a --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_3.c @@ -0,0 +1,70 @@ +/* { dg-do compile } */ +/* { dg-options "-O -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include + +/* +** callee_uint: +** ptrue p3\.b, all +** ld1b (z(?:2[4-9]|3[0-1]).b), p3/z, \[x4\] +** st1b \1, p2, \[x0\] +** st1b z4\.b, p1, \[x0\] +** st1h z5\.h, p1, \[x1\] +** st1w z6\.s, p1, \[x2\] +** st1d z7\.d, p1, \[x3\] +** st1b z0\.b, p0, \[x0\] +** st1h z1\.h, p0, \[x1\] +** st1w z2\.s, p0, \[x2\] +** st1d z3\.d, p0, \[x3\] +** ret +*/ +void __attribute__((noipa)) +callee_uint (uint8_t *x0, uint16_t *x1, uint32_t *x2, uint64_t *x3, + svuint8_t z0, svuint16_t z1, svuint32_t z2, svuint64_t z3, + svuint8_t z4, svuint16_t z5, svuint32_t z6, svuint64_t z7, + svuint8_t z8, + svbool_t p0, svbool_t p1, svbool_t p2) +{ + svst1 (p2, x0, z8); + svst1 (p1, x0, z4); + svst1 (p1, x1, z5); + svst1 (p1, x2, z6); + svst1 (p1, x3, z7); + svst1 (p0, x0, z0); + svst1 (p0, x1, z1); + svst1 (p0, x2, z2); + svst1 (p0, x3, z3); +} + +void __attribute__((noipa)) +caller_uint (uint8_t *x0, uint16_t *x1, uint32_t *x2, uint64_t *x3) +{ + callee_uint (x0, x1, x2, x3, + svdup_u8 (0), + svdup_u16 (1), + svdup_u32 (2), + svdup_u64 (3), + svdup_u8 (4), + svdup_u16 (5), + svdup_u32 (6), + svdup_u64 (7), + svdup_u8 (8), + svptrue_pat_b8 (SV_VL1), + svptrue_pat_b16 (SV_VL2), + svptrue_pat_b32 (SV_VL3)); +} + +/* { dg-final { scan-assembler {\tmov\tz0\.b, #0\n} } } */ +/* { dg-final { scan-assembler {\tmov\tz1\.h, #1\n} } } */ +/* { dg-final { scan-assembler {\tmov\tz2\.s, #2\n} } } */ +/* { dg-final { scan-assembler {\tmov\tz3\.d, #3\n} } } */ +/* { dg-final { scan-assembler {\tmov\tz4\.b, #4\n} } } */ +/* { dg-final { scan-assembler {\tmov\tz5\.h, #5\n} } } */ +/* { dg-final { scan-assembler {\tmov\tz6\.s, #6\n} } } */ +/* { dg-final { scan-assembler {\tmov\tz7\.d, #7\n} } } */ +/* { dg-final { scan-assembler {\tmov\tx4, sp\n} } } */ +/* { dg-final { scan-assembler {\tmov\t(z[0-9]+\.b), #8\n.*\tst1b\t\1, p[0-7], \[x4\]\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp0\.b, vl1\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp1\.h, vl2\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp2\.s, vl3\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_4.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_4.c new file mode 100644 index 0000000..0f99663 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_4.c @@ -0,0 +1,70 @@ +/* { dg-do compile } */ +/* { dg-options "-O -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include + +/* +** callee_float: +** ptrue p3\.b, all +** ld1h (z(?:2[4-9]|3[0-1]).h), p3/z, \[x4\] +** st1h \1, p2, \[x0\] +** st1h z4\.h, p1, \[x0\] +** st1h z5\.h, p1, \[x1\] +** st1w z6\.s, p1, \[x2\] +** st1d z7\.d, p1, \[x3\] +** st1h z0\.h, p0, \[x0\] +** st1h z1\.h, p0, \[x1\] +** st1w z2\.s, p0, \[x2\] +** st1d z3\.d, p0, \[x3\] +** ret +*/ +void __attribute__((noipa)) +callee_float (float16_t *x0, float16_t *x1, float32_t *x2, float64_t *x3, + svfloat16_t z0, svfloat16_t z1, svfloat32_t z2, svfloat64_t z3, + svfloat16_t z4, svfloat16_t z5, svfloat32_t z6, svfloat64_t z7, + svfloat16_t z8, + svbool_t p0, svbool_t p1, svbool_t p2) +{ + svst1 (p2, x0, z8); + svst1 (p1, x0, z4); + svst1 (p1, x1, z5); + svst1 (p1, x2, z6); + svst1 (p1, x3, z7); + svst1 (p0, x0, z0); + svst1 (p0, x1, z1); + svst1 (p0, x2, z2); + svst1 (p0, x3, z3); +} + +void __attribute__((noipa)) +caller_float (float16_t *x0, float16_t *x1, float32_t *x2, float64_t *x3) +{ + callee_float (x0, x1, x2, x3, + svdup_f16 (0), + svdup_f16 (1), + svdup_f32 (2), + svdup_f64 (3), + svdup_f16 (4), + svdup_f16 (5), + svdup_f32 (6), + svdup_f64 (7), + svdup_f16 (8), + svptrue_pat_b8 (SV_VL1), + svptrue_pat_b16 (SV_VL2), + svptrue_pat_b32 (SV_VL3)); +} + +/* { dg-final { scan-assembler {\tmov\tz0\.[bhsd], #0\n} } } */ +/* { dg-final { scan-assembler {\tfmov\tz1\.h, #1\.0} } } */ +/* { dg-final { scan-assembler {\tfmov\tz2\.s, #2\.0} } } */ +/* { dg-final { scan-assembler {\tfmov\tz3\.d, #3\.0} } } */ +/* { dg-final { scan-assembler {\tfmov\tz4\.h, #4\.0} } } */ +/* { dg-final { scan-assembler {\tfmov\tz5\.h, #5\.0} } } */ +/* { dg-final { scan-assembler {\tfmov\tz6\.s, #6\.0} } } */ +/* { dg-final { scan-assembler {\tfmov\tz7\.d, #7\.0} } } */ +/* { dg-final { scan-assembler {\tmov\tx4, sp\n} } } */ +/* { dg-final { scan-assembler {\tfmov\t(z[0-9]+\.h), #8\.0.*\tst1h\t\1, p[0-7], \[x4\]\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp0\.b, vl1\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp1\.h, vl2\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp2\.s, vl3\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_be_f16.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_be_f16.c new file mode 100644 index 0000000..6a9157b --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_be_f16.c @@ -0,0 +1,63 @@ +/* { dg-do compile } */ +/* { dg-options "-O -mbig-endian -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#pragma GCC aarch64 "arm_sve.h" + +/* +** callee: +** addvl sp, sp, #-1 +** str p4, \[sp\] +** ptrue p4\.b, all +** ( +** ld1h (z[0-9]+\.h), p4/z, \[x1, #1, mul vl\] +** ld1h (z[0-9]+\.h), p4/z, \[x1\] +** st2h {\2 - \1}, p0, \[x0\] +** | +** ld1h (z[0-9]+\.h), p4/z, \[x1\] +** ld1h (z[0-9]+\.h), p4/z, \[x1, #1, mul vl\] +** st2h {\3 - \4}, p0, \[x0\] +** ) +** st4h {z0\.h - z3\.h}, p1, \[x0\] +** st3h {z4\.h - z6\.h}, p2, \[x0\] +** st1h z7\.h, p3, \[x0\] +** ldr p4, \[sp\] +** addvl sp, sp, #1 +** ret +*/ +void __attribute__((noipa)) +callee (void *x0, svfloat16x4_t z0, svfloat16x3_t z4, svfloat16x2_t stack, + svfloat16_t z7, svbool_t p0, svbool_t p1, svbool_t p2, svbool_t p3) +{ + svst2 (p0, x0, stack); + svst4 (p1, x0, z0); + svst3 (p2, x0, z4); + svst1_f16 (p3, x0, z7); +} + +void __attribute__((noipa)) +caller (void *x0) +{ + svbool_t pg; + pg = svptrue_b8 (); + callee (x0, + svld4_vnum_f16 (pg, x0, -8), + svld3_vnum_f16 (pg, x0, -3), + svld2_vnum_f16 (pg, x0, 0), + svld1_vnum_f16 (pg, x0, 2), + svptrue_pat_b8 (SV_VL1), + svptrue_pat_b16 (SV_VL2), + svptrue_pat_b32 (SV_VL3), + svptrue_pat_b64 (SV_VL4)); +} + +/* { dg-final { scan-assembler {\tld4h\t{z0\.h - z3\.h}, p[0-7]/z, \[x0, #-8, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld3h\t{z4\.h - z6\.h}, p[0-7]/z, \[x0, #-3, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld1h\tz7\.h, p[0-7]/z, \[x0, #2, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tmov\tx1, sp\n} } } */ +/* { dg-final { scan-assembler {\tld2h\t{(z[0-9]+\.h) - z[0-9]+\.h}.*\tst1h\t\1, p[0-7], \[x1\]\n} } } */ +/* { dg-final { scan-assembler {\tld2h\t{z[0-9]+\.h - (z[0-9]+\.h)}.*\tst1h\t\1, p[0-7], \[x1, #1, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp0\.b, vl1\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp1\.h, vl2\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp2\.s, vl3\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp3\.d, vl4\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_be_f32.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_be_f32.c new file mode 100644 index 0000000..85dff59 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_be_f32.c @@ -0,0 +1,63 @@ +/* { dg-do compile } */ +/* { dg-options "-O -mbig-endian -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#pragma GCC aarch64 "arm_sve.h" + +/* +** callee: +** addvl sp, sp, #-1 +** str p4, \[sp\] +** ptrue p4\.b, all +** ( +** ld1w (z[0-9]+\.s), p4/z, \[x1, #1, mul vl\] +** ld1w (z[0-9]+\.s), p4/z, \[x1\] +** st2w {\2 - \1}, p0, \[x0\] +** | +** ld1w (z[0-9]+\.s), p4/z, \[x1\] +** ld1w (z[0-9]+\.s), p4/z, \[x1, #1, mul vl\] +** st2w {\3 - \4}, p0, \[x0\] +** ) +** st4w {z0\.s - z3\.s}, p1, \[x0\] +** st3w {z4\.s - z6\.s}, p2, \[x0\] +** st1w z7\.s, p3, \[x0\] +** ldr p4, \[sp\] +** addvl sp, sp, #1 +** ret +*/ +void __attribute__((noipa)) +callee (void *x0, svfloat32x4_t z0, svfloat32x3_t z4, svfloat32x2_t stack, + svfloat32_t z7, svbool_t p0, svbool_t p1, svbool_t p2, svbool_t p3) +{ + svst2 (p0, x0, stack); + svst4 (p1, x0, z0); + svst3 (p2, x0, z4); + svst1_f32 (p3, x0, z7); +} + +void __attribute__((noipa)) +caller (void *x0) +{ + svbool_t pg; + pg = svptrue_b8 (); + callee (x0, + svld4_vnum_f32 (pg, x0, -8), + svld3_vnum_f32 (pg, x0, -3), + svld2_vnum_f32 (pg, x0, 0), + svld1_vnum_f32 (pg, x0, 2), + svptrue_pat_b8 (SV_VL1), + svptrue_pat_b16 (SV_VL2), + svptrue_pat_b32 (SV_VL3), + svptrue_pat_b64 (SV_VL4)); +} + +/* { dg-final { scan-assembler {\tld4w\t{z0\.s - z3\.s}, p[0-7]/z, \[x0, #-8, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld3w\t{z4\.s - z6\.s}, p[0-7]/z, \[x0, #-3, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld1w\tz7\.s, p[0-7]/z, \[x0, #2, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tmov\tx1, sp\n} } } */ +/* { dg-final { scan-assembler {\tld2w\t{(z[0-9]+\.s) - z[0-9]+\.s}.*\tst1w\t\1, p[0-7], \[x1\]\n} } } */ +/* { dg-final { scan-assembler {\tld2w\t{z[0-9]+\.s - (z[0-9]+\.s)}.*\tst1w\t\1, p[0-7], \[x1, #1, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp0\.b, vl1\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp1\.h, vl2\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp2\.s, vl3\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp3\.d, vl4\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_be_f64.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_be_f64.c new file mode 100644 index 0000000..8cedd99 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_be_f64.c @@ -0,0 +1,63 @@ +/* { dg-do compile } */ +/* { dg-options "-O -mbig-endian -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#pragma GCC aarch64 "arm_sve.h" + +/* +** callee: +** addvl sp, sp, #-1 +** str p4, \[sp\] +** ptrue p4\.b, all +** ( +** ld1d (z[0-9]+\.d), p4/z, \[x1, #1, mul vl\] +** ld1d (z[0-9]+\.d), p4/z, \[x1\] +** st2d {\2 - \1}, p0, \[x0\] +** | +** ld1d (z[0-9]+\.d), p4/z, \[x1\] +** ld1d (z[0-9]+\.d), p4/z, \[x1, #1, mul vl\] +** st2d {\3 - \4}, p0, \[x0\] +** ) +** st4d {z0\.d - z3\.d}, p1, \[x0\] +** st3d {z4\.d - z6\.d}, p2, \[x0\] +** st1d z7\.d, p3, \[x0\] +** ldr p4, \[sp\] +** addvl sp, sp, #1 +** ret +*/ +void __attribute__((noipa)) +callee (void *x0, svfloat64x4_t z0, svfloat64x3_t z4, svfloat64x2_t stack, + svfloat64_t z7, svbool_t p0, svbool_t p1, svbool_t p2, svbool_t p3) +{ + svst2 (p0, x0, stack); + svst4 (p1, x0, z0); + svst3 (p2, x0, z4); + svst1_f64 (p3, x0, z7); +} + +void __attribute__((noipa)) +caller (void *x0) +{ + svbool_t pg; + pg = svptrue_b8 (); + callee (x0, + svld4_vnum_f64 (pg, x0, -8), + svld3_vnum_f64 (pg, x0, -3), + svld2_vnum_f64 (pg, x0, 0), + svld1_vnum_f64 (pg, x0, 2), + svptrue_pat_b8 (SV_VL1), + svptrue_pat_b16 (SV_VL2), + svptrue_pat_b32 (SV_VL3), + svptrue_pat_b64 (SV_VL4)); +} + +/* { dg-final { scan-assembler {\tld4d\t{z0\.d - z3\.d}, p[0-7]/z, \[x0, #-8, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld3d\t{z4\.d - z6\.d}, p[0-7]/z, \[x0, #-3, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld1d\tz7\.d, p[0-7]/z, \[x0, #2, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tmov\tx1, sp\n} } } */ +/* { dg-final { scan-assembler {\tld2d\t{(z[0-9]+\.d) - z[0-9]+\.d}.*\tst1d\t\1, p[0-7], \[x1\]\n} } } */ +/* { dg-final { scan-assembler {\tld2d\t{z[0-9]+\.d - (z[0-9]+\.d)}.*\tst1d\t\1, p[0-7], \[x1, #1, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp0\.b, vl1\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp1\.h, vl2\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp2\.s, vl3\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp3\.d, vl4\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_be_s16.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_be_s16.c new file mode 100644 index 0000000..9486b30 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_be_s16.c @@ -0,0 +1,63 @@ +/* { dg-do compile } */ +/* { dg-options "-O -mbig-endian -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#pragma GCC aarch64 "arm_sve.h" + +/* +** callee: +** addvl sp, sp, #-1 +** str p4, \[sp\] +** ptrue p4\.b, all +** ( +** ld1h (z[0-9]+\.h), p4/z, \[x1, #1, mul vl\] +** ld1h (z[0-9]+\.h), p4/z, \[x1\] +** st2h {\2 - \1}, p0, \[x0\] +** | +** ld1h (z[0-9]+\.h), p4/z, \[x1\] +** ld1h (z[0-9]+\.h), p4/z, \[x1, #1, mul vl\] +** st2h {\3 - \4}, p0, \[x0\] +** ) +** st4h {z0\.h - z3\.h}, p1, \[x0\] +** st3h {z4\.h - z6\.h}, p2, \[x0\] +** st1h z7\.h, p3, \[x0\] +** ldr p4, \[sp\] +** addvl sp, sp, #1 +** ret +*/ +void __attribute__((noipa)) +callee (void *x0, svint16x4_t z0, svint16x3_t z4, svint16x2_t stack, + svint16_t z7, svbool_t p0, svbool_t p1, svbool_t p2, svbool_t p3) +{ + svst2 (p0, x0, stack); + svst4 (p1, x0, z0); + svst3 (p2, x0, z4); + svst1_s16 (p3, x0, z7); +} + +void __attribute__((noipa)) +caller (void *x0) +{ + svbool_t pg; + pg = svptrue_b8 (); + callee (x0, + svld4_vnum_s16 (pg, x0, -8), + svld3_vnum_s16 (pg, x0, -3), + svld2_vnum_s16 (pg, x0, 0), + svld1_vnum_s16 (pg, x0, 2), + svptrue_pat_b8 (SV_VL1), + svptrue_pat_b16 (SV_VL2), + svptrue_pat_b32 (SV_VL3), + svptrue_pat_b64 (SV_VL4)); +} + +/* { dg-final { scan-assembler {\tld4h\t{z0\.h - z3\.h}, p[0-7]/z, \[x0, #-8, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld3h\t{z4\.h - z6\.h}, p[0-7]/z, \[x0, #-3, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld1h\tz7\.h, p[0-7]/z, \[x0, #2, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tmov\tx1, sp\n} } } */ +/* { dg-final { scan-assembler {\tld2h\t{(z[0-9]+\.h) - z[0-9]+\.h}.*\tst1h\t\1, p[0-7], \[x1\]\n} } } */ +/* { dg-final { scan-assembler {\tld2h\t{z[0-9]+\.h - (z[0-9]+\.h)}.*\tst1h\t\1, p[0-7], \[x1, #1, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp0\.b, vl1\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp1\.h, vl2\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp2\.s, vl3\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp3\.d, vl4\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_be_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_be_s32.c new file mode 100644 index 0000000..6643c3a --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_be_s32.c @@ -0,0 +1,63 @@ +/* { dg-do compile } */ +/* { dg-options "-O -mbig-endian -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#pragma GCC aarch64 "arm_sve.h" + +/* +** callee: +** addvl sp, sp, #-1 +** str p4, \[sp\] +** ptrue p4\.b, all +** ( +** ld1w (z[0-9]+\.s), p4/z, \[x1, #1, mul vl\] +** ld1w (z[0-9]+\.s), p4/z, \[x1\] +** st2w {\2 - \1}, p0, \[x0\] +** | +** ld1w (z[0-9]+\.s), p4/z, \[x1\] +** ld1w (z[0-9]+\.s), p4/z, \[x1, #1, mul vl\] +** st2w {\3 - \4}, p0, \[x0\] +** ) +** st4w {z0\.s - z3\.s}, p1, \[x0\] +** st3w {z4\.s - z6\.s}, p2, \[x0\] +** st1w z7\.s, p3, \[x0\] +** ldr p4, \[sp\] +** addvl sp, sp, #1 +** ret +*/ +void __attribute__((noipa)) +callee (void *x0, svint32x4_t z0, svint32x3_t z4, svint32x2_t stack, + svint32_t z7, svbool_t p0, svbool_t p1, svbool_t p2, svbool_t p3) +{ + svst2 (p0, x0, stack); + svst4 (p1, x0, z0); + svst3 (p2, x0, z4); + svst1_s32 (p3, x0, z7); +} + +void __attribute__((noipa)) +caller (void *x0) +{ + svbool_t pg; + pg = svptrue_b8 (); + callee (x0, + svld4_vnum_s32 (pg, x0, -8), + svld3_vnum_s32 (pg, x0, -3), + svld2_vnum_s32 (pg, x0, 0), + svld1_vnum_s32 (pg, x0, 2), + svptrue_pat_b8 (SV_VL1), + svptrue_pat_b16 (SV_VL2), + svptrue_pat_b32 (SV_VL3), + svptrue_pat_b64 (SV_VL4)); +} + +/* { dg-final { scan-assembler {\tld4w\t{z0\.s - z3\.s}, p[0-7]/z, \[x0, #-8, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld3w\t{z4\.s - z6\.s}, p[0-7]/z, \[x0, #-3, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld1w\tz7\.s, p[0-7]/z, \[x0, #2, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tmov\tx1, sp\n} } } */ +/* { dg-final { scan-assembler {\tld2w\t{(z[0-9]+\.s) - z[0-9]+\.s}.*\tst1w\t\1, p[0-7], \[x1\]\n} } } */ +/* { dg-final { scan-assembler {\tld2w\t{z[0-9]+\.s - (z[0-9]+\.s)}.*\tst1w\t\1, p[0-7], \[x1, #1, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp0\.b, vl1\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp1\.h, vl2\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp2\.s, vl3\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp3\.d, vl4\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_be_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_be_s64.c new file mode 100644 index 0000000..f9c8b134 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_be_s64.c @@ -0,0 +1,63 @@ +/* { dg-do compile } */ +/* { dg-options "-O -mbig-endian -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#pragma GCC aarch64 "arm_sve.h" + +/* +** callee: +** addvl sp, sp, #-1 +** str p4, \[sp\] +** ptrue p4\.b, all +** ( +** ld1d (z[0-9]+\.d), p4/z, \[x1, #1, mul vl\] +** ld1d (z[0-9]+\.d), p4/z, \[x1\] +** st2d {\2 - \1}, p0, \[x0\] +** | +** ld1d (z[0-9]+\.d), p4/z, \[x1\] +** ld1d (z[0-9]+\.d), p4/z, \[x1, #1, mul vl\] +** st2d {\3 - \4}, p0, \[x0\] +** ) +** st4d {z0\.d - z3\.d}, p1, \[x0\] +** st3d {z4\.d - z6\.d}, p2, \[x0\] +** st1d z7\.d, p3, \[x0\] +** ldr p4, \[sp\] +** addvl sp, sp, #1 +** ret +*/ +void __attribute__((noipa)) +callee (void *x0, svint64x4_t z0, svint64x3_t z4, svint64x2_t stack, + svint64_t z7, svbool_t p0, svbool_t p1, svbool_t p2, svbool_t p3) +{ + svst2 (p0, x0, stack); + svst4 (p1, x0, z0); + svst3 (p2, x0, z4); + svst1_s64 (p3, x0, z7); +} + +void __attribute__((noipa)) +caller (void *x0) +{ + svbool_t pg; + pg = svptrue_b8 (); + callee (x0, + svld4_vnum_s64 (pg, x0, -8), + svld3_vnum_s64 (pg, x0, -3), + svld2_vnum_s64 (pg, x0, 0), + svld1_vnum_s64 (pg, x0, 2), + svptrue_pat_b8 (SV_VL1), + svptrue_pat_b16 (SV_VL2), + svptrue_pat_b32 (SV_VL3), + svptrue_pat_b64 (SV_VL4)); +} + +/* { dg-final { scan-assembler {\tld4d\t{z0\.d - z3\.d}, p[0-7]/z, \[x0, #-8, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld3d\t{z4\.d - z6\.d}, p[0-7]/z, \[x0, #-3, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld1d\tz7\.d, p[0-7]/z, \[x0, #2, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tmov\tx1, sp\n} } } */ +/* { dg-final { scan-assembler {\tld2d\t{(z[0-9]+\.d) - z[0-9]+\.d}.*\tst1d\t\1, p[0-7], \[x1\]\n} } } */ +/* { dg-final { scan-assembler {\tld2d\t{z[0-9]+\.d - (z[0-9]+\.d)}.*\tst1d\t\1, p[0-7], \[x1, #1, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp0\.b, vl1\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp1\.h, vl2\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp2\.s, vl3\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp3\.d, vl4\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_be_s8.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_be_s8.c new file mode 100644 index 0000000..63118f5 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_be_s8.c @@ -0,0 +1,63 @@ +/* { dg-do compile } */ +/* { dg-options "-O -mbig-endian -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#pragma GCC aarch64 "arm_sve.h" + +/* +** callee: +** addvl sp, sp, #-1 +** str p4, \[sp\] +** ptrue p4\.b, all +** ( +** ld1b (z[0-9]+\.b), p4/z, \[x1, #1, mul vl\] +** ld1b (z[0-9]+\.b), p4/z, \[x1\] +** st2b {\2 - \1}, p0, \[x0\] +** | +** ld1b (z[0-9]+\.b), p4/z, \[x1\] +** ld1b (z[0-9]+\.b), p4/z, \[x1, #1, mul vl\] +** st2b {\3 - \4}, p0, \[x0\] +** ) +** st4b {z0\.b - z3\.b}, p1, \[x0\] +** st3b {z4\.b - z6\.b}, p2, \[x0\] +** st1b z7\.b, p3, \[x0\] +** ldr p4, \[sp\] +** addvl sp, sp, #1 +** ret +*/ +void __attribute__((noipa)) +callee (void *x0, svint8x4_t z0, svint8x3_t z4, svint8x2_t stack, + svint8_t z7, svbool_t p0, svbool_t p1, svbool_t p2, svbool_t p3) +{ + svst2 (p0, x0, stack); + svst4 (p1, x0, z0); + svst3 (p2, x0, z4); + svst1_s8 (p3, x0, z7); +} + +void __attribute__((noipa)) +caller (void *x0) +{ + svbool_t pg; + pg = svptrue_b8 (); + callee (x0, + svld4_vnum_s8 (pg, x0, -8), + svld3_vnum_s8 (pg, x0, -3), + svld2_vnum_s8 (pg, x0, 0), + svld1_vnum_s8 (pg, x0, 2), + svptrue_pat_b8 (SV_VL1), + svptrue_pat_b16 (SV_VL2), + svptrue_pat_b32 (SV_VL3), + svptrue_pat_b64 (SV_VL4)); +} + +/* { dg-final { scan-assembler {\tld4b\t{z0\.b - z3\.b}, p[0-7]/z, \[x0, #-8, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld3b\t{z4\.b - z6\.b}, p[0-7]/z, \[x0, #-3, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld1b\tz7\.b, p[0-7]/z, \[x0, #2, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tmov\tx1, sp\n} } } */ +/* { dg-final { scan-assembler {\tld2b\t{(z[0-9]+\.b) - z[0-9]+\.b}.*\tst1b\t\1, p[0-7], \[x1\]\n} } } */ +/* { dg-final { scan-assembler {\tld2b\t{z[0-9]+\.b - (z[0-9]+\.b)}.*\tst1b\t\1, p[0-7], \[x1, #1, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp0\.b, vl1\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp1\.h, vl2\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp2\.s, vl3\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp3\.d, vl4\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_be_u16.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_be_u16.c new file mode 100644 index 0000000..29af146 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_be_u16.c @@ -0,0 +1,63 @@ +/* { dg-do compile } */ +/* { dg-options "-O -mbig-endian -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#pragma GCC aarch64 "arm_sve.h" + +/* +** callee: +** addvl sp, sp, #-1 +** str p4, \[sp\] +** ptrue p4\.b, all +** ( +** ld1h (z[0-9]+\.h), p4/z, \[x1, #1, mul vl\] +** ld1h (z[0-9]+\.h), p4/z, \[x1\] +** st2h {\2 - \1}, p0, \[x0\] +** | +** ld1h (z[0-9]+\.h), p4/z, \[x1\] +** ld1h (z[0-9]+\.h), p4/z, \[x1, #1, mul vl\] +** st2h {\3 - \4}, p0, \[x0\] +** ) +** st4h {z0\.h - z3\.h}, p1, \[x0\] +** st3h {z4\.h - z6\.h}, p2, \[x0\] +** st1h z7\.h, p3, \[x0\] +** ldr p4, \[sp\] +** addvl sp, sp, #1 +** ret +*/ +void __attribute__((noipa)) +callee (void *x0, svuint16x4_t z0, svuint16x3_t z4, svuint16x2_t stack, + svuint16_t z7, svbool_t p0, svbool_t p1, svbool_t p2, svbool_t p3) +{ + svst2 (p0, x0, stack); + svst4 (p1, x0, z0); + svst3 (p2, x0, z4); + svst1_u16 (p3, x0, z7); +} + +void __attribute__((noipa)) +caller (void *x0) +{ + svbool_t pg; + pg = svptrue_b8 (); + callee (x0, + svld4_vnum_u16 (pg, x0, -8), + svld3_vnum_u16 (pg, x0, -3), + svld2_vnum_u16 (pg, x0, 0), + svld1_vnum_u16 (pg, x0, 2), + svptrue_pat_b8 (SV_VL1), + svptrue_pat_b16 (SV_VL2), + svptrue_pat_b32 (SV_VL3), + svptrue_pat_b64 (SV_VL4)); +} + +/* { dg-final { scan-assembler {\tld4h\t{z0\.h - z3\.h}, p[0-7]/z, \[x0, #-8, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld3h\t{z4\.h - z6\.h}, p[0-7]/z, \[x0, #-3, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld1h\tz7\.h, p[0-7]/z, \[x0, #2, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tmov\tx1, sp\n} } } */ +/* { dg-final { scan-assembler {\tld2h\t{(z[0-9]+\.h) - z[0-9]+\.h}.*\tst1h\t\1, p[0-7], \[x1\]\n} } } */ +/* { dg-final { scan-assembler {\tld2h\t{z[0-9]+\.h - (z[0-9]+\.h)}.*\tst1h\t\1, p[0-7], \[x1, #1, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp0\.b, vl1\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp1\.h, vl2\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp2\.s, vl3\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp3\.d, vl4\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_be_u32.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_be_u32.c new file mode 100644 index 0000000..0a9ca9d --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_be_u32.c @@ -0,0 +1,63 @@ +/* { dg-do compile } */ +/* { dg-options "-O -mbig-endian -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#pragma GCC aarch64 "arm_sve.h" + +/* +** callee: +** addvl sp, sp, #-1 +** str p4, \[sp\] +** ptrue p4\.b, all +** ( +** ld1w (z[0-9]+\.s), p4/z, \[x1, #1, mul vl\] +** ld1w (z[0-9]+\.s), p4/z, \[x1\] +** st2w {\2 - \1}, p0, \[x0\] +** | +** ld1w (z[0-9]+\.s), p4/z, \[x1\] +** ld1w (z[0-9]+\.s), p4/z, \[x1, #1, mul vl\] +** st2w {\3 - \4}, p0, \[x0\] +** ) +** st4w {z0\.s - z3\.s}, p1, \[x0\] +** st3w {z4\.s - z6\.s}, p2, \[x0\] +** st1w z7\.s, p3, \[x0\] +** ldr p4, \[sp\] +** addvl sp, sp, #1 +** ret +*/ +void __attribute__((noipa)) +callee (void *x0, svuint32x4_t z0, svuint32x3_t z4, svuint32x2_t stack, + svuint32_t z7, svbool_t p0, svbool_t p1, svbool_t p2, svbool_t p3) +{ + svst2 (p0, x0, stack); + svst4 (p1, x0, z0); + svst3 (p2, x0, z4); + svst1_u32 (p3, x0, z7); +} + +void __attribute__((noipa)) +caller (void *x0) +{ + svbool_t pg; + pg = svptrue_b8 (); + callee (x0, + svld4_vnum_u32 (pg, x0, -8), + svld3_vnum_u32 (pg, x0, -3), + svld2_vnum_u32 (pg, x0, 0), + svld1_vnum_u32 (pg, x0, 2), + svptrue_pat_b8 (SV_VL1), + svptrue_pat_b16 (SV_VL2), + svptrue_pat_b32 (SV_VL3), + svptrue_pat_b64 (SV_VL4)); +} + +/* { dg-final { scan-assembler {\tld4w\t{z0\.s - z3\.s}, p[0-7]/z, \[x0, #-8, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld3w\t{z4\.s - z6\.s}, p[0-7]/z, \[x0, #-3, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld1w\tz7\.s, p[0-7]/z, \[x0, #2, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tmov\tx1, sp\n} } } */ +/* { dg-final { scan-assembler {\tld2w\t{(z[0-9]+\.s) - z[0-9]+\.s}.*\tst1w\t\1, p[0-7], \[x1\]\n} } } */ +/* { dg-final { scan-assembler {\tld2w\t{z[0-9]+\.s - (z[0-9]+\.s)}.*\tst1w\t\1, p[0-7], \[x1, #1, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp0\.b, vl1\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp1\.h, vl2\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp2\.s, vl3\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp3\.d, vl4\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_be_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_be_u64.c new file mode 100644 index 0000000..50b71ec --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_be_u64.c @@ -0,0 +1,63 @@ +/* { dg-do compile } */ +/* { dg-options "-O -mbig-endian -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#pragma GCC aarch64 "arm_sve.h" + +/* +** callee: +** addvl sp, sp, #-1 +** str p4, \[sp\] +** ptrue p4\.b, all +** ( +** ld1d (z[0-9]+\.d), p4/z, \[x1, #1, mul vl\] +** ld1d (z[0-9]+\.d), p4/z, \[x1\] +** st2d {\2 - \1}, p0, \[x0\] +** | +** ld1d (z[0-9]+\.d), p4/z, \[x1\] +** ld1d (z[0-9]+\.d), p4/z, \[x1, #1, mul vl\] +** st2d {\3 - \4}, p0, \[x0\] +** ) +** st4d {z0\.d - z3\.d}, p1, \[x0\] +** st3d {z4\.d - z6\.d}, p2, \[x0\] +** st1d z7\.d, p3, \[x0\] +** ldr p4, \[sp\] +** addvl sp, sp, #1 +** ret +*/ +void __attribute__((noipa)) +callee (void *x0, svuint64x4_t z0, svuint64x3_t z4, svuint64x2_t stack, + svuint64_t z7, svbool_t p0, svbool_t p1, svbool_t p2, svbool_t p3) +{ + svst2 (p0, x0, stack); + svst4 (p1, x0, z0); + svst3 (p2, x0, z4); + svst1_u64 (p3, x0, z7); +} + +void __attribute__((noipa)) +caller (void *x0) +{ + svbool_t pg; + pg = svptrue_b8 (); + callee (x0, + svld4_vnum_u64 (pg, x0, -8), + svld3_vnum_u64 (pg, x0, -3), + svld2_vnum_u64 (pg, x0, 0), + svld1_vnum_u64 (pg, x0, 2), + svptrue_pat_b8 (SV_VL1), + svptrue_pat_b16 (SV_VL2), + svptrue_pat_b32 (SV_VL3), + svptrue_pat_b64 (SV_VL4)); +} + +/* { dg-final { scan-assembler {\tld4d\t{z0\.d - z3\.d}, p[0-7]/z, \[x0, #-8, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld3d\t{z4\.d - z6\.d}, p[0-7]/z, \[x0, #-3, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld1d\tz7\.d, p[0-7]/z, \[x0, #2, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tmov\tx1, sp\n} } } */ +/* { dg-final { scan-assembler {\tld2d\t{(z[0-9]+\.d) - z[0-9]+\.d}.*\tst1d\t\1, p[0-7], \[x1\]\n} } } */ +/* { dg-final { scan-assembler {\tld2d\t{z[0-9]+\.d - (z[0-9]+\.d)}.*\tst1d\t\1, p[0-7], \[x1, #1, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp0\.b, vl1\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp1\.h, vl2\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp2\.s, vl3\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp3\.d, vl4\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_be_u8.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_be_u8.c new file mode 100644 index 0000000..bb6de3f --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_be_u8.c @@ -0,0 +1,63 @@ +/* { dg-do compile } */ +/* { dg-options "-O -mbig-endian -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#pragma GCC aarch64 "arm_sve.h" + +/* +** callee: +** addvl sp, sp, #-1 +** str p4, \[sp\] +** ptrue p4\.b, all +** ( +** ld1b (z[0-9]+\.b), p4/z, \[x1, #1, mul vl\] +** ld1b (z[0-9]+\.b), p4/z, \[x1\] +** st2b {\2 - \1}, p0, \[x0\] +** | +** ld1b (z[0-9]+\.b), p4/z, \[x1\] +** ld1b (z[0-9]+\.b), p4/z, \[x1, #1, mul vl\] +** st2b {\3 - \4}, p0, \[x0\] +** ) +** st4b {z0\.b - z3\.b}, p1, \[x0\] +** st3b {z4\.b - z6\.b}, p2, \[x0\] +** st1b z7\.b, p3, \[x0\] +** ldr p4, \[sp\] +** addvl sp, sp, #1 +** ret +*/ +void __attribute__((noipa)) +callee (void *x0, svuint8x4_t z0, svuint8x3_t z4, svuint8x2_t stack, + svuint8_t z7, svbool_t p0, svbool_t p1, svbool_t p2, svbool_t p3) +{ + svst2 (p0, x0, stack); + svst4 (p1, x0, z0); + svst3 (p2, x0, z4); + svst1_u8 (p3, x0, z7); +} + +void __attribute__((noipa)) +caller (void *x0) +{ + svbool_t pg; + pg = svptrue_b8 (); + callee (x0, + svld4_vnum_u8 (pg, x0, -8), + svld3_vnum_u8 (pg, x0, -3), + svld2_vnum_u8 (pg, x0, 0), + svld1_vnum_u8 (pg, x0, 2), + svptrue_pat_b8 (SV_VL1), + svptrue_pat_b16 (SV_VL2), + svptrue_pat_b32 (SV_VL3), + svptrue_pat_b64 (SV_VL4)); +} + +/* { dg-final { scan-assembler {\tld4b\t{z0\.b - z3\.b}, p[0-7]/z, \[x0, #-8, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld3b\t{z4\.b - z6\.b}, p[0-7]/z, \[x0, #-3, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld1b\tz7\.b, p[0-7]/z, \[x0, #2, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tmov\tx1, sp\n} } } */ +/* { dg-final { scan-assembler {\tld2b\t{(z[0-9]+\.b) - z[0-9]+\.b}.*\tst1b\t\1, p[0-7], \[x1\]\n} } } */ +/* { dg-final { scan-assembler {\tld2b\t{z[0-9]+\.b - (z[0-9]+\.b)}.*\tst1b\t\1, p[0-7], \[x1, #1, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp0\.b, vl1\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp1\.h, vl2\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp2\.s, vl3\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp3\.d, vl4\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_le_f16.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_le_f16.c new file mode 100644 index 0000000..bd57fe4 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_le_f16.c @@ -0,0 +1,58 @@ +/* { dg-do compile } */ +/* { dg-options "-O -mlittle-endian -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#pragma GCC aarch64 "arm_sve.h" + +/* +** callee: +** ( +** ldr (z[0-9]+), \[x1, #1, mul vl\] +** ldr (z[0-9]+), \[x1\] +** st2h {\2\.h - \1\.h}, p0, \[x0\] +** | +** ldr (z[0-9]+), \[x1\] +** ldr (z[0-9]+), \[x1, #1, mul vl\] +** st2h {\3\.h - \4\.h}, p0, \[x0\] +** ) +** st4h {z0\.h - z3\.h}, p1, \[x0\] +** st3h {z4\.h - z6\.h}, p2, \[x0\] +** st1h z7\.h, p3, \[x0\] +** ret +*/ +void __attribute__((noipa)) +callee (void *x0, svfloat16x4_t z0, svfloat16x3_t z4, svfloat16x2_t stack, + svfloat16_t z7, svbool_t p0, svbool_t p1, svbool_t p2, svbool_t p3) +{ + svst2 (p0, x0, stack); + svst4 (p1, x0, z0); + svst3 (p2, x0, z4); + svst1_f16 (p3, x0, z7); +} + +void __attribute__((noipa)) +caller (void *x0) +{ + svbool_t pg; + pg = svptrue_b8 (); + callee (x0, + svld4_vnum_f16 (pg, x0, -8), + svld3_vnum_f16 (pg, x0, -3), + svld2_vnum_f16 (pg, x0, 0), + svld1_vnum_f16 (pg, x0, 2), + svptrue_pat_b8 (SV_VL1), + svptrue_pat_b16 (SV_VL2), + svptrue_pat_b32 (SV_VL3), + svptrue_pat_b64 (SV_VL4)); +} + +/* { dg-final { scan-assembler {\tld4h\t{z0\.h - z3\.h}, p[0-7]/z, \[x0, #-8, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld3h\t{z4\.h - z6\.h}, p[0-7]/z, \[x0, #-3, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld1h\tz7\.h, p[0-7]/z, \[x0, #2, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tmov\tx1, sp\n} } } */ +/* { dg-final { scan-assembler {\tld2h\t{(z[0-9]+)\.h - z[0-9]+\.h}.*\tstr\t\1, \[x1\]\n} } } */ +/* { dg-final { scan-assembler {\tld2h\t{z[0-9]+\.h - (z[0-9]+)\.h}.*\tstr\t\1, \[x1, #1, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp0\.b, vl1\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp1\.h, vl2\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp2\.s, vl3\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp3\.d, vl4\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_le_f32.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_le_f32.c new file mode 100644 index 0000000..7263cfc --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_le_f32.c @@ -0,0 +1,58 @@ +/* { dg-do compile } */ +/* { dg-options "-O -mlittle-endian -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#pragma GCC aarch64 "arm_sve.h" + +/* +** callee: +** ( +** ldr (z[0-9]+), \[x1, #1, mul vl\] +** ldr (z[0-9]+), \[x1\] +** st2w {\2\.s - \1\.s}, p0, \[x0\] +** | +** ldr (z[0-9]+), \[x1\] +** ldr (z[0-9]+), \[x1, #1, mul vl\] +** st2w {\3\.s - \4\.s}, p0, \[x0\] +** ) +** st4w {z0\.s - z3\.s}, p1, \[x0\] +** st3w {z4\.s - z6\.s}, p2, \[x0\] +** st1w z7\.s, p3, \[x0\] +** ret +*/ +void __attribute__((noipa)) +callee (void *x0, svfloat32x4_t z0, svfloat32x3_t z4, svfloat32x2_t stack, + svfloat32_t z7, svbool_t p0, svbool_t p1, svbool_t p2, svbool_t p3) +{ + svst2 (p0, x0, stack); + svst4 (p1, x0, z0); + svst3 (p2, x0, z4); + svst1_f32 (p3, x0, z7); +} + +void __attribute__((noipa)) +caller (void *x0) +{ + svbool_t pg; + pg = svptrue_b8 (); + callee (x0, + svld4_vnum_f32 (pg, x0, -8), + svld3_vnum_f32 (pg, x0, -3), + svld2_vnum_f32 (pg, x0, 0), + svld1_vnum_f32 (pg, x0, 2), + svptrue_pat_b8 (SV_VL1), + svptrue_pat_b16 (SV_VL2), + svptrue_pat_b32 (SV_VL3), + svptrue_pat_b64 (SV_VL4)); +} + +/* { dg-final { scan-assembler {\tld4w\t{z0\.s - z3\.s}, p[0-7]/z, \[x0, #-8, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld3w\t{z4\.s - z6\.s}, p[0-7]/z, \[x0, #-3, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld1w\tz7\.s, p[0-7]/z, \[x0, #2, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tmov\tx1, sp\n} } } */ +/* { dg-final { scan-assembler {\tld2w\t{(z[0-9]+)\.s - z[0-9]+\.s}.*\tstr\t\1, \[x1\]\n} } } */ +/* { dg-final { scan-assembler {\tld2w\t{z[0-9]+\.s - (z[0-9]+)\.s}.*\tstr\t\1, \[x1, #1, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp0\.b, vl1\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp1\.h, vl2\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp2\.s, vl3\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp3\.d, vl4\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_le_f64.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_le_f64.c new file mode 100644 index 0000000..5e24791 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_le_f64.c @@ -0,0 +1,58 @@ +/* { dg-do compile } */ +/* { dg-options "-O -mlittle-endian -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#pragma GCC aarch64 "arm_sve.h" + +/* +** callee: +** ( +** ldr (z[0-9]+), \[x1, #1, mul vl\] +** ldr (z[0-9]+), \[x1\] +** st2d {\2\.d - \1\.d}, p0, \[x0\] +** | +** ldr (z[0-9]+), \[x1\] +** ldr (z[0-9]+), \[x1, #1, mul vl\] +** st2d {\3\.d - \4\.d}, p0, \[x0\] +** ) +** st4d {z0\.d - z3\.d}, p1, \[x0\] +** st3d {z4\.d - z6\.d}, p2, \[x0\] +** st1d z7\.d, p3, \[x0\] +** ret +*/ +void __attribute__((noipa)) +callee (void *x0, svfloat64x4_t z0, svfloat64x3_t z4, svfloat64x2_t stack, + svfloat64_t z7, svbool_t p0, svbool_t p1, svbool_t p2, svbool_t p3) +{ + svst2 (p0, x0, stack); + svst4 (p1, x0, z0); + svst3 (p2, x0, z4); + svst1_f64 (p3, x0, z7); +} + +void __attribute__((noipa)) +caller (void *x0) +{ + svbool_t pg; + pg = svptrue_b8 (); + callee (x0, + svld4_vnum_f64 (pg, x0, -8), + svld3_vnum_f64 (pg, x0, -3), + svld2_vnum_f64 (pg, x0, 0), + svld1_vnum_f64 (pg, x0, 2), + svptrue_pat_b8 (SV_VL1), + svptrue_pat_b16 (SV_VL2), + svptrue_pat_b32 (SV_VL3), + svptrue_pat_b64 (SV_VL4)); +} + +/* { dg-final { scan-assembler {\tld4d\t{z0\.d - z3\.d}, p[0-7]/z, \[x0, #-8, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld3d\t{z4\.d - z6\.d}, p[0-7]/z, \[x0, #-3, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld1d\tz7\.d, p[0-7]/z, \[x0, #2, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tmov\tx1, sp\n} } } */ +/* { dg-final { scan-assembler {\tld2d\t{(z[0-9]+)\.d - z[0-9]+\.d}.*\tstr\t\1, \[x1\]\n} } } */ +/* { dg-final { scan-assembler {\tld2d\t{z[0-9]+\.d - (z[0-9]+)\.d}.*\tstr\t\1, \[x1, #1, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp0\.b, vl1\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp1\.h, vl2\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp2\.s, vl3\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp3\.d, vl4\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_le_s16.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_le_s16.c new file mode 100644 index 0000000..82500f2 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_le_s16.c @@ -0,0 +1,58 @@ +/* { dg-do compile } */ +/* { dg-options "-O -mlittle-endian -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#pragma GCC aarch64 "arm_sve.h" + +/* +** callee: +** ( +** ldr (z[0-9]+), \[x1, #1, mul vl\] +** ldr (z[0-9]+), \[x1\] +** st2h {\2\.h - \1\.h}, p0, \[x0\] +** | +** ldr (z[0-9]+), \[x1\] +** ldr (z[0-9]+), \[x1, #1, mul vl\] +** st2h {\3\.h - \4\.h}, p0, \[x0\] +** ) +** st4h {z0\.h - z3\.h}, p1, \[x0\] +** st3h {z4\.h - z6\.h}, p2, \[x0\] +** st1h z7\.h, p3, \[x0\] +** ret +*/ +void __attribute__((noipa)) +callee (void *x0, svint16x4_t z0, svint16x3_t z4, svint16x2_t stack, + svint16_t z7, svbool_t p0, svbool_t p1, svbool_t p2, svbool_t p3) +{ + svst2 (p0, x0, stack); + svst4 (p1, x0, z0); + svst3 (p2, x0, z4); + svst1_s16 (p3, x0, z7); +} + +void __attribute__((noipa)) +caller (void *x0) +{ + svbool_t pg; + pg = svptrue_b8 (); + callee (x0, + svld4_vnum_s16 (pg, x0, -8), + svld3_vnum_s16 (pg, x0, -3), + svld2_vnum_s16 (pg, x0, 0), + svld1_vnum_s16 (pg, x0, 2), + svptrue_pat_b8 (SV_VL1), + svptrue_pat_b16 (SV_VL2), + svptrue_pat_b32 (SV_VL3), + svptrue_pat_b64 (SV_VL4)); +} + +/* { dg-final { scan-assembler {\tld4h\t{z0\.h - z3\.h}, p[0-7]/z, \[x0, #-8, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld3h\t{z4\.h - z6\.h}, p[0-7]/z, \[x0, #-3, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld1h\tz7\.h, p[0-7]/z, \[x0, #2, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tmov\tx1, sp\n} } } */ +/* { dg-final { scan-assembler {\tld2h\t{(z[0-9]+)\.h - z[0-9]+\.h}.*\tstr\t\1, \[x1\]\n} } } */ +/* { dg-final { scan-assembler {\tld2h\t{z[0-9]+\.h - (z[0-9]+)\.h}.*\tstr\t\1, \[x1, #1, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp0\.b, vl1\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp1\.h, vl2\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp2\.s, vl3\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp3\.d, vl4\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_le_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_le_s32.c new file mode 100644 index 0000000..70ed319 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_le_s32.c @@ -0,0 +1,58 @@ +/* { dg-do compile } */ +/* { dg-options "-O -mlittle-endian -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#pragma GCC aarch64 "arm_sve.h" + +/* +** callee: +** ( +** ldr (z[0-9]+), \[x1, #1, mul vl\] +** ldr (z[0-9]+), \[x1\] +** st2w {\2\.s - \1\.s}, p0, \[x0\] +** | +** ldr (z[0-9]+), \[x1\] +** ldr (z[0-9]+), \[x1, #1, mul vl\] +** st2w {\3\.s - \4\.s}, p0, \[x0\] +** ) +** st4w {z0\.s - z3\.s}, p1, \[x0\] +** st3w {z4\.s - z6\.s}, p2, \[x0\] +** st1w z7\.s, p3, \[x0\] +** ret +*/ +void __attribute__((noipa)) +callee (void *x0, svint32x4_t z0, svint32x3_t z4, svint32x2_t stack, + svint32_t z7, svbool_t p0, svbool_t p1, svbool_t p2, svbool_t p3) +{ + svst2 (p0, x0, stack); + svst4 (p1, x0, z0); + svst3 (p2, x0, z4); + svst1_s32 (p3, x0, z7); +} + +void __attribute__((noipa)) +caller (void *x0) +{ + svbool_t pg; + pg = svptrue_b8 (); + callee (x0, + svld4_vnum_s32 (pg, x0, -8), + svld3_vnum_s32 (pg, x0, -3), + svld2_vnum_s32 (pg, x0, 0), + svld1_vnum_s32 (pg, x0, 2), + svptrue_pat_b8 (SV_VL1), + svptrue_pat_b16 (SV_VL2), + svptrue_pat_b32 (SV_VL3), + svptrue_pat_b64 (SV_VL4)); +} + +/* { dg-final { scan-assembler {\tld4w\t{z0\.s - z3\.s}, p[0-7]/z, \[x0, #-8, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld3w\t{z4\.s - z6\.s}, p[0-7]/z, \[x0, #-3, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld1w\tz7\.s, p[0-7]/z, \[x0, #2, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tmov\tx1, sp\n} } } */ +/* { dg-final { scan-assembler {\tld2w\t{(z[0-9]+)\.s - z[0-9]+\.s}.*\tstr\t\1, \[x1\]\n} } } */ +/* { dg-final { scan-assembler {\tld2w\t{z[0-9]+\.s - (z[0-9]+)\.s}.*\tstr\t\1, \[x1, #1, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp0\.b, vl1\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp1\.h, vl2\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp2\.s, vl3\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp3\.d, vl4\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_le_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_le_s64.c new file mode 100644 index 0000000..80cb1fb --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_le_s64.c @@ -0,0 +1,58 @@ +/* { dg-do compile } */ +/* { dg-options "-O -mlittle-endian -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#pragma GCC aarch64 "arm_sve.h" + +/* +** callee: +** ( +** ldr (z[0-9]+), \[x1, #1, mul vl\] +** ldr (z[0-9]+), \[x1\] +** st2d {\2\.d - \1\.d}, p0, \[x0\] +** | +** ldr (z[0-9]+), \[x1\] +** ldr (z[0-9]+), \[x1, #1, mul vl\] +** st2d {\3\.d - \4\.d}, p0, \[x0\] +** ) +** st4d {z0\.d - z3\.d}, p1, \[x0\] +** st3d {z4\.d - z6\.d}, p2, \[x0\] +** st1d z7\.d, p3, \[x0\] +** ret +*/ +void __attribute__((noipa)) +callee (void *x0, svint64x4_t z0, svint64x3_t z4, svint64x2_t stack, + svint64_t z7, svbool_t p0, svbool_t p1, svbool_t p2, svbool_t p3) +{ + svst2 (p0, x0, stack); + svst4 (p1, x0, z0); + svst3 (p2, x0, z4); + svst1_s64 (p3, x0, z7); +} + +void __attribute__((noipa)) +caller (void *x0) +{ + svbool_t pg; + pg = svptrue_b8 (); + callee (x0, + svld4_vnum_s64 (pg, x0, -8), + svld3_vnum_s64 (pg, x0, -3), + svld2_vnum_s64 (pg, x0, 0), + svld1_vnum_s64 (pg, x0, 2), + svptrue_pat_b8 (SV_VL1), + svptrue_pat_b16 (SV_VL2), + svptrue_pat_b32 (SV_VL3), + svptrue_pat_b64 (SV_VL4)); +} + +/* { dg-final { scan-assembler {\tld4d\t{z0\.d - z3\.d}, p[0-7]/z, \[x0, #-8, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld3d\t{z4\.d - z6\.d}, p[0-7]/z, \[x0, #-3, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld1d\tz7\.d, p[0-7]/z, \[x0, #2, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tmov\tx1, sp\n} } } */ +/* { dg-final { scan-assembler {\tld2d\t{(z[0-9]+)\.d - z[0-9]+\.d}.*\tstr\t\1, \[x1\]\n} } } */ +/* { dg-final { scan-assembler {\tld2d\t{z[0-9]+\.d - (z[0-9]+)\.d}.*\tstr\t\1, \[x1, #1, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp0\.b, vl1\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp1\.h, vl2\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp2\.s, vl3\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp3\.d, vl4\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_le_s8.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_le_s8.c new file mode 100644 index 0000000..12d5d4f --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_le_s8.c @@ -0,0 +1,58 @@ +/* { dg-do compile } */ +/* { dg-options "-O -mlittle-endian -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#pragma GCC aarch64 "arm_sve.h" + +/* +** callee: +** ( +** ldr (z[0-9]+), \[x1, #1, mul vl\] +** ldr (z[0-9]+), \[x1\] +** st2b {\2\.b - \1\.b}, p0, \[x0\] +** | +** ldr (z[0-9]+), \[x1\] +** ldr (z[0-9]+), \[x1, #1, mul vl\] +** st2b {\3\.b - \4\.b}, p0, \[x0\] +** ) +** st4b {z0\.b - z3\.b}, p1, \[x0\] +** st3b {z4\.b - z6\.b}, p2, \[x0\] +** st1b z7\.b, p3, \[x0\] +** ret +*/ +void __attribute__((noipa)) +callee (void *x0, svint8x4_t z0, svint8x3_t z4, svint8x2_t stack, + svint8_t z7, svbool_t p0, svbool_t p1, svbool_t p2, svbool_t p3) +{ + svst2 (p0, x0, stack); + svst4 (p1, x0, z0); + svst3 (p2, x0, z4); + svst1_s8 (p3, x0, z7); +} + +void __attribute__((noipa)) +caller (void *x0) +{ + svbool_t pg; + pg = svptrue_b8 (); + callee (x0, + svld4_vnum_s8 (pg, x0, -8), + svld3_vnum_s8 (pg, x0, -3), + svld2_vnum_s8 (pg, x0, 0), + svld1_vnum_s8 (pg, x0, 2), + svptrue_pat_b8 (SV_VL1), + svptrue_pat_b16 (SV_VL2), + svptrue_pat_b32 (SV_VL3), + svptrue_pat_b64 (SV_VL4)); +} + +/* { dg-final { scan-assembler {\tld4b\t{z0\.b - z3\.b}, p[0-7]/z, \[x0, #-8, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld3b\t{z4\.b - z6\.b}, p[0-7]/z, \[x0, #-3, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld1b\tz7\.b, p[0-7]/z, \[x0, #2, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tmov\tx1, sp\n} } } */ +/* { dg-final { scan-assembler {\tld2b\t{(z[0-9]+)\.b - z[0-9]+\.b}.*\tstr\t\1, \[x1\]\n} } } */ +/* { dg-final { scan-assembler {\tld2b\t{z[0-9]+\.b - (z[0-9]+)\.b}.*\tstr\t\1, \[x1, #1, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp0\.b, vl1\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp1\.h, vl2\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp2\.s, vl3\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp3\.d, vl4\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_le_u16.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_le_u16.c new file mode 100644 index 0000000..5d3ed92 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_le_u16.c @@ -0,0 +1,58 @@ +/* { dg-do compile } */ +/* { dg-options "-O -mlittle-endian -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#pragma GCC aarch64 "arm_sve.h" + +/* +** callee: +** ( +** ldr (z[0-9]+), \[x1, #1, mul vl\] +** ldr (z[0-9]+), \[x1\] +** st2h {\2\.h - \1\.h}, p0, \[x0\] +** | +** ldr (z[0-9]+), \[x1\] +** ldr (z[0-9]+), \[x1, #1, mul vl\] +** st2h {\3\.h - \4\.h}, p0, \[x0\] +** ) +** st4h {z0\.h - z3\.h}, p1, \[x0\] +** st3h {z4\.h - z6\.h}, p2, \[x0\] +** st1h z7\.h, p3, \[x0\] +** ret +*/ +void __attribute__((noipa)) +callee (void *x0, svuint16x4_t z0, svuint16x3_t z4, svuint16x2_t stack, + svuint16_t z7, svbool_t p0, svbool_t p1, svbool_t p2, svbool_t p3) +{ + svst2 (p0, x0, stack); + svst4 (p1, x0, z0); + svst3 (p2, x0, z4); + svst1_u16 (p3, x0, z7); +} + +void __attribute__((noipa)) +caller (void *x0) +{ + svbool_t pg; + pg = svptrue_b8 (); + callee (x0, + svld4_vnum_u16 (pg, x0, -8), + svld3_vnum_u16 (pg, x0, -3), + svld2_vnum_u16 (pg, x0, 0), + svld1_vnum_u16 (pg, x0, 2), + svptrue_pat_b8 (SV_VL1), + svptrue_pat_b16 (SV_VL2), + svptrue_pat_b32 (SV_VL3), + svptrue_pat_b64 (SV_VL4)); +} + +/* { dg-final { scan-assembler {\tld4h\t{z0\.h - z3\.h}, p[0-7]/z, \[x0, #-8, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld3h\t{z4\.h - z6\.h}, p[0-7]/z, \[x0, #-3, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld1h\tz7\.h, p[0-7]/z, \[x0, #2, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tmov\tx1, sp\n} } } */ +/* { dg-final { scan-assembler {\tld2h\t{(z[0-9]+)\.h - z[0-9]+\.h}.*\tstr\t\1, \[x1\]\n} } } */ +/* { dg-final { scan-assembler {\tld2h\t{z[0-9]+\.h - (z[0-9]+)\.h}.*\tstr\t\1, \[x1, #1, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp0\.b, vl1\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp1\.h, vl2\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp2\.s, vl3\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp3\.d, vl4\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_le_u32.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_le_u32.c new file mode 100644 index 0000000..d08a7a8 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_le_u32.c @@ -0,0 +1,58 @@ +/* { dg-do compile } */ +/* { dg-options "-O -mlittle-endian -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#pragma GCC aarch64 "arm_sve.h" + +/* +** callee: +** ( +** ldr (z[0-9]+), \[x1, #1, mul vl\] +** ldr (z[0-9]+), \[x1\] +** st2w {\2\.s - \1\.s}, p0, \[x0\] +** | +** ldr (z[0-9]+), \[x1\] +** ldr (z[0-9]+), \[x1, #1, mul vl\] +** st2w {\3\.s - \4\.s}, p0, \[x0\] +** ) +** st4w {z0\.s - z3\.s}, p1, \[x0\] +** st3w {z4\.s - z6\.s}, p2, \[x0\] +** st1w z7\.s, p3, \[x0\] +** ret +*/ +void __attribute__((noipa)) +callee (void *x0, svuint32x4_t z0, svuint32x3_t z4, svuint32x2_t stack, + svuint32_t z7, svbool_t p0, svbool_t p1, svbool_t p2, svbool_t p3) +{ + svst2 (p0, x0, stack); + svst4 (p1, x0, z0); + svst3 (p2, x0, z4); + svst1_u32 (p3, x0, z7); +} + +void __attribute__((noipa)) +caller (void *x0) +{ + svbool_t pg; + pg = svptrue_b8 (); + callee (x0, + svld4_vnum_u32 (pg, x0, -8), + svld3_vnum_u32 (pg, x0, -3), + svld2_vnum_u32 (pg, x0, 0), + svld1_vnum_u32 (pg, x0, 2), + svptrue_pat_b8 (SV_VL1), + svptrue_pat_b16 (SV_VL2), + svptrue_pat_b32 (SV_VL3), + svptrue_pat_b64 (SV_VL4)); +} + +/* { dg-final { scan-assembler {\tld4w\t{z0\.s - z3\.s}, p[0-7]/z, \[x0, #-8, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld3w\t{z4\.s - z6\.s}, p[0-7]/z, \[x0, #-3, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld1w\tz7\.s, p[0-7]/z, \[x0, #2, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tmov\tx1, sp\n} } } */ +/* { dg-final { scan-assembler {\tld2w\t{(z[0-9]+)\.s - z[0-9]+\.s}.*\tstr\t\1, \[x1\]\n} } } */ +/* { dg-final { scan-assembler {\tld2w\t{z[0-9]+\.s - (z[0-9]+)\.s}.*\tstr\t\1, \[x1, #1, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp0\.b, vl1\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp1\.h, vl2\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp2\.s, vl3\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp3\.d, vl4\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_le_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_le_u64.c new file mode 100644 index 0000000..84c27d5 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_le_u64.c @@ -0,0 +1,58 @@ +/* { dg-do compile } */ +/* { dg-options "-O -mlittle-endian -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#pragma GCC aarch64 "arm_sve.h" + +/* +** callee: +** ( +** ldr (z[0-9]+), \[x1, #1, mul vl\] +** ldr (z[0-9]+), \[x1\] +** st2d {\2\.d - \1\.d}, p0, \[x0\] +** | +** ldr (z[0-9]+), \[x1\] +** ldr (z[0-9]+), \[x1, #1, mul vl\] +** st2d {\3\.d - \4\.d}, p0, \[x0\] +** ) +** st4d {z0\.d - z3\.d}, p1, \[x0\] +** st3d {z4\.d - z6\.d}, p2, \[x0\] +** st1d z7\.d, p3, \[x0\] +** ret +*/ +void __attribute__((noipa)) +callee (void *x0, svuint64x4_t z0, svuint64x3_t z4, svuint64x2_t stack, + svuint64_t z7, svbool_t p0, svbool_t p1, svbool_t p2, svbool_t p3) +{ + svst2 (p0, x0, stack); + svst4 (p1, x0, z0); + svst3 (p2, x0, z4); + svst1_u64 (p3, x0, z7); +} + +void __attribute__((noipa)) +caller (void *x0) +{ + svbool_t pg; + pg = svptrue_b8 (); + callee (x0, + svld4_vnum_u64 (pg, x0, -8), + svld3_vnum_u64 (pg, x0, -3), + svld2_vnum_u64 (pg, x0, 0), + svld1_vnum_u64 (pg, x0, 2), + svptrue_pat_b8 (SV_VL1), + svptrue_pat_b16 (SV_VL2), + svptrue_pat_b32 (SV_VL3), + svptrue_pat_b64 (SV_VL4)); +} + +/* { dg-final { scan-assembler {\tld4d\t{z0\.d - z3\.d}, p[0-7]/z, \[x0, #-8, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld3d\t{z4\.d - z6\.d}, p[0-7]/z, \[x0, #-3, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld1d\tz7\.d, p[0-7]/z, \[x0, #2, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tmov\tx1, sp\n} } } */ +/* { dg-final { scan-assembler {\tld2d\t{(z[0-9]+)\.d - z[0-9]+\.d}.*\tstr\t\1, \[x1\]\n} } } */ +/* { dg-final { scan-assembler {\tld2d\t{z[0-9]+\.d - (z[0-9]+)\.d}.*\tstr\t\1, \[x1, #1, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp0\.b, vl1\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp1\.h, vl2\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp2\.s, vl3\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp3\.d, vl4\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_le_u8.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_le_u8.c new file mode 100644 index 0000000..e8b599c --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_le_u8.c @@ -0,0 +1,58 @@ +/* { dg-do compile } */ +/* { dg-options "-O -mlittle-endian -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#pragma GCC aarch64 "arm_sve.h" + +/* +** callee: +** ( +** ldr (z[0-9]+), \[x1, #1, mul vl\] +** ldr (z[0-9]+), \[x1\] +** st2b {\2\.b - \1\.b}, p0, \[x0\] +** | +** ldr (z[0-9]+), \[x1\] +** ldr (z[0-9]+), \[x1, #1, mul vl\] +** st2b {\3\.b - \4\.b}, p0, \[x0\] +** ) +** st4b {z0\.b - z3\.b}, p1, \[x0\] +** st3b {z4\.b - z6\.b}, p2, \[x0\] +** st1b z7\.b, p3, \[x0\] +** ret +*/ +void __attribute__((noipa)) +callee (void *x0, svuint8x4_t z0, svuint8x3_t z4, svuint8x2_t stack, + svuint8_t z7, svbool_t p0, svbool_t p1, svbool_t p2, svbool_t p3) +{ + svst2 (p0, x0, stack); + svst4 (p1, x0, z0); + svst3 (p2, x0, z4); + svst1_u8 (p3, x0, z7); +} + +void __attribute__((noipa)) +caller (void *x0) +{ + svbool_t pg; + pg = svptrue_b8 (); + callee (x0, + svld4_vnum_u8 (pg, x0, -8), + svld3_vnum_u8 (pg, x0, -3), + svld2_vnum_u8 (pg, x0, 0), + svld1_vnum_u8 (pg, x0, 2), + svptrue_pat_b8 (SV_VL1), + svptrue_pat_b16 (SV_VL2), + svptrue_pat_b32 (SV_VL3), + svptrue_pat_b64 (SV_VL4)); +} + +/* { dg-final { scan-assembler {\tld4b\t{z0\.b - z3\.b}, p[0-7]/z, \[x0, #-8, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld3b\t{z4\.b - z6\.b}, p[0-7]/z, \[x0, #-3, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld1b\tz7\.b, p[0-7]/z, \[x0, #2, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tmov\tx1, sp\n} } } */ +/* { dg-final { scan-assembler {\tld2b\t{(z[0-9]+)\.b - z[0-9]+\.b}.*\tstr\t\1, \[x1\]\n} } } */ +/* { dg-final { scan-assembler {\tld2b\t{z[0-9]+\.b - (z[0-9]+)\.b}.*\tstr\t\1, \[x1, #1, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp0\.b, vl1\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp1\.h, vl2\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp2\.s, vl3\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp3\.d, vl4\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_be_f16.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_be_f16.c new file mode 100644 index 0000000..f898cad --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_be_f16.c @@ -0,0 +1,71 @@ +/* { dg-do compile } */ +/* { dg-options "-O -mbig-endian -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#pragma GCC aarch64 "arm_sve.h" + +/* +** callee1: +** ptrue p3\.b, all +** ... +** ld1h (z[0-9]+\.h), p3/z, \[x1, #3, mul vl\] +** ... +** st4h {z[0-9]+\.h - \1}, p0, \[x0\] +** st2h {z3\.h - z4\.h}, p1, \[x0\] +** st3h {z5\.h - z7\.h}, p2, \[x0\] +** ret +*/ +void __attribute__((noipa)) +callee1 (void *x0, svfloat16x3_t z0, svfloat16x2_t z3, svfloat16x3_t z5, + svfloat16x4_t stack1, svfloat16_t stack2, svbool_t p0, + svbool_t p1, svbool_t p2) +{ + svst4_f16 (p0, x0, stack1); + svst2_f16 (p1, x0, z3); + svst3_f16 (p2, x0, z5); +} + +/* +** callee2: +** ptrue p3\.b, all +** ld1h (z[0-9]+\.h), p3/z, \[x2\] +** st1h \1, p0, \[x0\] +** st2h {z3\.h - z4\.h}, p1, \[x0\] +** st3h {z0\.h - z2\.h}, p2, \[x0\] +** ret +*/ +void __attribute__((noipa)) +callee2 (void *x0, svfloat16x3_t z0, svfloat16x2_t z3, svfloat16x3_t z5, + svfloat16x4_t stack1, svfloat16_t stack2, svbool_t p0, + svbool_t p1, svbool_t p2) +{ + svst1_f16 (p0, x0, stack2); + svst2_f16 (p1, x0, z3); + svst3_f16 (p2, x0, z0); +} + +void __attribute__((noipa)) +caller (void *x0) +{ + svbool_t pg; + pg = svptrue_b8 (); + callee1 (x0, + svld3_vnum_f16 (pg, x0, -9), + svld2_vnum_f16 (pg, x0, -2), + svld3_vnum_f16 (pg, x0, 0), + svld4_vnum_f16 (pg, x0, 8), + svld1_vnum_f16 (pg, x0, 5), + svptrue_pat_b8 (SV_VL1), + svptrue_pat_b16 (SV_VL2), + svptrue_pat_b32 (SV_VL3)); +} + +/* { dg-final { scan-assembler {\tld3h\t{z0\.h - z2\.h}, p[0-7]/z, \[x0, #-9, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld2h\t{z3\.h - z4\.h}, p[0-7]/z, \[x0, #-2, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld3h\t{z5\.h - z7\.h}, p[0-7]/z, \[x0\]\n} } } */ +/* { dg-final { scan-assembler {\tld4h\t{(z[0-9]+\.h) - z[0-9]+\.h}.*\tst1h\t\1, p[0-7], \[x1\]\n} } } */ +/* { dg-final { scan-assembler {\tld4h\t{z[0-9]+\.h - (z[0-9]+\.h)}.*\tst1h\t\1, p[0-7], \[x1, #3, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld1h\t(z[0-9]+\.h), p[0-7]/z, \[x0, #5, mul vl\]\n.*\tst1h\t\1, p[0-7], \[x2\]\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp0\.b, vl1\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp1\.h, vl2\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp2\.s, vl3\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_be_f32.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_be_f32.c new file mode 100644 index 0000000..dd23dbb --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_be_f32.c @@ -0,0 +1,71 @@ +/* { dg-do compile } */ +/* { dg-options "-O -mbig-endian -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#pragma GCC aarch64 "arm_sve.h" + +/* +** callee1: +** ptrue p3\.b, all +** ... +** ld1w (z[0-9]+\.s), p3/z, \[x1, #3, mul vl\] +** ... +** st4w {z[0-9]+\.s - \1}, p0, \[x0\] +** st2w {z3\.s - z4\.s}, p1, \[x0\] +** st3w {z5\.s - z7\.s}, p2, \[x0\] +** ret +*/ +void __attribute__((noipa)) +callee1 (void *x0, svfloat32x3_t z0, svfloat32x2_t z3, svfloat32x3_t z5, + svfloat32x4_t stack1, svfloat32_t stack2, svbool_t p0, + svbool_t p1, svbool_t p2) +{ + svst4_f32 (p0, x0, stack1); + svst2_f32 (p1, x0, z3); + svst3_f32 (p2, x0, z5); +} + +/* +** callee2: +** ptrue p3\.b, all +** ld1w (z[0-9]+\.s), p3/z, \[x2\] +** st1w \1, p0, \[x0\] +** st2w {z3\.s - z4\.s}, p1, \[x0\] +** st3w {z0\.s - z2\.s}, p2, \[x0\] +** ret +*/ +void __attribute__((noipa)) +callee2 (void *x0, svfloat32x3_t z0, svfloat32x2_t z3, svfloat32x3_t z5, + svfloat32x4_t stack1, svfloat32_t stack2, svbool_t p0, + svbool_t p1, svbool_t p2) +{ + svst1_f32 (p0, x0, stack2); + svst2_f32 (p1, x0, z3); + svst3_f32 (p2, x0, z0); +} + +void __attribute__((noipa)) +caller (void *x0) +{ + svbool_t pg; + pg = svptrue_b8 (); + callee1 (x0, + svld3_vnum_f32 (pg, x0, -9), + svld2_vnum_f32 (pg, x0, -2), + svld3_vnum_f32 (pg, x0, 0), + svld4_vnum_f32 (pg, x0, 8), + svld1_vnum_f32 (pg, x0, 5), + svptrue_pat_b8 (SV_VL1), + svptrue_pat_b16 (SV_VL2), + svptrue_pat_b32 (SV_VL3)); +} + +/* { dg-final { scan-assembler {\tld3w\t{z0\.s - z2\.s}, p[0-7]/z, \[x0, #-9, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld2w\t{z3\.s - z4\.s}, p[0-7]/z, \[x0, #-2, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld3w\t{z5\.s - z7\.s}, p[0-7]/z, \[x0\]\n} } } */ +/* { dg-final { scan-assembler {\tld4w\t{(z[0-9]+\.s) - z[0-9]+\.s}.*\tst1w\t\1, p[0-7], \[x1\]\n} } } */ +/* { dg-final { scan-assembler {\tld4w\t{z[0-9]+\.s - (z[0-9]+\.s)}.*\tst1w\t\1, p[0-7], \[x1, #3, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld1w\t(z[0-9]+\.s), p[0-7]/z, \[x0, #5, mul vl\]\n.*\tst1w\t\1, p[0-7], \[x2\]\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp0\.b, vl1\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp1\.h, vl2\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp2\.s, vl3\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_be_f64.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_be_f64.c new file mode 100644 index 0000000..090a91d --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_be_f64.c @@ -0,0 +1,71 @@ +/* { dg-do compile } */ +/* { dg-options "-O -mbig-endian -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#pragma GCC aarch64 "arm_sve.h" + +/* +** callee1: +** ptrue p3\.b, all +** ... +** ld1d (z[0-9]+\.d), p3/z, \[x1, #3, mul vl\] +** ... +** st4d {z[0-9]+\.d - \1}, p0, \[x0\] +** st2d {z3\.d - z4\.d}, p1, \[x0\] +** st3d {z5\.d - z7\.d}, p2, \[x0\] +** ret +*/ +void __attribute__((noipa)) +callee1 (void *x0, svfloat64x3_t z0, svfloat64x2_t z3, svfloat64x3_t z5, + svfloat64x4_t stack1, svfloat64_t stack2, svbool_t p0, + svbool_t p1, svbool_t p2) +{ + svst4_f64 (p0, x0, stack1); + svst2_f64 (p1, x0, z3); + svst3_f64 (p2, x0, z5); +} + +/* +** callee2: +** ptrue p3\.b, all +** ld1d (z[0-9]+\.d), p3/z, \[x2\] +** st1d \1, p0, \[x0\] +** st2d {z3\.d - z4\.d}, p1, \[x0\] +** st3d {z0\.d - z2\.d}, p2, \[x0\] +** ret +*/ +void __attribute__((noipa)) +callee2 (void *x0, svfloat64x3_t z0, svfloat64x2_t z3, svfloat64x3_t z5, + svfloat64x4_t stack1, svfloat64_t stack2, svbool_t p0, + svbool_t p1, svbool_t p2) +{ + svst1_f64 (p0, x0, stack2); + svst2_f64 (p1, x0, z3); + svst3_f64 (p2, x0, z0); +} + +void __attribute__((noipa)) +caller (void *x0) +{ + svbool_t pg; + pg = svptrue_b8 (); + callee1 (x0, + svld3_vnum_f64 (pg, x0, -9), + svld2_vnum_f64 (pg, x0, -2), + svld3_vnum_f64 (pg, x0, 0), + svld4_vnum_f64 (pg, x0, 8), + svld1_vnum_f64 (pg, x0, 5), + svptrue_pat_b8 (SV_VL1), + svptrue_pat_b16 (SV_VL2), + svptrue_pat_b32 (SV_VL3)); +} + +/* { dg-final { scan-assembler {\tld3d\t{z0\.d - z2\.d}, p[0-7]/z, \[x0, #-9, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld2d\t{z3\.d - z4\.d}, p[0-7]/z, \[x0, #-2, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld3d\t{z5\.d - z7\.d}, p[0-7]/z, \[x0\]\n} } } */ +/* { dg-final { scan-assembler {\tld4d\t{(z[0-9]+\.d) - z[0-9]+\.d}.*\tst1d\t\1, p[0-7], \[x1\]\n} } } */ +/* { dg-final { scan-assembler {\tld4d\t{z[0-9]+\.d - (z[0-9]+\.d)}.*\tst1d\t\1, p[0-7], \[x1, #3, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld1d\t(z[0-9]+\.d), p[0-7]/z, \[x0, #5, mul vl\]\n.*\tst1d\t\1, p[0-7], \[x2\]\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp0\.b, vl1\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp1\.h, vl2\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp2\.s, vl3\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_be_s16.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_be_s16.c new file mode 100644 index 0000000..f28ac71 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_be_s16.c @@ -0,0 +1,71 @@ +/* { dg-do compile } */ +/* { dg-options "-O -mbig-endian -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#pragma GCC aarch64 "arm_sve.h" + +/* +** callee1: +** ptrue p3\.b, all +** ... +** ld1h (z[0-9]+\.h), p3/z, \[x1, #3, mul vl\] +** ... +** st4h {z[0-9]+\.h - \1}, p0, \[x0\] +** st2h {z3\.h - z4\.h}, p1, \[x0\] +** st3h {z5\.h - z7\.h}, p2, \[x0\] +** ret +*/ +void __attribute__((noipa)) +callee1 (void *x0, svint16x3_t z0, svint16x2_t z3, svint16x3_t z5, + svint16x4_t stack1, svint16_t stack2, svbool_t p0, + svbool_t p1, svbool_t p2) +{ + svst4_s16 (p0, x0, stack1); + svst2_s16 (p1, x0, z3); + svst3_s16 (p2, x0, z5); +} + +/* +** callee2: +** ptrue p3\.b, all +** ld1h (z[0-9]+\.h), p3/z, \[x2\] +** st1h \1, p0, \[x0\] +** st2h {z3\.h - z4\.h}, p1, \[x0\] +** st3h {z0\.h - z2\.h}, p2, \[x0\] +** ret +*/ +void __attribute__((noipa)) +callee2 (void *x0, svint16x3_t z0, svint16x2_t z3, svint16x3_t z5, + svint16x4_t stack1, svint16_t stack2, svbool_t p0, + svbool_t p1, svbool_t p2) +{ + svst1_s16 (p0, x0, stack2); + svst2_s16 (p1, x0, z3); + svst3_s16 (p2, x0, z0); +} + +void __attribute__((noipa)) +caller (void *x0) +{ + svbool_t pg; + pg = svptrue_b8 (); + callee1 (x0, + svld3_vnum_s16 (pg, x0, -9), + svld2_vnum_s16 (pg, x0, -2), + svld3_vnum_s16 (pg, x0, 0), + svld4_vnum_s16 (pg, x0, 8), + svld1_vnum_s16 (pg, x0, 5), + svptrue_pat_b8 (SV_VL1), + svptrue_pat_b16 (SV_VL2), + svptrue_pat_b32 (SV_VL3)); +} + +/* { dg-final { scan-assembler {\tld3h\t{z0\.h - z2\.h}, p[0-7]/z, \[x0, #-9, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld2h\t{z3\.h - z4\.h}, p[0-7]/z, \[x0, #-2, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld3h\t{z5\.h - z7\.h}, p[0-7]/z, \[x0\]\n} } } */ +/* { dg-final { scan-assembler {\tld4h\t{(z[0-9]+\.h) - z[0-9]+\.h}.*\tst1h\t\1, p[0-7], \[x1\]\n} } } */ +/* { dg-final { scan-assembler {\tld4h\t{z[0-9]+\.h - (z[0-9]+\.h)}.*\tst1h\t\1, p[0-7], \[x1, #3, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld1h\t(z[0-9]+\.h), p[0-7]/z, \[x0, #5, mul vl\]\n.*\tst1h\t\1, p[0-7], \[x2\]\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp0\.b, vl1\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp1\.h, vl2\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp2\.s, vl3\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_be_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_be_s32.c new file mode 100644 index 0000000..701c8a9 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_be_s32.c @@ -0,0 +1,71 @@ +/* { dg-do compile } */ +/* { dg-options "-O -mbig-endian -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#pragma GCC aarch64 "arm_sve.h" + +/* +** callee1: +** ptrue p3\.b, all +** ... +** ld1w (z[0-9]+\.s), p3/z, \[x1, #3, mul vl\] +** ... +** st4w {z[0-9]+\.s - \1\}, p0, \[x0\] +** st2w {z3\.s - z4\.s}, p1, \[x0\] +** st3w {z5\.s - z7\.s}, p2, \[x0\] +** ret +*/ +void __attribute__((noipa)) +callee1 (void *x0, svint32x3_t z0, svint32x2_t z3, svint32x3_t z5, + svint32x4_t stack1, svint32_t stack2, svbool_t p0, + svbool_t p1, svbool_t p2) +{ + svst4_s32 (p0, x0, stack1); + svst2_s32 (p1, x0, z3); + svst3_s32 (p2, x0, z5); +} + +/* +** callee2: +** ptrue p3\.b, all +** ld1w (z[0-9]+\.s), p3/z, \[x2\] +** st1w \1, p0, \[x0\] +** st2w {z3\.s - z4\.s}, p1, \[x0\] +** st3w {z0\.s - z2\.s}, p2, \[x0\] +** ret +*/ +void __attribute__((noipa)) +callee2 (void *x0, svint32x3_t z0, svint32x2_t z3, svint32x3_t z5, + svint32x4_t stack1, svint32_t stack2, svbool_t p0, + svbool_t p1, svbool_t p2) +{ + svst1_s32 (p0, x0, stack2); + svst2_s32 (p1, x0, z3); + svst3_s32 (p2, x0, z0); +} + +void __attribute__((noipa)) +caller (void *x0) +{ + svbool_t pg; + pg = svptrue_b8 (); + callee1 (x0, + svld3_vnum_s32 (pg, x0, -9), + svld2_vnum_s32 (pg, x0, -2), + svld3_vnum_s32 (pg, x0, 0), + svld4_vnum_s32 (pg, x0, 8), + svld1_vnum_s32 (pg, x0, 5), + svptrue_pat_b8 (SV_VL1), + svptrue_pat_b16 (SV_VL2), + svptrue_pat_b32 (SV_VL3)); +} + +/* { dg-final { scan-assembler {\tld3w\t{z0\.s - z2\.s}, p[0-7]/z, \[x0, #-9, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld2w\t{z3\.s - z4\.s}, p[0-7]/z, \[x0, #-2, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld3w\t{z5\.s - z7\.s}, p[0-7]/z, \[x0\]\n} } } */ +/* { dg-final { scan-assembler {\tld4w\t{(z[0-9]+\.s) - z[0-9]+\.s}.*\tst1w\t\1, p[0-7], \[x1\]\n} } } */ +/* { dg-final { scan-assembler {\tld4w\t{z[0-9]+\.s - (z[0-9]+\.s)}.*\tst1w\t\1, p[0-7], \[x1, #3, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld1w\t(z[0-9]+\.s), p[0-7]/z, \[x0, #5, mul vl\]\n.*\tst1w\t\1, p[0-7], \[x2\]\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp0\.b, vl1\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp1\.h, vl2\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp2\.s, vl3\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_be_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_be_s64.c new file mode 100644 index 0000000..7aad40f --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_be_s64.c @@ -0,0 +1,71 @@ +/* { dg-do compile } */ +/* { dg-options "-O -mbig-endian -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#pragma GCC aarch64 "arm_sve.h" + +/* +** callee1: +** ptrue p3\.b, all +** ... +** ld1d (z[0-9]+\.d), p3/z, \[x1, #3, mul vl\] +** ... +** st4d {z[0-9]+\.d - \1}, p0, \[x0\] +** st2d {z3\.d - z4\.d}, p1, \[x0\] +** st3d {z5\.d - z7\.d}, p2, \[x0\] +** ret +*/ +void __attribute__((noipa)) +callee1 (void *x0, svint64x3_t z0, svint64x2_t z3, svint64x3_t z5, + svint64x4_t stack1, svint64_t stack2, svbool_t p0, + svbool_t p1, svbool_t p2) +{ + svst4_s64 (p0, x0, stack1); + svst2_s64 (p1, x0, z3); + svst3_s64 (p2, x0, z5); +} + +/* +** callee2: +** ptrue p3\.b, all +** ld1d (z[0-9]+\.d), p3/z, \[x2\] +** st1d \1, p0, \[x0\] +** st2d {z3\.d - z4\.d}, p1, \[x0\] +** st3d {z0\.d - z2\.d}, p2, \[x0\] +** ret +*/ +void __attribute__((noipa)) +callee2 (void *x0, svint64x3_t z0, svint64x2_t z3, svint64x3_t z5, + svint64x4_t stack1, svint64_t stack2, svbool_t p0, + svbool_t p1, svbool_t p2) +{ + svst1_s64 (p0, x0, stack2); + svst2_s64 (p1, x0, z3); + svst3_s64 (p2, x0, z0); +} + +void __attribute__((noipa)) +caller (void *x0) +{ + svbool_t pg; + pg = svptrue_b8 (); + callee1 (x0, + svld3_vnum_s64 (pg, x0, -9), + svld2_vnum_s64 (pg, x0, -2), + svld3_vnum_s64 (pg, x0, 0), + svld4_vnum_s64 (pg, x0, 8), + svld1_vnum_s64 (pg, x0, 5), + svptrue_pat_b8 (SV_VL1), + svptrue_pat_b16 (SV_VL2), + svptrue_pat_b32 (SV_VL3)); +} + +/* { dg-final { scan-assembler {\tld3d\t{z0\.d - z2\.d}, p[0-7]/z, \[x0, #-9, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld2d\t{z3\.d - z4\.d}, p[0-7]/z, \[x0, #-2, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld3d\t{z5\.d - z7\.d}, p[0-7]/z, \[x0\]\n} } } */ +/* { dg-final { scan-assembler {\tld4d\t{(z[0-9]+\.d) - z[0-9]+\.d}.*\tst1d\t\1, p[0-7], \[x1\]\n} } } */ +/* { dg-final { scan-assembler {\tld4d\t{z[0-9]+\.d - (z[0-9]+\.d)}.*\tst1d\t\1, p[0-7], \[x1, #3, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld1d\t(z[0-9]+\.d), p[0-7]/z, \[x0, #5, mul vl\]\n.*\tst1d\t\1, p[0-7], \[x2\]\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp0\.b, vl1\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp1\.h, vl2\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp2\.s, vl3\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_be_s8.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_be_s8.c new file mode 100644 index 0000000..66ee82e --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_be_s8.c @@ -0,0 +1,71 @@ +/* { dg-do compile } */ +/* { dg-options "-O -mbig-endian -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#pragma GCC aarch64 "arm_sve.h" + +/* +** callee1: +** ptrue p3\.b, all +** ... +** ld1b (z[0-9]+\.b), p3/z, \[x1, #3, mul vl\] +** ... +** st4b {z[0-9]+\.b - \1}, p0, \[x0\] +** st2b {z3\.b - z4\.b}, p1, \[x0\] +** st3b {z5\.b - z7\.b}, p2, \[x0\] +** ret +*/ +void __attribute__((noipa)) +callee1 (void *x0, svint8x3_t z0, svint8x2_t z3, svint8x3_t z5, + svint8x4_t stack1, svint8_t stack2, svbool_t p0, + svbool_t p1, svbool_t p2) +{ + svst4_s8 (p0, x0, stack1); + svst2_s8 (p1, x0, z3); + svst3_s8 (p2, x0, z5); +} + +/* +** callee2: +** ptrue p3\.b, all +** ld1b (z[0-9]+\.b), p3/z, \[x2\] +** st1b \1, p0, \[x0\] +** st2b {z3\.b - z4\.b}, p1, \[x0\] +** st3b {z0\.b - z2\.b}, p2, \[x0\] +** ret +*/ +void __attribute__((noipa)) +callee2 (void *x0, svint8x3_t z0, svint8x2_t z3, svint8x3_t z5, + svint8x4_t stack1, svint8_t stack2, svbool_t p0, + svbool_t p1, svbool_t p2) +{ + svst1_s8 (p0, x0, stack2); + svst2_s8 (p1, x0, z3); + svst3_s8 (p2, x0, z0); +} + +void __attribute__((noipa)) +caller (void *x0) +{ + svbool_t pg; + pg = svptrue_b8 (); + callee1 (x0, + svld3_vnum_s8 (pg, x0, -9), + svld2_vnum_s8 (pg, x0, -2), + svld3_vnum_s8 (pg, x0, 0), + svld4_vnum_s8 (pg, x0, 8), + svld1_vnum_s8 (pg, x0, 5), + svptrue_pat_b8 (SV_VL1), + svptrue_pat_b16 (SV_VL2), + svptrue_pat_b32 (SV_VL3)); +} + +/* { dg-final { scan-assembler {\tld3b\t{z0\.b - z2\.b}, p[0-7]/z, \[x0, #-9, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld2b\t{z3\.b - z4\.b}, p[0-7]/z, \[x0, #-2, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld3b\t{z5\.b - z7\.b}, p[0-7]/z, \[x0\]\n} } } */ +/* { dg-final { scan-assembler {\tld4b\t{(z[0-9]+\.b) - z[0-9]+\.b}.*\tst1b\t\1, p[0-7], \[x1\]\n} } } */ +/* { dg-final { scan-assembler {\tld4b\t{z[0-9]+\.b - (z[0-9]+\.b)}.*\tst1b\t\1, p[0-7], \[x1, #3, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld1b\t(z[0-9]+\.b), p[0-7]/z, \[x0, #5, mul vl\]\n.*\tst1b\t\1, p[0-7], \[x2\]\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp0\.b, vl1\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp1\.h, vl2\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp2\.s, vl3\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_be_u16.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_be_u16.c new file mode 100644 index 0000000..b9370e1 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_be_u16.c @@ -0,0 +1,71 @@ +/* { dg-do compile } */ +/* { dg-options "-O -mbig-endian -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#pragma GCC aarch64 "arm_sve.h" + +/* +** callee1: +** ptrue p3\.b, all +** ... +** ld1h (z[0-9]+\.h), p3/z, \[x1, #3, mul vl\] +** ... +** st4h {z[0-9]+\.h - \1}, p0, \[x0\] +** st2h {z3\.h - z4\.h}, p1, \[x0\] +** st3h {z5\.h - z7\.h}, p2, \[x0\] +** ret +*/ +void __attribute__((noipa)) +callee1 (void *x0, svuint16x3_t z0, svuint16x2_t z3, svuint16x3_t z5, + svuint16x4_t stack1, svuint16_t stack2, svbool_t p0, + svbool_t p1, svbool_t p2) +{ + svst4_u16 (p0, x0, stack1); + svst2_u16 (p1, x0, z3); + svst3_u16 (p2, x0, z5); +} + +/* +** callee2: +** ptrue p3\.b, all +** ld1h (z[0-9]+\.h), p3/z, \[x2\] +** st1h \1, p0, \[x0\] +** st2h {z3\.h - z4\.h}, p1, \[x0\] +** st3h {z0\.h - z2\.h}, p2, \[x0\] +** ret +*/ +void __attribute__((noipa)) +callee2 (void *x0, svuint16x3_t z0, svuint16x2_t z3, svuint16x3_t z5, + svuint16x4_t stack1, svuint16_t stack2, svbool_t p0, + svbool_t p1, svbool_t p2) +{ + svst1_u16 (p0, x0, stack2); + svst2_u16 (p1, x0, z3); + svst3_u16 (p2, x0, z0); +} + +void __attribute__((noipa)) +caller (void *x0) +{ + svbool_t pg; + pg = svptrue_b8 (); + callee1 (x0, + svld3_vnum_u16 (pg, x0, -9), + svld2_vnum_u16 (pg, x0, -2), + svld3_vnum_u16 (pg, x0, 0), + svld4_vnum_u16 (pg, x0, 8), + svld1_vnum_u16 (pg, x0, 5), + svptrue_pat_b8 (SV_VL1), + svptrue_pat_b16 (SV_VL2), + svptrue_pat_b32 (SV_VL3)); +} + +/* { dg-final { scan-assembler {\tld3h\t{z0\.h - z2\.h}, p[0-7]/z, \[x0, #-9, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld2h\t{z3\.h - z4\.h}, p[0-7]/z, \[x0, #-2, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld3h\t{z5\.h - z7\.h}, p[0-7]/z, \[x0\]\n} } } */ +/* { dg-final { scan-assembler {\tld4h\t{(z[0-9]+\.h) - z[0-9]+\.h}.*\tst1h\t\1, p[0-7], \[x1\]\n} } } */ +/* { dg-final { scan-assembler {\tld4h\t{z[0-9]+\.h - (z[0-9]+\.h)}.*\tst1h\t\1, p[0-7], \[x1, #3, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld1h\t(z[0-9]+\.h), p[0-7]/z, \[x0, #5, mul vl\]\n.*\tst1h\t\1, p[0-7], \[x2\]\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp0\.b, vl1\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp1\.h, vl2\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp2\.s, vl3\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_be_u32.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_be_u32.c new file mode 100644 index 0000000..983c26c --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_be_u32.c @@ -0,0 +1,71 @@ +/* { dg-do compile } */ +/* { dg-options "-O -mbig-endian -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#pragma GCC aarch64 "arm_sve.h" + +/* +** callee1: +** ptrue p3\.b, all +** ... +** ld1w (z[0-9]+\.s), p3/z, \[x1, #3, mul vl\] +** ... +** st4w {z[0-9]+\.s - \1}, p0, \[x0\] +** st2w {z3\.s - z4\.s}, p1, \[x0\] +** st3w {z5\.s - z7\.s}, p2, \[x0\] +** ret +*/ +void __attribute__((noipa)) +callee1 (void *x0, svuint32x3_t z0, svuint32x2_t z3, svuint32x3_t z5, + svuint32x4_t stack1, svuint32_t stack2, svbool_t p0, + svbool_t p1, svbool_t p2) +{ + svst4_u32 (p0, x0, stack1); + svst2_u32 (p1, x0, z3); + svst3_u32 (p2, x0, z5); +} + +/* +** callee2: +** ptrue p3\.b, all +** ld1w (z[0-9]+\.s), p3/z, \[x2\] +** st1w \1, p0, \[x0\] +** st2w {z3\.s - z4\.s}, p1, \[x0\] +** st3w {z0\.s - z2\.s}, p2, \[x0\] +** ret +*/ +void __attribute__((noipa)) +callee2 (void *x0, svuint32x3_t z0, svuint32x2_t z3, svuint32x3_t z5, + svuint32x4_t stack1, svuint32_t stack2, svbool_t p0, + svbool_t p1, svbool_t p2) +{ + svst1_u32 (p0, x0, stack2); + svst2_u32 (p1, x0, z3); + svst3_u32 (p2, x0, z0); +} + +void __attribute__((noipa)) +caller (void *x0) +{ + svbool_t pg; + pg = svptrue_b8 (); + callee1 (x0, + svld3_vnum_u32 (pg, x0, -9), + svld2_vnum_u32 (pg, x0, -2), + svld3_vnum_u32 (pg, x0, 0), + svld4_vnum_u32 (pg, x0, 8), + svld1_vnum_u32 (pg, x0, 5), + svptrue_pat_b8 (SV_VL1), + svptrue_pat_b16 (SV_VL2), + svptrue_pat_b32 (SV_VL3)); +} + +/* { dg-final { scan-assembler {\tld3w\t{z0\.s - z2\.s}, p[0-7]/z, \[x0, #-9, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld2w\t{z3\.s - z4\.s}, p[0-7]/z, \[x0, #-2, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld3w\t{z5\.s - z7\.s}, p[0-7]/z, \[x0\]\n} } } */ +/* { dg-final { scan-assembler {\tld4w\t{(z[0-9]+\.s) - z[0-9]+\.s}.*\tst1w\t\1, p[0-7], \[x1\]\n} } } */ +/* { dg-final { scan-assembler {\tld4w\t{z[0-9]+\.s - (z[0-9]+\.s)}.*\tst1w\t\1, p[0-7], \[x1, #3, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld1w\t(z[0-9]+\.s), p[0-7]/z, \[x0, #5, mul vl\]\n.*\tst1w\t\1, p[0-7], \[x2\]\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp0\.b, vl1\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp1\.h, vl2\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp2\.s, vl3\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_be_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_be_u64.c new file mode 100644 index 0000000..89755d6 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_be_u64.c @@ -0,0 +1,71 @@ +/* { dg-do compile } */ +/* { dg-options "-O -mbig-endian -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#pragma GCC aarch64 "arm_sve.h" + +/* +** callee1: +** ptrue p3\.b, all +** ... +** ld1d (z[0-9]+\.d), p3/z, \[x1, #3, mul vl\] +** ... +** st4d {z[0-9]+\.d - \1}, p0, \[x0\] +** st2d {z3\.d - z4\.d}, p1, \[x0\] +** st3d {z5\.d - z7\.d}, p2, \[x0\] +** ret +*/ +void __attribute__((noipa)) +callee1 (void *x0, svuint64x3_t z0, svuint64x2_t z3, svuint64x3_t z5, + svuint64x4_t stack1, svuint64_t stack2, svbool_t p0, + svbool_t p1, svbool_t p2) +{ + svst4_u64 (p0, x0, stack1); + svst2_u64 (p1, x0, z3); + svst3_u64 (p2, x0, z5); +} + +/* +** callee2: +** ptrue p3\.b, all +** ld1d (z[0-9]+\.d), p3/z, \[x2\] +** st1d \1, p0, \[x0\] +** st2d {z3\.d - z4\.d}, p1, \[x0\] +** st3d {z0\.d - z2\.d}, p2, \[x0\] +** ret +*/ +void __attribute__((noipa)) +callee2 (void *x0, svuint64x3_t z0, svuint64x2_t z3, svuint64x3_t z5, + svuint64x4_t stack1, svuint64_t stack2, svbool_t p0, + svbool_t p1, svbool_t p2) +{ + svst1_u64 (p0, x0, stack2); + svst2_u64 (p1, x0, z3); + svst3_u64 (p2, x0, z0); +} + +void __attribute__((noipa)) +caller (void *x0) +{ + svbool_t pg; + pg = svptrue_b8 (); + callee1 (x0, + svld3_vnum_u64 (pg, x0, -9), + svld2_vnum_u64 (pg, x0, -2), + svld3_vnum_u64 (pg, x0, 0), + svld4_vnum_u64 (pg, x0, 8), + svld1_vnum_u64 (pg, x0, 5), + svptrue_pat_b8 (SV_VL1), + svptrue_pat_b16 (SV_VL2), + svptrue_pat_b32 (SV_VL3)); +} + +/* { dg-final { scan-assembler {\tld3d\t{z0\.d - z2\.d}, p[0-7]/z, \[x0, #-9, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld2d\t{z3\.d - z4\.d}, p[0-7]/z, \[x0, #-2, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld3d\t{z5\.d - z7\.d}, p[0-7]/z, \[x0\]\n} } } */ +/* { dg-final { scan-assembler {\tld4d\t{(z[0-9]+\.d) - z[0-9]+\.d}.*\tst1d\t\1, p[0-7], \[x1\]\n} } } */ +/* { dg-final { scan-assembler {\tld4d\t{z[0-9]+\.d - (z[0-9]+\.d)}.*\tst1d\t\1, p[0-7], \[x1, #3, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld1d\t(z[0-9]+\.d), p[0-7]/z, \[x0, #5, mul vl\]\n.*\tst1d\t\1, p[0-7], \[x2\]\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp0\.b, vl1\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp1\.h, vl2\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp2\.s, vl3\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_be_u8.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_be_u8.c new file mode 100644 index 0000000..7324bd6 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_be_u8.c @@ -0,0 +1,71 @@ +/* { dg-do compile } */ +/* { dg-options "-O -mbig-endian -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#pragma GCC aarch64 "arm_sve.h" + +/* +** callee1: +** ptrue p3\.b, all +** ... +** ld1b (z[0-9]+\.b), p3/z, \[x1, #3, mul vl\] +** ... +** st4b {z[0-9]+\.b - \1}, p0, \[x0\] +** st2b {z3\.b - z4\.b}, p1, \[x0\] +** st3b {z5\.b - z7\.b}, p2, \[x0\] +** ret +*/ +void __attribute__((noipa)) +callee1 (void *x0, svuint8x3_t z0, svuint8x2_t z3, svuint8x3_t z5, + svuint8x4_t stack1, svuint8_t stack2, svbool_t p0, + svbool_t p1, svbool_t p2) +{ + svst4_u8 (p0, x0, stack1); + svst2_u8 (p1, x0, z3); + svst3_u8 (p2, x0, z5); +} + +/* +** callee2: +** ptrue p3\.b, all +** ld1b (z[0-9]+\.b), p3/z, \[x2\] +** st1b \1, p0, \[x0\] +** st2b {z3\.b - z4\.b}, p1, \[x0\] +** st3b {z0\.b - z2\.b}, p2, \[x0\] +** ret +*/ +void __attribute__((noipa)) +callee2 (void *x0, svuint8x3_t z0, svuint8x2_t z3, svuint8x3_t z5, + svuint8x4_t stack1, svuint8_t stack2, svbool_t p0, + svbool_t p1, svbool_t p2) +{ + svst1_u8 (p0, x0, stack2); + svst2_u8 (p1, x0, z3); + svst3_u8 (p2, x0, z0); +} + +void __attribute__((noipa)) +caller (void *x0) +{ + svbool_t pg; + pg = svptrue_b8 (); + callee1 (x0, + svld3_vnum_u8 (pg, x0, -9), + svld2_vnum_u8 (pg, x0, -2), + svld3_vnum_u8 (pg, x0, 0), + svld4_vnum_u8 (pg, x0, 8), + svld1_vnum_u8 (pg, x0, 5), + svptrue_pat_b8 (SV_VL1), + svptrue_pat_b16 (SV_VL2), + svptrue_pat_b32 (SV_VL3)); +} + +/* { dg-final { scan-assembler {\tld3b\t{z0\.b - z2\.b}, p[0-7]/z, \[x0, #-9, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld2b\t{z3\.b - z4\.b}, p[0-7]/z, \[x0, #-2, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld3b\t{z5\.b - z7\.b}, p[0-7]/z, \[x0\]\n} } } */ +/* { dg-final { scan-assembler {\tld4b\t{(z[0-9]+\.b) - z[0-9]+\.b}.*\tst1b\t\1, p[0-7], \[x1\]\n} } } */ +/* { dg-final { scan-assembler {\tld4b\t{z[0-9]+\.b - (z[0-9]+\.b)}.*\tst1b\t\1, p[0-7], \[x1, #3, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld1b\t(z[0-9]+\.b), p[0-7]/z, \[x0, #5, mul vl\]\n.*\tst1b\t\1, p[0-7], \[x2\]\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp0\.b, vl1\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp1\.h, vl2\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp2\.s, vl3\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_le_f16.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_le_f16.c new file mode 100644 index 0000000..9392c67 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_le_f16.c @@ -0,0 +1,70 @@ +/* { dg-do compile } */ +/* { dg-options "-O -mlittle-endian -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#pragma GCC aarch64 "arm_sve.h" + +/* +** callee1: +** ... +** ldr (z[0-9]+), \[x1, #3, mul vl\] +** ... +** st4h {z[0-9]+\.h - \1\.h}, p0, \[x0\] +** st2h {z3\.h - z4\.h}, p1, \[x0\] +** st3h {z5\.h - z7\.h}, p2, \[x0\] +** ret +*/ +void __attribute__((noipa)) +callee1 (void *x0, svfloat16x3_t z0, svfloat16x2_t z3, svfloat16x3_t z5, + svfloat16x4_t stack1, svfloat16_t stack2, svbool_t p0, + svbool_t p1, svbool_t p2) +{ + svst4_f16 (p0, x0, stack1); + svst2_f16 (p1, x0, z3); + svst3_f16 (p2, x0, z5); +} + +/* +** callee2: +** ptrue p3\.b, all +** ld1h (z[0-9]+\.h), p3/z, \[x2\] +** st1h \1, p0, \[x0\] +** st2h {z3\.h - z4\.h}, p1, \[x0\] +** st3h {z0\.h - z2\.h}, p2, \[x0\] +** ret +*/ +void __attribute__((noipa)) +callee2 (void *x0, svfloat16x3_t z0, svfloat16x2_t z3, svfloat16x3_t z5, + svfloat16x4_t stack1, svfloat16_t stack2, svbool_t p0, + svbool_t p1, svbool_t p2) +{ + svst1_f16 (p0, x0, stack2); + svst2_f16 (p1, x0, z3); + svst3_f16 (p2, x0, z0); +} + +void __attribute__((noipa)) +caller (void *x0) +{ + svbool_t pg; + pg = svptrue_b8 (); + callee1 (x0, + svld3_vnum_f16 (pg, x0, -9), + svld2_vnum_f16 (pg, x0, -2), + svld3_vnum_f16 (pg, x0, 0), + svld4_vnum_f16 (pg, x0, 8), + svld1_vnum_f16 (pg, x0, 5), + svptrue_pat_b8 (SV_VL1), + svptrue_pat_b16 (SV_VL2), + svptrue_pat_b32 (SV_VL3)); +} + +/* { dg-final { scan-assembler {\tld3h\t{z0\.h - z2\.h}, p[0-7]/z, \[x0, #-9, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld2h\t{z3\.h - z4\.h}, p[0-7]/z, \[x0, #-2, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld3h\t{z5\.h - z7\.h}, p[0-7]/z, \[x0\]\n} } } */ +/* { dg-final { scan-assembler {\tld4h\t{(z[0-9]+)\.h - z[0-9]+\.h}.*\tstr\t\1, \[x1\]\n} } } */ +/* { dg-final { scan-assembler {\tld4h\t{z[0-9]+\.h - (z[0-9]+)\.h}.*\tstr\t\1, \[x1, #3, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld1h\t(z[0-9]+\.h), p[0-7]/z, \[x0, #5, mul vl\]\n.*\tst1h\t\1, p[0-7], \[x2\]\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp0\.b, vl1\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp1\.h, vl2\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp2\.s, vl3\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_le_f32.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_le_f32.c new file mode 100644 index 0000000..8b22cf3 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_le_f32.c @@ -0,0 +1,70 @@ +/* { dg-do compile } */ +/* { dg-options "-O -mlittle-endian -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#pragma GCC aarch64 "arm_sve.h" + +/* +** callee1: +** ... +** ldr (z[0-9]+), \[x1, #3, mul vl\] +** ... +** st4w {z[0-9]+\.s - \1\.s}, p0, \[x0\] +** st2w {z3\.s - z4\.s}, p1, \[x0\] +** st3w {z5\.s - z7\.s}, p2, \[x0\] +** ret +*/ +void __attribute__((noipa)) +callee1 (void *x0, svfloat32x3_t z0, svfloat32x2_t z3, svfloat32x3_t z5, + svfloat32x4_t stack1, svfloat32_t stack2, svbool_t p0, + svbool_t p1, svbool_t p2) +{ + svst4_f32 (p0, x0, stack1); + svst2_f32 (p1, x0, z3); + svst3_f32 (p2, x0, z5); +} + +/* +** callee2: +** ptrue p3\.b, all +** ld1w (z[0-9]+\.s), p3/z, \[x2\] +** st1w \1, p0, \[x0\] +** st2w {z3\.s - z4\.s}, p1, \[x0\] +** st3w {z0\.s - z2\.s}, p2, \[x0\] +** ret +*/ +void __attribute__((noipa)) +callee2 (void *x0, svfloat32x3_t z0, svfloat32x2_t z3, svfloat32x3_t z5, + svfloat32x4_t stack1, svfloat32_t stack2, svbool_t p0, + svbool_t p1, svbool_t p2) +{ + svst1_f32 (p0, x0, stack2); + svst2_f32 (p1, x0, z3); + svst3_f32 (p2, x0, z0); +} + +void __attribute__((noipa)) +caller (void *x0) +{ + svbool_t pg; + pg = svptrue_b8 (); + callee1 (x0, + svld3_vnum_f32 (pg, x0, -9), + svld2_vnum_f32 (pg, x0, -2), + svld3_vnum_f32 (pg, x0, 0), + svld4_vnum_f32 (pg, x0, 8), + svld1_vnum_f32 (pg, x0, 5), + svptrue_pat_b8 (SV_VL1), + svptrue_pat_b16 (SV_VL2), + svptrue_pat_b32 (SV_VL3)); +} + +/* { dg-final { scan-assembler {\tld3w\t{z0\.s - z2\.s}, p[0-7]/z, \[x0, #-9, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld2w\t{z3\.s - z4\.s}, p[0-7]/z, \[x0, #-2, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld3w\t{z5\.s - z7\.s}, p[0-7]/z, \[x0\]\n} } } */ +/* { dg-final { scan-assembler {\tld4w\t{(z[0-9]+)\.s - z[0-9]+\.s}.*\tstr\t\1, \[x1\]\n} } } */ +/* { dg-final { scan-assembler {\tld4w\t{z[0-9]+\.s - (z[0-9]+)\.s}.*\tstr\t\1, \[x1, #3, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld1w\t(z[0-9]+\.s), p[0-7]/z, \[x0, #5, mul vl\]\n.*\tst1w\t\1, p[0-7], \[x2\]\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp0\.b, vl1\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp1\.h, vl2\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp2\.s, vl3\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_le_f64.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_le_f64.c new file mode 100644 index 0000000..94a1d40 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_le_f64.c @@ -0,0 +1,70 @@ +/* { dg-do compile } */ +/* { dg-options "-O -mlittle-endian -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#pragma GCC aarch64 "arm_sve.h" + +/* +** callee1: +** ... +** ldr (z[0-9]+), \[x1, #3, mul vl\] +** ... +** st4d {z[0-9]+\.d - \1\.d}, p0, \[x0\] +** st2d {z3\.d - z4\.d}, p1, \[x0\] +** st3d {z5\.d - z7\.d}, p2, \[x0\] +** ret +*/ +void __attribute__((noipa)) +callee1 (void *x0, svfloat64x3_t z0, svfloat64x2_t z3, svfloat64x3_t z5, + svfloat64x4_t stack1, svfloat64_t stack2, svbool_t p0, + svbool_t p1, svbool_t p2) +{ + svst4_f64 (p0, x0, stack1); + svst2_f64 (p1, x0, z3); + svst3_f64 (p2, x0, z5); +} + +/* +** callee2: +** ptrue p3\.b, all +** ld1d (z[0-9]+\.d), p3/z, \[x2\] +** st1d \1, p0, \[x0\] +** st2d {z3\.d - z4\.d}, p1, \[x0\] +** st3d {z0\.d - z2\.d}, p2, \[x0\] +** ret +*/ +void __attribute__((noipa)) +callee2 (void *x0, svfloat64x3_t z0, svfloat64x2_t z3, svfloat64x3_t z5, + svfloat64x4_t stack1, svfloat64_t stack2, svbool_t p0, + svbool_t p1, svbool_t p2) +{ + svst1_f64 (p0, x0, stack2); + svst2_f64 (p1, x0, z3); + svst3_f64 (p2, x0, z0); +} + +void __attribute__((noipa)) +caller (void *x0) +{ + svbool_t pg; + pg = svptrue_b8 (); + callee1 (x0, + svld3_vnum_f64 (pg, x0, -9), + svld2_vnum_f64 (pg, x0, -2), + svld3_vnum_f64 (pg, x0, 0), + svld4_vnum_f64 (pg, x0, 8), + svld1_vnum_f64 (pg, x0, 5), + svptrue_pat_b8 (SV_VL1), + svptrue_pat_b16 (SV_VL2), + svptrue_pat_b32 (SV_VL3)); +} + +/* { dg-final { scan-assembler {\tld3d\t{z0\.d - z2\.d}, p[0-7]/z, \[x0, #-9, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld2d\t{z3\.d - z4\.d}, p[0-7]/z, \[x0, #-2, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld3d\t{z5\.d - z7\.d}, p[0-7]/z, \[x0\]\n} } } */ +/* { dg-final { scan-assembler {\tld4d\t{(z[0-9]+)\.d - z[0-9]+\.d}.*\tstr\t\1, \[x1\]\n} } } */ +/* { dg-final { scan-assembler {\tld4d\t{z[0-9]+\.d - (z[0-9]+)\.d}.*\tstr\t\1, \[x1, #3, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld1d\t(z[0-9]+\.d), p[0-7]/z, \[x0, #5, mul vl\]\n.*\tst1d\t\1, p[0-7], \[x2\]\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp0\.b, vl1\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp1\.h, vl2\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp2\.s, vl3\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_le_s16.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_le_s16.c new file mode 100644 index 0000000..992ab18 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_le_s16.c @@ -0,0 +1,70 @@ +/* { dg-do compile } */ +/* { dg-options "-O -mlittle-endian -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#pragma GCC aarch64 "arm_sve.h" + +/* +** callee1: +** ... +** ldr (z[0-9]+), \[x1, #3, mul vl\] +** ... +** st4h {z[0-9]+\.h - \1\.h}, p0, \[x0\] +** st2h {z3\.h - z4\.h}, p1, \[x0\] +** st3h {z5\.h - z7\.h}, p2, \[x0\] +** ret +*/ +void __attribute__((noipa)) +callee1 (void *x0, svint16x3_t z0, svint16x2_t z3, svint16x3_t z5, + svint16x4_t stack1, svint16_t stack2, svbool_t p0, + svbool_t p1, svbool_t p2) +{ + svst4_s16 (p0, x0, stack1); + svst2_s16 (p1, x0, z3); + svst3_s16 (p2, x0, z5); +} + +/* +** callee2: +** ptrue p3\.b, all +** ld1h (z[0-9]+\.h), p3/z, \[x2\] +** st1h \1, p0, \[x0\] +** st2h {z3\.h - z4\.h}, p1, \[x0\] +** st3h {z0\.h - z2\.h}, p2, \[x0\] +** ret +*/ +void __attribute__((noipa)) +callee2 (void *x0, svint16x3_t z0, svint16x2_t z3, svint16x3_t z5, + svint16x4_t stack1, svint16_t stack2, svbool_t p0, + svbool_t p1, svbool_t p2) +{ + svst1_s16 (p0, x0, stack2); + svst2_s16 (p1, x0, z3); + svst3_s16 (p2, x0, z0); +} + +void __attribute__((noipa)) +caller (void *x0) +{ + svbool_t pg; + pg = svptrue_b8 (); + callee1 (x0, + svld3_vnum_s16 (pg, x0, -9), + svld2_vnum_s16 (pg, x0, -2), + svld3_vnum_s16 (pg, x0, 0), + svld4_vnum_s16 (pg, x0, 8), + svld1_vnum_s16 (pg, x0, 5), + svptrue_pat_b8 (SV_VL1), + svptrue_pat_b16 (SV_VL2), + svptrue_pat_b32 (SV_VL3)); +} + +/* { dg-final { scan-assembler {\tld3h\t{z0\.h - z2\.h}, p[0-7]/z, \[x0, #-9, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld2h\t{z3\.h - z4\.h}, p[0-7]/z, \[x0, #-2, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld3h\t{z5\.h - z7\.h}, p[0-7]/z, \[x0\]\n} } } */ +/* { dg-final { scan-assembler {\tld4h\t{(z[0-9]+)\.h - z[0-9]+\.h}.*\tstr\t\1, \[x1\]\n} } } */ +/* { dg-final { scan-assembler {\tld4h\t{z[0-9]+\.h - (z[0-9]+)\.h}.*\tstr\t\1, \[x1, #3, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld1h\t(z[0-9]+\.h), p[0-7]/z, \[x0, #5, mul vl\]\n.*\tst1h\t\1, p[0-7], \[x2\]\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp0\.b, vl1\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp1\.h, vl2\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp2\.s, vl3\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_le_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_le_s32.c new file mode 100644 index 0000000..6a497e9 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_le_s32.c @@ -0,0 +1,70 @@ +/* { dg-do compile } */ +/* { dg-options "-O -mlittle-endian -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#pragma GCC aarch64 "arm_sve.h" + +/* +** callee1: +** ... +** ldr (z[0-9]+), \[x1, #3, mul vl\] +** ... +** st4w {z[0-9]+\.s - \1\.s}, p0, \[x0\] +** st2w {z3\.s - z4\.s}, p1, \[x0\] +** st3w {z5\.s - z7\.s}, p2, \[x0\] +** ret +*/ +void __attribute__((noipa)) +callee1 (void *x0, svint32x3_t z0, svint32x2_t z3, svint32x3_t z5, + svint32x4_t stack1, svint32_t stack2, svbool_t p0, + svbool_t p1, svbool_t p2) +{ + svst4_s32 (p0, x0, stack1); + svst2_s32 (p1, x0, z3); + svst3_s32 (p2, x0, z5); +} + +/* +** callee2: +** ptrue p3\.b, all +** ld1w (z[0-9]+\.s), p3/z, \[x2\] +** st1w \1, p0, \[x0\] +** st2w {z3\.s - z4\.s}, p1, \[x0\] +** st3w {z0\.s - z2\.s}, p2, \[x0\] +** ret +*/ +void __attribute__((noipa)) +callee2 (void *x0, svint32x3_t z0, svint32x2_t z3, svint32x3_t z5, + svint32x4_t stack1, svint32_t stack2, svbool_t p0, + svbool_t p1, svbool_t p2) +{ + svst1_s32 (p0, x0, stack2); + svst2_s32 (p1, x0, z3); + svst3_s32 (p2, x0, z0); +} + +void __attribute__((noipa)) +caller (void *x0) +{ + svbool_t pg; + pg = svptrue_b8 (); + callee1 (x0, + svld3_vnum_s32 (pg, x0, -9), + svld2_vnum_s32 (pg, x0, -2), + svld3_vnum_s32 (pg, x0, 0), + svld4_vnum_s32 (pg, x0, 8), + svld1_vnum_s32 (pg, x0, 5), + svptrue_pat_b8 (SV_VL1), + svptrue_pat_b16 (SV_VL2), + svptrue_pat_b32 (SV_VL3)); +} + +/* { dg-final { scan-assembler {\tld3w\t{z0\.s - z2\.s}, p[0-7]/z, \[x0, #-9, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld2w\t{z3\.s - z4\.s}, p[0-7]/z, \[x0, #-2, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld3w\t{z5\.s - z7\.s}, p[0-7]/z, \[x0\]\n} } } */ +/* { dg-final { scan-assembler {\tld4w\t{(z[0-9]+)\.s - z[0-9]+\.s}.*\tstr\t\1, \[x1\]\n} } } */ +/* { dg-final { scan-assembler {\tld4w\t{z[0-9]+\.s - (z[0-9]+)\.s}.*\tstr\t\1, \[x1, #3, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld1w\t(z[0-9]+\.s), p[0-7]/z, \[x0, #5, mul vl\]\n.*\tst1w\t\1, p[0-7], \[x2\]\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp0\.b, vl1\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp1\.h, vl2\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp2\.s, vl3\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_le_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_le_s64.c new file mode 100644 index 0000000..d2e4c44 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_le_s64.c @@ -0,0 +1,70 @@ +/* { dg-do compile } */ +/* { dg-options "-O -mlittle-endian -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#pragma GCC aarch64 "arm_sve.h" + +/* +** callee1: +** ... +** ldr (z[0-9]+), \[x1, #3, mul vl\] +** ... +** st4d {z[0-9]+\.d - \1\.d}, p0, \[x0\] +** st2d {z3\.d - z4\.d}, p1, \[x0\] +** st3d {z5\.d - z7\.d}, p2, \[x0\] +** ret +*/ +void __attribute__((noipa)) +callee1 (void *x0, svint64x3_t z0, svint64x2_t z3, svint64x3_t z5, + svint64x4_t stack1, svint64_t stack2, svbool_t p0, + svbool_t p1, svbool_t p2) +{ + svst4_s64 (p0, x0, stack1); + svst2_s64 (p1, x0, z3); + svst3_s64 (p2, x0, z5); +} + +/* +** callee2: +** ptrue p3\.b, all +** ld1d (z[0-9]+\.d), p3/z, \[x2\] +** st1d \1, p0, \[x0\] +** st2d {z3\.d - z4\.d}, p1, \[x0\] +** st3d {z0\.d - z2\.d}, p2, \[x0\] +** ret +*/ +void __attribute__((noipa)) +callee2 (void *x0, svint64x3_t z0, svint64x2_t z3, svint64x3_t z5, + svint64x4_t stack1, svint64_t stack2, svbool_t p0, + svbool_t p1, svbool_t p2) +{ + svst1_s64 (p0, x0, stack2); + svst2_s64 (p1, x0, z3); + svst3_s64 (p2, x0, z0); +} + +void __attribute__((noipa)) +caller (void *x0) +{ + svbool_t pg; + pg = svptrue_b8 (); + callee1 (x0, + svld3_vnum_s64 (pg, x0, -9), + svld2_vnum_s64 (pg, x0, -2), + svld3_vnum_s64 (pg, x0, 0), + svld4_vnum_s64 (pg, x0, 8), + svld1_vnum_s64 (pg, x0, 5), + svptrue_pat_b8 (SV_VL1), + svptrue_pat_b16 (SV_VL2), + svptrue_pat_b32 (SV_VL3)); +} + +/* { dg-final { scan-assembler {\tld3d\t{z0\.d - z2\.d}, p[0-7]/z, \[x0, #-9, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld2d\t{z3\.d - z4\.d}, p[0-7]/z, \[x0, #-2, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld3d\t{z5\.d - z7\.d}, p[0-7]/z, \[x0\]\n} } } */ +/* { dg-final { scan-assembler {\tld4d\t{(z[0-9]+)\.d - z[0-9]+\.d}.*\tstr\t\1, \[x1\]\n} } } */ +/* { dg-final { scan-assembler {\tld4d\t{z[0-9]+\.d - (z[0-9]+)\.d}.*\tstr\t\1, \[x1, #3, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld1d\t(z[0-9]+\.d), p[0-7]/z, \[x0, #5, mul vl\]\n.*\tst1d\t\1, p[0-7], \[x2\]\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp0\.b, vl1\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp1\.h, vl2\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp2\.s, vl3\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_le_s8.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_le_s8.c new file mode 100644 index 0000000..eb7a374 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_le_s8.c @@ -0,0 +1,70 @@ +/* { dg-do compile } */ +/* { dg-options "-O -mlittle-endian -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#pragma GCC aarch64 "arm_sve.h" + +/* +** callee1: +** ... +** ldr (z[0-9]+), \[x1, #3, mul vl\] +** ... +** st4b {z[0-9]+\.b - \1\.b}, p0, \[x0\] +** st2b {z3\.b - z4\.b}, p1, \[x0\] +** st3b {z5\.b - z7\.b}, p2, \[x0\] +** ret +*/ +void __attribute__((noipa)) +callee1 (void *x0, svint8x3_t z0, svint8x2_t z3, svint8x3_t z5, + svint8x4_t stack1, svint8_t stack2, svbool_t p0, + svbool_t p1, svbool_t p2) +{ + svst4_s8 (p0, x0, stack1); + svst2_s8 (p1, x0, z3); + svst3_s8 (p2, x0, z5); +} + +/* +** callee2: +** ptrue p3\.b, all +** ld1b (z[0-9]+\.b), p3/z, \[x2\] +** st1b \1, p0, \[x0\] +** st2b {z3\.b - z4\.b}, p1, \[x0\] +** st3b {z0\.b - z2\.b}, p2, \[x0\] +** ret +*/ +void __attribute__((noipa)) +callee2 (void *x0, svint8x3_t z0, svint8x2_t z3, svint8x3_t z5, + svint8x4_t stack1, svint8_t stack2, svbool_t p0, + svbool_t p1, svbool_t p2) +{ + svst1_s8 (p0, x0, stack2); + svst2_s8 (p1, x0, z3); + svst3_s8 (p2, x0, z0); +} + +void __attribute__((noipa)) +caller (void *x0) +{ + svbool_t pg; + pg = svptrue_b8 (); + callee1 (x0, + svld3_vnum_s8 (pg, x0, -9), + svld2_vnum_s8 (pg, x0, -2), + svld3_vnum_s8 (pg, x0, 0), + svld4_vnum_s8 (pg, x0, 8), + svld1_vnum_s8 (pg, x0, 5), + svptrue_pat_b8 (SV_VL1), + svptrue_pat_b16 (SV_VL2), + svptrue_pat_b32 (SV_VL3)); +} + +/* { dg-final { scan-assembler {\tld3b\t{z0\.b - z2\.b}, p[0-7]/z, \[x0, #-9, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld2b\t{z3\.b - z4\.b}, p[0-7]/z, \[x0, #-2, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld3b\t{z5\.b - z7\.b}, p[0-7]/z, \[x0\]\n} } } */ +/* { dg-final { scan-assembler {\tld4b\t{(z[0-9]+)\.b - z[0-9]+\.b}.*\tstr\t\1, \[x1\]\n} } } */ +/* { dg-final { scan-assembler {\tld4b\t{z[0-9]+\.b - (z[0-9]+)\.b}.*\tstr\t\1, \[x1, #3, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld1b\t(z[0-9]+\.b), p[0-7]/z, \[x0, #5, mul vl\]\n.*\tst1b\t\1, p[0-7], \[x2\]\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp0\.b, vl1\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp1\.h, vl2\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp2\.s, vl3\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_le_u16.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_le_u16.c new file mode 100644 index 0000000..0d5b304 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_le_u16.c @@ -0,0 +1,70 @@ +/* { dg-do compile } */ +/* { dg-options "-O -mlittle-endian -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#pragma GCC aarch64 "arm_sve.h" + +/* +** callee1: +** ... +** ldr (z[0-9]+), \[x1, #3, mul vl\] +** ... +** st4h {z[0-9]+\.h - \1\.h}, p0, \[x0\] +** st2h {z3\.h - z4\.h}, p1, \[x0\] +** st3h {z5\.h - z7\.h}, p2, \[x0\] +** ret +*/ +void __attribute__((noipa)) +callee1 (void *x0, svuint16x3_t z0, svuint16x2_t z3, svuint16x3_t z5, + svuint16x4_t stack1, svuint16_t stack2, svbool_t p0, + svbool_t p1, svbool_t p2) +{ + svst4_u16 (p0, x0, stack1); + svst2_u16 (p1, x0, z3); + svst3_u16 (p2, x0, z5); +} + +/* +** callee2: +** ptrue p3\.b, all +** ld1h (z[0-9]+\.h), p3/z, \[x2\] +** st1h \1, p0, \[x0\] +** st2h {z3\.h - z4\.h}, p1, \[x0\] +** st3h {z0\.h - z2\.h}, p2, \[x0\] +** ret +*/ +void __attribute__((noipa)) +callee2 (void *x0, svuint16x3_t z0, svuint16x2_t z3, svuint16x3_t z5, + svuint16x4_t stack1, svuint16_t stack2, svbool_t p0, + svbool_t p1, svbool_t p2) +{ + svst1_u16 (p0, x0, stack2); + svst2_u16 (p1, x0, z3); + svst3_u16 (p2, x0, z0); +} + +void __attribute__((noipa)) +caller (void *x0) +{ + svbool_t pg; + pg = svptrue_b8 (); + callee1 (x0, + svld3_vnum_u16 (pg, x0, -9), + svld2_vnum_u16 (pg, x0, -2), + svld3_vnum_u16 (pg, x0, 0), + svld4_vnum_u16 (pg, x0, 8), + svld1_vnum_u16 (pg, x0, 5), + svptrue_pat_b8 (SV_VL1), + svptrue_pat_b16 (SV_VL2), + svptrue_pat_b32 (SV_VL3)); +} + +/* { dg-final { scan-assembler {\tld3h\t{z0\.h - z2\.h}, p[0-7]/z, \[x0, #-9, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld2h\t{z3\.h - z4\.h}, p[0-7]/z, \[x0, #-2, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld3h\t{z5\.h - z7\.h}, p[0-7]/z, \[x0\]\n} } } */ +/* { dg-final { scan-assembler {\tld4h\t{(z[0-9]+)\.h - z[0-9]+\.h}.*\tstr\t\1, \[x1\]\n} } } */ +/* { dg-final { scan-assembler {\tld4h\t{z[0-9]+\.h - (z[0-9]+)\.h}.*\tstr\t\1, \[x1, #3, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld1h\t(z[0-9]+\.h), p[0-7]/z, \[x0, #5, mul vl\]\n.*\tst1h\t\1, p[0-7], \[x2\]\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp0\.b, vl1\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp1\.h, vl2\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp2\.s, vl3\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_le_u32.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_le_u32.c new file mode 100644 index 0000000..962ccc9 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_le_u32.c @@ -0,0 +1,70 @@ +/* { dg-do compile } */ +/* { dg-options "-O -mlittle-endian -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#pragma GCC aarch64 "arm_sve.h" + +/* +** callee1: +** ... +** ldr (z[0-9]+), \[x1, #3, mul vl\] +** ... +** st4w {z[0-9]+\.s - \1\.s}, p0, \[x0\] +** st2w {z3\.s - z4\.s}, p1, \[x0\] +** st3w {z5\.s - z7\.s}, p2, \[x0\] +** ret +*/ +void __attribute__((noipa)) +callee1 (void *x0, svuint32x3_t z0, svuint32x2_t z3, svuint32x3_t z5, + svuint32x4_t stack1, svuint32_t stack2, svbool_t p0, + svbool_t p1, svbool_t p2) +{ + svst4_u32 (p0, x0, stack1); + svst2_u32 (p1, x0, z3); + svst3_u32 (p2, x0, z5); +} + +/* +** callee2: +** ptrue p3\.b, all +** ld1w (z[0-9]+\.s), p3/z, \[x2\] +** st1w \1, p0, \[x0\] +** st2w {z3\.s - z4\.s}, p1, \[x0\] +** st3w {z0\.s - z2\.s}, p2, \[x0\] +** ret +*/ +void __attribute__((noipa)) +callee2 (void *x0, svuint32x3_t z0, svuint32x2_t z3, svuint32x3_t z5, + svuint32x4_t stack1, svuint32_t stack2, svbool_t p0, + svbool_t p1, svbool_t p2) +{ + svst1_u32 (p0, x0, stack2); + svst2_u32 (p1, x0, z3); + svst3_u32 (p2, x0, z0); +} + +void __attribute__((noipa)) +caller (void *x0) +{ + svbool_t pg; + pg = svptrue_b8 (); + callee1 (x0, + svld3_vnum_u32 (pg, x0, -9), + svld2_vnum_u32 (pg, x0, -2), + svld3_vnum_u32 (pg, x0, 0), + svld4_vnum_u32 (pg, x0, 8), + svld1_vnum_u32 (pg, x0, 5), + svptrue_pat_b8 (SV_VL1), + svptrue_pat_b16 (SV_VL2), + svptrue_pat_b32 (SV_VL3)); +} + +/* { dg-final { scan-assembler {\tld3w\t{z0\.s - z2\.s}, p[0-7]/z, \[x0, #-9, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld2w\t{z3\.s - z4\.s}, p[0-7]/z, \[x0, #-2, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld3w\t{z5\.s - z7\.s}, p[0-7]/z, \[x0\]\n} } } */ +/* { dg-final { scan-assembler {\tld4w\t{(z[0-9]+)\.s - z[0-9]+\.s}.*\tstr\t\1, \[x1\]\n} } } */ +/* { dg-final { scan-assembler {\tld4w\t{z[0-9]+\.s - (z[0-9]+)\.s}.*\tstr\t\1, \[x1, #3, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld1w\t(z[0-9]+\.s), p[0-7]/z, \[x0, #5, mul vl\]\n.*\tst1w\t\1, p[0-7], \[x2\]\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp0\.b, vl1\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp1\.h, vl2\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp2\.s, vl3\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_le_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_le_u64.c new file mode 100644 index 0000000..930ed96 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_le_u64.c @@ -0,0 +1,70 @@ +/* { dg-do compile } */ +/* { dg-options "-O -mlittle-endian -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#pragma GCC aarch64 "arm_sve.h" + +/* +** callee1: +** ... +** ldr (z[0-9]+), \[x1, #3, mul vl\] +** ... +** st4d {z[0-9]+\.d - \1\.d}, p0, \[x0\] +** st2d {z3\.d - z4\.d}, p1, \[x0\] +** st3d {z5\.d - z7\.d}, p2, \[x0\] +** ret +*/ +void __attribute__((noipa)) +callee1 (void *x0, svuint64x3_t z0, svuint64x2_t z3, svuint64x3_t z5, + svuint64x4_t stack1, svuint64_t stack2, svbool_t p0, + svbool_t p1, svbool_t p2) +{ + svst4_u64 (p0, x0, stack1); + svst2_u64 (p1, x0, z3); + svst3_u64 (p2, x0, z5); +} + +/* +** callee2: +** ptrue p3\.b, all +** ld1d (z[0-9]+\.d), p3/z, \[x2\] +** st1d \1, p0, \[x0\] +** st2d {z3\.d - z4\.d}, p1, \[x0\] +** st3d {z0\.d - z2\.d}, p2, \[x0\] +** ret +*/ +void __attribute__((noipa)) +callee2 (void *x0, svuint64x3_t z0, svuint64x2_t z3, svuint64x3_t z5, + svuint64x4_t stack1, svuint64_t stack2, svbool_t p0, + svbool_t p1, svbool_t p2) +{ + svst1_u64 (p0, x0, stack2); + svst2_u64 (p1, x0, z3); + svst3_u64 (p2, x0, z0); +} + +void __attribute__((noipa)) +caller (void *x0) +{ + svbool_t pg; + pg = svptrue_b8 (); + callee1 (x0, + svld3_vnum_u64 (pg, x0, -9), + svld2_vnum_u64 (pg, x0, -2), + svld3_vnum_u64 (pg, x0, 0), + svld4_vnum_u64 (pg, x0, 8), + svld1_vnum_u64 (pg, x0, 5), + svptrue_pat_b8 (SV_VL1), + svptrue_pat_b16 (SV_VL2), + svptrue_pat_b32 (SV_VL3)); +} + +/* { dg-final { scan-assembler {\tld3d\t{z0\.d - z2\.d}, p[0-7]/z, \[x0, #-9, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld2d\t{z3\.d - z4\.d}, p[0-7]/z, \[x0, #-2, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld3d\t{z5\.d - z7\.d}, p[0-7]/z, \[x0\]\n} } } */ +/* { dg-final { scan-assembler {\tld4d\t{(z[0-9]+)\.d - z[0-9]+\.d}.*\tstr\t\1, \[x1\]\n} } } */ +/* { dg-final { scan-assembler {\tld4d\t{z[0-9]+\.d - (z[0-9]+)\.d}.*\tstr\t\1, \[x1, #3, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld1d\t(z[0-9]+\.d), p[0-7]/z, \[x0, #5, mul vl\]\n.*\tst1d\t\1, p[0-7], \[x2\]\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp0\.b, vl1\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp1\.h, vl2\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp2\.s, vl3\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_le_u8.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_le_u8.c new file mode 100644 index 0000000..8320843 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_le_u8.c @@ -0,0 +1,70 @@ +/* { dg-do compile } */ +/* { dg-options "-O -mlittle-endian -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#pragma GCC aarch64 "arm_sve.h" + +/* +** callee1: +** ... +** ldr (z[0-9]+), \[x1, #3, mul vl\] +** ... +** st4b {z[0-9]+\.b - \1\.b}, p0, \[x0\] +** st2b {z3\.b - z4\.b}, p1, \[x0\] +** st3b {z5\.b - z7\.b}, p2, \[x0\] +** ret +*/ +void __attribute__((noipa)) +callee1 (void *x0, svuint8x3_t z0, svuint8x2_t z3, svuint8x3_t z5, + svuint8x4_t stack1, svuint8_t stack2, svbool_t p0, + svbool_t p1, svbool_t p2) +{ + svst4_u8 (p0, x0, stack1); + svst2_u8 (p1, x0, z3); + svst3_u8 (p2, x0, z5); +} + +/* +** callee2: +** ptrue p3\.b, all +** ld1b (z[0-9]+\.b), p3/z, \[x2\] +** st1b \1, p0, \[x0\] +** st2b {z3\.b - z4\.b}, p1, \[x0\] +** st3b {z0\.b - z2\.b}, p2, \[x0\] +** ret +*/ +void __attribute__((noipa)) +callee2 (void *x0, svuint8x3_t z0, svuint8x2_t z3, svuint8x3_t z5, + svuint8x4_t stack1, svuint8_t stack2, svbool_t p0, + svbool_t p1, svbool_t p2) +{ + svst1_u8 (p0, x0, stack2); + svst2_u8 (p1, x0, z3); + svst3_u8 (p2, x0, z0); +} + +void __attribute__((noipa)) +caller (void *x0) +{ + svbool_t pg; + pg = svptrue_b8 (); + callee1 (x0, + svld3_vnum_u8 (pg, x0, -9), + svld2_vnum_u8 (pg, x0, -2), + svld3_vnum_u8 (pg, x0, 0), + svld4_vnum_u8 (pg, x0, 8), + svld1_vnum_u8 (pg, x0, 5), + svptrue_pat_b8 (SV_VL1), + svptrue_pat_b16 (SV_VL2), + svptrue_pat_b32 (SV_VL3)); +} + +/* { dg-final { scan-assembler {\tld3b\t{z0\.b - z2\.b}, p[0-7]/z, \[x0, #-9, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld2b\t{z3\.b - z4\.b}, p[0-7]/z, \[x0, #-2, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld3b\t{z5\.b - z7\.b}, p[0-7]/z, \[x0\]\n} } } */ +/* { dg-final { scan-assembler {\tld4b\t{(z[0-9]+)\.b - z[0-9]+\.b}.*\tstr\t\1, \[x1\]\n} } } */ +/* { dg-final { scan-assembler {\tld4b\t{z[0-9]+\.b - (z[0-9]+)\.b}.*\tstr\t\1, \[x1, #3, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld1b\t(z[0-9]+\.b), p[0-7]/z, \[x0, #5, mul vl\]\n.*\tst1b\t\1, p[0-7], \[x2\]\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp0\.b, vl1\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp1\.h, vl2\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp2\.s, vl3\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_7.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_7.c new file mode 100644 index 0000000..99ba248 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_7.c @@ -0,0 +1,30 @@ +/* { dg-do compile } */ +/* { dg-options "-O -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include + +/* +** callee: +** ... +** ldr (x[0-9]+), \[sp\] +** ... +** ld1b (z[0-9]+\.b), p[1-3]/z, \[\1\] +** st1b \2, p0, \[x0, x7\] +** ret +*/ +void __attribute__((noipa)) +callee (int8_t *x0, int x1, int x2, int x3, + int x4, int x5, svbool_t p0, int x6, int64_t x7, + svint32x4_t z0, svint32x4_t z4, svint8_t stack) +{ + svst1 (p0, x0 + x7, stack); +} + +void __attribute__((noipa)) +caller (int8_t *x0, svbool_t p0, svint32x4_t z0, svint32x4_t z4) +{ + callee (x0, 1, 2, 3, 4, 5, p0, 6, 7, z0, z4, svdup_s8 (42)); +} + +/* { dg-final { scan-assembler {\tmov\t(z[0-9]+\.b), #42\n.*\tst1b\t\1, p[0-7], \[(x[0-9]+)\]\n.*\tstr\t\2, \[sp\]\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_8.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_8.c new file mode 100644 index 0000000..53aa4cd --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_8.c @@ -0,0 +1,28 @@ +/* { dg-do compile } */ +/* { dg-options "-O -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include + +/* +** callee: +** ptrue (p[1-3])\.b, all +** ld1b (z[0-9]+\.b), \1/z, \[x4\] +** st1b \2, p0, \[x0, x7\] +** ret +*/ +void __attribute__((noipa)) +callee (int8_t *x0, int x1, int x2, int x3, + svint32x4_t z0, svint32x4_t z4, svint8_t stack, + int x5, svbool_t p0, int x6, int64_t x7) +{ + svst1 (p0, x0 + x7, stack); +} + +void __attribute__((noipa)) +caller (int8_t *x0, svbool_t p0, svint32x4_t z0, svint32x4_t z4) +{ + callee (x0, 1, 2, 3, z0, z4, svdup_s8 (42), 5, p0, 6, 7); +} + +/* { dg-final { scan-assembler {\tmov\t(z[0-9]+\.b), #42\n.*\tst1b\t\1, p[0-7], \[x4\]\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_9.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_9.c new file mode 100644 index 0000000..921ee392 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_9.c @@ -0,0 +1,49 @@ +/* { dg-do compile } */ +/* { dg-options "-O -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include + +/* +** callee: +** ldr (x[0-9]+), \[sp, 8\] +** ldr p0, \[\1\] +** ret +*/ +svbool_t __attribute__((noipa)) +callee (svint64x4_t z0, svint16x4_t z4, + svint64_t stack1, svint32_t stack2, + svint16_t stack3, svint8_t stack4, + svuint64_t stack5, svuint32_t stack6, + svuint16_t stack7, svuint8_t stack8, + svbool_t p0, svbool_t p1, svbool_t p2, svbool_t p3, + svbool_t stack9, svbool_t stack10) +{ + return stack10; +} + +uint64_t __attribute__((noipa)) +caller (int64_t *x0, int16_t *x1, svbool_t p0) +{ + svbool_t res; + res = callee (svld4 (p0, x0), + svld4 (p0, x1), + svdup_s64 (1), + svdup_s32 (2), + svdup_s16 (3), + svdup_s8 (4), + svdup_u64 (5), + svdup_u32 (6), + svdup_u16 (7), + svdup_u8 (8), + svptrue_pat_b8 (SV_VL5), + svptrue_pat_b16 (SV_VL6), + svptrue_pat_b32 (SV_VL7), + svptrue_pat_b64 (SV_VL8), + svptrue_pat_b8 (SV_MUL3), + svptrue_pat_b16 (SV_MUL3)); + return svcntp_b8 (res, res); +} + +/* { dg-final { scan-assembler {\tptrue\t(p[0-9]+)\.b, mul3\n\tstr\t\1, \[(x[0-9]+)\]\n.*\tstr\t\2, \[sp\]\n} } } */ +/* { dg-final { scan-assembler {\tptrue\t(p[0-9]+)\.h, mul3\n\tstr\t\1, \[(x[0-9]+)\]\n.*\tstr\t\2, \[sp, 8\]\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/nosve_1.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/nosve_1.c new file mode 100644 index 0000000..26802c8 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/nosve_1.c @@ -0,0 +1,14 @@ +/* { dg-do compile } */ +/* { dg-prune-output "compilation terminated" } */ + +#include + +#pragma GCC target "+nosve" + +svbool_t return_bool (); + +void +f (void) +{ + return_bool (); /* { dg-error {'return_bool' requires the SVE ISA extension} } */ +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/nosve_2.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/nosve_2.c new file mode 100644 index 0000000..663165f --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/nosve_2.c @@ -0,0 +1,14 @@ +/* { dg-do compile } */ +/* { dg-prune-output "compilation terminated" } */ + +#include + +#pragma GCC target "+nosve" + +svbool_t return_bool (); + +void +f (svbool_t *ptr) +{ + *ptr = return_bool (); /* { dg-error {'return_bool' requires the SVE ISA extension} } */ +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/nosve_3.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/nosve_3.c new file mode 100644 index 0000000..6d5823c --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/nosve_3.c @@ -0,0 +1,14 @@ +/* { dg-do compile } */ +/* { dg-prune-output "compilation terminated" } */ + +#include + +#pragma GCC target "+nosve" + +svbool_t (*return_bool) (); + +void +f (svbool_t *ptr) +{ + *ptr = return_bool (); /* { dg-error {calls to functions of type 'svbool_t\(\)' require the SVE ISA extension} } */ +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/nosve_4.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/nosve_4.c new file mode 100644 index 0000000..81e31cf --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/nosve_4.c @@ -0,0 +1,14 @@ +/* { dg-do compile } */ +/* { dg-prune-output "compilation terminated" } */ + +#include + +#pragma GCC target "+nosve" + +void take_svuint8 (svuint8_t); + +void +f (svuint8_t *ptr) +{ + take_svuint8 (*ptr); /* { dg-error {'take_svuint8' requires the SVE ISA extension} } */ +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/nosve_5.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/nosve_5.c new file mode 100644 index 0000000..300ed00 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/nosve_5.c @@ -0,0 +1,15 @@ +/* { dg-do compile } */ +/* { dg-prune-output "compilation terminated" } */ + +#include + +#pragma GCC target "+nosve" + +void take_svuint8_eventually (float, float, float, float, + float, float, float, float, svuint8_t); + +void +f (svuint8_t *ptr) +{ + take_svuint8_eventually (0, 0, 0, 0, 0, 0, 0, 0, *ptr); /* { dg-error {arguments of type '(svuint8_t|__SVUint8_t)' require the SVE ISA extension} } */ +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/nosve_6.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/nosve_6.c new file mode 100644 index 0000000..4bddf76 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/nosve_6.c @@ -0,0 +1,14 @@ +/* { dg-do compile } */ +/* { dg-prune-output "compilation terminated" } */ + +#include + +#pragma GCC target "+nosve" + +void unprototyped (); + +void +f (svuint8_t *ptr) +{ + unprototyped (*ptr); /* { dg-error {arguments of type '(svuint8_t|__SVUint8_t)' require the SVE ISA extension} } */ +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/nosve_7.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/nosve_7.c new file mode 100644 index 0000000..ef74271 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/nosve_7.c @@ -0,0 +1,8 @@ +/* { dg-do compile } */ +/* { dg-prune-output "compilation terminated" } */ + +#include + +#pragma GCC target "+nosve" + +void f (svuint8_t x) {} /* { dg-error {'f' requires the SVE ISA extension} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/nosve_8.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/nosve_8.c new file mode 100644 index 0000000..45b549f --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/nosve_8.c @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-prune-output "compilation terminated" } */ + +#include + +#pragma GCC target "+nosve" + +void +f (float a, float b, float c, float d, float e, float f, float g, float h, svuint8_t x) /* { dg-error {arguments of type '(svuint8_t|__SVUint8_t)' require the SVE ISA extension} } */ +{ +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_1.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_1.c new file mode 100644 index 0000000..1210539 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_1.c @@ -0,0 +1,32 @@ +/* { dg-do compile } */ +/* { dg-options "-O -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +/* +** callee_pred: +** ldr p0, \[x0\] +** ret +*/ +__SVBool_t __attribute__((noipa)) +callee_pred (__SVBool_t *ptr) +{ + return *ptr; +} + +#include + +/* +** caller_pred: +** ... +** bl callee_pred +** cntp x0, p0, p0.b +** ldp x29, x30, \[sp\], 16 +** ret +*/ +uint64_t __attribute__((noipa)) +caller_pred (__SVBool_t *ptr1) +{ + __SVBool_t p; + p = callee_pred (ptr1); + return svcntp_b8 (p, p); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_1_1024.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_1_1024.c new file mode 100644 index 0000000..a44a988 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_1_1024.c @@ -0,0 +1,31 @@ +/* { dg-do compile } */ +/* { dg-options "-O -msve-vector-bits=1024 -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +/* +** callee_pred: +** ldr p0, \[x0\] +** ret +*/ +__SVBool_t __attribute__((noipa)) +callee_pred (__SVBool_t *ptr) +{ + return *ptr; +} + +#include + +/* +** caller_pred: +** ... +** bl callee_pred +** cntp x0, p0, p0.b +** ldp x29, x30, \[sp\], 16 +** ret +*/ +uint64_t __attribute__((noipa)) +caller_pred (__SVBool_t *ptr1) +{ + __SVBool_t p = callee_pred (ptr1); + return svcntp_b8 (p, p); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_1_2048.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_1_2048.c new file mode 100644 index 0000000..d5030ce --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_1_2048.c @@ -0,0 +1,31 @@ +/* { dg-do compile } */ +/* { dg-options "-O -msve-vector-bits=2048 -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +/* +** callee_pred: +** ldr p0, \[x0\] +** ret +*/ +__SVBool_t __attribute__((noipa)) +callee_pred (__SVBool_t *ptr) +{ + return *ptr; +} + +#include + +/* +** caller_pred: +** ... +** bl callee_pred +** cntp x0, p0, p0.b +** ldp x29, x30, \[sp\], 16 +** ret +*/ +uint64_t __attribute__((noipa)) +caller_pred (__SVBool_t *ptr1) +{ + __SVBool_t p = callee_pred (ptr1); + return svcntp_b8 (p, p); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_1_256.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_1_256.c new file mode 100644 index 0000000..a59af19 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_1_256.c @@ -0,0 +1,31 @@ +/* { dg-do compile } */ +/* { dg-options "-O -msve-vector-bits=256 -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +/* +** callee_pred: +** ldr p0, \[x0\] +** ret +*/ +__SVBool_t __attribute__((noipa)) +callee_pred (__SVBool_t *ptr) +{ + return *ptr; +} + +#include + +/* +** caller_pred: +** ... +** bl callee_pred +** cntp x0, p0, p0.b +** ldp x29, x30, \[sp\], 16 +** ret +*/ +uint64_t __attribute__((noipa)) +caller_pred (__SVBool_t *ptr1) +{ + __SVBool_t p = callee_pred (ptr1); + return svcntp_b8 (p, p); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_1_512.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_1_512.c new file mode 100644 index 0000000..774c308 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_1_512.c @@ -0,0 +1,31 @@ +/* { dg-do compile } */ +/* { dg-options "-O -msve-vector-bits=512 -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +/* +** callee_pred: +** ldr p0, \[x0\] +** ret +*/ +__SVBool_t __attribute__((noipa)) +callee_pred (__SVBool_t *ptr) +{ + return *ptr; +} + +#include + +/* +** caller_pred: +** ... +** bl callee_pred +** cntp x0, p0, p0.b +** ldp x29, x30, \[sp\], 16 +** ret +*/ +uint64_t __attribute__((noipa)) +caller_pred (__SVBool_t *ptr1) +{ + __SVBool_t p = callee_pred (ptr1); + return svcntp_b8 (p, p); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_2.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_2.c new file mode 100644 index 0000000..4c0f598 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_2.c @@ -0,0 +1,32 @@ +/* { dg-do compile } */ +/* { dg-options "-O -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include + +/* +** callee_pred: +** ldr p0, \[x0\] +** ret +*/ +svbool_t __attribute__((noipa)) +callee_pred (svbool_t *ptr) +{ + return *ptr; +} + +/* +** caller_pred: +** ... +** bl callee_pred +** cntp x0, p0, p0.b +** ldp x29, x30, \[sp\], 16 +** ret +*/ +uint64_t __attribute__((noipa)) +caller_pred (svbool_t *ptr1) +{ + svbool_t p; + p = callee_pred (ptr1); + return svcntp_b8 (p, p); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_3.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_3.c new file mode 100644 index 0000000..e9c50b6 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_3.c @@ -0,0 +1,34 @@ +/* { dg-do compile } */ +/* { dg-options "-O -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include + +typedef svbool_t my_pred; + +/* +** callee_pred: +** ldr p0, \[x0\] +** ret +*/ +my_pred __attribute__((noipa)) +callee_pred (my_pred *ptr) +{ + return *ptr; +} + +/* +** caller_pred: +** ... +** bl callee_pred +** cntp x0, p0, p0.b +** ldp x29, x30, \[sp\], 16 +** ret +*/ +uint64_t __attribute__((noipa)) +caller_pred (my_pred *ptr1) +{ + my_pred p; + p = callee_pred (ptr1); + return svcntp_b8 (p, p); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_4.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_4.c new file mode 100644 index 0000000..8167305 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_4.c @@ -0,0 +1,237 @@ +/* { dg-do compile } */ +/* { dg-options "-O -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#define CALLEE(SUFFIX, TYPE) \ + TYPE __attribute__((noipa)) \ + callee_##SUFFIX (TYPE *ptr) \ + { \ + return *ptr; \ + } + +/* +** callee_s8: +** ptrue (p[0-7])\.b, all +** ld1b z0\.b, \1/z, \[x0\] +** ret +*/ +CALLEE (s8, __SVInt8_t) + +/* +** callee_u8: +** ptrue (p[0-7])\.b, all +** ld1b z0\.b, \1/z, \[x0\] +** ret +*/ +CALLEE (u8, __SVUint8_t) + +/* +** callee_s16: +** ptrue (p[0-7])\.b, all +** ld1h z0\.h, \1/z, \[x0\] +** ret +*/ +CALLEE (s16, __SVInt16_t) + +/* +** callee_u16: +** ptrue (p[0-7])\.b, all +** ld1h z0\.h, \1/z, \[x0\] +** ret +*/ +CALLEE (u16, __SVUint16_t) + +/* +** callee_f16: +** ptrue (p[0-7])\.b, all +** ld1h z0\.h, \1/z, \[x0\] +** ret +*/ +CALLEE (f16, __SVFloat16_t) + +/* +** callee_s32: +** ptrue (p[0-7])\.b, all +** ld1w z0\.s, \1/z, \[x0\] +** ret +*/ +CALLEE (s32, __SVInt32_t) + +/* +** callee_u32: +** ptrue (p[0-7])\.b, all +** ld1w z0\.s, \1/z, \[x0\] +** ret +*/ +CALLEE (u32, __SVUint32_t) + +/* +** callee_f32: +** ptrue (p[0-7])\.b, all +** ld1w z0\.s, \1/z, \[x0\] +** ret +*/ +CALLEE (f32, __SVFloat32_t) + +/* +** callee_s64: +** ptrue (p[0-7])\.b, all +** ld1d z0\.d, \1/z, \[x0\] +** ret +*/ +CALLEE (s64, __SVInt64_t) + +/* +** callee_u64: +** ptrue (p[0-7])\.b, all +** ld1d z0\.d, \1/z, \[x0\] +** ret +*/ +CALLEE (u64, __SVUint64_t) + +/* +** callee_f64: +** ptrue (p[0-7])\.b, all +** ld1d z0\.d, \1/z, \[x0\] +** ret +*/ +CALLEE (f64, __SVFloat64_t) + +#include + +#define CALLER(SUFFIX, TYPE) \ + typeof (svaddv (svptrue_b8 (), *(TYPE *) 0)) \ + __attribute__((noipa)) \ + caller_##SUFFIX (TYPE *ptr1) \ + { \ + return svaddv (svptrue_b8 (), callee_##SUFFIX (ptr1)); \ + } + +/* +** caller_s8: +** ... +** bl callee_s8 +** ptrue (p[0-7])\.b, all +** saddv (d[0-9]+), \1, z0\.b +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (s8, __SVInt8_t) + +/* +** caller_u8: +** ... +** bl callee_u8 +** ptrue (p[0-7])\.b, all +** uaddv (d[0-9]+), \1, z0\.b +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (u8, __SVUint8_t) + +/* +** caller_s16: +** ... +** bl callee_s16 +** ptrue (p[0-7])\.b, all +** saddv (d[0-9]+), \1, z0\.h +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (s16, __SVInt16_t) + +/* +** caller_u16: +** ... +** bl callee_u16 +** ptrue (p[0-7])\.b, all +** uaddv (d[0-9]+), \1, z0\.h +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (u16, __SVUint16_t) + +/* +** caller_f16: +** ... +** bl callee_f16 +** ptrue (p[0-7])\.b, all +** faddv h0, \1, z0\.h +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (f16, __SVFloat16_t) + +/* +** caller_s32: +** ... +** bl callee_s32 +** ptrue (p[0-7])\.b, all +** saddv (d[0-9]+), \1, z0\.s +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (s32, __SVInt32_t) + +/* +** caller_u32: +** ... +** bl callee_u32 +** ptrue (p[0-7])\.b, all +** uaddv (d[0-9]+), \1, z0\.s +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (u32, __SVUint32_t) + +/* +** caller_f32: +** ... +** bl callee_f32 +** ptrue (p[0-7])\.b, all +** faddv s0, \1, z0\.s +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (f32, __SVFloat32_t) + +/* +** caller_s64: +** ... +** bl callee_s64 +** ptrue (p[0-7])\.b, all +** uaddv (d[0-9]+), \1, z0\.d +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (s64, __SVInt64_t) + +/* +** caller_u64: +** ... +** bl callee_u64 +** ptrue (p[0-7])\.b, all +** uaddv (d[0-9]+), \1, z0\.d +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (u64, __SVUint64_t) + +/* +** caller_f64: +** ... +** bl callee_f64 +** ptrue (p[0-7])\.b, all +** faddv d0, \1, z0\.d +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (f64, __SVFloat64_t) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_4_1024.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_4_1024.c new file mode 100644 index 0000000..bfbb911 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_4_1024.c @@ -0,0 +1,237 @@ +/* { dg-do compile } */ +/* { dg-options "-O -msve-vector-bits=1024 -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#define CALLEE(SUFFIX, TYPE) \ + TYPE __attribute__((noipa)) \ + callee_##SUFFIX (TYPE *ptr) \ + { \ + return *ptr; \ + } + +/* +** callee_s8: +** ptrue (p[0-7])\.b, vl128 +** ld1b z0\.b, \1/z, \[x0\] +** ret +*/ +CALLEE (s8, __SVInt8_t) + +/* +** callee_u8: +** ptrue (p[0-7])\.b, vl128 +** ld1b z0\.b, \1/z, \[x0\] +** ret +*/ +CALLEE (u8, __SVUint8_t) + +/* +** callee_s16: +** ptrue (p[0-7])\.b, vl128 +** ld1h z0\.h, \1/z, \[x0\] +** ret +*/ +CALLEE (s16, __SVInt16_t) + +/* +** callee_u16: +** ptrue (p[0-7])\.b, vl128 +** ld1h z0\.h, \1/z, \[x0\] +** ret +*/ +CALLEE (u16, __SVUint16_t) + +/* +** callee_f16: +** ptrue (p[0-7])\.b, vl128 +** ld1h z0\.h, \1/z, \[x0\] +** ret +*/ +CALLEE (f16, __SVFloat16_t) + +/* +** callee_s32: +** ptrue (p[0-7])\.b, vl128 +** ld1w z0\.s, \1/z, \[x0\] +** ret +*/ +CALLEE (s32, __SVInt32_t) + +/* +** callee_u32: +** ptrue (p[0-7])\.b, vl128 +** ld1w z0\.s, \1/z, \[x0\] +** ret +*/ +CALLEE (u32, __SVUint32_t) + +/* +** callee_f32: +** ptrue (p[0-7])\.b, vl128 +** ld1w z0\.s, \1/z, \[x0\] +** ret +*/ +CALLEE (f32, __SVFloat32_t) + +/* +** callee_s64: +** ptrue (p[0-7])\.b, vl128 +** ld1d z0\.d, \1/z, \[x0\] +** ret +*/ +CALLEE (s64, __SVInt64_t) + +/* +** callee_u64: +** ptrue (p[0-7])\.b, vl128 +** ld1d z0\.d, \1/z, \[x0\] +** ret +*/ +CALLEE (u64, __SVUint64_t) + +/* +** callee_f64: +** ptrue (p[0-7])\.b, vl128 +** ld1d z0\.d, \1/z, \[x0\] +** ret +*/ +CALLEE (f64, __SVFloat64_t) + +#include + +#define CALLER(SUFFIX, TYPE) \ + typeof (svaddv (svptrue_b8 (), *(TYPE *) 0)) \ + __attribute__((noipa)) \ + caller_##SUFFIX (TYPE *ptr1) \ + { \ + return svaddv (svptrue_b8 (), callee_##SUFFIX (ptr1)); \ + } + +/* +** caller_s8: +** ... +** bl callee_s8 +** ptrue (p[0-7])\.b, vl128 +** saddv (d[0-9]+), \1, z0\.b +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (s8, __SVInt8_t) + +/* +** caller_u8: +** ... +** bl callee_u8 +** ptrue (p[0-7])\.b, vl128 +** uaddv (d[0-9]+), \1, z0\.b +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (u8, __SVUint8_t) + +/* +** caller_s16: +** ... +** bl callee_s16 +** ptrue (p[0-7])\.b, vl128 +** saddv (d[0-9]+), \1, z0\.h +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (s16, __SVInt16_t) + +/* +** caller_u16: +** ... +** bl callee_u16 +** ptrue (p[0-7])\.b, vl128 +** uaddv (d[0-9]+), \1, z0\.h +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (u16, __SVUint16_t) + +/* +** caller_f16: +** ... +** bl callee_f16 +** ptrue (p[0-7])\.b, vl128 +** faddv h0, \1, z0\.h +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (f16, __SVFloat16_t) + +/* +** caller_s32: +** ... +** bl callee_s32 +** ptrue (p[0-7])\.b, vl128 +** saddv (d[0-9]+), \1, z0\.s +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (s32, __SVInt32_t) + +/* +** caller_u32: +** ... +** bl callee_u32 +** ptrue (p[0-7])\.b, vl128 +** uaddv (d[0-9]+), \1, z0\.s +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (u32, __SVUint32_t) + +/* +** caller_f32: +** ... +** bl callee_f32 +** ptrue (p[0-7])\.b, vl128 +** faddv s0, \1, z0\.s +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (f32, __SVFloat32_t) + +/* +** caller_s64: +** ... +** bl callee_s64 +** ptrue (p[0-7])\.b, vl128 +** uaddv (d[0-9]+), \1, z0\.d +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (s64, __SVInt64_t) + +/* +** caller_u64: +** ... +** bl callee_u64 +** ptrue (p[0-7])\.b, vl128 +** uaddv (d[0-9]+), \1, z0\.d +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (u64, __SVUint64_t) + +/* +** caller_f64: +** ... +** bl callee_f64 +** ptrue (p[0-7])\.b, vl128 +** faddv d0, \1, z0\.d +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (f64, __SVFloat64_t) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_4_2048.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_4_2048.c new file mode 100644 index 0000000..751b1f5 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_4_2048.c @@ -0,0 +1,237 @@ +/* { dg-do compile } */ +/* { dg-options "-O -msve-vector-bits=2048 -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#define CALLEE(SUFFIX, TYPE) \ + TYPE __attribute__((noipa)) \ + callee_##SUFFIX (TYPE *ptr) \ + { \ + return *ptr; \ + } + +/* +** callee_s8: +** ptrue (p[0-7])\.b, vl256 +** ld1b z0\.b, \1/z, \[x0\] +** ret +*/ +CALLEE (s8, __SVInt8_t) + +/* +** callee_u8: +** ptrue (p[0-7])\.b, vl256 +** ld1b z0\.b, \1/z, \[x0\] +** ret +*/ +CALLEE (u8, __SVUint8_t) + +/* +** callee_s16: +** ptrue (p[0-7])\.b, vl256 +** ld1h z0\.h, \1/z, \[x0\] +** ret +*/ +CALLEE (s16, __SVInt16_t) + +/* +** callee_u16: +** ptrue (p[0-7])\.b, vl256 +** ld1h z0\.h, \1/z, \[x0\] +** ret +*/ +CALLEE (u16, __SVUint16_t) + +/* +** callee_f16: +** ptrue (p[0-7])\.b, vl256 +** ld1h z0\.h, \1/z, \[x0\] +** ret +*/ +CALLEE (f16, __SVFloat16_t) + +/* +** callee_s32: +** ptrue (p[0-7])\.b, vl256 +** ld1w z0\.s, \1/z, \[x0\] +** ret +*/ +CALLEE (s32, __SVInt32_t) + +/* +** callee_u32: +** ptrue (p[0-7])\.b, vl256 +** ld1w z0\.s, \1/z, \[x0\] +** ret +*/ +CALLEE (u32, __SVUint32_t) + +/* +** callee_f32: +** ptrue (p[0-7])\.b, vl256 +** ld1w z0\.s, \1/z, \[x0\] +** ret +*/ +CALLEE (f32, __SVFloat32_t) + +/* +** callee_s64: +** ptrue (p[0-7])\.b, vl256 +** ld1d z0\.d, \1/z, \[x0\] +** ret +*/ +CALLEE (s64, __SVInt64_t) + +/* +** callee_u64: +** ptrue (p[0-7])\.b, vl256 +** ld1d z0\.d, \1/z, \[x0\] +** ret +*/ +CALLEE (u64, __SVUint64_t) + +/* +** callee_f64: +** ptrue (p[0-7])\.b, vl256 +** ld1d z0\.d, \1/z, \[x0\] +** ret +*/ +CALLEE (f64, __SVFloat64_t) + +#include + +#define CALLER(SUFFIX, TYPE) \ + typeof (svaddv (svptrue_b8 (), *(TYPE *) 0)) \ + __attribute__((noipa)) \ + caller_##SUFFIX (TYPE *ptr1) \ + { \ + return svaddv (svptrue_b8 (), callee_##SUFFIX (ptr1)); \ + } + +/* +** caller_s8: +** ... +** bl callee_s8 +** ptrue (p[0-7])\.b, vl256 +** saddv (d[0-9]+), \1, z0\.b +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (s8, __SVInt8_t) + +/* +** caller_u8: +** ... +** bl callee_u8 +** ptrue (p[0-7])\.b, vl256 +** uaddv (d[0-9]+), \1, z0\.b +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (u8, __SVUint8_t) + +/* +** caller_s16: +** ... +** bl callee_s16 +** ptrue (p[0-7])\.b, vl256 +** saddv (d[0-9]+), \1, z0\.h +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (s16, __SVInt16_t) + +/* +** caller_u16: +** ... +** bl callee_u16 +** ptrue (p[0-7])\.b, vl256 +** uaddv (d[0-9]+), \1, z0\.h +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (u16, __SVUint16_t) + +/* +** caller_f16: +** ... +** bl callee_f16 +** ptrue (p[0-7])\.b, vl256 +** faddv h0, \1, z0\.h +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (f16, __SVFloat16_t) + +/* +** caller_s32: +** ... +** bl callee_s32 +** ptrue (p[0-7])\.b, vl256 +** saddv (d[0-9]+), \1, z0\.s +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (s32, __SVInt32_t) + +/* +** caller_u32: +** ... +** bl callee_u32 +** ptrue (p[0-7])\.b, vl256 +** uaddv (d[0-9]+), \1, z0\.s +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (u32, __SVUint32_t) + +/* +** caller_f32: +** ... +** bl callee_f32 +** ptrue (p[0-7])\.b, vl256 +** faddv s0, \1, z0\.s +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (f32, __SVFloat32_t) + +/* +** caller_s64: +** ... +** bl callee_s64 +** ptrue (p[0-7])\.b, vl256 +** uaddv (d[0-9]+), \1, z0\.d +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (s64, __SVInt64_t) + +/* +** caller_u64: +** ... +** bl callee_u64 +** ptrue (p[0-7])\.b, vl256 +** uaddv (d[0-9]+), \1, z0\.d +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (u64, __SVUint64_t) + +/* +** caller_f64: +** ... +** bl callee_f64 +** ptrue (p[0-7])\.b, vl256 +** faddv d0, \1, z0\.d +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (f64, __SVFloat64_t) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_4_256.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_4_256.c new file mode 100644 index 0000000..5bc467b --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_4_256.c @@ -0,0 +1,237 @@ +/* { dg-do compile } */ +/* { dg-options "-O -msve-vector-bits=256 -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#define CALLEE(SUFFIX, TYPE) \ + TYPE __attribute__((noipa)) \ + callee_##SUFFIX (TYPE *ptr) \ + { \ + return *ptr; \ + } + +/* +** callee_s8: +** ptrue (p[0-7])\.b, vl32 +** ld1b z0\.b, \1/z, \[x0\] +** ret +*/ +CALLEE (s8, __SVInt8_t) + +/* +** callee_u8: +** ptrue (p[0-7])\.b, vl32 +** ld1b z0\.b, \1/z, \[x0\] +** ret +*/ +CALLEE (u8, __SVUint8_t) + +/* +** callee_s16: +** ptrue (p[0-7])\.b, vl32 +** ld1h z0\.h, \1/z, \[x0\] +** ret +*/ +CALLEE (s16, __SVInt16_t) + +/* +** callee_u16: +** ptrue (p[0-7])\.b, vl32 +** ld1h z0\.h, \1/z, \[x0\] +** ret +*/ +CALLEE (u16, __SVUint16_t) + +/* +** callee_f16: +** ptrue (p[0-7])\.b, vl32 +** ld1h z0\.h, \1/z, \[x0\] +** ret +*/ +CALLEE (f16, __SVFloat16_t) + +/* +** callee_s32: +** ptrue (p[0-7])\.b, vl32 +** ld1w z0\.s, \1/z, \[x0\] +** ret +*/ +CALLEE (s32, __SVInt32_t) + +/* +** callee_u32: +** ptrue (p[0-7])\.b, vl32 +** ld1w z0\.s, \1/z, \[x0\] +** ret +*/ +CALLEE (u32, __SVUint32_t) + +/* +** callee_f32: +** ptrue (p[0-7])\.b, vl32 +** ld1w z0\.s, \1/z, \[x0\] +** ret +*/ +CALLEE (f32, __SVFloat32_t) + +/* +** callee_s64: +** ptrue (p[0-7])\.b, vl32 +** ld1d z0\.d, \1/z, \[x0\] +** ret +*/ +CALLEE (s64, __SVInt64_t) + +/* +** callee_u64: +** ptrue (p[0-7])\.b, vl32 +** ld1d z0\.d, \1/z, \[x0\] +** ret +*/ +CALLEE (u64, __SVUint64_t) + +/* +** callee_f64: +** ptrue (p[0-7])\.b, vl32 +** ld1d z0\.d, \1/z, \[x0\] +** ret +*/ +CALLEE (f64, __SVFloat64_t) + +#include + +#define CALLER(SUFFIX, TYPE) \ + typeof (svaddv (svptrue_b8 (), *(TYPE *) 0)) \ + __attribute__((noipa)) \ + caller_##SUFFIX (TYPE *ptr1) \ + { \ + return svaddv (svptrue_b8 (), callee_##SUFFIX (ptr1)); \ + } + +/* +** caller_s8: +** ... +** bl callee_s8 +** ptrue (p[0-7])\.b, vl32 +** saddv (d[0-9]+), \1, z0\.b +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (s8, __SVInt8_t) + +/* +** caller_u8: +** ... +** bl callee_u8 +** ptrue (p[0-7])\.b, vl32 +** uaddv (d[0-9]+), \1, z0\.b +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (u8, __SVUint8_t) + +/* +** caller_s16: +** ... +** bl callee_s16 +** ptrue (p[0-7])\.b, vl32 +** saddv (d[0-9]+), \1, z0\.h +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (s16, __SVInt16_t) + +/* +** caller_u16: +** ... +** bl callee_u16 +** ptrue (p[0-7])\.b, vl32 +** uaddv (d[0-9]+), \1, z0\.h +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (u16, __SVUint16_t) + +/* +** caller_f16: +** ... +** bl callee_f16 +** ptrue (p[0-7])\.b, vl32 +** faddv h0, \1, z0\.h +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (f16, __SVFloat16_t) + +/* +** caller_s32: +** ... +** bl callee_s32 +** ptrue (p[0-7])\.b, vl32 +** saddv (d[0-9]+), \1, z0\.s +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (s32, __SVInt32_t) + +/* +** caller_u32: +** ... +** bl callee_u32 +** ptrue (p[0-7])\.b, vl32 +** uaddv (d[0-9]+), \1, z0\.s +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (u32, __SVUint32_t) + +/* +** caller_f32: +** ... +** bl callee_f32 +** ptrue (p[0-7])\.b, vl32 +** faddv s0, \1, z0\.s +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (f32, __SVFloat32_t) + +/* +** caller_s64: +** ... +** bl callee_s64 +** ptrue (p[0-7])\.b, vl32 +** uaddv (d[0-9]+), \1, z0\.d +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (s64, __SVInt64_t) + +/* +** caller_u64: +** ... +** bl callee_u64 +** ptrue (p[0-7])\.b, vl32 +** uaddv (d[0-9]+), \1, z0\.d +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (u64, __SVUint64_t) + +/* +** caller_f64: +** ... +** bl callee_f64 +** ptrue (p[0-7])\.b, vl32 +** faddv d0, \1, z0\.d +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (f64, __SVFloat64_t) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_4_512.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_4_512.c new file mode 100644 index 0000000..46b38ac --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_4_512.c @@ -0,0 +1,237 @@ +/* { dg-do compile } */ +/* { dg-options "-O -msve-vector-bits=512 -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#define CALLEE(SUFFIX, TYPE) \ + TYPE __attribute__((noipa)) \ + callee_##SUFFIX (TYPE *ptr) \ + { \ + return *ptr; \ + } + +/* +** callee_s8: +** ptrue (p[0-7])\.b, vl64 +** ld1b z0\.b, \1/z, \[x0\] +** ret +*/ +CALLEE (s8, __SVInt8_t) + +/* +** callee_u8: +** ptrue (p[0-7])\.b, vl64 +** ld1b z0\.b, \1/z, \[x0\] +** ret +*/ +CALLEE (u8, __SVUint8_t) + +/* +** callee_s16: +** ptrue (p[0-7])\.b, vl64 +** ld1h z0\.h, \1/z, \[x0\] +** ret +*/ +CALLEE (s16, __SVInt16_t) + +/* +** callee_u16: +** ptrue (p[0-7])\.b, vl64 +** ld1h z0\.h, \1/z, \[x0\] +** ret +*/ +CALLEE (u16, __SVUint16_t) + +/* +** callee_f16: +** ptrue (p[0-7])\.b, vl64 +** ld1h z0\.h, \1/z, \[x0\] +** ret +*/ +CALLEE (f16, __SVFloat16_t) + +/* +** callee_s32: +** ptrue (p[0-7])\.b, vl64 +** ld1w z0\.s, \1/z, \[x0\] +** ret +*/ +CALLEE (s32, __SVInt32_t) + +/* +** callee_u32: +** ptrue (p[0-7])\.b, vl64 +** ld1w z0\.s, \1/z, \[x0\] +** ret +*/ +CALLEE (u32, __SVUint32_t) + +/* +** callee_f32: +** ptrue (p[0-7])\.b, vl64 +** ld1w z0\.s, \1/z, \[x0\] +** ret +*/ +CALLEE (f32, __SVFloat32_t) + +/* +** callee_s64: +** ptrue (p[0-7])\.b, vl64 +** ld1d z0\.d, \1/z, \[x0\] +** ret +*/ +CALLEE (s64, __SVInt64_t) + +/* +** callee_u64: +** ptrue (p[0-7])\.b, vl64 +** ld1d z0\.d, \1/z, \[x0\] +** ret +*/ +CALLEE (u64, __SVUint64_t) + +/* +** callee_f64: +** ptrue (p[0-7])\.b, vl64 +** ld1d z0\.d, \1/z, \[x0\] +** ret +*/ +CALLEE (f64, __SVFloat64_t) + +#include + +#define CALLER(SUFFIX, TYPE) \ + typeof (svaddv (svptrue_b8 (), *(TYPE *) 0)) \ + __attribute__((noipa)) \ + caller_##SUFFIX (TYPE *ptr1) \ + { \ + return svaddv (svptrue_b8 (), callee_##SUFFIX (ptr1)); \ + } + +/* +** caller_s8: +** ... +** bl callee_s8 +** ptrue (p[0-7])\.b, vl64 +** saddv (d[0-9]+), \1, z0\.b +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (s8, __SVInt8_t) + +/* +** caller_u8: +** ... +** bl callee_u8 +** ptrue (p[0-7])\.b, vl64 +** uaddv (d[0-9]+), \1, z0\.b +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (u8, __SVUint8_t) + +/* +** caller_s16: +** ... +** bl callee_s16 +** ptrue (p[0-7])\.b, vl64 +** saddv (d[0-9]+), \1, z0\.h +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (s16, __SVInt16_t) + +/* +** caller_u16: +** ... +** bl callee_u16 +** ptrue (p[0-7])\.b, vl64 +** uaddv (d[0-9]+), \1, z0\.h +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (u16, __SVUint16_t) + +/* +** caller_f16: +** ... +** bl callee_f16 +** ptrue (p[0-7])\.b, vl64 +** faddv h0, \1, z0\.h +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (f16, __SVFloat16_t) + +/* +** caller_s32: +** ... +** bl callee_s32 +** ptrue (p[0-7])\.b, vl64 +** saddv (d[0-9]+), \1, z0\.s +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (s32, __SVInt32_t) + +/* +** caller_u32: +** ... +** bl callee_u32 +** ptrue (p[0-7])\.b, vl64 +** uaddv (d[0-9]+), \1, z0\.s +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (u32, __SVUint32_t) + +/* +** caller_f32: +** ... +** bl callee_f32 +** ptrue (p[0-7])\.b, vl64 +** faddv s0, \1, z0\.s +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (f32, __SVFloat32_t) + +/* +** caller_s64: +** ... +** bl callee_s64 +** ptrue (p[0-7])\.b, vl64 +** uaddv (d[0-9]+), \1, z0\.d +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (s64, __SVInt64_t) + +/* +** caller_u64: +** ... +** bl callee_u64 +** ptrue (p[0-7])\.b, vl64 +** uaddv (d[0-9]+), \1, z0\.d +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (u64, __SVUint64_t) + +/* +** caller_f64: +** ... +** bl callee_f64 +** ptrue (p[0-7])\.b, vl64 +** faddv d0, \1, z0\.d +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (f64, __SVFloat64_t) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_5.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_5.c new file mode 100644 index 0000000..becabd9 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_5.c @@ -0,0 +1,237 @@ +/* { dg-do compile } */ +/* { dg-options "-O -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include + +#define CALLEE(SUFFIX, TYPE) \ + TYPE __attribute__((noipa)) \ + callee_##SUFFIX (TYPE *ptr) \ + { \ + return *ptr; \ + } + +/* +** callee_s8: +** ptrue (p[0-7])\.b, all +** ld1b z0\.b, \1/z, \[x0\] +** ret +*/ +CALLEE (s8, svint8_t) + +/* +** callee_u8: +** ptrue (p[0-7])\.b, all +** ld1b z0\.b, \1/z, \[x0\] +** ret +*/ +CALLEE (u8, svuint8_t) + +/* +** callee_s16: +** ptrue (p[0-7])\.b, all +** ld1h z0\.h, \1/z, \[x0\] +** ret +*/ +CALLEE (s16, svint16_t) + +/* +** callee_u16: +** ptrue (p[0-7])\.b, all +** ld1h z0\.h, \1/z, \[x0\] +** ret +*/ +CALLEE (u16, svuint16_t) + +/* +** callee_f16: +** ptrue (p[0-7])\.b, all +** ld1h z0\.h, \1/z, \[x0\] +** ret +*/ +CALLEE (f16, svfloat16_t) + +/* +** callee_s32: +** ptrue (p[0-7])\.b, all +** ld1w z0\.s, \1/z, \[x0\] +** ret +*/ +CALLEE (s32, svint32_t) + +/* +** callee_u32: +** ptrue (p[0-7])\.b, all +** ld1w z0\.s, \1/z, \[x0\] +** ret +*/ +CALLEE (u32, svuint32_t) + +/* +** callee_f32: +** ptrue (p[0-7])\.b, all +** ld1w z0\.s, \1/z, \[x0\] +** ret +*/ +CALLEE (f32, svfloat32_t) + +/* +** callee_s64: +** ptrue (p[0-7])\.b, all +** ld1d z0\.d, \1/z, \[x0\] +** ret +*/ +CALLEE (s64, svint64_t) + +/* +** callee_u64: +** ptrue (p[0-7])\.b, all +** ld1d z0\.d, \1/z, \[x0\] +** ret +*/ +CALLEE (u64, svuint64_t) + +/* +** callee_f64: +** ptrue (p[0-7])\.b, all +** ld1d z0\.d, \1/z, \[x0\] +** ret +*/ +CALLEE (f64, svfloat64_t) + +#define CALLER(SUFFIX, TYPE) \ + typeof (svaddv (svptrue_b8 (), *(TYPE *) 0)) \ + __attribute__((noipa)) \ + caller_##SUFFIX (TYPE *ptr1) \ + { \ + return svaddv (svptrue_b8 (), callee_##SUFFIX (ptr1)); \ + } + +/* +** caller_s8: +** ... +** bl callee_s8 +** ptrue (p[0-7])\.b, all +** saddv (d[0-9]+), \1, z0\.b +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (s8, svint8_t) + +/* +** caller_u8: +** ... +** bl callee_u8 +** ptrue (p[0-7])\.b, all +** uaddv (d[0-9]+), \1, z0\.b +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (u8, svuint8_t) + +/* +** caller_s16: +** ... +** bl callee_s16 +** ptrue (p[0-7])\.b, all +** saddv (d[0-9]+), \1, z0\.h +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (s16, svint16_t) + +/* +** caller_u16: +** ... +** bl callee_u16 +** ptrue (p[0-7])\.b, all +** uaddv (d[0-9]+), \1, z0\.h +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (u16, svuint16_t) + +/* +** caller_f16: +** ... +** bl callee_f16 +** ptrue (p[0-7])\.b, all +** faddv h0, \1, z0\.h +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (f16, svfloat16_t) + +/* +** caller_s32: +** ... +** bl callee_s32 +** ptrue (p[0-7])\.b, all +** saddv (d[0-9]+), \1, z0\.s +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (s32, svint32_t) + +/* +** caller_u32: +** ... +** bl callee_u32 +** ptrue (p[0-7])\.b, all +** uaddv (d[0-9]+), \1, z0\.s +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (u32, svuint32_t) + +/* +** caller_f32: +** ... +** bl callee_f32 +** ptrue (p[0-7])\.b, all +** faddv s0, \1, z0\.s +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (f32, svfloat32_t) + +/* +** caller_s64: +** ... +** bl callee_s64 +** ptrue (p[0-7])\.b, all +** uaddv (d[0-9]+), \1, z0\.d +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (s64, svint64_t) + +/* +** caller_u64: +** ... +** bl callee_u64 +** ptrue (p[0-7])\.b, all +** uaddv (d[0-9]+), \1, z0\.d +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (u64, svuint64_t) + +/* +** caller_f64: +** ... +** bl callee_f64 +** ptrue (p[0-7])\.b, all +** faddv d0, \1, z0\.d +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (f64, svfloat64_t) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_5_1024.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_5_1024.c new file mode 100644 index 0000000..f2a3fd5 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_5_1024.c @@ -0,0 +1,237 @@ +/* { dg-do compile } */ +/* { dg-options "-O -msve-vector-bits=1024 -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include + +#define CALLEE(SUFFIX, TYPE) \ + TYPE __attribute__((noipa)) \ + callee_##SUFFIX (TYPE *ptr) \ + { \ + return *ptr; \ + } + +/* +** callee_s8: +** ptrue (p[0-7])\.b, vl128 +** ld1b z0\.b, \1/z, \[x0\] +** ret +*/ +CALLEE (s8, svint8_t) + +/* +** callee_u8: +** ptrue (p[0-7])\.b, vl128 +** ld1b z0\.b, \1/z, \[x0\] +** ret +*/ +CALLEE (u8, svuint8_t) + +/* +** callee_s16: +** ptrue (p[0-7])\.b, vl128 +** ld1h z0\.h, \1/z, \[x0\] +** ret +*/ +CALLEE (s16, svint16_t) + +/* +** callee_u16: +** ptrue (p[0-7])\.b, vl128 +** ld1h z0\.h, \1/z, \[x0\] +** ret +*/ +CALLEE (u16, svuint16_t) + +/* +** callee_f16: +** ptrue (p[0-7])\.b, vl128 +** ld1h z0\.h, \1/z, \[x0\] +** ret +*/ +CALLEE (f16, svfloat16_t) + +/* +** callee_s32: +** ptrue (p[0-7])\.b, vl128 +** ld1w z0\.s, \1/z, \[x0\] +** ret +*/ +CALLEE (s32, svint32_t) + +/* +** callee_u32: +** ptrue (p[0-7])\.b, vl128 +** ld1w z0\.s, \1/z, \[x0\] +** ret +*/ +CALLEE (u32, svuint32_t) + +/* +** callee_f32: +** ptrue (p[0-7])\.b, vl128 +** ld1w z0\.s, \1/z, \[x0\] +** ret +*/ +CALLEE (f32, svfloat32_t) + +/* +** callee_s64: +** ptrue (p[0-7])\.b, vl128 +** ld1d z0\.d, \1/z, \[x0\] +** ret +*/ +CALLEE (s64, svint64_t) + +/* +** callee_u64: +** ptrue (p[0-7])\.b, vl128 +** ld1d z0\.d, \1/z, \[x0\] +** ret +*/ +CALLEE (u64, svuint64_t) + +/* +** callee_f64: +** ptrue (p[0-7])\.b, vl128 +** ld1d z0\.d, \1/z, \[x0\] +** ret +*/ +CALLEE (f64, svfloat64_t) + +#define CALLER(SUFFIX, TYPE) \ + typeof (svaddv (svptrue_b8 (), *(TYPE *) 0)) \ + __attribute__((noipa)) \ + caller_##SUFFIX (TYPE *ptr1) \ + { \ + return svaddv (svptrue_b8 (), callee_##SUFFIX (ptr1)); \ + } + +/* +** caller_s8: +** ... +** bl callee_s8 +** ptrue (p[0-7])\.b, vl128 +** saddv (d[0-9]+), \1, z0\.b +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (s8, svint8_t) + +/* +** caller_u8: +** ... +** bl callee_u8 +** ptrue (p[0-7])\.b, vl128 +** uaddv (d[0-9]+), \1, z0\.b +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (u8, svuint8_t) + +/* +** caller_s16: +** ... +** bl callee_s16 +** ptrue (p[0-7])\.b, vl128 +** saddv (d[0-9]+), \1, z0\.h +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (s16, svint16_t) + +/* +** caller_u16: +** ... +** bl callee_u16 +** ptrue (p[0-7])\.b, vl128 +** uaddv (d[0-9]+), \1, z0\.h +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (u16, svuint16_t) + +/* +** caller_f16: +** ... +** bl callee_f16 +** ptrue (p[0-7])\.b, vl128 +** faddv h0, \1, z0\.h +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (f16, svfloat16_t) + +/* +** caller_s32: +** ... +** bl callee_s32 +** ptrue (p[0-7])\.b, vl128 +** saddv (d[0-9]+), \1, z0\.s +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (s32, svint32_t) + +/* +** caller_u32: +** ... +** bl callee_u32 +** ptrue (p[0-7])\.b, vl128 +** uaddv (d[0-9]+), \1, z0\.s +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (u32, svuint32_t) + +/* +** caller_f32: +** ... +** bl callee_f32 +** ptrue (p[0-7])\.b, vl128 +** faddv s0, \1, z0\.s +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (f32, svfloat32_t) + +/* +** caller_s64: +** ... +** bl callee_s64 +** ptrue (p[0-7])\.b, vl128 +** uaddv (d[0-9]+), \1, z0\.d +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (s64, svint64_t) + +/* +** caller_u64: +** ... +** bl callee_u64 +** ptrue (p[0-7])\.b, vl128 +** uaddv (d[0-9]+), \1, z0\.d +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (u64, svuint64_t) + +/* +** caller_f64: +** ... +** bl callee_f64 +** ptrue (p[0-7])\.b, vl128 +** faddv d0, \1, z0\.d +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (f64, svfloat64_t) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_5_2048.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_5_2048.c new file mode 100644 index 0000000..0875acc --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_5_2048.c @@ -0,0 +1,237 @@ +/* { dg-do compile } */ +/* { dg-options "-O -msve-vector-bits=2048 -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include + +#define CALLEE(SUFFIX, TYPE) \ + TYPE __attribute__((noipa)) \ + callee_##SUFFIX (TYPE *ptr) \ + { \ + return *ptr; \ + } + +/* +** callee_s8: +** ptrue (p[0-7])\.b, vl256 +** ld1b z0\.b, \1/z, \[x0\] +** ret +*/ +CALLEE (s8, svint8_t) + +/* +** callee_u8: +** ptrue (p[0-7])\.b, vl256 +** ld1b z0\.b, \1/z, \[x0\] +** ret +*/ +CALLEE (u8, svuint8_t) + +/* +** callee_s16: +** ptrue (p[0-7])\.b, vl256 +** ld1h z0\.h, \1/z, \[x0\] +** ret +*/ +CALLEE (s16, svint16_t) + +/* +** callee_u16: +** ptrue (p[0-7])\.b, vl256 +** ld1h z0\.h, \1/z, \[x0\] +** ret +*/ +CALLEE (u16, svuint16_t) + +/* +** callee_f16: +** ptrue (p[0-7])\.b, vl256 +** ld1h z0\.h, \1/z, \[x0\] +** ret +*/ +CALLEE (f16, svfloat16_t) + +/* +** callee_s32: +** ptrue (p[0-7])\.b, vl256 +** ld1w z0\.s, \1/z, \[x0\] +** ret +*/ +CALLEE (s32, svint32_t) + +/* +** callee_u32: +** ptrue (p[0-7])\.b, vl256 +** ld1w z0\.s, \1/z, \[x0\] +** ret +*/ +CALLEE (u32, svuint32_t) + +/* +** callee_f32: +** ptrue (p[0-7])\.b, vl256 +** ld1w z0\.s, \1/z, \[x0\] +** ret +*/ +CALLEE (f32, svfloat32_t) + +/* +** callee_s64: +** ptrue (p[0-7])\.b, vl256 +** ld1d z0\.d, \1/z, \[x0\] +** ret +*/ +CALLEE (s64, svint64_t) + +/* +** callee_u64: +** ptrue (p[0-7])\.b, vl256 +** ld1d z0\.d, \1/z, \[x0\] +** ret +*/ +CALLEE (u64, svuint64_t) + +/* +** callee_f64: +** ptrue (p[0-7])\.b, vl256 +** ld1d z0\.d, \1/z, \[x0\] +** ret +*/ +CALLEE (f64, svfloat64_t) + +#define CALLER(SUFFIX, TYPE) \ + typeof (svaddv (svptrue_b8 (), *(TYPE *) 0)) \ + __attribute__((noipa)) \ + caller_##SUFFIX (TYPE *ptr1) \ + { \ + return svaddv (svptrue_b8 (), callee_##SUFFIX (ptr1)); \ + } + +/* +** caller_s8: +** ... +** bl callee_s8 +** ptrue (p[0-7])\.b, vl256 +** saddv (d[0-9]+), \1, z0\.b +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (s8, svint8_t) + +/* +** caller_u8: +** ... +** bl callee_u8 +** ptrue (p[0-7])\.b, vl256 +** uaddv (d[0-9]+), \1, z0\.b +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (u8, svuint8_t) + +/* +** caller_s16: +** ... +** bl callee_s16 +** ptrue (p[0-7])\.b, vl256 +** saddv (d[0-9]+), \1, z0\.h +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (s16, svint16_t) + +/* +** caller_u16: +** ... +** bl callee_u16 +** ptrue (p[0-7])\.b, vl256 +** uaddv (d[0-9]+), \1, z0\.h +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (u16, svuint16_t) + +/* +** caller_f16: +** ... +** bl callee_f16 +** ptrue (p[0-7])\.b, vl256 +** faddv h0, \1, z0\.h +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (f16, svfloat16_t) + +/* +** caller_s32: +** ... +** bl callee_s32 +** ptrue (p[0-7])\.b, vl256 +** saddv (d[0-9]+), \1, z0\.s +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (s32, svint32_t) + +/* +** caller_u32: +** ... +** bl callee_u32 +** ptrue (p[0-7])\.b, vl256 +** uaddv (d[0-9]+), \1, z0\.s +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (u32, svuint32_t) + +/* +** caller_f32: +** ... +** bl callee_f32 +** ptrue (p[0-7])\.b, vl256 +** faddv s0, \1, z0\.s +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (f32, svfloat32_t) + +/* +** caller_s64: +** ... +** bl callee_s64 +** ptrue (p[0-7])\.b, vl256 +** uaddv (d[0-9]+), \1, z0\.d +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (s64, svint64_t) + +/* +** caller_u64: +** ... +** bl callee_u64 +** ptrue (p[0-7])\.b, vl256 +** uaddv (d[0-9]+), \1, z0\.d +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (u64, svuint64_t) + +/* +** caller_f64: +** ... +** bl callee_f64 +** ptrue (p[0-7])\.b, vl256 +** faddv d0, \1, z0\.d +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (f64, svfloat64_t) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_5_256.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_5_256.c new file mode 100644 index 0000000..bcd052b --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_5_256.c @@ -0,0 +1,237 @@ +/* { dg-do compile } */ +/* { dg-options "-O -msve-vector-bits=256 -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include + +#define CALLEE(SUFFIX, TYPE) \ + TYPE __attribute__((noipa)) \ + callee_##SUFFIX (TYPE *ptr) \ + { \ + return *ptr; \ + } + +/* +** callee_s8: +** ptrue (p[0-7])\.b, vl32 +** ld1b z0\.b, \1/z, \[x0\] +** ret +*/ +CALLEE (s8, svint8_t) + +/* +** callee_u8: +** ptrue (p[0-7])\.b, vl32 +** ld1b z0\.b, \1/z, \[x0\] +** ret +*/ +CALLEE (u8, svuint8_t) + +/* +** callee_s16: +** ptrue (p[0-7])\.b, vl32 +** ld1h z0\.h, \1/z, \[x0\] +** ret +*/ +CALLEE (s16, svint16_t) + +/* +** callee_u16: +** ptrue (p[0-7])\.b, vl32 +** ld1h z0\.h, \1/z, \[x0\] +** ret +*/ +CALLEE (u16, svuint16_t) + +/* +** callee_f16: +** ptrue (p[0-7])\.b, vl32 +** ld1h z0\.h, \1/z, \[x0\] +** ret +*/ +CALLEE (f16, svfloat16_t) + +/* +** callee_s32: +** ptrue (p[0-7])\.b, vl32 +** ld1w z0\.s, \1/z, \[x0\] +** ret +*/ +CALLEE (s32, svint32_t) + +/* +** callee_u32: +** ptrue (p[0-7])\.b, vl32 +** ld1w z0\.s, \1/z, \[x0\] +** ret +*/ +CALLEE (u32, svuint32_t) + +/* +** callee_f32: +** ptrue (p[0-7])\.b, vl32 +** ld1w z0\.s, \1/z, \[x0\] +** ret +*/ +CALLEE (f32, svfloat32_t) + +/* +** callee_s64: +** ptrue (p[0-7])\.b, vl32 +** ld1d z0\.d, \1/z, \[x0\] +** ret +*/ +CALLEE (s64, svint64_t) + +/* +** callee_u64: +** ptrue (p[0-7])\.b, vl32 +** ld1d z0\.d, \1/z, \[x0\] +** ret +*/ +CALLEE (u64, svuint64_t) + +/* +** callee_f64: +** ptrue (p[0-7])\.b, vl32 +** ld1d z0\.d, \1/z, \[x0\] +** ret +*/ +CALLEE (f64, svfloat64_t) + +#define CALLER(SUFFIX, TYPE) \ + typeof (svaddv (svptrue_b8 (), *(TYPE *) 0)) \ + __attribute__((noipa)) \ + caller_##SUFFIX (TYPE *ptr1) \ + { \ + return svaddv (svptrue_b8 (), callee_##SUFFIX (ptr1)); \ + } + +/* +** caller_s8: +** ... +** bl callee_s8 +** ptrue (p[0-7])\.b, vl32 +** saddv (d[0-9]+), \1, z0\.b +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (s8, svint8_t) + +/* +** caller_u8: +** ... +** bl callee_u8 +** ptrue (p[0-7])\.b, vl32 +** uaddv (d[0-9]+), \1, z0\.b +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (u8, svuint8_t) + +/* +** caller_s16: +** ... +** bl callee_s16 +** ptrue (p[0-7])\.b, vl32 +** saddv (d[0-9]+), \1, z0\.h +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (s16, svint16_t) + +/* +** caller_u16: +** ... +** bl callee_u16 +** ptrue (p[0-7])\.b, vl32 +** uaddv (d[0-9]+), \1, z0\.h +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (u16, svuint16_t) + +/* +** caller_f16: +** ... +** bl callee_f16 +** ptrue (p[0-7])\.b, vl32 +** faddv h0, \1, z0\.h +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (f16, svfloat16_t) + +/* +** caller_s32: +** ... +** bl callee_s32 +** ptrue (p[0-7])\.b, vl32 +** saddv (d[0-9]+), \1, z0\.s +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (s32, svint32_t) + +/* +** caller_u32: +** ... +** bl callee_u32 +** ptrue (p[0-7])\.b, vl32 +** uaddv (d[0-9]+), \1, z0\.s +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (u32, svuint32_t) + +/* +** caller_f32: +** ... +** bl callee_f32 +** ptrue (p[0-7])\.b, vl32 +** faddv s0, \1, z0\.s +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (f32, svfloat32_t) + +/* +** caller_s64: +** ... +** bl callee_s64 +** ptrue (p[0-7])\.b, vl32 +** uaddv (d[0-9]+), \1, z0\.d +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (s64, svint64_t) + +/* +** caller_u64: +** ... +** bl callee_u64 +** ptrue (p[0-7])\.b, vl32 +** uaddv (d[0-9]+), \1, z0\.d +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (u64, svuint64_t) + +/* +** caller_f64: +** ... +** bl callee_f64 +** ptrue (p[0-7])\.b, vl32 +** faddv d0, \1, z0\.d +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (f64, svfloat64_t) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_5_512.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_5_512.c new file mode 100644 index 0000000..2122c32 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_5_512.c @@ -0,0 +1,237 @@ +/* { dg-do compile } */ +/* { dg-options "-O -msve-vector-bits=512 -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include + +#define CALLEE(SUFFIX, TYPE) \ + TYPE __attribute__((noipa)) \ + callee_##SUFFIX (TYPE *ptr) \ + { \ + return *ptr; \ + } + +/* +** callee_s8: +** ptrue (p[0-7])\.b, vl64 +** ld1b z0\.b, \1/z, \[x0\] +** ret +*/ +CALLEE (s8, svint8_t) + +/* +** callee_u8: +** ptrue (p[0-7])\.b, vl64 +** ld1b z0\.b, \1/z, \[x0\] +** ret +*/ +CALLEE (u8, svuint8_t) + +/* +** callee_s16: +** ptrue (p[0-7])\.b, vl64 +** ld1h z0\.h, \1/z, \[x0\] +** ret +*/ +CALLEE (s16, svint16_t) + +/* +** callee_u16: +** ptrue (p[0-7])\.b, vl64 +** ld1h z0\.h, \1/z, \[x0\] +** ret +*/ +CALLEE (u16, svuint16_t) + +/* +** callee_f16: +** ptrue (p[0-7])\.b, vl64 +** ld1h z0\.h, \1/z, \[x0\] +** ret +*/ +CALLEE (f16, svfloat16_t) + +/* +** callee_s32: +** ptrue (p[0-7])\.b, vl64 +** ld1w z0\.s, \1/z, \[x0\] +** ret +*/ +CALLEE (s32, svint32_t) + +/* +** callee_u32: +** ptrue (p[0-7])\.b, vl64 +** ld1w z0\.s, \1/z, \[x0\] +** ret +*/ +CALLEE (u32, svuint32_t) + +/* +** callee_f32: +** ptrue (p[0-7])\.b, vl64 +** ld1w z0\.s, \1/z, \[x0\] +** ret +*/ +CALLEE (f32, svfloat32_t) + +/* +** callee_s64: +** ptrue (p[0-7])\.b, vl64 +** ld1d z0\.d, \1/z, \[x0\] +** ret +*/ +CALLEE (s64, svint64_t) + +/* +** callee_u64: +** ptrue (p[0-7])\.b, vl64 +** ld1d z0\.d, \1/z, \[x0\] +** ret +*/ +CALLEE (u64, svuint64_t) + +/* +** callee_f64: +** ptrue (p[0-7])\.b, vl64 +** ld1d z0\.d, \1/z, \[x0\] +** ret +*/ +CALLEE (f64, svfloat64_t) + +#define CALLER(SUFFIX, TYPE) \ + typeof (svaddv (svptrue_b8 (), *(TYPE *) 0)) \ + __attribute__((noipa)) \ + caller_##SUFFIX (TYPE *ptr1) \ + { \ + return svaddv (svptrue_b8 (), callee_##SUFFIX (ptr1)); \ + } + +/* +** caller_s8: +** ... +** bl callee_s8 +** ptrue (p[0-7])\.b, vl64 +** saddv (d[0-9]+), \1, z0\.b +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (s8, svint8_t) + +/* +** caller_u8: +** ... +** bl callee_u8 +** ptrue (p[0-7])\.b, vl64 +** uaddv (d[0-9]+), \1, z0\.b +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (u8, svuint8_t) + +/* +** caller_s16: +** ... +** bl callee_s16 +** ptrue (p[0-7])\.b, vl64 +** saddv (d[0-9]+), \1, z0\.h +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (s16, svint16_t) + +/* +** caller_u16: +** ... +** bl callee_u16 +** ptrue (p[0-7])\.b, vl64 +** uaddv (d[0-9]+), \1, z0\.h +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (u16, svuint16_t) + +/* +** caller_f16: +** ... +** bl callee_f16 +** ptrue (p[0-7])\.b, vl64 +** faddv h0, \1, z0\.h +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (f16, svfloat16_t) + +/* +** caller_s32: +** ... +** bl callee_s32 +** ptrue (p[0-7])\.b, vl64 +** saddv (d[0-9]+), \1, z0\.s +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (s32, svint32_t) + +/* +** caller_u32: +** ... +** bl callee_u32 +** ptrue (p[0-7])\.b, vl64 +** uaddv (d[0-9]+), \1, z0\.s +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (u32, svuint32_t) + +/* +** caller_f32: +** ... +** bl callee_f32 +** ptrue (p[0-7])\.b, vl64 +** faddv s0, \1, z0\.s +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (f32, svfloat32_t) + +/* +** caller_s64: +** ... +** bl callee_s64 +** ptrue (p[0-7])\.b, vl64 +** uaddv (d[0-9]+), \1, z0\.d +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (s64, svint64_t) + +/* +** caller_u64: +** ... +** bl callee_u64 +** ptrue (p[0-7])\.b, vl64 +** uaddv (d[0-9]+), \1, z0\.d +** fmov x0, \2 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (u64, svuint64_t) + +/* +** caller_f64: +** ... +** bl callee_f64 +** ptrue (p[0-7])\.b, vl64 +** faddv d0, \1, z0\.d +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER (f64, svfloat64_t) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_6.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_6.c new file mode 100644 index 0000000..33bb2d9 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_6.c @@ -0,0 +1,258 @@ +/* { dg-do compile } */ +/* { dg-options "-O -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include + +typedef int8_t svint8_t __attribute__ ((vector_size (32))); +typedef uint8_t svuint8_t __attribute__ ((vector_size (32))); + +typedef int16_t svint16_t __attribute__ ((vector_size (32))); +typedef uint16_t svuint16_t __attribute__ ((vector_size (32))); +typedef __fp16 svfloat16_t __attribute__ ((vector_size (32))); + +typedef int32_t svint32_t __attribute__ ((vector_size (32))); +typedef uint32_t svuint32_t __attribute__ ((vector_size (32))); +typedef float svfloat32_t __attribute__ ((vector_size (32))); + +typedef int64_t svint64_t __attribute__ ((vector_size (32))); +typedef uint64_t svuint64_t __attribute__ ((vector_size (32))); +typedef double svfloat64_t __attribute__ ((vector_size (32))); + +#define CALLEE(SUFFIX, TYPE) \ + TYPE __attribute__((noipa)) \ + callee_##SUFFIX (TYPE *ptr) \ + { \ + return *ptr; \ + } + +/* +** callee_s8: +** ( +** ld1 ({v.*}), \[x0\] +** st1 \1, \[x8\] +** | +** ldp (q[0-9]+, q[0-9]+), \[x0\] +** stp \2, \[x8\] +** ) +** ret +*/ +CALLEE (s8, svint8_t) + +/* +** callee_u8: +** ( +** ld1 ({v.*}), \[x0\] +** st1 \1, \[x8\] +** | +** ldp (q[0-9]+, q[0-9]+), \[x0\] +** stp \2, \[x8\] +** ) +** ret +*/ +CALLEE (u8, svuint8_t) + +/* +** callee_s16: +** ( +** ld1 ({v.*}), \[x0\] +** st1 \1, \[x8\] +** | +** ldp (q[0-9]+, q[0-9]+), \[x0\] +** stp \2, \[x8\] +** ) +** ret +*/ +CALLEE (s16, svint16_t) + +/* +** callee_u16: +** ( +** ld1 ({v.*}), \[x0\] +** st1 \1, \[x8\] +** | +** ldp (q[0-9]+, q[0-9]+), \[x0\] +** stp \2, \[x8\] +** ) +** ret +*/ +CALLEE (u16, svuint16_t) + +/* Currently we scalarize this. */ +CALLEE (f16, svfloat16_t) + +/* +** callee_s32: +** ( +** ld1 ({v.*}), \[x0\] +** st1 \1, \[x8\] +** | +** ldp (q[0-9]+, q[0-9]+), \[x0\] +** stp \2, \[x8\] +** ) +** ret +*/ +CALLEE (s32, svint32_t) + +/* +** callee_u32: +** ( +** ld1 ({v.*}), \[x0\] +** st1 \1, \[x8\] +** | +** ldp (q[0-9]+, q[0-9]+), \[x0\] +** stp \2, \[x8\] +** ) +** ret +*/ +CALLEE (u32, svuint32_t) + +/* Currently we scalarize this. */ +CALLEE (f32, svfloat32_t) + +/* +** callee_s64: +** ( +** ld1 ({v.*}), \[x0\] +** st1 \1, \[x8\] +** | +** ldp (q[0-9]+, q[0-9]+), \[x0\] +** stp \2, \[x8\] +** ) +** ret +*/ +CALLEE (s64, svint64_t) + +/* +** callee_u64: +** ( +** ld1 ({v.*}), \[x0\] +** st1 \1, \[x8\] +** | +** ldp (q[0-9]+, q[0-9]+), \[x0\] +** stp \2, \[x8\] +** ) +** ret +*/ +CALLEE (u64, svuint64_t) + +/* Currently we scalarize this. */ +CALLEE (f64, svfloat64_t) + +#define CALLER(SUFFIX, TYPE) \ + typeof ((*(TYPE *) 0)[0]) \ + __attribute__((noipa)) \ + caller_##SUFFIX (TYPE *ptr1) \ + { \ + return callee_##SUFFIX (ptr1)[0]; \ + } + +/* +** caller_s8: +** ... +** bl callee_s8 +** ldrb w0, \[sp, 16\] +** ldp x29, x30, \[sp\], 48 +** ret +*/ +CALLER (s8, svint8_t) + +/* +** caller_u8: +** ... +** bl callee_u8 +** ldrb w0, \[sp, 16\] +** ldp x29, x30, \[sp\], 48 +** ret +*/ +CALLER (u8, svuint8_t) + +/* +** caller_s16: +** ... +** bl callee_s16 +** ldrh w0, \[sp, 16\] +** ldp x29, x30, \[sp\], 48 +** ret +*/ +CALLER (s16, svint16_t) + +/* +** caller_u16: +** ... +** bl callee_u16 +** ldrh w0, \[sp, 16\] +** ldp x29, x30, \[sp\], 48 +** ret +*/ +CALLER (u16, svuint16_t) + +/* +** caller_f16: +** ... +** bl callee_f16 +** ldr h0, \[sp, 16\] +** ldp x29, x30, \[sp\], 48 +** ret +*/ +CALLER (f16, svfloat16_t) + +/* +** caller_s32: +** ... +** bl callee_s32 +** ldr w0, \[sp, 16\] +** ldp x29, x30, \[sp\], 48 +** ret +*/ +CALLER (s32, svint32_t) + +/* +** caller_u32: +** ... +** bl callee_u32 +** ldr w0, \[sp, 16\] +** ldp x29, x30, \[sp\], 48 +** ret +*/ +CALLER (u32, svuint32_t) + +/* +** caller_f32: +** ... +** bl callee_f32 +** ldr s0, \[sp, 16\] +** ldp x29, x30, \[sp\], 48 +** ret +*/ +CALLER (f32, svfloat32_t) + +/* +** caller_s64: +** ... +** bl callee_s64 +** ldr x0, \[sp, 16\] +** ldp x29, x30, \[sp\], 48 +** ret +*/ +CALLER (s64, svint64_t) + +/* +** caller_u64: +** ... +** bl callee_u64 +** ldr x0, \[sp, 16\] +** ldp x29, x30, \[sp\], 48 +** ret +*/ +CALLER (u64, svuint64_t) + +/* +** caller_f64: +** ... +** bl callee_f64 +** ldr d0, \[sp, 16\] +** ldp x29, x30, \[sp\], 48 +** ret +*/ +CALLER (f64, svfloat64_t) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_6_1024.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_6_1024.c new file mode 100644 index 0000000..696f26a --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_6_1024.c @@ -0,0 +1,265 @@ +/* { dg-do compile } */ +/* { dg-options "-O -msve-vector-bits=1024 -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include + +typedef int8_t svint8_t __attribute__ ((vector_size (128))); +typedef uint8_t svuint8_t __attribute__ ((vector_size (128))); + +typedef int16_t svint16_t __attribute__ ((vector_size (128))); +typedef uint16_t svuint16_t __attribute__ ((vector_size (128))); +typedef __fp16 svfloat16_t __attribute__ ((vector_size (128))); + +typedef int32_t svint32_t __attribute__ ((vector_size (128))); +typedef uint32_t svuint32_t __attribute__ ((vector_size (128))); +typedef float svfloat32_t __attribute__ ((vector_size (128))); + +typedef int64_t svint64_t __attribute__ ((vector_size (128))); +typedef uint64_t svuint64_t __attribute__ ((vector_size (128))); +typedef double svfloat64_t __attribute__ ((vector_size (128))); + +#define CALLEE(SUFFIX, TYPE) \ + TYPE __attribute__((noipa)) \ + callee_##SUFFIX (TYPE *ptr) \ + { \ + return *ptr; \ + } + +/* +** callee_s8: +** ptrue (p[0-7])\.b, vl128 +** ld1b z0\.b, \1/z, \[x0\] +** st1b z0\.b, \1, \[x8\] +** ret +*/ +CALLEE (s8, svint8_t) + +/* +** callee_u8: +** ptrue (p[0-7])\.b, vl128 +** ld1b z0\.b, \1/z, \[x0\] +** st1b z0\.b, \1, \[x8\] +** ret +*/ +CALLEE (u8, svuint8_t) + +/* +** callee_s16: +** ptrue (p[0-7])\.b, vl128 +** ld1h z0\.h, \1/z, \[x0\] +** st1h z0\.h, \1, \[x8\] +** ret +*/ +CALLEE (s16, svint16_t) + +/* +** callee_u16: +** ptrue (p[0-7])\.b, vl128 +** ld1h z0\.h, \1/z, \[x0\] +** st1h z0\.h, \1, \[x8\] +** ret +*/ +CALLEE (u16, svuint16_t) + +/* +** callee_f16: +** ptrue (p[0-7])\.b, vl128 +** ld1h z0\.h, \1/z, \[x0\] +** st1h z0\.h, \1, \[x8\] +** ret +*/ +CALLEE (f16, svfloat16_t) + +/* +** callee_s32: +** ptrue (p[0-7])\.b, vl128 +** ld1w z0\.s, \1/z, \[x0\] +** st1w z0\.s, \1, \[x8\] +** ret +*/ +CALLEE (s32, svint32_t) + +/* +** callee_u32: +** ptrue (p[0-7])\.b, vl128 +** ld1w z0\.s, \1/z, \[x0\] +** st1w z0\.s, \1, \[x8\] +** ret +*/ +CALLEE (u32, svuint32_t) + +/* +** callee_f32: +** ptrue (p[0-7])\.b, vl128 +** ld1w z0\.s, \1/z, \[x0\] +** st1w z0\.s, \1, \[x8\] +** ret +*/ +CALLEE (f32, svfloat32_t) + +/* +** callee_s64: +** ptrue (p[0-7])\.b, vl128 +** ld1d z0\.d, \1/z, \[x0\] +** st1d z0\.d, \1, \[x8\] +** ret +*/ +CALLEE (s64, svint64_t) + +/* +** callee_u64: +** ptrue (p[0-7])\.b, vl128 +** ld1d z0\.d, \1/z, \[x0\] +** st1d z0\.d, \1, \[x8\] +** ret +*/ +CALLEE (u64, svuint64_t) + +/* +** callee_f64: +** ptrue (p[0-7])\.b, vl128 +** ld1d z0\.d, \1/z, \[x0\] +** st1d z0\.d, \1, \[x8\] +** ret +*/ +CALLEE (f64, svfloat64_t) + +#define CALLER(SUFFIX, TYPE) \ + void __attribute__((noipa)) \ + caller_##SUFFIX (TYPE *ptr1, TYPE *ptr2) \ + { \ + *ptr2 = callee_##SUFFIX (ptr1); \ + } + +/* +** caller_s8: +** ... +** bl callee_s8 +** ... +** ld1b (z[0-9]+\.b), (p[0-7])/z, \[[^]]*\] +** st1b \1, \2, \[[^]]*\] +** ... +** ret +*/ +CALLER (s8, svint8_t) + +/* +** caller_u8: +** ... +** bl callee_u8 +** ... +** ld1b (z[0-9]+\.b), (p[0-7])/z, \[[^]]*\] +** st1b \1, \2, \[[^]]*\] +** ... +** ret +*/ +CALLER (u8, svuint8_t) + +/* +** caller_s16: +** ... +** bl callee_s16 +** ... +** ld1h (z[0-9]+\.h), (p[0-7])/z, \[[^]]*\] +** st1h \1, \2, \[[^]]*\] +** ... +** ret +*/ +CALLER (s16, svint16_t) + +/* +** caller_u16: +** ... +** bl callee_u16 +** ... +** ld1h (z[0-9]+\.h), (p[0-7])/z, \[[^]]*\] +** st1h \1, \2, \[[^]]*\] +** ... +** ret +*/ +CALLER (u16, svuint16_t) + +/* +** caller_f16: +** ... +** bl callee_f16 +** ... +** ld1h (z[0-9]+\.h), (p[0-7])/z, \[[^]]*\] +** st1h \1, \2, \[[^]]*\] +** ... +** ret +*/ +CALLER (f16, svfloat16_t) + +/* +** caller_s32: +** ... +** bl callee_s32 +** ... +** ld1w (z[0-9]+\.s), (p[0-7])/z, \[[^]]*\] +** st1w \1, \2, \[[^]]*\] +** ... +** ret +*/ +CALLER (s32, svint32_t) + +/* +** caller_u32: +** ... +** bl callee_u32 +** ... +** ld1w (z[0-9]+\.s), (p[0-7])/z, \[[^]]*\] +** st1w \1, \2, \[[^]]*\] +** ... +** ret +*/ +CALLER (u32, svuint32_t) + +/* +** caller_f32: +** ... +** bl callee_f32 +** ... +** ld1w (z[0-9]+\.s), (p[0-7])/z, \[[^]]*\] +** st1w \1, \2, \[[^]]*\] +** ... +** ret +*/ +CALLER (f32, svfloat32_t) + +/* +** caller_s64: +** ... +** bl callee_s64 +** ... +** ld1d (z[0-9]+\.d), (p[0-7])/z, \[[^]]*\] +** st1d \1, \2, \[[^]]*\] +** ... +** ret +*/ +CALLER (s64, svint64_t) + +/* +** caller_u64: +** ... +** bl callee_u64 +** ... +** ld1d (z[0-9]+\.d), (p[0-7])/z, \[[^]]*\] +** st1d \1, \2, \[[^]]*\] +** ... +** ret +*/ +CALLER (u64, svuint64_t) + +/* +** caller_f64: +** ... +** bl callee_f64 +** ... +** ld1d (z[0-9]+\.d), (p[0-7])/z, \[[^]]*\] +** st1d \1, \2, \[[^]]*\] +** ... +** ret +*/ +CALLER (f64, svfloat64_t) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_6_2048.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_6_2048.c new file mode 100644 index 0000000..254a36b --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_6_2048.c @@ -0,0 +1,265 @@ +/* { dg-do compile } */ +/* { dg-options "-O -msve-vector-bits=2048 -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include + +typedef int8_t svint8_t __attribute__ ((vector_size (256))); +typedef uint8_t svuint8_t __attribute__ ((vector_size (256))); + +typedef int16_t svint16_t __attribute__ ((vector_size (256))); +typedef uint16_t svuint16_t __attribute__ ((vector_size (256))); +typedef __fp16 svfloat16_t __attribute__ ((vector_size (256))); + +typedef int32_t svint32_t __attribute__ ((vector_size (256))); +typedef uint32_t svuint32_t __attribute__ ((vector_size (256))); +typedef float svfloat32_t __attribute__ ((vector_size (256))); + +typedef int64_t svint64_t __attribute__ ((vector_size (256))); +typedef uint64_t svuint64_t __attribute__ ((vector_size (256))); +typedef double svfloat64_t __attribute__ ((vector_size (256))); + +#define CALLEE(SUFFIX, TYPE) \ + TYPE __attribute__((noipa)) \ + callee_##SUFFIX (TYPE *ptr) \ + { \ + return *ptr; \ + } + +/* +** callee_s8: +** ptrue (p[0-7])\.b, vl256 +** ld1b z0\.b, \1/z, \[x0\] +** st1b z0\.b, \1, \[x8\] +** ret +*/ +CALLEE (s8, svint8_t) + +/* +** callee_u8: +** ptrue (p[0-7])\.b, vl256 +** ld1b z0\.b, \1/z, \[x0\] +** st1b z0\.b, \1, \[x8\] +** ret +*/ +CALLEE (u8, svuint8_t) + +/* +** callee_s16: +** ptrue (p[0-7])\.b, vl256 +** ld1h z0\.h, \1/z, \[x0\] +** st1h z0\.h, \1, \[x8\] +** ret +*/ +CALLEE (s16, svint16_t) + +/* +** callee_u16: +** ptrue (p[0-7])\.b, vl256 +** ld1h z0\.h, \1/z, \[x0\] +** st1h z0\.h, \1, \[x8\] +** ret +*/ +CALLEE (u16, svuint16_t) + +/* +** callee_f16: +** ptrue (p[0-7])\.b, vl256 +** ld1h z0\.h, \1/z, \[x0\] +** st1h z0\.h, \1, \[x8\] +** ret +*/ +CALLEE (f16, svfloat16_t) + +/* +** callee_s32: +** ptrue (p[0-7])\.b, vl256 +** ld1w z0\.s, \1/z, \[x0\] +** st1w z0\.s, \1, \[x8\] +** ret +*/ +CALLEE (s32, svint32_t) + +/* +** callee_u32: +** ptrue (p[0-7])\.b, vl256 +** ld1w z0\.s, \1/z, \[x0\] +** st1w z0\.s, \1, \[x8\] +** ret +*/ +CALLEE (u32, svuint32_t) + +/* +** callee_f32: +** ptrue (p[0-7])\.b, vl256 +** ld1w z0\.s, \1/z, \[x0\] +** st1w z0\.s, \1, \[x8\] +** ret +*/ +CALLEE (f32, svfloat32_t) + +/* +** callee_s64: +** ptrue (p[0-7])\.b, vl256 +** ld1d z0\.d, \1/z, \[x0\] +** st1d z0\.d, \1, \[x8\] +** ret +*/ +CALLEE (s64, svint64_t) + +/* +** callee_u64: +** ptrue (p[0-7])\.b, vl256 +** ld1d z0\.d, \1/z, \[x0\] +** st1d z0\.d, \1, \[x8\] +** ret +*/ +CALLEE (u64, svuint64_t) + +/* +** callee_f64: +** ptrue (p[0-7])\.b, vl256 +** ld1d z0\.d, \1/z, \[x0\] +** st1d z0\.d, \1, \[x8\] +** ret +*/ +CALLEE (f64, svfloat64_t) + +#define CALLER(SUFFIX, TYPE) \ + void __attribute__((noipa)) \ + caller_##SUFFIX (TYPE *ptr1, TYPE *ptr2) \ + { \ + *ptr2 = callee_##SUFFIX (ptr1); \ + } + +/* +** caller_s8: +** ... +** bl callee_s8 +** ... +** ld1b (z[0-9]+\.b), (p[0-7])/z, \[[^]]*\] +** st1b \1, \2, \[[^]]*\] +** ... +** ret +*/ +CALLER (s8, svint8_t) + +/* +** caller_u8: +** ... +** bl callee_u8 +** ... +** ld1b (z[0-9]+\.b), (p[0-7])/z, \[[^]]*\] +** st1b \1, \2, \[[^]]*\] +** ... +** ret +*/ +CALLER (u8, svuint8_t) + +/* +** caller_s16: +** ... +** bl callee_s16 +** ... +** ld1h (z[0-9]+\.h), (p[0-7])/z, \[[^]]*\] +** st1h \1, \2, \[[^]]*\] +** ... +** ret +*/ +CALLER (s16, svint16_t) + +/* +** caller_u16: +** ... +** bl callee_u16 +** ... +** ld1h (z[0-9]+\.h), (p[0-7])/z, \[[^]]*\] +** st1h \1, \2, \[[^]]*\] +** ... +** ret +*/ +CALLER (u16, svuint16_t) + +/* +** caller_f16: +** ... +** bl callee_f16 +** ... +** ld1h (z[0-9]+\.h), (p[0-7])/z, \[[^]]*\] +** st1h \1, \2, \[[^]]*\] +** ... +** ret +*/ +CALLER (f16, svfloat16_t) + +/* +** caller_s32: +** ... +** bl callee_s32 +** ... +** ld1w (z[0-9]+\.s), (p[0-7])/z, \[[^]]*\] +** st1w \1, \2, \[[^]]*\] +** ... +** ret +*/ +CALLER (s32, svint32_t) + +/* +** caller_u32: +** ... +** bl callee_u32 +** ... +** ld1w (z[0-9]+\.s), (p[0-7])/z, \[[^]]*\] +** st1w \1, \2, \[[^]]*\] +** ... +** ret +*/ +CALLER (u32, svuint32_t) + +/* +** caller_f32: +** ... +** bl callee_f32 +** ... +** ld1w (z[0-9]+\.s), (p[0-7])/z, \[[^]]*\] +** st1w \1, \2, \[[^]]*\] +** ... +** ret +*/ +CALLER (f32, svfloat32_t) + +/* +** caller_s64: +** ... +** bl callee_s64 +** ... +** ld1d (z[0-9]+\.d), (p[0-7])/z, \[[^]]*\] +** st1d \1, \2, \[[^]]*\] +** ... +** ret +*/ +CALLER (s64, svint64_t) + +/* +** caller_u64: +** ... +** bl callee_u64 +** ... +** ld1d (z[0-9]+\.d), (p[0-7])/z, \[[^]]*\] +** st1d \1, \2, \[[^]]*\] +** ... +** ret +*/ +CALLER (u64, svuint64_t) + +/* +** caller_f64: +** ... +** bl callee_f64 +** ... +** ld1d (z[0-9]+\.d), (p[0-7])/z, \[[^]]*\] +** st1d \1, \2, \[[^]]*\] +** ... +** ret +*/ +CALLER (f64, svfloat64_t) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_6_256.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_6_256.c new file mode 100644 index 0000000..414f66f --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_6_256.c @@ -0,0 +1,265 @@ +/* { dg-do compile } */ +/* { dg-options "-O -msve-vector-bits=256 -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include + +typedef int8_t svint8_t __attribute__ ((vector_size (32))); +typedef uint8_t svuint8_t __attribute__ ((vector_size (32))); + +typedef int16_t svint16_t __attribute__ ((vector_size (32))); +typedef uint16_t svuint16_t __attribute__ ((vector_size (32))); +typedef __fp16 svfloat16_t __attribute__ ((vector_size (32))); + +typedef int32_t svint32_t __attribute__ ((vector_size (32))); +typedef uint32_t svuint32_t __attribute__ ((vector_size (32))); +typedef float svfloat32_t __attribute__ ((vector_size (32))); + +typedef int64_t svint64_t __attribute__ ((vector_size (32))); +typedef uint64_t svuint64_t __attribute__ ((vector_size (32))); +typedef double svfloat64_t __attribute__ ((vector_size (32))); + +#define CALLEE(SUFFIX, TYPE) \ + TYPE __attribute__((noipa)) \ + callee_##SUFFIX (TYPE *ptr) \ + { \ + return *ptr; \ + } + +/* +** callee_s8: +** ptrue (p[0-7])\.b, vl32 +** ld1b z0\.b, \1/z, \[x0\] +** st1b z0\.b, \1, \[x8\] +** ret +*/ +CALLEE (s8, svint8_t) + +/* +** callee_u8: +** ptrue (p[0-7])\.b, vl32 +** ld1b z0\.b, \1/z, \[x0\] +** st1b z0\.b, \1, \[x8\] +** ret +*/ +CALLEE (u8, svuint8_t) + +/* +** callee_s16: +** ptrue (p[0-7])\.b, vl32 +** ld1h z0\.h, \1/z, \[x0\] +** st1h z0\.h, \1, \[x8\] +** ret +*/ +CALLEE (s16, svint16_t) + +/* +** callee_u16: +** ptrue (p[0-7])\.b, vl32 +** ld1h z0\.h, \1/z, \[x0\] +** st1h z0\.h, \1, \[x8\] +** ret +*/ +CALLEE (u16, svuint16_t) + +/* +** callee_f16: +** ptrue (p[0-7])\.b, vl32 +** ld1h z0\.h, \1/z, \[x0\] +** st1h z0\.h, \1, \[x8\] +** ret +*/ +CALLEE (f16, svfloat16_t) + +/* +** callee_s32: +** ptrue (p[0-7])\.b, vl32 +** ld1w z0\.s, \1/z, \[x0\] +** st1w z0\.s, \1, \[x8\] +** ret +*/ +CALLEE (s32, svint32_t) + +/* +** callee_u32: +** ptrue (p[0-7])\.b, vl32 +** ld1w z0\.s, \1/z, \[x0\] +** st1w z0\.s, \1, \[x8\] +** ret +*/ +CALLEE (u32, svuint32_t) + +/* +** callee_f32: +** ptrue (p[0-7])\.b, vl32 +** ld1w z0\.s, \1/z, \[x0\] +** st1w z0\.s, \1, \[x8\] +** ret +*/ +CALLEE (f32, svfloat32_t) + +/* +** callee_s64: +** ptrue (p[0-7])\.b, vl32 +** ld1d z0\.d, \1/z, \[x0\] +** st1d z0\.d, \1, \[x8\] +** ret +*/ +CALLEE (s64, svint64_t) + +/* +** callee_u64: +** ptrue (p[0-7])\.b, vl32 +** ld1d z0\.d, \1/z, \[x0\] +** st1d z0\.d, \1, \[x8\] +** ret +*/ +CALLEE (u64, svuint64_t) + +/* +** callee_f64: +** ptrue (p[0-7])\.b, vl32 +** ld1d z0\.d, \1/z, \[x0\] +** st1d z0\.d, \1, \[x8\] +** ret +*/ +CALLEE (f64, svfloat64_t) + +#define CALLER(SUFFIX, TYPE) \ + void __attribute__((noipa)) \ + caller_##SUFFIX (TYPE *ptr1, TYPE *ptr2) \ + { \ + *ptr2 = callee_##SUFFIX (ptr1); \ + } + +/* +** caller_s8: +** ... +** bl callee_s8 +** ... +** ld1b (z[0-9]+\.b), (p[0-7])/z, \[[^]]*\] +** st1b \1, \2, \[[^]]*\] +** ... +** ret +*/ +CALLER (s8, svint8_t) + +/* +** caller_u8: +** ... +** bl callee_u8 +** ... +** ld1b (z[0-9]+\.b), (p[0-7])/z, \[[^]]*\] +** st1b \1, \2, \[[^]]*\] +** ... +** ret +*/ +CALLER (u8, svuint8_t) + +/* +** caller_s16: +** ... +** bl callee_s16 +** ... +** ld1h (z[0-9]+\.h), (p[0-7])/z, \[[^]]*\] +** st1h \1, \2, \[[^]]*\] +** ... +** ret +*/ +CALLER (s16, svint16_t) + +/* +** caller_u16: +** ... +** bl callee_u16 +** ... +** ld1h (z[0-9]+\.h), (p[0-7])/z, \[[^]]*\] +** st1h \1, \2, \[[^]]*\] +** ... +** ret +*/ +CALLER (u16, svuint16_t) + +/* +** caller_f16: +** ... +** bl callee_f16 +** ... +** ld1h (z[0-9]+\.h), (p[0-7])/z, \[[^]]*\] +** st1h \1, \2, \[[^]]*\] +** ... +** ret +*/ +CALLER (f16, svfloat16_t) + +/* +** caller_s32: +** ... +** bl callee_s32 +** ... +** ld1w (z[0-9]+\.s), (p[0-7])/z, \[[^]]*\] +** st1w \1, \2, \[[^]]*\] +** ... +** ret +*/ +CALLER (s32, svint32_t) + +/* +** caller_u32: +** ... +** bl callee_u32 +** ... +** ld1w (z[0-9]+\.s), (p[0-7])/z, \[[^]]*\] +** st1w \1, \2, \[[^]]*\] +** ... +** ret +*/ +CALLER (u32, svuint32_t) + +/* +** caller_f32: +** ... +** bl callee_f32 +** ... +** ld1w (z[0-9]+\.s), (p[0-7])/z, \[[^]]*\] +** st1w \1, \2, \[[^]]*\] +** ... +** ret +*/ +CALLER (f32, svfloat32_t) + +/* +** caller_s64: +** ... +** bl callee_s64 +** ... +** ld1d (z[0-9]+\.d), (p[0-7])/z, \[[^]]*\] +** st1d \1, \2, \[[^]]*\] +** ... +** ret +*/ +CALLER (s64, svint64_t) + +/* +** caller_u64: +** ... +** bl callee_u64 +** ... +** ld1d (z[0-9]+\.d), (p[0-7])/z, \[[^]]*\] +** st1d \1, \2, \[[^]]*\] +** ... +** ret +*/ +CALLER (u64, svuint64_t) + +/* +** caller_f64: +** ... +** bl callee_f64 +** ... +** ld1d (z[0-9]+\.d), (p[0-7])/z, \[[^]]*\] +** st1d \1, \2, \[[^]]*\] +** ... +** ret +*/ +CALLER (f64, svfloat64_t) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_6_512.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_6_512.c new file mode 100644 index 0000000..7673ea2 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_6_512.c @@ -0,0 +1,265 @@ +/* { dg-do compile } */ +/* { dg-options "-O -msve-vector-bits=512 -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include + +typedef int8_t svint8_t __attribute__ ((vector_size (64))); +typedef uint8_t svuint8_t __attribute__ ((vector_size (64))); + +typedef int16_t svint16_t __attribute__ ((vector_size (64))); +typedef uint16_t svuint16_t __attribute__ ((vector_size (64))); +typedef __fp16 svfloat16_t __attribute__ ((vector_size (64))); + +typedef int32_t svint32_t __attribute__ ((vector_size (64))); +typedef uint32_t svuint32_t __attribute__ ((vector_size (64))); +typedef float svfloat32_t __attribute__ ((vector_size (64))); + +typedef int64_t svint64_t __attribute__ ((vector_size (64))); +typedef uint64_t svuint64_t __attribute__ ((vector_size (64))); +typedef double svfloat64_t __attribute__ ((vector_size (64))); + +#define CALLEE(SUFFIX, TYPE) \ + TYPE __attribute__((noipa)) \ + callee_##SUFFIX (TYPE *ptr) \ + { \ + return *ptr; \ + } + +/* +** callee_s8: +** ptrue (p[0-7])\.b, vl64 +** ld1b z0\.b, \1/z, \[x0\] +** st1b z0\.b, \1, \[x8\] +** ret +*/ +CALLEE (s8, svint8_t) + +/* +** callee_u8: +** ptrue (p[0-7])\.b, vl64 +** ld1b z0\.b, \1/z, \[x0\] +** st1b z0\.b, \1, \[x8\] +** ret +*/ +CALLEE (u8, svuint8_t) + +/* +** callee_s16: +** ptrue (p[0-7])\.b, vl64 +** ld1h z0\.h, \1/z, \[x0\] +** st1h z0\.h, \1, \[x8\] +** ret +*/ +CALLEE (s16, svint16_t) + +/* +** callee_u16: +** ptrue (p[0-7])\.b, vl64 +** ld1h z0\.h, \1/z, \[x0\] +** st1h z0\.h, \1, \[x8\] +** ret +*/ +CALLEE (u16, svuint16_t) + +/* +** callee_f16: +** ptrue (p[0-7])\.b, vl64 +** ld1h z0\.h, \1/z, \[x0\] +** st1h z0\.h, \1, \[x8\] +** ret +*/ +CALLEE (f16, svfloat16_t) + +/* +** callee_s32: +** ptrue (p[0-7])\.b, vl64 +** ld1w z0\.s, \1/z, \[x0\] +** st1w z0\.s, \1, \[x8\] +** ret +*/ +CALLEE (s32, svint32_t) + +/* +** callee_u32: +** ptrue (p[0-7])\.b, vl64 +** ld1w z0\.s, \1/z, \[x0\] +** st1w z0\.s, \1, \[x8\] +** ret +*/ +CALLEE (u32, svuint32_t) + +/* +** callee_f32: +** ptrue (p[0-7])\.b, vl64 +** ld1w z0\.s, \1/z, \[x0\] +** st1w z0\.s, \1, \[x8\] +** ret +*/ +CALLEE (f32, svfloat32_t) + +/* +** callee_s64: +** ptrue (p[0-7])\.b, vl64 +** ld1d z0\.d, \1/z, \[x0\] +** st1d z0\.d, \1, \[x8\] +** ret +*/ +CALLEE (s64, svint64_t) + +/* +** callee_u64: +** ptrue (p[0-7])\.b, vl64 +** ld1d z0\.d, \1/z, \[x0\] +** st1d z0\.d, \1, \[x8\] +** ret +*/ +CALLEE (u64, svuint64_t) + +/* +** callee_f64: +** ptrue (p[0-7])\.b, vl64 +** ld1d z0\.d, \1/z, \[x0\] +** st1d z0\.d, \1, \[x8\] +** ret +*/ +CALLEE (f64, svfloat64_t) + +#define CALLER(SUFFIX, TYPE) \ + void __attribute__((noipa)) \ + caller_##SUFFIX (TYPE *ptr1, TYPE *ptr2) \ + { \ + *ptr2 = callee_##SUFFIX (ptr1); \ + } + +/* +** caller_s8: +** ... +** bl callee_s8 +** ... +** ld1b (z[0-9]+\.b), (p[0-7])/z, \[[^]]*\] +** st1b \1, \2, \[[^]]*\] +** ... +** ret +*/ +CALLER (s8, svint8_t) + +/* +** caller_u8: +** ... +** bl callee_u8 +** ... +** ld1b (z[0-9]+\.b), (p[0-7])/z, \[[^]]*\] +** st1b \1, \2, \[[^]]*\] +** ... +** ret +*/ +CALLER (u8, svuint8_t) + +/* +** caller_s16: +** ... +** bl callee_s16 +** ... +** ld1h (z[0-9]+\.h), (p[0-7])/z, \[[^]]*\] +** st1h \1, \2, \[[^]]*\] +** ... +** ret +*/ +CALLER (s16, svint16_t) + +/* +** caller_u16: +** ... +** bl callee_u16 +** ... +** ld1h (z[0-9]+\.h), (p[0-7])/z, \[[^]]*\] +** st1h \1, \2, \[[^]]*\] +** ... +** ret +*/ +CALLER (u16, svuint16_t) + +/* +** caller_f16: +** ... +** bl callee_f16 +** ... +** ld1h (z[0-9]+\.h), (p[0-7])/z, \[[^]]*\] +** st1h \1, \2, \[[^]]*\] +** ... +** ret +*/ +CALLER (f16, svfloat16_t) + +/* +** caller_s32: +** ... +** bl callee_s32 +** ... +** ld1w (z[0-9]+\.s), (p[0-7])/z, \[[^]]*\] +** st1w \1, \2, \[[^]]*\] +** ... +** ret +*/ +CALLER (s32, svint32_t) + +/* +** caller_u32: +** ... +** bl callee_u32 +** ... +** ld1w (z[0-9]+\.s), (p[0-7])/z, \[[^]]*\] +** st1w \1, \2, \[[^]]*\] +** ... +** ret +*/ +CALLER (u32, svuint32_t) + +/* +** caller_f32: +** ... +** bl callee_f32 +** ... +** ld1w (z[0-9]+\.s), (p[0-7])/z, \[[^]]*\] +** st1w \1, \2, \[[^]]*\] +** ... +** ret +*/ +CALLER (f32, svfloat32_t) + +/* +** caller_s64: +** ... +** bl callee_s64 +** ... +** ld1d (z[0-9]+\.d), (p[0-7])/z, \[[^]]*\] +** st1d \1, \2, \[[^]]*\] +** ... +** ret +*/ +CALLER (s64, svint64_t) + +/* +** caller_u64: +** ... +** bl callee_u64 +** ... +** ld1d (z[0-9]+\.d), (p[0-7])/z, \[[^]]*\] +** st1d \1, \2, \[[^]]*\] +** ... +** ret +*/ +CALLER (u64, svuint64_t) + +/* +** caller_f64: +** ... +** bl callee_f64 +** ... +** ld1d (z[0-9]+\.d), (p[0-7])/z, \[[^]]*\] +** st1d \1, \2, \[[^]]*\] +** ... +** ret +*/ +CALLER (f64, svfloat64_t) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_7.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_7.c new file mode 100644 index 0000000..d03ef69 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_7.c @@ -0,0 +1,313 @@ +/* { dg-do compile } */ +/* { dg-options "-O -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include + +/* +** callee_s8: +** mov z0\.b, #1 +** mov z1\.b, #2 +** ret +*/ +svint8x2_t __attribute__((noipa)) +callee_s8 (void) +{ + return svcreate2 (svdup_s8 (1), svdup_s8 (2)); +} + +/* +** caller_s8: +** ... +** bl callee_s8 +** trn1 z0\.b, z0\.b, z1\.b +** ldp x29, x30, \[sp\], 16 +** ret +*/ +svint8_t __attribute__((noipa)) +caller_s8 (void) +{ + svint8x2_t res; + res = callee_s8 (); + return svtrn1 (svget2 (res, 0), svget2 (res, 1)); +} + +/* +** callee_u8: +** mov z0\.b, #3 +** mov z1\.b, #4 +** ret +*/ +svuint8x2_t __attribute__((noipa)) +callee_u8 (void) +{ + return svcreate2 (svdup_u8 (3), svdup_u8 (4)); +} + +/* +** caller_u8: +** ... +** bl callee_u8 +** trn2 z0\.b, z1\.b, z0\.b +** ldp x29, x30, \[sp\], 16 +** ret +*/ +svuint8_t __attribute__((noipa)) +caller_u8 (void) +{ + svuint8x2_t res; + res = callee_u8 (); + return svtrn2 (svget2 (res, 1), svget2 (res, 0)); +} + +/* +** callee_s16: +** mov z0\.h, #1 +** mov z1\.h, #2 +** ret +*/ +svint16x2_t __attribute__((noipa)) +callee_s16 (void) +{ + return svcreate2 (svdup_s16 (1), svdup_s16 (2)); +} + +/* +** caller_s16: +** ... +** bl callee_s16 +** trn1 z0\.h, z0\.h, z1\.h +** ldp x29, x30, \[sp\], 16 +** ret +*/ +svint16_t __attribute__((noipa)) +caller_s16 (void) +{ + svint16x2_t res; + res = callee_s16 (); + return svtrn1 (svget2 (res, 0), svget2 (res, 1)); +} + +/* +** callee_u16: +** mov z0\.h, #3 +** mov z1\.h, #4 +** ret +*/ +svuint16x2_t __attribute__((noipa)) +callee_u16 (void) +{ + return svcreate2 (svdup_u16 (3), svdup_u16 (4)); +} + +/* +** caller_u16: +** ... +** bl callee_u16 +** trn2 z0\.h, z1\.h, z0\.h +** ldp x29, x30, \[sp\], 16 +** ret +*/ +svuint16_t __attribute__((noipa)) +caller_u16 (void) +{ + svuint16x2_t res; + res = callee_u16 (); + return svtrn2 (svget2 (res, 1), svget2 (res, 0)); +} + +/* +** callee_f16: +** fmov z0\.h, #5\.0(?:e\+0)? +** fmov z1\.h, #6\.0(?:e\+0)? +** ret +*/ +svfloat16x2_t __attribute__((noipa)) +callee_f16 (void) +{ + return svcreate2 (svdup_f16 (5), svdup_f16 (6)); +} + +/* +** caller_f16: +** ... +** bl callee_f16 +** zip1 z0\.h, z1\.h, z0\.h +** ldp x29, x30, \[sp\], 16 +** ret +*/ +svfloat16_t __attribute__((noipa)) +caller_f16 (void) +{ + svfloat16x2_t res; + res = callee_f16 (); + return svzip1 (svget2 (res, 1), svget2 (res, 0)); +} + +/* +** callee_s32: +** mov z0\.s, #1 +** mov z1\.s, #2 +** ret +*/ +svint32x2_t __attribute__((noipa)) +callee_s32 (void) +{ + return svcreate2 (svdup_s32 (1), svdup_s32 (2)); +} + +/* +** caller_s32: +** ... +** bl callee_s32 +** trn1 z0\.s, z0\.s, z1\.s +** ldp x29, x30, \[sp\], 16 +** ret +*/ +svint32_t __attribute__((noipa)) +caller_s32 (void) +{ + svint32x2_t res; + res = callee_s32 (); + return svtrn1 (svget2 (res, 0), svget2 (res, 1)); +} + +/* +** callee_u32: +** mov z0\.s, #3 +** mov z1\.s, #4 +** ret +*/ +svuint32x2_t __attribute__((noipa)) +callee_u32 (void) +{ + return svcreate2 (svdup_u32 (3), svdup_u32 (4)); +} + +/* +** caller_u32: +** ... +** bl callee_u32 +** trn2 z0\.s, z1\.s, z0\.s +** ldp x29, x30, \[sp\], 16 +** ret +*/ +svuint32_t __attribute__((noipa)) +caller_u32 (void) +{ + svuint32x2_t res; + res = callee_u32 (); + return svtrn2 (svget2 (res, 1), svget2 (res, 0)); +} + +/* +** callee_f32: +** fmov z0\.s, #5\.0(?:e\+0)? +** fmov z1\.s, #6\.0(?:e\+0)? +** ret +*/ +svfloat32x2_t __attribute__((noipa)) +callee_f32 (void) +{ + return svcreate2 (svdup_f32 (5), svdup_f32 (6)); +} + +/* +** caller_f32: +** ... +** bl callee_f32 +** zip1 z0\.s, z1\.s, z0\.s +** ldp x29, x30, \[sp\], 16 +** ret +*/ +svfloat32_t __attribute__((noipa)) +caller_f32 (void) +{ + svfloat32x2_t res; + res = callee_f32 (); + return svzip1 (svget2 (res, 1), svget2 (res, 0)); +} + +/* +** callee_s64: +** mov z0\.d, #1 +** mov z1\.d, #2 +** ret +*/ +svint64x2_t __attribute__((noipa)) +callee_s64 (void) +{ + return svcreate2 (svdup_s64 (1), svdup_s64 (2)); +} + +/* +** caller_s64: +** ... +** bl callee_s64 +** trn1 z0\.d, z0\.d, z1\.d +** ldp x29, x30, \[sp\], 16 +** ret +*/ +svint64_t __attribute__((noipa)) +caller_s64 (void) +{ + svint64x2_t res; + res = callee_s64 (); + return svtrn1 (svget2 (res, 0), svget2 (res, 1)); +} + +/* +** callee_u64: +** mov z0\.d, #3 +** mov z1\.d, #4 +** ret +*/ +svuint64x2_t __attribute__((noipa)) +callee_u64 (void) +{ + return svcreate2 (svdup_u64 (3), svdup_u64 (4)); +} + +/* +** caller_u64: +** ... +** bl callee_u64 +** trn2 z0\.d, z1\.d, z0\.d +** ldp x29, x30, \[sp\], 16 +** ret +*/ +svuint64_t __attribute__((noipa)) +caller_u64 (void) +{ + svuint64x2_t res; + res = callee_u64 (); + return svtrn2 (svget2 (res, 1), svget2 (res, 0)); +} + +/* +** callee_f64: +** fmov z0\.d, #5\.0(?:e\+0)? +** fmov z1\.d, #6\.0(?:e\+0)? +** ret +*/ +svfloat64x2_t __attribute__((noipa)) +callee_f64 (void) +{ + return svcreate2 (svdup_f64 (5), svdup_f64 (6)); +} + +/* +** caller_f64: +** ... +** bl callee_f64 +** zip1 z0\.d, z1\.d, z0\.d +** ldp x29, x30, \[sp\], 16 +** ret +*/ +svfloat64_t __attribute__((noipa)) +caller_f64 (void) +{ + svfloat64x2_t res; + res = callee_f64 (); + return svzip1 (svget2 (res, 1), svget2 (res, 0)); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_8.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_8.c new file mode 100644 index 0000000..6a094bb --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_8.c @@ -0,0 +1,346 @@ +/* { dg-do compile } */ +/* { dg-options "-O -frename-registers -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include + +/* +** callee_s8: +** mov z0\.b, #1 +** mov z1\.b, #2 +** mov z2\.b, #3 +** ret +*/ +svint8x3_t __attribute__((noipa)) +callee_s8 (void) +{ + return svcreate3 (svdup_s8 (1), svdup_s8 (2), svdup_s8 (3)); +} + +/* +** caller_s8: +** ... +** bl callee_s8 +** ptrue (p[0-7])\.b, all +** mad z0\.b, \1/m, z1\.b, z2\.b +** ldp x29, x30, \[sp\], 16 +** ret +*/ +svint8_t __attribute__((noipa)) +caller_s8 (void) +{ + svint8x3_t res; + res = callee_s8 (); + return svmad_x (svptrue_b8 (), + svget3 (res, 0), svget3 (res, 1), svget3 (res, 2)); +} + +/* +** callee_u8: +** mov z0\.b, #4 +** mov z1\.b, #5 +** mov z2\.b, #6 +** ret +*/ +svuint8x3_t __attribute__((noipa)) +callee_u8 (void) +{ + return svcreate3 (svdup_u8 (4), svdup_u8 (5), svdup_u8 (6)); +} + +/* +** caller_u8: +** ... +** bl callee_u8 +** ptrue (p[0-7])\.b, all +** msb z0\.b, \1/m, z1\.b, z2\.b +** ldp x29, x30, \[sp\], 16 +** ret +*/ +svuint8_t __attribute__((noipa)) +caller_u8 (void) +{ + svuint8x3_t res; + res = callee_u8 (); + return svmsb_x (svptrue_b8 (), + svget3 (res, 0), svget3 (res, 1), svget3 (res, 2)); +} + +/* +** callee_s16: +** mov z0\.h, #1 +** mov z1\.h, #2 +** mov z2\.h, #3 +** ret +*/ +svint16x3_t __attribute__((noipa)) +callee_s16 (void) +{ + return svcreate3 (svdup_s16 (1), svdup_s16 (2), svdup_s16 (3)); +} + +/* +** caller_s16: +** ... +** bl callee_s16 +** ptrue (p[0-7])\.b, all +** mls z0\.h, \1/m, z1\.h, z2\.h +** ldp x29, x30, \[sp\], 16 +** ret +*/ +svint16_t __attribute__((noipa)) +caller_s16 (void) +{ + svint16x3_t res; + res = callee_s16 (); + return svmls_x (svptrue_b16 (), + svget3 (res, 0), svget3 (res, 1), svget3 (res, 2)); +} + +/* +** callee_u16: +** mov z0\.h, #4 +** mov z1\.h, #5 +** mov z2\.h, #6 +** ret +*/ +svuint16x3_t __attribute__((noipa)) +callee_u16 (void) +{ + return svcreate3 (svdup_u16 (4), svdup_u16 (5), svdup_u16 (6)); +} + +/* +** caller_u16: +** ... +** bl callee_u16 +** ptrue (p[0-7])\.b, all +** mla z0\.h, \1/m, z1\.h, z2\.h +** ldp x29, x30, \[sp\], 16 +** ret +*/ +svuint16_t __attribute__((noipa)) +caller_u16 (void) +{ + svuint16x3_t res; + res = callee_u16 (); + return svmla_x (svptrue_b16 (), + svget3 (res, 0), svget3 (res, 1), svget3 (res, 2)); +} + +/* +** callee_f16: +** fmov z0\.h, #1\.0(?:e\+0)? +** fmov z1\.h, #2\.0(?:e\+0)? +** fmov z2\.h, #3\.0(?:e\+0)? +** ret +*/ +svfloat16x3_t __attribute__((noipa)) +callee_f16 (void) +{ + return svcreate3 (svdup_f16 (1), svdup_f16 (2), svdup_f16 (3)); +} + +/* +** caller_f16: +** ... +** bl callee_f16 +** ptrue (p[0-7])\.b, all +** fmla z0\.h, \1/m, z1\.h, z2\.h +** ldp x29, x30, \[sp\], 16 +** ret +*/ +svfloat16_t __attribute__((noipa)) +caller_f16 (void) +{ + svfloat16x3_t res; + res = callee_f16 (); + return svmla_x (svptrue_b16 (), + svget3 (res, 0), svget3 (res, 1), svget3 (res, 2)); +} + +/* +** callee_s32: +** mov z0\.s, #1 +** mov z1\.s, #2 +** mov z2\.s, #3 +** ret +*/ +svint32x3_t __attribute__((noipa)) +callee_s32 (void) +{ + return svcreate3 (svdup_s32 (1), svdup_s32 (2), svdup_s32 (3)); +} + +/* +** caller_s32: +** ... +** bl callee_s32 +** ptrue (p[0-7])\.b, all +** mad z0\.s, \1/m, z1\.s, z2\.s +** ldp x29, x30, \[sp\], 16 +** ret +*/ +svint32_t __attribute__((noipa)) +caller_s32 (void) +{ + svint32x3_t res; + res = callee_s32 (); + return svmad_x (svptrue_b32 (), + svget3 (res, 0), svget3 (res, 1), svget3 (res, 2)); +} + +/* +** callee_u32: +** mov z0\.s, #4 +** mov z1\.s, #5 +** mov z2\.s, #6 +** ret +*/ +svuint32x3_t __attribute__((noipa)) +callee_u32 (void) +{ + return svcreate3 (svdup_u32 (4), svdup_u32 (5), svdup_u32 (6)); +} + +/* +** caller_u32: +** ... +** bl callee_u32 +** ptrue (p[0-7])\.b, all +** msb z0\.s, \1/m, z1\.s, z2\.s +** ldp x29, x30, \[sp\], 16 +** ret +*/ +svuint32_t __attribute__((noipa)) +caller_u32 (void) +{ + svuint32x3_t res; + res = callee_u32 (); + return svmsb_x (svptrue_b32 (), + svget3 (res, 0), svget3 (res, 1), svget3 (res, 2)); +} + +/* +** callee_f32: +** fmov z0\.s, #1\.0(?:e\+0)? +** fmov z1\.s, #2\.0(?:e\+0)? +** fmov z2\.s, #3\.0(?:e\+0)? +** ret +*/ +svfloat32x3_t __attribute__((noipa)) +callee_f32 (void) +{ + return svcreate3 (svdup_f32 (1), svdup_f32 (2), svdup_f32 (3)); +} + +/* +** caller_f32: +** ... +** bl callee_f32 +** ptrue (p[0-7])\.b, all +** fmla z0\.s, \1/m, z1\.s, z2\.s +** ldp x29, x30, \[sp\], 16 +** ret +*/ +svfloat32_t __attribute__((noipa)) +caller_f32 (void) +{ + svfloat32x3_t res; + res = callee_f32 (); + return svmla_x (svptrue_b32 (), + svget3 (res, 0), svget3 (res, 1), svget3 (res, 2)); +} + +/* +** callee_s64: +** mov z0\.d, #1 +** mov z1\.d, #2 +** mov z2\.d, #3 +** ret +*/ +svint64x3_t __attribute__((noipa)) +callee_s64 (void) +{ + return svcreate3 (svdup_s64 (1), svdup_s64 (2), svdup_s64 (3)); +} + +/* +** caller_s64: +** ... +** bl callee_s64 +** ptrue (p[0-7])\.b, all +** mls z0\.d, \1/m, z1\.d, z2\.d +** ldp x29, x30, \[sp\], 16 +** ret +*/ +svint64_t __attribute__((noipa)) +caller_s64 (void) +{ + svint64x3_t res; + res = callee_s64 (); + return svmls_x (svptrue_b64 (), + svget3 (res, 0), svget3 (res, 1), svget3 (res, 2)); +} + +/* +** callee_u64: +** mov z0\.d, #4 +** mov z1\.d, #5 +** mov z2\.d, #6 +** ret +*/ +svuint64x3_t __attribute__((noipa)) +callee_u64 (void) +{ + return svcreate3 (svdup_u64 (4), svdup_u64 (5), svdup_u64 (6)); +} + +/* +** caller_u64: +** ... +** bl callee_u64 +** ptrue (p[0-7])\.b, all +** mla z0\.d, \1/m, z1\.d, z2\.d +** ldp x29, x30, \[sp\], 16 +** ret +*/ +svuint64_t __attribute__((noipa)) +caller_u64 (void) +{ + svuint64x3_t res; + res = callee_u64 (); + return svmla_x (svptrue_b64 (), + svget3 (res, 0), svget3 (res, 1), svget3 (res, 2)); +} + +/* +** callee_f64: +** fmov z0\.d, #1\.0(?:e\+0)? +** fmov z1\.d, #2\.0(?:e\+0)? +** fmov z2\.d, #3\.0(?:e\+0)? +** ret +*/ +svfloat64x3_t __attribute__((noipa)) +callee_f64 (void) +{ + return svcreate3 (svdup_f64 (1), svdup_f64 (2), svdup_f64 (3)); +} + +/* +** caller_f64: +** ... +** bl callee_f64 +** ptrue (p[0-7])\.b, all +** fmla z0\.d, \1/m, z1\.d, z2\.d +** ldp x29, x30, \[sp\], 16 +** ret +*/ +svfloat64_t __attribute__((noipa)) +caller_f64 (void) +{ + svfloat64x3_t res; + res = callee_f64 (); + return svmla_x (svptrue_b64 (), + svget3 (res, 0), svget3 (res, 1), svget3 (res, 2)); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_9.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_9.c new file mode 100644 index 0000000..caadbb9 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_9.c @@ -0,0 +1,405 @@ +/* { dg-do compile } */ +/* { dg-options "-O -frename-registers -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include + +/* +** callee_s8: +** mov z0\.b, #1 +** mov z1\.b, #2 +** mov z2\.b, #3 +** mov z3\.b, #4 +** ret +*/ +svint8x4_t __attribute__((noipa)) +callee_s8 (void) +{ + return svcreate4 (svdup_s8 (1), svdup_s8 (2), svdup_s8 (3), svdup_s8 (4)); +} + +/* +** caller_s8: +** ... +** bl callee_s8 +** add (z[2-7]\.b), z2\.b, z3\.b +** ptrue (p[0-7])\.b, all +** mla z0\.b, \2/m, (z1\.b, \1|\1, z1\.b) +** ldp x29, x30, \[sp\], 16 +** ret +*/ +svint8_t __attribute__((noipa)) +caller_s8 (void) +{ + svint8x4_t res; + res = callee_s8 (); + return svmla_x (svptrue_b8 (), svget4 (res, 0), svget4 (res, 1), + svadd_x (svptrue_b8 (), + svget4 (res, 2), + svget4 (res, 3))); +} + +/* +** callee_u8: +** mov z0\.b, #4 +** mov z1\.b, #5 +** mov z2\.b, #6 +** mov z3\.b, #7 +** ret +*/ +svuint8x4_t __attribute__((noipa)) +callee_u8 (void) +{ + return svcreate4 (svdup_u8 (4), svdup_u8 (5), svdup_u8 (6), svdup_u8 (7)); +} + +/* +** caller_u8: +** ... +** bl callee_u8 +** sub (z[2-7]\.b), z2\.b, z3\.b +** ptrue (p[0-7])\.b, all +** mla z0\.b, \2/m, (z1\.b, \1|\1, z1\.b) +** ldp x29, x30, \[sp\], 16 +** ret +*/ +svuint8_t __attribute__((noipa)) +caller_u8 (void) +{ + svuint8x4_t res; + res = callee_u8 (); + return svmla_x (svptrue_b8 (), svget4 (res, 0), svget4 (res, 1), + svsub_x (svptrue_b8 (), + svget4 (res, 2), + svget4 (res, 3))); +} + +/* +** callee_s16: +** mov z0\.h, #1 +** mov z1\.h, #2 +** mov z2\.h, #3 +** mov z3\.h, #4 +** ret +*/ +svint16x4_t __attribute__((noipa)) +callee_s16 (void) +{ + return svcreate4 (svdup_s16 (1), svdup_s16 (2), + svdup_s16 (3), svdup_s16 (4)); +} + +/* +** caller_s16: +** ... +** bl callee_s16 +** add (z[2-7]\.h), z2\.h, z3\.h +** ptrue (p[0-7])\.b, all +** mad z0\.h, \2/m, (z1\.h, \1|\1, z1\.h) +** ldp x29, x30, \[sp\], 16 +** ret +*/ +svint16_t __attribute__((noipa)) +caller_s16 (void) +{ + svint16x4_t res; + res = callee_s16 (); + return svmad_x (svptrue_b16 (), svget4 (res, 0), svget4 (res, 1), + svadd_x (svptrue_b16 (), + svget4 (res, 2), + svget4 (res, 3))); +} + +/* +** callee_u16: +** mov z0\.h, #4 +** mov z1\.h, #5 +** mov z2\.h, #6 +** mov z3\.h, #7 +** ret +*/ +svuint16x4_t __attribute__((noipa)) +callee_u16 (void) +{ + return svcreate4 (svdup_u16 (4), svdup_u16 (5), + svdup_u16 (6), svdup_u16 (7)); +} + +/* +** caller_u16: +** ... +** bl callee_u16 +** sub (z[2-7]\.h), z2\.h, z3\.h +** ptrue (p[0-7])\.b, all +** mad z0\.h, \2/m, (z1\.h, \1|\1, z1\.h) +** ldp x29, x30, \[sp\], 16 +** ret +*/ +svuint16_t __attribute__((noipa)) +caller_u16 (void) +{ + svuint16x4_t res; + res = callee_u16 (); + return svmad_x (svptrue_b16 (), svget4 (res, 0), svget4 (res, 1), + svsub_x (svptrue_b16 (), + svget4 (res, 2), + svget4 (res, 3))); +} + +/* +** callee_f16: +** fmov z0\.h, #1\.0(?:e\+0)? +** fmov z1\.h, #2\.0(?:e\+0)? +** fmov z2\.h, #3\.0(?:e\+0)? +** fmov z3\.h, #4\.0(?:e\+0)? +** ret +*/ +svfloat16x4_t __attribute__((noipa)) +callee_f16 (void) +{ + return svcreate4 (svdup_f16 (1), svdup_f16 (2), + svdup_f16 (3), svdup_f16 (4)); +} + +/* +** caller_f16: +** ... +** bl callee_f16 +** fadd (z[0-9]+\.h), z0\.h, z1\.h +** fmul (z[0-9]+\.h), \1, z2\.h +** fadd z0\.h, \2, z3\.h +** ldp x29, x30, \[sp\], 16 +** ret +*/ +svfloat16_t __attribute__((noipa)) +caller_f16 (void) +{ + svfloat16x4_t res; + res = callee_f16 (); + return svadd_x (svptrue_b16 (), + svmul_x (svptrue_b16 (), + svadd_x (svptrue_b16 (), svget4 (res, 0), + svget4 (res, 1)), + svget4 (res, 2)), + svget4 (res, 3)); +} + +/* +** callee_s32: +** mov z0\.s, #1 +** mov z1\.s, #2 +** mov z2\.s, #3 +** mov z3\.s, #4 +** ret +*/ +svint32x4_t __attribute__((noipa)) +callee_s32 (void) +{ + return svcreate4 (svdup_s32 (1), svdup_s32 (2), + svdup_s32 (3), svdup_s32 (4)); +} + +/* +** caller_s32: +** ... +** bl callee_s32 +** add (z[2-7]\.s), z2\.s, z3\.s +** ptrue (p[0-7])\.b, all +** msb z0\.s, \2/m, (z1\.s, \1|\1, z1\.s) +** ldp x29, x30, \[sp\], 16 +** ret +*/ +svint32_t __attribute__((noipa)) +caller_s32 (void) +{ + svint32x4_t res; + res = callee_s32 (); + return svmsb_x (svptrue_b32 (), svget4 (res, 0), svget4 (res, 1), + svadd_x (svptrue_b32 (), + svget4 (res, 2), + svget4 (res, 3))); +} + +/* +** callee_u32: +** mov z0\.s, #4 +** mov z1\.s, #5 +** mov z2\.s, #6 +** mov z3\.s, #7 +** ret +*/ +svuint32x4_t __attribute__((noipa)) +callee_u32 (void) +{ + return svcreate4 (svdup_u32 (4), svdup_u32 (5), + svdup_u32 (6), svdup_u32 (7)); +} + +/* +** caller_u32: +** ... +** bl callee_u32 +** sub (z[2-7]\.s), z2\.s, z3\.s +** ptrue (p[0-7])\.b, all +** msb z0\.s, \2/m, (z1\.s, \1|\1, z1\.s) +** ldp x29, x30, \[sp\], 16 +** ret +*/ +svuint32_t __attribute__((noipa)) +caller_u32 (void) +{ + svuint32x4_t res; + res = callee_u32 (); + return svmsb_x (svptrue_b32 (), svget4 (res, 0), svget4 (res, 1), + svsub_x (svptrue_b32 (), + svget4 (res, 2), + svget4 (res, 3))); +} + +/* +** callee_f32: +** fmov z0\.s, #1\.0(?:e\+0)? +** fmov z1\.s, #2\.0(?:e\+0)? +** fmov z2\.s, #3\.0(?:e\+0)? +** fmov z3\.s, #4\.0(?:e\+0)? +** ret +*/ +svfloat32x4_t __attribute__((noipa)) +callee_f32 (void) +{ + return svcreate4 (svdup_f32 (1), svdup_f32 (2), + svdup_f32 (3), svdup_f32 (4)); +} + +/* +** caller_f32: +** ... +** bl callee_f32 +** fadd (z[0-9]+\.s), z0\.s, z1\.s +** fmul (z[0-9]+\.s), \1, z2\.s +** fadd z0\.s, \2, z3\.s +** ldp x29, x30, \[sp\], 16 +** ret +*/ +svfloat32_t __attribute__((noipa)) +caller_f32 (void) +{ + svfloat32x4_t res; + res = callee_f32 (); + return svadd_x (svptrue_b32 (), + svmul_x (svptrue_b32 (), + svadd_x (svptrue_b32 (), svget4 (res, 0), + svget4 (res, 1)), + svget4 (res, 2)), + svget4 (res, 3)); +} + +/* +** callee_s64: +** mov z0\.d, #1 +** mov z1\.d, #2 +** mov z2\.d, #3 +** mov z3\.d, #4 +** ret +*/ +svint64x4_t __attribute__((noipa)) +callee_s64 (void) +{ + return svcreate4 (svdup_s64 (1), svdup_s64 (2), + svdup_s64 (3), svdup_s64 (4)); +} + +/* +** caller_s64: +** ... +** bl callee_s64 +** add (z[2-7]\.d), z2\.d, z3\.d +** ptrue (p[0-7])\.b, all +** mls z0\.d, \2/m, (z1\.d, \1|\1, z1\.d) +** ldp x29, x30, \[sp\], 16 +** ret +*/ +svint64_t __attribute__((noipa)) +caller_s64 (void) +{ + svint64x4_t res; + res = callee_s64 (); + return svmls_x (svptrue_b64 (), svget4 (res, 0), svget4 (res, 1), + svadd_x (svptrue_b64 (), + svget4 (res, 2), + svget4 (res, 3))); +} + +/* +** callee_u64: +** mov z0\.d, #4 +** mov z1\.d, #5 +** mov z2\.d, #6 +** mov z3\.d, #7 +** ret +*/ +svuint64x4_t __attribute__((noipa)) +callee_u64 (void) +{ + return svcreate4 (svdup_u64 (4), svdup_u64 (5), + svdup_u64 (6), svdup_u64 (7)); +} + +/* +** caller_u64: +** ... +** bl callee_u64 +** sub (z[2-7]\.d), z2\.d, z3\.d +** ptrue (p[0-7])\.b, all +** mls z0\.d, \2/m, (z1\.d, \1|\1, z1\.d) +** ldp x29, x30, \[sp\], 16 +** ret +*/ +svuint64_t __attribute__((noipa)) +caller_u64 (void) +{ + svuint64x4_t res; + res = callee_u64 (); + return svmls_x (svptrue_b64 (), svget4 (res, 0), svget4 (res, 1), + svsub_x (svptrue_b64 (), + svget4 (res, 2), + svget4 (res, 3))); +} + +/* +** callee_f64: +** fmov z0\.d, #1\.0(?:e\+0)? +** fmov z1\.d, #2\.0(?:e\+0)? +** fmov z2\.d, #3\.0(?:e\+0)? +** fmov z3\.d, #4\.0(?:e\+0)? +** ret +*/ +svfloat64x4_t __attribute__((noipa)) +callee_f64 (void) +{ + return svcreate4 (svdup_f64 (1), svdup_f64 (2), + svdup_f64 (3), svdup_f64 (4)); +} + +/* +** caller_f64: +** ... +** bl callee_f64 +** fadd (z[0-9]+\.d), z0\.d, z1\.d +** fmul (z[0-9]+\.d), \1, z2\.d +** fadd z0\.d, \2, z3\.d +** ldp x29, x30, \[sp\], 16 +** ret +*/ +svfloat64_t __attribute__((noipa)) +caller_f64 (void) +{ + svfloat64x4_t res; + res = callee_f64 (); + return svadd_x (svptrue_b64 (), + svmul_x (svptrue_b64 (), + svadd_x (svptrue_b64 (), svget4 (res, 0), + svget4 (res, 1)), + svget4 (res, 2)), + svget4 (res, 3)); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/saves_1_be_nowrap.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/saves_1_be_nowrap.c new file mode 100644 index 0000000..4eee042 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/saves_1_be_nowrap.c @@ -0,0 +1,196 @@ +/* { dg-do compile } */ +/* { dg-options "-O -mbig-endian -fno-shrink-wrap -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#pragma GCC aarch64 "arm_sve.h" + +/* +** test_1: +** addvl sp, sp, #-17 +** str p4, \[sp\] +** str p5, \[sp, #1, mul vl\] +** str p6, \[sp, #2, mul vl\] +** str p7, \[sp, #3, mul vl\] +** str p8, \[sp, #4, mul vl\] +** str p9, \[sp, #5, mul vl\] +** str p10, \[sp, #6, mul vl\] +** str p11, \[sp, #7, mul vl\] +** ptrue p1\.b, all +** st1d z8\.d, p1, \[sp, #1, mul vl\] +** st1d z9\.d, p1, \[sp, #2, mul vl\] +** st1d z10\.d, p1, \[sp, #3, mul vl\] +** st1d z11\.d, p1, \[sp, #4, mul vl\] +** st1d z12\.d, p1, \[sp, #5, mul vl\] +** st1d z13\.d, p1, \[sp, #6, mul vl\] +** st1d z14\.d, p1, \[sp, #7, mul vl\] +** addvl x11, sp, #16 +** st1d z15\.d, p1, \[x11, #-8, mul vl\] +** str z16, \[sp, #9, mul vl\] +** str z17, \[sp, #10, mul vl\] +** str z18, \[sp, #11, mul vl\] +** str z19, \[sp, #12, mul vl\] +** str z20, \[sp, #13, mul vl\] +** str z21, \[sp, #14, mul vl\] +** str z22, \[sp, #15, mul vl\] +** str z23, \[sp, #16, mul vl\] +** ptrue p0\.b, all +** ptrue p1\.b, all +** ld1d z8\.d, p1/z, \[sp, #1, mul vl\] +** ld1d z9\.d, p1/z, \[sp, #2, mul vl\] +** ld1d z10\.d, p1/z, \[sp, #3, mul vl\] +** ld1d z11\.d, p1/z, \[sp, #4, mul vl\] +** ld1d z12\.d, p1/z, \[sp, #5, mul vl\] +** ld1d z13\.d, p1/z, \[sp, #6, mul vl\] +** ld1d z14\.d, p1/z, \[sp, #7, mul vl\] +** addvl x11, sp, #16 +** ld1d z15\.d, p1/z, \[x11, #-8, mul vl\] +** ldr z16, \[sp, #9, mul vl\] +** ldr z17, \[sp, #10, mul vl\] +** ldr z18, \[sp, #11, mul vl\] +** ldr z19, \[sp, #12, mul vl\] +** ldr z20, \[sp, #13, mul vl\] +** ldr z21, \[sp, #14, mul vl\] +** ldr z22, \[sp, #15, mul vl\] +** ldr z23, \[sp, #16, mul vl\] +** ldr p4, \[sp\] +** ldr p5, \[sp, #1, mul vl\] +** ldr p6, \[sp, #2, mul vl\] +** ldr p7, \[sp, #3, mul vl\] +** ldr p8, \[sp, #4, mul vl\] +** ldr p9, \[sp, #5, mul vl\] +** ldr p10, \[sp, #6, mul vl\] +** ldr p11, \[sp, #7, mul vl\] +** addvl sp, sp, #17 +** ret +*/ +svbool_t +test_1 (void) +{ + asm volatile ("" ::: + "z0", "z1", "z2", "z3", "z4", "z5", "z6", "z7", + "z8", "z9", "z10", "z11", "z12", "z13", "z14", "z15", + "z16", "z17", "z18", "z19", "z20", "z21", "z22", "z23", + "z24", "z25", "z26", "z27", "z28", "z29", "z30", "z31", + "p0", "p1", "p2", "p3", "p4", "p5", "p6", "p7", + "p8", "p9", "p10", "p11", "p12", "p13", "p14", "p15"); + return svptrue_b8 (); +} + +/* +** test_2: +** ptrue p0\.b, all +** ret +*/ +svbool_t +test_2 (void) +{ + asm volatile ("" ::: + "z0", "z1", "z2", "z3", "z4", "z5", "z6", "z7", + "z24", "z25", "z26", "z27", "z28", "z29", "z30", "z31", + "p0", "p1", "p2", "p3", "p12", "p13", "p14", "p15"); + return svptrue_b8 (); +} + +/* +** test_3: +** addvl sp, sp, #-6 +** str p5, \[sp\] +** str p6, \[sp, #1, mul vl\] +** str p11, \[sp, #2, mul vl\] +** ptrue p1\.b, all +** st1d z8\.d, p1, \[sp, #1, mul vl\] +** st1d z13\.d, p1, \[sp, #2, mul vl\] +** str z19, \[sp, #3, mul vl\] +** str z20, \[sp, #4, mul vl\] +** str z22, \[sp, #5, mul vl\] +** ptrue p0\.b, all +** ptrue p1\.b, all +** ld1d z8\.d, p1/z, \[sp, #1, mul vl\] +** ld1d z13\.d, p1/z, \[sp, #2, mul vl\] +** ldr z19, \[sp, #3, mul vl\] +** ldr z20, \[sp, #4, mul vl\] +** ldr z22, \[sp, #5, mul vl\] +** ldr p5, \[sp\] +** ldr p6, \[sp, #1, mul vl\] +** ldr p11, \[sp, #2, mul vl\] +** addvl sp, sp, #6 +** ret +*/ +svbool_t +test_3 (void) +{ + asm volatile ("" ::: + "z8", "z13", "z19", "z20", "z22", + "p5", "p6", "p11"); + return svptrue_b8 (); +} + +/* +** test_4: +** addvl sp, sp, #-1 +** str p4, \[sp\] +** ptrue p0\.b, all +** ldr p4, \[sp\] +** addvl sp, sp, #1 +** ret +*/ +svbool_t +test_4 (void) +{ + asm volatile ("" ::: "p4"); + return svptrue_b8 (); +} + +/* +** test_5: +** addvl sp, sp, #-1 +** ptrue p1\.b, all +** st1d z15\.d, p1, \[sp\] +** ptrue p0\.b, all +** ptrue p1\.b, all +** ld1d z15\.d, p1/z, \[sp\] +** addvl sp, sp, #1 +** ret +*/ +svbool_t +test_5 (void) +{ + asm volatile ("" ::: "z15"); + return svptrue_b8 (); +} + +/* +** test_6: +** addvl sp, sp, #-2 +** str p4, \[sp\] +** ptrue p4\.b, all +** st1d z15\.d, p4, \[sp, #1, mul vl\] +** mov z0\.b, #1 +** ptrue p4\.b, all +** ld1d z15\.d, p4/z, \[sp, #1, mul vl\] +** ldr p4, \[sp\] +** addvl sp, sp, #2 +** ret +*/ +svint8_t +test_6 (svbool_t p0, svbool_t p1, svbool_t p2, svbool_t p3) +{ + asm volatile ("" :: "Upa" (p0), "Upa" (p1), "Upa" (p2), "Upa" (p3) : "z15"); + return svdup_s8 (1); +} + +/* +** test_7: +** addvl sp, sp, #-1 +** str z16, \[sp\] +** ptrue p0\.b, all +** ldr z16, \[sp\] +** addvl sp, sp, #1 +** ret +*/ +svbool_t +test_7 (void) +{ + asm volatile ("" ::: "z16"); + return svptrue_b8 (); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/saves_1_be_wrap.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/saves_1_be_wrap.c new file mode 100644 index 0000000..e88a3dd --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/saves_1_be_wrap.c @@ -0,0 +1,196 @@ +/* { dg-do compile } */ +/* { dg-options "-O -mbig-endian -fshrink-wrap -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#pragma GCC aarch64 "arm_sve.h" + +/* +** test_1: +** addvl sp, sp, #-17 +** str p4, \[sp\] +** str p5, \[sp, #1, mul vl\] +** str p6, \[sp, #2, mul vl\] +** str p7, \[sp, #3, mul vl\] +** str p8, \[sp, #4, mul vl\] +** str p9, \[sp, #5, mul vl\] +** str p10, \[sp, #6, mul vl\] +** str p11, \[sp, #7, mul vl\] +** ptrue p1\.b, all +** st1d z8\.d, p1, \[sp, #1, mul vl\] +** st1d z9\.d, p1, \[sp, #2, mul vl\] +** st1d z10\.d, p1, \[sp, #3, mul vl\] +** st1d z11\.d, p1, \[sp, #4, mul vl\] +** st1d z12\.d, p1, \[sp, #5, mul vl\] +** st1d z13\.d, p1, \[sp, #6, mul vl\] +** st1d z14\.d, p1, \[sp, #7, mul vl\] +** addvl x11, sp, #16 +** st1d z15\.d, p1, \[x11, #-8, mul vl\] +** str z16, \[sp, #9, mul vl\] +** str z17, \[sp, #10, mul vl\] +** str z18, \[sp, #11, mul vl\] +** str z19, \[sp, #12, mul vl\] +** str z20, \[sp, #13, mul vl\] +** str z21, \[sp, #14, mul vl\] +** str z22, \[sp, #15, mul vl\] +** str z23, \[sp, #16, mul vl\] +** ptrue p0\.b, all +** ptrue p1\.b, all +** ld1d z8\.d, p1/z, \[sp, #1, mul vl\] +** ld1d z9\.d, p1/z, \[sp, #2, mul vl\] +** ld1d z10\.d, p1/z, \[sp, #3, mul vl\] +** ld1d z11\.d, p1/z, \[sp, #4, mul vl\] +** ld1d z12\.d, p1/z, \[sp, #5, mul vl\] +** ld1d z13\.d, p1/z, \[sp, #6, mul vl\] +** ld1d z14\.d, p1/z, \[sp, #7, mul vl\] +** addvl x11, sp, #16 +** ld1d z15\.d, p1/z, \[x11, #-8, mul vl\] +** ldr z16, \[sp, #9, mul vl\] +** ldr z17, \[sp, #10, mul vl\] +** ldr z18, \[sp, #11, mul vl\] +** ldr z19, \[sp, #12, mul vl\] +** ldr z20, \[sp, #13, mul vl\] +** ldr z21, \[sp, #14, mul vl\] +** ldr z22, \[sp, #15, mul vl\] +** ldr z23, \[sp, #16, mul vl\] +** ldr p4, \[sp\] +** ldr p5, \[sp, #1, mul vl\] +** ldr p6, \[sp, #2, mul vl\] +** ldr p7, \[sp, #3, mul vl\] +** ldr p8, \[sp, #4, mul vl\] +** ldr p9, \[sp, #5, mul vl\] +** ldr p10, \[sp, #6, mul vl\] +** ldr p11, \[sp, #7, mul vl\] +** addvl sp, sp, #17 +** ret +*/ +svbool_t +test_1 (void) +{ + asm volatile ("" ::: + "z0", "z1", "z2", "z3", "z4", "z5", "z6", "z7", + "z8", "z9", "z10", "z11", "z12", "z13", "z14", "z15", + "z16", "z17", "z18", "z19", "z20", "z21", "z22", "z23", + "z24", "z25", "z26", "z27", "z28", "z29", "z30", "z31", + "p0", "p1", "p2", "p3", "p4", "p5", "p6", "p7", + "p8", "p9", "p10", "p11", "p12", "p13", "p14", "p15"); + return svptrue_b8 (); +} + +/* +** test_2: +** ptrue p0\.b, all +** ret +*/ +svbool_t +test_2 (void) +{ + asm volatile ("" ::: + "z0", "z1", "z2", "z3", "z4", "z5", "z6", "z7", + "z24", "z25", "z26", "z27", "z28", "z29", "z30", "z31", + "p0", "p1", "p2", "p3", "p12", "p13", "p14", "p15"); + return svptrue_b8 (); +} + +/* +** test_3: +** addvl sp, sp, #-6 +** str p5, \[sp\] +** str p6, \[sp, #1, mul vl\] +** str p11, \[sp, #2, mul vl\] +** ptrue p1\.b, all +** st1d z8\.d, p1, \[sp, #1, mul vl\] +** st1d z13\.d, p1, \[sp, #2, mul vl\] +** str z19, \[sp, #3, mul vl\] +** str z20, \[sp, #4, mul vl\] +** str z22, \[sp, #5, mul vl\] +** ptrue p0\.b, all +** ptrue p1\.b, all +** ld1d z8\.d, p1/z, \[sp, #1, mul vl\] +** ld1d z13\.d, p1/z, \[sp, #2, mul vl\] +** ldr z19, \[sp, #3, mul vl\] +** ldr z20, \[sp, #4, mul vl\] +** ldr z22, \[sp, #5, mul vl\] +** ldr p5, \[sp\] +** ldr p6, \[sp, #1, mul vl\] +** ldr p11, \[sp, #2, mul vl\] +** addvl sp, sp, #6 +** ret +*/ +svbool_t +test_3 (void) +{ + asm volatile ("" ::: + "z8", "z13", "z19", "z20", "z22", + "p5", "p6", "p11"); + return svptrue_b8 (); +} + +/* +** test_4: +** addvl sp, sp, #-1 +** str p4, \[sp\] +** ptrue p0\.b, all +** ldr p4, \[sp\] +** addvl sp, sp, #1 +** ret +*/ +svbool_t +test_4 (void) +{ + asm volatile ("" ::: "p4"); + return svptrue_b8 (); +} + +/* +** test_5: +** addvl sp, sp, #-1 +** ptrue p1\.b, all +** st1d z15\.d, p1, \[sp\] +** ptrue p0\.b, all +** ptrue p1\.b, all +** ld1d z15\.d, p1/z, \[sp\] +** addvl sp, sp, #1 +** ret +*/ +svbool_t +test_5 (void) +{ + asm volatile ("" ::: "z15"); + return svptrue_b8 (); +} + +/* +** test_6: +** addvl sp, sp, #-2 +** str p4, \[sp\] +** ptrue p4\.b, all +** st1d z15\.d, p4, \[sp, #1, mul vl\] +** mov z0\.b, #1 +** ptrue p4\.b, all +** ld1d z15\.d, p4/z, \[sp, #1, mul vl\] +** ldr p4, \[sp\] +** addvl sp, sp, #2 +** ret +*/ +svint8_t +test_6 (svbool_t p0, svbool_t p1, svbool_t p2, svbool_t p3) +{ + asm volatile ("" :: "Upa" (p0), "Upa" (p1), "Upa" (p2), "Upa" (p3) : "z15"); + return svdup_s8 (1); +} + +/* +** test_7: +** addvl sp, sp, #-1 +** str z16, \[sp\] +** ptrue p0\.b, all +** ldr z16, \[sp\] +** addvl sp, sp, #1 +** ret +*/ +svbool_t +test_7 (void) +{ + asm volatile ("" ::: "z16"); + return svptrue_b8 (); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/saves_1_le_nowrap.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/saves_1_le_nowrap.c new file mode 100644 index 0000000..d14cd79 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/saves_1_le_nowrap.c @@ -0,0 +1,184 @@ +/* { dg-do compile } */ +/* { dg-options "-O -mlittle-endian -fno-shrink-wrap -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#pragma GCC aarch64 "arm_sve.h" + +/* +** test_1: +** addvl sp, sp, #-17 +** str p4, \[sp\] +** str p5, \[sp, #1, mul vl\] +** str p6, \[sp, #2, mul vl\] +** str p7, \[sp, #3, mul vl\] +** str p8, \[sp, #4, mul vl\] +** str p9, \[sp, #5, mul vl\] +** str p10, \[sp, #6, mul vl\] +** str p11, \[sp, #7, mul vl\] +** str z8, \[sp, #1, mul vl\] +** str z9, \[sp, #2, mul vl\] +** str z10, \[sp, #3, mul vl\] +** str z11, \[sp, #4, mul vl\] +** str z12, \[sp, #5, mul vl\] +** str z13, \[sp, #6, mul vl\] +** str z14, \[sp, #7, mul vl\] +** str z15, \[sp, #8, mul vl\] +** str z16, \[sp, #9, mul vl\] +** str z17, \[sp, #10, mul vl\] +** str z18, \[sp, #11, mul vl\] +** str z19, \[sp, #12, mul vl\] +** str z20, \[sp, #13, mul vl\] +** str z21, \[sp, #14, mul vl\] +** str z22, \[sp, #15, mul vl\] +** str z23, \[sp, #16, mul vl\] +** ptrue p0\.b, all +** ldr z8, \[sp, #1, mul vl\] +** ldr z9, \[sp, #2, mul vl\] +** ldr z10, \[sp, #3, mul vl\] +** ldr z11, \[sp, #4, mul vl\] +** ldr z12, \[sp, #5, mul vl\] +** ldr z13, \[sp, #6, mul vl\] +** ldr z14, \[sp, #7, mul vl\] +** ldr z15, \[sp, #8, mul vl\] +** ldr z16, \[sp, #9, mul vl\] +** ldr z17, \[sp, #10, mul vl\] +** ldr z18, \[sp, #11, mul vl\] +** ldr z19, \[sp, #12, mul vl\] +** ldr z20, \[sp, #13, mul vl\] +** ldr z21, \[sp, #14, mul vl\] +** ldr z22, \[sp, #15, mul vl\] +** ldr z23, \[sp, #16, mul vl\] +** ldr p4, \[sp\] +** ldr p5, \[sp, #1, mul vl\] +** ldr p6, \[sp, #2, mul vl\] +** ldr p7, \[sp, #3, mul vl\] +** ldr p8, \[sp, #4, mul vl\] +** ldr p9, \[sp, #5, mul vl\] +** ldr p10, \[sp, #6, mul vl\] +** ldr p11, \[sp, #7, mul vl\] +** addvl sp, sp, #17 +** ret +*/ +svbool_t +test_1 (void) +{ + asm volatile ("" ::: + "z0", "z1", "z2", "z3", "z4", "z5", "z6", "z7", + "z8", "z9", "z10", "z11", "z12", "z13", "z14", "z15", + "z16", "z17", "z18", "z19", "z20", "z21", "z22", "z23", + "z24", "z25", "z26", "z27", "z28", "z29", "z30", "z31", + "p0", "p1", "p2", "p3", "p4", "p5", "p6", "p7", + "p8", "p9", "p10", "p11", "p12", "p13", "p14", "p15"); + return svptrue_b8 (); +} + +/* +** test_2: +** ptrue p0\.b, all +** ret +*/ +svbool_t +test_2 (void) +{ + asm volatile ("" ::: + "z0", "z1", "z2", "z3", "z4", "z5", "z6", "z7", + "z24", "z25", "z26", "z27", "z28", "z29", "z30", "z31", + "p0", "p1", "p2", "p3", "p12", "p13", "p14", "p15"); + return svptrue_b8 (); +} + +/* +** test_3: +** addvl sp, sp, #-6 +** str p5, \[sp\] +** str p6, \[sp, #1, mul vl\] +** str p11, \[sp, #2, mul vl\] +** str z8, \[sp, #1, mul vl\] +** str z13, \[sp, #2, mul vl\] +** str z19, \[sp, #3, mul vl\] +** str z20, \[sp, #4, mul vl\] +** str z22, \[sp, #5, mul vl\] +** ptrue p0\.b, all +** ldr z8, \[sp, #1, mul vl\] +** ldr z13, \[sp, #2, mul vl\] +** ldr z19, \[sp, #3, mul vl\] +** ldr z20, \[sp, #4, mul vl\] +** ldr z22, \[sp, #5, mul vl\] +** ldr p5, \[sp\] +** ldr p6, \[sp, #1, mul vl\] +** ldr p11, \[sp, #2, mul vl\] +** addvl sp, sp, #6 +** ret +*/ +svbool_t +test_3 (void) +{ + asm volatile ("" ::: + "z8", "z13", "z19", "z20", "z22", + "p5", "p6", "p11"); + return svptrue_b8 (); +} + +/* +** test_4: +** addvl sp, sp, #-1 +** str p4, \[sp\] +** ptrue p0\.b, all +** ldr p4, \[sp\] +** addvl sp, sp, #1 +** ret +*/ +svbool_t +test_4 (void) +{ + asm volatile ("" ::: "p4"); + return svptrue_b8 (); +} + +/* +** test_5: +** addvl sp, sp, #-1 +** str z15, \[sp\] +** ptrue p0\.b, all +** ldr z15, \[sp\] +** addvl sp, sp, #1 +** ret +*/ +svbool_t +test_5 (void) +{ + asm volatile ("" ::: "z15"); + return svptrue_b8 (); +} + +/* +** test_6: +** addvl sp, sp, #-1 +** str z15, \[sp\] +** mov z0\.b, #1 +** ldr z15, \[sp\] +** addvl sp, sp, #1 +** ret +*/ +svint8_t +test_6 (svbool_t p0, svbool_t p1, svbool_t p2, svbool_t p3) +{ + asm volatile ("" :: "Upa" (p0), "Upa" (p1), "Upa" (p2), "Upa" (p3) : "z15"); + return svdup_s8 (1); +} + +/* +** test_7: +** addvl sp, sp, #-1 +** str z16, \[sp\] +** ptrue p0\.b, all +** ldr z16, \[sp\] +** addvl sp, sp, #1 +** ret +*/ +svbool_t +test_7 (void) +{ + asm volatile ("" ::: "z16"); + return svptrue_b8 (); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/saves_1_le_wrap.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/saves_1_le_wrap.c new file mode 100644 index 0000000..d81dd8e --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/saves_1_le_wrap.c @@ -0,0 +1,184 @@ +/* { dg-do compile } */ +/* { dg-options "-O -mlittle-endian -fshrink-wrap -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#pragma GCC aarch64 "arm_sve.h" + +/* +** test_1: +** addvl sp, sp, #-17 +** str p4, \[sp\] +** str p5, \[sp, #1, mul vl\] +** str p6, \[sp, #2, mul vl\] +** str p7, \[sp, #3, mul vl\] +** str p8, \[sp, #4, mul vl\] +** str p9, \[sp, #5, mul vl\] +** str p10, \[sp, #6, mul vl\] +** str p11, \[sp, #7, mul vl\] +** str z8, \[sp, #1, mul vl\] +** str z9, \[sp, #2, mul vl\] +** str z10, \[sp, #3, mul vl\] +** str z11, \[sp, #4, mul vl\] +** str z12, \[sp, #5, mul vl\] +** str z13, \[sp, #6, mul vl\] +** str z14, \[sp, #7, mul vl\] +** str z15, \[sp, #8, mul vl\] +** str z16, \[sp, #9, mul vl\] +** str z17, \[sp, #10, mul vl\] +** str z18, \[sp, #11, mul vl\] +** str z19, \[sp, #12, mul vl\] +** str z20, \[sp, #13, mul vl\] +** str z21, \[sp, #14, mul vl\] +** str z22, \[sp, #15, mul vl\] +** str z23, \[sp, #16, mul vl\] +** ptrue p0\.b, all +** ldr z8, \[sp, #1, mul vl\] +** ldr z9, \[sp, #2, mul vl\] +** ldr z10, \[sp, #3, mul vl\] +** ldr z11, \[sp, #4, mul vl\] +** ldr z12, \[sp, #5, mul vl\] +** ldr z13, \[sp, #6, mul vl\] +** ldr z14, \[sp, #7, mul vl\] +** ldr z15, \[sp, #8, mul vl\] +** ldr z16, \[sp, #9, mul vl\] +** ldr z17, \[sp, #10, mul vl\] +** ldr z18, \[sp, #11, mul vl\] +** ldr z19, \[sp, #12, mul vl\] +** ldr z20, \[sp, #13, mul vl\] +** ldr z21, \[sp, #14, mul vl\] +** ldr z22, \[sp, #15, mul vl\] +** ldr z23, \[sp, #16, mul vl\] +** ldr p4, \[sp\] +** ldr p5, \[sp, #1, mul vl\] +** ldr p6, \[sp, #2, mul vl\] +** ldr p7, \[sp, #3, mul vl\] +** ldr p8, \[sp, #4, mul vl\] +** ldr p9, \[sp, #5, mul vl\] +** ldr p10, \[sp, #6, mul vl\] +** ldr p11, \[sp, #7, mul vl\] +** addvl sp, sp, #17 +** ret +*/ +svbool_t +test_1 (void) +{ + asm volatile ("" ::: + "z0", "z1", "z2", "z3", "z4", "z5", "z6", "z7", + "z8", "z9", "z10", "z11", "z12", "z13", "z14", "z15", + "z16", "z17", "z18", "z19", "z20", "z21", "z22", "z23", + "z24", "z25", "z26", "z27", "z28", "z29", "z30", "z31", + "p0", "p1", "p2", "p3", "p4", "p5", "p6", "p7", + "p8", "p9", "p10", "p11", "p12", "p13", "p14", "p15"); + return svptrue_b8 (); +} + +/* +** test_2: +** ptrue p0\.b, all +** ret +*/ +svbool_t +test_2 (void) +{ + asm volatile ("" ::: + "z0", "z1", "z2", "z3", "z4", "z5", "z6", "z7", + "z24", "z25", "z26", "z27", "z28", "z29", "z30", "z31", + "p0", "p1", "p2", "p3", "p12", "p13", "p14", "p15"); + return svptrue_b8 (); +} + +/* +** test_3: +** addvl sp, sp, #-6 +** str p5, \[sp\] +** str p6, \[sp, #1, mul vl\] +** str p11, \[sp, #2, mul vl\] +** str z8, \[sp, #1, mul vl\] +** str z13, \[sp, #2, mul vl\] +** str z19, \[sp, #3, mul vl\] +** str z20, \[sp, #4, mul vl\] +** str z22, \[sp, #5, mul vl\] +** ptrue p0\.b, all +** ldr z8, \[sp, #1, mul vl\] +** ldr z13, \[sp, #2, mul vl\] +** ldr z19, \[sp, #3, mul vl\] +** ldr z20, \[sp, #4, mul vl\] +** ldr z22, \[sp, #5, mul vl\] +** ldr p5, \[sp\] +** ldr p6, \[sp, #1, mul vl\] +** ldr p11, \[sp, #2, mul vl\] +** addvl sp, sp, #6 +** ret +*/ +svbool_t +test_3 (void) +{ + asm volatile ("" ::: + "z8", "z13", "z19", "z20", "z22", + "p5", "p6", "p11"); + return svptrue_b8 (); +} + +/* +** test_4: +** addvl sp, sp, #-1 +** str p4, \[sp\] +** ptrue p0\.b, all +** ldr p4, \[sp\] +** addvl sp, sp, #1 +** ret +*/ +svbool_t +test_4 (void) +{ + asm volatile ("" ::: "p4"); + return svptrue_b8 (); +} + +/* +** test_5: +** addvl sp, sp, #-1 +** str z15, \[sp\] +** ptrue p0\.b, all +** ldr z15, \[sp\] +** addvl sp, sp, #1 +** ret +*/ +svbool_t +test_5 (void) +{ + asm volatile ("" ::: "z15"); + return svptrue_b8 (); +} + +/* +** test_6: +** addvl sp, sp, #-1 +** str z15, \[sp\] +** mov z0\.b, #1 +** ldr z15, \[sp\] +** addvl sp, sp, #1 +** ret +*/ +svint8_t +test_6 (svbool_t p0, svbool_t p1, svbool_t p2, svbool_t p3) +{ + asm volatile ("" :: "Upa" (p0), "Upa" (p1), "Upa" (p2), "Upa" (p3) : "z15"); + return svdup_s8 (1); +} + +/* +** test_7: +** addvl sp, sp, #-1 +** str z16, \[sp\] +** ptrue p0\.b, all +** ldr z16, \[sp\] +** addvl sp, sp, #1 +** ret +*/ +svbool_t +test_7 (void) +{ + asm volatile ("" ::: "z16"); + return svptrue_b8 (); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/saves_2_be_nowrap.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/saves_2_be_nowrap.c new file mode 100644 index 0000000..d72601b --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/saves_2_be_nowrap.c @@ -0,0 +1,271 @@ +/* { dg-do compile } */ +/* { dg-options "-O -mbig-endian -fno-shrink-wrap -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +void standard_callee (void); +__attribute__((aarch64_vector_pcs)) void vpcs_callee (void); + +/* +** calls_standard: +** stp x29, x30, \[sp, -16\]! +** mov x29, sp +** addvl sp, sp, #-17 +** str p4, \[sp\] +** str p5, \[sp, #1, mul vl\] +** str p6, \[sp, #2, mul vl\] +** str p7, \[sp, #3, mul vl\] +** str p8, \[sp, #4, mul vl\] +** str p9, \[sp, #5, mul vl\] +** str p10, \[sp, #6, mul vl\] +** str p11, \[sp, #7, mul vl\] +** ptrue p0\.b, all +** st1d z8\.d, p0, \[sp, #1, mul vl\] +** st1d z9\.d, p0, \[sp, #2, mul vl\] +** st1d z10\.d, p0, \[sp, #3, mul vl\] +** st1d z11\.d, p0, \[sp, #4, mul vl\] +** st1d z12\.d, p0, \[sp, #5, mul vl\] +** st1d z13\.d, p0, \[sp, #6, mul vl\] +** st1d z14\.d, p0, \[sp, #7, mul vl\] +** addvl x11, sp, #16 +** st1d z15\.d, p0, \[x11, #-8, mul vl\] +** str z16, \[sp, #9, mul vl\] +** str z17, \[sp, #10, mul vl\] +** str z18, \[sp, #11, mul vl\] +** str z19, \[sp, #12, mul vl\] +** str z20, \[sp, #13, mul vl\] +** str z21, \[sp, #14, mul vl\] +** str z22, \[sp, #15, mul vl\] +** str z23, \[sp, #16, mul vl\] +** bl standard_callee +** ptrue p0\.b, all +** ld1d z8\.d, p0/z, \[sp, #1, mul vl\] +** ld1d z9\.d, p0/z, \[sp, #2, mul vl\] +** ld1d z10\.d, p0/z, \[sp, #3, mul vl\] +** ld1d z11\.d, p0/z, \[sp, #4, mul vl\] +** ld1d z12\.d, p0/z, \[sp, #5, mul vl\] +** ld1d z13\.d, p0/z, \[sp, #6, mul vl\] +** ld1d z14\.d, p0/z, \[sp, #7, mul vl\] +** addvl x11, sp, #16 +** ld1d z15\.d, p0/z, \[x11, #-8, mul vl\] +** ldr z16, \[sp, #9, mul vl\] +** ldr z17, \[sp, #10, mul vl\] +** ldr z18, \[sp, #11, mul vl\] +** ldr z19, \[sp, #12, mul vl\] +** ldr z20, \[sp, #13, mul vl\] +** ldr z21, \[sp, #14, mul vl\] +** ldr z22, \[sp, #15, mul vl\] +** ldr z23, \[sp, #16, mul vl\] +** ldr p4, \[sp\] +** ldr p5, \[sp, #1, mul vl\] +** ldr p6, \[sp, #2, mul vl\] +** ldr p7, \[sp, #3, mul vl\] +** ldr p8, \[sp, #4, mul vl\] +** ldr p9, \[sp, #5, mul vl\] +** ldr p10, \[sp, #6, mul vl\] +** ldr p11, \[sp, #7, mul vl\] +** addvl sp, sp, #17 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +void calls_standard (__SVInt8_t x) { standard_callee (); } + +/* +** calls_vpcs: +** stp x29, x30, \[sp, -16\]! +** mov x29, sp +** addvl sp, sp, #-17 +** str p4, \[sp\] +** str p5, \[sp, #1, mul vl\] +** str p6, \[sp, #2, mul vl\] +** str p7, \[sp, #3, mul vl\] +** str p8, \[sp, #4, mul vl\] +** str p9, \[sp, #5, mul vl\] +** str p10, \[sp, #6, mul vl\] +** str p11, \[sp, #7, mul vl\] +** ptrue p0\.b, all +** st1d z8\.d, p0, \[sp, #1, mul vl\] +** st1d z9\.d, p0, \[sp, #2, mul vl\] +** st1d z10\.d, p0, \[sp, #3, mul vl\] +** st1d z11\.d, p0, \[sp, #4, mul vl\] +** st1d z12\.d, p0, \[sp, #5, mul vl\] +** st1d z13\.d, p0, \[sp, #6, mul vl\] +** st1d z14\.d, p0, \[sp, #7, mul vl\] +** addvl x11, sp, #16 +** st1d z15\.d, p0, \[x11, #-8, mul vl\] +** str z16, \[sp, #9, mul vl\] +** str z17, \[sp, #10, mul vl\] +** str z18, \[sp, #11, mul vl\] +** str z19, \[sp, #12, mul vl\] +** str z20, \[sp, #13, mul vl\] +** str z21, \[sp, #14, mul vl\] +** str z22, \[sp, #15, mul vl\] +** str z23, \[sp, #16, mul vl\] +** bl vpcs_callee +** ptrue p0\.b, all +** ld1d z8\.d, p0/z, \[sp, #1, mul vl\] +** ld1d z9\.d, p0/z, \[sp, #2, mul vl\] +** ld1d z10\.d, p0/z, \[sp, #3, mul vl\] +** ld1d z11\.d, p0/z, \[sp, #4, mul vl\] +** ld1d z12\.d, p0/z, \[sp, #5, mul vl\] +** ld1d z13\.d, p0/z, \[sp, #6, mul vl\] +** ld1d z14\.d, p0/z, \[sp, #7, mul vl\] +** addvl x11, sp, #16 +** ld1d z15\.d, p0/z, \[x11, #-8, mul vl\] +** ldr z16, \[sp, #9, mul vl\] +** ldr z17, \[sp, #10, mul vl\] +** ldr z18, \[sp, #11, mul vl\] +** ldr z19, \[sp, #12, mul vl\] +** ldr z20, \[sp, #13, mul vl\] +** ldr z21, \[sp, #14, mul vl\] +** ldr z22, \[sp, #15, mul vl\] +** ldr z23, \[sp, #16, mul vl\] +** ldr p4, \[sp\] +** ldr p5, \[sp, #1, mul vl\] +** ldr p6, \[sp, #2, mul vl\] +** ldr p7, \[sp, #3, mul vl\] +** ldr p8, \[sp, #4, mul vl\] +** ldr p9, \[sp, #5, mul vl\] +** ldr p10, \[sp, #6, mul vl\] +** ldr p11, \[sp, #7, mul vl\] +** addvl sp, sp, #17 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +void calls_vpcs (__SVInt8_t x) { vpcs_callee (); } + +/* +** calls_standard_ptr: +** stp x29, x30, \[sp, -16\]! +** mov x29, sp +** addvl sp, sp, #-17 +** str p4, \[sp\] +** str p5, \[sp, #1, mul vl\] +** str p6, \[sp, #2, mul vl\] +** str p7, \[sp, #3, mul vl\] +** str p8, \[sp, #4, mul vl\] +** str p9, \[sp, #5, mul vl\] +** str p10, \[sp, #6, mul vl\] +** str p11, \[sp, #7, mul vl\] +** ptrue p0\.b, all +** st1d z8\.d, p0, \[sp, #1, mul vl\] +** st1d z9\.d, p0, \[sp, #2, mul vl\] +** st1d z10\.d, p0, \[sp, #3, mul vl\] +** st1d z11\.d, p0, \[sp, #4, mul vl\] +** st1d z12\.d, p0, \[sp, #5, mul vl\] +** st1d z13\.d, p0, \[sp, #6, mul vl\] +** st1d z14\.d, p0, \[sp, #7, mul vl\] +** addvl x11, sp, #16 +** st1d z15\.d, p0, \[x11, #-8, mul vl\] +** str z16, \[sp, #9, mul vl\] +** str z17, \[sp, #10, mul vl\] +** str z18, \[sp, #11, mul vl\] +** str z19, \[sp, #12, mul vl\] +** str z20, \[sp, #13, mul vl\] +** str z21, \[sp, #14, mul vl\] +** str z22, \[sp, #15, mul vl\] +** str z23, \[sp, #16, mul vl\] +** blr x0 +** ptrue p0\.b, all +** ld1d z8\.d, p0/z, \[sp, #1, mul vl\] +** ld1d z9\.d, p0/z, \[sp, #2, mul vl\] +** ld1d z10\.d, p0/z, \[sp, #3, mul vl\] +** ld1d z11\.d, p0/z, \[sp, #4, mul vl\] +** ld1d z12\.d, p0/z, \[sp, #5, mul vl\] +** ld1d z13\.d, p0/z, \[sp, #6, mul vl\] +** ld1d z14\.d, p0/z, \[sp, #7, mul vl\] +** addvl x11, sp, #16 +** ld1d z15\.d, p0/z, \[x11, #-8, mul vl\] +** ldr z16, \[sp, #9, mul vl\] +** ldr z17, \[sp, #10, mul vl\] +** ldr z18, \[sp, #11, mul vl\] +** ldr z19, \[sp, #12, mul vl\] +** ldr z20, \[sp, #13, mul vl\] +** ldr z21, \[sp, #14, mul vl\] +** ldr z22, \[sp, #15, mul vl\] +** ldr z23, \[sp, #16, mul vl\] +** ldr p4, \[sp\] +** ldr p5, \[sp, #1, mul vl\] +** ldr p6, \[sp, #2, mul vl\] +** ldr p7, \[sp, #3, mul vl\] +** ldr p8, \[sp, #4, mul vl\] +** ldr p9, \[sp, #5, mul vl\] +** ldr p10, \[sp, #6, mul vl\] +** ldr p11, \[sp, #7, mul vl\] +** addvl sp, sp, #17 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +void +calls_standard_ptr (__SVInt8_t x, void (*fn) (void)) +{ + fn (); +} + +/* +** calls_vpcs_ptr: +** stp x29, x30, \[sp, -16\]! +** mov x29, sp +** addvl sp, sp, #-17 +** str p4, \[sp\] +** str p5, \[sp, #1, mul vl\] +** str p6, \[sp, #2, mul vl\] +** str p7, \[sp, #3, mul vl\] +** str p8, \[sp, #4, mul vl\] +** str p9, \[sp, #5, mul vl\] +** str p10, \[sp, #6, mul vl\] +** str p11, \[sp, #7, mul vl\] +** ptrue p0\.b, all +** st1d z8\.d, p0, \[sp, #1, mul vl\] +** st1d z9\.d, p0, \[sp, #2, mul vl\] +** st1d z10\.d, p0, \[sp, #3, mul vl\] +** st1d z11\.d, p0, \[sp, #4, mul vl\] +** st1d z12\.d, p0, \[sp, #5, mul vl\] +** st1d z13\.d, p0, \[sp, #6, mul vl\] +** st1d z14\.d, p0, \[sp, #7, mul vl\] +** addvl x11, sp, #16 +** st1d z15\.d, p0, \[x11, #-8, mul vl\] +** str z16, \[sp, #9, mul vl\] +** str z17, \[sp, #10, mul vl\] +** str z18, \[sp, #11, mul vl\] +** str z19, \[sp, #12, mul vl\] +** str z20, \[sp, #13, mul vl\] +** str z21, \[sp, #14, mul vl\] +** str z22, \[sp, #15, mul vl\] +** str z23, \[sp, #16, mul vl\] +** blr x0 +** ptrue p0\.b, all +** ld1d z8\.d, p0/z, \[sp, #1, mul vl\] +** ld1d z9\.d, p0/z, \[sp, #2, mul vl\] +** ld1d z10\.d, p0/z, \[sp, #3, mul vl\] +** ld1d z11\.d, p0/z, \[sp, #4, mul vl\] +** ld1d z12\.d, p0/z, \[sp, #5, mul vl\] +** ld1d z13\.d, p0/z, \[sp, #6, mul vl\] +** ld1d z14\.d, p0/z, \[sp, #7, mul vl\] +** addvl x11, sp, #16 +** ld1d z15\.d, p0/z, \[x11, #-8, mul vl\] +** ldr z16, \[sp, #9, mul vl\] +** ldr z17, \[sp, #10, mul vl\] +** ldr z18, \[sp, #11, mul vl\] +** ldr z19, \[sp, #12, mul vl\] +** ldr z20, \[sp, #13, mul vl\] +** ldr z21, \[sp, #14, mul vl\] +** ldr z22, \[sp, #15, mul vl\] +** ldr z23, \[sp, #16, mul vl\] +** ldr p4, \[sp\] +** ldr p5, \[sp, #1, mul vl\] +** ldr p6, \[sp, #2, mul vl\] +** ldr p7, \[sp, #3, mul vl\] +** ldr p8, \[sp, #4, mul vl\] +** ldr p9, \[sp, #5, mul vl\] +** ldr p10, \[sp, #6, mul vl\] +** ldr p11, \[sp, #7, mul vl\] +** addvl sp, sp, #17 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +void +calls_vpcs_ptr (__SVInt8_t x, + void (*__attribute__((aarch64_vector_pcs)) fn) (void)) +{ + fn (); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/saves_2_be_wrap.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/saves_2_be_wrap.c new file mode 100644 index 0000000..f715f01 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/saves_2_be_wrap.c @@ -0,0 +1,271 @@ +/* { dg-do compile } */ +/* { dg-options "-O -mbig-endian -fshrink-wrap -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +void standard_callee (void); +__attribute__((aarch64_vector_pcs)) void vpcs_callee (void); + +/* +** calls_standard: +** stp x29, x30, \[sp, -16\]! +** mov x29, sp +** addvl sp, sp, #-17 +** str p4, \[sp\] +** str p5, \[sp, #1, mul vl\] +** str p6, \[sp, #2, mul vl\] +** str p7, \[sp, #3, mul vl\] +** str p8, \[sp, #4, mul vl\] +** str p9, \[sp, #5, mul vl\] +** str p10, \[sp, #6, mul vl\] +** str p11, \[sp, #7, mul vl\] +** ptrue p0\.b, all +** st1d z8\.d, p0, \[sp, #1, mul vl\] +** st1d z9\.d, p0, \[sp, #2, mul vl\] +** st1d z10\.d, p0, \[sp, #3, mul vl\] +** st1d z11\.d, p0, \[sp, #4, mul vl\] +** st1d z12\.d, p0, \[sp, #5, mul vl\] +** st1d z13\.d, p0, \[sp, #6, mul vl\] +** st1d z14\.d, p0, \[sp, #7, mul vl\] +** addvl x11, sp, #16 +** st1d z15\.d, p0, \[x11, #-8, mul vl\] +** str z16, \[sp, #9, mul vl\] +** str z17, \[sp, #10, mul vl\] +** str z18, \[sp, #11, mul vl\] +** str z19, \[sp, #12, mul vl\] +** str z20, \[sp, #13, mul vl\] +** str z21, \[sp, #14, mul vl\] +** str z22, \[sp, #15, mul vl\] +** str z23, \[sp, #16, mul vl\] +** bl standard_callee +** ptrue p0\.b, all +** ld1d z8\.d, p0/z, \[sp, #1, mul vl\] +** ld1d z9\.d, p0/z, \[sp, #2, mul vl\] +** ld1d z10\.d, p0/z, \[sp, #3, mul vl\] +** ld1d z11\.d, p0/z, \[sp, #4, mul vl\] +** ld1d z12\.d, p0/z, \[sp, #5, mul vl\] +** ld1d z13\.d, p0/z, \[sp, #6, mul vl\] +** ld1d z14\.d, p0/z, \[sp, #7, mul vl\] +** addvl x11, sp, #16 +** ld1d z15\.d, p0/z, \[x11, #-8, mul vl\] +** ldr z16, \[sp, #9, mul vl\] +** ldr z17, \[sp, #10, mul vl\] +** ldr z18, \[sp, #11, mul vl\] +** ldr z19, \[sp, #12, mul vl\] +** ldr z20, \[sp, #13, mul vl\] +** ldr z21, \[sp, #14, mul vl\] +** ldr z22, \[sp, #15, mul vl\] +** ldr z23, \[sp, #16, mul vl\] +** ldr p4, \[sp\] +** ldr p5, \[sp, #1, mul vl\] +** ldr p6, \[sp, #2, mul vl\] +** ldr p7, \[sp, #3, mul vl\] +** ldr p8, \[sp, #4, mul vl\] +** ldr p9, \[sp, #5, mul vl\] +** ldr p10, \[sp, #6, mul vl\] +** ldr p11, \[sp, #7, mul vl\] +** addvl sp, sp, #17 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +void calls_standard (__SVInt8_t x) { standard_callee (); } + +/* +** calls_vpcs: +** stp x29, x30, \[sp, -16\]! +** mov x29, sp +** addvl sp, sp, #-17 +** str p4, \[sp\] +** str p5, \[sp, #1, mul vl\] +** str p6, \[sp, #2, mul vl\] +** str p7, \[sp, #3, mul vl\] +** str p8, \[sp, #4, mul vl\] +** str p9, \[sp, #5, mul vl\] +** str p10, \[sp, #6, mul vl\] +** str p11, \[sp, #7, mul vl\] +** ptrue p0\.b, all +** st1d z8\.d, p0, \[sp, #1, mul vl\] +** st1d z9\.d, p0, \[sp, #2, mul vl\] +** st1d z10\.d, p0, \[sp, #3, mul vl\] +** st1d z11\.d, p0, \[sp, #4, mul vl\] +** st1d z12\.d, p0, \[sp, #5, mul vl\] +** st1d z13\.d, p0, \[sp, #6, mul vl\] +** st1d z14\.d, p0, \[sp, #7, mul vl\] +** addvl x11, sp, #16 +** st1d z15\.d, p0, \[x11, #-8, mul vl\] +** str z16, \[sp, #9, mul vl\] +** str z17, \[sp, #10, mul vl\] +** str z18, \[sp, #11, mul vl\] +** str z19, \[sp, #12, mul vl\] +** str z20, \[sp, #13, mul vl\] +** str z21, \[sp, #14, mul vl\] +** str z22, \[sp, #15, mul vl\] +** str z23, \[sp, #16, mul vl\] +** bl vpcs_callee +** ptrue p0\.b, all +** ld1d z8\.d, p0/z, \[sp, #1, mul vl\] +** ld1d z9\.d, p0/z, \[sp, #2, mul vl\] +** ld1d z10\.d, p0/z, \[sp, #3, mul vl\] +** ld1d z11\.d, p0/z, \[sp, #4, mul vl\] +** ld1d z12\.d, p0/z, \[sp, #5, mul vl\] +** ld1d z13\.d, p0/z, \[sp, #6, mul vl\] +** ld1d z14\.d, p0/z, \[sp, #7, mul vl\] +** addvl x11, sp, #16 +** ld1d z15\.d, p0/z, \[x11, #-8, mul vl\] +** ldr z16, \[sp, #9, mul vl\] +** ldr z17, \[sp, #10, mul vl\] +** ldr z18, \[sp, #11, mul vl\] +** ldr z19, \[sp, #12, mul vl\] +** ldr z20, \[sp, #13, mul vl\] +** ldr z21, \[sp, #14, mul vl\] +** ldr z22, \[sp, #15, mul vl\] +** ldr z23, \[sp, #16, mul vl\] +** ldr p4, \[sp\] +** ldr p5, \[sp, #1, mul vl\] +** ldr p6, \[sp, #2, mul vl\] +** ldr p7, \[sp, #3, mul vl\] +** ldr p8, \[sp, #4, mul vl\] +** ldr p9, \[sp, #5, mul vl\] +** ldr p10, \[sp, #6, mul vl\] +** ldr p11, \[sp, #7, mul vl\] +** addvl sp, sp, #17 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +void calls_vpcs (__SVInt8_t x) { vpcs_callee (); } + +/* +** calls_standard_ptr: +** stp x29, x30, \[sp, -16\]! +** mov x29, sp +** addvl sp, sp, #-17 +** str p4, \[sp\] +** str p5, \[sp, #1, mul vl\] +** str p6, \[sp, #2, mul vl\] +** str p7, \[sp, #3, mul vl\] +** str p8, \[sp, #4, mul vl\] +** str p9, \[sp, #5, mul vl\] +** str p10, \[sp, #6, mul vl\] +** str p11, \[sp, #7, mul vl\] +** ptrue p0\.b, all +** st1d z8\.d, p0, \[sp, #1, mul vl\] +** st1d z9\.d, p0, \[sp, #2, mul vl\] +** st1d z10\.d, p0, \[sp, #3, mul vl\] +** st1d z11\.d, p0, \[sp, #4, mul vl\] +** st1d z12\.d, p0, \[sp, #5, mul vl\] +** st1d z13\.d, p0, \[sp, #6, mul vl\] +** st1d z14\.d, p0, \[sp, #7, mul vl\] +** addvl x11, sp, #16 +** st1d z15\.d, p0, \[x11, #-8, mul vl\] +** str z16, \[sp, #9, mul vl\] +** str z17, \[sp, #10, mul vl\] +** str z18, \[sp, #11, mul vl\] +** str z19, \[sp, #12, mul vl\] +** str z20, \[sp, #13, mul vl\] +** str z21, \[sp, #14, mul vl\] +** str z22, \[sp, #15, mul vl\] +** str z23, \[sp, #16, mul vl\] +** blr x0 +** ptrue p0\.b, all +** ld1d z8\.d, p0/z, \[sp, #1, mul vl\] +** ld1d z9\.d, p0/z, \[sp, #2, mul vl\] +** ld1d z10\.d, p0/z, \[sp, #3, mul vl\] +** ld1d z11\.d, p0/z, \[sp, #4, mul vl\] +** ld1d z12\.d, p0/z, \[sp, #5, mul vl\] +** ld1d z13\.d, p0/z, \[sp, #6, mul vl\] +** ld1d z14\.d, p0/z, \[sp, #7, mul vl\] +** addvl x11, sp, #16 +** ld1d z15\.d, p0/z, \[x11, #-8, mul vl\] +** ldr z16, \[sp, #9, mul vl\] +** ldr z17, \[sp, #10, mul vl\] +** ldr z18, \[sp, #11, mul vl\] +** ldr z19, \[sp, #12, mul vl\] +** ldr z20, \[sp, #13, mul vl\] +** ldr z21, \[sp, #14, mul vl\] +** ldr z22, \[sp, #15, mul vl\] +** ldr z23, \[sp, #16, mul vl\] +** ldr p4, \[sp\] +** ldr p5, \[sp, #1, mul vl\] +** ldr p6, \[sp, #2, mul vl\] +** ldr p7, \[sp, #3, mul vl\] +** ldr p8, \[sp, #4, mul vl\] +** ldr p9, \[sp, #5, mul vl\] +** ldr p10, \[sp, #6, mul vl\] +** ldr p11, \[sp, #7, mul vl\] +** addvl sp, sp, #17 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +void +calls_standard_ptr (__SVInt8_t x, void (*fn) (void)) +{ + fn (); +} + +/* +** calls_vpcs_ptr: +** stp x29, x30, \[sp, -16\]! +** mov x29, sp +** addvl sp, sp, #-17 +** str p4, \[sp\] +** str p5, \[sp, #1, mul vl\] +** str p6, \[sp, #2, mul vl\] +** str p7, \[sp, #3, mul vl\] +** str p8, \[sp, #4, mul vl\] +** str p9, \[sp, #5, mul vl\] +** str p10, \[sp, #6, mul vl\] +** str p11, \[sp, #7, mul vl\] +** ptrue p0\.b, all +** st1d z8\.d, p0, \[sp, #1, mul vl\] +** st1d z9\.d, p0, \[sp, #2, mul vl\] +** st1d z10\.d, p0, \[sp, #3, mul vl\] +** st1d z11\.d, p0, \[sp, #4, mul vl\] +** st1d z12\.d, p0, \[sp, #5, mul vl\] +** st1d z13\.d, p0, \[sp, #6, mul vl\] +** st1d z14\.d, p0, \[sp, #7, mul vl\] +** addvl x11, sp, #16 +** st1d z15\.d, p0, \[x11, #-8, mul vl\] +** str z16, \[sp, #9, mul vl\] +** str z17, \[sp, #10, mul vl\] +** str z18, \[sp, #11, mul vl\] +** str z19, \[sp, #12, mul vl\] +** str z20, \[sp, #13, mul vl\] +** str z21, \[sp, #14, mul vl\] +** str z22, \[sp, #15, mul vl\] +** str z23, \[sp, #16, mul vl\] +** blr x0 +** ptrue p0\.b, all +** ld1d z8\.d, p0/z, \[sp, #1, mul vl\] +** ld1d z9\.d, p0/z, \[sp, #2, mul vl\] +** ld1d z10\.d, p0/z, \[sp, #3, mul vl\] +** ld1d z11\.d, p0/z, \[sp, #4, mul vl\] +** ld1d z12\.d, p0/z, \[sp, #5, mul vl\] +** ld1d z13\.d, p0/z, \[sp, #6, mul vl\] +** ld1d z14\.d, p0/z, \[sp, #7, mul vl\] +** addvl x11, sp, #16 +** ld1d z15\.d, p0/z, \[x11, #-8, mul vl\] +** ldr z16, \[sp, #9, mul vl\] +** ldr z17, \[sp, #10, mul vl\] +** ldr z18, \[sp, #11, mul vl\] +** ldr z19, \[sp, #12, mul vl\] +** ldr z20, \[sp, #13, mul vl\] +** ldr z21, \[sp, #14, mul vl\] +** ldr z22, \[sp, #15, mul vl\] +** ldr z23, \[sp, #16, mul vl\] +** ldr p4, \[sp\] +** ldr p5, \[sp, #1, mul vl\] +** ldr p6, \[sp, #2, mul vl\] +** ldr p7, \[sp, #3, mul vl\] +** ldr p8, \[sp, #4, mul vl\] +** ldr p9, \[sp, #5, mul vl\] +** ldr p10, \[sp, #6, mul vl\] +** ldr p11, \[sp, #7, mul vl\] +** addvl sp, sp, #17 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +void +calls_vpcs_ptr (__SVInt8_t x, + void (*__attribute__((aarch64_vector_pcs)) fn) (void)) +{ + fn (); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/saves_2_le_nowrap.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/saves_2_le_nowrap.c new file mode 100644 index 0000000..cb709e7 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/saves_2_le_nowrap.c @@ -0,0 +1,255 @@ +/* { dg-do compile } */ +/* { dg-options "-O -mlittle-endian -fno-shrink-wrap -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +void standard_callee (void); +__attribute__((aarch64_vector_pcs)) void vpcs_callee (void); + +/* +** calls_standard: +** stp x29, x30, \[sp, -16\]! +** mov x29, sp +** addvl sp, sp, #-17 +** str p4, \[sp\] +** str p5, \[sp, #1, mul vl\] +** str p6, \[sp, #2, mul vl\] +** str p7, \[sp, #3, mul vl\] +** str p8, \[sp, #4, mul vl\] +** str p9, \[sp, #5, mul vl\] +** str p10, \[sp, #6, mul vl\] +** str p11, \[sp, #7, mul vl\] +** str z8, \[sp, #1, mul vl\] +** str z9, \[sp, #2, mul vl\] +** str z10, \[sp, #3, mul vl\] +** str z11, \[sp, #4, mul vl\] +** str z12, \[sp, #5, mul vl\] +** str z13, \[sp, #6, mul vl\] +** str z14, \[sp, #7, mul vl\] +** str z15, \[sp, #8, mul vl\] +** str z16, \[sp, #9, mul vl\] +** str z17, \[sp, #10, mul vl\] +** str z18, \[sp, #11, mul vl\] +** str z19, \[sp, #12, mul vl\] +** str z20, \[sp, #13, mul vl\] +** str z21, \[sp, #14, mul vl\] +** str z22, \[sp, #15, mul vl\] +** str z23, \[sp, #16, mul vl\] +** bl standard_callee +** ldr z8, \[sp, #1, mul vl\] +** ldr z9, \[sp, #2, mul vl\] +** ldr z10, \[sp, #3, mul vl\] +** ldr z11, \[sp, #4, mul vl\] +** ldr z12, \[sp, #5, mul vl\] +** ldr z13, \[sp, #6, mul vl\] +** ldr z14, \[sp, #7, mul vl\] +** ldr z15, \[sp, #8, mul vl\] +** ldr z16, \[sp, #9, mul vl\] +** ldr z17, \[sp, #10, mul vl\] +** ldr z18, \[sp, #11, mul vl\] +** ldr z19, \[sp, #12, mul vl\] +** ldr z20, \[sp, #13, mul vl\] +** ldr z21, \[sp, #14, mul vl\] +** ldr z22, \[sp, #15, mul vl\] +** ldr z23, \[sp, #16, mul vl\] +** ldr p4, \[sp\] +** ldr p5, \[sp, #1, mul vl\] +** ldr p6, \[sp, #2, mul vl\] +** ldr p7, \[sp, #3, mul vl\] +** ldr p8, \[sp, #4, mul vl\] +** ldr p9, \[sp, #5, mul vl\] +** ldr p10, \[sp, #6, mul vl\] +** ldr p11, \[sp, #7, mul vl\] +** addvl sp, sp, #17 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +void calls_standard (__SVInt8_t x) { standard_callee (); } + +/* +** calls_vpcs: +** stp x29, x30, \[sp, -16\]! +** mov x29, sp +** addvl sp, sp, #-17 +** str p4, \[sp\] +** str p5, \[sp, #1, mul vl\] +** str p6, \[sp, #2, mul vl\] +** str p7, \[sp, #3, mul vl\] +** str p8, \[sp, #4, mul vl\] +** str p9, \[sp, #5, mul vl\] +** str p10, \[sp, #6, mul vl\] +** str p11, \[sp, #7, mul vl\] +** str z8, \[sp, #1, mul vl\] +** str z9, \[sp, #2, mul vl\] +** str z10, \[sp, #3, mul vl\] +** str z11, \[sp, #4, mul vl\] +** str z12, \[sp, #5, mul vl\] +** str z13, \[sp, #6, mul vl\] +** str z14, \[sp, #7, mul vl\] +** str z15, \[sp, #8, mul vl\] +** str z16, \[sp, #9, mul vl\] +** str z17, \[sp, #10, mul vl\] +** str z18, \[sp, #11, mul vl\] +** str z19, \[sp, #12, mul vl\] +** str z20, \[sp, #13, mul vl\] +** str z21, \[sp, #14, mul vl\] +** str z22, \[sp, #15, mul vl\] +** str z23, \[sp, #16, mul vl\] +** bl vpcs_callee +** ldr z8, \[sp, #1, mul vl\] +** ldr z9, \[sp, #2, mul vl\] +** ldr z10, \[sp, #3, mul vl\] +** ldr z11, \[sp, #4, mul vl\] +** ldr z12, \[sp, #5, mul vl\] +** ldr z13, \[sp, #6, mul vl\] +** ldr z14, \[sp, #7, mul vl\] +** ldr z15, \[sp, #8, mul vl\] +** ldr z16, \[sp, #9, mul vl\] +** ldr z17, \[sp, #10, mul vl\] +** ldr z18, \[sp, #11, mul vl\] +** ldr z19, \[sp, #12, mul vl\] +** ldr z20, \[sp, #13, mul vl\] +** ldr z21, \[sp, #14, mul vl\] +** ldr z22, \[sp, #15, mul vl\] +** ldr z23, \[sp, #16, mul vl\] +** ldr p4, \[sp\] +** ldr p5, \[sp, #1, mul vl\] +** ldr p6, \[sp, #2, mul vl\] +** ldr p7, \[sp, #3, mul vl\] +** ldr p8, \[sp, #4, mul vl\] +** ldr p9, \[sp, #5, mul vl\] +** ldr p10, \[sp, #6, mul vl\] +** ldr p11, \[sp, #7, mul vl\] +** addvl sp, sp, #17 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +void calls_vpcs (__SVInt8_t x) { vpcs_callee (); } + +/* +** calls_standard_ptr: +** stp x29, x30, \[sp, -16\]! +** mov x29, sp +** addvl sp, sp, #-17 +** str p4, \[sp\] +** str p5, \[sp, #1, mul vl\] +** str p6, \[sp, #2, mul vl\] +** str p7, \[sp, #3, mul vl\] +** str p8, \[sp, #4, mul vl\] +** str p9, \[sp, #5, mul vl\] +** str p10, \[sp, #6, mul vl\] +** str p11, \[sp, #7, mul vl\] +** str z8, \[sp, #1, mul vl\] +** str z9, \[sp, #2, mul vl\] +** str z10, \[sp, #3, mul vl\] +** str z11, \[sp, #4, mul vl\] +** str z12, \[sp, #5, mul vl\] +** str z13, \[sp, #6, mul vl\] +** str z14, \[sp, #7, mul vl\] +** str z15, \[sp, #8, mul vl\] +** str z16, \[sp, #9, mul vl\] +** str z17, \[sp, #10, mul vl\] +** str z18, \[sp, #11, mul vl\] +** str z19, \[sp, #12, mul vl\] +** str z20, \[sp, #13, mul vl\] +** str z21, \[sp, #14, mul vl\] +** str z22, \[sp, #15, mul vl\] +** str z23, \[sp, #16, mul vl\] +** blr x0 +** ldr z8, \[sp, #1, mul vl\] +** ldr z9, \[sp, #2, mul vl\] +** ldr z10, \[sp, #3, mul vl\] +** ldr z11, \[sp, #4, mul vl\] +** ldr z12, \[sp, #5, mul vl\] +** ldr z13, \[sp, #6, mul vl\] +** ldr z14, \[sp, #7, mul vl\] +** ldr z15, \[sp, #8, mul vl\] +** ldr z16, \[sp, #9, mul vl\] +** ldr z17, \[sp, #10, mul vl\] +** ldr z18, \[sp, #11, mul vl\] +** ldr z19, \[sp, #12, mul vl\] +** ldr z20, \[sp, #13, mul vl\] +** ldr z21, \[sp, #14, mul vl\] +** ldr z22, \[sp, #15, mul vl\] +** ldr z23, \[sp, #16, mul vl\] +** ldr p4, \[sp\] +** ldr p5, \[sp, #1, mul vl\] +** ldr p6, \[sp, #2, mul vl\] +** ldr p7, \[sp, #3, mul vl\] +** ldr p8, \[sp, #4, mul vl\] +** ldr p9, \[sp, #5, mul vl\] +** ldr p10, \[sp, #6, mul vl\] +** ldr p11, \[sp, #7, mul vl\] +** addvl sp, sp, #17 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +void +calls_standard_ptr (__SVInt8_t x, void (*fn) (void)) +{ + fn (); +} + +/* +** calls_vpcs_ptr: +** stp x29, x30, \[sp, -16\]! +** mov x29, sp +** addvl sp, sp, #-17 +** str p4, \[sp\] +** str p5, \[sp, #1, mul vl\] +** str p6, \[sp, #2, mul vl\] +** str p7, \[sp, #3, mul vl\] +** str p8, \[sp, #4, mul vl\] +** str p9, \[sp, #5, mul vl\] +** str p10, \[sp, #6, mul vl\] +** str p11, \[sp, #7, mul vl\] +** str z8, \[sp, #1, mul vl\] +** str z9, \[sp, #2, mul vl\] +** str z10, \[sp, #3, mul vl\] +** str z11, \[sp, #4, mul vl\] +** str z12, \[sp, #5, mul vl\] +** str z13, \[sp, #6, mul vl\] +** str z14, \[sp, #7, mul vl\] +** str z15, \[sp, #8, mul vl\] +** str z16, \[sp, #9, mul vl\] +** str z17, \[sp, #10, mul vl\] +** str z18, \[sp, #11, mul vl\] +** str z19, \[sp, #12, mul vl\] +** str z20, \[sp, #13, mul vl\] +** str z21, \[sp, #14, mul vl\] +** str z22, \[sp, #15, mul vl\] +** str z23, \[sp, #16, mul vl\] +** blr x0 +** ldr z8, \[sp, #1, mul vl\] +** ldr z9, \[sp, #2, mul vl\] +** ldr z10, \[sp, #3, mul vl\] +** ldr z11, \[sp, #4, mul vl\] +** ldr z12, \[sp, #5, mul vl\] +** ldr z13, \[sp, #6, mul vl\] +** ldr z14, \[sp, #7, mul vl\] +** ldr z15, \[sp, #8, mul vl\] +** ldr z16, \[sp, #9, mul vl\] +** ldr z17, \[sp, #10, mul vl\] +** ldr z18, \[sp, #11, mul vl\] +** ldr z19, \[sp, #12, mul vl\] +** ldr z20, \[sp, #13, mul vl\] +** ldr z21, \[sp, #14, mul vl\] +** ldr z22, \[sp, #15, mul vl\] +** ldr z23, \[sp, #16, mul vl\] +** ldr p4, \[sp\] +** ldr p5, \[sp, #1, mul vl\] +** ldr p6, \[sp, #2, mul vl\] +** ldr p7, \[sp, #3, mul vl\] +** ldr p8, \[sp, #4, mul vl\] +** ldr p9, \[sp, #5, mul vl\] +** ldr p10, \[sp, #6, mul vl\] +** ldr p11, \[sp, #7, mul vl\] +** addvl sp, sp, #17 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +void +calls_vpcs_ptr (__SVInt8_t x, + void (*__attribute__((aarch64_vector_pcs)) fn) (void)) +{ + fn (); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/saves_2_le_wrap.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/saves_2_le_wrap.c new file mode 100644 index 0000000..ef24c7a --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/saves_2_le_wrap.c @@ -0,0 +1,255 @@ +/* { dg-do compile } */ +/* { dg-options "-O -mlittle-endian -fshrink-wrap -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +void standard_callee (void); +__attribute__((aarch64_vector_pcs)) void vpcs_callee (void); + +/* +** calls_standard: +** stp x29, x30, \[sp, -16\]! +** mov x29, sp +** addvl sp, sp, #-17 +** str p4, \[sp\] +** str p5, \[sp, #1, mul vl\] +** str p6, \[sp, #2, mul vl\] +** str p7, \[sp, #3, mul vl\] +** str p8, \[sp, #4, mul vl\] +** str p9, \[sp, #5, mul vl\] +** str p10, \[sp, #6, mul vl\] +** str p11, \[sp, #7, mul vl\] +** str z8, \[sp, #1, mul vl\] +** str z9, \[sp, #2, mul vl\] +** str z10, \[sp, #3, mul vl\] +** str z11, \[sp, #4, mul vl\] +** str z12, \[sp, #5, mul vl\] +** str z13, \[sp, #6, mul vl\] +** str z14, \[sp, #7, mul vl\] +** str z15, \[sp, #8, mul vl\] +** str z16, \[sp, #9, mul vl\] +** str z17, \[sp, #10, mul vl\] +** str z18, \[sp, #11, mul vl\] +** str z19, \[sp, #12, mul vl\] +** str z20, \[sp, #13, mul vl\] +** str z21, \[sp, #14, mul vl\] +** str z22, \[sp, #15, mul vl\] +** str z23, \[sp, #16, mul vl\] +** bl standard_callee +** ldr z8, \[sp, #1, mul vl\] +** ldr z9, \[sp, #2, mul vl\] +** ldr z10, \[sp, #3, mul vl\] +** ldr z11, \[sp, #4, mul vl\] +** ldr z12, \[sp, #5, mul vl\] +** ldr z13, \[sp, #6, mul vl\] +** ldr z14, \[sp, #7, mul vl\] +** ldr z15, \[sp, #8, mul vl\] +** ldr z16, \[sp, #9, mul vl\] +** ldr z17, \[sp, #10, mul vl\] +** ldr z18, \[sp, #11, mul vl\] +** ldr z19, \[sp, #12, mul vl\] +** ldr z20, \[sp, #13, mul vl\] +** ldr z21, \[sp, #14, mul vl\] +** ldr z22, \[sp, #15, mul vl\] +** ldr z23, \[sp, #16, mul vl\] +** ldr p4, \[sp\] +** ldr p5, \[sp, #1, mul vl\] +** ldr p6, \[sp, #2, mul vl\] +** ldr p7, \[sp, #3, mul vl\] +** ldr p8, \[sp, #4, mul vl\] +** ldr p9, \[sp, #5, mul vl\] +** ldr p10, \[sp, #6, mul vl\] +** ldr p11, \[sp, #7, mul vl\] +** addvl sp, sp, #17 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +void calls_standard (__SVInt8_t x) { standard_callee (); } + +/* +** calls_vpcs: +** stp x29, x30, \[sp, -16\]! +** mov x29, sp +** addvl sp, sp, #-17 +** str p4, \[sp\] +** str p5, \[sp, #1, mul vl\] +** str p6, \[sp, #2, mul vl\] +** str p7, \[sp, #3, mul vl\] +** str p8, \[sp, #4, mul vl\] +** str p9, \[sp, #5, mul vl\] +** str p10, \[sp, #6, mul vl\] +** str p11, \[sp, #7, mul vl\] +** str z8, \[sp, #1, mul vl\] +** str z9, \[sp, #2, mul vl\] +** str z10, \[sp, #3, mul vl\] +** str z11, \[sp, #4, mul vl\] +** str z12, \[sp, #5, mul vl\] +** str z13, \[sp, #6, mul vl\] +** str z14, \[sp, #7, mul vl\] +** str z15, \[sp, #8, mul vl\] +** str z16, \[sp, #9, mul vl\] +** str z17, \[sp, #10, mul vl\] +** str z18, \[sp, #11, mul vl\] +** str z19, \[sp, #12, mul vl\] +** str z20, \[sp, #13, mul vl\] +** str z21, \[sp, #14, mul vl\] +** str z22, \[sp, #15, mul vl\] +** str z23, \[sp, #16, mul vl\] +** bl vpcs_callee +** ldr z8, \[sp, #1, mul vl\] +** ldr z9, \[sp, #2, mul vl\] +** ldr z10, \[sp, #3, mul vl\] +** ldr z11, \[sp, #4, mul vl\] +** ldr z12, \[sp, #5, mul vl\] +** ldr z13, \[sp, #6, mul vl\] +** ldr z14, \[sp, #7, mul vl\] +** ldr z15, \[sp, #8, mul vl\] +** ldr z16, \[sp, #9, mul vl\] +** ldr z17, \[sp, #10, mul vl\] +** ldr z18, \[sp, #11, mul vl\] +** ldr z19, \[sp, #12, mul vl\] +** ldr z20, \[sp, #13, mul vl\] +** ldr z21, \[sp, #14, mul vl\] +** ldr z22, \[sp, #15, mul vl\] +** ldr z23, \[sp, #16, mul vl\] +** ldr p4, \[sp\] +** ldr p5, \[sp, #1, mul vl\] +** ldr p6, \[sp, #2, mul vl\] +** ldr p7, \[sp, #3, mul vl\] +** ldr p8, \[sp, #4, mul vl\] +** ldr p9, \[sp, #5, mul vl\] +** ldr p10, \[sp, #6, mul vl\] +** ldr p11, \[sp, #7, mul vl\] +** addvl sp, sp, #17 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +void calls_vpcs (__SVInt8_t x) { vpcs_callee (); } + +/* +** calls_standard_ptr: +** stp x29, x30, \[sp, -16\]! +** mov x29, sp +** addvl sp, sp, #-17 +** str p4, \[sp\] +** str p5, \[sp, #1, mul vl\] +** str p6, \[sp, #2, mul vl\] +** str p7, \[sp, #3, mul vl\] +** str p8, \[sp, #4, mul vl\] +** str p9, \[sp, #5, mul vl\] +** str p10, \[sp, #6, mul vl\] +** str p11, \[sp, #7, mul vl\] +** str z8, \[sp, #1, mul vl\] +** str z9, \[sp, #2, mul vl\] +** str z10, \[sp, #3, mul vl\] +** str z11, \[sp, #4, mul vl\] +** str z12, \[sp, #5, mul vl\] +** str z13, \[sp, #6, mul vl\] +** str z14, \[sp, #7, mul vl\] +** str z15, \[sp, #8, mul vl\] +** str z16, \[sp, #9, mul vl\] +** str z17, \[sp, #10, mul vl\] +** str z18, \[sp, #11, mul vl\] +** str z19, \[sp, #12, mul vl\] +** str z20, \[sp, #13, mul vl\] +** str z21, \[sp, #14, mul vl\] +** str z22, \[sp, #15, mul vl\] +** str z23, \[sp, #16, mul vl\] +** blr x0 +** ldr z8, \[sp, #1, mul vl\] +** ldr z9, \[sp, #2, mul vl\] +** ldr z10, \[sp, #3, mul vl\] +** ldr z11, \[sp, #4, mul vl\] +** ldr z12, \[sp, #5, mul vl\] +** ldr z13, \[sp, #6, mul vl\] +** ldr z14, \[sp, #7, mul vl\] +** ldr z15, \[sp, #8, mul vl\] +** ldr z16, \[sp, #9, mul vl\] +** ldr z17, \[sp, #10, mul vl\] +** ldr z18, \[sp, #11, mul vl\] +** ldr z19, \[sp, #12, mul vl\] +** ldr z20, \[sp, #13, mul vl\] +** ldr z21, \[sp, #14, mul vl\] +** ldr z22, \[sp, #15, mul vl\] +** ldr z23, \[sp, #16, mul vl\] +** ldr p4, \[sp\] +** ldr p5, \[sp, #1, mul vl\] +** ldr p6, \[sp, #2, mul vl\] +** ldr p7, \[sp, #3, mul vl\] +** ldr p8, \[sp, #4, mul vl\] +** ldr p9, \[sp, #5, mul vl\] +** ldr p10, \[sp, #6, mul vl\] +** ldr p11, \[sp, #7, mul vl\] +** addvl sp, sp, #17 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +void +calls_standard_ptr (__SVInt8_t x, void (*fn) (void)) +{ + fn (); +} + +/* +** calls_vpcs_ptr: +** stp x29, x30, \[sp, -16\]! +** mov x29, sp +** addvl sp, sp, #-17 +** str p4, \[sp\] +** str p5, \[sp, #1, mul vl\] +** str p6, \[sp, #2, mul vl\] +** str p7, \[sp, #3, mul vl\] +** str p8, \[sp, #4, mul vl\] +** str p9, \[sp, #5, mul vl\] +** str p10, \[sp, #6, mul vl\] +** str p11, \[sp, #7, mul vl\] +** str z8, \[sp, #1, mul vl\] +** str z9, \[sp, #2, mul vl\] +** str z10, \[sp, #3, mul vl\] +** str z11, \[sp, #4, mul vl\] +** str z12, \[sp, #5, mul vl\] +** str z13, \[sp, #6, mul vl\] +** str z14, \[sp, #7, mul vl\] +** str z15, \[sp, #8, mul vl\] +** str z16, \[sp, #9, mul vl\] +** str z17, \[sp, #10, mul vl\] +** str z18, \[sp, #11, mul vl\] +** str z19, \[sp, #12, mul vl\] +** str z20, \[sp, #13, mul vl\] +** str z21, \[sp, #14, mul vl\] +** str z22, \[sp, #15, mul vl\] +** str z23, \[sp, #16, mul vl\] +** blr x0 +** ldr z8, \[sp, #1, mul vl\] +** ldr z9, \[sp, #2, mul vl\] +** ldr z10, \[sp, #3, mul vl\] +** ldr z11, \[sp, #4, mul vl\] +** ldr z12, \[sp, #5, mul vl\] +** ldr z13, \[sp, #6, mul vl\] +** ldr z14, \[sp, #7, mul vl\] +** ldr z15, \[sp, #8, mul vl\] +** ldr z16, \[sp, #9, mul vl\] +** ldr z17, \[sp, #10, mul vl\] +** ldr z18, \[sp, #11, mul vl\] +** ldr z19, \[sp, #12, mul vl\] +** ldr z20, \[sp, #13, mul vl\] +** ldr z21, \[sp, #14, mul vl\] +** ldr z22, \[sp, #15, mul vl\] +** ldr z23, \[sp, #16, mul vl\] +** ldr p4, \[sp\] +** ldr p5, \[sp, #1, mul vl\] +** ldr p6, \[sp, #2, mul vl\] +** ldr p7, \[sp, #3, mul vl\] +** ldr p8, \[sp, #4, mul vl\] +** ldr p9, \[sp, #5, mul vl\] +** ldr p10, \[sp, #6, mul vl\] +** ldr p11, \[sp, #7, mul vl\] +** addvl sp, sp, #17 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +void +calls_vpcs_ptr (__SVInt8_t x, + void (*__attribute__((aarch64_vector_pcs)) fn) (void)) +{ + fn (); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/saves_3.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/saves_3.c new file mode 100644 index 0000000..283c5bb --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/saves_3.c @@ -0,0 +1,92 @@ +/* { dg-do compile } */ +/* { dg-options "-O -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include + +int sve_callee (svint8_t); + +/* +** standard_caller: +** stp x29, x30, \[sp, -16\]! +** mov x29, sp +** mov z0\.b, #1 +** bl sve_callee +** add w0, w0, #?1 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +int standard_caller (void) { return sve_callee (svdup_s8 (1)) + 1; } + +/* +** vpcs_caller: +** stp x29, x30, \[sp, -16\]! +** mov x29, sp +** mov z0\.b, #1 +** bl sve_callee +** add w0, w0, #?1 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +__attribute__((aarch64_vector_pcs)) +int vpcs_caller (void) { return sve_callee (svdup_s8 (1)) + 1; } + +/* +** sve_caller: +** stp x29, x30, \[sp, -16\]! +** mov x29, sp +** mov z0\.b, #1 +** bl sve_callee +** add w0, w0, #?1 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +int sve_caller (svbool_t p0) { return sve_callee (svdup_s8 (1)) + 1; } + +/* +** standard_caller_ptr: +** stp x29, x30, \[sp, -16\]! +** mov x29, sp +** mov z0\.h, #1 +** blr x0 +** add w0, w0, #?1 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +int +standard_caller_ptr (int (*fn) (__SVInt16_t)) +{ + return fn (svdup_s16 (1)) + 1; +} + +/* +** vpcs_caller_ptr: +** stp x29, x30, \[sp, -16\]! +** mov x29, sp +** mov z0\.h, #1 +** blr x0 +** add w0, w0, #?1 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +int __attribute__((aarch64_vector_pcs)) +vpcs_caller_ptr (int (*fn) (__SVInt16_t)) +{ + return fn (svdup_s16 (1)) + 1; +} + +/* +** sve_caller_ptr: +** stp x29, x30, \[sp, -16\]! +** mov x29, sp +** mov z0\.h, #1 +** blr x0 +** add w0, w0, #?1 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +int +sve_caller_ptr (svbool_t pg, int (*fn) (svint16_t)) +{ + return fn (svdup_s16 (1)) + 1; +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/saves_4_be.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/saves_4_be.c new file mode 100644 index 0000000..aaf8abd --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/saves_4_be.c @@ -0,0 +1,84 @@ +/* { dg-do compile } */ +/* { dg-options "-O -mbig-endian -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +void standard_callee (__SVInt8_t *); + +/* +** calls_standard: +** addvl sp, sp, #-1 +** ( +** stp x29, x30, \[sp, -16\]! +** | +** sub sp, sp, #?16 +** stp x29, x30, \[sp\] +** ) +** mov x29, sp +** addvl sp, sp, #-17 +** str p4, \[sp\] +** str p5, \[sp, #1, mul vl\] +** str p6, \[sp, #2, mul vl\] +** str p7, \[sp, #3, mul vl\] +** str p8, \[sp, #4, mul vl\] +** str p9, \[sp, #5, mul vl\] +** str p10, \[sp, #6, mul vl\] +** str p11, \[sp, #7, mul vl\] +** ptrue p0\.b, all +** st1d z8\.d, p0, \[sp, #1, mul vl\] +** st1d z9\.d, p0, \[sp, #2, mul vl\] +** st1d z10\.d, p0, \[sp, #3, mul vl\] +** st1d z11\.d, p0, \[sp, #4, mul vl\] +** st1d z12\.d, p0, \[sp, #5, mul vl\] +** st1d z13\.d, p0, \[sp, #6, mul vl\] +** st1d z14\.d, p0, \[sp, #7, mul vl\] +** addvl x11, sp, #16 +** st1d z15\.d, p0, \[x11, #-8, mul vl\] +** str z16, \[sp, #9, mul vl\] +** str z17, \[sp, #10, mul vl\] +** str z18, \[sp, #11, mul vl\] +** str z19, \[sp, #12, mul vl\] +** str z20, \[sp, #13, mul vl\] +** str z21, \[sp, #14, mul vl\] +** str z22, \[sp, #15, mul vl\] +** str z23, \[sp, #16, mul vl\] +** addvl x0, sp, #17 +** add x0, x0, #?16 +** bl standard_callee +** ptrue p0\.b, all +** ld1d z8\.d, p0/z, \[sp, #1, mul vl\] +** ld1d z9\.d, p0/z, \[sp, #2, mul vl\] +** ld1d z10\.d, p0/z, \[sp, #3, mul vl\] +** ld1d z11\.d, p0/z, \[sp, #4, mul vl\] +** ld1d z12\.d, p0/z, \[sp, #5, mul vl\] +** ld1d z13\.d, p0/z, \[sp, #6, mul vl\] +** ld1d z14\.d, p0/z, \[sp, #7, mul vl\] +** addvl x11, sp, #16 +** ld1d z15\.d, p0/z, \[x11, #-8, mul vl\] +** ldr z16, \[sp, #9, mul vl\] +** ldr z17, \[sp, #10, mul vl\] +** ldr z18, \[sp, #11, mul vl\] +** ldr z19, \[sp, #12, mul vl\] +** ldr z20, \[sp, #13, mul vl\] +** ldr z21, \[sp, #14, mul vl\] +** ldr z22, \[sp, #15, mul vl\] +** ldr z23, \[sp, #16, mul vl\] +** ldr p4, \[sp\] +** ldr p5, \[sp, #1, mul vl\] +** ldr p6, \[sp, #2, mul vl\] +** ldr p7, \[sp, #3, mul vl\] +** ldr p8, \[sp, #4, mul vl\] +** ldr p9, \[sp, #5, mul vl\] +** ldr p10, \[sp, #6, mul vl\] +** ldr p11, \[sp, #7, mul vl\] +** addvl sp, sp, #17 +** ( +** ldp x29, x30, \[sp\], 16 +** addvl sp, sp, #1 +** | +** ldp x29, x30, \[sp\] +** addvl sp, sp, #1 +** add sp, sp, #?16 +** ) +** ret +*/ +void calls_standard (__SVInt8_t x) { __SVInt8_t tmp; standard_callee (&tmp); } diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/saves_4_le.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/saves_4_le.c new file mode 100644 index 0000000..648f8a0 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/saves_4_le.c @@ -0,0 +1,80 @@ +/* { dg-do compile } */ +/* { dg-options "-O -mlittle-endian -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +void standard_callee (__SVInt8_t *); + +/* +** calls_standard: +** addvl sp, sp, #-1 +** ( +** stp x29, x30, \[sp, -16\]! +** | +** sub sp, sp, #?16 +** stp x29, x30, \[sp\] +** ) +** mov x29, sp +** addvl sp, sp, #-17 +** str p4, \[sp\] +** str p5, \[sp, #1, mul vl\] +** str p6, \[sp, #2, mul vl\] +** str p7, \[sp, #3, mul vl\] +** str p8, \[sp, #4, mul vl\] +** str p9, \[sp, #5, mul vl\] +** str p10, \[sp, #6, mul vl\] +** str p11, \[sp, #7, mul vl\] +** str z8, \[sp, #1, mul vl\] +** str z9, \[sp, #2, mul vl\] +** str z10, \[sp, #3, mul vl\] +** str z11, \[sp, #4, mul vl\] +** str z12, \[sp, #5, mul vl\] +** str z13, \[sp, #6, mul vl\] +** str z14, \[sp, #7, mul vl\] +** str z15, \[sp, #8, mul vl\] +** str z16, \[sp, #9, mul vl\] +** str z17, \[sp, #10, mul vl\] +** str z18, \[sp, #11, mul vl\] +** str z19, \[sp, #12, mul vl\] +** str z20, \[sp, #13, mul vl\] +** str z21, \[sp, #14, mul vl\] +** str z22, \[sp, #15, mul vl\] +** str z23, \[sp, #16, mul vl\] +** addvl x0, sp, #17 +** add x0, x0, #?16 +** bl standard_callee +** ldr z8, \[sp, #1, mul vl\] +** ldr z9, \[sp, #2, mul vl\] +** ldr z10, \[sp, #3, mul vl\] +** ldr z11, \[sp, #4, mul vl\] +** ldr z12, \[sp, #5, mul vl\] +** ldr z13, \[sp, #6, mul vl\] +** ldr z14, \[sp, #7, mul vl\] +** ldr z15, \[sp, #8, mul vl\] +** ldr z16, \[sp, #9, mul vl\] +** ldr z17, \[sp, #10, mul vl\] +** ldr z18, \[sp, #11, mul vl\] +** ldr z19, \[sp, #12, mul vl\] +** ldr z20, \[sp, #13, mul vl\] +** ldr z21, \[sp, #14, mul vl\] +** ldr z22, \[sp, #15, mul vl\] +** ldr z23, \[sp, #16, mul vl\] +** ldr p4, \[sp\] +** ldr p5, \[sp, #1, mul vl\] +** ldr p6, \[sp, #2, mul vl\] +** ldr p7, \[sp, #3, mul vl\] +** ldr p8, \[sp, #4, mul vl\] +** ldr p9, \[sp, #5, mul vl\] +** ldr p10, \[sp, #6, mul vl\] +** ldr p11, \[sp, #7, mul vl\] +** addvl sp, sp, #17 +** ( +** ldp x29, x30, \[sp\], 16 +** addvl sp, sp, #1 +** | +** ldp x29, x30, \[sp\] +** addvl sp, sp, #1 +** add sp, sp, #?16 +** ) +** ret +*/ +void calls_standard (__SVInt8_t x) { __SVInt8_t tmp; standard_callee (&tmp); } diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/saves_5_be.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/saves_5_be.c new file mode 100644 index 0000000..dc3282e --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/saves_5_be.c @@ -0,0 +1,78 @@ +/* { dg-do compile } */ +/* { dg-options "-O -mbig-endian -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +void standard_callee (void); + +/* +** calls_standard: +** stp x29, x30, \[sp, -16\]! +** mov x29, sp +** addvl sp, sp, #-17 +** ptrue p0\.b, all +** st1d z8\.d, p0, \[sp, #1, mul vl\] +** st1d z9\.d, p0, \[sp, #2, mul vl\] +** st1d z10\.d, p0, \[sp, #3, mul vl\] +** st1d z11\.d, p0, \[sp, #4, mul vl\] +** st1d z12\.d, p0, \[sp, #5, mul vl\] +** st1d z13\.d, p0, \[sp, #6, mul vl\] +** st1d z14\.d, p0, \[sp, #7, mul vl\] +** addvl x11, sp, #16 +** st1d z15\.d, p0, \[x11, #-8, mul vl\] +** cbnz w0, \.L[0-9]+ +** ptrue p0\.b, all +** ld1d z8\.d, p0/z, \[sp, #1, mul vl\] +** ld1d z9\.d, p0/z, \[sp, #2, mul vl\] +** ld1d z10\.d, p0/z, \[sp, #3, mul vl\] +** ld1d z11\.d, p0/z, \[sp, #4, mul vl\] +** ld1d z12\.d, p0/z, \[sp, #5, mul vl\] +** ld1d z13\.d, p0/z, \[sp, #6, mul vl\] +** ld1d z14\.d, p0/z, \[sp, #7, mul vl\] +** addvl x11, sp, #16 +** ld1d z15\.d, p0/z, \[x11, #-8, mul vl\] +** addvl sp, sp, #17 +** ldp x29, x30, \[sp\], 16 +** ret +** ... +** str z16, \[sp, #9, mul vl\] +** str z17, \[sp, #10, mul vl\] +** str z18, \[sp, #11, mul vl\] +** str z19, \[sp, #12, mul vl\] +** str z20, \[sp, #13, mul vl\] +** str z21, \[sp, #14, mul vl\] +** str z22, \[sp, #15, mul vl\] +** str z23, \[sp, #16, mul vl\] +** str p4, \[sp\] +** str p5, \[sp, #1, mul vl\] +** str p6, \[sp, #2, mul vl\] +** str p7, \[sp, #3, mul vl\] +** str p8, \[sp, #4, mul vl\] +** str p9, \[sp, #5, mul vl\] +** str p10, \[sp, #6, mul vl\] +** str p11, \[sp, #7, mul vl\] +** bl standard_callee +** ldr z16, \[sp, #9, mul vl\] +** ldr z17, \[sp, #10, mul vl\] +** ldr z18, \[sp, #11, mul vl\] +** ldr z19, \[sp, #12, mul vl\] +** ldr z20, \[sp, #13, mul vl\] +** ldr z21, \[sp, #14, mul vl\] +** ldr z22, \[sp, #15, mul vl\] +** ldr z23, \[sp, #16, mul vl\] +** ldr p4, \[sp\] +** ldr p5, \[sp, #1, mul vl\] +** ldr p6, \[sp, #2, mul vl\] +** ldr p7, \[sp, #3, mul vl\] +** ldr p8, \[sp, #4, mul vl\] +** ldr p9, \[sp, #5, mul vl\] +** ldr p10, \[sp, #6, mul vl\] +** ldr p11, \[sp, #7, mul vl\] +** b \.L[0-9]+ +*/ +void +calls_standard (__SVInt8_t x, int y) +{ + asm volatile ("" ::: "z8"); + if (__builtin_expect (y, 0)) + standard_callee (); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/saves_5_le.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/saves_5_le.c new file mode 100644 index 0000000..0d29ff2 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/saves_5_le.c @@ -0,0 +1,74 @@ +/* { dg-do compile } */ +/* { dg-options "-O -mlittle-endian -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +void standard_callee (void); + +/* +** calls_standard: +** stp x29, x30, \[sp, -16\]! +** mov x29, sp +** addvl sp, sp, #-17 +** str z8, \[sp, #1, mul vl\] +** cbnz w0, \.L[0-9]+ +** ldr z8, \[sp, #1, mul vl\] +** addvl sp, sp, #17 +** ldp x29, x30, \[sp\], 16 +** ret +** ... +** str z9, \[sp, #2, mul vl\] +** str z10, \[sp, #3, mul vl\] +** str z11, \[sp, #4, mul vl\] +** str z12, \[sp, #5, mul vl\] +** str z13, \[sp, #6, mul vl\] +** str z14, \[sp, #7, mul vl\] +** str z15, \[sp, #8, mul vl\] +** str z16, \[sp, #9, mul vl\] +** str z17, \[sp, #10, mul vl\] +** str z18, \[sp, #11, mul vl\] +** str z19, \[sp, #12, mul vl\] +** str z20, \[sp, #13, mul vl\] +** str z21, \[sp, #14, mul vl\] +** str z22, \[sp, #15, mul vl\] +** str z23, \[sp, #16, mul vl\] +** str p4, \[sp\] +** str p5, \[sp, #1, mul vl\] +** str p6, \[sp, #2, mul vl\] +** str p7, \[sp, #3, mul vl\] +** str p8, \[sp, #4, mul vl\] +** str p9, \[sp, #5, mul vl\] +** str p10, \[sp, #6, mul vl\] +** str p11, \[sp, #7, mul vl\] +** bl standard_callee +** ldr z9, \[sp, #2, mul vl\] +** ldr z10, \[sp, #3, mul vl\] +** ldr z11, \[sp, #4, mul vl\] +** ldr z12, \[sp, #5, mul vl\] +** ldr z13, \[sp, #6, mul vl\] +** ldr z14, \[sp, #7, mul vl\] +** ldr z15, \[sp, #8, mul vl\] +** ldr z16, \[sp, #9, mul vl\] +** ldr z17, \[sp, #10, mul vl\] +** ldr z18, \[sp, #11, mul vl\] +** ldr z19, \[sp, #12, mul vl\] +** ldr z20, \[sp, #13, mul vl\] +** ldr z21, \[sp, #14, mul vl\] +** ldr z22, \[sp, #15, mul vl\] +** ldr z23, \[sp, #16, mul vl\] +** ldr p4, \[sp\] +** ldr p5, \[sp, #1, mul vl\] +** ldr p6, \[sp, #2, mul vl\] +** ldr p7, \[sp, #3, mul vl\] +** ldr p8, \[sp, #4, mul vl\] +** ldr p9, \[sp, #5, mul vl\] +** ldr p10, \[sp, #6, mul vl\] +** ldr p11, \[sp, #7, mul vl\] +** b \.L[0-9]+ +*/ +void +calls_standard (__SVInt8_t x, int y) +{ + asm volatile ("" ::: "z8"); + if (__builtin_expect (y, 0)) + standard_callee (); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/stack_clash_1.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/stack_clash_1.c new file mode 100644 index 0000000..485d018 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/stack_clash_1.c @@ -0,0 +1,204 @@ +/* { dg-do compile } */ +/* { dg-options "-O -mlittle-endian -fshrink-wrap -fstack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#pragma GCC aarch64 "arm_sve.h" + +/* +** test_1: +** cntb x12 +** mov x13, #?17 +** mul x12, x12, x13 +** mov x11, sp +** ... +** sub sp, sp, x12 +** str p4, \[sp\] +** str p5, \[sp, #1, mul vl\] +** str p6, \[sp, #2, mul vl\] +** str p7, \[sp, #3, mul vl\] +** str p8, \[sp, #4, mul vl\] +** str p9, \[sp, #5, mul vl\] +** str p10, \[sp, #6, mul vl\] +** str p11, \[sp, #7, mul vl\] +** str z8, \[sp, #1, mul vl\] +** str z9, \[sp, #2, mul vl\] +** str z10, \[sp, #3, mul vl\] +** str z11, \[sp, #4, mul vl\] +** str z12, \[sp, #5, mul vl\] +** str z13, \[sp, #6, mul vl\] +** str z14, \[sp, #7, mul vl\] +** str z15, \[sp, #8, mul vl\] +** str z16, \[sp, #9, mul vl\] +** str z17, \[sp, #10, mul vl\] +** str z18, \[sp, #11, mul vl\] +** str z19, \[sp, #12, mul vl\] +** str z20, \[sp, #13, mul vl\] +** str z21, \[sp, #14, mul vl\] +** str z22, \[sp, #15, mul vl\] +** str z23, \[sp, #16, mul vl\] +** ptrue p0\.b, all +** ldr z8, \[sp, #1, mul vl\] +** ldr z9, \[sp, #2, mul vl\] +** ldr z10, \[sp, #3, mul vl\] +** ldr z11, \[sp, #4, mul vl\] +** ldr z12, \[sp, #5, mul vl\] +** ldr z13, \[sp, #6, mul vl\] +** ldr z14, \[sp, #7, mul vl\] +** ldr z15, \[sp, #8, mul vl\] +** ldr z16, \[sp, #9, mul vl\] +** ldr z17, \[sp, #10, mul vl\] +** ldr z18, \[sp, #11, mul vl\] +** ldr z19, \[sp, #12, mul vl\] +** ldr z20, \[sp, #13, mul vl\] +** ldr z21, \[sp, #14, mul vl\] +** ldr z22, \[sp, #15, mul vl\] +** ldr z23, \[sp, #16, mul vl\] +** ldr p4, \[sp\] +** ldr p5, \[sp, #1, mul vl\] +** ldr p6, \[sp, #2, mul vl\] +** ldr p7, \[sp, #3, mul vl\] +** ldr p8, \[sp, #4, mul vl\] +** ldr p9, \[sp, #5, mul vl\] +** ldr p10, \[sp, #6, mul vl\] +** ldr p11, \[sp, #7, mul vl\] +** addvl sp, sp, #17 +** ret +*/ +svbool_t +test_1 (void) +{ + asm volatile ("" ::: + "z0", "z1", "z2", "z3", "z4", "z5", "z6", "z7", + "z8", "z9", "z10", "z11", "z12", "z13", "z14", "z15", + "z16", "z17", "z18", "z19", "z20", "z21", "z22", "z23", + "z24", "z25", "z26", "z27", "z28", "z29", "z30", "z31", + "p0", "p1", "p2", "p3", "p4", "p5", "p6", "p7", + "p8", "p9", "p10", "p11", "p12", "p13", "p14", "p15"); + return svptrue_b8 (); +} + +/* +** test_2: +** ptrue p0\.b, all +** ret +*/ +svbool_t +test_2 (void) +{ + asm volatile ("" ::: + "z0", "z1", "z2", "z3", "z4", "z5", "z6", "z7", + "z24", "z25", "z26", "z27", "z28", "z29", "z30", "z31", + "p0", "p1", "p2", "p3", "p12", "p13", "p14", "p15"); + return svptrue_b8 (); +} + +/* +** test_3: +** cntb x12, all, mul #6 +** mov x11, sp +** ... +** sub sp, sp, x12 +** str p5, \[sp\] +** str p6, \[sp, #1, mul vl\] +** str p11, \[sp, #2, mul vl\] +** str z8, \[sp, #1, mul vl\] +** str z13, \[sp, #2, mul vl\] +** str z19, \[sp, #3, mul vl\] +** str z20, \[sp, #4, mul vl\] +** str z22, \[sp, #5, mul vl\] +** ptrue p0\.b, all +** ldr z8, \[sp, #1, mul vl\] +** ldr z13, \[sp, #2, mul vl\] +** ldr z19, \[sp, #3, mul vl\] +** ldr z20, \[sp, #4, mul vl\] +** ldr z22, \[sp, #5, mul vl\] +** ldr p5, \[sp\] +** ldr p6, \[sp, #1, mul vl\] +** ldr p11, \[sp, #2, mul vl\] +** addvl sp, sp, #6 +** ret +*/ +svbool_t +test_3 (void) +{ + asm volatile ("" ::: + "z8", "z13", "z19", "z20", "z22", + "p5", "p6", "p11"); + return svptrue_b8 (); +} + +/* +** test_4: +** cntb x12 +** mov x11, sp +** ... +** sub sp, sp, x12 +** str p4, \[sp\] +** ptrue p0\.b, all +** ldr p4, \[sp\] +** addvl sp, sp, #1 +** ret +*/ +svbool_t +test_4 (void) +{ + asm volatile ("" ::: "p4"); + return svptrue_b8 (); +} + +/* +** test_5: +** cntb x12 +** mov x11, sp +** ... +** sub sp, sp, x12 +** str z15, \[sp\] +** ptrue p0\.b, all +** ldr z15, \[sp\] +** addvl sp, sp, #1 +** ret +*/ +svbool_t +test_5 (void) +{ + asm volatile ("" ::: "z15"); + return svptrue_b8 (); +} + +/* +** test_6: +** cntb x12 +** mov x11, sp +** ... +** sub sp, sp, x12 +** str z15, \[sp\] +** mov z0\.b, #1 +** ldr z15, \[sp\] +** addvl sp, sp, #1 +** ret +*/ +svint8_t +test_6 (svbool_t p0, svbool_t p1, svbool_t p2, svbool_t p3) +{ + asm volatile ("" :: "Upa" (p0), "Upa" (p1), "Upa" (p2), "Upa" (p3) : "z15"); + return svdup_s8 (1); +} + +/* +** test_7: +** cntb x12 +** mov x11, sp +** ... +** sub sp, sp, x12 +** str z16, \[sp\] +** ptrue p0\.b, all +** ldr z16, \[sp\] +** addvl sp, sp, #1 +** ret +*/ +svbool_t +test_7 (void) +{ + asm volatile ("" ::: "z16"); + return svptrue_b8 (); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/stack_clash_1_1024.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/stack_clash_1_1024.c new file mode 100644 index 0000000..087e8db --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/stack_clash_1_1024.c @@ -0,0 +1,184 @@ +/* { dg-do compile } */ +/* { dg-options "-O -mlittle-endian -fshrink-wrap -fstack-clash-protection -msve-vector-bits=1024 -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#pragma GCC aarch64 "arm_sve.h" + +/* +** test_1: +** sub sp, sp, #2176 +** str p4, \[sp\] +** str p5, \[sp, #1, mul vl\] +** str p6, \[sp, #2, mul vl\] +** str p7, \[sp, #3, mul vl\] +** str p8, \[sp, #4, mul vl\] +** str p9, \[sp, #5, mul vl\] +** str p10, \[sp, #6, mul vl\] +** str p11, \[sp, #7, mul vl\] +** str z8, \[sp, #1, mul vl\] +** str z9, \[sp, #2, mul vl\] +** str z10, \[sp, #3, mul vl\] +** str z11, \[sp, #4, mul vl\] +** str z12, \[sp, #5, mul vl\] +** str z13, \[sp, #6, mul vl\] +** str z14, \[sp, #7, mul vl\] +** str z15, \[sp, #8, mul vl\] +** str z16, \[sp, #9, mul vl\] +** str z17, \[sp, #10, mul vl\] +** str z18, \[sp, #11, mul vl\] +** str z19, \[sp, #12, mul vl\] +** str z20, \[sp, #13, mul vl\] +** str z21, \[sp, #14, mul vl\] +** str z22, \[sp, #15, mul vl\] +** str z23, \[sp, #16, mul vl\] +** ptrue p0\.b, vl128 +** ldr z8, \[sp, #1, mul vl\] +** ldr z9, \[sp, #2, mul vl\] +** ldr z10, \[sp, #3, mul vl\] +** ldr z11, \[sp, #4, mul vl\] +** ldr z12, \[sp, #5, mul vl\] +** ldr z13, \[sp, #6, mul vl\] +** ldr z14, \[sp, #7, mul vl\] +** ldr z15, \[sp, #8, mul vl\] +** ldr z16, \[sp, #9, mul vl\] +** ldr z17, \[sp, #10, mul vl\] +** ldr z18, \[sp, #11, mul vl\] +** ldr z19, \[sp, #12, mul vl\] +** ldr z20, \[sp, #13, mul vl\] +** ldr z21, \[sp, #14, mul vl\] +** ldr z22, \[sp, #15, mul vl\] +** ldr z23, \[sp, #16, mul vl\] +** ldr p4, \[sp\] +** ldr p5, \[sp, #1, mul vl\] +** ldr p6, \[sp, #2, mul vl\] +** ldr p7, \[sp, #3, mul vl\] +** ldr p8, \[sp, #4, mul vl\] +** ldr p9, \[sp, #5, mul vl\] +** ldr p10, \[sp, #6, mul vl\] +** ldr p11, \[sp, #7, mul vl\] +** add sp, sp, #?2176 +** ret +*/ +svbool_t +test_1 (void) +{ + asm volatile ("" ::: + "z0", "z1", "z2", "z3", "z4", "z5", "z6", "z7", + "z8", "z9", "z10", "z11", "z12", "z13", "z14", "z15", + "z16", "z17", "z18", "z19", "z20", "z21", "z22", "z23", + "z24", "z25", "z26", "z27", "z28", "z29", "z30", "z31", + "p0", "p1", "p2", "p3", "p4", "p5", "p6", "p7", + "p8", "p9", "p10", "p11", "p12", "p13", "p14", "p15"); + return svptrue_b8 (); +} + +/* +** test_2: +** ptrue p0\.b, vl128 +** ret +*/ +svbool_t +test_2 (void) +{ + asm volatile ("" ::: + "z0", "z1", "z2", "z3", "z4", "z5", "z6", "z7", + "z24", "z25", "z26", "z27", "z28", "z29", "z30", "z31", + "p0", "p1", "p2", "p3", "p12", "p13", "p14", "p15"); + return svptrue_b8 (); +} + +/* +** test_3: +** sub sp, sp, #768 +** str p5, \[sp\] +** str p6, \[sp, #1, mul vl\] +** str p11, \[sp, #2, mul vl\] +** str z8, \[sp, #1, mul vl\] +** str z13, \[sp, #2, mul vl\] +** str z19, \[sp, #3, mul vl\] +** str z20, \[sp, #4, mul vl\] +** str z22, \[sp, #5, mul vl\] +** ptrue p0\.b, vl128 +** ldr z8, \[sp, #1, mul vl\] +** ldr z13, \[sp, #2, mul vl\] +** ldr z19, \[sp, #3, mul vl\] +** ldr z20, \[sp, #4, mul vl\] +** ldr z22, \[sp, #5, mul vl\] +** ldr p5, \[sp\] +** ldr p6, \[sp, #1, mul vl\] +** ldr p11, \[sp, #2, mul vl\] +** add sp, sp, #?768 +** ret +*/ +svbool_t +test_3 (void) +{ + asm volatile ("" ::: + "z8", "z13", "z19", "z20", "z22", + "p5", "p6", "p11"); + return svptrue_b8 (); +} + +/* +** test_4: +** sub sp, sp, #128 +** str p4, \[sp\] +** ptrue p0\.b, vl128 +** ldr p4, \[sp\] +** add sp, sp, #?128 +** ret +*/ +svbool_t +test_4 (void) +{ + asm volatile ("" ::: "p4"); + return svptrue_b8 (); +} + +/* +** test_5: +** sub sp, sp, #128 +** str z15, \[sp\] +** ptrue p0\.b, vl128 +** ldr z15, \[sp\] +** add sp, sp, #?128 +** ret +*/ +svbool_t +test_5 (void) +{ + asm volatile ("" ::: "z15"); + return svptrue_b8 (); +} + +/* +** test_6: +** sub sp, sp, #128 +** str z15, \[sp\] +** mov z0\.b, #1 +** ldr z15, \[sp\] +** add sp, sp, #?128 +** ret +*/ +svint8_t +test_6 (svbool_t p0, svbool_t p1, svbool_t p2, svbool_t p3) +{ + asm volatile ("" :: "Upa" (p0), "Upa" (p1), "Upa" (p2), "Upa" (p3) : "z15"); + return svdup_s8 (1); +} + +/* +** test_7: +** sub sp, sp, #128 +** str z16, \[sp\] +** ptrue p0\.b, vl128 +** ldr z16, \[sp\] +** add sp, sp, #?128 +** ret +*/ +svbool_t +test_7 (void) +{ + asm volatile ("" ::: "z16"); + return svptrue_b8 (); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/stack_clash_1_2048.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/stack_clash_1_2048.c new file mode 100644 index 0000000..e8dc5d5 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/stack_clash_1_2048.c @@ -0,0 +1,185 @@ +/* { dg-do compile } */ +/* { dg-options "-O -mlittle-endian -fshrink-wrap -fstack-clash-protection -msve-vector-bits=2048 -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#pragma GCC aarch64 "arm_sve.h" + +/* +** test_1: +** mov x12, #?4352 +** sub sp, sp, x12 +** str p4, \[sp\] +** str p5, \[sp, #1, mul vl\] +** str p6, \[sp, #2, mul vl\] +** str p7, \[sp, #3, mul vl\] +** str p8, \[sp, #4, mul vl\] +** str p9, \[sp, #5, mul vl\] +** str p10, \[sp, #6, mul vl\] +** str p11, \[sp, #7, mul vl\] +** str z8, \[sp, #1, mul vl\] +** str z9, \[sp, #2, mul vl\] +** str z10, \[sp, #3, mul vl\] +** str z11, \[sp, #4, mul vl\] +** str z12, \[sp, #5, mul vl\] +** str z13, \[sp, #6, mul vl\] +** str z14, \[sp, #7, mul vl\] +** str z15, \[sp, #8, mul vl\] +** str z16, \[sp, #9, mul vl\] +** str z17, \[sp, #10, mul vl\] +** str z18, \[sp, #11, mul vl\] +** str z19, \[sp, #12, mul vl\] +** str z20, \[sp, #13, mul vl\] +** str z21, \[sp, #14, mul vl\] +** str z22, \[sp, #15, mul vl\] +** str z23, \[sp, #16, mul vl\] +** ptrue p0\.b, vl256 +** ldr z8, \[sp, #1, mul vl\] +** ldr z9, \[sp, #2, mul vl\] +** ldr z10, \[sp, #3, mul vl\] +** ldr z11, \[sp, #4, mul vl\] +** ldr z12, \[sp, #5, mul vl\] +** ldr z13, \[sp, #6, mul vl\] +** ldr z14, \[sp, #7, mul vl\] +** ldr z15, \[sp, #8, mul vl\] +** ldr z16, \[sp, #9, mul vl\] +** ldr z17, \[sp, #10, mul vl\] +** ldr z18, \[sp, #11, mul vl\] +** ldr z19, \[sp, #12, mul vl\] +** ldr z20, \[sp, #13, mul vl\] +** ldr z21, \[sp, #14, mul vl\] +** ldr z22, \[sp, #15, mul vl\] +** ldr z23, \[sp, #16, mul vl\] +** ldr p4, \[sp\] +** ldr p5, \[sp, #1, mul vl\] +** ldr p6, \[sp, #2, mul vl\] +** ldr p7, \[sp, #3, mul vl\] +** ldr p8, \[sp, #4, mul vl\] +** ldr p9, \[sp, #5, mul vl\] +** ldr p10, \[sp, #6, mul vl\] +** ldr p11, \[sp, #7, mul vl\] +** add sp, sp, x12 +** ret +*/ +svbool_t +test_1 (void) +{ + asm volatile ("" ::: + "z0", "z1", "z2", "z3", "z4", "z5", "z6", "z7", + "z8", "z9", "z10", "z11", "z12", "z13", "z14", "z15", + "z16", "z17", "z18", "z19", "z20", "z21", "z22", "z23", + "z24", "z25", "z26", "z27", "z28", "z29", "z30", "z31", + "p0", "p1", "p2", "p3", "p4", "p5", "p6", "p7", + "p8", "p9", "p10", "p11", "p12", "p13", "p14", "p15"); + return svptrue_b8 (); +} + +/* +** test_2: +** ptrue p0\.b, vl256 +** ret +*/ +svbool_t +test_2 (void) +{ + asm volatile ("" ::: + "z0", "z1", "z2", "z3", "z4", "z5", "z6", "z7", + "z24", "z25", "z26", "z27", "z28", "z29", "z30", "z31", + "p0", "p1", "p2", "p3", "p12", "p13", "p14", "p15"); + return svptrue_b8 (); +} + +/* +** test_3: +** sub sp, sp, #1536 +** str p5, \[sp\] +** str p6, \[sp, #1, mul vl\] +** str p11, \[sp, #2, mul vl\] +** str z8, \[sp, #1, mul vl\] +** str z13, \[sp, #2, mul vl\] +** str z19, \[sp, #3, mul vl\] +** str z20, \[sp, #4, mul vl\] +** str z22, \[sp, #5, mul vl\] +** ptrue p0\.b, vl256 +** ldr z8, \[sp, #1, mul vl\] +** ldr z13, \[sp, #2, mul vl\] +** ldr z19, \[sp, #3, mul vl\] +** ldr z20, \[sp, #4, mul vl\] +** ldr z22, \[sp, #5, mul vl\] +** ldr p5, \[sp\] +** ldr p6, \[sp, #1, mul vl\] +** ldr p11, \[sp, #2, mul vl\] +** add sp, sp, #?1536 +** ret +*/ +svbool_t +test_3 (void) +{ + asm volatile ("" ::: + "z8", "z13", "z19", "z20", "z22", + "p5", "p6", "p11"); + return svptrue_b8 (); +} + +/* +** test_4: +** sub sp, sp, #256 +** str p4, \[sp\] +** ptrue p0\.b, vl256 +** ldr p4, \[sp\] +** add sp, sp, #?256 +** ret +*/ +svbool_t +test_4 (void) +{ + asm volatile ("" ::: "p4"); + return svptrue_b8 (); +} + +/* +** test_5: +** sub sp, sp, #256 +** str z15, \[sp\] +** ptrue p0\.b, vl256 +** ldr z15, \[sp\] +** add sp, sp, #?256 +** ret +*/ +svbool_t +test_5 (void) +{ + asm volatile ("" ::: "z15"); + return svptrue_b8 (); +} + +/* +** test_6: +** sub sp, sp, #256 +** str z15, \[sp\] +** mov z0\.b, #1 +** ldr z15, \[sp\] +** add sp, sp, #?256 +** ret +*/ +svint8_t +test_6 (svbool_t p0, svbool_t p1, svbool_t p2, svbool_t p3) +{ + asm volatile ("" :: "Upa" (p0), "Upa" (p1), "Upa" (p2), "Upa" (p3) : "z15"); + return svdup_s8 (1); +} + +/* +** test_7: +** sub sp, sp, #256 +** str z16, \[sp\] +** ptrue p0\.b, vl256 +** ldr z16, \[sp\] +** add sp, sp, #?256 +** ret +*/ +svbool_t +test_7 (void) +{ + asm volatile ("" ::: "z16"); + return svptrue_b8 (); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/stack_clash_1_256.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/stack_clash_1_256.c new file mode 100644 index 0000000..73c49e4 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/stack_clash_1_256.c @@ -0,0 +1,184 @@ +/* { dg-do compile } */ +/* { dg-options "-O -mlittle-endian -fshrink-wrap -fstack-clash-protection -msve-vector-bits=256 -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#pragma GCC aarch64 "arm_sve.h" + +/* +** test_1: +** sub sp, sp, #544 +** str p4, \[sp\] +** str p5, \[sp, #1, mul vl\] +** str p6, \[sp, #2, mul vl\] +** str p7, \[sp, #3, mul vl\] +** str p8, \[sp, #4, mul vl\] +** str p9, \[sp, #5, mul vl\] +** str p10, \[sp, #6, mul vl\] +** str p11, \[sp, #7, mul vl\] +** str z8, \[sp, #1, mul vl\] +** str z9, \[sp, #2, mul vl\] +** str z10, \[sp, #3, mul vl\] +** str z11, \[sp, #4, mul vl\] +** str z12, \[sp, #5, mul vl\] +** str z13, \[sp, #6, mul vl\] +** str z14, \[sp, #7, mul vl\] +** str z15, \[sp, #8, mul vl\] +** str z16, \[sp, #9, mul vl\] +** str z17, \[sp, #10, mul vl\] +** str z18, \[sp, #11, mul vl\] +** str z19, \[sp, #12, mul vl\] +** str z20, \[sp, #13, mul vl\] +** str z21, \[sp, #14, mul vl\] +** str z22, \[sp, #15, mul vl\] +** str z23, \[sp, #16, mul vl\] +** ptrue p0\.b, vl32 +** ldr z8, \[sp, #1, mul vl\] +** ldr z9, \[sp, #2, mul vl\] +** ldr z10, \[sp, #3, mul vl\] +** ldr z11, \[sp, #4, mul vl\] +** ldr z12, \[sp, #5, mul vl\] +** ldr z13, \[sp, #6, mul vl\] +** ldr z14, \[sp, #7, mul vl\] +** ldr z15, \[sp, #8, mul vl\] +** ldr z16, \[sp, #9, mul vl\] +** ldr z17, \[sp, #10, mul vl\] +** ldr z18, \[sp, #11, mul vl\] +** ldr z19, \[sp, #12, mul vl\] +** ldr z20, \[sp, #13, mul vl\] +** ldr z21, \[sp, #14, mul vl\] +** ldr z22, \[sp, #15, mul vl\] +** ldr z23, \[sp, #16, mul vl\] +** ldr p4, \[sp\] +** ldr p5, \[sp, #1, mul vl\] +** ldr p6, \[sp, #2, mul vl\] +** ldr p7, \[sp, #3, mul vl\] +** ldr p8, \[sp, #4, mul vl\] +** ldr p9, \[sp, #5, mul vl\] +** ldr p10, \[sp, #6, mul vl\] +** ldr p11, \[sp, #7, mul vl\] +** add sp, sp, #?544 +** ret +*/ +svbool_t +test_1 (void) +{ + asm volatile ("" ::: + "z0", "z1", "z2", "z3", "z4", "z5", "z6", "z7", + "z8", "z9", "z10", "z11", "z12", "z13", "z14", "z15", + "z16", "z17", "z18", "z19", "z20", "z21", "z22", "z23", + "z24", "z25", "z26", "z27", "z28", "z29", "z30", "z31", + "p0", "p1", "p2", "p3", "p4", "p5", "p6", "p7", + "p8", "p9", "p10", "p11", "p12", "p13", "p14", "p15"); + return svptrue_b8 (); +} + +/* +** test_2: +** ptrue p0\.b, vl32 +** ret +*/ +svbool_t +test_2 (void) +{ + asm volatile ("" ::: + "z0", "z1", "z2", "z3", "z4", "z5", "z6", "z7", + "z24", "z25", "z26", "z27", "z28", "z29", "z30", "z31", + "p0", "p1", "p2", "p3", "p12", "p13", "p14", "p15"); + return svptrue_b8 (); +} + +/* +** test_3: +** sub sp, sp, #192 +** str p5, \[sp\] +** str p6, \[sp, #1, mul vl\] +** str p11, \[sp, #2, mul vl\] +** str z8, \[sp, #1, mul vl\] +** str z13, \[sp, #2, mul vl\] +** str z19, \[sp, #3, mul vl\] +** str z20, \[sp, #4, mul vl\] +** str z22, \[sp, #5, mul vl\] +** ptrue p0\.b, vl32 +** ldr z8, \[sp, #1, mul vl\] +** ldr z13, \[sp, #2, mul vl\] +** ldr z19, \[sp, #3, mul vl\] +** ldr z20, \[sp, #4, mul vl\] +** ldr z22, \[sp, #5, mul vl\] +** ldr p5, \[sp\] +** ldr p6, \[sp, #1, mul vl\] +** ldr p11, \[sp, #2, mul vl\] +** add sp, sp, #?192 +** ret +*/ +svbool_t +test_3 (void) +{ + asm volatile ("" ::: + "z8", "z13", "z19", "z20", "z22", + "p5", "p6", "p11"); + return svptrue_b8 (); +} + +/* +** test_4: +** sub sp, sp, #32 +** str p4, \[sp\] +** ptrue p0\.b, vl32 +** ldr p4, \[sp\] +** add sp, sp, #?32 +** ret +*/ +svbool_t +test_4 (void) +{ + asm volatile ("" ::: "p4"); + return svptrue_b8 (); +} + +/* +** test_5: +** sub sp, sp, #32 +** str z15, \[sp\] +** ptrue p0\.b, vl32 +** ldr z15, \[sp\] +** add sp, sp, #?32 +** ret +*/ +svbool_t +test_5 (void) +{ + asm volatile ("" ::: "z15"); + return svptrue_b8 (); +} + +/* +** test_6: +** sub sp, sp, #32 +** str z15, \[sp\] +** mov z0\.b, #1 +** ldr z15, \[sp\] +** add sp, sp, #?32 +** ret +*/ +svint8_t +test_6 (svbool_t p0, svbool_t p1, svbool_t p2, svbool_t p3) +{ + asm volatile ("" :: "Upa" (p0), "Upa" (p1), "Upa" (p2), "Upa" (p3) : "z15"); + return svdup_s8 (1); +} + +/* +** test_7: +** sub sp, sp, #32 +** str z16, \[sp\] +** ptrue p0\.b, vl32 +** ldr z16, \[sp\] +** add sp, sp, #?32 +** ret +*/ +svbool_t +test_7 (void) +{ + asm volatile ("" ::: "z16"); + return svptrue_b8 (); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/stack_clash_1_512.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/stack_clash_1_512.c new file mode 100644 index 0000000..d4b5241 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/stack_clash_1_512.c @@ -0,0 +1,184 @@ +/* { dg-do compile } */ +/* { dg-options "-O -mlittle-endian -fshrink-wrap -fstack-clash-protection -msve-vector-bits=512 -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#pragma GCC aarch64 "arm_sve.h" + +/* +** test_1: +** sub sp, sp, #1088 +** str p4, \[sp\] +** str p5, \[sp, #1, mul vl\] +** str p6, \[sp, #2, mul vl\] +** str p7, \[sp, #3, mul vl\] +** str p8, \[sp, #4, mul vl\] +** str p9, \[sp, #5, mul vl\] +** str p10, \[sp, #6, mul vl\] +** str p11, \[sp, #7, mul vl\] +** str z8, \[sp, #1, mul vl\] +** str z9, \[sp, #2, mul vl\] +** str z10, \[sp, #3, mul vl\] +** str z11, \[sp, #4, mul vl\] +** str z12, \[sp, #5, mul vl\] +** str z13, \[sp, #6, mul vl\] +** str z14, \[sp, #7, mul vl\] +** str z15, \[sp, #8, mul vl\] +** str z16, \[sp, #9, mul vl\] +** str z17, \[sp, #10, mul vl\] +** str z18, \[sp, #11, mul vl\] +** str z19, \[sp, #12, mul vl\] +** str z20, \[sp, #13, mul vl\] +** str z21, \[sp, #14, mul vl\] +** str z22, \[sp, #15, mul vl\] +** str z23, \[sp, #16, mul vl\] +** ptrue p0\.b, vl64 +** ldr z8, \[sp, #1, mul vl\] +** ldr z9, \[sp, #2, mul vl\] +** ldr z10, \[sp, #3, mul vl\] +** ldr z11, \[sp, #4, mul vl\] +** ldr z12, \[sp, #5, mul vl\] +** ldr z13, \[sp, #6, mul vl\] +** ldr z14, \[sp, #7, mul vl\] +** ldr z15, \[sp, #8, mul vl\] +** ldr z16, \[sp, #9, mul vl\] +** ldr z17, \[sp, #10, mul vl\] +** ldr z18, \[sp, #11, mul vl\] +** ldr z19, \[sp, #12, mul vl\] +** ldr z20, \[sp, #13, mul vl\] +** ldr z21, \[sp, #14, mul vl\] +** ldr z22, \[sp, #15, mul vl\] +** ldr z23, \[sp, #16, mul vl\] +** ldr p4, \[sp\] +** ldr p5, \[sp, #1, mul vl\] +** ldr p6, \[sp, #2, mul vl\] +** ldr p7, \[sp, #3, mul vl\] +** ldr p8, \[sp, #4, mul vl\] +** ldr p9, \[sp, #5, mul vl\] +** ldr p10, \[sp, #6, mul vl\] +** ldr p11, \[sp, #7, mul vl\] +** add sp, sp, #?1088 +** ret +*/ +svbool_t +test_1 (void) +{ + asm volatile ("" ::: + "z0", "z1", "z2", "z3", "z4", "z5", "z6", "z7", + "z8", "z9", "z10", "z11", "z12", "z13", "z14", "z15", + "z16", "z17", "z18", "z19", "z20", "z21", "z22", "z23", + "z24", "z25", "z26", "z27", "z28", "z29", "z30", "z31", + "p0", "p1", "p2", "p3", "p4", "p5", "p6", "p7", + "p8", "p9", "p10", "p11", "p12", "p13", "p14", "p15"); + return svptrue_b8 (); +} + +/* +** test_2: +** ptrue p0\.b, vl64 +** ret +*/ +svbool_t +test_2 (void) +{ + asm volatile ("" ::: + "z0", "z1", "z2", "z3", "z4", "z5", "z6", "z7", + "z24", "z25", "z26", "z27", "z28", "z29", "z30", "z31", + "p0", "p1", "p2", "p3", "p12", "p13", "p14", "p15"); + return svptrue_b8 (); +} + +/* +** test_3: +** sub sp, sp, #384 +** str p5, \[sp\] +** str p6, \[sp, #1, mul vl\] +** str p11, \[sp, #2, mul vl\] +** str z8, \[sp, #1, mul vl\] +** str z13, \[sp, #2, mul vl\] +** str z19, \[sp, #3, mul vl\] +** str z20, \[sp, #4, mul vl\] +** str z22, \[sp, #5, mul vl\] +** ptrue p0\.b, vl64 +** ldr z8, \[sp, #1, mul vl\] +** ldr z13, \[sp, #2, mul vl\] +** ldr z19, \[sp, #3, mul vl\] +** ldr z20, \[sp, #4, mul vl\] +** ldr z22, \[sp, #5, mul vl\] +** ldr p5, \[sp\] +** ldr p6, \[sp, #1, mul vl\] +** ldr p11, \[sp, #2, mul vl\] +** add sp, sp, #?384 +** ret +*/ +svbool_t +test_3 (void) +{ + asm volatile ("" ::: + "z8", "z13", "z19", "z20", "z22", + "p5", "p6", "p11"); + return svptrue_b8 (); +} + +/* +** test_4: +** sub sp, sp, #64 +** str p4, \[sp\] +** ptrue p0\.b, vl64 +** ldr p4, \[sp\] +** add sp, sp, #?64 +** ret +*/ +svbool_t +test_4 (void) +{ + asm volatile ("" ::: "p4"); + return svptrue_b8 (); +} + +/* +** test_5: +** sub sp, sp, #64 +** str z15, \[sp\] +** ptrue p0\.b, vl64 +** ldr z15, \[sp\] +** add sp, sp, #?64 +** ret +*/ +svbool_t +test_5 (void) +{ + asm volatile ("" ::: "z15"); + return svptrue_b8 (); +} + +/* +** test_6: +** sub sp, sp, #64 +** str z15, \[sp\] +** mov z0\.b, #1 +** ldr z15, \[sp\] +** add sp, sp, #?64 +** ret +*/ +svint8_t +test_6 (svbool_t p0, svbool_t p1, svbool_t p2, svbool_t p3) +{ + asm volatile ("" :: "Upa" (p0), "Upa" (p1), "Upa" (p2), "Upa" (p3) : "z15"); + return svdup_s8 (1); +} + +/* +** test_7: +** sub sp, sp, #64 +** str z16, \[sp\] +** ptrue p0\.b, vl64 +** ldr z16, \[sp\] +** add sp, sp, #?64 +** ret +*/ +svbool_t +test_7 (void) +{ + asm volatile ("" ::: "z16"); + return svptrue_b8 (); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/stack_clash_2.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/stack_clash_2.c new file mode 100644 index 0000000..4622a1e --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/stack_clash_2.c @@ -0,0 +1,336 @@ +/* { dg-do compile } */ +/* { dg-options "-O -fshrink-wrap -fstack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#pragma GCC aarch64 "arm_sve.h" + +svbool_t take_stack_args (volatile void *, void *, int, int, int, + int, int, int, int); + +/* +** test_1: +** cntb x12 +** add x12, x12, #?16 +** mov x11, sp +** ... +** sub sp, sp, x12 +** str p4, \[sp\] +** ... +** ptrue p0\.b, all +** ldr p4, \[sp\] +** addvl sp, sp, #1 +** add sp, sp, #?16 +** ret +*/ +svbool_t +test_1 (void) +{ + volatile int x = 1; + asm volatile ("" ::: "p4"); + return svptrue_b8 (); +} + +/* +** test_2: +** stp x24, x25, \[sp, -48\]! +** str x26, \[sp, 16\] +** cntb x13 +** mov x11, sp +** ... +** sub sp, sp, x13 +** str p4, \[sp\] +** ... +** ptrue p0\.b, all +** ldr p4, \[sp\] +** addvl sp, sp, #1 +** ldr x26, \[sp, 16\] +** ldp x24, x25, \[sp\], 48 +** ret +*/ +svbool_t +test_2 (void) +{ + volatile int x = 1; + asm volatile ("" ::: "p4", "x24", "x25", "x26"); + return svptrue_b8 (); +} + +/* +** test_3: +** cntb x12 +** mov x13, #?4128 +** add x12, x12, x13 +** mov x11, sp +** ... +** sub sp, sp, x12 +** addvl x11, sp, #1 +** stp x24, x25, \[x11\] +** str x26, \[x11, 16\] +** str p4, \[sp\] +** ... +** ptrue p0\.b, all +** ldr p4, \[sp\] +** addvl sp, sp, #1 +** ldp x24, x25, \[sp\] +** ldr x26, \[sp, 16\] +** mov x12, #?4128 +** add sp, sp, x12 +** ret +*/ +svbool_t +test_3 (void) +{ + volatile int x[1024]; + asm volatile ("" :: "r" (x) : "p4", "x24", "x25", "x26"); + return svptrue_b8 (); +} + +/* +** test_4: +** cntb x12, all, mul #2 +** mov x11, sp +** ... +** sub sp, sp, x12 +** str p4, \[sp\] +** ... +** ptrue p0\.h, all +** ldr p4, \[sp\] +** addvl sp, sp, #2 +** ret +*/ +svbool_t +test_4 (void) +{ + volatile svint32_t b; + b = svdup_s32 (1); + asm volatile ("" ::: "p4"); + return svptrue_b16 (); +} + +/* +** test_5: +** cntb x12, all, mul #2 +** add x12, x12, #?32 +** mov x11, sp +** ... +** sub sp, sp, x12 +** addvl x11, sp, #1 +** stp x24, x25, \[x11\] +** str x26, \[x11, 16\] +** str p4, \[sp\] +** ... +** ptrue p0\.h, all +** ldr p4, \[sp\] +** addvl sp, sp, #1 +** ldp x24, x25, \[sp\] +** ldr x26, \[sp, 16\] +** addvl sp, sp, #1 +** add sp, sp, #?32 +** ret +*/ +svbool_t +test_5 (void) +{ + volatile svint32_t b; + b = svdup_s32 (1); + asm volatile ("" ::: "p4", "x24", "x25", "x26"); + return svptrue_b16 (); +} + +/* +** test_6: +** stp x29, x30, \[sp, -16\]! +** mov x29, sp +** cntb x13 +** mov x11, sp +** ... +** sub sp, sp, x13 +** str p4, \[sp\] +** sub sp, sp, #?16 +** ... +** ptrue p0\.b, all +** add sp, sp, #?16 +** ldr p4, \[sp\] +** addvl sp, sp, #1 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +svbool_t +test_6 (void) +{ + take_stack_args (0, 0, 1, 2, 3, 4, 5, 6, 7); + asm volatile ("" ::: "p4"); + return svptrue_b8 (); +} + +/* +** test_7: +** cntb x12 +** mov x13, #?4112 +** add x12, x12, x13 +** mov x11, sp +** ... +** sub sp, sp, x12 +** addvl x11, sp, #1 +** stp x29, x30, \[x11\] +** addvl x29, sp, #1 +** str p4, \[sp\] +** sub sp, sp, #?16 +** ... +** ptrue p0\.b, all +** add sp, sp, #?16 +** ldr p4, \[sp\] +** addvl sp, sp, #1 +** ldp x29, x30, \[sp\] +** mov x12, #?4112 +** add sp, sp, x12 +** ret +*/ +svbool_t +test_7 (void) +{ + volatile int x[1024]; + take_stack_args (x, 0, 1, 2, 3, 4, 5, 6, 7); + asm volatile ("" ::: "p4"); + return svptrue_b8 (); +} + +/* +** test_8: +** cntb x12 +** mov x13, #?4144 +** add x12, x12, x13 +** mov x11, sp +** ... +** sub sp, sp, x12 +** addvl x11, sp, #1 +** stp x29, x30, \[x11\] +** addvl x29, sp, #1 +** stp x24, x25, \[x29, 16\] +** str x26, \[x29, 32\] +** str p4, \[sp\] +** sub sp, sp, #?16 +** ... +** ptrue p0\.b, all +** add sp, sp, #?16 +** ldr p4, \[sp\] +** addvl sp, sp, #1 +** ldp x24, x25, \[sp, 16\] +** ldr x26, \[sp, 32\] +** ldp x29, x30, \[sp\] +** mov x12, #?4144 +** add sp, sp, x12 +** ret +*/ +svbool_t +test_8 (void) +{ + volatile int x[1024]; + take_stack_args (x, 0, 1, 2, 3, 4, 5, 6, 7); + asm volatile ("" ::: "p4", "x24", "x25", "x26"); + return svptrue_b8 (); +} + +/* +** test_9: +** cntb x12 +** mov x13, #?4112 +** add x12, x12, x13 +** mov x11, sp +** ... +** sub sp, sp, x12 +** addvl x11, sp, #1 +** stp x29, x30, \[x11\] +** addvl x29, sp, #1 +** str p4, \[sp\] +** sub sp, sp, #?16 +** ... +** ptrue p0\.b, all +** addvl sp, x29, #-1 +** ldr p4, \[sp\] +** addvl sp, sp, #1 +** ldp x29, x30, \[sp\] +** mov x12, #?4112 +** add sp, sp, x12 +** ret +*/ +svbool_t +test_9 (int n) +{ + volatile int x[1024]; + take_stack_args (x, __builtin_alloca (n), 1, 2, 3, 4, 5, 6, 7); + asm volatile ("" ::: "p4"); + return svptrue_b8 (); +} + +/* +** test_10: +** cntb x12 +** mov x13, #?4144 +** add x12, x12, x13 +** mov x11, sp +** ... +** sub sp, sp, x12 +** addvl x11, sp, #1 +** stp x29, x30, \[x11\] +** addvl x29, sp, #1 +** stp x24, x25, \[x29, 16\] +** str x26, \[x29, 32\] +** str p4, \[sp\] +** sub sp, sp, #?16 +** ... +** ptrue p0\.b, all +** addvl sp, x29, #-1 +** ldr p4, \[sp\] +** addvl sp, sp, #1 +** ldp x24, x25, \[sp, 16\] +** ldr x26, \[sp, 32\] +** ldp x29, x30, \[sp\] +** mov x12, #?4144 +** add sp, sp, x12 +** ret +*/ +svbool_t +test_10 (int n) +{ + volatile int x[1024]; + take_stack_args (x, __builtin_alloca (n), 1, 2, 3, 4, 5, 6, 7); + asm volatile ("" ::: "p4", "x24", "x25", "x26"); + return svptrue_b8 (); +} + +/* +** test_11: +** cntb x12 +** add x12, x12, #?3008 +** add x12, x12, #?126976 +** mov x11, sp +** ... +** sub sp, sp, x12 +** addvl x11, sp, #1 +** stp x29, x30, \[x11\] +** addvl x29, sp, #1 +** stp x24, x25, \[x29, 16\] +** str x26, \[x29, 32\] +** str p4, \[sp\] +** sub sp, sp, #?16 +** ... +** ptrue p0\.b, all +** addvl sp, x29, #-1 +** ldr p4, \[sp\] +** addvl sp, sp, #1 +** ldp x24, x25, \[sp, 16\] +** ldr x26, \[sp, 32\] +** ldp x29, x30, \[sp\] +** add sp, sp, #?3008 +** add sp, sp, #?126976 +** ret +*/ +svbool_t +test_11 (int n) +{ + volatile int x[0x7ee4]; + take_stack_args (x, __builtin_alloca (n), 1, 2, 3, 4, 5, 6, 7); + asm volatile ("" ::: "p4", "x24", "x25", "x26"); + return svptrue_b8 (); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/stack_clash_2_1024.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/stack_clash_2_1024.c new file mode 100644 index 0000000..d5a9d44 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/stack_clash_2_1024.c @@ -0,0 +1,285 @@ +/* { dg-do compile } */ +/* { dg-options "-O -fshrink-wrap -fstack-clash-protection -msve-vector-bits=1024 -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#pragma GCC aarch64 "arm_sve.h" + +svbool_t take_stack_args (volatile void *, void *, int, int, int, + int, int, int, int); + +/* +** test_1: +** sub sp, sp, #144 +** str p4, \[sp\] +** ... +** ptrue p0\.b, vl128 +** ldr p4, \[sp\] +** add sp, sp, #?144 +** ret +*/ +svbool_t +test_1 (void) +{ + volatile int x = 1; + asm volatile ("" ::: "p4"); + return svptrue_b8 (); +} + +/* +** test_2: +** sub sp, sp, #176 +** stp x24, x25, \[sp, 128\] +** str x26, \[sp, 144\] +** str p4, \[sp\] +** ... +** ptrue p0\.b, vl128 +** ldr p4, \[sp\] +** ldp x24, x25, \[sp, 128\] +** ldr x26, \[sp, 144\] +** add sp, sp, #?176 +** ret +*/ +svbool_t +test_2 (void) +{ + volatile int x = 1; + asm volatile ("" ::: "p4", "x24", "x25", "x26"); + return svptrue_b8 (); +} + +/* +** test_3: +** mov x12, #?4256 +** sub sp, sp, x12 +** stp x24, x25, \[sp, 128\] +** str x26, \[sp, 144\] +** str p4, \[sp\] +** ... +** ptrue p0\.b, vl128 +** ldr p4, \[sp\] +** ldp x24, x25, \[sp, 128\] +** ldr x26, \[sp, 144\] +** add sp, sp, x12 +** ret +*/ +svbool_t +test_3 (void) +{ + volatile int x[1024]; + asm volatile ("" :: "r" (x) : "p4", "x24", "x25", "x26"); + return svptrue_b8 (); +} + +/* +** test_4: +** sub sp, sp, #256 +** str p4, \[sp\] +** ... +** ptrue p0\.h, vl64 +** ldr p4, \[sp\] +** add sp, sp, #?256 +** ret +*/ +svbool_t +test_4 (void) +{ + volatile svint32_t b; + b = svdup_s32 (1); + asm volatile ("" ::: "p4"); + return svptrue_b16 (); +} + +/* +** test_5: +** sub sp, sp, #288 +** stp x24, x25, \[sp, 128\] +** str x26, \[sp, 144\] +** str p4, \[sp\] +** ... +** ptrue p0\.h, vl64 +** ldr p4, \[sp\] +** ldp x24, x25, \[sp, 128\] +** ldr x26, \[sp, 144\] +** add sp, sp, #?288 +** ret +*/ +svbool_t +test_5 (void) +{ + volatile svint32_t b; + b = svdup_s32 (1); + asm volatile ("" ::: "p4", "x24", "x25", "x26"); + return svptrue_b16 (); +} + +/* +** test_6: +** stp x29, x30, \[sp, -16\]! +** mov x29, sp +** sub sp, sp, #128 +** str p4, \[sp\] +** ... +** ptrue p0\.b, vl128 +** add sp, sp, #?16 +** ldr p4, \[sp\] +** add sp, sp, #?128 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +svbool_t +test_6 (void) +{ + take_stack_args (0, 0, 1, 2, 3, 4, 5, 6, 7); + asm volatile ("" ::: "p4"); + return svptrue_b8 (); +} + +/* +** test_7: +** mov x12, #?4240 +** sub sp, sp, x12 +** stp x29, x30, \[sp, 128\] +** add x29, sp, #?128 +** str p4, \[sp\] +** sub sp, sp, #16 +** ... +** ptrue p0\.b, vl128 +** add sp, sp, #?16 +** ldr p4, \[sp\] +** add sp, sp, #?128 +** ldp x29, x30, \[sp\] +** mov x12, #?4112 +** add sp, sp, x12 +** ret +*/ +svbool_t +test_7 (void) +{ + volatile int x[1024]; + take_stack_args (x, 0, 1, 2, 3, 4, 5, 6, 7); + asm volatile ("" ::: "p4"); + return svptrue_b8 (); +} + +/* +** test_8: +** mov x12, #?4272 +** sub sp, sp, x12 +** stp x29, x30, \[sp, 128\] +** add x29, sp, #?128 +** stp x24, x25, \[sp, 144\] +** str x26, \[sp, 160\] +** str p4, \[sp\] +** sub sp, sp, #16 +** ... +** ptrue p0\.b, vl128 +** add sp, sp, #?16 +** ldr p4, \[sp\] +** add sp, sp, #?128 +** ldp x24, x25, \[sp, 16\] +** ldr x26, \[sp, 32\] +** ldp x29, x30, \[sp\] +** mov x12, #?4144 +** add sp, sp, x12 +** ret +*/ +svbool_t +test_8 (void) +{ + volatile int x[1024]; + take_stack_args (x, 0, 1, 2, 3, 4, 5, 6, 7); + asm volatile ("" ::: "p4", "x24", "x25", "x26"); + return svptrue_b8 (); +} + +/* +** test_9: +** mov x12, #?4240 +** sub sp, sp, x12 +** stp x29, x30, \[sp, 128\] +** add x29, sp, #?128 +** str p4, \[sp\] +** sub sp, sp, #16 +** ... +** ptrue p0\.b, vl128 +** sub sp, x29, #128 +** ldr p4, \[sp\] +** add sp, sp, #?128 +** ldp x29, x30, \[sp\] +** mov x12, #?4112 +** add sp, sp, x12 +** ret +*/ +svbool_t +test_9 (int n) +{ + volatile int x[1024]; + take_stack_args (x, __builtin_alloca (n), 1, 2, 3, 4, 5, 6, 7); + asm volatile ("" ::: "p4"); + return svptrue_b8 (); +} + +/* +** test_10: +** mov x12, #?4272 +** sub sp, sp, x12 +** stp x29, x30, \[sp, 128\] +** add x29, sp, #?128 +** stp x24, x25, \[sp, 144\] +** str x26, \[sp, 160\] +** str p4, \[sp\] +** sub sp, sp, #16 +** ... +** ptrue p0\.b, vl128 +** sub sp, x29, #128 +** ldr p4, \[sp\] +** add sp, sp, #?128 +** ldp x24, x25, \[sp, 16\] +** ldr x26, \[sp, 32\] +** ldp x29, x30, \[sp\] +** mov x12, #?4144 +** add sp, sp, x12 +** ret +*/ +svbool_t +test_10 (int n) +{ + volatile int x[1024]; + take_stack_args (x, __builtin_alloca (n), 1, 2, 3, 4, 5, 6, 7); + asm volatile ("" ::: "p4", "x24", "x25", "x26"); + return svptrue_b8 (); +} + +/* +** test_11: +** sub sp, sp, #65536 +** str xzr, \[sp, 1024\] +** mov x12, #?64576 +** sub sp, sp, x12 +** str xzr, \[sp, 1024\] +** stp x29, x30, \[sp, 128\] +** add x29, sp, #?128 +** stp x24, x25, \[sp, 144\] +** str x26, \[sp, 160\] +** str p4, \[sp\] +** sub sp, sp, #16 +** ... +** ptrue p0\.b, vl128 +** sub sp, x29, #128 +** ldr p4, \[sp\] +** add sp, sp, #?128 +** ldp x24, x25, \[sp, 16\] +** ldr x26, \[sp, 32\] +** ldp x29, x30, \[sp\] +** add sp, sp, #?3008 +** add sp, sp, #?126976 +** ret +*/ +svbool_t +test_11 (int n) +{ + volatile int x[0x7ee4]; + take_stack_args (x, __builtin_alloca (n), 1, 2, 3, 4, 5, 6, 7); + asm volatile ("" ::: "p4", "x24", "x25", "x26"); + return svptrue_b8 (); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/stack_clash_2_2048.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/stack_clash_2_2048.c new file mode 100644 index 0000000..c185e2e --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/stack_clash_2_2048.c @@ -0,0 +1,285 @@ +/* { dg-do compile } */ +/* { dg-options "-O -fshrink-wrap -fstack-clash-protection -msve-vector-bits=2048 -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#pragma GCC aarch64 "arm_sve.h" + +svbool_t take_stack_args (volatile void *, void *, int, int, int, + int, int, int, int); + +/* +** test_1: +** sub sp, sp, #272 +** str p4, \[sp\] +** ... +** ptrue p0\.b, vl256 +** ldr p4, \[sp\] +** add sp, sp, #?272 +** ret +*/ +svbool_t +test_1 (void) +{ + volatile int x = 1; + asm volatile ("" ::: "p4"); + return svptrue_b8 (); +} + +/* +** test_2: +** sub sp, sp, #304 +** stp x24, x25, \[sp, 256\] +** str x26, \[sp, 272\] +** str p4, \[sp\] +** ... +** ptrue p0\.b, vl256 +** ldr p4, \[sp\] +** ldp x24, x25, \[sp, 256\] +** ldr x26, \[sp, 272\] +** add sp, sp, #?304 +** ret +*/ +svbool_t +test_2 (void) +{ + volatile int x = 1; + asm volatile ("" ::: "p4", "x24", "x25", "x26"); + return svptrue_b8 (); +} + +/* +** test_3: +** mov x12, #?4384 +** sub sp, sp, x12 +** stp x24, x25, \[sp, 256\] +** str x26, \[sp, 272\] +** str p4, \[sp\] +** ... +** ptrue p0\.b, vl256 +** ldr p4, \[sp\] +** ldp x24, x25, \[sp, 256\] +** ldr x26, \[sp, 272\] +** add sp, sp, x12 +** ret +*/ +svbool_t +test_3 (void) +{ + volatile int x[1024]; + asm volatile ("" :: "r" (x) : "p4", "x24", "x25", "x26"); + return svptrue_b8 (); +} + +/* +** test_4: +** sub sp, sp, #512 +** str p4, \[sp\] +** ... +** ptrue p0\.h, vl128 +** ldr p4, \[sp\] +** add sp, sp, #?512 +** ret +*/ +svbool_t +test_4 (void) +{ + volatile svint32_t b; + b = svdup_s32 (1); + asm volatile ("" ::: "p4"); + return svptrue_b16 (); +} + +/* +** test_5: +** sub sp, sp, #544 +** stp x24, x25, \[sp, 256\] +** str x26, \[sp, 272\] +** str p4, \[sp\] +** ... +** ptrue p0\.h, vl128 +** ldr p4, \[sp\] +** ldp x24, x25, \[sp, 256\] +** ldr x26, \[sp, 272\] +** add sp, sp, #?544 +** ret +*/ +svbool_t +test_5 (void) +{ + volatile svint32_t b; + b = svdup_s32 (1); + asm volatile ("" ::: "p4", "x24", "x25", "x26"); + return svptrue_b16 (); +} + +/* +** test_6: +** stp x29, x30, \[sp, -16\]! +** mov x29, sp +** sub sp, sp, #256 +** str p4, \[sp\] +** ... +** ptrue p0\.b, vl256 +** add sp, sp, #?16 +** ldr p4, \[sp\] +** add sp, sp, #?256 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +svbool_t +test_6 (void) +{ + take_stack_args (0, 0, 1, 2, 3, 4, 5, 6, 7); + asm volatile ("" ::: "p4"); + return svptrue_b8 (); +} + +/* +** test_7: +** mov x12, #?4368 +** sub sp, sp, x12 +** stp x29, x30, \[sp, 256\] +** add x29, sp, #?256 +** str p4, \[sp\] +** sub sp, sp, #16 +** ... +** ptrue p0\.b, vl256 +** add sp, sp, #?16 +** ldr p4, \[sp\] +** add sp, sp, #?256 +** ldp x29, x30, \[sp\] +** mov x12, #?4112 +** add sp, sp, x12 +** ret +*/ +svbool_t +test_7 (void) +{ + volatile int x[1024]; + take_stack_args (x, 0, 1, 2, 3, 4, 5, 6, 7); + asm volatile ("" ::: "p4"); + return svptrue_b8 (); +} + +/* +** test_8: +** mov x12, #?4400 +** sub sp, sp, x12 +** stp x29, x30, \[sp, 256\] +** add x29, sp, #?256 +** stp x24, x25, \[sp, 272\] +** str x26, \[sp, 288\] +** str p4, \[sp\] +** sub sp, sp, #16 +** ... +** ptrue p0\.b, vl256 +** add sp, sp, #?16 +** ldr p4, \[sp\] +** add sp, sp, #?256 +** ldp x24, x25, \[sp, 16\] +** ldr x26, \[sp, 32\] +** ldp x29, x30, \[sp\] +** mov x12, #?4144 +** add sp, sp, x12 +** ret +*/ +svbool_t +test_8 (void) +{ + volatile int x[1024]; + take_stack_args (x, 0, 1, 2, 3, 4, 5, 6, 7); + asm volatile ("" ::: "p4", "x24", "x25", "x26"); + return svptrue_b8 (); +} + +/* +** test_9: +** mov x12, #?4368 +** sub sp, sp, x12 +** stp x29, x30, \[sp, 256\] +** add x29, sp, #?256 +** str p4, \[sp\] +** sub sp, sp, #16 +** ... +** ptrue p0\.b, vl256 +** sub sp, x29, #256 +** ldr p4, \[sp\] +** add sp, sp, #?256 +** ldp x29, x30, \[sp\] +** mov x12, #?4112 +** add sp, sp, x12 +** ret +*/ +svbool_t +test_9 (int n) +{ + volatile int x[1024]; + take_stack_args (x, __builtin_alloca (n), 1, 2, 3, 4, 5, 6, 7); + asm volatile ("" ::: "p4"); + return svptrue_b8 (); +} + +/* +** test_10: +** mov x12, #?4400 +** sub sp, sp, x12 +** stp x29, x30, \[sp, 256\] +** add x29, sp, #?256 +** stp x24, x25, \[sp, 272\] +** str x26, \[sp, 288\] +** str p4, \[sp\] +** sub sp, sp, #16 +** ... +** ptrue p0\.b, vl256 +** sub sp, x29, #256 +** ldr p4, \[sp\] +** add sp, sp, #?256 +** ldp x24, x25, \[sp, 16\] +** ldr x26, \[sp, 32\] +** ldp x29, x30, \[sp\] +** mov x12, #?4144 +** add sp, sp, x12 +** ret +*/ +svbool_t +test_10 (int n) +{ + volatile int x[1024]; + take_stack_args (x, __builtin_alloca (n), 1, 2, 3, 4, 5, 6, 7); + asm volatile ("" ::: "p4", "x24", "x25", "x26"); + return svptrue_b8 (); +} + +/* +** test_11: +** sub sp, sp, #65536 +** str xzr, \[sp, 1024\] +** mov x12, #?64704 +** sub sp, sp, x12 +** str xzr, \[sp, 1024\] +** stp x29, x30, \[sp, 256\] +** add x29, sp, #?256 +** stp x24, x25, \[sp, 272\] +** str x26, \[sp, 288\] +** str p4, \[sp\] +** sub sp, sp, #16 +** ... +** ptrue p0\.b, vl256 +** sub sp, x29, #256 +** ldr p4, \[sp\] +** add sp, sp, #?256 +** ldp x24, x25, \[sp, 16\] +** ldr x26, \[sp, 32\] +** ldp x29, x30, \[sp\] +** add sp, sp, #?3008 +** add sp, sp, #?126976 +** ret +*/ +svbool_t +test_11 (int n) +{ + volatile int x[0x7ee4]; + take_stack_args (x, __builtin_alloca (n), 1, 2, 3, 4, 5, 6, 7); + asm volatile ("" ::: "p4", "x24", "x25", "x26"); + return svptrue_b8 (); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/stack_clash_2_256.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/stack_clash_2_256.c new file mode 100644 index 0000000..f8318b3 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/stack_clash_2_256.c @@ -0,0 +1,284 @@ +/* { dg-do compile } */ +/* { dg-options "-O -fshrink-wrap -fstack-clash-protection -msve-vector-bits=256 -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#pragma GCC aarch64 "arm_sve.h" + +svbool_t take_stack_args (volatile void *, void *, int, int, int, + int, int, int, int); + +/* +** test_1: +** sub sp, sp, #48 +** str p4, \[sp\] +** ... +** ptrue p0\.b, vl32 +** ldr p4, \[sp\] +** add sp, sp, #?48 +** ret +*/ +svbool_t +test_1 (void) +{ + volatile int x = 1; + asm volatile ("" ::: "p4"); + return svptrue_b8 (); +} + +/* +** test_2: +** sub sp, sp, #80 +** stp x24, x25, \[sp, 32\] +** str x26, \[sp, 48\] +** str p4, \[sp\] +** ... +** ptrue p0\.b, vl32 +** ldr p4, \[sp\] +** ldp x24, x25, \[sp, 32\] +** ldr x26, \[sp, 48\] +** add sp, sp, #?80 +** ret +*/ +svbool_t +test_2 (void) +{ + volatile int x = 1; + asm volatile ("" ::: "p4", "x24", "x25", "x26"); + return svptrue_b8 (); +} + +/* +** test_3: +** mov x12, #?4160 +** sub sp, sp, x12 +** stp x24, x25, \[sp, 32\] +** str x26, \[sp, 48\] +** str p4, \[sp\] +** ... +** ptrue p0\.b, vl32 +** ldr p4, \[sp\] +** ldp x24, x25, \[sp, 32\] +** ldr x26, \[sp, 48\] +** add sp, sp, x12 +** ret +*/ +svbool_t +test_3 (void) +{ + volatile int x[1024]; + asm volatile ("" :: "r" (x) : "p4", "x24", "x25", "x26"); + return svptrue_b8 (); +} + +/* +** test_4: +** sub sp, sp, #64 +** str p4, \[sp\] +** ... +** ptrue p0\.h, vl16 +** ldr p4, \[sp\] +** add sp, sp, #?64 +** ret +*/ +svbool_t +test_4 (void) +{ + volatile svint32_t b; + b = svdup_s32 (1); + asm volatile ("" ::: "p4"); + return svptrue_b16 (); +} + +/* +** test_5: +** sub sp, sp, #96 +** stp x24, x25, \[sp, 32\] +** str x26, \[sp, 48\] +** str p4, \[sp\] +** ... +** ptrue p0\.h, vl16 +** ldr p4, \[sp\] +** ldp x24, x25, \[sp, 32\] +** ldr x26, \[sp, 48\] +** add sp, sp, #?96 +** ret +*/ +svbool_t +test_5 (void) +{ + volatile svint32_t b; + b = svdup_s32 (1); + asm volatile ("" ::: "p4", "x24", "x25", "x26"); + return svptrue_b16 (); +} + +/* +** test_6: +** stp x29, x30, \[sp, -16\]! +** mov x29, sp +** sub sp, sp, #32 +** str p4, \[sp\] +** ... +** ptrue p0\.b, vl32 +** add sp, sp, #?16 +** ldr p4, \[sp\] +** add sp, sp, #?32 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +svbool_t +test_6 (void) +{ + take_stack_args (0, 0, 1, 2, 3, 4, 5, 6, 7); + asm volatile ("" ::: "p4"); + return svptrue_b8 (); +} + +/* +** test_7: +** mov x12, #?4144 +** sub sp, sp, x12 +** stp x29, x30, \[sp, 32\] +** add x29, sp, #?32 +** str p4, \[sp\] +** sub sp, sp, #16 +** ... +** ptrue p0\.b, vl32 +** add sp, sp, #?16 +** ldr p4, \[sp\] +** add sp, sp, #?32 +** ldp x29, x30, \[sp\] +** mov x12, #?4112 +** add sp, sp, x12 +** ret +*/ +svbool_t +test_7 (void) +{ + volatile int x[1024]; + take_stack_args (x, 0, 1, 2, 3, 4, 5, 6, 7); + asm volatile ("" ::: "p4"); + return svptrue_b8 (); +} + +/* +** test_8: +** mov x12, #?4176 +** sub sp, sp, x12 +** stp x29, x30, \[sp, 32\] +** add x29, sp, #?32 +** stp x24, x25, \[sp, 48\] +** str x26, \[sp, 64\] +** str p4, \[sp\] +** sub sp, sp, #16 +** ... +** ptrue p0\.b, vl32 +** add sp, sp, #?16 +** ldr p4, \[sp\] +** add sp, sp, #?32 +** ldp x24, x25, \[sp, 16\] +** ldr x26, \[sp, 32\] +** ldp x29, x30, \[sp\] +** mov x12, #?4144 +** add sp, sp, x12 +** ret +*/ +svbool_t +test_8 (void) +{ + volatile int x[1024]; + take_stack_args (x, 0, 1, 2, 3, 4, 5, 6, 7); + asm volatile ("" ::: "p4", "x24", "x25", "x26"); + return svptrue_b8 (); +} + +/* +** test_9: +** mov x12, #?4144 +** sub sp, sp, x12 +** stp x29, x30, \[sp, 32\] +** add x29, sp, #?32 +** str p4, \[sp\] +** sub sp, sp, #16 +** ... +** ptrue p0\.b, vl32 +** sub sp, x29, #32 +** ldr p4, \[sp\] +** add sp, sp, #?32 +** ldp x29, x30, \[sp\] +** mov x12, #?4112 +** add sp, sp, x12 +** ret +*/ +svbool_t +test_9 (int n) +{ + volatile int x[1024]; + take_stack_args (x, __builtin_alloca (n), 1, 2, 3, 4, 5, 6, 7); + asm volatile ("" ::: "p4"); + return svptrue_b8 (); +} + +/* +** test_10: +** mov x12, #?4176 +** sub sp, sp, x12 +** stp x29, x30, \[sp, 32\] +** add x29, sp, #?32 +** stp x24, x25, \[sp, 48\] +** str x26, \[sp, 64\] +** str p4, \[sp\] +** sub sp, sp, #16 +** ... +** ptrue p0\.b, vl32 +** sub sp, x29, #32 +** ldr p4, \[sp\] +** add sp, sp, #?32 +** ldp x24, x25, \[sp, 16\] +** ldr x26, \[sp, 32\] +** ldp x29, x30, \[sp\] +** mov x12, #?4144 +** add sp, sp, x12 +** ret +*/ +svbool_t +test_10 (int n) +{ + volatile int x[1024]; + take_stack_args (x, __builtin_alloca (n), 1, 2, 3, 4, 5, 6, 7); + asm volatile ("" ::: "p4", "x24", "x25", "x26"); + return svptrue_b8 (); +} + +/* +** test_11: +** sub sp, sp, #65536 +** str xzr, \[sp, 1024\] +** mov x12, #?64480 +** sub sp, sp, x12 +** stp x29, x30, \[sp, 32\] +** add x29, sp, #?32 +** stp x24, x25, \[sp, 48\] +** str x26, \[sp, 64\] +** str p4, \[sp\] +** sub sp, sp, #16 +** ... +** ptrue p0\.b, vl32 +** sub sp, x29, #32 +** ldr p4, \[sp\] +** add sp, sp, #?32 +** ldp x24, x25, \[sp, 16\] +** ldr x26, \[sp, 32\] +** ldp x29, x30, \[sp\] +** add sp, sp, #?3008 +** add sp, sp, #?126976 +** ret +*/ +svbool_t +test_11 (int n) +{ + volatile int x[0x7ee4]; + take_stack_args (x, __builtin_alloca (n), 1, 2, 3, 4, 5, 6, 7); + asm volatile ("" ::: "p4", "x24", "x25", "x26"); + return svptrue_b8 (); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/stack_clash_2_512.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/stack_clash_2_512.c new file mode 100644 index 0000000..45a23ad --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/stack_clash_2_512.c @@ -0,0 +1,285 @@ +/* { dg-do compile } */ +/* { dg-options "-O -fshrink-wrap -fstack-clash-protection -msve-vector-bits=512 -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#pragma GCC aarch64 "arm_sve.h" + +svbool_t take_stack_args (volatile void *, void *, int, int, int, + int, int, int, int); + +/* +** test_1: +** sub sp, sp, #80 +** str p4, \[sp\] +** ... +** ptrue p0\.b, vl64 +** ldr p4, \[sp\] +** add sp, sp, #?80 +** ret +*/ +svbool_t +test_1 (void) +{ + volatile int x = 1; + asm volatile ("" ::: "p4"); + return svptrue_b8 (); +} + +/* +** test_2: +** sub sp, sp, #112 +** stp x24, x25, \[sp, 64\] +** str x26, \[sp, 80\] +** str p4, \[sp\] +** ... +** ptrue p0\.b, vl64 +** ldr p4, \[sp\] +** ldp x24, x25, \[sp, 64\] +** ldr x26, \[sp, 80\] +** add sp, sp, #?112 +** ret +*/ +svbool_t +test_2 (void) +{ + volatile int x = 1; + asm volatile ("" ::: "p4", "x24", "x25", "x26"); + return svptrue_b8 (); +} + +/* +** test_3: +** mov x12, #?4192 +** sub sp, sp, x12 +** stp x24, x25, \[sp, 64\] +** str x26, \[sp, 80\] +** str p4, \[sp\] +** ... +** ptrue p0\.b, vl64 +** ldr p4, \[sp\] +** ldp x24, x25, \[sp, 64\] +** ldr x26, \[sp, 80\] +** add sp, sp, x12 +** ret +*/ +svbool_t +test_3 (void) +{ + volatile int x[1024]; + asm volatile ("" :: "r" (x) : "p4", "x24", "x25", "x26"); + return svptrue_b8 (); +} + +/* +** test_4: +** sub sp, sp, #128 +** str p4, \[sp\] +** ... +** ptrue p0\.h, vl32 +** ldr p4, \[sp\] +** add sp, sp, #?128 +** ret +*/ +svbool_t +test_4 (void) +{ + volatile svint32_t b; + b = svdup_s32 (1); + asm volatile ("" ::: "p4"); + return svptrue_b16 (); +} + +/* +** test_5: +** sub sp, sp, #160 +** stp x24, x25, \[sp, 64\] +** str x26, \[sp, 80\] +** str p4, \[sp\] +** ... +** ptrue p0\.h, vl32 +** ldr p4, \[sp\] +** ldp x24, x25, \[sp, 64\] +** ldr x26, \[sp, 80\] +** add sp, sp, #?160 +** ret +*/ +svbool_t +test_5 (void) +{ + volatile svint32_t b; + b = svdup_s32 (1); + asm volatile ("" ::: "p4", "x24", "x25", "x26"); + return svptrue_b16 (); +} + +/* +** test_6: +** stp x29, x30, \[sp, -16\]! +** mov x29, sp +** sub sp, sp, #64 +** str p4, \[sp\] +** ... +** ptrue p0\.b, vl64 +** add sp, sp, #?16 +** ldr p4, \[sp\] +** add sp, sp, #?64 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +svbool_t +test_6 (void) +{ + take_stack_args (0, 0, 1, 2, 3, 4, 5, 6, 7); + asm volatile ("" ::: "p4"); + return svptrue_b8 (); +} + +/* +** test_7: +** mov x12, #?4176 +** sub sp, sp, x12 +** stp x29, x30, \[sp, 64\] +** add x29, sp, #?64 +** str p4, \[sp\] +** sub sp, sp, #16 +** ... +** ptrue p0\.b, vl64 +** add sp, sp, #?16 +** ldr p4, \[sp\] +** add sp, sp, #?64 +** ldp x29, x30, \[sp\] +** mov x12, #?4112 +** add sp, sp, x12 +** ret +*/ +svbool_t +test_7 (void) +{ + volatile int x[1024]; + take_stack_args (x, 0, 1, 2, 3, 4, 5, 6, 7); + asm volatile ("" ::: "p4"); + return svptrue_b8 (); +} + +/* +** test_8: +** mov x12, #?4208 +** sub sp, sp, x12 +** stp x29, x30, \[sp, 64\] +** add x29, sp, #?64 +** stp x24, x25, \[sp, 80\] +** str x26, \[sp, 96\] +** str p4, \[sp\] +** sub sp, sp, #16 +** ... +** ptrue p0\.b, vl64 +** add sp, sp, #?16 +** ldr p4, \[sp\] +** add sp, sp, #?64 +** ldp x24, x25, \[sp, 16\] +** ldr x26, \[sp, 32\] +** ldp x29, x30, \[sp\] +** mov x12, #?4144 +** add sp, sp, x12 +** ret +*/ +svbool_t +test_8 (void) +{ + volatile int x[1024]; + take_stack_args (x, 0, 1, 2, 3, 4, 5, 6, 7); + asm volatile ("" ::: "p4", "x24", "x25", "x26"); + return svptrue_b8 (); +} + +/* +** test_9: +** mov x12, #?4176 +** sub sp, sp, x12 +** stp x29, x30, \[sp, 64\] +** add x29, sp, #?64 +** str p4, \[sp\] +** sub sp, sp, #16 +** ... +** ptrue p0\.b, vl64 +** sub sp, x29, #64 +** ldr p4, \[sp\] +** add sp, sp, #?64 +** ldp x29, x30, \[sp\] +** mov x12, #?4112 +** add sp, sp, x12 +** ret +*/ +svbool_t +test_9 (int n) +{ + volatile int x[1024]; + take_stack_args (x, __builtin_alloca (n), 1, 2, 3, 4, 5, 6, 7); + asm volatile ("" ::: "p4"); + return svptrue_b8 (); +} + +/* +** test_10: +** mov x12, #?4208 +** sub sp, sp, x12 +** stp x29, x30, \[sp, 64\] +** add x29, sp, #?64 +** stp x24, x25, \[sp, 80\] +** str x26, \[sp, 96\] +** str p4, \[sp\] +** sub sp, sp, #16 +** ... +** ptrue p0\.b, vl64 +** sub sp, x29, #64 +** ldr p4, \[sp\] +** add sp, sp, #?64 +** ldp x24, x25, \[sp, 16\] +** ldr x26, \[sp, 32\] +** ldp x29, x30, \[sp\] +** mov x12, #?4144 +** add sp, sp, x12 +** ret +*/ +svbool_t +test_10 (int n) +{ + volatile int x[1024]; + take_stack_args (x, __builtin_alloca (n), 1, 2, 3, 4, 5, 6, 7); + asm volatile ("" ::: "p4", "x24", "x25", "x26"); + return svptrue_b8 (); +} + +/* +** test_11: +** sub sp, sp, #65536 +** str xzr, \[sp, 1024\] +** mov x12, #?64512 +** sub sp, sp, x12 +** str xzr, \[sp, 1024\] +** stp x29, x30, \[sp, 64\] +** add x29, sp, #?64 +** stp x24, x25, \[sp, 80\] +** str x26, \[sp, 96\] +** str p4, \[sp\] +** sub sp, sp, #16 +** ... +** ptrue p0\.b, vl64 +** sub sp, x29, #64 +** ldr p4, \[sp\] +** add sp, sp, #?64 +** ldp x24, x25, \[sp, 16\] +** ldr x26, \[sp, 32\] +** ldp x29, x30, \[sp\] +** add sp, sp, #?3008 +** add sp, sp, #?126976 +** ret +*/ +svbool_t +test_11 (int n) +{ + volatile int x[0x7ee4]; + take_stack_args (x, __builtin_alloca (n), 1, 2, 3, 4, 5, 6, 7); + asm volatile ("" ::: "p4", "x24", "x25", "x26"); + return svptrue_b8 (); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/stack_clash_3.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/stack_clash_3.c new file mode 100644 index 0000000..3e01ec3 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/stack_clash_3.c @@ -0,0 +1,63 @@ +/* { dg-do compile } */ +/* { dg-options "-O -fshrink-wrap -fstack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#pragma GCC aarch64 "arm_sve.h" + +/* +** test_1: +** str x24, \[sp, -32\]! +** cntb x13 +** mov x11, sp +** ... +** sub sp, sp, x13 +** str p4, \[sp\] +** cbz w0, [^\n]* +** ... +** ptrue p0\.b, all +** ldr p4, \[sp\] +** addvl sp, sp, #1 +** ldr x24, \[sp\], 32 +** ret +*/ +svbool_t +test_1 (int n) +{ + asm volatile ("" ::: "x24"); + if (n) + { + volatile int x = 1; + asm volatile ("" ::: "p4"); + } + return svptrue_b8 (); +} + +/* +** test_2: +** str x24, \[sp, -32\]! +** cntb x13 +** mov x11, sp +** ... +** sub sp, sp, x13 +** str p4, \[sp\] +** cbz w0, [^\n]* +** str p5, \[sp, #1, mul vl\] +** str p6, \[sp, #2, mul vl\] +** ... +** ptrue p0\.b, all +** ldr p4, \[sp\] +** addvl sp, sp, #1 +** ldr x24, \[sp\], 32 +** ret +*/ +svbool_t +test_2 (int n) +{ + asm volatile ("" ::: "x24"); + if (n) + { + volatile int x = 1; + asm volatile ("" ::: "p4", "p5", "p6"); + } + return svptrue_b8 (); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/unprototyped_1.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/unprototyped_1.c new file mode 100644 index 0000000..5c7ed51 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/unprototyped_1.c @@ -0,0 +1,11 @@ +/* { dg-do compile } */ + +#include + +void unprototyped (); + +void +f (svuint8_t *ptr) +{ + unprototyped (*ptr); /* { dg-error {SVE type '(svuint8_t|__SVUint8_t)' cannot be passed to an unprototyped function} } */ +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/varargs_1.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/varargs_1.c new file mode 100644 index 0000000..305a35f --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/varargs_1.c @@ -0,0 +1,170 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include +#include + +/* +** callee_0: +** ... +** ldr (p[0-7]), \[x1\] +** ... +** cntp x0, \1, \1\.b +** ... +** ret +*/ +uint64_t __attribute__((noipa)) +callee_0 (int64_t *ptr, ...) +{ + va_list va; + svbool_t pg; + + va_start (va, ptr); + pg = va_arg (va, svbool_t); + va_end (va); + return svcntp_b8 (pg, pg); +} + +/* +** caller_0: +** ... +** ptrue (p[0-7])\.d, vl7 +** ... +** str \1, \[x1\] +** ... +** ret +*/ +uint64_t __attribute__((noipa)) +caller_0 (int64_t *ptr) +{ + return callee_0 (ptr, svptrue_pat_b64 (SV_VL7)); +} + +/* +** callee_1: +** ... +** ldr (p[0-7]), \[x2\] +** ... +** cntp x0, \1, \1\.b +** ... +** ret +*/ +uint64_t __attribute__((noipa)) +callee_1 (int64_t *ptr, ...) +{ + va_list va; + svbool_t pg; + + va_start (va, ptr); + va_arg (va, int); + pg = va_arg (va, svbool_t); + va_end (va); + return svcntp_b8 (pg, pg); +} + +/* +** caller_1: +** ... +** ptrue (p[0-7])\.d, vl7 +** ... +** str \1, \[x2\] +** ... +** ret +*/ +uint64_t __attribute__((noipa)) +caller_1 (int64_t *ptr) +{ + return callee_1 (ptr, 1, svptrue_pat_b64 (SV_VL7)); +} + +/* +** callee_7: +** ... +** ldr (p[0-7]), \[x7\] +** ... +** cntp x0, \1, \1\.b +** ... +** ret +*/ +uint64_t __attribute__((noipa)) +callee_7 (int64_t *ptr, ...) +{ + va_list va; + svbool_t pg; + + va_start (va, ptr); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + pg = va_arg (va, svbool_t); + va_end (va); + return svcntp_b8 (pg, pg); +} + +/* +** caller_7: +** ... +** ptrue (p[0-7])\.d, vl7 +** ... +** str \1, \[x7\] +** ... +** ret +*/ +uint64_t __attribute__((noipa)) +caller_7 (int64_t *ptr) +{ + return callee_7 (ptr, 1, 2, 3, 4, 5, 6, svptrue_pat_b64 (SV_VL7)); +} + +/* FIXME: We should be able to get rid of the va_list object. */ +/* +** callee_8: +** sub sp, sp, #([0-9]+) +** ... +** ldr (x[0-9]+), \[sp, \1\] +** ... +** ldr (p[0-7]), \[\2\] +** ... +** cntp x0, \3, \3\.b +** ... +** ret +*/ +uint64_t __attribute__((noipa)) +callee_8 (int64_t *ptr, ...) +{ + va_list va; + svbool_t pg; + + va_start (va, ptr); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + pg = va_arg (va, svbool_t); + va_end (va); + return svcntp_b8 (pg, pg); +} + +/* +** caller_8: +** ... +** ptrue (p[0-7])\.d, vl7 +** ... +** str \1, \[(x[0-9]+)\] +** ... +** str \2, \[sp\] +** ... +** ret +*/ +uint64_t __attribute__((noipa)) +caller_8 (int64_t *ptr) +{ + return callee_8 (ptr, 1, 2, 3, 4, 5, 6, 7, svptrue_pat_b64 (SV_VL7)); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/varargs_2_f16.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/varargs_2_f16.c new file mode 100644 index 0000000..3d7d6b6 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/varargs_2_f16.c @@ -0,0 +1,170 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include +#include + +/* +** callee_0: +** ... +** ld1h (z[0-9]+\.h), (p[0-7])/z, \[x1\] +** ... +** st1h \1, \2, \[x0\] +** ... +** ret +*/ +void __attribute__((noipa)) +callee_0 (int16_t *ptr, ...) +{ + va_list va; + svint16_t vec; + + va_start (va, ptr); + vec = va_arg (va, svint16_t); + va_end (va); + svst1 (svptrue_b8 (), ptr, vec); +} + +/* +** caller_0: +** ... +** fmov (z[0-9]+\.h), #9\.0[^\n]* +** ... +** st1h \1, p[0-7], \[x1\] +** ... +** ret +*/ +void __attribute__((noipa)) +caller_0 (int16_t *ptr) +{ + callee_0 (ptr, svdup_f16 (9)); +} + +/* +** callee_1: +** ... +** ld1h (z[0-9]+\.h), (p[0-7])/z, \[x2\] +** ... +** st1h \1, p[0-7], \[x0\] +** ... +** ret +*/ +void __attribute__((noipa)) +callee_1 (int16_t *ptr, ...) +{ + va_list va; + svint16_t vec; + + va_start (va, ptr); + va_arg (va, int); + vec = va_arg (va, svint16_t); + va_end (va); + svst1 (svptrue_b8 (), ptr, vec); +} + +/* +** caller_1: +** ... +** fmov (z[0-9]+\.h), #9\.0[^\n]* +** ... +** st1h \1, p[0-7], \[x2\] +** ... +** ret +*/ +void __attribute__((noipa)) +caller_1 (int16_t *ptr) +{ + callee_1 (ptr, 1, svdup_f16 (9)); +} + +/* +** callee_7: +** ... +** ld1h (z[0-9]+\.h), (p[0-7])/z, \[x7\] +** ... +** st1h \1, p[0-7], \[x0\] +** ... +** ret +*/ +void __attribute__((noipa)) +callee_7 (int16_t *ptr, ...) +{ + va_list va; + svint16_t vec; + + va_start (va, ptr); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + vec = va_arg (va, svint16_t); + va_end (va); + svst1 (svptrue_b8 (), ptr, vec); +} + +/* +** caller_7: +** ... +** fmov (z[0-9]+\.h), #9\.0[^\n]* +** ... +** st1h \1, p[0-7], \[x7\] +** ... +** ret +*/ +void __attribute__((noipa)) +caller_7 (int16_t *ptr) +{ + callee_7 (ptr, 1, 2, 3, 4, 5, 6, svdup_f16 (9)); +} + +/* FIXME: We should be able to get rid of the va_list object. */ +/* +** callee_8: +** sub sp, sp, #([0-9]+) +** ... +** ldr (x[0-9]+), \[sp, \1\] +** ... +** ld1h (z[0-9]+\.h), (p[0-7])/z, \[\2\] +** ... +** st1h \3, \4, \[x0\] +** ... +** ret +*/ +void __attribute__((noipa)) +callee_8 (int16_t *ptr, ...) +{ + va_list va; + svint16_t vec; + + va_start (va, ptr); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + vec = va_arg (va, svint16_t); + va_end (va); + svst1 (svptrue_b8 (), ptr, vec); +} + +/* +** caller_8: +** ... +** fmov (z[0-9]+\.h), #9\.0[^\n]* +** ... +** st1h \1, p[0-7], \[(x[0-9]+)\] +** ... +** str \2, \[sp\] +** ... +** ret +*/ +void __attribute__((noipa)) +caller_8 (int16_t *ptr) +{ + callee_8 (ptr, 1, 2, 3, 4, 5, 6, 7, svdup_f16 (9)); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/varargs_2_f32.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/varargs_2_f32.c new file mode 100644 index 0000000..769b764 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/varargs_2_f32.c @@ -0,0 +1,170 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include +#include + +/* +** callee_0: +** ... +** ld1w (z[0-9]+\.s), (p[0-7])/z, \[x1\] +** ... +** st1w \1, \2, \[x0\] +** ... +** ret +*/ +void __attribute__((noipa)) +callee_0 (int32_t *ptr, ...) +{ + va_list va; + svint32_t vec; + + va_start (va, ptr); + vec = va_arg (va, svint32_t); + va_end (va); + svst1 (svptrue_b8 (), ptr, vec); +} + +/* +** caller_0: +** ... +** fmov (z[0-9]+\.s), #9\.0[^\n]* +** ... +** st1w \1, p[0-7], \[x1\] +** ... +** ret +*/ +void __attribute__((noipa)) +caller_0 (int32_t *ptr) +{ + callee_0 (ptr, svdup_f32 (9)); +} + +/* +** callee_1: +** ... +** ld1w (z[0-9]+\.s), (p[0-7])/z, \[x2\] +** ... +** st1w \1, p[0-7], \[x0\] +** ... +** ret +*/ +void __attribute__((noipa)) +callee_1 (int32_t *ptr, ...) +{ + va_list va; + svint32_t vec; + + va_start (va, ptr); + va_arg (va, int); + vec = va_arg (va, svint32_t); + va_end (va); + svst1 (svptrue_b8 (), ptr, vec); +} + +/* +** caller_1: +** ... +** fmov (z[0-9]+\.s), #9\.0[^\n]* +** ... +** st1w \1, p[0-7], \[x2\] +** ... +** ret +*/ +void __attribute__((noipa)) +caller_1 (int32_t *ptr) +{ + callee_1 (ptr, 1, svdup_f32 (9)); +} + +/* +** callee_7: +** ... +** ld1w (z[0-9]+\.s), (p[0-7])/z, \[x7\] +** ... +** st1w \1, p[0-7], \[x0\] +** ... +** ret +*/ +void __attribute__((noipa)) +callee_7 (int32_t *ptr, ...) +{ + va_list va; + svint32_t vec; + + va_start (va, ptr); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + vec = va_arg (va, svint32_t); + va_end (va); + svst1 (svptrue_b8 (), ptr, vec); +} + +/* +** caller_7: +** ... +** fmov (z[0-9]+\.s), #9\.0[^\n]* +** ... +** st1w \1, p[0-7], \[x7\] +** ... +** ret +*/ +void __attribute__((noipa)) +caller_7 (int32_t *ptr) +{ + callee_7 (ptr, 1, 2, 3, 4, 5, 6, svdup_f32 (9)); +} + +/* FIXME: We should be able to get rid of the va_list object. */ +/* +** callee_8: +** sub sp, sp, #([0-9]+) +** ... +** ldr (x[0-9]+), \[sp, \1\] +** ... +** ld1w (z[0-9]+\.s), (p[0-7])/z, \[\2\] +** ... +** st1w \3, \4, \[x0\] +** ... +** ret +*/ +void __attribute__((noipa)) +callee_8 (int32_t *ptr, ...) +{ + va_list va; + svint32_t vec; + + va_start (va, ptr); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + vec = va_arg (va, svint32_t); + va_end (va); + svst1 (svptrue_b8 (), ptr, vec); +} + +/* +** caller_8: +** ... +** fmov (z[0-9]+\.s), #9\.0[^\n]* +** ... +** st1w \1, p[0-7], \[(x[0-9]+)\] +** ... +** str \2, \[sp\] +** ... +** ret +*/ +void __attribute__((noipa)) +caller_8 (int32_t *ptr) +{ + callee_8 (ptr, 1, 2, 3, 4, 5, 6, 7, svdup_f32 (9)); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/varargs_2_f64.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/varargs_2_f64.c new file mode 100644 index 0000000..8067eee --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/varargs_2_f64.c @@ -0,0 +1,170 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include +#include + +/* +** callee_0: +** ... +** ld1d (z[0-9]+\.d), (p[0-7])/z, \[x1\] +** ... +** st1d \1, \2, \[x0\] +** ... +** ret +*/ +void __attribute__((noipa)) +callee_0 (int64_t *ptr, ...) +{ + va_list va; + svint64_t vec; + + va_start (va, ptr); + vec = va_arg (va, svint64_t); + va_end (va); + svst1 (svptrue_b8 (), ptr, vec); +} + +/* +** caller_0: +** ... +** fmov (z[0-9]+\.d), #9\.0[^\n]* +** ... +** st1d \1, p[0-7], \[x1\] +** ... +** ret +*/ +void __attribute__((noipa)) +caller_0 (int64_t *ptr) +{ + callee_0 (ptr, svdup_f64 (9)); +} + +/* +** callee_1: +** ... +** ld1d (z[0-9]+\.d), (p[0-7])/z, \[x2\] +** ... +** st1d \1, p[0-7], \[x0\] +** ... +** ret +*/ +void __attribute__((noipa)) +callee_1 (int64_t *ptr, ...) +{ + va_list va; + svint64_t vec; + + va_start (va, ptr); + va_arg (va, int); + vec = va_arg (va, svint64_t); + va_end (va); + svst1 (svptrue_b8 (), ptr, vec); +} + +/* +** caller_1: +** ... +** fmov (z[0-9]+\.d), #9\.0[^\n]* +** ... +** st1d \1, p[0-7], \[x2\] +** ... +** ret +*/ +void __attribute__((noipa)) +caller_1 (int64_t *ptr) +{ + callee_1 (ptr, 1, svdup_f64 (9)); +} + +/* +** callee_7: +** ... +** ld1d (z[0-9]+\.d), (p[0-7])/z, \[x7\] +** ... +** st1d \1, p[0-7], \[x0\] +** ... +** ret +*/ +void __attribute__((noipa)) +callee_7 (int64_t *ptr, ...) +{ + va_list va; + svint64_t vec; + + va_start (va, ptr); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + vec = va_arg (va, svint64_t); + va_end (va); + svst1 (svptrue_b8 (), ptr, vec); +} + +/* +** caller_7: +** ... +** fmov (z[0-9]+\.d), #9\.0[^\n]* +** ... +** st1d \1, p[0-7], \[x7\] +** ... +** ret +*/ +void __attribute__((noipa)) +caller_7 (int64_t *ptr) +{ + callee_7 (ptr, 1, 2, 3, 4, 5, 6, svdup_f64 (9)); +} + +/* FIXME: We should be able to get rid of the va_list object. */ +/* +** callee_8: +** sub sp, sp, #([0-9]+) +** ... +** ldr (x[0-9]+), \[sp, \1\] +** ... +** ld1d (z[0-9]+\.d), (p[0-7])/z, \[\2\] +** ... +** st1d \3, \4, \[x0\] +** ... +** ret +*/ +void __attribute__((noipa)) +callee_8 (int64_t *ptr, ...) +{ + va_list va; + svint64_t vec; + + va_start (va, ptr); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + vec = va_arg (va, svint64_t); + va_end (va); + svst1 (svptrue_b8 (), ptr, vec); +} + +/* +** caller_8: +** ... +** fmov (z[0-9]+\.d), #9\.0[^\n]* +** ... +** st1d \1, p[0-7], \[(x[0-9]+)\] +** ... +** str \2, \[sp\] +** ... +** ret +*/ +void __attribute__((noipa)) +caller_8 (int64_t *ptr) +{ + callee_8 (ptr, 1, 2, 3, 4, 5, 6, 7, svdup_f64 (9)); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/varargs_2_s16.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/varargs_2_s16.c new file mode 100644 index 0000000..d695518 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/varargs_2_s16.c @@ -0,0 +1,170 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include +#include + +/* +** callee_0: +** ... +** ld1h (z[0-9]+\.h), (p[0-7])/z, \[x1\] +** ... +** st1h \1, \2, \[x0\] +** ... +** ret +*/ +void __attribute__((noipa)) +callee_0 (int16_t *ptr, ...) +{ + va_list va; + svint16_t vec; + + va_start (va, ptr); + vec = va_arg (va, svint16_t); + va_end (va); + svst1 (svptrue_b8 (), ptr, vec); +} + +/* +** caller_0: +** ... +** mov (z[0-9]+\.h), #42 +** ... +** st1h \1, p[0-7], \[x1\] +** ... +** ret +*/ +void __attribute__((noipa)) +caller_0 (int16_t *ptr) +{ + callee_0 (ptr, svdup_s16 (42)); +} + +/* +** callee_1: +** ... +** ld1h (z[0-9]+\.h), (p[0-7])/z, \[x2\] +** ... +** st1h \1, p[0-7], \[x0\] +** ... +** ret +*/ +void __attribute__((noipa)) +callee_1 (int16_t *ptr, ...) +{ + va_list va; + svint16_t vec; + + va_start (va, ptr); + va_arg (va, int); + vec = va_arg (va, svint16_t); + va_end (va); + svst1 (svptrue_b8 (), ptr, vec); +} + +/* +** caller_1: +** ... +** mov (z[0-9]+\.h), #42 +** ... +** st1h \1, p[0-7], \[x2\] +** ... +** ret +*/ +void __attribute__((noipa)) +caller_1 (int16_t *ptr) +{ + callee_1 (ptr, 1, svdup_s16 (42)); +} + +/* +** callee_7: +** ... +** ld1h (z[0-9]+\.h), (p[0-7])/z, \[x7\] +** ... +** st1h \1, p[0-7], \[x0\] +** ... +** ret +*/ +void __attribute__((noipa)) +callee_7 (int16_t *ptr, ...) +{ + va_list va; + svint16_t vec; + + va_start (va, ptr); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + vec = va_arg (va, svint16_t); + va_end (va); + svst1 (svptrue_b8 (), ptr, vec); +} + +/* +** caller_7: +** ... +** mov (z[0-9]+\.h), #42 +** ... +** st1h \1, p[0-7], \[x7\] +** ... +** ret +*/ +void __attribute__((noipa)) +caller_7 (int16_t *ptr) +{ + callee_7 (ptr, 1, 2, 3, 4, 5, 6, svdup_s16 (42)); +} + +/* FIXME: We should be able to get rid of the va_list object. */ +/* +** callee_8: +** sub sp, sp, #([0-9]+) +** ... +** ldr (x[0-9]+), \[sp, \1\] +** ... +** ld1h (z[0-9]+\.h), (p[0-7])/z, \[\2\] +** ... +** st1h \3, \4, \[x0\] +** ... +** ret +*/ +void __attribute__((noipa)) +callee_8 (int16_t *ptr, ...) +{ + va_list va; + svint16_t vec; + + va_start (va, ptr); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + vec = va_arg (va, svint16_t); + va_end (va); + svst1 (svptrue_b8 (), ptr, vec); +} + +/* +** caller_8: +** ... +** mov (z[0-9]+\.h), #42 +** ... +** st1h \1, p[0-7], \[(x[0-9]+)\] +** ... +** str \2, \[sp\] +** ... +** ret +*/ +void __attribute__((noipa)) +caller_8 (int16_t *ptr) +{ + callee_8 (ptr, 1, 2, 3, 4, 5, 6, 7, svdup_s16 (42)); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/varargs_2_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/varargs_2_s32.c new file mode 100644 index 0000000..fddc0b8 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/varargs_2_s32.c @@ -0,0 +1,170 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include +#include + +/* +** callee_0: +** ... +** ld1w (z[0-9]+\.s), (p[0-7])/z, \[x1\] +** ... +** st1w \1, \2, \[x0\] +** ... +** ret +*/ +void __attribute__((noipa)) +callee_0 (int32_t *ptr, ...) +{ + va_list va; + svint32_t vec; + + va_start (va, ptr); + vec = va_arg (va, svint32_t); + va_end (va); + svst1 (svptrue_b8 (), ptr, vec); +} + +/* +** caller_0: +** ... +** mov (z[0-9]+\.s), #42 +** ... +** st1w \1, p[0-7], \[x1\] +** ... +** ret +*/ +void __attribute__((noipa)) +caller_0 (int32_t *ptr) +{ + callee_0 (ptr, svdup_s32 (42)); +} + +/* +** callee_1: +** ... +** ld1w (z[0-9]+\.s), (p[0-7])/z, \[x2\] +** ... +** st1w \1, p[0-7], \[x0\] +** ... +** ret +*/ +void __attribute__((noipa)) +callee_1 (int32_t *ptr, ...) +{ + va_list va; + svint32_t vec; + + va_start (va, ptr); + va_arg (va, int); + vec = va_arg (va, svint32_t); + va_end (va); + svst1 (svptrue_b8 (), ptr, vec); +} + +/* +** caller_1: +** ... +** mov (z[0-9]+\.s), #42 +** ... +** st1w \1, p[0-7], \[x2\] +** ... +** ret +*/ +void __attribute__((noipa)) +caller_1 (int32_t *ptr) +{ + callee_1 (ptr, 1, svdup_s32 (42)); +} + +/* +** callee_7: +** ... +** ld1w (z[0-9]+\.s), (p[0-7])/z, \[x7\] +** ... +** st1w \1, p[0-7], \[x0\] +** ... +** ret +*/ +void __attribute__((noipa)) +callee_7 (int32_t *ptr, ...) +{ + va_list va; + svint32_t vec; + + va_start (va, ptr); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + vec = va_arg (va, svint32_t); + va_end (va); + svst1 (svptrue_b8 (), ptr, vec); +} + +/* +** caller_7: +** ... +** mov (z[0-9]+\.s), #42 +** ... +** st1w \1, p[0-7], \[x7\] +** ... +** ret +*/ +void __attribute__((noipa)) +caller_7 (int32_t *ptr) +{ + callee_7 (ptr, 1, 2, 3, 4, 5, 6, svdup_s32 (42)); +} + +/* FIXME: We should be able to get rid of the va_list object. */ +/* +** callee_8: +** sub sp, sp, #([0-9]+) +** ... +** ldr (x[0-9]+), \[sp, \1\] +** ... +** ld1w (z[0-9]+\.s), (p[0-7])/z, \[\2\] +** ... +** st1w \3, \4, \[x0\] +** ... +** ret +*/ +void __attribute__((noipa)) +callee_8 (int32_t *ptr, ...) +{ + va_list va; + svint32_t vec; + + va_start (va, ptr); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + vec = va_arg (va, svint32_t); + va_end (va); + svst1 (svptrue_b8 (), ptr, vec); +} + +/* +** caller_8: +** ... +** mov (z[0-9]+\.s), #42 +** ... +** st1w \1, p[0-7], \[(x[0-9]+)\] +** ... +** str \2, \[sp\] +** ... +** ret +*/ +void __attribute__((noipa)) +caller_8 (int32_t *ptr) +{ + callee_8 (ptr, 1, 2, 3, 4, 5, 6, 7, svdup_s32 (42)); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/varargs_2_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/varargs_2_s64.c new file mode 100644 index 0000000..e6c4447 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/varargs_2_s64.c @@ -0,0 +1,170 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include +#include + +/* +** callee_0: +** ... +** ld1d (z[0-9]+\.d), (p[0-7])/z, \[x1\] +** ... +** st1d \1, \2, \[x0\] +** ... +** ret +*/ +void __attribute__((noipa)) +callee_0 (int64_t *ptr, ...) +{ + va_list va; + svint64_t vec; + + va_start (va, ptr); + vec = va_arg (va, svint64_t); + va_end (va); + svst1 (svptrue_b8 (), ptr, vec); +} + +/* +** caller_0: +** ... +** mov (z[0-9]+\.d), #42 +** ... +** st1d \1, p[0-7], \[x1\] +** ... +** ret +*/ +void __attribute__((noipa)) +caller_0 (int64_t *ptr) +{ + callee_0 (ptr, svdup_s64 (42)); +} + +/* +** callee_1: +** ... +** ld1d (z[0-9]+\.d), (p[0-7])/z, \[x2\] +** ... +** st1d \1, p[0-7], \[x0\] +** ... +** ret +*/ +void __attribute__((noipa)) +callee_1 (int64_t *ptr, ...) +{ + va_list va; + svint64_t vec; + + va_start (va, ptr); + va_arg (va, int); + vec = va_arg (va, svint64_t); + va_end (va); + svst1 (svptrue_b8 (), ptr, vec); +} + +/* +** caller_1: +** ... +** mov (z[0-9]+\.d), #42 +** ... +** st1d \1, p[0-7], \[x2\] +** ... +** ret +*/ +void __attribute__((noipa)) +caller_1 (int64_t *ptr) +{ + callee_1 (ptr, 1, svdup_s64 (42)); +} + +/* +** callee_7: +** ... +** ld1d (z[0-9]+\.d), (p[0-7])/z, \[x7\] +** ... +** st1d \1, p[0-7], \[x0\] +** ... +** ret +*/ +void __attribute__((noipa)) +callee_7 (int64_t *ptr, ...) +{ + va_list va; + svint64_t vec; + + va_start (va, ptr); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + vec = va_arg (va, svint64_t); + va_end (va); + svst1 (svptrue_b8 (), ptr, vec); +} + +/* +** caller_7: +** ... +** mov (z[0-9]+\.d), #42 +** ... +** st1d \1, p[0-7], \[x7\] +** ... +** ret +*/ +void __attribute__((noipa)) +caller_7 (int64_t *ptr) +{ + callee_7 (ptr, 1, 2, 3, 4, 5, 6, svdup_s64 (42)); +} + +/* FIXME: We should be able to get rid of the va_list object. */ +/* +** callee_8: +** sub sp, sp, #([0-9]+) +** ... +** ldr (x[0-9]+), \[sp, \1\] +** ... +** ld1d (z[0-9]+\.d), (p[0-7])/z, \[\2\] +** ... +** st1d \3, \4, \[x0\] +** ... +** ret +*/ +void __attribute__((noipa)) +callee_8 (int64_t *ptr, ...) +{ + va_list va; + svint64_t vec; + + va_start (va, ptr); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + vec = va_arg (va, svint64_t); + va_end (va); + svst1 (svptrue_b8 (), ptr, vec); +} + +/* +** caller_8: +** ... +** mov (z[0-9]+\.d), #42 +** ... +** st1d \1, p[0-7], \[(x[0-9]+)\] +** ... +** str \2, \[sp\] +** ... +** ret +*/ +void __attribute__((noipa)) +caller_8 (int64_t *ptr) +{ + callee_8 (ptr, 1, 2, 3, 4, 5, 6, 7, svdup_s64 (42)); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/varargs_2_s8.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/varargs_2_s8.c new file mode 100644 index 0000000..3f1d5f1 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/varargs_2_s8.c @@ -0,0 +1,170 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include +#include + +/* +** callee_0: +** ... +** ld1b (z[0-9]+\.b), (p[0-7])/z, \[x1\] +** ... +** st1b \1, \2, \[x0\] +** ... +** ret +*/ +void __attribute__((noipa)) +callee_0 (int8_t *ptr, ...) +{ + va_list va; + svint8_t vec; + + va_start (va, ptr); + vec = va_arg (va, svint8_t); + va_end (va); + svst1 (svptrue_b8 (), ptr, vec); +} + +/* +** caller_0: +** ... +** mov (z[0-9]+\.b), #42 +** ... +** st1b \1, p[0-7], \[x1\] +** ... +** ret +*/ +void __attribute__((noipa)) +caller_0 (int8_t *ptr) +{ + callee_0 (ptr, svdup_s8 (42)); +} + +/* +** callee_1: +** ... +** ld1b (z[0-9]+\.b), (p[0-7])/z, \[x2\] +** ... +** st1b \1, p[0-7], \[x0\] +** ... +** ret +*/ +void __attribute__((noipa)) +callee_1 (int8_t *ptr, ...) +{ + va_list va; + svint8_t vec; + + va_start (va, ptr); + va_arg (va, int); + vec = va_arg (va, svint8_t); + va_end (va); + svst1 (svptrue_b8 (), ptr, vec); +} + +/* +** caller_1: +** ... +** mov (z[0-9]+\.b), #42 +** ... +** st1b \1, p[0-7], \[x2\] +** ... +** ret +*/ +void __attribute__((noipa)) +caller_1 (int8_t *ptr) +{ + callee_1 (ptr, 1, svdup_s8 (42)); +} + +/* +** callee_7: +** ... +** ld1b (z[0-9]+\.b), (p[0-7])/z, \[x7\] +** ... +** st1b \1, p[0-7], \[x0\] +** ... +** ret +*/ +void __attribute__((noipa)) +callee_7 (int8_t *ptr, ...) +{ + va_list va; + svint8_t vec; + + va_start (va, ptr); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + vec = va_arg (va, svint8_t); + va_end (va); + svst1 (svptrue_b8 (), ptr, vec); +} + +/* +** caller_7: +** ... +** mov (z[0-9]+\.b), #42 +** ... +** st1b \1, p[0-7], \[x7\] +** ... +** ret +*/ +void __attribute__((noipa)) +caller_7 (int8_t *ptr) +{ + callee_7 (ptr, 1, 2, 3, 4, 5, 6, svdup_s8 (42)); +} + +/* FIXME: We should be able to get rid of the va_list object. */ +/* +** callee_8: +** sub sp, sp, #([0-9]+) +** ... +** ldr (x[0-9]+), \[sp, \1\] +** ... +** ld1b (z[0-9]+\.b), (p[0-7])/z, \[\2\] +** ... +** st1b \3, \4, \[x0\] +** ... +** ret +*/ +void __attribute__((noipa)) +callee_8 (int8_t *ptr, ...) +{ + va_list va; + svint8_t vec; + + va_start (va, ptr); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + vec = va_arg (va, svint8_t); + va_end (va); + svst1 (svptrue_b8 (), ptr, vec); +} + +/* +** caller_8: +** ... +** mov (z[0-9]+\.b), #42 +** ... +** st1b \1, p[0-7], \[(x[0-9]+)\] +** ... +** str \2, \[sp\] +** ... +** ret +*/ +void __attribute__((noipa)) +caller_8 (int8_t *ptr) +{ + callee_8 (ptr, 1, 2, 3, 4, 5, 6, 7, svdup_s8 (42)); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/varargs_2_u16.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/varargs_2_u16.c new file mode 100644 index 0000000..658aadc --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/varargs_2_u16.c @@ -0,0 +1,170 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include +#include + +/* +** callee_0: +** ... +** ld1h (z[0-9]+\.h), (p[0-7])/z, \[x1\] +** ... +** st1h \1, \2, \[x0\] +** ... +** ret +*/ +void __attribute__((noipa)) +callee_0 (int16_t *ptr, ...) +{ + va_list va; + svint16_t vec; + + va_start (va, ptr); + vec = va_arg (va, svint16_t); + va_end (va); + svst1 (svptrue_b8 (), ptr, vec); +} + +/* +** caller_0: +** ... +** mov (z[0-9]+\.h), #42 +** ... +** st1h \1, p[0-7], \[x1\] +** ... +** ret +*/ +void __attribute__((noipa)) +caller_0 (int16_t *ptr) +{ + callee_0 (ptr, svdup_u16 (42)); +} + +/* +** callee_1: +** ... +** ld1h (z[0-9]+\.h), (p[0-7])/z, \[x2\] +** ... +** st1h \1, p[0-7], \[x0\] +** ... +** ret +*/ +void __attribute__((noipa)) +callee_1 (int16_t *ptr, ...) +{ + va_list va; + svint16_t vec; + + va_start (va, ptr); + va_arg (va, int); + vec = va_arg (va, svint16_t); + va_end (va); + svst1 (svptrue_b8 (), ptr, vec); +} + +/* +** caller_1: +** ... +** mov (z[0-9]+\.h), #42 +** ... +** st1h \1, p[0-7], \[x2\] +** ... +** ret +*/ +void __attribute__((noipa)) +caller_1 (int16_t *ptr) +{ + callee_1 (ptr, 1, svdup_u16 (42)); +} + +/* +** callee_7: +** ... +** ld1h (z[0-9]+\.h), (p[0-7])/z, \[x7\] +** ... +** st1h \1, p[0-7], \[x0\] +** ... +** ret +*/ +void __attribute__((noipa)) +callee_7 (int16_t *ptr, ...) +{ + va_list va; + svint16_t vec; + + va_start (va, ptr); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + vec = va_arg (va, svint16_t); + va_end (va); + svst1 (svptrue_b8 (), ptr, vec); +} + +/* +** caller_7: +** ... +** mov (z[0-9]+\.h), #42 +** ... +** st1h \1, p[0-7], \[x7\] +** ... +** ret +*/ +void __attribute__((noipa)) +caller_7 (int16_t *ptr) +{ + callee_7 (ptr, 1, 2, 3, 4, 5, 6, svdup_u16 (42)); +} + +/* FIXME: We should be able to get rid of the va_list object. */ +/* +** callee_8: +** sub sp, sp, #([0-9]+) +** ... +** ldr (x[0-9]+), \[sp, \1\] +** ... +** ld1h (z[0-9]+\.h), (p[0-7])/z, \[\2\] +** ... +** st1h \3, \4, \[x0\] +** ... +** ret +*/ +void __attribute__((noipa)) +callee_8 (int16_t *ptr, ...) +{ + va_list va; + svint16_t vec; + + va_start (va, ptr); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + vec = va_arg (va, svint16_t); + va_end (va); + svst1 (svptrue_b8 (), ptr, vec); +} + +/* +** caller_8: +** ... +** mov (z[0-9]+\.h), #42 +** ... +** st1h \1, p[0-7], \[(x[0-9]+)\] +** ... +** str \2, \[sp\] +** ... +** ret +*/ +void __attribute__((noipa)) +caller_8 (int16_t *ptr) +{ + callee_8 (ptr, 1, 2, 3, 4, 5, 6, 7, svdup_u16 (42)); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/varargs_2_u32.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/varargs_2_u32.c new file mode 100644 index 0000000..2ab320a3 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/varargs_2_u32.c @@ -0,0 +1,170 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include +#include + +/* +** callee_0: +** ... +** ld1w (z[0-9]+\.s), (p[0-7])/z, \[x1\] +** ... +** st1w \1, \2, \[x0\] +** ... +** ret +*/ +void __attribute__((noipa)) +callee_0 (int32_t *ptr, ...) +{ + va_list va; + svint32_t vec; + + va_start (va, ptr); + vec = va_arg (va, svint32_t); + va_end (va); + svst1 (svptrue_b8 (), ptr, vec); +} + +/* +** caller_0: +** ... +** mov (z[0-9]+\.s), #42 +** ... +** st1w \1, p[0-7], \[x1\] +** ... +** ret +*/ +void __attribute__((noipa)) +caller_0 (int32_t *ptr) +{ + callee_0 (ptr, svdup_u32 (42)); +} + +/* +** callee_1: +** ... +** ld1w (z[0-9]+\.s), (p[0-7])/z, \[x2\] +** ... +** st1w \1, p[0-7], \[x0\] +** ... +** ret +*/ +void __attribute__((noipa)) +callee_1 (int32_t *ptr, ...) +{ + va_list va; + svint32_t vec; + + va_start (va, ptr); + va_arg (va, int); + vec = va_arg (va, svint32_t); + va_end (va); + svst1 (svptrue_b8 (), ptr, vec); +} + +/* +** caller_1: +** ... +** mov (z[0-9]+\.s), #42 +** ... +** st1w \1, p[0-7], \[x2\] +** ... +** ret +*/ +void __attribute__((noipa)) +caller_1 (int32_t *ptr) +{ + callee_1 (ptr, 1, svdup_u32 (42)); +} + +/* +** callee_7: +** ... +** ld1w (z[0-9]+\.s), (p[0-7])/z, \[x7\] +** ... +** st1w \1, p[0-7], \[x0\] +** ... +** ret +*/ +void __attribute__((noipa)) +callee_7 (int32_t *ptr, ...) +{ + va_list va; + svint32_t vec; + + va_start (va, ptr); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + vec = va_arg (va, svint32_t); + va_end (va); + svst1 (svptrue_b8 (), ptr, vec); +} + +/* +** caller_7: +** ... +** mov (z[0-9]+\.s), #42 +** ... +** st1w \1, p[0-7], \[x7\] +** ... +** ret +*/ +void __attribute__((noipa)) +caller_7 (int32_t *ptr) +{ + callee_7 (ptr, 1, 2, 3, 4, 5, 6, svdup_u32 (42)); +} + +/* FIXME: We should be able to get rid of the va_list object. */ +/* +** callee_8: +** sub sp, sp, #([0-9]+) +** ... +** ldr (x[0-9]+), \[sp, \1\] +** ... +** ld1w (z[0-9]+\.s), (p[0-7])/z, \[\2\] +** ... +** st1w \3, \4, \[x0\] +** ... +** ret +*/ +void __attribute__((noipa)) +callee_8 (int32_t *ptr, ...) +{ + va_list va; + svint32_t vec; + + va_start (va, ptr); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + vec = va_arg (va, svint32_t); + va_end (va); + svst1 (svptrue_b8 (), ptr, vec); +} + +/* +** caller_8: +** ... +** mov (z[0-9]+\.s), #42 +** ... +** st1w \1, p[0-7], \[(x[0-9]+)\] +** ... +** str \2, \[sp\] +** ... +** ret +*/ +void __attribute__((noipa)) +caller_8 (int32_t *ptr) +{ + callee_8 (ptr, 1, 2, 3, 4, 5, 6, 7, svdup_u32 (42)); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/varargs_2_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/varargs_2_u64.c new file mode 100644 index 0000000..1326af5 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/varargs_2_u64.c @@ -0,0 +1,170 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include +#include + +/* +** callee_0: +** ... +** ld1d (z[0-9]+\.d), (p[0-7])/z, \[x1\] +** ... +** st1d \1, \2, \[x0\] +** ... +** ret +*/ +void __attribute__((noipa)) +callee_0 (int64_t *ptr, ...) +{ + va_list va; + svint64_t vec; + + va_start (va, ptr); + vec = va_arg (va, svint64_t); + va_end (va); + svst1 (svptrue_b8 (), ptr, vec); +} + +/* +** caller_0: +** ... +** mov (z[0-9]+\.d), #42 +** ... +** st1d \1, p[0-7], \[x1\] +** ... +** ret +*/ +void __attribute__((noipa)) +caller_0 (int64_t *ptr) +{ + callee_0 (ptr, svdup_u64 (42)); +} + +/* +** callee_1: +** ... +** ld1d (z[0-9]+\.d), (p[0-7])/z, \[x2\] +** ... +** st1d \1, p[0-7], \[x0\] +** ... +** ret +*/ +void __attribute__((noipa)) +callee_1 (int64_t *ptr, ...) +{ + va_list va; + svint64_t vec; + + va_start (va, ptr); + va_arg (va, int); + vec = va_arg (va, svint64_t); + va_end (va); + svst1 (svptrue_b8 (), ptr, vec); +} + +/* +** caller_1: +** ... +** mov (z[0-9]+\.d), #42 +** ... +** st1d \1, p[0-7], \[x2\] +** ... +** ret +*/ +void __attribute__((noipa)) +caller_1 (int64_t *ptr) +{ + callee_1 (ptr, 1, svdup_u64 (42)); +} + +/* +** callee_7: +** ... +** ld1d (z[0-9]+\.d), (p[0-7])/z, \[x7\] +** ... +** st1d \1, p[0-7], \[x0\] +** ... +** ret +*/ +void __attribute__((noipa)) +callee_7 (int64_t *ptr, ...) +{ + va_list va; + svint64_t vec; + + va_start (va, ptr); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + vec = va_arg (va, svint64_t); + va_end (va); + svst1 (svptrue_b8 (), ptr, vec); +} + +/* +** caller_7: +** ... +** mov (z[0-9]+\.d), #42 +** ... +** st1d \1, p[0-7], \[x7\] +** ... +** ret +*/ +void __attribute__((noipa)) +caller_7 (int64_t *ptr) +{ + callee_7 (ptr, 1, 2, 3, 4, 5, 6, svdup_u64 (42)); +} + +/* FIXME: We should be able to get rid of the va_list object. */ +/* +** callee_8: +** sub sp, sp, #([0-9]+) +** ... +** ldr (x[0-9]+), \[sp, \1\] +** ... +** ld1d (z[0-9]+\.d), (p[0-7])/z, \[\2\] +** ... +** st1d \3, \4, \[x0\] +** ... +** ret +*/ +void __attribute__((noipa)) +callee_8 (int64_t *ptr, ...) +{ + va_list va; + svint64_t vec; + + va_start (va, ptr); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + vec = va_arg (va, svint64_t); + va_end (va); + svst1 (svptrue_b8 (), ptr, vec); +} + +/* +** caller_8: +** ... +** mov (z[0-9]+\.d), #42 +** ... +** st1d \1, p[0-7], \[(x[0-9]+)\] +** ... +** str \2, \[sp\] +** ... +** ret +*/ +void __attribute__((noipa)) +caller_8 (int64_t *ptr) +{ + callee_8 (ptr, 1, 2, 3, 4, 5, 6, 7, svdup_u64 (42)); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/varargs_2_u8.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/varargs_2_u8.c new file mode 100644 index 0000000..a2b812d --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/varargs_2_u8.c @@ -0,0 +1,170 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include +#include + +/* +** callee_0: +** ... +** ld1b (z[0-9]+\.b), (p[0-7])/z, \[x1\] +** ... +** st1b \1, \2, \[x0\] +** ... +** ret +*/ +void __attribute__((noipa)) +callee_0 (int8_t *ptr, ...) +{ + va_list va; + svint8_t vec; + + va_start (va, ptr); + vec = va_arg (va, svint8_t); + va_end (va); + svst1 (svptrue_b8 (), ptr, vec); +} + +/* +** caller_0: +** ... +** mov (z[0-9]+\.b), #42 +** ... +** st1b \1, p[0-7], \[x1\] +** ... +** ret +*/ +void __attribute__((noipa)) +caller_0 (int8_t *ptr) +{ + callee_0 (ptr, svdup_u8 (42)); +} + +/* +** callee_1: +** ... +** ld1b (z[0-9]+\.b), (p[0-7])/z, \[x2\] +** ... +** st1b \1, p[0-7], \[x0\] +** ... +** ret +*/ +void __attribute__((noipa)) +callee_1 (int8_t *ptr, ...) +{ + va_list va; + svint8_t vec; + + va_start (va, ptr); + va_arg (va, int); + vec = va_arg (va, svint8_t); + va_end (va); + svst1 (svptrue_b8 (), ptr, vec); +} + +/* +** caller_1: +** ... +** mov (z[0-9]+\.b), #42 +** ... +** st1b \1, p[0-7], \[x2\] +** ... +** ret +*/ +void __attribute__((noipa)) +caller_1 (int8_t *ptr) +{ + callee_1 (ptr, 1, svdup_u8 (42)); +} + +/* +** callee_7: +** ... +** ld1b (z[0-9]+\.b), (p[0-7])/z, \[x7\] +** ... +** st1b \1, p[0-7], \[x0\] +** ... +** ret +*/ +void __attribute__((noipa)) +callee_7 (int8_t *ptr, ...) +{ + va_list va; + svint8_t vec; + + va_start (va, ptr); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + vec = va_arg (va, svint8_t); + va_end (va); + svst1 (svptrue_b8 (), ptr, vec); +} + +/* +** caller_7: +** ... +** mov (z[0-9]+\.b), #42 +** ... +** st1b \1, p[0-7], \[x7\] +** ... +** ret +*/ +void __attribute__((noipa)) +caller_7 (int8_t *ptr) +{ + callee_7 (ptr, 1, 2, 3, 4, 5, 6, svdup_u8 (42)); +} + +/* FIXME: We should be able to get rid of the va_list object. */ +/* +** callee_8: +** sub sp, sp, #([0-9]+) +** ... +** ldr (x[0-9]+), \[sp, \1\] +** ... +** ld1b (z[0-9]+\.b), (p[0-7])/z, \[\2\] +** ... +** st1b \3, \4, \[x0\] +** ... +** ret +*/ +void __attribute__((noipa)) +callee_8 (int8_t *ptr, ...) +{ + va_list va; + svint8_t vec; + + va_start (va, ptr); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + vec = va_arg (va, svint8_t); + va_end (va); + svst1 (svptrue_b8 (), ptr, vec); +} + +/* +** caller_8: +** ... +** mov (z[0-9]+\.b), #42 +** ... +** st1b \1, p[0-7], \[(x[0-9]+)\] +** ... +** str \2, \[sp\] +** ... +** ret +*/ +void __attribute__((noipa)) +caller_8 (int8_t *ptr) +{ + callee_8 (ptr, 1, 2, 3, 4, 5, 6, 7, svdup_u8 (42)); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/varargs_3_nosc.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/varargs_3_nosc.c new file mode 100644 index 0000000..cea69cc --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/varargs_3_nosc.c @@ -0,0 +1,75 @@ +/* { dg-do run { target aarch64_sve_hw } } */ +/* { dg-options "-O0 -g" } */ + +#include +#include + +void __attribute__((noipa)) +callee (int foo, ...) +{ + va_list va; + svbool_t pg, p; + svint8_t s8; + svuint16x4_t u16; + svfloat32x3_t f32; + svint64x2_t s64; + + va_start (va, foo); + p = va_arg (va, svbool_t); + s8 = va_arg (va, svint8_t); + u16 = va_arg (va, svuint16x4_t); + f32 = va_arg (va, svfloat32x3_t); + s64 = va_arg (va, svint64x2_t); + + pg = svptrue_b8 (); + + if (svptest_any (pg, sveor_z (pg, p, svptrue_pat_b8 (SV_VL7)))) + __builtin_abort (); + + if (svptest_any (pg, svcmpne (pg, s8, svindex_s8 (1, 2)))) + __builtin_abort (); + + if (svptest_any (pg, svcmpne (pg, svget4 (u16, 0), svindex_u16 (2, 3)))) + __builtin_abort (); + + if (svptest_any (pg, svcmpne (pg, svget4 (u16, 1), svindex_u16 (3, 4)))) + __builtin_abort (); + + if (svptest_any (pg, svcmpne (pg, svget4 (u16, 2), svindex_u16 (4, 5)))) + __builtin_abort (); + + if (svptest_any (pg, svcmpne (pg, svget4 (u16, 3), svindex_u16 (5, 6)))) + __builtin_abort (); + + if (svptest_any (pg, svcmpne (pg, svget3 (f32, 0), svdup_f32 (1.0)))) + __builtin_abort (); + + if (svptest_any (pg, svcmpne (pg, svget3 (f32, 1), svdup_f32 (2.0)))) + __builtin_abort (); + + if (svptest_any (pg, svcmpne (pg, svget3 (f32, 2), svdup_f32 (3.0)))) + __builtin_abort (); + + if (svptest_any (pg, svcmpne (pg, svget2 (s64, 0), svindex_s64 (6, 7)))) + __builtin_abort (); + + if (svptest_any (pg, svcmpne (pg, svget2 (s64, 1), svindex_s64 (7, 8)))) + __builtin_abort (); +} + +int __attribute__((noipa)) +main (void) +{ + callee (100, + svptrue_pat_b8 (SV_VL7), + svindex_s8 (1, 2), + svcreate4 (svindex_u16 (2, 3), + svindex_u16 (3, 4), + svindex_u16 (4, 5), + svindex_u16 (5, 6)), + svcreate3 (svdup_f32 (1.0), + svdup_f32 (2.0), + svdup_f32 (3.0)), + svcreate2 (svindex_s64 (6, 7), + svindex_s64 (7, 8))); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/varargs_3_sc.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/varargs_3_sc.c new file mode 100644 index 0000000..b939aa5 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/varargs_3_sc.c @@ -0,0 +1,75 @@ +/* { dg-do run { target aarch64_sve_hw } } */ +/* { dg-options "-O0 -fstack-clash-protection -g" } */ + +#include +#include + +void __attribute__((noipa)) +callee (int foo, ...) +{ + va_list va; + svbool_t pg, p; + svint8_t s8; + svuint16x4_t u16; + svfloat32x3_t f32; + svint64x2_t s64; + + va_start (va, foo); + p = va_arg (va, svbool_t); + s8 = va_arg (va, svint8_t); + u16 = va_arg (va, svuint16x4_t); + f32 = va_arg (va, svfloat32x3_t); + s64 = va_arg (va, svint64x2_t); + + pg = svptrue_b8 (); + + if (svptest_any (pg, sveor_z (pg, p, svptrue_pat_b8 (SV_VL7)))) + __builtin_abort (); + + if (svptest_any (pg, svcmpne (pg, s8, svindex_s8 (1, 2)))) + __builtin_abort (); + + if (svptest_any (pg, svcmpne (pg, svget4 (u16, 0), svindex_u16 (2, 3)))) + __builtin_abort (); + + if (svptest_any (pg, svcmpne (pg, svget4 (u16, 1), svindex_u16 (3, 4)))) + __builtin_abort (); + + if (svptest_any (pg, svcmpne (pg, svget4 (u16, 2), svindex_u16 (4, 5)))) + __builtin_abort (); + + if (svptest_any (pg, svcmpne (pg, svget4 (u16, 3), svindex_u16 (5, 6)))) + __builtin_abort (); + + if (svptest_any (pg, svcmpne (pg, svget3 (f32, 0), svdup_f32 (1.0)))) + __builtin_abort (); + + if (svptest_any (pg, svcmpne (pg, svget3 (f32, 1), svdup_f32 (2.0)))) + __builtin_abort (); + + if (svptest_any (pg, svcmpne (pg, svget3 (f32, 2), svdup_f32 (3.0)))) + __builtin_abort (); + + if (svptest_any (pg, svcmpne (pg, svget2 (s64, 0), svindex_s64 (6, 7)))) + __builtin_abort (); + + if (svptest_any (pg, svcmpne (pg, svget2 (s64, 1), svindex_s64 (7, 8)))) + __builtin_abort (); +} + +int __attribute__((noipa)) +main (void) +{ + callee (100, + svptrue_pat_b8 (SV_VL7), + svindex_s8 (1, 2), + svcreate4 (svindex_u16 (2, 3), + svindex_u16 (3, 4), + svindex_u16 (4, 5), + svindex_u16 (5, 6)), + svcreate3 (svdup_f32 (1.0), + svdup_f32 (2.0), + svdup_f32 (3.0)), + svcreate2 (svindex_s64 (6, 7), + svindex_s64 (7, 8))); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/vpcs_1.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/vpcs_1.c new file mode 100644 index 0000000..d9f4e6c --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/vpcs_1.c @@ -0,0 +1,6 @@ +/* { dg-do compile } */ + +__attribute__ ((aarch64_vector_pcs)) void f1 (__SVBool_t); /* { dg-error {the 'aarch64_vector_pcs' attribute cannot be applied to an SVE function type} } */ +__attribute__ ((aarch64_vector_pcs)) void f2 (__SVInt8_t s8) {} /* { dg-error {the 'aarch64_vector_pcs' attribute cannot be applied to an SVE function type} } */ +__attribute__ ((aarch64_vector_pcs)) void (*f3) (__SVInt16_t); /* { dg-error {the 'aarch64_vector_pcs' attribute cannot be applied to an SVE function type} } */ +typedef __attribute__ ((aarch64_vector_pcs)) void (*f4) (__SVInt32_t); /* { dg-error {the 'aarch64_vector_pcs' attribute cannot be applied to an SVE function type} } */ -- 2.7.4