From 787c7a65f6ec2876337e6c50a26a1da0fadcb5bf Mon Sep 17 00:00:00 2001 From: Michael Meissner Date: Thu, 27 Oct 2016 20:52:07 +0000 Subject: [PATCH] constraints.md (wH constraint): Add new constraints for allowing 32-bit integers (and eventually 8/16-bit... [gcc] 2016-10-27 Michael Meissner * config/rs6000/constraints.md (wH constraint): Add new constraints for allowing 32-bit integers (and eventually 8/16-bit integers) into the vector registers. (wI constraint): Likewise. (wJ constraint): Likewise. (wK constraint): Likewise. * config/rs6000/rs6000-cpus.def (ISA_2_7_MASKS_SERVER): Add -mvsx-small-integer as a default option for ISA 2.07 (i.e. power8). (POWERPC_MASKS): Likewise. * config/rs6000/rs6000.opt (-mvsx-small-integer): Add new debug switch to turn off small integer support in vector registers. * config/rs6000/rs6000.c (rs6000_hard_regno_mode_ok): Eliminate test for -mupper-regs-di, since it is already done with the reg_add[mode].scalar_in_vsx_p. Add support for the switch -mvsx-small-integer. (rs6000_debug_reg_global): Add support for wH, wI, wJ, and wK constraints. (rs6000_setup_reg_addr_masks): Likewise. (rs6000_init_hard_regno_mode_ok): Likewise. (rs6000_option_override_internal): Add consistency checks for -mvsx-small-integer. (rs6000_secondary_reload_simple_move): SImode is a simple move if -mvsx-small-integer. (rs6000_secondary_reload): Use std::swap. (rs6000_preferred_reload_class): Don't prefer FLOAT_REGS over VSX_REGS for small integers in vector registers, since there is no D-FORM address mode for such types. (rs6000_register_move_cost): Use FIRST_FPR_REGNO instead of 32. (rs6000_opt_masks): Add -mvsx-small-integer. * config/rs6000/vsx.md (VSINT_84): Add SImode for small integer support. (VSX_EXTRACT_I2): Clone VSX_EXTRACT_I, but drop V4SI since SImode extracts can be done on ISA 2.07. (vsx_extract_): Add support for small integers in vsx registers. (vsx_extract__p9): Use 'v' instead of VSX_EX, since we no longer support V4SImode in this pattern. (vsx_extract_si): New insn to support extraction of SImode in ISA 2.07 using either xxextractuw or vspltw. (vsx_extract__p8): Use 'v' instead of VSX_EX, since we no longer support V4SImode in this pattern. * config/rs6000/rs6000.h (enum rs6000_reg_class_enum): Add wH, wI, wJ, and wK constraints. * config/rs6000/rs6000.md (f32_sv): Use correct instruction for storing SDmode with VSX instructions. (zero_extendsi2): Reorder pattern, so RLDICL comes after the GPR load and before the FPR and VSX loads. Remove ??, ! from the constraints. Add MFVSRWZ and XXEXTRACTUW instructions to support small integers in vector registers. (extendsi2): Reorder pattern, so EXTSW comes after the GPR load and before the FPR and VSX loads. Remove ??, ! from the constraints. Add VEXTSW2D support for small integers in vector registers. (lfiwax): Remove ! constraint. Add VEXTSW2D support for small integers in vector registers. (floatsi2_lfiwax): If -mvsx-small-integer issue a normal move instead of using an UNSPEC. (lfiwzx): Remove ! constraint. Add XXEXTRACTUW support for small integers in vector registers. (floatunssi2_lfiwzx): If -mvsx-small-integer issue a normal move instead of using an UNSPEC. (movsi_internal1): Add support for -mvsx-small-integer. Align columns so that it is more readable. (SImode splitter for ISA 3.0 constants): Add splitter for -128..127 constants that can easily be constructed on ISA 3.0. * doc/md.texi (PowerPC Constraints): Document wH, wI, wJ, and wK constraints. [gcc/testsuite] 2016-10-27 Michael Meissner * gcc.target/powerpc/vsx-simode.c: New test. * gcc.target/powerpc/vsx-simode2.c: Likewise. * gcc.target/powerpc/vsx-simode3.c: Likewise. From-SVN: r241631 --- gcc/ChangeLog | 71 +++++++++++++++ gcc/config/rs6000/constraints.md | 12 +++ gcc/config/rs6000/rs6000-cpus.def | 4 +- gcc/config/rs6000/rs6000.c | 120 +++++++++++++++++++------ gcc/config/rs6000/rs6000.h | 4 + gcc/config/rs6000/rs6000.md | 119 ++++++++++++++++++------ gcc/config/rs6000/rs6000.opt | 4 + gcc/config/rs6000/vsx.md | 88 ++++++++++++++---- gcc/doc/md.texi | 12 +++ gcc/testsuite/ChangeLog | 6 ++ gcc/testsuite/gcc.target/powerpc/vsx-simode.c | 22 +++++ gcc/testsuite/gcc.target/powerpc/vsx-simode2.c | 15 ++++ gcc/testsuite/gcc.target/powerpc/vsx-simode3.c | 22 +++++ 13 files changed, 429 insertions(+), 70 deletions(-) create mode 100644 gcc/testsuite/gcc.target/powerpc/vsx-simode.c create mode 100644 gcc/testsuite/gcc.target/powerpc/vsx-simode2.c create mode 100644 gcc/testsuite/gcc.target/powerpc/vsx-simode3.c diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 4d0b993..1e05d45 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,74 @@ +2016-10-27 Michael Meissner + + * config/rs6000/constraints.md (wH constraint): Add new + constraints for allowing 32-bit integers (and eventually 8/16-bit + integers) into the vector registers. + (wI constraint): Likewise. + (wJ constraint): Likewise. + (wK constraint): Likewise. + * config/rs6000/rs6000-cpus.def (ISA_2_7_MASKS_SERVER): Add + -mvsx-small-integer as a default option for ISA 2.07 + (i.e. power8). + (POWERPC_MASKS): Likewise. + * config/rs6000/rs6000.opt (-mvsx-small-integer): Add new debug + switch to turn off small integer support in vector registers. + * config/rs6000/rs6000.c (rs6000_hard_regno_mode_ok): Eliminate + test for -mupper-regs-di, since it is already done with the + reg_add[mode].scalar_in_vsx_p. Add support for the switch + -mvsx-small-integer. + (rs6000_debug_reg_global): Add support for wH, wI, wJ, and wK + constraints. + (rs6000_setup_reg_addr_masks): Likewise. + (rs6000_init_hard_regno_mode_ok): Likewise. + (rs6000_option_override_internal): Add consistency checks for + -mvsx-small-integer. + (rs6000_secondary_reload_simple_move): SImode is a simple move if + -mvsx-small-integer. + (rs6000_secondary_reload): Use std::swap. + (rs6000_preferred_reload_class): Don't prefer FLOAT_REGS over + VSX_REGS for small integers in vector registers, since there is no + D-FORM address mode for such types. + (rs6000_register_move_cost): Use FIRST_FPR_REGNO instead of 32. + (rs6000_opt_masks): Add -mvsx-small-integer. + * config/rs6000/vsx.md (VSINT_84): Add SImode for small integer + support. + (VSX_EXTRACT_I2): Clone VSX_EXTRACT_I, but drop V4SI since SImode + extracts can be done on ISA 2.07. + (vsx_extract_): Add support for small integers in vsx + registers. + (vsx_extract__p9): Use 'v' instead of VSX_EX, since we no + longer support V4SImode in this pattern. + (vsx_extract_si): New insn to support extraction of SImode in ISA + 2.07 using either xxextractuw or vspltw. + (vsx_extract__p8): Use 'v' instead of VSX_EX, since we no + longer support V4SImode in this pattern. + * config/rs6000/rs6000.h (enum rs6000_reg_class_enum): Add wH, wI, + wJ, and wK constraints. + * config/rs6000/rs6000.md (f32_sv): Use correct instruction for + storing SDmode with VSX instructions. + (zero_extendsi2): Reorder pattern, so RLDICL comes after the + GPR load and before the FPR and VSX loads. Remove ??, ! from the + constraints. Add MFVSRWZ and XXEXTRACTUW instructions to support + small integers in vector registers. + (extendsi2): Reorder pattern, so EXTSW comes after the GPR + load and before the FPR and VSX loads. Remove ??, ! from the + constraints. Add VEXTSW2D support for small integers in vector + registers. + (lfiwax): Remove ! constraint. Add VEXTSW2D support for small + integers in vector registers. + (floatsi2_lfiwax): If -mvsx-small-integer issue a normal + move instead of using an UNSPEC. + (lfiwzx): Remove ! constraint. Add XXEXTRACTUW support for small + integers in vector registers. + (floatunssi2_lfiwzx): If -mvsx-small-integer issue a normal + move instead of using an UNSPEC. + (movsi_internal1): Add support for -mvsx-small-integer. Align + columns so that it is more readable. + (SImode splitter for ISA 3.0 constants): Add splitter for + -128..127 constants that can easily be constructed on ISA 3.0. + * doc/md.texi (PowerPC Constraints): Document wH, wI, wJ, and wK + constraints. + 2016-10-27 Jakub Jelinek PR middle-end/78025 diff --git a/gcc/config/rs6000/constraints.md b/gcc/config/rs6000/constraints.md index 7535c35..0463c0d 100644 --- a/gcc/config/rs6000/constraints.md +++ b/gcc/config/rs6000/constraints.md @@ -159,6 +159,18 @@ "Memory operand suitable for TOC fusion memory references" (match_operand 0 "toc_fusion_mem_wrapped")) +(define_register_constraint "wH" "rs6000_constraints[RS6000_CONSTRAINT_wH]" + "Altivec register to hold 32-bit integers or NO_REGS.") + +(define_register_constraint "wI" "rs6000_constraints[RS6000_CONSTRAINT_wI]" + "FPR register to hold 32-bit integers or NO_REGS.") + +(define_register_constraint "wJ" "rs6000_constraints[RS6000_CONSTRAINT_wJ]" + "FPR register to hold 8/16-bit integers or NO_REGS.") + +(define_register_constraint "wK" "rs6000_constraints[RS6000_CONSTRAINT_wK]" + "Altivec register to hold 8/16-bit integers or NO_REGS.") + (define_constraint "wL" "Int constant that is the element number mfvsrld accesses in a vector." (and (match_code "const_int") diff --git a/gcc/config/rs6000/rs6000-cpus.def b/gcc/config/rs6000/rs6000-cpus.def index e1786b2..c86da7a 100644 --- a/gcc/config/rs6000/rs6000-cpus.def +++ b/gcc/config/rs6000/rs6000-cpus.def @@ -58,7 +58,8 @@ | OPTION_MASK_HTM \ | OPTION_MASK_QUAD_MEMORY \ | OPTION_MASK_QUAD_MEMORY_ATOMIC \ - | OPTION_MASK_UPPER_REGS_SF) + | OPTION_MASK_UPPER_REGS_SF \ + | OPTION_MASK_VSX_SMALL_INTEGER) /* Add ISEL back into ISA 3.0, since it is supposed to be a win. Do not add P9_MINMAX until the hardware that supports it is available. Do not add @@ -138,6 +139,7 @@ | OPTION_MASK_UPPER_REGS_DF \ | OPTION_MASK_UPPER_REGS_SF \ | OPTION_MASK_VSX \ + | OPTION_MASK_VSX_SMALL_INTEGER \ | OPTION_MASK_VSX_TIMODE) #endif diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index 5e35e33..f9e4739 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -1980,8 +1980,7 @@ rs6000_hard_regno_mode_ok (int regno, machine_mode mode) || FLOAT128_VECTOR_P (mode) || reg_addr[mode].scalar_in_vmx_p || (TARGET_VSX_TIMODE && mode == TImode) - || (TARGET_VADDUQM && mode == V1TImode) - || (TARGET_UPPER_REGS_DI && mode == DImode))) + || (TARGET_VADDUQM && mode == V1TImode))) { if (FP_REGNO_P (regno)) return FP_REGNO_P (last_regno); @@ -2012,9 +2011,14 @@ rs6000_hard_regno_mode_ok (int regno, machine_mode mode) && FP_REGNO_P (last_regno)) return 1; - if (GET_MODE_CLASS (mode) == MODE_INT - && GET_MODE_SIZE (mode) == UNITS_PER_FP_WORD) - return 1; + if (GET_MODE_CLASS (mode) == MODE_INT) + { + if(GET_MODE_SIZE (mode) == UNITS_PER_FP_WORD) + return 1; + + if (TARGET_VSX_SMALL_INTEGER && mode == SImode) + return 1; + } if (PAIRED_SIMD_REGNO_P (regno) && TARGET_PAIRED_FLOAT && PAIRED_VECTOR_MODE (mode)) @@ -2447,6 +2451,10 @@ rs6000_debug_reg_global (void) "wx reg_class = %s\n" "wy reg_class = %s\n" "wz reg_class = %s\n" + "wH reg_class = %s\n" + "wI reg_class = %s\n" + "wJ reg_class = %s\n" + "wK reg_class = %s\n" "\n", reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_d]], reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_f]], @@ -2474,7 +2482,11 @@ rs6000_debug_reg_global (void) reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_ww]], reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wx]], reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wy]], - reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wz]]); + reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wz]], + reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wH]], + reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wI]], + reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wJ]], + reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wK]]); nl = "\n"; for (m = 0; m < NUM_MACHINE_MODES; ++m) @@ -2770,6 +2782,7 @@ rs6000_setup_reg_addr_masks (void) { machine_mode m2 = (machine_mode) m; bool complex_p = false; + bool small_int_p = (m2 == QImode || m2 == HImode || m2 == SImode); size_t msize; if (COMPLEX_MODE_P (m2)) @@ -2794,13 +2807,20 @@ rs6000_setup_reg_addr_masks (void) /* Can mode values go in the GPR/FPR/Altivec registers? */ if (reg >= 0 && rs6000_hard_regno_mode_ok_p[m][reg]) { + bool small_int_vsx_p = (small_int_p + && (rc == RELOAD_REG_FPR + || rc == RELOAD_REG_VMX)); + nregs = rs6000_hard_regno_nregs[m][reg]; addr_mask |= RELOAD_REG_VALID; /* Indicate if the mode takes more than 1 physical register. If it takes a single register, indicate it can do REG+REG - addressing. */ - if (nregs > 1 || m == BLKmode || complex_p) + addressing. Small integers in VSX registers can only do + REG+REG addressing. */ + if (small_int_vsx_p) + addr_mask |= RELOAD_REG_INDEXED; + else if (nregs > 1 || m == BLKmode || complex_p) addr_mask |= RELOAD_REG_MULTIPLE; else addr_mask |= RELOAD_REG_INDEXED; @@ -2817,6 +2837,7 @@ rs6000_setup_reg_addr_masks (void) && !VECTOR_MODE_P (m2) && !FLOAT128_VECTOR_P (m2) && !complex_p + && !small_int_vsx_p && (m2 != DFmode || !TARGET_UPPER_REGS_DF) && (m2 != SFmode || !TARGET_UPPER_REGS_SF) && !(TARGET_E500_DOUBLE && msize == 8)) @@ -3115,7 +3136,11 @@ rs6000_init_hard_regno_mode_ok (bool global_init_p) ww - Register class to do SF conversions in with VSX operations. wx - Float register if we can do 32-bit int stores. wy - Register class to do ISA 2.07 SF operations. - wz - Float register if we can do 32-bit unsigned int loads. */ + wz - Float register if we can do 32-bit unsigned int loads. + wH - Altivec register if SImode is allowed in VSX registers. + wI - VSX register if SImode is allowed in VSX registers. + wJ - VSX register if QImode/HImode are allowed in VSX registers. + wK - Altivec register if QImode/HImode are allowed in VSX registers. */ if (TARGET_HARD_FLOAT && TARGET_FPRS) rs6000_constraints[RS6000_CONSTRAINT_f] = FLOAT_REGS; /* SFmode */ @@ -3209,6 +3234,18 @@ rs6000_init_hard_regno_mode_ok (bool global_init_p) if (TARGET_DIRECT_MOVE_128) rs6000_constraints[RS6000_CONSTRAINT_we] = VSX_REGS; + /* Support small integers in VSX registers. */ + if (TARGET_VSX_SMALL_INTEGER) + { + rs6000_constraints[RS6000_CONSTRAINT_wH] = ALTIVEC_REGS; + rs6000_constraints[RS6000_CONSTRAINT_wI] = FLOAT_REGS; + if (TARGET_P9_VECTOR) + { + rs6000_constraints[RS6000_CONSTRAINT_wJ] = FLOAT_REGS; + rs6000_constraints[RS6000_CONSTRAINT_wK] = ALTIVEC_REGS; + } + } + /* Set up the reload helper and direct move functions. */ if (TARGET_VSX || TARGET_ALTIVEC) { @@ -3361,6 +3398,9 @@ rs6000_init_hard_regno_mode_ok (bool global_init_p) if (TARGET_UPPER_REGS_SF) reg_addr[SFmode].scalar_in_vmx_p = true; + + if (TARGET_VSX_SMALL_INTEGER) + reg_addr[SImode].scalar_in_vmx_p = true; } /* Setup the fusion operations. */ @@ -4433,6 +4473,20 @@ rs6000_option_override_internal (bool global_init_p) } } + /* Check whether we should allow small integers into VSX registers. We + require direct move to prevent the register allocator from having to move + variables through memory to do moves. SImode can be used on ISA 2.07, + while HImode and QImode require ISA 3.0. */ + if (TARGET_VSX_SMALL_INTEGER + && (!TARGET_DIRECT_MOVE || !TARGET_P8_VECTOR || !TARGET_UPPER_REGS_DI)) + { + if (rs6000_isa_flags_explicit & OPTION_MASK_VSX_SMALL_INTEGER) + error ("-mvsx-small-integer requires -mpower8-vector, " + "-mupper-regs-di, and -mdirect-move"); + + rs6000_isa_flags &= ~OPTION_MASK_VSX_SMALL_INTEGER; + } + /* Set long double size before the IEEE 128-bit tests. */ if (!global_options_set.x_rs6000_long_double_type_size) { @@ -20485,32 +20539,46 @@ rs6000_secondary_reload_simple_move (enum rs6000_reg_type to_type, enum rs6000_reg_type from_type, machine_mode mode) { - int size; + int size = GET_MODE_SIZE (mode); /* Add support for various direct moves available. In this function, we only look at cases where we don't need any extra registers, and one or more - simple move insns are issued. At present, 32-bit integers are not allowed + simple move insns are issued. Originally small integers are not allowed in FPR/VSX registers. Single precision binary floating is not a simple move because we need to convert to the single precision memory layout. The 4-byte SDmode can be moved. TDmode values are disallowed since they need special direct move handling, which we do not support yet. */ - size = GET_MODE_SIZE (mode); if (TARGET_DIRECT_MOVE - && ((mode == SDmode) || (TARGET_POWERPC64 && size == 8)) && ((to_type == GPR_REG_TYPE && from_type == VSX_REG_TYPE) || (to_type == VSX_REG_TYPE && from_type == GPR_REG_TYPE))) - return true; + { + if (TARGET_POWERPC64) + { + /* ISA 2.07: MTVSRD or MVFVSRD. */ + if (size == 8) + return true; - else if (TARGET_DIRECT_MOVE_128 && size == 16 && mode != TDmode - && ((to_type == VSX_REG_TYPE && from_type == GPR_REG_TYPE) - || (to_type == GPR_REG_TYPE && from_type == VSX_REG_TYPE))) - return true; + /* ISA 3.0: MTVSRDD or MFVSRD + MFVSRLD. */ + if (size == 16 && TARGET_P9_VECTOR && mode != TDmode) + return true; + } + + /* ISA 2.07: MTVSRWZ or MFVSRWZ. */ + if (TARGET_VSX_SMALL_INTEGER && mode == SImode) + return true; + + /* ISA 2.07: MTVSRWZ or MFVSRWZ. */ + if (mode == SDmode) + return true; + } + /* Power6+: MFTGPR or MFFGPR. */ else if (TARGET_MFPGPR && TARGET_POWERPC64 && size == 8 - && ((to_type == GPR_REG_TYPE && from_type == FPR_REG_TYPE) - || (to_type == FPR_REG_TYPE && from_type == GPR_REG_TYPE))) + && ((to_type == GPR_REG_TYPE && from_type == FPR_REG_TYPE) + || (to_type == FPR_REG_TYPE && from_type == GPR_REG_TYPE))) return true; + /* Move to/from SPR. */ else if ((size == 4 || (TARGET_POWERPC64 && size == 8)) && ((to_type == GPR_REG_TYPE && from_type == SPR_REG_TYPE) || (to_type == SPR_REG_TYPE && from_type == GPR_REG_TYPE))) @@ -20686,11 +20754,7 @@ rs6000_secondary_reload (bool in_p, enum rs6000_reg_type from_type = register_to_reg_type (x, &altivec_p); if (!in_p) - { - enum rs6000_reg_type exchange = to_type; - to_type = from_type; - from_type = exchange; - } + std::swap (to_type, from_type); /* Can we do a direct move of some sort? */ if (rs6000_secondary_reload_move (to_type, from_type, mode, sri, @@ -21318,7 +21382,8 @@ rs6000_preferred_reload_class (rtx x, enum reg_class rclass) /* If this is a scalar floating point value and we don't have D-form addressing, prefer the traditional floating point registers so that we can use D-form (register+offset) addressing. */ - if (GET_MODE_SIZE (mode) < 16 && rclass == VSX_REGS) + if (rclass == VSX_REGS + && (mode == SFmode || GET_MODE_SIZE (mode) == 8)) return FLOAT_REGS; /* Prefer the Altivec registers if Altivec is handling the vector @@ -35898,7 +35963,7 @@ rs6000_register_move_cost (machine_mode mode, else if (VECTOR_MEM_VSX_P (mode) && reg_classes_intersect_p (to, VSX_REGS) && reg_classes_intersect_p (from, VSX_REGS)) - ret = 2 * hard_regno_nregs[32][mode]; + ret = 2 * hard_regno_nregs[FIRST_FPR_REGNO][mode]; /* Moving between two similar registers is just one instruction. */ else if (reg_classes_intersect_p (to, from)) @@ -37504,6 +37569,7 @@ static struct rs6000_opt_mask const rs6000_opt_masks[] = { "upper-regs-df", OPTION_MASK_UPPER_REGS_DF, false, true }, { "upper-regs-sf", OPTION_MASK_UPPER_REGS_SF, false, true }, { "vsx", OPTION_MASK_VSX, false, true }, + { "vsx-small-integer", OPTION_MASK_VSX_SMALL_INTEGER, false, true }, { "vsx-timode", OPTION_MASK_VSX_TIMODE, false, true }, #ifdef OPTION_MASK_64BIT #if TARGET_AIX_OS diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h index ee0f105..4b83abd 100644 --- a/gcc/config/rs6000/rs6000.h +++ b/gcc/config/rs6000/rs6000.h @@ -1602,6 +1602,10 @@ enum r6000_reg_class_enum { RS6000_CONSTRAINT_wx, /* FPR register for STFIWX */ RS6000_CONSTRAINT_wy, /* VSX register for SF */ RS6000_CONSTRAINT_wz, /* FPR register for LFIWZX */ + RS6000_CONSTRAINT_wH, /* Altivec register for 32-bit integers. */ + RS6000_CONSTRAINT_wI, /* VSX register for 32-bit integers. */ + RS6000_CONSTRAINT_wJ, /* VSX register for 8/16-bit integers. */ + RS6000_CONSTRAINT_wK, /* Altivec register for 16/32-bit integers. */ RS6000_CONSTRAINT_MAX }; diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index e432a5a..bc8e52d 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -458,7 +458,7 @@ (define_mode_attr f32_sm2 [(SF "wY") (SD "wn")]) (define_mode_attr f32_si [(SF "stfs%U0%X0 %1,%0") (SD "stfiwx %1,%y0")]) (define_mode_attr f32_si2 [(SF "stxssp %1,%0") (SD "stfiwx %1,%y0")]) -(define_mode_attr f32_sv [(SF "stxsspx %x1,%y0") (SD "stxsiwzx %x1,%y0")]) +(define_mode_attr f32_sv [(SF "stxsspx %x1,%y0") (SD "stxsiwx %x1,%y0")]) ; Definitions for 32-bit fpr direct move ; At present, the decimal modes are not allowed in the traditional altivec @@ -837,16 +837,18 @@ (define_insn "zero_extendsi2" - [(set (match_operand:EXTSI 0 "gpc_reg_operand" "=r,r,??wj,!wz,!wu") - (zero_extend:EXTSI (match_operand:SI 1 "reg_or_mem_operand" "m,r,r,Z,Z")))] + [(set (match_operand:EXTSI 0 "gpc_reg_operand" "=r,r,wz,wu,wj,r,wJwK") + (zero_extend:EXTSI (match_operand:SI 1 "reg_or_mem_operand" "m,r,Z,Z,r,wIwH,wJwK")))] "" "@ lwz%U1%X1 %0,%1 rldicl %0,%1,0,32 - mtvsrwz %x0,%1 lfiwzx %0,%y1 - lxsiwzx %x0,%y1" - [(set_attr "type" "load,shift,mffgpr,fpload,fpload")]) + lxsiwzx %x0,%y1 + mtvsrwz %x0,%1 + mfvsrwz %0,%x1 + xxextractuw %x0,%x1,1" + [(set_attr "type" "load,shift,fpload,fpload,mffgpr,mftgpr,vecexts")]) (define_insn_and_split "*zero_extendsi2_dot" [(set (match_operand:CC 2 "cc_reg_operand" "=x,?y") @@ -1005,16 +1007,17 @@ (define_insn "extendsi2" - [(set (match_operand:EXTSI 0 "gpc_reg_operand" "=r,r,??wj,!wl,!wu") - (sign_extend:EXTSI (match_operand:SI 1 "lwa_operand" "Y,r,r,Z,Z")))] + [(set (match_operand:EXTSI 0 "gpc_reg_operand" "=r,r,wl,wu,wj,wK") + (sign_extend:EXTSI (match_operand:SI 1 "lwa_operand" "Y,r,Z,Z,r,wK")))] "" "@ lwa%U1%X1 %0,%1 extsw %0,%1 - mtvsrwa %x0,%1 lfiwax %0,%y1 - lxsiwax %x0,%y1" - [(set_attr "type" "load,exts,mffgpr,fpload,fpload") + lxsiwax %x0,%y1 + mtvsrwa %x0,%1 + vextsw2d %0,%1" + [(set_attr "type" "load,exts,fpload,fpload,mffgpr,vecexts") (set_attr "sign_extend" "yes")]) (define_insn_and_split "*extendsi2_dot" @@ -4947,15 +4950,16 @@ ; We don't define lfiwax/lfiwzx with the normal definition, because we ; don't want to support putting SImode in FPR registers. (define_insn "lfiwax" - [(set (match_operand:DI 0 "gpc_reg_operand" "=d,wj,!wj") - (unspec:DI [(match_operand:SI 1 "reg_or_indexed_operand" "Z,Z,r")] + [(set (match_operand:DI 0 "gpc_reg_operand" "=d,wj,wj,wK") + (unspec:DI [(match_operand:SI 1 "reg_or_indexed_operand" "Z,Z,r,wK")] UNSPEC_LFIWAX))] "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT && TARGET_LFIWAX" "@ lfiwax %0,%y1 lxsiwax %x0,%y1 - mtvsrwa %x0,%1" - [(set_attr "type" "fpload,fpload,mffgpr")]) + mtvsrwa %x0,%1 + vextsw2d %0,%1" + [(set_attr "type" "fpload,fpload,mffgpr,vecexts")]) ; This split must be run before register allocation because it allocates the ; memory slot that is needed to move values to/from the FPR. We don't allocate @@ -5019,7 +5023,10 @@ operands[1] = rs6000_address_for_fpconvert (operands[1]); if (GET_CODE (operands[2]) == SCRATCH) operands[2] = gen_reg_rtx (DImode); - emit_insn (gen_lfiwax (operands[2], operands[1])); + if (TARGET_VSX_SMALL_INTEGER) + emit_insn (gen_extendsidi2 (operands[2], operands[1])); + else + emit_insn (gen_lfiwax (operands[2], operands[1])); emit_insn (gen_floatdi2 (operands[0], operands[2])); DONE; }" @@ -5027,15 +5034,16 @@ (set_attr "type" "fpload")]) (define_insn "lfiwzx" - [(set (match_operand:DI 0 "gpc_reg_operand" "=d,wj,!wj") - (unspec:DI [(match_operand:SI 1 "reg_or_indexed_operand" "Z,Z,r")] + [(set (match_operand:DI 0 "gpc_reg_operand" "=d,wj,wj,wJwK") + (unspec:DI [(match_operand:SI 1 "reg_or_indexed_operand" "Z,Z,r,wJwK")] UNSPEC_LFIWZX))] "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT && TARGET_LFIWZX" "@ lfiwzx %0,%y1 lxsiwzx %x0,%y1 - mtvsrwz %x0,%1" - [(set_attr "type" "fpload,fpload,mftgpr")]) + mtvsrwz %x0,%1 + xxextractuw %x0,%x1,1" + [(set_attr "type" "fpload,fpload,mftgpr,vecexts")]) (define_insn_and_split "floatunssi2_lfiwzx" [(set (match_operand:SFDF 0 "gpc_reg_operand" "=") @@ -5094,7 +5102,10 @@ operands[1] = rs6000_address_for_fpconvert (operands[1]); if (GET_CODE (operands[2]) == SCRATCH) operands[2] = gen_reg_rtx (DImode); - emit_insn (gen_lfiwzx (operands[2], operands[1])); + if (TARGET_VSX_SMALL_INTEGER) + emit_insn (gen_zero_extendsidi2 (operands[2], operands[1])); + else + emit_insn (gen_lfiwzx (operands[2], operands[1])); emit_insn (gen_floatdi2 (operands[0], operands[2])); DONE; }" @@ -6518,25 +6529,66 @@ [(set_attr "type" "load") (set_attr "length" "4")]) +;; MR LA LWZ LFIWZX LXSIWZX +;; STW STFIWX STXSIWX LI LIS +;; # XXLOR XXSPLTIB 0 XXSPLTIB -1 VSPLTISW +;; XXLXOR 0 XXLORC -1 P9 const MTVSRWZ MFVSRWZ +;; MF%1 MT%0 MT%0 NOP (define_insn "*movsi_internal1" - [(set (match_operand:SI 0 "rs6000_nonimmediate_operand" "=r,r,r,m,r,r,r,r,*c*l,*h,*h") - (match_operand:SI 1 "input_operand" "r,U,m,r,I,L,n,*h,r,r,0"))] + [(set (match_operand:SI 0 "rs6000_nonimmediate_operand" + "=r, r, r, ?*wI, ?*wH, + m, ?Z, ?Z, r, r, + r, ?*wIwH, ?*wJwK, ?*wK, ?*wJwK, + ?*wJwK, ?*wH, ?*wK, ?*wIwH, ?r, + r, *c*l, *h, *h") + + (match_operand:SI 1 "input_operand" + "r, U, m, Z, Z, + r, wI, wH, I, L, + n, wIwH, O, wM, wB, + O, wM, wS, r, wIwH, + *h, r, r, 0"))] + "!TARGET_SINGLE_FPU && (gpc_reg_operand (operands[0], SImode) || gpc_reg_operand (operands[1], SImode))" "@ mr %0,%1 la %0,%a1 lwz%U1%X1 %0,%1 + lfiwzx %0,%y1 + lxsiwzx %x0,%y1 stw%U0%X0 %1,%0 + stfiwx %1,%y0 + stxsiwx %x1,%y0 li %0,%1 lis %0,%v1 # + xxlor %x0,%x1,%x1 + xxspltib %x0,0 + xxspltib %x0,255 + vspltisw %0,%1 + xxlxor %x0,%x0,%x0 + xxlorc %x0,%x0,%x0 + # + mtvsrwz %x0,%1 + mfvsrwz %0,%x1 mf%1 %0 mt%0 %1 mt%0 %1 nop" - [(set_attr "type" "*,*,load,store,*,*,*,mfjmpr,mtjmpr,*,*") - (set_attr "length" "4,4,4,4,4,4,8,4,4,4,4")]) + [(set_attr "type" + "*, *, load, fpload, fpload, + store, fpstore, fpstore, *, *, + *, veclogical, vecsimple, vecsimple, vecsimple, + veclogical, veclogical, vecsimple, mffgpr, mftgpr, + *, *, *, *") + + (set_attr "length" + "4, 4, 4, 4, 4, + 4, 4, 4, 4, 4, + 8, 4, 4, 4, 4, + 4, 4, 8, 4, 4, + 4, 4, 4, 4")]) (define_insn "*movsi_internal1_single" [(set (match_operand:SI 0 "rs6000_nonimmediate_operand" "=r,r,r,m,r,r,r,r,*c*l,*h,*h,m,*f") @@ -6581,6 +6633,23 @@ FAIL; }") +;; Split loading -128..127 to use XXSPLITB and VEXTSW2D +(define_split + [(set (match_operand:DI 0 "altivec_register_operand" "") + (match_operand:DI 1 "xxspltib_constant_split" ""))] + "TARGET_VSX_SMALL_INTEGER && TARGET_P9_VECTOR && reload_completed" + [(const_int 0)] +{ + rtx op0 = operands[0]; + rtx op1 = operands[1]; + int r = REGNO (op0); + rtx op0_v16qi = gen_rtx_REG (V16QImode, r); + + emit_insn (gen_xxspltib_v16qi (op0_v16qi, op1)); + emit_insn (gen_vsx_sign_extend_qi_si (operands[0], op0_v16qi)); + DONE; +}) + (define_insn "*mov_internal2" [(set (match_operand:CC 2 "cc_reg_operand" "=y,x,?y") (compare:CC (match_operand:P 1 "gpc_reg_operand" "0,r,r") diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt index 3e0717d..367be2d 100644 --- a/gcc/config/rs6000/rs6000.opt +++ b/gcc/config/rs6000/rs6000.opt @@ -664,3 +664,7 @@ Enable using IEEE 128-bit floating point instructions. mfloat128-convert Target Undocumented Mask(FLOAT128_CVT) Var(rs6000_isa_flags) Enable default conversions between __float128 & long double. + +mvsx-small-integer +Target Report Mask(VSX_SMALL_INTEGER) Var(rs6000_isa_flags) +Enable small integers to be in VSX registers. diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md index 36567e4..18f3e86 100644 --- a/gcc/config/rs6000/vsx.md +++ b/gcc/config/rs6000/vsx.md @@ -263,11 +263,14 @@ (V2DI "wi")]) ;; Iterators for loading constants with xxspltib -(define_mode_iterator VSINT_84 [V4SI V2DI DI]) +(define_mode_iterator VSINT_84 [V4SI V2DI DI SI]) (define_mode_iterator VSINT_842 [V8HI V4SI V2DI]) -;; Iterator for ISA 3.0 vector extract/insert of integer vectors -(define_mode_iterator VSX_EXTRACT_I [V16QI V8HI V4SI]) +;; Iterator for ISA 3.0 vector extract/insert of small integer vectors. +;; VSX_EXTRACT_I2 doesn't include V4SImode because SI extracts can be +;; done on ISA 2.07 and not just ISA 3.0. +(define_mode_iterator VSX_EXTRACT_I [V16QI V8HI V4SI]) +(define_mode_iterator VSX_EXTRACT_I2 [V16QI V8HI]) (define_mode_attr VSX_EXTRACT_WIDTH [(V16QI "b") (V8HI "h") @@ -2496,7 +2499,9 @@ (clobber (match_dup 3))])] "VECTOR_MEM_VSX_P (mode) && TARGET_DIRECT_MOVE_64BIT" { - operands[3] = gen_rtx_SCRATCH ((TARGET_VEXTRACTUB) ? DImode : mode); + machine_mode smode = ((mode != V4SImode && TARGET_VEXTRACTUB) + ? DImode : mode); + operands[3] = gen_rtx_SCRATCH (smode); }) ;; Under ISA 3.0, we can use the byte/half-word/word integer stores if we are @@ -2505,9 +2510,9 @@ (define_insn_and_split "*vsx_extract__p9" [(set (match_operand: 0 "nonimmediate_operand" "=r,Z") (vec_select: - (match_operand:VSX_EXTRACT_I 1 "gpc_reg_operand" ",") + (match_operand:VSX_EXTRACT_I2 1 "gpc_reg_operand" "v,v") (parallel [(match_operand:QI 2 "" "n,n")]))) - (clobber (match_scratch:DI 3 "=,"))] + (clobber (match_scratch:DI 3 "=v,v"))] "VECTOR_MEM_VSX_P (mode) && TARGET_VEXTRACTUB" "#" "&& (reload_completed || MEM_P (operands[0]))" @@ -2536,8 +2541,6 @@ emit_insn (gen_p9_stxsibx (dest, di_tmp)); else if (mode == V8HImode) emit_insn (gen_p9_stxsihx (dest, di_tmp)); - else if (mode == V4SImode) - emit_insn (gen_stfiwx (dest, di_tmp)); else gcc_unreachable (); } @@ -2570,12 +2573,70 @@ } [(set_attr "type" "vecsimple")]) +(define_insn_and_split "*vsx_extract_si" + [(set (match_operand:SI 0 "nonimmediate_operand" "=r,Z,Z,wJwK") + (vec_select:SI + (match_operand:V4SI 1 "gpc_reg_operand" "v,wJwK,v,v") + (parallel [(match_operand:QI 2 "const_0_to_3_operand" "n,n,n,n")]))) + (clobber (match_scratch:V4SI 3 "=v,wJwK,v,v"))] + "VECTOR_MEM_VSX_P (V4SImode) && TARGET_DIRECT_MOVE_64BIT" + "#" + "&& reload_completed" + [(const_int 0)] +{ + rtx dest = operands[0]; + rtx src = operands[1]; + rtx element = operands[2]; + rtx vec_tmp = operands[3]; + int value; + + if (!VECTOR_ELT_ORDER_BIG) + element = GEN_INT (GET_MODE_NUNITS (V4SImode) - 1 - INTVAL (element)); + + /* If the value is in the correct position, we can avoid doing the VSPLT + instruction. */ + value = INTVAL (element); + if (value != 1) + { + if (TARGET_VEXTRACTUB) + { + rtx di_tmp = gen_rtx_REG (DImode, REGNO (vec_tmp)); + emit_insn (gen_vsx_extract_v4si_di (di_tmp,src, element)); + } + else + emit_insn (gen_altivec_vspltw_direct (vec_tmp, src, element)); + } + else + vec_tmp = src; + + if (MEM_P (operands[0])) + { + if (can_create_pseudo_p ()) + dest = rs6000_address_for_fpconvert (dest); + + if (TARGET_VSX_SMALL_INTEGER) + emit_move_insn (dest, gen_rtx_REG (SImode, REGNO (vec_tmp))); + else + emit_insn (gen_stfiwx (dest, gen_rtx_REG (DImode, REGNO (vec_tmp)))); + } + + else if (TARGET_VSX_SMALL_INTEGER) + emit_move_insn (dest, gen_rtx_REG (SImode, REGNO (vec_tmp))); + else + emit_move_insn (gen_rtx_REG (DImode, REGNO (dest)), + gen_rtx_REG (DImode, REGNO (vec_tmp))); + + DONE; +} + [(set_attr "type" "mftgpr,fpstore,fpstore,vecsimple") + (set_attr "length" "8")]) + (define_insn_and_split "*vsx_extract__p8" [(set (match_operand: 0 "nonimmediate_operand" "=r") (vec_select: - (match_operand:VSX_EXTRACT_I 1 "gpc_reg_operand" "v") + (match_operand:VSX_EXTRACT_I2 1 "gpc_reg_operand" "v") (parallel [(match_operand:QI 2 "" "n")]))) - (clobber (match_scratch:VSX_EXTRACT_I 3 "=v"))] + (clobber (match_scratch:VSX_EXTRACT_I2 3 "=v"))] "VECTOR_MEM_VSX_P (mode) && TARGET_DIRECT_MOVE_64BIT" "#" "&& reload_completed" @@ -2607,13 +2668,6 @@ else vec_tmp = src; } - else if (mode == V4SImode) - { - if (value != 1) - emit_insn (gen_altivec_vspltw_direct (vec_tmp, src, element)); - else - vec_tmp = src; - } else gcc_unreachable (); diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi index 335dc61..9f19314 100644 --- a/gcc/doc/md.texi +++ b/gcc/doc/md.texi @@ -3125,6 +3125,18 @@ Memory operand suitable for power9 fusion load/stores. @item wG Memory operand suitable for TOC fusion memory references. +@item wH +Altivec register if @option{-mvsx-small-integer}. + +@item wI +Floating point register if @option{-mvsx-small-integer}. + +@item wJ +FP register if @option{-mvsx-small-integer} and @option{-mpower9-vector}. + +@item wK +Altivec register if @option{-mvsx-small-integer} and @option{-mpower9-vector}. + @item wL Int constant that is the element number that the MFVSRLD instruction. targets. diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog index a8d187c..f2a0a2f 100644 --- a/gcc/testsuite/ChangeLog +++ b/gcc/testsuite/ChangeLog @@ -1,3 +1,9 @@ +2016-10-27 Michael Meissner + + * gcc.target/powerpc/vsx-simode.c: New test. + * gcc.target/powerpc/vsx-simode2.c: Likewise. + * gcc.target/powerpc/vsx-simode3.c: Likewise. + 2016-10-27 Jakub Jelinek PR fortran/78026 diff --git a/gcc/testsuite/gcc.target/powerpc/vsx-simode.c b/gcc/testsuite/gcc.target/powerpc/vsx-simode.c new file mode 100644 index 0000000..e4b2113 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/vsx-simode.c @@ -0,0 +1,22 @@ +/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */ +/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */ +/* { dg-require-effective-target powerpc_p8vector_ok } */ +/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power8" } } */ +/* { dg-options "-mcpu=power8 -O2 -mvsx-small-integer" } */ + +double load_asm_d_constraint (int *p) +{ + double ret; + __asm__ ("xxlor %x0,%x1,%x1\t# load d constraint" : "=d" (ret) : "d" (*p)); + return ret; +} + +void store_asm_d_constraint (int *p, double x) +{ + int i; + __asm__ ("xxlor %x0,%x1,%x1\t# store d constraint" : "=d" (i) : "d" (x)); + *p = i; +} + +/* { dg-final { scan-assembler "lfiwzx" } } */ +/* { dg-final { scan-assembler "stfiwx" } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/vsx-simode2.c b/gcc/testsuite/gcc.target/powerpc/vsx-simode2.c new file mode 100644 index 0000000..92553b9 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/vsx-simode2.c @@ -0,0 +1,15 @@ +/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */ +/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */ +/* { dg-require-effective-target powerpc_p8vector_ok } */ +/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power8" } } */ +/* { dg-options "-mcpu=power8 -O2 -mvsx-small-integer" } */ + +unsigned int foo (unsigned int u) +{ + unsigned int ret; + __asm__ ("xxlor %x0,%x1,%x1\t# v, v constraints" : "=v" (ret) : "v" (u)); + return ret; +} + +/* { dg-final { scan-assembler "mtvsrwz" } } */ +/* { dg-final { scan-assembler "mfvsrwz" } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/vsx-simode3.c b/gcc/testsuite/gcc.target/powerpc/vsx-simode3.c new file mode 100644 index 0000000..fd15931 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/vsx-simode3.c @@ -0,0 +1,22 @@ +/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */ +/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */ +/* { dg-require-effective-target powerpc_p8vector_ok } */ +/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power8" } } */ +/* { dg-options "-mcpu=power8 -O2 -mvsx-small-integer" } */ + +double load_asm_v_constraint (int *p) +{ + double ret; + __asm__ ("xxlor %x0,%x1,%x1\t# load v constraint" : "=d" (ret) : "v" (*p)); + return ret; +} + +void store_asm_v_constraint (int *p, double x) +{ + int i; + __asm__ ("xxlor %x0,%x1,%x1\t# store v constraint" : "=v" (i) : "d" (x)); + *p = i; +} + +/* { dg-final { scan-assembler "lxsiwzx" } } */ +/* { dg-final { scan-assembler "stxsiwx" } } */ -- 2.7.4