From 73a4d10bbb438ec06e0ffee7ee11b9ed02891326 Mon Sep 17 00:00:00 2001 From: "J\"orn Rennecke" Date: Mon, 9 May 2005 17:42:55 +0000 Subject: [PATCH] re PR target/20695 (sh64-*-* port deos not handle 32 / 64 bit conversions properly) gcc: 2005-05-09 J"orn Rennecke * config/sh/sh.h (OVERRIDE_OPTIONS): Don't set flag_finite_math_only if flag_signaling_nans is set. For TARGET_SH2E, if flag_finite_math_only is not set, set IEEE_BIT. * doc/invoke.texi (SH -mieee): Document relation to -ffinite-math-only. 2005-05-06 J"orn Rennecke Merge of sh-elf specific patches from sh-elf-4_1-branch: 2005-05-05 Kaz Kojima * config/sh/sh.h (ASM_OUTPUT_REG_PUSH): Provide SHMEDIA version. (ASM_OUTPUT_REG_POP): Likewise. 2005-05-05 J"orn Rennecke Kaz Kojima * config/sh/sh.c (sh_builtin_saveregs): Use copy_to_mode_reg and plus_constant. 2005-05-04 Kaz Kojima * config/sh/sh.c (sh_div_strategy): Initialize with SH_DIV_STRATEGY_DEFAULT. * config/sh/sh.c (SH_DIV_STR_FOR_SIZE): Define. (SH_DIV_STRATEGY_DEFAULT): Likewise. (OPTIMIZATION_OPTIONS): Set sh_div_str to SH_DIV_STR_FOR_SIZE when optimized for size. * config/sh/linux.h (SH_DIV_STRATEGY_DEFAULT): Redefine. (SH_DIV_STR_FOR_SIZE): Likewise. * config/sh/netbsd-elf.h (SH_DIV_STRATEGY_DEFAULT): Likewise. (SH_DIV_STR_FOR_SIZE): Likewise. 2005-05-04 J"orn Rennecke * config/sh/sh-modes.def (PDImode): Add. * config/sh/sh-protos.h (shmedia_prepare_call_address): Declare. * config/sh/sh.c (print_operand): Handle IF_THEN_ELSE. (target_reg_operand): Allow PDImode. (sh_register_move_cost): If neither sh_gettrcost_str nor TARGET_PT_FIXED is set, assume gettr costs 100. (shmedia_prepare_call_address): New function. (sh_gettrcost_str): Initialize to empty string. (sh_divsi3_libfunc): New variable. * config/sh/sh.h (PT_FIXED_BIT, TARGET_INVALID_SYMBOLS): Define. (TARGET_SWITCH_SH5_32_ANY_EXTRA): Likewise. (TARGET_SWITCH_SH5_MEDIA_ANY_EXTRA): Likewise. (TARGET_SWITCHES): Use TARGET_SWITCH_SH5_32_ANY_EXTRA and TARGET_SWITCH_SH5_MEDIA_ANY_EXTRA. (TARGET_OPTIONS): Add -mdivsi3_libfunc. (OVERRIDE_OPTIONS): Set sh_divsi3_libfunc if it hasn't been set by the user. Also set flag_no_function_cse for (TARGET_SHMEDIA && !TARGET_PT_FIXED). (HARD_REGNO_MODE_OK): Allow TARGET_REGS in PDImode. (CONSTRAINT_LEN): Remove debug version. (SECONDARY_INOUT_RELOAD_CLASS:) Break out of (SECONDARY_OUTPUT_RELOAD_CLASS). Use EXTRA_CONSTRAINT_Csy for check if a target register needs a secondary reload through GENERAL_REGS. (SECONDARY_INPUT_RELOAD_CLASS): Use SECONDARY_INOUT_RELOAD_CLASS. (sh_divsi3_libfunc): Declare. (FUNCTION_PROFILER): Provide SHMEDIA version. * config/sh/predicates.md: New file. * config/sh/sh.md (predicates.md): Include. (divsi_inv_call_combine, divsi3): Use sh_divsi3_libfunc. (reload_insi): Fix predicates and constraints. (ptabs): New expander. (*extendsipdi_media, *truncdipdi_media): New insns. (call, call_value, sibcall): Use shmedia_prepare_call_address. * doc/invoke.texi (-multcost, -mdiv): Document new SH options. (-mdivsi3_libfunc, -madjust-unroll, -mindexed-addressing): Likewise. (-mgettrcost, -mpt-fixed, -minvalid-symbols): Likewise. 2005-04-11 J"orn Rennecke * sh.c (print_operand): Remove sh_rep_vec extraction. (sh_output_mi_thunk): Make i unsigned. * sh.c (TARGET_ADJUST_UNROLL_MAX): Only redefine if already defined. (sh_adjust_unroll_max): Only define if TARGET_ADJUST_UNROLL_MAX is defined. Update label detection code and iteration lookup, enable basic functionality, but without IV analysis. 2005-04-11 J"orn Rennecke * sh.h (OPTIMIZATION_OPTIONS): Don't make setting of flag_branch_target_load_optimize dependent on TARGET_SHMEDIA. Set flag_finite_math_only to 2. If flag_finite_math_only set set to 2, set it to 1 iff we use SH2E..SH4 arithmetic without full IEEE support. 2005-04-09 Kaz Kojima * config/sh/lib1funcs.asm (ic_invalidate): Fix typos. * config/sh/t-linux (LIB1ASMFUNCS_CACHE): Add _ic_invalidate_array. 2005-04-06 J"orn Rennecke Merge of SuperH / STM SH specific patches, including fix for PR target/20695: * config.gcc (sh*-superh-elf, sh*elf (newlib)): Use newlib.h when building with libgloss. (sh*elf): Implement --without-fp option. (sh64-superh-linux*): Don't multilib. (sh*-*-linux): Use sh3 as basic multilib. * config/sh/crt1.asm (SHmedia start): Add code to enable the MMU, and to set up vbr. Enable FPU before calling set_fpscr. Load atexit address just before use. Use __SH_FPU_ANY__. (SH3*/SH4* start): Add code to set up vbr. Use __SH_FPU_ANY__. Set DN bit in fpscr. * config/sh/elf.h (SUBTARGET_ASM_ISA_SPEC): Merge into: config/sh/sh.h (SH_ASM_SPEC, SUBTARGET_ASM_ISA_SPEC): Here. * config/sh/lib1funcs.asm (HIDDEN_FUNC, HIDDEN_ALIAS): Define. (FMOVD_WORKS): Don't define for __SH5__. (ashiftrt_r4_0, ashiftrt_r4_1, ashiftrt_r4_2, ashiftrt_r4_3): Hide. (ashiftrt_r4_4, ashiftrt_r4_5, ashiftrt_r4_6, ashiftrt_r4_7): Hide. (ashiftrt_r4_8, ashiftrt_r4_9, ashiftrt_r4_10, ashiftrt_r4_11): Hide. (ashiftrt_r4_12, ashiftrt_r4_13, ashiftrt_r4_14, ashiftrt_r4_15): Hide. (ashiftrt_r4_16, ashiftrt_r4_17, ashiftrt_r4_18, ashiftrt_r4_19): Hide. (ashiftrt_r4_20, ashiftrt_r4_21, ashiftrt_r4_22, ashiftrt_r4_23): Hide. (ashiftrt_r4_24, ashiftrt_r4_25, ashiftrt_r4_26, ashiftrt_r4_27): Hide. (ashiftrt_r4_28, ashiftrt_r4_29, ashiftrt_r4_30, ashiftrt_r4_31): Hide. (ashiftrt_r4_32, ashrsi3, ashlsi3, lshrsi3, movmem, movstr): Hide. (movstrSI64, movmemSI64, movstrSI60, movmemSI60): Hide. (movstrSI56, movmemSI56, movstrSI52, movmemSI52): Hide. (movstrSI48, movmemSI48, movstrSI44, movmemSI44): Hide. (movstrSI40, movmemSI40, movstrSI36, movmemSI36): Hide. (movstrSI32, movmemSI32, movstrSI28, movmemSI28): Hide. (movstrSI24, movmemSI24, movstrSI20, movmemSI20): Hide. (movstrSI16,movmemSI16, movstrSI12,movmemSI12): Hide. (movstrSI8,movmemSI8, movstrSI4,movmemSI4): Hide. (movmemSI0, movstrSI0): Remove. (movmemSI4): Schedule last store into rts delay slot. (movmem): Shorten code. Provide ENDFUNC. (movmem_i4_even, movmem_i4_odd, movmemSI12_i4, mulsi3): Hide. (mulsi3): Provide ENDFUNC. (sdivsi3_i4, sdivsi3_i4, udivsi3_i4, udivsi3, set_fpscr): Hide. (SH5 sdivsi3): Reimplement, using: (div_table): New, linear approximation table lookup for division seed. (sdivsi3_2): New SH5 entry point. (divdi3): Use hidden alias for udivdi3. (moddi3): Use hidden alias for umoddi3. (init_trampoline): Hide. Provide exact ENDFUNC. (ic_invalidate): Hide. Re-implement SH4 version, using (ic_invalidate_array): New global. (GCC_shcompact_return_trampoline, GCC_nested_trampoline): Hide. (GCC_push_shmedia_regs_nofpu): Only provide for __SH4_NOFPU__. (GCC_pop_shmedia_regs_nofpu): Likewise. * config/sh/libgcc-excl.ver (__mulsi3): Add. * config/sh/linux.h (TARGET_DEFAULT): Include TARGET_OPT_DEFAULT. * config/sh/sh-protos.h (sh_function_kind): New enum. (sh_gen_truncate, replace_n_hard_rtx): Declare. (function_symbol): Update declaration. (shmedia_cleanup_truncate, sh_contains_memref_p): Declare. * sh.c (cfgloop.h): Include. (TARGET_ADJUST_UNROLL_MAX): Redefine. (print_operand): Add '>' and 'U' support. Handle TRUNCATE and SIGN_EXTEND. (function_sybol): Add arguments for target and kind of symbol. If not an ordinary function symbol, make sure the string becomes unique. For PIC, load appropriately depending on kind of symbol. Changed all callers. (prepare_move_operands): Dont copy R0 to a pseudo for SHmedia. (multcosts): Check sh_multcost_str. If not set, return 2 for SHMEDIA TARGET_SMALLCODE. (sh_rtx_costs): Lower some costs when outer_code is SET. Add code for CONST_VECTOR, MINUS and PARALLEL. (gen_shifty_op): Don't emit nop. (expand_ashiftrt): While expanding to rtl, do shift by 31 using a register set to zero. (gen_datalabel_ref): Make sure that the string is shared. (MAX_POOL_SIZE): Define as 372. (find_barrier): Remove spurious adjustment. (sh_media_register_for_return): Return -1 for interrupt handlers. (sh_pch_valid_p): Use a copy of TARGET_OPTIONS. (general_movsrc_operand): Accept vector that match sh_rep_vec. (general_movdst_operand): For SHmedia, recject paradoxical DImode subregs before high_life / reload. (arith_reg_operand): Allow no-op sign extensions. (logical_reg_operand, fp_arith_reg_dest, xor_operand): New functions. (cmp_operand, shift_operator, logical_operator): Likewise. (minuend_operand, ua_address_operand, cache_address_operand): Likewise. (ua_offset, shift_count_reg_operand, shift_count_operand): Likewise. (sh_adjust_unroll_max, replace_n_hard_rtx, sh_gen_truncate): Likewise. (shmedia_cleanup_truncate, sh_contains_memref_p_1): Likewise. (sh_contains_memref_p): Likewise. (shmedia_6bit_operand): Remove. (arith_operand): Allow some TRUNCATEs. (logical_operand): Disallow subregs <= SImode of >= DImode. (greater_comparison_operator): Fix mode comparison. (less_comparison_operator): Likewise. (target_reg_operand, target_operand): Compare modes with Pmode. (sh_adjust_cost): Consider the dependency between a target register load and its use in a subsequent block. Implement mac_media latency exception. Before reload, anticipate floating point latencies to be at least four. Give preference to the ptabs feeding a casesi_jump_media. Handle UNSPEC in a CALL address. (sh_optimize_target_register_callee_saved): Improve handling of borderline cases. (sh_function_ok_for_sibcall): Allow for non-pic, and also when we will use the symbol with @GOTOFF addressing. (SH_BLTIN_UDI): Remove. (SH_BLTIN_LDUA_L64, SH_BLTIN_LDUA_Q64, SH_BLTIN_STUA_L64): New. (SH_BLTIN_STUA_Q64): Likewise. (signature_args, SH_BLTIN_NUM_SHARED_SIGNATURES): Update. (SH_BLTIN_2, SH_BLTIN_SU, SH_BLTIN_3, SH_BLTIN_SUS): Renumber. (SH_BLTIN_PSSV, SH_BLTIN_XXUU, SH_BLTIN_UUUU, SH_BLTIN_PV): Likewise. (bdesc): Add entries for alloco, mac_media, sqrtdf2, sqrtsf2, fsrra_s, {ld,st}{hi,lo}.[lq] and prefetch. Change mextr entries to use SH_BLTIN_V8QI3. (sh_media_init_builtins): Implement specific TARGET_SHMEDIA32 / TARGET_SHMEDIA64 checks for pointer arguments. (sh_expand_builtin): For pointer types, use ptr_mode / ptr_type_mode. (sh_register_move_cost): Check sh_gettrcost_str. (cmpsi_operand): T_REG is only allowed for TARGET_SH1. (sh_output_mi_thunk): Make static. Check that needed registers are actually available. Make sure that the sibcall won't go via the PLT. (sh_multcost_str, sh_gettrcost_str, sh_div_str): New variables. (cut2_workaround_str, sh_div_strategy, boardtype, osruntime): Likewise. (arith_reg_dest): Allow paradoxical DImode subreg for ! TARGET_SHMEDIA. * sh.h (TARGET_CPU_CPP_BUILTINS): Define __SH_FPU_ANY__ and __SH_FPU_DOUBLE__. (INDEXED_ADDRESS_BIT, ADJUST_UNROLL_BIT, TARGET_DIVIDE_INV): Define. (TARGET_HARVARD): Also true for TARGET_SH5. (TARGET_DIVIDE_FP, TARGET_DIVIDE_INV_FP, TARGET_DIVIDE_CALL2): Define. (TARGET_DIVIDE_INV_MINLAT, TARGET_DIVIDE_INV20U): Define. (TARGET_DIVIDE_INV20L, TARGET_DIVIDE_INV_CALL): Define. (TARGET_DIVIDE_INV_CALL2, TARGET_ALLOW_INDEXED_ADDRESS): Define. (TARGET_ADJUST_UNROLL, TARGET_OPT_DEFAULT, SUBTARGET_OPTIONS): Define. (TARGET_SWITCHES): Removed excessive whitespace. Added options indexed-addressing, no-indexed-addressing, adjust-unroll and no-adjust-unroll. (TARGET_DEFAULT): Add TARGET_OPT_DEFAULT. (TARGET_OPTIONS): Define. (EXTRA_SPECS): Add subtarget_asm_spec. (SH_ASM_SPEC): Pass cut2-workaround option. (SUBTARGET_ASM_ISA_SPEC): Enforce STRICT_NOFPU for SH4 --without-fp. (LINK_EMUL_PREFIX): If target defaults to little endian, default to shl. (OPTIMIZATION_OPTIONS): Set sh_div_str. If not using if not -mieee, set flag_finite_math_only. (sh_divide_strategy_e): New enum. (sh_div_strategy): Declare. (OVERRIDE_OPTIONS): Don't set FMOVD_BIT for TARGET_SHCOMPACT. Clear flag_if_conversion2 for SHMEDIA. Set sh_div_strategy. Leave profile_flag and profile_arc_flag alone. (LOOP_ALIGN): Replace TARGET_HARVARD test with TARGET_HARD_SH4 test. (HARD_REGNO_MODE_OK): Allow TImode in aligned FP registers. (MODES_TIEABLE_P): For TARGET_SHMEDIA, allow tying of integral modes of the same size. (CONST_OK_FOR_I): Fix detection of I06 constraint. (PREFERRED_RELOAD_CLASS): Also choose GENERAL_REGS for PIC_DIRECT_ADDR_P. (SECONDARY_INPUT_RELOAD_CLASS): Fix parentheses. For TARGET_SHMEDIA, check for inqhi_operand, LABEL_REF and PIC_DIRECT_ADDR_P. (FUNCTION_VALUE, PROMOTE_MODE): Don't promote from SImode. For TARGET_SHMEDIA32, promote to SImode. (EXTRA_CONSTRAINT_C16): Allow SIGN_EXTEND to SImode. (DATALABEL_REF_NO_CONST_P: Don't allow SYMBOL_REF. (DATALABEL_REF_P): Don't define. (NON_PIC_REFERENCE_P): Allow LABEL_REF and SYMBOL_REF directly inside a CONST. Don't allow DATALABEL_REF_NO_CONST_P outside of a CONST. Allow a LABEL_REF in a sum. (BASE_REGISTER_RTX_P): Check TRULY_NOOP_TRUNCATION. (INDEX_REGISTER_RTX_P): Likewise. (GO_IF_LEGITIMATE_INDEX): Check if pased the address of an unaligned load / store. (ALLOW_INDEXED_ADDRESS): Define. (GO_IF_LEGITIMATE_ADDRESS): Use it. (TRULY_NOOP_TRUNCATION): Don't allow no-op truncation from 64 bit or beyond to less than 64 bit. (PRINT_OPERAND_PUNCT_VALID_P): Allow '>'. (rtx_equal_function_value_matters): Don't declare. (arith_reg_operand): Allow sign_extend. (PREDICATE_CODES): Allow SIGN_EXTEND in arith_reg_operand. Add any_arith_reg_dest, cache_address_operand, cmp_operand, fp_arith_reg_dest, logical_operator, logical_reg_operand, minuend_operand, shift_count_operand, shift_count_reg_operand, shift_operator, ua_address_operand, ua_offset, unary_float_operator, xor_operand. Don't allow PARALLEL in sh_1el_vec and sh_rep_vec Remove shmedia_6bit_operand. (SPECIAL_MODE_PREDICATES): Add any-arith_reg_dest, target_operand and target_reg_operand. (SIDI_OFF, SIMULTANEOUS_PREFETCHES, high_life_started): Define. (sh_multcost_str, sh_gettrcost_str, sh_div_str): Declare. (cut2_workaround_str): Declare. (INDEX_REG_CLASS): Is NO_REGS if ALLOW_INDEXED_ADDRESS is zero. (LEGITIMIZE_RELOAD_ADDRESS): Check ALLOW_INDEXED_ADDRESS. Substitute INDEX_REG_CLASS with R0_REGS. * sh.md (UNSPEC_DIV_INV_M0, UNSPEC_DIV_INV_M1): New constants. (UNSPEC_DIV_INV_M2, UNSPEC_DIV_INV_M3, UNSPEC_DIV_INV20): Likewise. (UNSPEC_ASHIFTRT, UNSPEC_THUNK): Likewise. (Attribute "length"): jump_media has length 8 if TARGET_SH5_CUT2_WORKAROUND is true. ("highpart"): New attribute. (cmpsi): Allow TARGET_SHMEDIA. (cmpeqsi_media, cmpgtsi_media, cmpgtusi_media): New patterns. (cmpsieqsi_media, cmpsieqdi_media, cmpsigtsi_media): Likewise. (cmpsigtdi_media, cmpsigtusi_media, cmpsigtudi_media): Likewise. (*cmpne0si_media, *cmpne0sisi_media, movdicc_true+1): Likewise. (movdicc_true+2, movsicc_false, movsicc_true): Likewise. (movsicc_true+1, movsicc_true+2, movsicc_true+3): Likewise. (*movsicc_umin, movsicc, movqicc, *adddisi3_media): Likewise. (addsidi3_media, subdisi3_media, mov_neg_si_t): Likewise. (*subsi3_media+1, *subsi3_media+2, divsi3_media_2): Likewise. (divsi_inv_call, *divsi_inv_call_combine, divsi_inv_m0): Likewise. (divsi_inv_m1, divsi_inv_m2, divsi_inv_m3, divsi_inv_m1_3): Likewise. (divsi_inv20, divsi_inv_fp, *divsi_inv_fp_combine, muldi3): Likewise. (*andsi3_media, andcsi3): Likewise. (cmpeqdi_media): Use cmp_operand operand predicate. (*adddi3_media, adddi3z_media): Use arith_reg_dest operand predicate. (adddi3_compact, adddi3_compact+1, addc, addc1): Likewise. (addsi3_media, *addsi3_compact, *subdi3_media): Likewise. (subdi3_compact, subdi3_compact+1, subc, subc1): Likewise. (*subsi3_internal, *subsi3_media, udivsi3_sh2a, divsi3_sh2a): Likewise. (mul_r, mulsidi3_media, mulsidi3_compact): Likewise. (mulsidi3_compact+1, umulsidi3_media, umulsidi3_compact): Likewise. (umulsidi3_compact+1, *andsi3_compact, anddi3, andcdi3): Likewise. (*subsi3_media): Make define_insn_and_split. Use minuend_operand operand predicate. (subsi3): Don't force operand 1 into a register if it is a SUBREG. (udivsi3_i1_media, udivsi3): Use Pmode for function/target address. (divsi3_i1_media, beq_media, *beq_media_i, bne_media): Likewise. (bgt_media, bge_media, bgtu_media, bgeu_media, *bgt_media_i): Likewise. (*blt_media_i, bunordered, jump_media, jump, call_media): Likewise. (call_value_media, call, call_value, sibcall_media, sibcall): Likewise. (indirect_jump, casesi_jump_media, GOTaddr2picreg, *ptb): Likewise. (symGOT_load, casesi, casesi_shift_media, casesi_load_media): Likewise. (return_media_i, return_media): Likewise. (udivsi3_i1_media): Enable also for ! TARGET_DIVIDE_FP. (divsi3_i1_media): Likewise. Don't clobber R2 / R3 / TR1 / TR2. (divsi3): Add support for division by multiplying with inverse. (andsi3): Use logical_reg_operand predicate. Add SHmedia support. (iorsi3): Rename to: (*iorsi3_compact). (xorsi3): Rename to: (*xorsi3_compact). (iorsi3, *iorsi3_media, *logical_sidi3, xorsi3): New patterns. (*logical_sidisi3, *logical_sidi3_2, rotrdi3_mextr+1): Likewise. (ashrsi2_31+2, *ashlsi_c_void, *ashldisi3_media): Likewise. (*lshrdisi3_media, *ashrdisi3_media, ashrdisi3_media_high): Likewise. (ashrdisi3_media_opaque, one_cmpldi2+1, cneg, movsi_const): Likewise. (movsi_const_16bit, *movdi_media_I16, *shori_media_si): Likewise. (*beq_media_i32, *bgt_media_i32, *blt_media_i32): Likewise. (bunordered+1, sibcalli_thunk, ptrel_si, cmpsieqsf_media): Likewise. (cmpsieqdf_media, addv2hi3, ashlv2si3+1, subv2hi3, ldhi_l): Likewise. (ldhi_q, *ldhi_q_comb0, *ldhi_q_comb1, ldlo_l, ldlo_q): Likewise. (*ldlo_q_comb0, *ldlo_q_comb1, sthi_l, sthi_q): Likewise. (*sthi_q_comb0, *sthi_q_comb1, stlo_l, stlo_q): Likewise. (*stlo_q_comb0, *stlo_q_comb1, ldhi_l64, ldhi_q64, ldlo_l64): Likewise. (ldlo_q64, sthi_l64, sthi_q64, stlo_l64, stlo_q64, alloco_i): Likewise. (alloca_i+1): Likewise. (prefetch_media): Inhibit generator function generation. (prefetch_i4): Likewise. Also enable for TARGET_SHCOMPACT. (*iorsi3_compact, iordi3): Use arith_reg_dest operand predicate. (*xorsi3_compact, xordi3, xordi3+1, rotlsi3_1, rotlsi3_31): Likewise. (rotlsi3_16, rotlsi3, *rotlhi3_8, ashlsi3_sh2a, ashlsi3_std): Likewise. (ashlhi3_k, ashlsi3_n, ashlsi3_n+1, ashlsi3_media): Likewise. (*ashlhi3_n, ashlhi3+1, ashrsi3_sh2a, ashrsi3_k, ashrsi2_16): Likewise. (ashrsi2_16+1, ashrsi2_31, ashrsi2_31+1, ashlsi_c): Likewise. (ashrsi3_d, ashrsi3_media, lshrsi3_sh2a, lshrsi3_d): Likewise. (lshrsi3_m, lshrsi3_k, lshrsi3_n, lshrsi3_n, lshrsi3_media): Likewise. (lshrsi3, ashldi3_k, ashldi3_mediai, lshrdi3_k): Likewise. (ashrdi3_k, xtrct_left, xtrct_right, negc, *negdi_media): Likewise. (negsi2, one_cmplsi2, one_cmpldi2, zero_extendsidi2): Likewise. (*zero_extendhisi2_compact, *zero_extendqisi2_compact): Likewise. (zero_extendqihi2, extendhisi2, *extendhisi2_compact): Likewise. (extendqisi2, *extendqisi2_compact, extendqihi2): Likewise. (movsi_const_16bit+1, *movdi_media_I16+1): Likewise. (movdf_media_nofpu+1, movsf_media_nofpu+1, dect, movt, seq): Likewise. (movnegt+1, divsf3_i): Likewise. (xordi3): Use xor_operand operand predicate. (ashlsi3_media): Use shift_count_operand operand predicate. (ashrsi3_media, lshrsi3_media, ashldi3_media, lshrdi3_media): Likewise. (ashrdi3_media): Likewise. (ashrsi2_31+1): Use mov_neg_si_t. (lshrdi3_media, ashrdi3_media): Use ext_dest_operand predicate. Make sure that either the destination is not a subreg, or that the shift generates a sufficient number of sign bit copies. (*loaddi_trunc): Use any_register_operand predicate. (ic_invalidate_line_sh4a): Likewise. (*zero_extendhisi2_media+1): Use simplify_gen_subreg. (*extendhisi2_media+1i, *extendqisi2_media+1): Likewise. (extendsidi2): Add fmov.sl alternative. (load_ra): Add mode for operand 1. (*movsi_media): Discourage the use of floating point registers. Allow TRUNCATE. (*movsi_media_nofpu): Ignore target register alternative for register preferencing. Allow TRUNCATE. (movsi_const_16bit+1): Use gen_movsi_const, and add an REG_EQUAL note. (*movqi_media): Use extend_reg_or_0_operand predicate. (*movdi_media): Ignore target register alternative for register preferencing. Discourage the use of floating point registers. (*movdi_media_nofpu): Ignore target register alternative for register preferencing. (movdi_const_16bit+1): If the source is subregged from SImode, sign-extend highpart. Use ext_dest_operand predicate. (movdi_const_16bit+2, shori_media): Use ext_dest_operand predicate. (reload_outdf+7, reload_outdf+8): Check ALLOW_INDEXED_ADDRESS. (stuff_delay_slot): Add modes for operands 0 and 1. (*beq_media_i, *bgt_media_i): Add '>' to output templates. (*blt_media_i, jump_media): Likewise. (beq, bne): Pass through SImode inputs, and I06 constants. (bgt, blt, ble, bge, bgtu): Pass through SImode inputs, the constant 0. (bltu, bgeu, bleu): Likewise. (GOTaddr2picreg): Don't call gen_datalabel_ref. (ptrel): Rename to: (ptrel_di). (tls_global_dynamic, tls_local_dynamic): Add mode for call. (seq): Properly support input modes other than DImode. (slt, sle, sgt, sge,sne): Properly support SImode. (addsf3_i, negdf2_i, sqrtdf2_i, absdf2_i): Use fp_arith_reg_operand. (mac_media) Enable generator function generation. (fix_truncsfdi2): Use fp_arith_reg_dest operand predicate. (fix_truncdfdi2): Likewise. (movv8qi_i+3): Enable for CONST0_RTX too. (movv2hi_i): Use add.l, not addz.l. (ashlv2si3, ashlv4hi3, lshrv2si3): Use shift_count_reg_operand. (lshrv4hi3): Likewise. (ussubv8qi3): Allow zero for operand 1. (prefetch): Allow any mode for operand 0. Enable for SHCOMPACT. Use force_reg. * config/sh/shmedia.md: (shmedia): Remove automaton declaration. (sh5inst_pipe, sh5fpu_pipe): New automatons. (sh5issue): Use sh5inst_pipe. (sh5fds): Use sh5fpu_pipe. (shmedia_fdiv, shmedia_dfdiv): Also use sh5issue. * config/sh/sshmedia.h (sh_media_GETCON, sh_media_PUTCON): Declare with always_inline Attribute. * t-sh64 (LIB1ASMFUNCS): Add _div_table. * config/sh/ushmedia.h (sh_media_MABS_L): Use builtin function. (sh_media_MABS_W, sh_media_MADD_L, sh_media_MADD_W): Likewise. (sh_media_MADDS_L, sh_media_MADDS_UB, sh_media_MADDS_W): Likewise. (sh_media_MCMPEQ_B, sh_media_MCMPEQ_L, sh_media_MCMPEQ_W): Likewise. (sh_media_MCMPGT_UB, sh_media_MCMPGT_L, sh_media_MCMPGT_W): Likewise. (sh_media_MCMV, sh_media_MCNVS_LW, sh_media_MCNVS_WB): Likewise. (sh_media_MCNVS_WUB, sh_media_MEXTR1, sh_media_MEXTR2): Likewise. (sh_media_MEXTR3, sh_media_MEXTR4, sh_media_MEXTR5): Likewise. (sh_media_MEXTR6, sh_media_MEXTR7, sh_media_MMACFX_WL): Likewise. (sh_media_MMACNFX_WL, sh_media_MMUL_L, sh_media_MMUL_W): Likewise. (sh_media_MMULFX_L, sh_media_MMULFX_W, sh_media_MMULFXRP_W): Likewise. (sh_media_MMULHI_WL, sh_media_MMULLO_WL): Likewise. (sh_media_MMULSUM_WQ, sh_media_MPERM_W, sh_media_MSAD_UBQ): Likewise. (sh_media_MSHALDS_L, sh_media_MSHALDS_W, sh_media_MSHARD_L): Likewise. (sh_media_MSHARD_W, sh_media_MSHARDS_Q, sh_media_MSHFHI_B): Likewise. (sh_media_MSHFHI_L, sh_media_MSHFHI_W, sh_media_MSHFLO_B): Likewise. (sh_media_MSHFLO_L, sh_media_MSHFLO_W, sh_media_MSHLLD_L): Likewise. (sh_media_MSHLLD_W, sh_media_MSHLRD_L, sh_media_MSHLRD_W): Likewise. (sh_media_MSUB_L, sh_media_MSUB_W, sh_media_MSUBS_L): Likewise. (sh_media_MSUBS_UB, sh_media_MSUBS_W, sh_media_FABS_D): Likewise. (sh_media_FABS_S, sh_media_FCMPUN_D, sh_media_FCMPUN_S): Likewise. (sh_media_FIPR_S, sh_media_FMAC_S, sh_media_FSQRT_D): Likewise. (sh_media_FSQRT_S, sh_media_FTRV_S, sh_media_LDHI_L): Likewise. (sh_media_LDHI_Q, sh_media_LDLO_L, sh_media_LDLO_Q): Likewise. (sh_media_STHI_L, sh_media_STHI_Q, sh_media_STLO_L): Likewise. (sh_media_STLO_Q, sh_media_NSB, sh_media_BYTEREV): Likewise. (sh_media_PREFO, sh_media_ALLOCO): Likewise. (sh_media_FCOSA_S, sh_media_FSINA_S): New function. (sh_media_FMOV_DQ, sh_media_FMOV_LS): Use union assignment. (sh_media_FMOV_QD, sh_media_FMOV_SL): Likewise. (sh_media_CMVEQ): Use C code. Add attribute always_inline. (sh_media_CMVNE): Likewise. (sh_media_ADDZ_L): Use C code. (sh_media_unaligned_LD_L): Use intrinsics directly. (sh_media_unaligned_LD_Q, sh_media_unaligned_ST_L): Likewise. (sh_media_unaligned_ST_Q): Likewise. * config/sh/divtab.c: New file. 2005-04-06 Andrew Stubbs J"orn Rennecke * config/sh/superh64.h, config/sh/superh.h: New files. * config/sh/newlib.h, config/sh/t-superh: Likewise. * config.gcc: Add support for sh*-superh-elf* and sh64-superh-linux*. gcc/testsuite: 2005-05-06 J"orn Rennecke * gcc.dg/pr15784-3.c: Add -fno-finite-math-only option. * gcc.dg/20021029-1.c: For sh64*-*-*, add -mpt-fixed. From-SVN: r99460 --- gcc/config/sh/crt1.asm | 1092 +++++++++++- gcc/config/sh/divtab.c | 204 +++ gcc/config/sh/elf.h | 8 - gcc/config/sh/lib1funcs.asm | 751 ++++++--- gcc/config/sh/libgcc-excl.ver | 1 + gcc/config/sh/linux.h | 10 +- gcc/config/sh/netbsd-elf.h | 9 +- gcc/config/sh/newlib.h | 26 + gcc/config/sh/predicates.md | 38 + gcc/config/sh/sh-modes.def | 2 + gcc/config/sh/sh-protos.h | 22 +- gcc/config/sh/sh.c | 1336 +++++++++++++-- gcc/config/sh/sh.h | 495 ++++-- gcc/config/sh/sh.md | 3318 ++++++++++++++++++++++++++++++------- gcc/config/sh/shmedia.md | 10 +- gcc/config/sh/sshmedia.h | 6 + gcc/config/sh/superh.h | 151 ++ gcc/config/sh/superh64.h | 50 + gcc/config/sh/t-linux | 2 +- gcc/config/sh/t-sh64 | 4 +- gcc/config/sh/t-superh | 6 + gcc/config/sh/ushmedia.h | 884 +++++----- gcc/doc/invoke.texi | 111 +- gcc/testsuite/ChangeLog | 6 + gcc/testsuite/gcc.dg/20021029-1.c | 1 + gcc/testsuite/gcc.dg/pr15784-3.c | 3 +- 26 files changed, 6952 insertions(+), 1594 deletions(-) create mode 100644 gcc/config/sh/divtab.c create mode 100644 gcc/config/sh/newlib.h create mode 100644 gcc/config/sh/predicates.md create mode 100644 gcc/config/sh/superh.h create mode 100644 gcc/config/sh/superh64.h create mode 100644 gcc/config/sh/t-superh diff --git a/gcc/config/sh/crt1.asm b/gcc/config/sh/crt1.asm index f2c4e0d..2fab84d 100644 --- a/gcc/config/sh/crt1.asm +++ b/gcc/config/sh/crt1.asm @@ -1,4 +1,4 @@ -/* Copyright (C) 2000, 2001, 2003 Free Software Foundation, Inc. +/* Copyright (C) 2000, 2001, 2003, 2004 Free Software Foundation, Inc. This file was pretty much copied from newlib. This file is part of GCC. @@ -27,6 +27,19 @@ along with this program; see the file COPYING. If not, write to the Free Software Foundation, 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. */ +#ifdef MMU_SUPPORT + /* Section used for exception/timer interrupt stack area */ + .section .data.vbr.stack,"aw" + .align 4 + .global __ST_VBR +__ST_VBR: + .zero 1024 * 2 /* ; 2k for VBR handlers */ +/* Label at the highest stack address where the stack grows from */ +__timer_stack: +#endif /* MMU_SUPPORT */ + + /* ;---------------------------------------- + Normal newlib crt1.asm */ #ifdef __SH5__ .section .data,"aw" @@ -37,6 +50,89 @@ ___data: .global ___rodata ___rodata: +#define ICCR_BASE 0x01600000 +#define OCCR_BASE 0x01e00000 +#define MMUIR_BASE 0x00000000 +#define MMUDR_BASE 0x00800000 + +#define PTE_ENABLED 1 +#define PTE_DISABLED 0 + +#define PTE_SHARED (1 << 1) +#define PTE_NOT_SHARED 0 + +#define PTE_CB_UNCACHEABLE 0 +#define PTE_CB_DEVICE 1 +#define PTE_CB_CACHEABLE_WB 2 +#define PTE_CB_CACHEABLE_WT 3 + +#define PTE_SZ_4KB (0 << 3) +#define PTE_SZ_64KB (1 << 3) +#define PTE_SZ_1MB (2 << 3) +#define PTE_SZ_512MB (3 << 3) + +#define PTE_PRR (1 << 6) +#define PTE_PRX (1 << 7) +#define PTE_PRW (1 << 8) +#define PTE_PRU (1 << 9) + +#define SR_MMU_BIT 31 +#define SR_BL_BIT 28 + +#define ALIGN_4KB (0xfff) +#define ALIGN_1MB (0xfffff) +#define ALIGN_512MB (0x1fffffff) + +#define DYNACON_BASE 0x0f000000 +#define DM_CB_DLINK_BASE 0x0c000000 +#define DM_DB_DLINK_BASE 0x0b000000 + +#define FEMI_AREA_0 0x00000000 +#define FEMI_AREA_1 0x04000000 +#define FEMI_AREA_2 0x05000000 +#define FEMI_AREA_3 0x06000000 +#define FEMI_AREA_4 0x07000000 +#define FEMI_CB 0x08000000 + +#define EMI_BASE 0X80000000 + +#define DMA_BASE 0X0e000000 + +#define CPU_BASE 0X0d000000 + +#define PERIPH_BASE 0X09000000 +#define DMAC_BASE 0x0e000000 +#define INTC_BASE 0x0a000000 +#define CPRC_BASE 0x0a010000 +#define TMU_BASE 0x0a020000 +#define SCIF_BASE 0x0a030000 +#define RTC_BASE 0x0a040000 + + + +#define LOAD_CONST32(val, reg) \ + movi ((val) >> 16) & 65535, reg; \ + shori (val) & 65535, reg + +#define LOAD_PTEH_VAL(sym, align, bits, scratch_reg, reg) \ + LOAD_ADDR (sym, reg); \ + LOAD_CONST32 ((align), scratch_reg); \ + andc reg, scratch_reg, reg; \ + LOAD_CONST32 ((bits), scratch_reg); \ + or reg, scratch_reg, reg + +#define LOAD_PTEL_VAL(sym, align, bits, scratch_reg, reg) \ + LOAD_ADDR (sym, reg); \ + LOAD_CONST32 ((align), scratch_reg); \ + andc reg, scratch_reg, reg; \ + LOAD_CONST32 ((bits), scratch_reg); \ + or reg, scratch_reg, reg + +#define SET_PTE(pte_addr_reg, pteh_val_reg, ptel_val_reg) \ + putcfg pte_addr_reg, 0, r63; \ + putcfg pte_addr_reg, 1, ptel_val_reg; \ + putcfg pte_addr_reg, 0, pteh_val_reg + #if __SH5__ == 64 .section .text,"ax" #define LOAD_ADDR(sym, reg) \ @@ -55,8 +151,279 @@ ___rodata: start: LOAD_ADDR (_stack, r15) +#ifdef MMU_SUPPORT + ! Set up the VM using the MMU and caches + + ! .vm_ep is first instruction to execute + ! after VM initialization + pt/l .vm_ep, tr1 + + ! Configure instruction cache (ICCR) + movi 3, r2 + movi 0, r3 + LOAD_ADDR (ICCR_BASE, r1) + putcfg r1, 0, r2 + putcfg r1, 1, r3 + + ! movi 7, r2 ! write through + ! Configure operand cache (OCCR) + LOAD_ADDR (OCCR_BASE, r1) + putcfg r1, 0, r2 + putcfg r1, 1, r3 + + ! Disable all PTE translations + LOAD_ADDR (MMUIR_BASE, r1) + LOAD_ADDR (MMUDR_BASE, r2) + movi 64, r3 + pt/l .disable_ptes_loop, tr0 +.disable_ptes_loop: + putcfg r1, 0, r63 + putcfg r2, 0, r63 + addi r1, 16, r1 + addi r2, 16, r2 + addi r3, -1, r3 + bgt r3, r63, tr0 + + LOAD_ADDR (MMUIR_BASE, r1) + + ! FEMI instruction mappings + ! Area 0 - 1Mb cacheable at 0x00000000 + ! Area 1 - None + ! Area 2 - 1Mb cacheable at 0x05000000 + ! - 1Mb cacheable at 0x05100000 + ! Area 3 - None + ! Area 4 - None + + ! Map a 1Mb page for instructions at 0x00000000 + LOAD_PTEH_VAL (FEMI_AREA_0, ALIGN_1MB, PTE_ENABLED | PTE_NOT_SHARED, r25, r2) + LOAD_PTEL_VAL (FEMI_AREA_0, ALIGN_1MB, PTE_CB_CACHEABLE_WB | PTE_SZ_1MB | PTE_PRX | PTE_PRU, r25, r3) + SET_PTE (r1, r2, r3) + + ! Map a 1Mb page for instructions at 0x05000000 + addi r1, 16, r1 + LOAD_PTEH_VAL (FEMI_AREA_2, ALIGN_1MB, PTE_ENABLED | PTE_NOT_SHARED, r25, r2) + LOAD_PTEL_VAL (FEMI_AREA_2, ALIGN_1MB, PTE_CB_CACHEABLE_WB | PTE_SZ_1MB | PTE_PRX | PTE_PRU, r25, r3) + SET_PTE (r1, r2, r3) + + ! Map a 1Mb page for instructions at 0x05100000 + addi r1, 16, r1 + LOAD_PTEH_VAL ((FEMI_AREA_2+0x100000), ALIGN_1MB, PTE_ENABLED | PTE_NOT_SHARED, r25, r2) + LOAD_PTEL_VAL ((FEMI_AREA_2+0x100000), ALIGN_1MB, PTE_CB_CACHEABLE_WB | PTE_SZ_1MB | PTE_PRX | PTE_PRU, r25, r3) + SET_PTE (r1, r2, r3) + + ! Map a 512M page for instructions at EMI base + addi r1, 16, r1 + LOAD_PTEH_VAL (EMI_BASE, ALIGN_512MB, PTE_ENABLED | PTE_NOT_SHARED, r25, r2) + LOAD_PTEL_VAL (EMI_BASE, ALIGN_512MB, PTE_CB_CACHEABLE_WB | PTE_SZ_512MB | PTE_PRX | PTE_PRU, r25, r3) + SET_PTE (r1, r2, r3) + + ! Map a 4K page for instructions at DM_DB_DLINK_BASE + addi r1, 16, r1 + LOAD_PTEH_VAL (DM_DB_DLINK_BASE, ALIGN_4KB, PTE_ENABLED | PTE_NOT_SHARED, r25, r2) + LOAD_PTEL_VAL (DM_DB_DLINK_BASE, ALIGN_4KB, PTE_CB_CACHEABLE_WB | PTE_SZ_4KB | PTE_PRX | PTE_PRU, r25, r3) + SET_PTE (r1, r2, r3) + + LOAD_ADDR (MMUDR_BASE, r1) + + ! FEMI data mappings + ! Area 0 - 1Mb cacheable at 0x00000000 + ! Area 1 - 1Mb device at 0x04000000 + ! Area 2 - 1Mb cacheable at 0x05000000 + ! - 1Mb cacheable at 0x05100000 + ! Area 3 - None + ! Area 4 - None + ! CB - 1Mb device at 0x08000000 + + ! Map a 1Mb page for data at 0x00000000 + LOAD_PTEH_VAL (FEMI_AREA_0, ALIGN_1MB, PTE_ENABLED | PTE_NOT_SHARED, r25, r2) + LOAD_PTEL_VAL (FEMI_AREA_0, ALIGN_1MB, PTE_CB_CACHEABLE_WB | PTE_SZ_1MB | PTE_PRR | PTE_PRW | PTE_PRU, r25, r3) + SET_PTE (r1, r2, r3) + + ! Map a 1Mb page for data at 0x04000000 + addi r1, 16, r1 + LOAD_PTEH_VAL (FEMI_AREA_1, ALIGN_1MB, PTE_ENABLED | PTE_NOT_SHARED, r25, r2) + LOAD_PTEL_VAL (FEMI_AREA_1, ALIGN_1MB, PTE_CB_DEVICE | PTE_SZ_1MB | PTE_PRR | PTE_PRW | PTE_PRU, r25, r3) + SET_PTE (r1, r2, r3) + + ! Map a 1Mb page for data at 0x05000000 + addi r1, 16, r1 + LOAD_PTEH_VAL (FEMI_AREA_2, ALIGN_1MB, PTE_ENABLED | PTE_NOT_SHARED, r25, r2) + LOAD_PTEL_VAL (FEMI_AREA_2, ALIGN_1MB, PTE_CB_CACHEABLE_WB | PTE_SZ_1MB | PTE_PRR | PTE_PRW | PTE_PRU, r25, r3) + SET_PTE (r1, r2, r3) + + ! Map a 1Mb page for data at 0x05100000 + addi r1, 16, r1 + LOAD_PTEH_VAL ((FEMI_AREA_2+0x100000), ALIGN_1MB, PTE_ENABLED | PTE_NOT_SHARED, r25, r2) + LOAD_PTEL_VAL ((FEMI_AREA_2+0x100000), ALIGN_1MB, PTE_CB_CACHEABLE_WB | PTE_SZ_1MB | PTE_PRR | PTE_PRW | PTE_PRU, r25, r3) + SET_PTE (r1, r2, r3) + + ! Map a 4K page for registers at 0x08000000 + addi r1, 16, r1 + LOAD_PTEH_VAL (FEMI_CB, ALIGN_4KB, PTE_ENABLED | PTE_NOT_SHARED, r25, r2) + LOAD_PTEL_VAL (FEMI_CB, ALIGN_4KB, PTE_CB_DEVICE | PTE_SZ_4KB | PTE_PRR | PTE_PRW | PTE_PRU, r25, r3) + SET_PTE (r1, r2, r3) + + ! Map a 512M page for data at EMI + addi r1, 16, r1 + LOAD_PTEH_VAL (EMI_BASE, ALIGN_512MB, PTE_ENABLED | PTE_NOT_SHARED, r25, r2) + LOAD_PTEL_VAL (EMI_BASE, ALIGN_512MB, PTE_CB_CACHEABLE_WB | PTE_SZ_512MB | PTE_PRR | PTE_PRW | PTE_PRU, r25, r3) + SET_PTE (r1, r2, r3) + + ! Map a 4K page for DYNACON at DYNACON_BASE + addi r1, 16, r1 + LOAD_PTEH_VAL (DYNACON_BASE, ALIGN_4KB, PTE_ENABLED | PTE_NOT_SHARED, r25, r2) + LOAD_PTEL_VAL (DYNACON_BASE, ALIGN_4KB, PTE_CB_DEVICE | PTE_SZ_4KB | PTE_PRR | PTE_PRW | PTE_PRU, r25, r3) + SET_PTE (r1, r2, r3) + + ! Map a 4K page for instructions at DM_DB_DLINK_BASE + addi r1, 16, r1 + LOAD_PTEH_VAL (DM_DB_DLINK_BASE, ALIGN_4KB, PTE_ENABLED | PTE_NOT_SHARED, r25, r2) + LOAD_PTEL_VAL (DM_DB_DLINK_BASE, ALIGN_4KB, PTE_CB_CACHEABLE_WB | PTE_SZ_4KB | PTE_PRR | PTE_PRW | PTE_PRU, r25, r3) + SET_PTE (r1, r2, r3) + + ! Map a 4K page for data at DM_DB_DLINK_BASE+0x1000 + addi r1, 16, r1 + LOAD_PTEH_VAL ((DM_DB_DLINK_BASE+0x1000), ALIGN_4KB, PTE_ENABLED | PTE_NOT_SHARED, r25, r2) + LOAD_PTEL_VAL ((DM_DB_DLINK_BASE+0x1000), ALIGN_4KB, PTE_CB_UNCACHEABLE | PTE_SZ_4KB | PTE_PRR | PTE_PRW | PTE_PRU, r25, r3) + SET_PTE (r1, r2, r3) + + ! Map a 4K page for stack DM_DB_DLINK_BASE+0x2000 + addi r1, 16, r1 + LOAD_PTEH_VAL ((DM_DB_DLINK_BASE+0x2000), ALIGN_4KB, PTE_ENABLED | PTE_NOT_SHARED, r25, r2) + LOAD_PTEL_VAL ((DM_DB_DLINK_BASE+0x2000), ALIGN_4KB, PTE_CB_CACHEABLE_WB | PTE_SZ_4KB | PTE_PRR | PTE_PRW | PTE_PRU, r25, r3) + SET_PTE (r1, r2, r3) + + ! Map a 1M page for DM_CB_BASE2 at DM_CB_DLINK + ! 0x0c000000 - 0x0c0fffff + addi r1, 16, r1 + LOAD_PTEH_VAL (DM_CB_DLINK_BASE, ALIGN_1MB, PTE_ENABLED | PTE_NOT_SHARED, r25, r2) + LOAD_PTEL_VAL (DM_CB_DLINK_BASE, ALIGN_1MB, PTE_CB_DEVICE | PTE_SZ_1MB | PTE_PRR | PTE_PRW | PTE_PRU, r25, r3) + SET_PTE (r1, r2, r3) + + ! Map a 1M page for DM_CB_BASE2 at DM_CB_DLINK + ! 0x0c100000 - 0x0c1fffff + addi r1, 16, r1 + LOAD_PTEH_VAL ((DM_CB_DLINK_BASE+0x100000), ALIGN_1MB, PTE_ENABLED | PTE_NOT_SHARED, r25, r2) + LOAD_PTEL_VAL ((DM_CB_DLINK_BASE+0x100000), ALIGN_1MB, PTE_CB_DEVICE | PTE_SZ_1MB | PTE_PRR | PTE_PRW | PTE_PRU, r25, r3) + SET_PTE (r1, r2, r3) + + ! Map a 1M page for DM_CB_BASE2 at DM_CB_DLINK + ! 0x0c200000 - 0x0c2fffff + addi r1, 16, r1 + LOAD_PTEH_VAL ((DM_CB_DLINK_BASE+0x200000), ALIGN_1MB, PTE_ENABLED | PTE_NOT_SHARED, r25, r2) + LOAD_PTEL_VAL ((DM_CB_DLINK_BASE+0x200000), ALIGN_1MB, PTE_CB_DEVICE | PTE_SZ_1MB | PTE_PRR | PTE_PRW | PTE_PRU, r25, r3) + SET_PTE (r1, r2, r3) + + ! Map a 1M page for DM_CB_BASE2 at DM_CB_DLINK + ! 0x0c400000 - 0x0c4fffff + addi r1, 16, r1 + LOAD_PTEH_VAL ((DM_CB_DLINK_BASE+0x400000), ALIGN_1MB, PTE_ENABLED | PTE_NOT_SHARED, r25, r2) + LOAD_PTEL_VAL ((DM_CB_DLINK_BASE+0x400000), ALIGN_1MB, PTE_CB_DEVICE | PTE_SZ_1MB | PTE_PRR | PTE_PRW | PTE_PRU, r25, r3) + SET_PTE (r1, r2, r3) + + ! Map a 1M page for DM_CB_BASE2 at DM_CB_DLINK + ! 0x0c800000 - 0x0c8fffff + addi r1, 16, r1 + LOAD_PTEH_VAL ((DM_CB_DLINK_BASE+0x800000), ALIGN_1MB, PTE_ENABLED | PTE_NOT_SHARED, r25, r2) + LOAD_PTEL_VAL ((DM_CB_DLINK_BASE+0x800000), ALIGN_1MB, PTE_CB_DEVICE | PTE_SZ_1MB | PTE_PRR | PTE_PRW | PTE_PRU, r25, r3) + SET_PTE (r1, r2, r3) + + ! Map a 4K page for DMA control registers + addi r1, 16, r1 + LOAD_PTEH_VAL (DMA_BASE, ALIGN_4KB, PTE_ENABLED | PTE_NOT_SHARED, r25, r2) + LOAD_PTEL_VAL (DMA_BASE, ALIGN_4KB, PTE_CB_DEVICE | PTE_SZ_4KB | PTE_PRR | PTE_PRW | PTE_PRU, r25, r3) + SET_PTE (r1, r2, r3) + + ! Map lots of 4K pages for peripherals + + ! /* peripheral */ + addi r1, 16, r1 + LOAD_PTEH_VAL (PERIPH_BASE, ALIGN_4KB, PTE_ENABLED | PTE_NOT_SHARED, r25, r2) + LOAD_PTEL_VAL (PERIPH_BASE, ALIGN_4KB, PTE_CB_DEVICE | PTE_SZ_4KB | PTE_PRR | PTE_PRW | PTE_PRU, r25, r3) + SET_PTE (r1, r2, r3) + ! /* dmac */ + addi r1, 16, r1 + LOAD_PTEH_VAL (DMAC_BASE, ALIGN_4KB, PTE_ENABLED | PTE_NOT_SHARED, r25, r2) + LOAD_PTEL_VAL (DMAC_BASE, ALIGN_4KB, PTE_CB_DEVICE | PTE_SZ_4KB | PTE_PRR | PTE_PRW | PTE_PRU, r25, r3) + SET_PTE (r1, r2, r3) + ! /* intc */ + addi r1, 16, r1 + LOAD_PTEH_VAL (INTC_BASE, ALIGN_4KB, PTE_ENABLED | PTE_NOT_SHARED, r25, r2) + LOAD_PTEL_VAL (INTC_BASE, ALIGN_4KB, PTE_CB_DEVICE | PTE_SZ_4KB | PTE_PRR | PTE_PRW | PTE_PRU, r25, r3) + SET_PTE (r1, r2, r3) + ! /* rtc */ + addi r1, 16, r1 + LOAD_PTEH_VAL (RTC_BASE, ALIGN_4KB, PTE_ENABLED | PTE_NOT_SHARED, r25, r2) + LOAD_PTEL_VAL (RTC_BASE, ALIGN_4KB, PTE_CB_DEVICE | PTE_SZ_4KB | PTE_PRR | PTE_PRW | PTE_PRU, r25, r3) + SET_PTE (r1, r2, r3) + ! /* dmac */ + addi r1, 16, r1 + LOAD_PTEH_VAL (TMU_BASE, ALIGN_4KB, PTE_ENABLED | PTE_NOT_SHARED, r25, r2) + LOAD_PTEL_VAL (TMU_BASE, ALIGN_4KB, PTE_CB_DEVICE | PTE_SZ_4KB | PTE_PRR | PTE_PRW | PTE_PRU, r25, r3) + SET_PTE (r1, r2, r3) + ! /* scif */ + addi r1, 16, r1 + LOAD_PTEH_VAL (SCIF_BASE, ALIGN_4KB, PTE_ENABLED | PTE_NOT_SHARED, r25, r2) + LOAD_PTEL_VAL (SCIF_BASE, ALIGN_4KB, PTE_CB_DEVICE | PTE_SZ_4KB | PTE_PRR | PTE_PRW | PTE_PRU, r25, r3) + SET_PTE (r1, r2, r3) + ! /* cprc */ + addi r1, 16, r1 + LOAD_PTEH_VAL (CPRC_BASE, ALIGN_4KB, PTE_ENABLED | PTE_NOT_SHARED, r25, r2) + LOAD_PTEL_VAL (CPRC_BASE, ALIGN_4KB, PTE_CB_DEVICE | PTE_SZ_4KB | PTE_PRR | PTE_PRW | PTE_PRU, r25, r3) + SET_PTE (r1, r2, r3) + + ! Map CPU WPC registers + addi r1, 16, r1 + LOAD_PTEH_VAL (CPU_BASE, ALIGN_1MB, PTE_ENABLED | PTE_NOT_SHARED, r25, r2) + LOAD_PTEL_VAL (CPU_BASE, ALIGN_1MB, PTE_CB_DEVICE | PTE_SZ_1MB | PTE_PRR | PTE_PRW | PTE_PRU, r25, r3) + SET_PTE (r1, r2, r3) + addi r1, 16, r1 + + LOAD_PTEH_VAL ((CPU_BASE+0x100000), ALIGN_1MB, PTE_ENABLED | PTE_NOT_SHARED, r25, r2) + LOAD_PTEL_VAL ((CPU_BASE+0x100000), ALIGN_1MB, PTE_CB_DEVICE | PTE_SZ_1MB | PTE_PRR | PTE_PRW | PTE_PRU, r25, r3) + SET_PTE (r1, r2, r3) + + addi r1, 16, r1 + LOAD_PTEH_VAL ((CPU_BASE+0x200000), ALIGN_1MB, PTE_ENABLED | PTE_NOT_SHARED, r25, r2) + LOAD_PTEL_VAL ((CPU_BASE+0x200000), ALIGN_1MB, PTE_CB_DEVICE | PTE_SZ_1MB | PTE_PRR | PTE_PRW | PTE_PRU, r25, r3) + SET_PTE (r1, r2, r3) + + addi r1, 16, r1 + LOAD_PTEH_VAL ((CPU_BASE+0x400000), ALIGN_1MB, PTE_ENABLED | PTE_NOT_SHARED, r25, r2) + LOAD_PTEL_VAL ((CPU_BASE+0x400000), ALIGN_1MB, PTE_CB_DEVICE | PTE_SZ_1MB | PTE_PRR | PTE_PRW | PTE_PRU, r25, r3) + SET_PTE (r1, r2, r3) + + ! Switch over to virtual addressing and enabled cache + getcon sr, r1 + movi 1, r2 + shlli r2, SR_BL_BIT, r2 + or r1, r2, r1 + putcon r1, ssr + getcon sr, r1 + movi 1, r2 + shlli r2, SR_MMU_BIT, r2 + or r1, r2, r1 + putcon r1, ssr + gettr tr1, r1 + putcon r1, spc + synco + rte + + ! VM entry point. From now on, we are in VM mode. +.vm_ep: + + ! Install the trap handler, by seeding vbr with the + ! correct value, and by assigning sr.bl = 0. + + LOAD_ADDR (vbr_start, r1) + putcon r1, vbr + movi ~(1<<28), r1 + getcon sr, r2 + and r1, r2, r2 + putcon r2, sr +#endif /* MMU_SUPPORT */ + pt/l .Lzero_bss_loop, tr0 - pt/l _atexit, tr1 pt/l _init, tr5 pt/l ___setup_argv_and_call_main, tr6 pt/l _exit, tr7 @@ -72,12 +439,7 @@ start: LOAD_ADDR (___data, r26) LOAD_ADDR (___rodata, r27) -#if ! __SH4_NOFPU__ && ! __SH2A_NOFPU__ -#if __SH5__ == 32 - pt/l ___set_fpscr, tr0 - movi 0, r4 - blink tr0, r18 -#endif +#ifdef __SH_FPU_ANY__ getcon sr, r0 ! enable the FP unit, by resetting SR.FD ! also zero out SR.FR, SR.SZ and SR.PR, as mandated by the ABI @@ -85,9 +447,15 @@ start: shori 0xf000, r1 andc r0, r1, r0 putcon r0, sr +#if __SH5__ == 32 + pt/l ___set_fpscr, tr0 + movi 0, r4 + blink tr0, r18 +#endif #endif ! arrange for exit to call fini + pt/l _atexit, tr1 LOAD_ADDR (_fini, r2) blink tr1, r18 @@ -99,13 +467,253 @@ start: ! call exit blink tr7, r18 + ! We should never return from _exit but in case we do we would enter the + ! the following tight loop. This avoids executing any data that might follow. +limbo: + pt/l limbo, tr0 + blink tr0, r63 -#else +#ifdef MMU_SUPPORT + ! All these traps are handled in the same place. + .balign 256 +vbr_start: + pt/l handler, tr0 ! tr0 trashed. + blink tr0, r63 + .balign 256 +vbr_100: + pt/l handler, tr0 ! tr0 trashed. + blink tr0, r63 +vbr_100_end: + .balign 256 +vbr_200: + pt/l handler, tr0 ! tr0 trashed. + blink tr0, r63 + .balign 256 +vbr_300: + pt/l handler, tr0 ! tr0 trashed. + blink tr0, r63 + .balign 256 +vbr_400: ! Should be at vbr+0x400 +handler: + /* If the trap handler is there call it */ + LOAD_ADDR (__superh_trap_handler, r2) + pta chandler,tr2 + beq r2, r63, tr2 /* If zero, ie not present branch around to chandler */ + /* Now call the trap handler with as much of the context unchanged as possible. + Move trapping address into R18 to make it look like the trap point */ + getcon spc, r18 + pt/l __superh_trap_handler, tr0 + blink tr0, r7 +chandler: + getcon spc, r62 + getcon expevt, r2 + pt/l _exit, tr0 + blink tr0, r63 + + /* Simulated trap handler */ + .section .text..SHmedia32,"ax" +gcc2_compiled.: + .section .debug_abbrev +.Ldebug_abbrev0: + .section .text..SHmedia32 +.Ltext0: + .section .debug_info +.Ldebug_info0: + .section .debug_line +.Ldebug_line0: + .section .text..SHmedia32,"ax" + .align 5 + .global __superh_trap_handler + .type __superh_trap_handler,@function +__superh_trap_handler: +.LFB1: + ptabs r18, tr0 + addi.l r15, -8, r15 + st.l r15, 4, r14 + addi.l r15, -8, r15 + add.l r15, r63, r14 + st.l r14, 0, r2 + ptabs r7, tr0 + addi.l r14, 8, r14 + add.l r14, r63, r15 + ld.l r15, 4, r14 + addi.l r15, 8, r15 + blink tr0, r63 +.LFE1: +.Lfe1: + .size __superh_trap_handler,.Lfe1-__superh_trap_handler + + .section .text..SHmedia32 +.Letext0: + + .section .debug_info + .ualong 0xa7 + .uaword 0x2 + .ualong .Ldebug_abbrev0 + .byte 0x4 + .byte 0x1 + .ualong .Ldebug_line0 + .ualong .Letext0 + .ualong .Ltext0 + .string "trap_handler.c" + + .string "xxxxxxxxxxxxxxxxxxxxxxxxxxxx" + + .string "GNU C 2.97-sh5-010522" + + .byte 0x1 + .byte 0x2 + .ualong 0x9a + .byte 0x1 + .string "_superh_trap_handler" + + .byte 0x1 + .byte 0x2 + .byte 0x1 + .ualong .LFB1 + .ualong .LFE1 + .byte 0x1 + .byte 0x5e + .byte 0x3 + .string "trap_reason" + + .byte 0x1 + .byte 0x1 + .ualong 0x9a + .byte 0x2 + .byte 0x91 + .byte 0x0 + .byte 0x0 + .byte 0x4 + .string "unsigned int" + + .byte 0x4 + .byte 0x7 + .byte 0x0 + + .section .debug_abbrev + .byte 0x1 + .byte 0x11 + .byte 0x1 + .byte 0x10 + .byte 0x6 + .byte 0x12 + .byte 0x1 + .byte 0x11 + .byte 0x1 + .byte 0x3 + .byte 0x8 + .byte 0x1b + .byte 0x8 + .byte 0x25 + .byte 0x8 + .byte 0x13 + .byte 0xb + .byte 0,0 + .byte 0x2 + .byte 0x2e + .byte 0x1 + .byte 0x1 + .byte 0x13 + .byte 0x3f + .byte 0xc + .byte 0x3 + .byte 0x8 + .byte 0x3a + .byte 0xb + .byte 0x3b + .byte 0xb + .byte 0x27 + .byte 0xc + .byte 0x11 + .byte 0x1 + .byte 0x12 + .byte 0x1 + .byte 0x40 + .byte 0xa + .byte 0,0 + .byte 0x3 + .byte 0x5 + .byte 0x0 + .byte 0x3 + .byte 0x8 + .byte 0x3a + .byte 0xb + .byte 0x3b + .byte 0xb + .byte 0x49 + .byte 0x13 + .byte 0x2 + .byte 0xa + .byte 0,0 + .byte 0x4 + .byte 0x24 + .byte 0x0 + .byte 0x3 + .byte 0x8 + .byte 0xb + .byte 0xb + .byte 0x3e + .byte 0xb + .byte 0,0 + .byte 0 + + .section .debug_pubnames + .ualong 0x27 + .uaword 0x2 + .ualong .Ldebug_info0 + .ualong 0xab + .ualong 0x5b + .string "_superh_trap_handler" + + .ualong 0x0 + + .section .debug_aranges + .ualong 0x1c + .uaword 0x2 + .ualong .Ldebug_info0 + .byte 0x4 + .byte 0x0 + .uaword 0x0,0 + .ualong .Ltext0 + .ualong .Letext0-.Ltext0 + .ualong 0x0 + .ualong 0x0 + .ident "GCC: (GNU) 2.97-sh5-010522" +#endif /* MMU_SUPPORT */ +#else /* ! __SH5__ */ + + ! make a place to keep any previous value of the vbr register + ! this will only have a value if it has been set by redboot (for example) + .section .bss +old_vbr: + .long 0 + .section .text .global start + .import ___rtos_profiler_start_timer + .weak ___rtos_profiler_start_timer start: mov.l stack_k,r15 +#if defined (__SH3__) || (defined (__SH_FPU_ANY__) && ! defined (__SH2A__)) || defined (__SH4_NOFPU__) +#define VBR_SETUP + ! before zeroing the bss ... + ! if the vbr is already set to vbr_start then the program has been restarted + ! (i.e. it is not the first time the program has been run since reset) + ! reset the vbr to its old value before old_vbr (in bss) is wiped + ! this ensures that the later code does not create a circular vbr chain + stc vbr, r1 + mov.l vbr_start_k, r2 + cmp/eq r1, r2 + bf 0f + ! reset the old vbr value + mov.l old_vbr_k, r1 + mov.l @r1, r2 + ldc r2, vbr +0: +#endif /* VBR_SETUP */ + ! zero out bss mov.l edata_k,r0 mov.l end_k,r1 @@ -116,14 +724,39 @@ start_l: cmp/ge r0,r1 bt start_l -#if ! __SH2A_NOFPU__ -#if defined (__SH2E__) || defined (__SH2A__) || defined (__SH3E__) || defined(__SH4_SINGLE__) || defined(__SH4__) || defined(__SH4_SINGLE_ONLY__) +#if defined (__SH_FPU_ANY__) mov.l set_fpscr_k, r1 + mov #4,r4 jsr @r1 - mov #0,r4 - lds r3,fpscr -#endif /* defined (__SH2E__) || defined (__SH2A__) || defined (__SH3E__) || defined(__SH4_SINGLE__) || defined(__SH4__) || defined(__SH4_SINGLE_ONLY__) */ -#endif /* ! __SH2A_NOFPU__ */ + shll16 r4 ! Set DN bit (flush denormal inputs to zero) + lds r3,fpscr ! Switch to default precision +#endif /* defined (__SH_FPU_ANY__) */ + +#ifdef VBR_SETUP + ! save the existing contents of the vbr + ! there will only be a prior value when using something like redboot + ! otherwise it will be zero + stc vbr, r1 + mov.l old_vbr_k, r2 + mov.l r1, @r2 + ! setup vbr + mov.l vbr_start_k, r1 + ldc r1,vbr +#endif /* VBR_SETUP */ + + ! if an rtos is exporting a timer start fn, + ! then pick up an SR which does not enable ints + ! (the rtos will take care of this) + mov.l rtos_start_fn, r0 + mov.l sr_initial_bare, r1 + tst r0, r0 + bt set_sr + + mov.l sr_initial_rtos, r1 + +set_sr: + ! Set status register (sr) + ldc r1, sr ! arrange for exit to call fini mov.l atexit_k,r0 @@ -146,14 +779,12 @@ start_l: mov.l exit_k,r0 jsr @r0 nop - + .align 2 -#if ! __SH2A_NOFPU__ -#if defined (__SH2E__) || defined (__SH2A__) || defined (__SH3E__) || defined(__SH4_SINGLE__) || defined(__SH4__) || defined(__SH4_SINGLE_ONLY__) +#if defined (__SH_FPU_ANY__) set_fpscr_k: .long ___set_fpscr -#endif /* defined (__SH2E__) || defined (__SH2A__) || defined (__SH3E__) || defined(__SH4_SINGLE__) || defined(__SH4__) || defined(__SH4_SINGLE_ONLY__) */ -#endif /* ! __SH2A_NOFPU__ */ +#endif /* defined (__SH_FPU_ANY__) */ stack_k: .long _stack @@ -171,11 +802,428 @@ init_k: .long _init fini_k: .long _fini +#ifdef VBR_SETUP +old_vbr_k: + .long old_vbr +vbr_start_k: + .long vbr_start +#endif /* VBR_SETUP */ + +sr_initial_rtos: + ! Privileged mode RB 1 BL 0. Keep BL 0 to allow default trap handlers to work. + ! Whether profiling or not, keep interrupts masked, + ! the RTOS will enable these if required. + .long 0x600000f1 + +rtos_start_fn: + .long ___rtos_profiler_start_timer + +sr_initial_bare: + ! Privileged mode RB 1 BL 0. Keep BL 0 to allow default trap handlers to work. + ! Keep interrupts disabled - the application will enable as required. + .long 0x600000f1 ! supplied for backward compatibility only, in case of linking ! code whose main() was compiled with an older version of GCC. - .global ___main + .global ___main ___main: rts nop -#endif +#ifdef VBR_SETUP +! Exception handlers + .balign 256 +vbr_start: + mov.l 2f, r0 ! load the old vbr setting (if any) + mov.l @r0, r0 + cmp/eq #0, r0 + bf 1f + ! no previous vbr - jump to own generic handler + bra handler + nop +1: ! there was a previous handler - chain them + jmp @r0 + nop + .balign 4 +2: + .long old_vbr + + .balign 256 +vbr_100: + ! Non profiling case. +handler_100: + mov.l 2f, r0 ! load the old vbr setting (if any) + mov.l @r0, r0 + cmp/eq #0, r0 + bf 1f + ! no previous vbr - jump to own generic handler + bra handler + nop +1: ! there was a previous handler - chain them + add #0x7f, r0 ! 0x7f + add #0x7f, r0 ! 0xfe + add #0x2, r0 ! add 0x100 without corrupting another register + jmp @r0 + nop + .balign 4 +2: + .long old_vbr + + .balign 256 +vbr_200: + mov.l 2f, r0 ! load the old vbr setting (if any) + mov.l @r0, r0 + cmp/eq #0, r0 + bf 1f + ! no previous vbr - jump to own generic handler + bra handler + nop +1: ! there was a previous handler - chain them + add #0x7f, r0 ! 0x7f + add #0x7f, r0 ! 0xfe + add #0x7f, r0 ! 0x17d + add #0x7f, r0 ! 0x1fc + add #0x4, r0 ! add 0x200 without corrupting another register + jmp @r0 + nop + .balign 4 +2: + .long old_vbr + + .balign 256 +vbr_300: + mov.l 2f, r0 ! load the old vbr setting (if any) + mov.l @r0, r0 + cmp/eq #0, r0 + bf 1f + ! no previous vbr - jump to own generic handler + bra handler + nop +1: ! there was a previous handler - chain them + add #0x7f, r0 ! 0x7f + add #0x7f, r0 ! 0xfe + add #0x7f, r0 ! 0x17d + add #0x7f, r0 ! 0x1fc + add #0x7f, r0 ! 0x27b + add #0x7f, r0 ! 0x2fa + add #0x6, r0 ! add 0x300 without corrupting another register + jmp @r0 + nop + .balign 4 +2: + .long old_vbr + + .balign 256 +vbr_400: ! Should be at vbr+0x400 + mov.l 2f, r0 ! load the old vbr setting (if any) + mov.l @r0, r0 + cmp/eq #0, r0 + ! no previous vbr - jump to own generic handler + bt handler + ! there was a previous handler - chain them + add #0x7f, r0 ! 0x7f + add #0x7f, r0 ! 0xfe + add #0x7f, r0 ! 0x17d + add #0x7f, r0 ! 0x1fc + add #0x7f, r0 ! 0x27b + add #0x7f, r0 ! 0x2fa + add #0x7f, r0 ! 0x379 + add #0x7f, r0 ! 0x3f8 + add #0x8, r0 ! add 0x400 without corrupting another register + jmp @r0 + nop + .balign 4 +2: + .long old_vbr +handler: + /* If the trap handler is there call it */ + mov.l superh_trap_handler_k, r0 + cmp/eq #0, r0 ! True if zero. + bf 3f + bra chandler + nop +3: + ! Here handler available, call it. + /* Now call the trap handler with as much of the context unchanged as possible. + Move trapping address into PR to make it look like the trap point */ + stc spc, r1 + lds r1, pr + mov.l expevt_k, r4 + mov.l @r4, r4 ! r4 is value of expevt, first parameter. + mov r1, r5 ! Remember trapping pc. + mov r1, r6 ! Remember trapping pc. + mov.l chandler_k, r1 + mov.l superh_trap_handler_k, r2 + ! jmp to trap handler to avoid disturbing pr. + jmp @r2 + nop + + .balign 256 +vbr_500: + mov.l 2f, r0 ! load the old vbr setting (if any) + mov.l @r0, r0 + cmp/eq #0, r0 + ! no previous vbr - jump to own generic handler + bt handler + ! there was a previous handler - chain them + add #0x7f, r0 ! 0x7f + add #0x7f, r0 ! 0xfe + add #0x7f, r0 ! 0x17d + add #0x7f, r0 ! 0x1fc + add #0x7f, r0 ! 0x27b + add #0x7f, r0 ! 0x2fa + add #0x7f, r0 ! 0x379 + add #0x7f, r0 ! 0x3f8 + add #0x7f, r0 ! 0x477 + add #0x7f, r0 ! 0x4f6 + add #0xa, r0 ! add 0x500 without corrupting another register + jmp @r0 + nop + .balign 4 +2: + .long old_vbr + + .balign 256 +vbr_600: + mov.l 2f, r0 ! load the old vbr setting (if any) + mov.l @r0, r0 + cmp/eq #0, r0 + ! no previous vbr - jump to own handler + bt chandler + ! there was a previous handler - chain them + add #0x7f, r0 ! 0x7f + add #0x7f, r0 ! 0xfe + add #0x7f, r0 ! 0x17d + add #0x7f, r0 ! 0x1fc + add #0x7f, r0 ! 0x27b + add #0x7f, r0 ! 0x2fa + add #0x7f, r0 ! 0x379 + add #0x7f, r0 ! 0x3f8 + add #0x7f, r0 ! 0x477 + add #0x7f, r0 ! 0x4f6 + add #0x7f, r0 ! 0x575 + add #0x7f, r0 ! 0x5f4 + add #0xc, r0 ! add 0x600 without corrupting another register + jmp @r0 + nop + .balign 4 +2: + .long old_vbr +chandler: + mov.l expevt_k, r4 + mov.l @r4, r4 ! r4 is value of expevt hence making this the return code + mov.l handler_exit_k,r0 + jsr @r0 + nop + ! We should never return from _exit but in case we do we would enter the + ! the following tight loop +limbo: + bra limbo + nop + .balign 4 +expevt_k: + .long 0xff000024 ! Address of expevt +chandler_k: + .long chandler +superh_trap_handler_k: + .long __superh_trap_handler +handler_exit_k: + .long _exit + .align 2 +! Simulated compile of trap handler. + .section .debug_abbrev,"",@progbits +.Ldebug_abbrev0: + .section .debug_info,"",@progbits +.Ldebug_info0: + .section .debug_line,"",@progbits +.Ldebug_line0: + .text +.Ltext0: + .align 5 + .type __superh_trap_handler,@function +__superh_trap_handler: +.LFB1: + mov.l r14,@-r15 +.LCFI0: + add #-4,r15 +.LCFI1: + mov r15,r14 +.LCFI2: + mov.l r4,@r14 + lds r1, pr + add #4,r14 + mov r14,r15 + mov.l @r15+,r14 + rts + nop +.LFE1: +.Lfe1: + .size __superh_trap_handler,.Lfe1-__superh_trap_handler + .section .debug_frame,"",@progbits +.Lframe0: + .ualong .LECIE0-.LSCIE0 +.LSCIE0: + .ualong 0xffffffff + .byte 0x1 + .string "" + .uleb128 0x1 + .sleb128 -4 + .byte 0x11 + .byte 0xc + .uleb128 0xf + .uleb128 0x0 + .align 2 +.LECIE0: +.LSFDE0: + .ualong .LEFDE0-.LASFDE0 +.LASFDE0: + .ualong .Lframe0 + .ualong .LFB1 + .ualong .LFE1-.LFB1 + .byte 0x4 + .ualong .LCFI0-.LFB1 + .byte 0xe + .uleb128 0x4 + .byte 0x4 + .ualong .LCFI1-.LCFI0 + .byte 0xe + .uleb128 0x8 + .byte 0x8e + .uleb128 0x1 + .byte 0x4 + .ualong .LCFI2-.LCFI1 + .byte 0xd + .uleb128 0xe + .align 2 +.LEFDE0: + .text +.Letext0: + .section .debug_info + .ualong 0xb3 + .uaword 0x2 + .ualong .Ldebug_abbrev0 + .byte 0x4 + .uleb128 0x1 + .ualong .Ldebug_line0 + .ualong .Letext0 + .ualong .Ltext0 + .string "trap_handler.c" + .string "xxxxxxxxxxxxxxxxxxxxxxxxxxxx" + .string "GNU C 3.2 20020529 (experimental)" + .byte 0x1 + .uleb128 0x2 + .ualong 0xa6 + .byte 0x1 + .string "_superh_trap_handler" + .byte 0x1 + .byte 0x2 + .byte 0x1 + .ualong .LFB1 + .ualong .LFE1 + .byte 0x1 + .byte 0x5e + .uleb128 0x3 + .string "trap_reason" + .byte 0x1 + .byte 0x1 + .ualong 0xa6 + .byte 0x2 + .byte 0x91 + .sleb128 0 + .byte 0x0 + .uleb128 0x4 + .string "unsigned int" + .byte 0x4 + .byte 0x7 + .byte 0x0 + .section .debug_abbrev + .uleb128 0x1 + .uleb128 0x11 + .byte 0x1 + .uleb128 0x10 + .uleb128 0x6 + .uleb128 0x12 + .uleb128 0x1 + .uleb128 0x11 + .uleb128 0x1 + .uleb128 0x3 + .uleb128 0x8 + .uleb128 0x1b + .uleb128 0x8 + .uleb128 0x25 + .uleb128 0x8 + .uleb128 0x13 + .uleb128 0xb + .byte 0x0 + .byte 0x0 + .uleb128 0x2 + .uleb128 0x2e + .byte 0x1 + .uleb128 0x1 + .uleb128 0x13 + .uleb128 0x3f + .uleb128 0xc + .uleb128 0x3 + .uleb128 0x8 + .uleb128 0x3a + .uleb128 0xb + .uleb128 0x3b + .uleb128 0xb + .uleb128 0x27 + .uleb128 0xc + .uleb128 0x11 + .uleb128 0x1 + .uleb128 0x12 + .uleb128 0x1 + .uleb128 0x40 + .uleb128 0xa + .byte 0x0 + .byte 0x0 + .uleb128 0x3 + .uleb128 0x5 + .byte 0x0 + .uleb128 0x3 + .uleb128 0x8 + .uleb128 0x3a + .uleb128 0xb + .uleb128 0x3b + .uleb128 0xb + .uleb128 0x49 + .uleb128 0x13 + .uleb128 0x2 + .uleb128 0xa + .byte 0x0 + .byte 0x0 + .uleb128 0x4 + .uleb128 0x24 + .byte 0x0 + .uleb128 0x3 + .uleb128 0x8 + .uleb128 0xb + .uleb128 0xb + .uleb128 0x3e + .uleb128 0xb + .byte 0x0 + .byte 0x0 + .byte 0x0 + .section .debug_pubnames,"",@progbits + .ualong 0x27 + .uaword 0x2 + .ualong .Ldebug_info0 + .ualong 0xb7 + .ualong 0x67 + .string "_superh_trap_handler" + .ualong 0x0 + .section .debug_aranges,"",@progbits + .ualong 0x1c + .uaword 0x2 + .ualong .Ldebug_info0 + .byte 0x4 + .byte 0x0 + .uaword 0x0 + .uaword 0x0 + .ualong .Ltext0 + .ualong .Letext0-.Ltext0 + .ualong 0x0 + .ualong 0x0 +#endif /* VBR_SETUP */ +#endif /* ! __SH5__ */ diff --git a/gcc/config/sh/divtab.c b/gcc/config/sh/divtab.c new file mode 100644 index 0000000..31b4463 --- /dev/null +++ b/gcc/config/sh/divtab.c @@ -0,0 +1,204 @@ +/* Copyright (C) 2003 Free Software Foundation, Inc. + +This file is free software; you can redistribute it and/or modify it +under the terms of the GNU General Public License as published by the +Free Software Foundation; either version 2, or (at your option) any +later version. + +In addition to the permissions in the GNU General Public License, the +Free Software Foundation gives you unlimited permission to link the +compiled version of this file into combinations with other programs, +and to distribute those combinations without any restriction coming +from the use of this file. (The General Public License restrictions +do apply in other respects; for example, they cover modification of +the file, and distribution when not linked into a combine +executable.) + +This file is distributed in the hope that it will be useful, but +WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +General Public License for more details. + +You should have received a copy of the GNU General Public License +along with this program; see the file COPYING. If not, write to +the Free Software Foundation, 59 Temple Place - Suite 330, +Boston, MA 02111-1307, USA. */ + +/* Calculate division table for SH5Media integer division + Contributed by Joern Rernnecke + joern.rennecke@superh.com */ + +#include +#include + +#define BITS 5 +#define N_ENTRIES (1 << BITS) +#define CUTOFF_BITS 20 + +#define BIAS (-330) + +double max_defect = 0.; +double max_defect_x; + +double min_defect = 1e9; +double min_defect_x; + +double max_defect2 = 0.; +double max_defect2_x; + +double min_defect2 = 0.; +double min_defect2_x; + +double min_defect3 = 01e9; +double min_defect3_x; +int min_defect3_val; + +double max_defect3 = 0.; +double max_defect3_x; +int max_defect3_val; + +static double note_defect3 (int val, double d2, double y2d, double x) +{ + int cutoff_val = val >> CUTOFF_BITS; + double cutoff; + double defect; + + if (val < 0) + cutoff_val++; + cutoff = (cutoff_val * (1< max_defect3) + { + max_defect3 = defect; + max_defect3_x = x; + max_defect3_val = val; + } + if (defect < min_defect3) + { + min_defect3 = defect; + min_defect3_x = x; + min_defect3_val = val; + } +} + +/* This function assumes 32 bit integers. */ +static double +calc_defect (double x, int constant, int factor) +{ + double y0 = (constant - (int) floor ((x * factor * 64.))) / 16384.; + double y1 = 2 * y0 -y0 * y0 * (x + BIAS / (1.*(1LL<<30))); + double y2d0, y2d; + int y2d1; + double d, d2; + + y1 = floor (y1 * (1024 * 1024 * 1024)) / (1024 * 1024 * 1024); + d = y1 - 1 / x; + if (d > max_defect) + { + max_defect = d; + max_defect_x = x; + } + if (d < min_defect) + { + min_defect = d; + min_defect_x = x; + } + y2d0 = floor (y1 * x * (1LL << 60-16)); + y2d1 = (int) (long long) y2d0; + y2d = - floor ((y1 - y0 / (1<<30-14)) * y2d1) / (1LL<<44); + d2 = y1 + y2d - 1/x; + if (d2 > max_defect2) + { + max_defect2 = d2; + max_defect2_x = x; + } + if (d2 < min_defect2) + { + min_defect2 = d2; + min_defect2_x = x; + } + /* zero times anything is trivially zero. */ + note_defect3 ((1 << CUTOFF_BITS) - 1, d2, y2d, x); + note_defect3 (1 << CUTOFF_BITS, d2, y2d, x); + note_defect3 ((1U << 31) - (1 << CUTOFF_BITS), d2, y2d, x); + note_defect3 ((1U << 31) - 1, d2, y2d, x); + note_defect3 (-1, d2, y2d, x); + note_defect3 (-(1 << CUTOFF_BITS), d2, y2d, x); + note_defect3 ((1U << 31) - (1 << CUTOFF_BITS) + 1, d2, y2d, x); + note_defect3 (-(1U << 31), d2, y2d, x); + return d; +} + +int +main () +{ + int i; + unsigned char factors[N_ENTRIES]; + short constants[N_ENTRIES]; + int steps = N_ENTRIES / 2; + double step = 1. / steps; + double eps30 = 1. / (1024 * 1024 * 1024); + + for (i = 0; i < N_ENTRIES; i++) + { + double x_low = (i < steps ? 1. : -3.) + i * step; + double x_high = x_low + step - eps30; + double x_med; + int factor, constant; + double low_defect, med_defect, high_defect, max_defect; + + factor = (1./x_low- 1./x_high) / step * 256. + 0.5; + if (factor == 256) + factor = 255; + factors[i] = factor; + /* Use minimum of error function for x_med. */ + x_med = sqrt (256./factor); + if (x_low < 0) + x_med = - x_med; + low_defect = 1. / x_low + x_low * factor / 256.; + high_defect = 1. / x_high + x_high * factor / 256.; + med_defect = 1. / x_med + x_med * factor / 256.; + max_defect + = ((low_defect > high_defect) ^ (x_med < 0)) ? low_defect : high_defect; + constant = (med_defect + max_defect) * 0.5 * 16384. + 0.5; + if (constant < -32768 || constant > 32767) + abort (); + constants[i] = constant; + calc_defect (x_low, constant, factor); + calc_defect (x_med, constant, factor); + calc_defect (x_high, constant, factor); + } + printf ("/* This table has been generated by divtab.c .\n"); + printf ("Defects for bias %d:\n", BIAS); + printf (" Max defect: %e at %e\n", max_defect, max_defect_x); + printf (" Min defect: %e at %e\n", min_defect, min_defect_x); + printf (" Max 2nd step defect: %e at %e\n", max_defect2, max_defect2_x); + printf (" Min 2nd step defect: %e at %e\n", min_defect2, min_defect2_x); + printf (" Max div defect: %e at %d:%e\n", max_defect3, max_defect3_val, max_defect3_x); + printf (" Min div defect: %e at %d:%e\n", min_defect3, min_defect3_val, min_defect3_x); + printf (" Defect at 1: %e\n", + calc_defect (1., constants[0], factors[0])); + printf (" Defect at -2: %e */\n", + calc_defect (-2., constants[steps], factors[steps])); + printf ("\t.section\t.rodata\n"); + printf ("\t.balign 2\n"); + printf ("/* negative division constants */\n"); + for (i = steps; i < 2 * steps; i++) + printf ("\t.word\t%d\n", constants[i]); + printf ("/* negative division factors */\n"); + for (i = steps; i < 2*steps; i++) + printf ("\t.byte\t%d\n", factors[i]); + printf ("\t.skip %d\n", steps); + printf ("\t.global GLOBAL(div_table):\n"); + printf ("GLOBAL(div_table):\n"); + printf ("\t.skip %d\n", steps); + printf ("/* positive division factors */\n"); + for (i = 0; i < steps; i++) + printf ("\t.byte\t%d\n", factors[i]); + printf ("/* positive division constants */\n"); + for (i = 0; i < steps; i++) + printf ("\t.word\t%d\n", constants[i]); + exit (0); +} diff --git a/gcc/config/sh/elf.h b/gcc/config/sh/elf.h index b069308..0dab7d8 100644 --- a/gcc/config/sh/elf.h +++ b/gcc/config/sh/elf.h @@ -57,14 +57,6 @@ Boston, MA 02111-1307, USA. */ /* Pass -ml and -mrelax to the assembler and linker. */ #undef ASM_SPEC #define ASM_SPEC SH_ASM_SPEC -#undef SUBTARGET_ASM_ISA_SPEC -#define SUBTARGET_ASM_ISA_SPEC "\ -%{m2a:--isa=sh2a} \ -%{m2a-single:--isa=sh2a} \ -%{m2a-single-only:--isa=sh2a} \ -%{m2a-nofpu:--isa=sh2a-nofpu} \ -%{m5-compact*:--isa=SHcompact} %{m5-32media*:--isa=SHmedia --abi=32} \ -%{m5-64media*:--isa=SHmedia --abi=64}" ASM_ISA_DEFAULT_SPEC #undef LINK_SPEC #define LINK_SPEC SH_LINK_SPEC diff --git a/gcc/config/sh/lib1funcs.asm b/gcc/config/sh/lib1funcs.asm index 30f10a9..546a908 100644 --- a/gcc/config/sh/lib1funcs.asm +++ b/gcc/config/sh/lib1funcs.asm @@ -40,11 +40,15 @@ Boston, MA 02111-1307, USA. */ #ifdef __ELF__ #define LOCAL(X) .L_##X #define FUNC(X) .type X,@function +#define HIDDEN_FUNC(X) FUNC(X); .hidden X +#define HIDDEN_ALIAS(X,Y) ALIAS (X,Y); .hidden GLOBAL(X) #define ENDFUNC0(X) .Lfe_##X: .size X,.Lfe_##X-X #define ENDFUNC(X) ENDFUNC0(X) #else #define LOCAL(X) L_##X #define FUNC(X) +#define HIDDEN_FUNC(X) +#define HIDDEN_ALIAS(X,Y) ALIAS (X,Y) #define ENDFUNC(X) #endif @@ -54,10 +58,6 @@ Boston, MA 02111-1307, USA. */ #define ALIAS(X,Y) .global GLOBAL(X); .set GLOBAL(X),GLOBAL(Y) -#if defined __SH5__ && ! defined __SH4_NOFPU__ && ! defined (__LITTLE_ENDIAN__) -#define FMOVD_WORKS -#endif - #ifdef __SH2A__ #undef FMOVD_WORKS #define FMOVD_WORKS @@ -99,39 +99,39 @@ Boston, MA 02111-1307, USA. */ .global GLOBAL(ashiftrt_r4_31) .global GLOBAL(ashiftrt_r4_32) - FUNC(GLOBAL(ashiftrt_r4_0)) - FUNC(GLOBAL(ashiftrt_r4_1)) - FUNC(GLOBAL(ashiftrt_r4_2)) - FUNC(GLOBAL(ashiftrt_r4_3)) - FUNC(GLOBAL(ashiftrt_r4_4)) - FUNC(GLOBAL(ashiftrt_r4_5)) - FUNC(GLOBAL(ashiftrt_r4_6)) - FUNC(GLOBAL(ashiftrt_r4_7)) - FUNC(GLOBAL(ashiftrt_r4_8)) - FUNC(GLOBAL(ashiftrt_r4_9)) - FUNC(GLOBAL(ashiftrt_r4_10)) - FUNC(GLOBAL(ashiftrt_r4_11)) - FUNC(GLOBAL(ashiftrt_r4_12)) - FUNC(GLOBAL(ashiftrt_r4_13)) - FUNC(GLOBAL(ashiftrt_r4_14)) - FUNC(GLOBAL(ashiftrt_r4_15)) - FUNC(GLOBAL(ashiftrt_r4_16)) - FUNC(GLOBAL(ashiftrt_r4_17)) - FUNC(GLOBAL(ashiftrt_r4_18)) - FUNC(GLOBAL(ashiftrt_r4_19)) - FUNC(GLOBAL(ashiftrt_r4_20)) - FUNC(GLOBAL(ashiftrt_r4_21)) - FUNC(GLOBAL(ashiftrt_r4_22)) - FUNC(GLOBAL(ashiftrt_r4_23)) - FUNC(GLOBAL(ashiftrt_r4_24)) - FUNC(GLOBAL(ashiftrt_r4_25)) - FUNC(GLOBAL(ashiftrt_r4_26)) - FUNC(GLOBAL(ashiftrt_r4_27)) - FUNC(GLOBAL(ashiftrt_r4_28)) - FUNC(GLOBAL(ashiftrt_r4_29)) - FUNC(GLOBAL(ashiftrt_r4_30)) - FUNC(GLOBAL(ashiftrt_r4_31)) - FUNC(GLOBAL(ashiftrt_r4_32)) + HIDDEN_FUNC(GLOBAL(ashiftrt_r4_0)) + HIDDEN_FUNC(GLOBAL(ashiftrt_r4_1)) + HIDDEN_FUNC(GLOBAL(ashiftrt_r4_2)) + HIDDEN_FUNC(GLOBAL(ashiftrt_r4_3)) + HIDDEN_FUNC(GLOBAL(ashiftrt_r4_4)) + HIDDEN_FUNC(GLOBAL(ashiftrt_r4_5)) + HIDDEN_FUNC(GLOBAL(ashiftrt_r4_6)) + HIDDEN_FUNC(GLOBAL(ashiftrt_r4_7)) + HIDDEN_FUNC(GLOBAL(ashiftrt_r4_8)) + HIDDEN_FUNC(GLOBAL(ashiftrt_r4_9)) + HIDDEN_FUNC(GLOBAL(ashiftrt_r4_10)) + HIDDEN_FUNC(GLOBAL(ashiftrt_r4_11)) + HIDDEN_FUNC(GLOBAL(ashiftrt_r4_12)) + HIDDEN_FUNC(GLOBAL(ashiftrt_r4_13)) + HIDDEN_FUNC(GLOBAL(ashiftrt_r4_14)) + HIDDEN_FUNC(GLOBAL(ashiftrt_r4_15)) + HIDDEN_FUNC(GLOBAL(ashiftrt_r4_16)) + HIDDEN_FUNC(GLOBAL(ashiftrt_r4_17)) + HIDDEN_FUNC(GLOBAL(ashiftrt_r4_18)) + HIDDEN_FUNC(GLOBAL(ashiftrt_r4_19)) + HIDDEN_FUNC(GLOBAL(ashiftrt_r4_20)) + HIDDEN_FUNC(GLOBAL(ashiftrt_r4_21)) + HIDDEN_FUNC(GLOBAL(ashiftrt_r4_22)) + HIDDEN_FUNC(GLOBAL(ashiftrt_r4_23)) + HIDDEN_FUNC(GLOBAL(ashiftrt_r4_24)) + HIDDEN_FUNC(GLOBAL(ashiftrt_r4_25)) + HIDDEN_FUNC(GLOBAL(ashiftrt_r4_26)) + HIDDEN_FUNC(GLOBAL(ashiftrt_r4_27)) + HIDDEN_FUNC(GLOBAL(ashiftrt_r4_28)) + HIDDEN_FUNC(GLOBAL(ashiftrt_r4_29)) + HIDDEN_FUNC(GLOBAL(ashiftrt_r4_30)) + HIDDEN_FUNC(GLOBAL(ashiftrt_r4_31)) + HIDDEN_FUNC(GLOBAL(ashiftrt_r4_32)) .align 1 GLOBAL(ashiftrt_r4_32): @@ -268,7 +268,7 @@ GLOBAL(ashiftrt_r4_0): ! .global GLOBAL(ashrsi3) - FUNC(GLOBAL(ashrsi3)) + HIDDEN_FUNC(GLOBAL(ashrsi3)) .align 2 GLOBAL(ashrsi3): mov #31,r0 @@ -418,7 +418,7 @@ LOCAL(ashrsi3_0): ! (none) ! .global GLOBAL(ashlsi3) - FUNC(GLOBAL(ashlsi3)) + HIDDEN_FUNC(GLOBAL(ashlsi3)) .align 2 GLOBAL(ashlsi3): mov #31,r0 @@ -577,7 +577,7 @@ LOCAL(ashlsi3_0): ! (none) ! .global GLOBAL(lshrsi3) - FUNC(GLOBAL(lshrsi3)) + HIDDEN_FUNC(GLOBAL(lshrsi3)) .align 2 GLOBAL(lshrsi3): mov #31,r0 @@ -719,121 +719,143 @@ LOCAL(lshrsi3_0): #ifdef L_movmem .text + .balign 4 + .global GLOBAL(movmem) + HIDDEN_FUNC(GLOBAL(movmem)) + HIDDEN_ALIAS(movstr,movmem) + /* This would be a lot simpler if r6 contained the byte count + minus 64, and we wouldn't be called here for a byte count of 64. */ +GLOBAL(movmem): + sts.l pr,@-r15 + shll2 r6 + bsr GLOBAL(movmemSI52+2) + mov.l @(48,r5),r0 + .balign 4 +LOCAL(movmem_loop): /* Reached with rts */ + mov.l @(60,r5),r0 + add #-64,r6 + mov.l r0,@(60,r4) + tst r6,r6 + mov.l @(56,r5),r0 + bt LOCAL(movmem_done) + mov.l r0,@(56,r4) + cmp/pl r6 + mov.l @(52,r5),r0 + add #64,r5 + mov.l r0,@(52,r4) + add #64,r4 + bt GLOBAL(movmemSI52) ! done all the large groups, do the remainder - ! jump to movmem+ -done: - add #64,r5 - mova GLOBAL(movmemSI0),r0 - shll2 r6 + mova GLOBAL(movmemSI4)+4,r0 add r6,r0 jmp @r0 - add #64,r4 - .align 4 +LOCAL(movmem_done): ! share slot insn, works out aligned. + lds.l @r15+,pr + mov.l r0,@(56,r4) + mov.l @(52,r5),r0 + rts + mov.l r0,@(52,r4) + .balign 4 ! ??? We need aliases movstr* for movmem* for the older libraries. These ! aliases will be removed at the some point in the future. .global GLOBAL(movmemSI64) - FUNC(GLOBAL(movmemSI64)) - ALIAS(movstrSI64,movmemSI64) + HIDDEN_FUNC(GLOBAL(movmemSI64)) + HIDDEN_ALIAS(movstrSI64,movmemSI64) GLOBAL(movmemSI64): mov.l @(60,r5),r0 mov.l r0,@(60,r4) .global GLOBAL(movmemSI60) - FUNC(GLOBAL(movmemSI60)) - ALIAS(movstrSI60,movmemSI60) + HIDDEN_FUNC(GLOBAL(movmemSI60)) + HIDDEN_ALIAS(movstrSI60,movmemSI60) GLOBAL(movmemSI60): mov.l @(56,r5),r0 mov.l r0,@(56,r4) .global GLOBAL(movmemSI56) - FUNC(GLOBAL(movmemSI56)) - ALIAS(movstrSI56,movmemSI56) + HIDDEN_FUNC(GLOBAL(movmemSI56)) + HIDDEN_ALIAS(movstrSI56,movmemSI56) GLOBAL(movmemSI56): mov.l @(52,r5),r0 mov.l r0,@(52,r4) .global GLOBAL(movmemSI52) - FUNC(GLOBAL(movmemSI52)) - ALIAS(movstrSI52,movmemSI52) + HIDDEN_FUNC(GLOBAL(movmemSI52)) + HIDDEN_ALIAS(movstrSI52,movmemSI52) GLOBAL(movmemSI52): mov.l @(48,r5),r0 mov.l r0,@(48,r4) .global GLOBAL(movmemSI48) - FUNC(GLOBAL(movmemSI48)) - ALIAS(movstrSI48,movmemSI48) + HIDDEN_FUNC(GLOBAL(movmemSI48)) + HIDDEN_ALIAS(movstrSI48,movmemSI48) GLOBAL(movmemSI48): mov.l @(44,r5),r0 mov.l r0,@(44,r4) .global GLOBAL(movmemSI44) - FUNC(GLOBAL(movmemSI44)) - ALIAS(movstrSI44,movmemSI44) + HIDDEN_FUNC(GLOBAL(movmemSI44)) + HIDDEN_ALIAS(movstrSI44,movmemSI44) GLOBAL(movmemSI44): mov.l @(40,r5),r0 mov.l r0,@(40,r4) .global GLOBAL(movmemSI40) - FUNC(GLOBAL(movmemSI40)) - ALIAS(movstrSI40,movmemSI40) + HIDDEN_FUNC(GLOBAL(movmemSI40)) + HIDDEN_ALIAS(movstrSI40,movmemSI40) GLOBAL(movmemSI40): mov.l @(36,r5),r0 mov.l r0,@(36,r4) .global GLOBAL(movmemSI36) - FUNC(GLOBAL(movmemSI36)) - ALIAS(movstrSI36,movmemSI36) + HIDDEN_FUNC(GLOBAL(movmemSI36)) + HIDDEN_ALIAS(movstrSI36,movmemSI36) GLOBAL(movmemSI36): mov.l @(32,r5),r0 mov.l r0,@(32,r4) .global GLOBAL(movmemSI32) - FUNC(GLOBAL(movmemSI32)) - ALIAS(movstrSI32,movmemSI32) + HIDDEN_FUNC(GLOBAL(movmemSI32)) + HIDDEN_ALIAS(movstrSI32,movmemSI32) GLOBAL(movmemSI32): mov.l @(28,r5),r0 mov.l r0,@(28,r4) .global GLOBAL(movmemSI28) - FUNC(GLOBAL(movmemSI28)) - ALIAS(movstrSI28,movmemSI28) + HIDDEN_FUNC(GLOBAL(movmemSI28)) + HIDDEN_ALIAS(movstrSI28,movmemSI28) GLOBAL(movmemSI28): mov.l @(24,r5),r0 mov.l r0,@(24,r4) .global GLOBAL(movmemSI24) - FUNC(GLOBAL(movmemSI24)) - ALIAS(movstrSI24,movmemSI24) + HIDDEN_FUNC(GLOBAL(movmemSI24)) + HIDDEN_ALIAS(movstrSI24,movmemSI24) GLOBAL(movmemSI24): mov.l @(20,r5),r0 mov.l r0,@(20,r4) .global GLOBAL(movmemSI20) - FUNC(GLOBAL(movmemSI20)) - ALIAS(movstrSI20,movmemSI20) + HIDDEN_FUNC(GLOBAL(movmemSI20)) + HIDDEN_ALIAS(movstrSI20,movmemSI20) GLOBAL(movmemSI20): mov.l @(16,r5),r0 mov.l r0,@(16,r4) .global GLOBAL(movmemSI16) - FUNC(GLOBAL(movmemSI16)) - ALIAS(movstrSI16,movmemSI16) + HIDDEN_FUNC(GLOBAL(movmemSI16)) + HIDDEN_ALIAS(movstrSI16,movmemSI16) GLOBAL(movmemSI16): mov.l @(12,r5),r0 mov.l r0,@(12,r4) .global GLOBAL(movmemSI12) - FUNC(GLOBAL(movmemSI12)) - ALIAS(movstrSI12,movmemSI12) + HIDDEN_FUNC(GLOBAL(movmemSI12)) + HIDDEN_ALIAS(movstrSI12,movmemSI12) GLOBAL(movmemSI12): mov.l @(8,r5),r0 mov.l r0,@(8,r4) .global GLOBAL(movmemSI8) - FUNC(GLOBAL(movmemSI8)) - ALIAS(movstrSI8,movmemSI8) + HIDDEN_FUNC(GLOBAL(movmemSI8)) + HIDDEN_ALIAS(movstrSI8,movmemSI8) GLOBAL(movmemSI8): mov.l @(4,r5),r0 mov.l r0,@(4,r4) .global GLOBAL(movmemSI4) - FUNC(GLOBAL(movmemSI4)) - ALIAS(movstrSI4,movmemSI4) + HIDDEN_FUNC(GLOBAL(movmemSI4)) + HIDDEN_ALIAS(movstrSI4,movmemSI4) GLOBAL(movmemSI4): mov.l @(0,r5),r0 - mov.l r0,@(0,r4) - .global GLOBAL(movmemSI0) - FUNC(GLOBAL(movmemSI0)) - ALIAS(movstrSI0,movmemSI0) -GLOBAL(movmemSI0): rts - nop + mov.l r0,@(0,r4) ENDFUNC(GLOBAL(movmemSI64)) ENDFUNC(GLOBAL(movmemSI60)) @@ -851,71 +873,7 @@ GLOBAL(movmemSI0): ENDFUNC(GLOBAL(movmemSI12)) ENDFUNC(GLOBAL(movmemSI8)) ENDFUNC(GLOBAL(movmemSI4)) - ENDFUNC(GLOBAL(movmemSI0)) - - .align 4 - - .global GLOBAL(movmem) - FUNC(GLOBAL(movmem)) - ALIAS(movstr,movmem) -GLOBAL(movmem): - mov.l @(60,r5),r0 - mov.l r0,@(60,r4) - - mov.l @(56,r5),r0 - mov.l r0,@(56,r4) - - mov.l @(52,r5),r0 - mov.l r0,@(52,r4) - - mov.l @(48,r5),r0 - mov.l r0,@(48,r4) - - mov.l @(44,r5),r0 - mov.l r0,@(44,r4) - - mov.l @(40,r5),r0 - mov.l r0,@(40,r4) - - mov.l @(36,r5),r0 - mov.l r0,@(36,r4) - - mov.l @(32,r5),r0 - mov.l r0,@(32,r4) - - mov.l @(28,r5),r0 - mov.l r0,@(28,r4) - - mov.l @(24,r5),r0 - mov.l r0,@(24,r4) - - mov.l @(20,r5),r0 - mov.l r0,@(20,r4) - - mov.l @(16,r5),r0 - mov.l r0,@(16,r4) - - mov.l @(12,r5),r0 - mov.l r0,@(12,r4) - - mov.l @(8,r5),r0 - mov.l r0,@(8,r4) - - mov.l @(4,r5),r0 - mov.l r0,@(4,r4) - - mov.l @(0,r5),r0 - mov.l r0,@(0,r4) - - add #-16,r6 - cmp/pl r6 - bf done - - add #64,r5 - bra GLOBAL(movmem) - add #64,r4 - - FUNC(GLOBAL(movmem)) + ENDFUNC(GLOBAL(movmem)) #endif #ifdef L_movmem_i4 @@ -924,13 +882,13 @@ GLOBAL(movmem): .global GLOBAL(movmem_i4_odd) .global GLOBAL(movmemSI12_i4) - FUNC(GLOBAL(movmem_i4_even)) - FUNC(GLOBAL(movmem_i4_odd)) - FUNC(GLOBAL(movmemSI12_i4)) + HIDDEN_FUNC(GLOBAL(movmem_i4_even)) + HIDDEN_FUNC(GLOBAL(movmem_i4_odd)) + HIDDEN_FUNC(GLOBAL(movmemSI12_i4)) - ALIAS(movstr_i4_even,movmem_i4_even) - ALIAS(movstr_i4_odd,movmem_i4_odd) - ALIAS(movstrSI12_i4,movmemSI12_i4) + HIDDEN_ALIAS(movstr_i4_even,movmem_i4_even) + HIDDEN_ALIAS(movstr_i4_odd,movmem_i4_odd) + HIDDEN_ALIAS(movstrSI12_i4,movmemSI12_i4) .p2align 5 L_movmem_2mod4_end: @@ -991,7 +949,7 @@ GLOBAL(movmemSI12_i4): .global GLOBAL(mulsi3) - FUNC(GLOBAL(mulsi3)) + HIDDEN_FUNC(GLOBAL(mulsi3)) ! r4 = aabb ! r5 = ccdd @@ -1024,7 +982,7 @@ hiset: sts macl,r0 ! r0 = bb*dd rts add r2,r0 - FUNC(GLOBAL(mulsi3)) + ENDFUNC(GLOBAL(mulsi3)) #endif #endif /* ! __SH5__ */ #ifdef L_sdivsi3_i4 @@ -1034,7 +992,7 @@ hiset: sts macl,r0 ! r0 = bb*dd !! args in r4 and r5, result in fpul, clobber dr0, dr2 .global GLOBAL(sdivsi3_i4) - FUNC(GLOBAL(sdivsi3_i4)) + HIDDEN_FUNC(GLOBAL(sdivsi3_i4)) GLOBAL(sdivsi3_i4): lds r4,fpul float fpul,dr0 @@ -1053,7 +1011,7 @@ GLOBAL(sdivsi3_i4): .mode SHcompact #endif .global GLOBAL(sdivsi3_i4) - FUNC(GLOBAL(sdivsi3_i4)) + HIDDEN_FUNC(GLOBAL(sdivsi3_i4)) GLOBAL(sdivsi3_i4): sts.l fpscr,@-r15 mov #8,r2 @@ -1086,7 +1044,6 @@ GLOBAL(sdivsi3_i4): !! args in r4 and r5, result in r0 clobber r1, r2, r3, and t bit .global GLOBAL(sdivsi3) - FUNC(GLOBAL(sdivsi3)) #if __SHMEDIA__ #if __SH5__ == 32 .section .text..SHmedia32,"ax" @@ -1152,7 +1109,7 @@ LOCAL(sdivsi3_dontadd): muls.l r0, r2, r0 add.l r0, r63, r0 blink tr0, r63 -#else /* ! 0 */ +#elif 0 /* ! 0 */ // inputs: r4,r5 // clobbered: r1,r2,r3,r18,r19,r20,r21,r25,tr0 // result in r0 @@ -1214,6 +1171,65 @@ GLOBAL(sdivsi3): add.l r19,r25,r0 xor r0,r18,r0 blink tr0,r63 +#else /* ! 0 && ! 0 */ + + // inputs: r4,r5 + // clobbered: r1,r18,r19,r20,r21,r25,tr0 + // result in r0 + HIDDEN_FUNC(GLOBAL(sdivsi3_2)) +#ifndef __pic__ + FUNC(GLOBAL(sdivsi3)) +GLOBAL(sdivsi3): /* this is the shcompact entry point */ + // The special SHmedia entry point sdivsi3_1 prevents accidental linking + // with the SHcompact implementation, which clobbers tr1 / tr2. + .global GLOBAL(sdivsi3_1) +GLOBAL(sdivsi3_1): + .global GLOBAL(div_table_internal) + movi (GLOBAL(div_table_internal) >> 16) & 65535, r20 + shori GLOBAL(div_table_internal) & 65535, r20 +#endif + .global GLOBAL(sdivsi3_2) + // div_table in r20 + // clobbered: r1,r18,r19,r21,r25,tr0 +GLOBAL(sdivsi3_2): + nsb r5, r1 + shlld r5, r1, r25 // normalize; [-2 ..1, 1..2) in s2.62 + shari r25, 58, r21 // extract 5(6) bit index (s2.4 with hole -1..1) + ldx.ub r20, r21, r19 // u0.8 + shari r25, 32, r25 // normalize to s2.30 + shlli r21, 1, r21 + muls.l r25, r19, r19 // s2.38 + ldx.w r20, r21, r21 // s2.14 + ptabs r18, tr0 + shari r19, 24, r19 // truncate to s2.14 + sub r21, r19, r19 // some 11 bit inverse in s1.14 + muls.l r19, r19, r21 // u0.28 + sub r63, r1, r1 + addi r1, 92, r1 + muls.l r25, r21, r18 // s2.58 + shlli r19, 45, r19 // multiply by two and convert to s2.58 + /* bubble */ + sub r19, r18, r18 + shari r18, 28, r18 // some 22 bit inverse in s1.30 + muls.l r18, r25, r0 // s2.60 + muls.l r18, r4, r25 // s32.30 + /* bubble */ + shari r0, 16, r19 // s-16.44 + muls.l r19, r18, r19 // s-16.74 + shari r25, 63, r0 + shari r4, 14, r18 // s19.-14 + shari r19, 30, r19 // s-16.44 + muls.l r19, r18, r19 // s15.30 + xor r21, r0, r21 // You could also use the constant 1 << 27. + add r21, r25, r21 + sub r21, r19, r21 + shard r21, r1, r21 + sub r21, r0, r0 + blink tr0, r63 +#ifndef __pic__ + ENDFUNC(GLOBAL(sdivsi3)) +#endif + ENDFUNC(GLOBAL(sdivsi3_2)) #endif #elif defined __SHMEDIA__ /* m5compact-nofpu */ @@ -1221,6 +1237,7 @@ GLOBAL(sdivsi3): .mode SHmedia .section .text..SHmedia32,"ax" .align 2 + FUNC(GLOBAL(sdivsi3)) GLOBAL(sdivsi3): pt/l LOCAL(sdivsi3_dontsub), tr0 pt/l LOCAL(sdivsi3_loop), tr1 @@ -1245,7 +1262,9 @@ LOCAL(sdivsi3_dontsub): xor r20,r19,r20 sub.l r20,r19,r0 blink tr2,r63 + ENDFUNC(GLOBAL(sdivsi3)) #else /* ! __SHMEDIA__ */ + FUNC(GLOBAL(sdivsi3)) GLOBAL(sdivsi3): mov r4,r1 mov r5,r0 @@ -1343,7 +1362,7 @@ div0: rts !! and t bit .global GLOBAL(udivsi3_i4) - FUNC(GLOBAL(udivsi3_i4)) + HIDDEN_FUNC(GLOBAL(udivsi3_i4)) GLOBAL(udivsi3_i4): mov #1,r1 cmp/hi r1,r5 @@ -1390,7 +1409,7 @@ L1: !! args in r4 and r5, result in fpul, clobber r20, r21, dr0, fr33 .mode SHmedia .global GLOBAL(udivsi3_i4) - FUNC(GLOBAL(udivsi3_i4)) + HIDDEN_FUNC(GLOBAL(udivsi3_i4)) GLOBAL(udivsi3_i4): addz.l r4,r63,r20 addz.l r5,r63,r21 @@ -1410,6 +1429,7 @@ GLOBAL(udivsi3_i4): !! args in r4 and r5, result in fpul, clobber r0, r1, r4, r5, dr0, dr2, dr4 .global GLOBAL(udivsi3_i4) + HIDDEN_FUNC(GLOBAL(udivsi3_i4)) GLOBAL(udivsi3_i4): mov #1,r1 cmp/hi r1,r5 @@ -1469,7 +1489,7 @@ L1: !! args in r4 and r5, result in r0, clobbers r4, pr, and t bit .global GLOBAL(udivsi3) - FUNC(GLOBAL(udivsi3)) + HIDDEN_FUNC(GLOBAL(udivsi3)) #if __SHMEDIA__ #if __SH5__ == 32 @@ -1671,6 +1691,7 @@ LOCAL(large_divisor): .global GLOBAL(udivdi3) FUNC(GLOBAL(udivdi3)) GLOBAL(udivdi3): + HIDDEN_ALIAS(udivdi3_internal,udivdi3) shlri r3,1,r4 nsb r4,r22 shlld r3,r22,r6 @@ -1798,7 +1819,7 @@ LOCAL(no_lo_adj): .global GLOBAL(divdi3) FUNC(GLOBAL(divdi3)) GLOBAL(divdi3): - pta GLOBAL(udivdi3),tr0 + pta GLOBAL(udivdi3_internal),tr0 shari r2,63,r22 shari r3,63,r23 xor r2,r22,r2 @@ -1822,6 +1843,7 @@ GLOBAL(divdi3): .global GLOBAL(umoddi3) FUNC(GLOBAL(umoddi3)) GLOBAL(umoddi3): + HIDDEN_ALIAS(umoddi3_internal,umoddi3) shlri r3,1,r4 nsb r4,r22 shlld r3,r22,r6 @@ -1950,7 +1972,7 @@ LOCAL(no_lo_adj): .global GLOBAL(moddi3) FUNC(GLOBAL(moddi3)) GLOBAL(moddi3): - pta GLOBAL(umoddi3),tr0 + pta GLOBAL(umoddi3_internal),tr0 shari r2,63,r22 shari r3,63,r23 xor r2,r22,r2 @@ -1973,7 +1995,7 @@ GLOBAL(moddi3): .mode SHcompact #endif .global GLOBAL(set_fpscr) - FUNC(GLOBAL(set_fpscr)) + HIDDEN_FUNC(GLOBAL(set_fpscr)) GLOBAL(set_fpscr): lds r4,fpscr #ifdef __PIC__ @@ -2041,7 +2063,7 @@ LOCAL(set_fpscr_L1): .section .text..SHmedia32,"ax" .align 2 .global GLOBAL(init_trampoline) - FUNC(GLOBAL(init_trampoline)) + HIDDEN_FUNC(GLOBAL(init_trampoline)) GLOBAL(init_trampoline): st.l r0,8,r2 #ifdef __LITTLE_ENDIAN__ @@ -2057,8 +2079,9 @@ GLOBAL(init_trampoline): #endif st.q r0,0,r20 st.l r0,12,r3 + ENDFUNC(GLOBAL(init_trampoline)) .global GLOBAL(ic_invalidate) - FUNC(GLOBAL(ic_invalidate)) + HIDDEN_FUNC(GLOBAL(ic_invalidate)) GLOBAL(ic_invalidate): ocbwb r0,0 synco @@ -2066,65 +2089,123 @@ GLOBAL(ic_invalidate): ptabs r18, tr0 synci blink tr0, r63 - ENDFUNC(GLOBAL(ic_invalidate)) - ENDFUNC(GLOBAL(init_trampoline)) #elif defined(__SH4A__) .global GLOBAL(ic_invalidate) - FUNC(GLOBAL(ic_invalidate)) + HIDDEN_FUNC(GLOBAL(ic_invalidate)) GLOBAL(ic_invalidate): ocbwb @r4 synco rts icbi @r4 ENDFUNC(GLOBAL(ic_invalidate)) -#elif defined(__SH4_SINGLE__) || defined(__SH4__) || defined(__SH4_SINGLE_ONLY__) - /* This assumes a direct-mapped cache, which is the case for - the first SH4, but not for the second version of SH4, that - uses a 2-way set-associative cache, nor SH4a, that is 4-way. - SH4a fortunately offers an instruction to invalidate the - instruction cache, and we use it above, but SH4 doesn't. - However, since the libraries don't contain any nested - functions (the only case in which GCC would emit this pattern) - and we actually emit the ic_invalidate_line_i pattern for - cache invalidation on all SH4 multilibs (even 4-nofpu, that - isn't even corevered here), and pre-SH4 cores don't have - caches, it seems like this code is pointless, unless it's - meant for backward binary compatibility or for userland-only - cache invalidation for say sh4-*-linux-gnu. Such a feature - should probably be moved into a system call, such that the - kernel could do whatever it takes to invalidate a cache line - on the core it's actually running on. I.e., this hideous :-) - piece of code should go away at some point. */ - +#elif defined(__SH4_SINGLE__) || defined(__SH4__) || defined(__SH4_SINGLE_ONLY__) || (defined(__SH4_NOFPU__) && !defined(__SH5__)) + /* For system code, we use ic_invalidate_line_i, but user code + needs a different mechanism. A kernel call is generally not + available, and it would also be slow. Different SH4 variants use + different sizes and associativities of the Icache. We use a small + bit of dispatch code that can be put hidden in every shared object, + which calls the actual processor-specific invalidation code in a + separate module. + Or if you have operating system support, the OS could mmap the + procesor-specific code from a single page, since it is highly + repetitive. */ .global GLOBAL(ic_invalidate) - FUNC(GLOBAL(ic_invalidate)) + HIDDEN_FUNC(GLOBAL(ic_invalidate)) GLOBAL(ic_invalidate): - ocbwb @r4 + mov.l 0f,r1 +#ifdef __pic__ mova 0f,r0 - mov.w 1f,r1 -/* Compute how many cache lines 0f is away from r4. */ - sub r0,r4 - and r1,r4 -/* Prepare to branch to 0f plus the cache-line offset. */ - add # 0f - 1f,r4 - braf r4 - nop -1: - .short 0x1fe0 + mov.l 1f,r2 + add r1,r0 + mov.l @(r0,r2),r1 +#endif + ocbwb @r4 + mov.l @(8,r1),r0 + sub r1,r4 + and r4,r0 + add r1,r0 + jmp @r0 + mov.l @(4,r1),r0 +#ifndef __pic__ +0: .long GLOBAL(ic_invalidate_array) +#else /* __pic__ */ + .global GLOBAL(ic_invalidate_array) + /* ??? Why won't the assembler allow to add these two constants? */ +0: .long _GLOBAL_OFFSET_TABLE_ +1: .long GLOBAL(ic_invalidate_array)@GOT + ENDFUNC(GLOBAL(ic_invalidate)) +#endif /* __pic__ */ +#endif /* SH4 */ +#endif /* L_ic_invalidate */ + +#ifdef L_ic_invalidate_array +#if defined(__SH4A__) + /* This is needed when an SH4 dso with trampolines is used on SH4A. */ + .global GLOBAL(ic_invalidate_array) + FUNC(GLOBAL(ic_invalidate_array)) +GLOBAL(ic_invalidate_array): + add r1,r4 + synco + rts + icbi @r4 + .long 0 + ENDFUNC(GLOBAL(ic_invalidate_array)) +#elif defined(__SH4_SINGLE__) || defined(__SH4__) || defined(__SH4_SINGLE_ONLY__) || (defined(__SH4_NOFPU__) && !defined(__SH5__)) + .global GLOBAL(ic_invalidate_array) .p2align 5 + FUNC(GLOBAL(ic_invalidate_array)) /* This must be aligned to the beginning of a cache line. */ -0: - .rept 256 /* There are 256 cache lines of 32 bytes. */ +GLOBAL(ic_invalidate_array): +#ifndef WAYS +#define WAYS 4 +#define WAY_SIZE 0x4000 +#endif +#if WAYS == 1 + .rept WAY_SIZE * WAYS / 32 + rts + nop + .rept 7 + .long WAY_SIZE - 32 + .endr + .endr +#elif WAYS <= 6 + .rept WAY_SIZE * WAYS / 32 + braf r0 + add #-8,r0 + .long WAY_SIZE + 8 + .long WAY_SIZE - 32 + .rept WAYS-2 + braf r0 + nop + .endr + .rept 7 - WAYS + rts + nop + .endr + .endr +#else /* WAYS > 6 */ + /* This variant needs two different pages for mmap-ing. */ + .rept WAYS-1 + .rept WAY_SIZE / 32 + braf r0 + nop + .long WAY_SIZE + .rept 6 + .long WAY_SIZE - 32 + .endr + .endr + .endr + .rept WAY_SIZE / 32 rts .rept 15 nop .endr .endr - - ENDFUNC(GLOBAL(ic_invalidate)) +#endif /* WAYS */ + ENDFUNC(GLOBAL(ic_invalidate_array)) #endif /* SH4 */ -#endif /* L_ic_invalidate */ +#endif /* L_ic_invalidate_array */ #if defined (__SH5__) && __SH5__ == 32 #ifdef L_shcompact_call_trampoline @@ -2546,7 +2627,7 @@ LOCAL(ct_ret_wide): /* Call the function, so that we can unpack its .section .text..SHmedia32, "ax" .align 2 .global GLOBAL(GCC_shcompact_return_trampoline) - FUNC(GLOBAL(GCC_shcompact_return_trampoline)) + HIDDEN_FUNC(GLOBAL(GCC_shcompact_return_trampoline)) GLOBAL(GCC_shcompact_return_trampoline): ptabs/l r18, tr0 #if __LITTLE_ENDIAN__ @@ -2614,7 +2695,7 @@ LOCAL(ia_main_table): actual bit pattern. */ .global GLOBAL(GCC_shcompact_incoming_args) - FUNC(GLOBAL(GCC_shcompact_incoming_args)) + FUNC(GLOBAL(GCC_shcompact_incoming_args)) GLOBAL(GCC_shcompact_incoming_args): ptabs/l r18, tr0 /* Prepare to return. */ shlri r17, 32, r0 /* Load the cookie. */ @@ -2779,7 +2860,7 @@ LOCAL(ia_end_of_push_seq): /* Label used to compute the first push instruction. #endif .align 3 /* It is copied in units of 8 bytes in SHmedia mode. */ .global GLOBAL(GCC_nested_trampoline) - FUNC(GLOBAL(GCC_nested_trampoline)) + HIDDEN_FUNC(GLOBAL(GCC_nested_trampoline)) GLOBAL(GCC_nested_trampoline): .mode SHmedia ptrel/u r63, tr0 @@ -2824,10 +2905,11 @@ GLOBAL(GCC_push_shmedia_regs): fst.d r15, 2*8, dr40 fst.d r15, 1*8, dr38 fst.d r15, 0*8, dr36 -#endif +#else /* ! __SH4_NOFPU__ */ .global GLOBAL(GCC_push_shmedia_regs_nofpu) FUNC(GLOBAL(GCC_push_shmedia_regs_nofpu)) GLOBAL(GCC_push_shmedia_regs_nofpu): +#endif /* ! __SH4_NOFPU__ */ ptabs/l r18, tr0 addi.l r15, -27*8, r15 gettr tr7, r62 @@ -2861,12 +2943,12 @@ GLOBAL(GCC_push_shmedia_regs_nofpu): st.q r15, 1*8, r29 st.q r15, 0*8, r28 blink tr0, r63 - #ifndef __SH4_NOFPU__ ENDFUNC(GLOBAL(GCC_push_shmedia_regs)) -#endif +#else ENDFUNC(GLOBAL(GCC_push_shmedia_regs_nofpu)) -#ifndef __SH4_NOFPU__ +#endif +#ifndef __SH4_NOFPU__ .global GLOBAL(GCC_pop_shmedia_regs) FUNC(GLOBAL(GCC_pop_shmedia_regs)) GLOBAL(GCC_pop_shmedia_regs): @@ -2887,10 +2969,11 @@ GLOBAL(GCC_pop_shmedia_regs): fld.d r15, 28*8, dr38 fld.d r15, 27*8, dr36 blink tr1, r63 -#endif +#else /* ! __SH4_NOFPU__ */ .global GLOBAL(GCC_pop_shmedia_regs_nofpu) FUNC(GLOBAL(GCC_pop_shmedia_regs_nofpu)) GLOBAL(GCC_pop_shmedia_regs_nofpu): +#endif /* ! __SH4_NOFPU__ */ movi 27*8, r0 .L0: ptabs r18, tr0 @@ -2929,7 +3012,239 @@ GLOBAL(GCC_pop_shmedia_regs_nofpu): #ifndef __SH4_NOFPU__ ENDFUNC(GLOBAL(GCC_pop_shmedia_regs)) -#endif +#else ENDFUNC(GLOBAL(GCC_pop_shmedia_regs_nofpu)) +#endif #endif /* __SH5__ == 32 */ #endif /* L_push_pop_shmedia_regs */ + +#if __SH5__ +#ifdef L_div_table +#if defined(__pic__) && defined(__SHMEDIA__) + .global GLOBAL(sdivsi3) + FUNC(GLOBAL(sdivsi3)) +#if __SH5__ == 32 + .section .text..SHmedia32,"ax" +#else + .text +#endif +#if 0 +/* ??? FIXME: Presumably due to a linker bug, exporting data symbols + in a text section does not work (at least for shared libraries): + the linker sets the LSB of the address as if this was SHmedia code. */ +#define TEXT_DATA_BUG +#endif + .align 2 + // inputs: r4,r5 + // clobbered: r1,r18,r19,r20,r21,r25,tr0 + // result in r0 + .global GLOBAL(sdivsi3) +GLOBAL(sdivsi3): +#ifdef TEXT_DATA_BUG + ptb datalabel Local_div_table,tr0 +#else + ptb GLOBAL(div_table_internal),tr0 +#endif + nsb r5, r1 + shlld r5, r1, r25 // normalize; [-2 ..1, 1..2) in s2.62 + shari r25, 58, r21 // extract 5(6) bit index (s2.4 with hole -1..1) + /* bubble */ + gettr tr0,r20 + ldx.ub r20, r21, r19 // u0.8 + shari r25, 32, r25 // normalize to s2.30 + shlli r21, 1, r21 + muls.l r25, r19, r19 // s2.38 + ldx.w r20, r21, r21 // s2.14 + ptabs r18, tr0 + shari r19, 24, r19 // truncate to s2.14 + sub r21, r19, r19 // some 11 bit inverse in s1.14 + muls.l r19, r19, r21 // u0.28 + sub r63, r1, r1 + addi r1, 92, r1 + muls.l r25, r21, r18 // s2.58 + shlli r19, 45, r19 // multiply by two and convert to s2.58 + /* bubble */ + sub r19, r18, r18 + shari r18, 28, r18 // some 22 bit inverse in s1.30 + muls.l r18, r25, r0 // s2.60 + muls.l r18, r4, r25 // s32.30 + /* bubble */ + shari r0, 16, r19 // s-16.44 + muls.l r19, r18, r19 // s-16.74 + shari r25, 63, r0 + shari r4, 14, r18 // s19.-14 + shari r19, 30, r19 // s-16.44 + muls.l r19, r18, r19 // s15.30 + xor r21, r0, r21 // You could also use the constant 1 << 27. + add r21, r25, r21 + sub r21, r19, r21 + shard r21, r1, r21 + sub r21, r0, r0 + blink tr0, r63 + ENDFUNC(GLOBAL(sdivsi3)) +/* This table has been generated by divtab.c . +Defects for bias -330: + Max defect: 6.081536e-07 at -1.000000e+00 + Min defect: 2.849516e-08 at 1.030651e+00 + Max 2nd step defect: 9.606539e-12 at -1.000000e+00 + Min 2nd step defect: 0.000000e+00 at 0.000000e+00 + Defect at 1: 1.238659e-07 + Defect at -2: 1.061708e-07 */ +#else /* ! __pic__ || ! __SHMEDIA__ */ + .section .rodata +#endif /* __pic__ */ +#if defined(TEXT_DATA_BUG) && defined(__pic__) && defined(__SHMEDIA__) + .balign 2 + .type Local_div_table,@object + .size Local_div_table,128 +/* negative division constants */ + .word -16638 + .word -17135 + .word -17737 + .word -18433 + .word -19103 + .word -19751 + .word -20583 + .word -21383 + .word -22343 + .word -23353 + .word -24407 + .word -25582 + .word -26863 + .word -28382 + .word -29965 + .word -31800 +/* negative division factors */ + .byte 66 + .byte 70 + .byte 75 + .byte 81 + .byte 87 + .byte 93 + .byte 101 + .byte 109 + .byte 119 + .byte 130 + .byte 142 + .byte 156 + .byte 172 + .byte 192 + .byte 214 + .byte 241 + .skip 16 +Local_div_table: + .skip 16 +/* positive division factors */ + .byte 241 + .byte 214 + .byte 192 + .byte 172 + .byte 156 + .byte 142 + .byte 130 + .byte 119 + .byte 109 + .byte 101 + .byte 93 + .byte 87 + .byte 81 + .byte 75 + .byte 70 + .byte 66 +/* positive division constants */ + .word 31801 + .word 29966 + .word 28383 + .word 26864 + .word 25583 + .word 24408 + .word 23354 + .word 22344 + .word 21384 + .word 20584 + .word 19752 + .word 19104 + .word 18434 + .word 17738 + .word 17136 + .word 16639 + .section .rodata +#endif /* TEXT_DATA_BUG */ + .balign 2 + .type GLOBAL(div_table),@object + .size GLOBAL(div_table),128 +/* negative division constants */ + .word -16638 + .word -17135 + .word -17737 + .word -18433 + .word -19103 + .word -19751 + .word -20583 + .word -21383 + .word -22343 + .word -23353 + .word -24407 + .word -25582 + .word -26863 + .word -28382 + .word -29965 + .word -31800 +/* negative division factors */ + .byte 66 + .byte 70 + .byte 75 + .byte 81 + .byte 87 + .byte 93 + .byte 101 + .byte 109 + .byte 119 + .byte 130 + .byte 142 + .byte 156 + .byte 172 + .byte 192 + .byte 214 + .byte 241 + .skip 16 + .global GLOBAL(div_table) +GLOBAL(div_table): + HIDDEN_ALIAS(div_table_internal,div_table) + .skip 16 +/* positive division factors */ + .byte 241 + .byte 214 + .byte 192 + .byte 172 + .byte 156 + .byte 142 + .byte 130 + .byte 119 + .byte 109 + .byte 101 + .byte 93 + .byte 87 + .byte 81 + .byte 75 + .byte 70 + .byte 66 +/* positive division constants */ + .word 31801 + .word 29966 + .word 28383 + .word 26864 + .word 25583 + .word 24408 + .word 23354 + .word 22344 + .word 21384 + .word 20584 + .word 19752 + .word 19104 + .word 18434 + .word 17738 + .word 17136 + .word 16639 +#endif /* L_div_table */ +#endif /* __SH5__ */ diff --git a/gcc/config/sh/libgcc-excl.ver b/gcc/config/sh/libgcc-excl.ver index 1083ba2..325c740 100644 --- a/gcc/config/sh/libgcc-excl.ver +++ b/gcc/config/sh/libgcc-excl.ver @@ -3,5 +3,6 @@ __ashlsi3 __ashrsi3 __lshrsi3 + __mulsi3 # this is an SH1-only symbol. __udivsi3 } diff --git a/gcc/config/sh/linux.h b/gcc/config/sh/linux.h index 412ce46..013bf49 100644 --- a/gcc/config/sh/linux.h +++ b/gcc/config/sh/linux.h @@ -48,7 +48,8 @@ Boston, MA 02111-1307, USA. */ #undef TARGET_DEFAULT #define TARGET_DEFAULT \ - (TARGET_CPU_DEFAULT | USERMODE_BIT | TARGET_ENDIAN_DEFAULT) + (TARGET_CPU_DEFAULT | USERMODE_BIT | TARGET_ENDIAN_DEFAULT \ + | TARGET_OPT_DEFAULT) #define TARGET_ASM_FILE_END file_end_indicate_exec_stack @@ -104,3 +105,10 @@ Boston, MA 02111-1307, USA. */ #undef DBX_REGISTER_NUMBER #define DBX_REGISTER_NUMBER(REGNO) \ ((! TARGET_SH5 && (REGNO) == 16) ? 16 : SH_DBX_REGISTER_NUMBER (REGNO)) + +/* Since libgcc is compiled with -fpic for this target, we can't use + __sdivsi3_1 as the division strategy for -O0 and -Os. */ +#undef SH_DIV_STRATEGY_DEFAULT +#define SH_DIV_STRATEGY_DEFAULT SH_DIV_CALL2 +#undef SH_DIV_STR_FOR_SIZE +#define SH_DIV_STR_FOR_SIZE "call2" diff --git a/gcc/config/sh/netbsd-elf.h b/gcc/config/sh/netbsd-elf.h index a638546..c640ba0 100644 --- a/gcc/config/sh/netbsd-elf.h +++ b/gcc/config/sh/netbsd-elf.h @@ -1,5 +1,5 @@ /* Definitions for SH running NetBSD using ELF - Copyright (C) 2002, 2003, 2004 Free Software Foundation, Inc. + Copyright (C) 2002, 2003, 2004, 2005 Free Software Foundation, Inc. Contributed by Wasabi Systems, Inc. This file is part of GCC. @@ -109,3 +109,10 @@ do \ } \ } \ while (0) + +/* Since libgcc is compiled with -fpic for this target, we can't use + __sdivsi3_1 as the division strategy for -O0 and -Os. */ +#undef SH_DIV_STRATEGY_DEFAULT +#define SH_DIV_STRATEGY_DEFAULT SH_DIV_CALL2 +#undef SH_DIV_STR_FOR_SIZE +#define SH_DIV_STR_FOR_SIZE "call2" diff --git a/gcc/config/sh/newlib.h b/gcc/config/sh/newlib.h new file mode 100644 index 0000000..062cc7e --- /dev/null +++ b/gcc/config/sh/newlib.h @@ -0,0 +1,26 @@ +/* Definitions of target machine for gcc for Super-H using sh-superh-elf. + Copyright (C) 2001 Free Software Foundation, Inc. + +This file is part of GNU CC. + +GNU CC is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 2, or (at your option) +any later version. + +GNU CC is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with GNU CC; see the file COPYING. If not, write to +the Free Software Foundation, 59 Temple Place - Suite 330, +Boston, MA 02111-1307, USA. */ + + +/* This header file is used when with_libgloss is enabled during gcc + configuration. */ + +#undef LIB_SPEC +#define LIB_SPEC "-lc -lgloss" diff --git a/gcc/config/sh/predicates.md b/gcc/config/sh/predicates.md new file mode 100644 index 0000000..981cc8f --- /dev/null +++ b/gcc/config/sh/predicates.md @@ -0,0 +1,38 @@ +(define_predicate "trapping_target_operand" + (match_code "if_then_else") +{ + rtx cond, mem, res, tar, and; + + if (GET_MODE (op) != PDImode) + return 0; + cond = XEXP (op, 0); + mem = XEXP (op, 1); + res = XEXP (op, 2); + if (GET_CODE (mem) != MEM + || (GET_CODE (res) != SIGN_EXTEND && GET_CODE (res) != TRUNCATE)) + return 0; + tar = XEXP (res, 0); + if (!rtx_equal_p (XEXP (mem, 0), tar) + || GET_MODE (tar) != Pmode) + return 0; + if (GET_CODE (cond) == CONST) + { + cond = XEXP (cond, 0); + if (!EXTRA_CONSTRAINT_Csy (tar)) + return 0; + if (GET_CODE (tar) == CONST) + tar = XEXP (tar, 0); + } + else if (!arith_reg_operand (tar, VOIDmode) + && ! EXTRA_CONSTRAINT_Csy (tar)) + return 0; + if (GET_CODE (cond) != EQ) + return 0; + and = XEXP (cond, 0); + return (GET_CODE (and) == AND + && rtx_equal_p (XEXP (and, 0), tar) + && GET_CODE (XEXP (and, 1)) == CONST_INT + && GET_CODE (XEXP (cond, 1)) == CONST_INT + && INTVAL (XEXP (and, 1)) == 3 + && INTVAL (XEXP (cond, 1)) == 3); +}) diff --git a/gcc/config/sh/sh-modes.def b/gcc/config/sh/sh-modes.def index c152068..917708a 100644 --- a/gcc/config/sh/sh-modes.def +++ b/gcc/config/sh/sh-modes.def @@ -20,6 +20,8 @@ Boston, MA 02111-1307, USA. */ /* The SH uses a partial integer mode to represent the FPSCR register. */ PARTIAL_INT_MODE (SI); +/* PDI mode is used to represent a function address in a target register. */ +PARTIAL_INT_MODE (DI); /* Vector modes. */ VECTOR_MODE (INT, QI, 2); /* V2QI */ diff --git a/gcc/config/sh/sh-protos.h b/gcc/config/sh/sh-protos.h index 039f8cb..4882ee3 100644 --- a/gcc/config/sh/sh-protos.h +++ b/gcc/config/sh/sh-protos.h @@ -24,6 +24,19 @@ Boston, MA 02111-1307, USA. */ #ifndef GCC_SH_PROTOS_H #define GCC_SH_PROTOS_H +enum sh_function_kind { + /* A function with normal C ABI */ + FUNCTION_ORDINARY, + /* A special function that guarantees that some otherwise call-clobbered + registers are not clobbered. These can't go through the SH5 resolver, + because it only saves argument passing registers. */ + SFUNC_GOT, + /* A special function that should be linked statically. These are typically + smaller or not much larger than a PLT entry. + Some also have a non-standard ABI which precludes dynamic linking. */ + SFUNC_STATIC +}; + #ifdef RTX_CODE extern rtx sh_fsca_sf2int (void); extern rtx sh_fsca_df2int (void); @@ -101,6 +114,7 @@ extern int sh_can_redirect_branch (rtx, rtx); extern void sh_expand_unop_v2sf (enum rtx_code, rtx, rtx); extern void sh_expand_binop_v2sf (enum rtx_code, rtx, rtx, rtx); extern int sh_expand_t_scc (enum rtx_code code, rtx target); +extern rtx sh_gen_truncate (enum machine_mode, rtx, int); extern bool sh_vector_mode_supported_p (enum machine_mode); #ifdef TREE_CODE extern void sh_va_start (tree, rtx); @@ -137,7 +151,7 @@ extern void fpscr_set_from_mem (int, HARD_REG_SET); extern void sh_pr_interrupt (struct cpp_reader *); extern void sh_pr_trapa (struct cpp_reader *); extern void sh_pr_nosave_low_regs (struct cpp_reader *); -extern rtx function_symbol (const char *); +extern rtx function_symbol (rtx, const char *, enum sh_function_kind); extern rtx sh_get_pr_initial_val (void); extern rtx sh_function_arg (CUMULATIVE_ARGS *, enum machine_mode, tree, int); @@ -147,6 +161,12 @@ extern void sh_init_cumulative_args (CUMULATIVE_ARGS *, tree, rtx, tree, signed extern const char *sh_pch_valid_p (const void *data_p, size_t sz); extern bool sh_promote_prototypes (tree); +extern rtx replace_n_hard_rtx (rtx, rtx *, int , int); +extern int shmedia_cleanup_truncate (rtx *, void *); + +extern int sh_contains_memref_p (rtx); +extern rtx shmedia_prepare_call_address (rtx fnaddr, int is_sibcall); + #endif /* ! GCC_SH_PROTOS_H */ #ifdef SYMBIAN diff --git a/gcc/config/sh/sh.c b/gcc/config/sh/sh.c index 8a6e1a6..65cd944 100644 --- a/gcc/config/sh/sh.c +++ b/gcc/config/sh/sh.c @@ -52,6 +52,7 @@ Boston, MA 02111-1307, USA. */ #include "sched-int.h" #include "ggc.h" #include "tree-gimple.h" +#include "cfgloop.h" int code_for_indirect_jump_scratch = CODE_FOR_indirect_jump_scratch; @@ -265,6 +266,7 @@ static bool unspec_caller_rtx_p (rtx); static bool sh_cannot_copy_insn_p (rtx); static bool sh_rtx_costs (rtx, int, int, int *); static int sh_address_cost (rtx); +static int sh_adjust_unroll_max (struct loop *, int, int, int, int); static int shmedia_target_regs_stack_space (HARD_REG_SET *); static int shmedia_reserve_space_for_target_registers_p (int, HARD_REG_SET *); static int shmedia_target_regs_stack_adjust (HARD_REG_SET *); @@ -480,6 +482,11 @@ static int hard_regs_intersect_p (HARD_REG_SET *, HARD_REG_SET *); #endif /* SYMBIAN */ +#ifdef TARGET_ADJUST_UNROLL_MAX +#undef TARGET_ADJUST_UNROLL_MAX +#define TARGET_ADJUST_UNROLL_MAX sh_adjust_unroll_max +#endif + struct gcc_target targetm = TARGET_INITIALIZER; /* Print the operand address in x to the stream. */ @@ -546,6 +553,7 @@ print_operand_address (FILE *stream, rtx x) '@' print trap, rte or rts depending upon pragma interruptness '#' output a nop if there is nothing to put in the delay slot ''' print likelihood suffix (/u for unlikely). + '>' print branch target if -fverbose-asm 'O' print a constant without the # 'R' print the LSW of a dp value - changes if in little endian 'S' print the MSW of a dp value - changes if in little endian @@ -554,12 +562,16 @@ print_operand_address (FILE *stream, rtx x) 'N' print 'r63' if the operand is (const_int 0). 'd' print a V2SF reg as dN instead of fpN. 'm' print a pair `base,offset' or `base,index', for LD and ST. + 'U' Likewise for {LD,ST}{HI,LO}. 'u' prints the lowest 16 bits of CONST_INT, as an unsigned value. 'o' output an operator. */ void print_operand (FILE *stream, rtx x, int code) { + int regno; + enum machine_mode mode; + switch (code) { case '.': @@ -592,6 +604,13 @@ print_operand (FILE *stream, rtx x, int code) fputs ("/u", stream); break; } + case '>': + if (flag_verbose_asm && JUMP_LABEL (current_output_insn)) + { + fputs ("\t! target: ", stream); + output_addr_const (stream, JUMP_LABEL (current_output_insn)); + } + break; case 'O': x = mark_constant_pool_use (x); output_addr_const (stream, x); @@ -647,6 +666,8 @@ print_operand (FILE *stream, rtx x, int code) case 'm': gcc_assert (GET_CODE (x) == MEM); x = XEXP (x, 0); + /* Fall through. */ + case 'U': switch (GET_CODE (x)) { case REG: @@ -689,13 +710,63 @@ print_operand (FILE *stream, rtx x, int code) default_output: default: + regno = 0; + mode = GET_MODE (x); + switch (GET_CODE (x)) { + case TRUNCATE: + { + rtx inner = XEXP (x, 0); + int offset = 0; + enum machine_mode inner_mode; + + /* We might see SUBREGs with vector mode registers inside. */ + if (GET_CODE (inner) == SUBREG + && (GET_MODE_SIZE (GET_MODE (inner)) + == GET_MODE_SIZE (GET_MODE (SUBREG_REG (inner)))) + && subreg_lowpart_p (inner)) + inner = SUBREG_REG (inner); + if (GET_CODE (inner) == CONST_INT) + { + x = GEN_INT (trunc_int_for_mode (INTVAL (inner), GET_MODE (x))); + goto default_output; + } + inner_mode = GET_MODE (inner); + if (GET_CODE (inner) == SUBREG + && (GET_MODE_SIZE (GET_MODE (inner)) + < GET_MODE_SIZE (GET_MODE (SUBREG_REG (inner)))) + && GET_CODE (SUBREG_REG (inner)) == REG) + { + offset = subreg_regno_offset (REGNO (SUBREG_REG (inner)), + GET_MODE (SUBREG_REG (inner)), + SUBREG_BYTE (inner), + GET_MODE (inner)); + inner = SUBREG_REG (inner); + } + if (GET_CODE (inner) != REG || GET_MODE_SIZE (inner_mode) > 8) + abort (); + /* Floating point register pairs are always big endian; + general purpose registes are 64 bit wide. */ + regno = REGNO (inner); + regno = (HARD_REGNO_NREGS (regno, inner_mode) + - HARD_REGNO_NREGS (regno, mode)) + + offset; + x = inner; + goto reg; + } + case SIGN_EXTEND: + x = XEXP (x, 0); + goto reg; /* FIXME: We need this on SHmedia32 because reload generates some sign-extended HI or QI loads into DImode registers but, because Pmode is SImode, the address ends up with a subreg:SI of the DImode register. Maybe reload should be fixed so as to apply alter_subreg to such loads? */ + case IF_THEN_ELSE: + gcc_assert (trapping_target_operand (x, VOIDmode)); + x = XEXP (XEXP (x, 2), 0); + goto default_output; case SUBREG: gcc_assert (SUBREG_BYTE (x) == 0 && GET_CODE (SUBREG_REG (x)) == REG); @@ -703,21 +774,23 @@ print_operand (FILE *stream, rtx x, int code) x = SUBREG_REG (x); /* Fall through. */ + reg: case REG: - if (FP_REGISTER_P (REGNO (x)) - && GET_MODE (x) == V16SFmode) - fprintf ((stream), "mtrx%s", reg_names[REGNO (x)] + 2); + regno += REGNO (x); + if (FP_REGISTER_P (regno) + && mode == V16SFmode) + fprintf ((stream), "mtrx%s", reg_names[regno] + 2); else if (FP_REGISTER_P (REGNO (x)) - && GET_MODE (x) == V4SFmode) - fprintf ((stream), "fv%s", reg_names[REGNO (x)] + 2); + && mode == V4SFmode) + fprintf ((stream), "fv%s", reg_names[regno] + 2); else if (GET_CODE (x) == REG - && GET_MODE (x) == V2SFmode) - fprintf ((stream), "fp%s", reg_names[REGNO (x)] + 2); + && mode == V2SFmode) + fprintf ((stream), "fp%s", reg_names[regno] + 2); else if (FP_REGISTER_P (REGNO (x)) - && GET_MODE_SIZE (GET_MODE (x)) > 4) - fprintf ((stream), "d%s", reg_names[REGNO (x)] + 1); + && GET_MODE_SIZE (mode) > 4) + fprintf ((stream), "d%s", reg_names[regno] + 1); else - fputs (reg_names[REGNO (x)], (stream)); + fputs (reg_names[regno], (stream)); break; case MEM: @@ -727,7 +800,8 @@ print_operand (FILE *stream, rtx x, int code) case CONST: if (TARGET_SHMEDIA && GET_CODE (XEXP (x, 0)) == SIGN_EXTEND - && GET_MODE (XEXP (x, 0)) == DImode + && (GET_MODE (XEXP (x, 0)) == DImode + || GET_MODE (XEXP (x, 0)) == SImode) && GET_CODE (XEXP (XEXP (x, 0), 0)) == TRUNCATE && GET_MODE (XEXP (XEXP (x, 0), 0)) == HImode) { @@ -842,16 +916,11 @@ expand_block_move (rtx *operands) return 0; else if (bytes == 12) { - tree entry_name; - rtx sym; - rtx func_addr_rtx; + rtx func_addr_rtx = gen_reg_rtx (Pmode); rtx r4 = gen_rtx_REG (SImode, 4); rtx r5 = gen_rtx_REG (SImode, 5); - entry_name = get_identifier ("__movmemSI12_i4"); - - sym = function_symbol (IDENTIFIER_POINTER (entry_name)); - func_addr_rtx = copy_to_mode_reg (Pmode, sym); + function_symbol (func_addr_rtx, "__movmemSI12_i4", SFUNC_STATIC); force_into (XEXP (operands[0], 0), r4); force_into (XEXP (operands[1], 0), r5); emit_insn (gen_block_move_real_i4 (func_addr_rtx)); @@ -859,19 +928,15 @@ expand_block_move (rtx *operands) } else if (! TARGET_SMALLCODE) { - tree entry_name; - rtx sym; - rtx func_addr_rtx; + const char *entry_name; + rtx func_addr_rtx = gen_reg_rtx (Pmode); int dwords; rtx r4 = gen_rtx_REG (SImode, 4); rtx r5 = gen_rtx_REG (SImode, 5); rtx r6 = gen_rtx_REG (SImode, 6); - entry_name = get_identifier (bytes & 4 - ? "__movmem_i4_odd" - : "__movmem_i4_even"); - sym = function_symbol (IDENTIFIER_POINTER (entry_name)); - func_addr_rtx = copy_to_mode_reg (Pmode, sym); + entry_name = (bytes & 4 ? "__movmem_i4_odd" : "__movmem_i4_even"); + function_symbol (func_addr_rtx, entry_name, SFUNC_STATIC); force_into (XEXP (operands[0], 0), r4); force_into (XEXP (operands[1], 0), r5); @@ -886,16 +951,12 @@ expand_block_move (rtx *operands) if (bytes < 64) { char entry[30]; - tree entry_name; - rtx sym; - rtx func_addr_rtx; + rtx func_addr_rtx = gen_reg_rtx (Pmode); rtx r4 = gen_rtx_REG (SImode, 4); rtx r5 = gen_rtx_REG (SImode, 5); sprintf (entry, "__movmemSI%d", bytes); - entry_name = get_identifier (entry); - sym = function_symbol (IDENTIFIER_POINTER (entry_name)); - func_addr_rtx = copy_to_mode_reg (Pmode, sym); + function_symbol (func_addr_rtx, entry, SFUNC_STATIC); force_into (XEXP (operands[0], 0), r4); force_into (XEXP (operands[1], 0), r5); emit_insn (gen_block_move_real (func_addr_rtx)); @@ -906,17 +967,13 @@ expand_block_move (rtx *operands) less common function name, so this will occasionally use more space. */ if (! TARGET_SMALLCODE) { - tree entry_name; - rtx sym; - rtx func_addr_rtx; + rtx func_addr_rtx = gen_reg_rtx (Pmode); int final_switch, while_loop; rtx r4 = gen_rtx_REG (SImode, 4); rtx r5 = gen_rtx_REG (SImode, 5); rtx r6 = gen_rtx_REG (SImode, 6); - entry_name = get_identifier ("__movmem"); - sym = function_symbol (IDENTIFIER_POINTER (entry_name)); - func_addr_rtx = copy_to_mode_reg (Pmode, sym); + function_symbol (func_addr_rtx, "__movmem", SFUNC_STATIC); force_into (XEXP (operands[0], 0), r4); force_into (XEXP (operands[1], 0), r5); @@ -997,7 +1054,8 @@ prepare_move_operands (rtx operands[], enum machine_mode mode) of a library call to the target. Reject `st r0,@(rX,rY)' because reload will fail to find a spill register for rX, since r0 is already being used for the source. */ - else if (refers_to_regno_p (R0_REG, R0_REG + 1, operands[1], (rtx *)0) + else if (TARGET_SH1 + && refers_to_regno_p (R0_REG, R0_REG + 1, operands[1], (rtx *)0) && GET_CODE (operands[0]) == MEM && GET_CODE (XEXP (operands[0], 0)) == PLUS && GET_CODE (XEXP (XEXP (operands[0], 0), 1)) == REG) @@ -1792,8 +1850,17 @@ addsubcosts (rtx x) static inline int multcosts (rtx x ATTRIBUTE_UNUSED) { + if (*sh_multcost_str) + return atoi (sh_multcost_str); if (TARGET_SHMEDIA) - return 3; + /* ??? We have a mul insn, but it has a latency of three, and doesn't + accept constants. Ideally, we would use a cost of one or two and + add the cost of the operand, but disregard the latter when inside loops + and loop invariant code motion is still to follow. + Using a multiply first and splitting it later if it's a loss + doesn't work because of different sign / zero extension semantics + of multiplies vs. shifts. */ + return TARGET_SMALLCODE ? 2 : 3; if (TARGET_SH2) { @@ -1837,7 +1904,7 @@ sh_rtx_costs (rtx x, int code, int outer_code, int *total) else if (CONST_OK_FOR_I16 (INTVAL (x))) *total = COSTS_N_INSNS (outer_code != SET); else if (CONST_OK_FOR_I16 (INTVAL (x) >> 16)) - *total = COSTS_N_INSNS (2); + *total = COSTS_N_INSNS ((outer_code != SET) + 1); else if (CONST_OK_FOR_I16 ((INTVAL (x) >> 16) >> 16)) *total = COSTS_N_INSNS (3); else @@ -1870,8 +1937,19 @@ sh_rtx_costs (rtx x, int code, int outer_code, int *total) else *total = 10; return true; + case CONST_VECTOR: + if (x == CONST0_RTX (GET_MODE (x))) + *total = 0; + else if (sh_1el_vec (x, VOIDmode)) + *total = outer_code != SET; + if (sh_rep_vec (x, VOIDmode)) + *total = ((GET_MODE_UNIT_SIZE (GET_MODE (x)) + 3) / 4 + + (outer_code != SET)); + *total = COSTS_N_INSNS (3) + (outer_code != SET); + return true; case PLUS: + case MINUS: *total = COSTS_N_INSNS (addsubcosts (x)); return true; @@ -1896,6 +1974,15 @@ sh_rtx_costs (rtx x, int code, int outer_code, int *total) *total = COSTS_N_INSNS (20); return true; + case PARALLEL: + if (sh_1el_vec (x, VOIDmode)) + *total = outer_code != SET; + if (sh_rep_vec (x, VOIDmode)) + *total = ((GET_MODE_UNIT_SIZE (GET_MODE (x)) + 3) / 4 + + (outer_code != SET)); + *total = COSTS_N_INSNS (3) + (outer_code != SET); + return true; + case FLOAT: case FIX: *total = 100; @@ -2024,10 +2111,10 @@ gen_shifty_op (int code, rtx *operands) } else if (value == 0) { - /* This can happen when not optimizing. We must output something here - to prevent the compiler from dying in final.c after the try_split - call. */ - emit_insn (gen_nop ()); + /* This can happen even when optimizing, if there were subregs before + reload. Don't output a nop here, as this is never optimized away; + use a no-op move instead. */ + emit_insn (gen_rtx_SET (VOIDmode, operands[0], operands[0])); return; } @@ -2076,10 +2163,8 @@ gen_shifty_hi_op (int code, rtx *operands) int expand_ashiftrt (rtx *operands) { - rtx sym; rtx wrk; char func[18]; - tree func_name; int value; if (TARGET_SH3) @@ -2107,6 +2192,16 @@ expand_ashiftrt (rtx *operands) if (value == 31) { + /* If we are called from abs expansion, arrange things so that we + we can use a single MT instruction that doesn't clobber the source, + if LICM can hoist out the load of the constant zero. */ + if (currently_expanding_to_rtl) + { + emit_insn (gen_cmpgtsi_t (force_reg (SImode, CONST0_RTX (SImode)), + operands[1])); + emit_insn (gen_mov_neg_si_t (operands[0])); + return 1; + } emit_insn (gen_ashrsi2_31 (operands[0], operands[1])); return 1; } @@ -2136,9 +2231,7 @@ expand_ashiftrt (rtx *operands) /* Load the value into an arg reg and call a helper. */ emit_move_insn (gen_rtx_REG (SImode, 4), operands[1]); sprintf (func, "__ashiftrt_r4_%d", value); - func_name = get_identifier (func); - sym = function_symbol (IDENTIFIER_POINTER (func_name)); - emit_move_insn (wrk, sym); + function_symbol (wrk, func, SFUNC_STATIC); emit_insn (gen_ashrsi3_n (GEN_INT (value), wrk)); emit_move_insn (operands[0], gen_rtx_REG (SImode, 4)); return 1; @@ -2680,6 +2773,8 @@ gen_shl_sext (rtx dest, rtx left_rtx, rtx size_rtx, rtx source) rtx gen_datalabel_ref (rtx sym) { + const char *str; + if (GET_CODE (sym) == LABEL_REF) return gen_rtx_CONST (GET_MODE (sym), gen_rtx_UNSPEC (GET_MODE (sym), @@ -2688,6 +2783,12 @@ gen_datalabel_ref (rtx sym) gcc_assert (GET_CODE (sym) == SYMBOL_REF); + str = XSTR (sym, 0); + /* Share all SYMBOL_REF strings with the same value - that is important + for cse. */ + str = IDENTIFIER_POINTER (get_identifier (str)); + XSTR (sym, 0) = str; + return sym; } @@ -2758,10 +2859,10 @@ typedef struct } pool_node; /* The maximum number of constants that can fit into one pool, since - the pc relative range is 0...1020 bytes and constants are at least 4 - bytes long. */ + constants in the range 0..510 are at least 2 bytes long, and in the + range from there to 1018 at least 4 bytes. */ -#define MAX_POOL_SIZE (1020/4) +#define MAX_POOL_SIZE 372 static pool_node pool_vector[MAX_POOL_SIZE]; static int pool_size; static rtx pool_window_label; @@ -3261,11 +3362,6 @@ find_barrier (int num_mova, rtx mova, rtx from) if (num_mova) si_limit -= GET_MODE_SIZE (mode); } - - /* See the code in machine_dependent_reorg, which has a similar if - statement that generates a new mova insn in many cases. */ - if (GET_CODE (dst) == REG && FP_ANY_REGISTER_P (REGNO (dst))) - inc += 2; } if (mova_p (from)) @@ -5292,6 +5388,8 @@ sh_media_register_for_return (void) if (lookup_attribute ("interrupt_handler", DECL_ATTRIBUTES (current_function_decl))) return -1; + if (sh_cfun_interrupt_handler_p ()) + return -1; tr0_used = flag_pic && regs_ever_live[PIC_OFFSET_TABLE_REGNUM]; @@ -5760,12 +5858,12 @@ sh_expand_prologue (void) if (SHMEDIA_REGS_STACK_ADJUST ()) { - emit_move_insn (gen_rtx_REG (Pmode, R0_REG), - function_symbol (TARGET_FPU_ANY - ? "__GCC_push_shmedia_regs" - : "__GCC_push_shmedia_regs_nofpu")); /* This must NOT go through the PLT, otherwise mach and macl may be clobbered. */ + function_symbol (gen_rtx_REG (Pmode, R0_REG), + (TARGET_FPU_ANY + ? "__GCC_push_shmedia_regs" + : "__GCC_push_shmedia_regs_nofpu"), SFUNC_GOT); emit_insn (gen_shmedia_save_restore_regs_compact (GEN_INT (-SHMEDIA_REGS_STACK_ADJUST ()))); } @@ -5795,8 +5893,8 @@ sh_expand_prologue (void) { /* This must NOT go through the PLT, otherwise mach and macl may be clobbered. */ - emit_move_insn (gen_rtx_REG (Pmode, R0_REG), - function_symbol ("__GCC_shcompact_incoming_args")); + function_symbol (gen_rtx_REG (Pmode, R0_REG), + "__GCC_shcompact_incoming_args", SFUNC_GOT); emit_insn (gen_shcompact_incoming_args ()); } } @@ -5871,10 +5969,10 @@ sh_expand_epilogue (bool sibcall_p) if (SHMEDIA_REGS_STACK_ADJUST ()) { - emit_move_insn (gen_rtx_REG (Pmode, R0_REG), - function_symbol (TARGET_FPU_ANY - ? "__GCC_pop_shmedia_regs" - : "__GCC_pop_shmedia_regs_nofpu")); + function_symbol (gen_rtx_REG (Pmode, R0_REG), + (TARGET_FPU_ANY + ? "__GCC_pop_shmedia_regs" + : "__GCC_pop_shmedia_regs_nofpu"), SFUNC_GOT); /* This must NOT go through the PLT, otherwise mach and macl may be clobbered. */ emit_insn (gen_shmedia_save_restore_regs_compact @@ -7317,6 +7415,18 @@ sh_target_switches[] = TARGET_SWITCHES; const char * sh_pch_valid_p (const void *data_p, size_t len) { +#ifdef TARGET_OPTIONS + /* ??? We have a copy of this in toplev.c, but it is static. */ + static const struct + { + const char *const prefix; + const char **const variable; + const char *const description; + const char *const value; + } + target_options[] = TARGET_OPTIONS; +#endif + const char *data = (const char *)data_p; const char *flag_that_differs = NULL; size_t i; @@ -7437,6 +7547,14 @@ general_movsrc_operand (rtx op, enum machine_mode mode) && system_reg_operand (XEXP (op, 0), mode))) return 0; + if (TARGET_SHMEDIA + && (GET_CODE (op) == PARALLEL || GET_CODE (op) == CONST_VECTOR) + && sh_rep_vec (op, mode)) + return 1; + if (TARGET_SHMEDIA && 1 + && GET_CODE (op) == SUBREG && GET_MODE (op) == mode + && SUBREG_REG (op) == const0_rtx && subreg_lowpart_p (op)) + /* FIXME */ abort (); // return 1; return general_operand (op, mode); } @@ -7449,6 +7567,10 @@ general_movdst_operand (rtx op, enum machine_mode mode) /* Only pre dec allowed. */ if (GET_CODE (op) == MEM && GET_CODE (XEXP (op, 0)) == POST_INC) return 0; + if (mode == DImode && TARGET_SHMEDIA && GET_CODE (op) == SUBREG + && GET_MODE_SIZE (GET_MODE (SUBREG_REG (op))) < 8 + && ! (high_life_started || reload_completed)) + return 0; return general_operand (op, mode); } @@ -7474,6 +7596,28 @@ arith_reg_operand (rtx op, enum machine_mode mode) && (regno != FPUL_REG || TARGET_SH4) && regno != MACH_REG && regno != MACL_REG); } + /* Allow a no-op sign extension - compare LOAD_EXTEND_OP. + We allow SImode here, as not using an FP register is just a matter of + proper register allocation. */ + if (TARGET_SHMEDIA + && GET_MODE (op) == DImode && GET_CODE (op) == SIGN_EXTEND + && GET_MODE (XEXP (op, 0)) == SImode + && GET_CODE (XEXP (op, 0)) != SUBREG) + return register_operand (XEXP (op, 0), VOIDmode); +#if 0 /* Can't do this because of PROMOTE_MODE for unsigned vars. */ + if (GET_MODE (op) == SImode && GET_CODE (op) == SIGN_EXTEND + && GET_MODE (XEXP (op, 0)) == HImode + && GET_CODE (XEXP (op, 0)) == REG + && REGNO (XEXP (op, 0)) <= LAST_GENERAL_REG) + return register_operand (XEXP (op, 0), VOIDmode); +#endif + if (GET_MODE_CLASS (GET_MODE (op)) == MODE_VECTOR_INT + && GET_CODE (op) == SUBREG + && GET_MODE (SUBREG_REG (op)) == DImode + && GET_CODE (SUBREG_REG (op)) == SIGN_EXTEND + && GET_MODE (XEXP (SUBREG_REG (op), 0)) == SImode + && GET_CODE (XEXP (SUBREG_REG (op), 0)) != SUBREG) + return register_operand (XEXP (SUBREG_REG (op), 0), VOIDmode); return 0; } @@ -7484,7 +7628,21 @@ int arith_reg_dest (rtx op, enum machine_mode mode) { if (mode == DImode && GET_CODE (op) == SUBREG - && GET_MODE_SIZE (GET_MODE (SUBREG_REG (op))) < 8) + && GET_MODE_SIZE (GET_MODE (SUBREG_REG (op))) < 8 + && TARGET_SHMEDIA) + return 0; + return arith_reg_operand (op, mode); +} + +/* Like arith_reg_operand, but for register source operands of narrow + logical SHMEDIA operations: forbid subregs of DImode / TImode regs. */ +int +logical_reg_operand (rtx op, enum machine_mode mode) +{ + if (TARGET_SHMEDIA + && GET_CODE (op) == SUBREG + && GET_MODE_SIZE (GET_MODE (SUBREG_REG (op))) > 4 + && mode != DImode) return 0; return arith_reg_operand (op, mode); } @@ -7522,6 +7680,15 @@ fp_arith_reg_operand (rtx op, enum machine_mode mode) return 0; } +int +fp_arith_reg_dest (rtx op, enum machine_mode mode) +{ + if (mode == DImode && GET_CODE (op) == SUBREG + && GET_MODE_SIZE (GET_MODE (SUBREG_REG (op))) < 8) + return 0; + return fp_arith_reg_operand (op, mode); +} + /* Returns 1 if OP is a valid source operand for an arithmetic insn. */ int @@ -7540,6 +7707,14 @@ arith_operand (rtx op, enum machine_mode mode) if (GET_CODE (op) == CONST_INT || EXTRA_CONSTRAINT_C16 (op)) return 1; + else if (GET_CODE (op) == TRUNCATE + && ! system_reg_operand (XEXP (op, 0), VOIDmode) + && (mode == VOIDmode || mode == GET_MODE (op)) + && (GET_MODE_SIZE (GET_MODE (op)) + < GET_MODE_SIZE (GET_MODE (XEXP (op, 0)))) + && (! FP_REGISTER_P (REGNO (XEXP (op, 0))) + || GET_MODE_SIZE (GET_MODE (op)) == 4)) + return register_operand (XEXP (op, 0), VOIDmode); else return 0; } @@ -7563,14 +7738,34 @@ arith_reg_or_0_operand (rtx op, enum machine_mode mode) return 0; } -/* Return 1 if OP is a valid source operand for an SHmedia operation - that takes either a register or a 6-bit immediate. */ +/* Return 1 if OP is a valid source operand for xor. */ int -shmedia_6bit_operand (rtx op, enum machine_mode mode) +xor_operand (rtx op, enum machine_mode mode) { - return (arith_reg_operand (op, mode) - || (GET_CODE (op) == CONST_INT && CONST_OK_FOR_I06 (INTVAL (op)))); + if (GET_CODE (op) == CONST_INT) + return (TARGET_SHMEDIA + ? (CONST_OK_FOR_I06 (INTVAL (op)) + || (no_new_pseudos && INTVAL (op) == 0xff)) + : CONST_OK_FOR_K08 (INTVAL (op))); + if (TARGET_SHMEDIA + && mode != DImode && GET_CODE (op) == SUBREG + && GET_MODE_SIZE (GET_MODE (SUBREG_REG (op))) > 4) + return 0; + return arith_reg_operand (op, mode); +} + +/* Return 1 if OP is a valid source operand for shmedia cmpgt / cmpgtu. */ +int +cmp_operand (rtx op, enum machine_mode mode) +{ + if (GET_CODE (op) == CONST_INT && CONST_OK_FOR_N (INTVAL (op))) + return 1; + if (TARGET_SHMEDIA + && mode != DImode && GET_CODE (op) == SUBREG + && GET_MODE_SIZE (GET_MODE (SUBREG_REG (op))) > 4) + return 0; + return arith_reg_operand (op, mode); } /* Returns 1 if OP is a valid source operand for a logical operation. */ @@ -7578,6 +7773,11 @@ shmedia_6bit_operand (rtx op, enum machine_mode mode) int logical_operand (rtx op, enum machine_mode mode) { + if (TARGET_SHMEDIA + && mode != DImode && GET_CODE (op) == SUBREG + && GET_MODE_SIZE (GET_MODE (SUBREG_REG (op))) > 4) + return 0; + if (arith_reg_operand (op, mode)) return 1; @@ -7788,7 +7988,7 @@ equality_comparison_operator (rtx op, enum machine_mode mode) int greater_comparison_operator (rtx op, enum machine_mode mode) { - if (mode != VOIDmode && GET_MODE (op) == mode) + if (mode != VOIDmode && GET_MODE (op) != mode) return 0; switch (GET_CODE (op)) { @@ -7805,7 +8005,7 @@ greater_comparison_operator (rtx op, enum machine_mode mode) int less_comparison_operator (rtx op, enum machine_mode mode) { - if (mode != VOIDmode && GET_MODE (op) == mode) + if (mode != VOIDmode && GET_MODE (op) != mode) return 0; switch (GET_CODE (op)) { @@ -7819,12 +8019,45 @@ less_comparison_operator (rtx op, enum machine_mode mode) } } +int +shift_operator (rtx op, enum machine_mode mode) +{ + if (mode != VOIDmode && GET_MODE (op) != mode) + return 0; + switch (GET_CODE (op)) + { + case ASHIFT: + case ASHIFTRT: + case LSHIFTRT: + return 1; + default: + return 0; + } +} + +int +logical_operator (rtx op, enum machine_mode mode) +{ + if (mode != VOIDmode && GET_MODE (op) != mode) + return 0; + switch (GET_CODE (op)) + { + case AND: + case IOR: + case XOR: + return 1; + default: + return 0; + } +} + /* Accept pseudos and branch target registers. */ int target_reg_operand (rtx op, enum machine_mode mode) { - if (mode != DImode - || GET_MODE (op) != DImode) + if (mode == VOIDmode + ? GET_MODE (op) != Pmode && GET_MODE (op) != PDImode + : mode != GET_MODE (op)) return 0; if (GET_CODE (op) == SUBREG) @@ -7848,10 +8081,10 @@ target_reg_operand (rtx op, enum machine_mode mode) int target_operand (rtx op, enum machine_mode mode) { - if (mode != DImode) + if (mode != VOIDmode && mode != Pmode) return 0; - if ((GET_MODE (op) == DImode || GET_MODE (op) == VOIDmode) + if ((GET_MODE (op) == Pmode || GET_MODE (op) == VOIDmode) && EXTRA_CONSTRAINT_Csy (op)) return ! reload_completed; @@ -7897,6 +8130,12 @@ extend_reg_or_0_operand (rtx op, enum machine_mode mode) } int +minuend_operand (rtx op, enum machine_mode mode) +{ + return op == constm1_rtx || extend_reg_or_0_operand (op, mode); +} + +int general_extend_operand (rtx op, enum machine_mode mode) { return (GET_CODE (op) == TRUNCATE @@ -7905,6 +8144,32 @@ general_extend_operand (rtx op, enum machine_mode mode) } int +ua_address_operand (rtx op, enum machine_mode mode ATTRIBUTE_UNUSED) +{ + if (GET_CODE (op) == PLUS + && (GET_CODE (XEXP (op, 1)) != CONST_INT + || ! CONST_OK_FOR_I06 (INTVAL (XEXP (op, 1))))) + return 0; + return address_operand (op, QImode); +} + +int +cache_address_operand (rtx op, enum machine_mode mode) +{ + if (GET_CODE (op) == PLUS) + { + if (GET_CODE (XEXP (op, 0)) != REG) + return 0; + if (GET_CODE (XEXP (op, 1)) != CONST_INT + || (INTVAL (XEXP (op, 1)) & 31)) + return 0; + } + else if (GET_CODE (op) != REG) + return 0; + return address_operand (op, mode); +} + +int inqhi_operand (rtx op, enum machine_mode mode) { if (GET_CODE (op) != TRUNCATE || mode != GET_MODE (op)) @@ -8499,6 +8764,14 @@ mark_constant_pool_use (rtx x) return lab; } + +int +ua_offset (c, mode) + rtx c; + enum machine_mode mode ATTRIBUTE_UNUSED; +{ + return GET_CODE (c) == CONST_INT && CONST_OK_FOR_I06 (INTVAL (c)); +} /* Return true if it's possible to redirect BRANCH1 to the destination of an unconditional jump BRANCH2. We only want to do this if the @@ -8566,11 +8839,58 @@ sh_adjust_cost (rtx insn, rtx link ATTRIBUTE_UNUSED, rtx dep_insn, int cost) /* On SHmedia, if the dependence is an anti-dependence or output-dependence, there is no cost. */ if (REG_NOTE_KIND (link) != 0) - cost = 0; + { + /* However, dependencies between target register loads and + uses of the register in a subsequent block that are separated + by a conditional branch are not modelled - we have to do with + the anti-dependency between the target register load and the + conditional branch that ends the current block. */ + if (REG_NOTE_KIND (link) == REG_DEP_ANTI + && GET_CODE (PATTERN (dep_insn)) == SET + && (get_attr_type (dep_insn) == TYPE_PT_MEDIA + || get_attr_type (dep_insn) == TYPE_PTABS_MEDIA) + && get_attr_type (insn) == TYPE_CBRANCH_MEDIA) + { + int orig_cost = cost; + rtx note = find_reg_note (insn, REG_BR_PROB, 0); + rtx target = ((! note + || INTVAL (XEXP (note, 0)) * 2 < REG_BR_PROB_BASE) + ? insn : JUMP_LABEL (insn)); + /* On the likely path, the branch costs 1, on the unlikely path, + it costs 3. */ + cost--; + do + target = next_active_insn (target); + while (target && ! flow_dependent_p (target, dep_insn) + && --cost > 0); + /* If two branches are executed in immediate succession, with the + first branch properly predicted, this causes a stall at the + second branch, hence we won't need the target for the + second branch for two cycles after the launch of the first + branch. */ + if (cost > orig_cost - 2) + cost = orig_cost - 2; + } + else + cost = 0; + } - if (get_attr_is_mac_media (insn) - && get_attr_is_mac_media (dep_insn)) - cost = 1; + else if (get_attr_is_mac_media (insn) + && get_attr_is_mac_media (dep_insn)) + cost = 1; + + else if (! reload_completed + && GET_CODE (PATTERN (insn)) == SET + && GET_CODE (SET_SRC (PATTERN (insn))) == FLOAT + && GET_CODE (PATTERN (dep_insn)) == SET + && fp_arith_reg_operand (SET_SRC (PATTERN (dep_insn)), VOIDmode) + && cost < 4) + cost = 4; + /* Schedule the ptabs for a casesi_jump_media in preference to stuff + that is needed at the target. */ + else if (get_attr_type (insn) == TYPE_JUMP_MEDIA + && ! flow_dependent_p (insn, dep_insn)) + cost--; } else if (REG_NOTE_KIND (link) == 0) { @@ -8599,7 +8919,9 @@ sh_adjust_cost (rtx insn, rtx link ATTRIBUTE_UNUSED, rtx dep_insn, int cost) if (GET_CODE (call) == SET) call = SET_SRC (call); if (GET_CODE (call) == CALL && GET_CODE (XEXP (call, 0)) == MEM - && ! reg_set_p (XEXP (XEXP (call, 0), 0), dep_insn)) + /* sibcalli_thunk uses a symbol_ref in an unspec. */ + && (GET_CODE (XEXP (XEXP (call, 0), 0)) == UNSPEC + || ! reg_set_p (XEXP (XEXP (call, 0), 0), dep_insn))) cost = 0; } /* Likewise, the most timing critical input for an sfuncs call @@ -9018,8 +9340,38 @@ sh_target_reg_class (void) static bool sh_optimize_target_register_callee_saved (bool after_prologue_epilogue_gen) { - return (shmedia_space_reserved_for_target_registers - && (! after_prologue_epilogue_gen || TARGET_SAVE_ALL_TARGET_REGS)); + HARD_REG_SET dummy; + rtx insn; + + if (! shmedia_space_reserved_for_target_registers) + return 0; + if (after_prologue_epilogue_gen && ! TARGET_SAVE_ALL_TARGET_REGS) + return 0; + if (calc_live_regs (&dummy) >= 6 * 8) + return 1; + /* This is a borderline case. See if we got a nested loop, or a loop + with a call, or with more than 4 labels inside. */ + for (insn = get_insns(); insn; insn = NEXT_INSN (insn)) + { + if (GET_CODE (insn) == NOTE + && NOTE_LINE_NUMBER (insn) == NOTE_INSN_LOOP_BEG) + { + int labels = 0; + + do + { + insn = NEXT_INSN (insn); + if ((GET_CODE (insn) == NOTE + && NOTE_LINE_NUMBER (insn) == NOTE_INSN_LOOP_BEG) + || GET_CODE (insn) == CALL_INSN + || (GET_CODE (insn) == CODE_LABEL && ++labels > 4)) + return 1; + } + while (GET_CODE (insn) != NOTE + || NOTE_LINE_NUMBER (insn) != NOTE_INSN_LOOP_END); + } + } + return 0; } static bool @@ -9190,7 +9542,8 @@ sh_initialize_trampoline (rtx tramp, rtx fnaddr, rtx cxt) if (TARGET_HARVARD) { if (TARGET_USERMODE) - emit_library_call (function_symbol ("__ic_invalidate"), + emit_library_call (function_symbol (NULL, "__ic_invalidate", + FUNCTION_ORDINARY), 0, VOIDmode, 1, tramp, SImode); else emit_insn (gen_ic_invalidate_line (tramp)); @@ -9201,13 +9554,18 @@ sh_initialize_trampoline (rtx tramp, rtx fnaddr, rtx cxt) receives arguments ``by reference'' will have them stored in its own stack frame, so it must not pass pointers or references to these arguments to other functions by means of sibling calls. */ +/* If PIC, we cannot make sibling calls to global functions + because the PLT requires r12 to be live. */ static bool sh_function_ok_for_sibcall (tree decl, tree exp ATTRIBUTE_UNUSED) { - return (decl + return (1 && (! TARGET_SHCOMPACT || current_function_args_info.stack_regs == 0) - && ! sh_cfun_interrupt_handler_p ()); + && ! sh_cfun_interrupt_handler_p () + && (! flag_pic + || (decl && ! TREE_PUBLIC (decl)) + || (decl && DECL_VISIBILITY (decl) != VISIBILITY_DEFAULT))); } /* Machine specific built-in functions. */ @@ -9221,6 +9579,7 @@ struct builtin_description /* describe number and signedness of arguments; arg[0] == result (1: unsigned, 2: signed, 4: don't care, 8: pointer 0: no argument */ +/* 9: 64 bit pointer, 10: 32 bit pointer */ static const char signature_args[][4] = { #define SH_BLTIN_V2SI2 0 @@ -9246,28 +9605,34 @@ static const char signature_args[][4] = #define SH_BLTIN_SISF 10 { 4, 2 }, #define SH_BLTIN_LDUA_L 11 - { 2, 8 }, + { 2, 10 }, #define SH_BLTIN_LDUA_Q 12 - { 1, 8 }, + { 1, 10 }, #define SH_BLTIN_STUA_L 13 - { 0, 8, 2 }, + { 0, 10, 2 }, #define SH_BLTIN_STUA_Q 14 - { 0, 8, 1 }, -#define SH_BLTIN_UDI 15 - { 0, 8, 1 }, -#define SH_BLTIN_NUM_SHARED_SIGNATURES 16 -#define SH_BLTIN_2 16 -#define SH_BLTIN_SU 16 + { 0, 10, 1 }, +#define SH_BLTIN_LDUA_L64 15 + { 2, 9 }, +#define SH_BLTIN_LDUA_Q64 16 + { 1, 9 }, +#define SH_BLTIN_STUA_L64 17 + { 0, 9, 2 }, +#define SH_BLTIN_STUA_Q64 18 + { 0, 9, 1 }, +#define SH_BLTIN_NUM_SHARED_SIGNATURES 19 +#define SH_BLTIN_2 19 +#define SH_BLTIN_SU 19 { 1, 2 }, -#define SH_BLTIN_3 17 -#define SH_BLTIN_SUS 17 +#define SH_BLTIN_3 20 +#define SH_BLTIN_SUS 20 { 2, 2, 1 }, -#define SH_BLTIN_PSSV 18 +#define SH_BLTIN_PSSV 21 { 0, 8, 2, 2 }, -#define SH_BLTIN_XXUU 19 -#define SH_BLTIN_UUUU 19 +#define SH_BLTIN_XXUU 22 +#define SH_BLTIN_UUUU 22 { 1, 1, 1, 1 }, -#define SH_BLTIN_PV 20 +#define SH_BLTIN_PV 23 { 0, 8 }, }; /* mcmv: operands considered unsigned. */ @@ -9285,10 +9650,7 @@ static const struct builtin_description bdesc[] = { CODE_FOR_ssaddv2si3,"__builtin_ssaddv2si3", SH_BLTIN_V2SI3 }, { CODE_FOR_usaddv8qi3,"__builtin_usaddv8qi3", SH_BLTIN_V8QI3 }, { CODE_FOR_ssaddv4hi3,"__builtin_ssaddv4hi3", SH_BLTIN_V4HI3 }, -#if 0 - { CODE_FOR_alloco32, "__builtin_sh_media_ALLOCO", SH_BLTIN_PV }, - { CODE_FOR_alloco64, "__builtin_sh_media_ALLOCO", SH_BLTIN_PV }, -#endif + { CODE_FOR_alloco_i, "__builtin_sh_media_ALLOCO", SH_BLTIN_PV }, { CODE_FOR_negcmpeqv8qi,"__builtin_sh_media_MCMPEQ_B", SH_BLTIN_V8QI3 }, { CODE_FOR_negcmpeqv2si,"__builtin_sh_media_MCMPEQ_L", SH_BLTIN_V2SI3 }, { CODE_FOR_negcmpeqv4hi,"__builtin_sh_media_MCMPEQ_W", SH_BLTIN_V4HI3 }, @@ -9299,13 +9661,13 @@ static const struct builtin_description bdesc[] = { CODE_FOR_mcnvs_lw, "__builtin_sh_media_MCNVS_LW", SH_BLTIN_3 }, { CODE_FOR_mcnvs_wb, "__builtin_sh_media_MCNVS_WB", SH_BLTIN_V4HI2V8QI }, { CODE_FOR_mcnvs_wub, "__builtin_sh_media_MCNVS_WUB", SH_BLTIN_V4HI2V8QI }, - { CODE_FOR_mextr1, "__builtin_sh_media_MEXTR1", SH_BLTIN_UDI }, - { CODE_FOR_mextr2, "__builtin_sh_media_MEXTR2", SH_BLTIN_UDI }, - { CODE_FOR_mextr3, "__builtin_sh_media_MEXTR3", SH_BLTIN_UDI }, - { CODE_FOR_mextr4, "__builtin_sh_media_MEXTR4", SH_BLTIN_UDI }, - { CODE_FOR_mextr5, "__builtin_sh_media_MEXTR5", SH_BLTIN_UDI }, - { CODE_FOR_mextr6, "__builtin_sh_media_MEXTR6", SH_BLTIN_UDI }, - { CODE_FOR_mextr7, "__builtin_sh_media_MEXTR7", SH_BLTIN_UDI }, + { CODE_FOR_mextr1, "__builtin_sh_media_MEXTR1", SH_BLTIN_V8QI3 }, + { CODE_FOR_mextr2, "__builtin_sh_media_MEXTR2", SH_BLTIN_V8QI3 }, + { CODE_FOR_mextr3, "__builtin_sh_media_MEXTR3", SH_BLTIN_V8QI3 }, + { CODE_FOR_mextr4, "__builtin_sh_media_MEXTR4", SH_BLTIN_V8QI3 }, + { CODE_FOR_mextr5, "__builtin_sh_media_MEXTR5", SH_BLTIN_V8QI3 }, + { CODE_FOR_mextr6, "__builtin_sh_media_MEXTR6", SH_BLTIN_V8QI3 }, + { CODE_FOR_mextr7, "__builtin_sh_media_MEXTR7", SH_BLTIN_V8QI3 }, { CODE_FOR_mmacfx_wl, "__builtin_sh_media_MMACFX_WL", SH_BLTIN_MAC_HISI }, { CODE_FOR_mmacnfx_wl,"__builtin_sh_media_MMACNFX_WL", SH_BLTIN_MAC_HISI }, { CODE_FOR_mulv2si3, "__builtin_mulv2si3", SH_BLTIN_V2SI3, }, @@ -9342,8 +9704,10 @@ static const struct builtin_description bdesc[] = { CODE_FOR_fsina_s, "__builtin_sh_media_FSINA_S", SH_BLTIN_SISF }, { CODE_FOR_fipr, "__builtin_sh_media_FIPR_S", SH_BLTIN_3 }, { CODE_FOR_ftrv, "__builtin_sh_media_FTRV_S", SH_BLTIN_3 }, + { CODE_FOR_mac_media, "__builtin_sh_media_FMAC_S", SH_BLTIN_3 }, + { CODE_FOR_sqrtdf2, "__builtin_sh_media_FSQRT_D", SH_BLTIN_2 }, + { CODE_FOR_sqrtsf2, "__builtin_sh_media_FSQRT_S", SH_BLTIN_2 }, { CODE_FOR_fsrra_s, "__builtin_sh_media_FSRRA_S", SH_BLTIN_2 }, -#if 0 { CODE_FOR_ldhi_l, "__builtin_sh_media_LDHI_L", SH_BLTIN_LDUA_L }, { CODE_FOR_ldhi_q, "__builtin_sh_media_LDHI_Q", SH_BLTIN_LDUA_Q }, { CODE_FOR_ldlo_l, "__builtin_sh_media_LDLO_L", SH_BLTIN_LDUA_L }, @@ -9352,21 +9716,17 @@ static const struct builtin_description bdesc[] = { CODE_FOR_sthi_q, "__builtin_sh_media_STHI_Q", SH_BLTIN_STUA_Q }, { CODE_FOR_stlo_l, "__builtin_sh_media_STLO_L", SH_BLTIN_STUA_L }, { CODE_FOR_stlo_q, "__builtin_sh_media_STLO_Q", SH_BLTIN_STUA_Q }, - { CODE_FOR_ldhi_l64, "__builtin_sh_media_LDHI_L", SH_BLTIN_LDUA_L }, - { CODE_FOR_ldhi_q64, "__builtin_sh_media_LDHI_Q", SH_BLTIN_LDUA_Q }, - { CODE_FOR_ldlo_l64, "__builtin_sh_media_LDLO_L", SH_BLTIN_LDUA_L }, - { CODE_FOR_ldlo_q64, "__builtin_sh_media_LDLO_Q", SH_BLTIN_LDUA_Q }, - { CODE_FOR_sthi_l64, "__builtin_sh_media_STHI_L", SH_BLTIN_STUA_L }, - { CODE_FOR_sthi_q64, "__builtin_sh_media_STHI_Q", SH_BLTIN_STUA_Q }, - { CODE_FOR_stlo_l64, "__builtin_sh_media_STLO_L", SH_BLTIN_STUA_L }, - { CODE_FOR_stlo_q64, "__builtin_sh_media_STLO_Q", SH_BLTIN_STUA_Q }, -#endif + { CODE_FOR_ldhi_l64, "__builtin_sh_media_LDHI_L", SH_BLTIN_LDUA_L64 }, + { CODE_FOR_ldhi_q64, "__builtin_sh_media_LDHI_Q", SH_BLTIN_LDUA_Q64 }, + { CODE_FOR_ldlo_l64, "__builtin_sh_media_LDLO_L", SH_BLTIN_LDUA_L64 }, + { CODE_FOR_ldlo_q64, "__builtin_sh_media_LDLO_Q", SH_BLTIN_LDUA_Q64 }, + { CODE_FOR_sthi_l64, "__builtin_sh_media_STHI_L", SH_BLTIN_STUA_L64 }, + { CODE_FOR_sthi_q64, "__builtin_sh_media_STHI_Q", SH_BLTIN_STUA_Q64 }, + { CODE_FOR_stlo_l64, "__builtin_sh_media_STLO_L", SH_BLTIN_STUA_L64 }, + { CODE_FOR_stlo_q64, "__builtin_sh_media_STLO_Q", SH_BLTIN_STUA_Q64 }, { CODE_FOR_nsb, "__builtin_sh_media_NSB", SH_BLTIN_SU }, { CODE_FOR_byterev, "__builtin_sh_media_BYTEREV", SH_BLTIN_2 }, -#if 0 - { CODE_FOR_prefetch32,"__builtin_sh_media_PREFO", SH_BLTIN_PSSV }, - { CODE_FOR_prefetch64,"__builtin_sh_media_PREFO", SH_BLTIN_PSSV } -#endif + { CODE_FOR_prefetch, "__builtin_sh_media_PREFO", SH_BLTIN_PSSV }, }; static void @@ -9378,7 +9738,7 @@ sh_media_init_builtins (void) memset (shared, 0, sizeof shared); for (d = bdesc; d - bdesc < (int) ARRAY_SIZE (bdesc); d++) { - tree type, arg_type; + tree type, arg_type = 0; int signature = d->signature; int i; @@ -9388,8 +9748,9 @@ sh_media_init_builtins (void) { int has_result = signature_args[signature][0] != 0; - if (signature_args[signature][1] == 8 - && (insn_data[d->icode].operand[has_result].mode != Pmode)) + if ((signature_args[signature][1] & 8) + && (((signature_args[signature][1] & 1) && TARGET_SHMEDIA32) + || ((signature_args[signature][1] & 2) && TARGET_SHMEDIA64))) continue; if (! TARGET_FPU_ANY && FLOAT_MODE_P (insn_data[d->icode].operand[0].mode)) @@ -9400,12 +9761,12 @@ sh_media_init_builtins (void) int arg = signature_args[signature][i]; int opno = i - 1 + has_result; - if (arg == 8) + if (arg & 8) arg_type = ptr_type_node; else if (arg) - arg_type = ((*lang_hooks.types.type_for_mode) - (insn_data[d->icode].operand[opno].mode, - (arg & 1))); + arg_type = (*lang_hooks.types.type_for_mode) + (insn_data[d->icode].operand[opno].mode, + (arg & 1)); else if (i) continue; else @@ -9480,7 +9841,7 @@ sh_expand_builtin (tree exp, rtx target, rtx subtarget ATTRIBUTE_UNUSED, enum machine_mode tmode = VOIDmode; int nop = 0, i; rtx op[4]; - rtx pat; + rtx pat = 0; if (signature_args[signature][0]) { @@ -9501,6 +9862,7 @@ sh_expand_builtin (tree exp, rtx target, rtx subtarget ATTRIBUTE_UNUSED, { tree arg; enum machine_mode opmode, argmode; + tree optype; if (! signature_args[signature][i]) break; @@ -9508,11 +9870,19 @@ sh_expand_builtin (tree exp, rtx target, rtx subtarget ATTRIBUTE_UNUSED, if (arg == error_mark_node) return const0_rtx; arglist = TREE_CHAIN (arglist); - opmode = insn_data[icode].operand[nop].mode; + if (signature_args[signature][i] & 8) + { + opmode = ptr_mode; + optype = ptr_type_node; + } + else + { + opmode = insn_data[icode].operand[nop].mode; + optype = (*lang_hooks.types.type_for_mode) (opmode, 0); + } argmode = TYPE_MODE (TREE_TYPE (arg)); if (argmode != opmode) - arg = build1 (NOP_EXPR, - (*lang_hooks.types.type_for_mode) (opmode, 0), arg); + arg = build1 (NOP_EXPR, optype, arg); op[nop] = expand_expr (arg, NULL_RTX, opmode, 0); if (! (*insn_data[icode].operand[nop].predicate) (op[nop], opmode)) op[nop] = copy_to_mode_reg (opmode, op[nop]); @@ -9662,6 +10032,16 @@ sh_register_move_cost (enum machine_mode mode, || ((dstclass) == TARGET_REGS && ! REGCLASS_HAS_GENERAL_REG (srcclass))) return 20; + /* ??? ptabs faults on (value & 0x3) == 0x3 */ + if (TARGET_SHMEDIA + && ((srcclass) == TARGET_REGS || (srcclass) == SIBCALL_REGS)) + { + if (*sh_gettrcost_str) + return atoi (sh_gettrcost_str); + else if (!TARGET_PT_FIXED) + return 100; + } + if ((srcclass == FPSCR_REGS && ! REGCLASS_HAS_GENERAL_REG (dstclass)) || (dstclass == FPSCR_REGS && ! REGCLASS_HAS_GENERAL_REG (srcclass))) return 4; @@ -9689,11 +10069,43 @@ int cmpsi_operand (rtx op, enum machine_mode mode) { if (GET_CODE (op) == REG && REGNO (op) == T_REG - && GET_MODE (op) == SImode) + && GET_MODE (op) == SImode + && TARGET_SH1) return 1; return arith_operand (op, mode); } +int +shift_count_reg_operand (rtx op, enum machine_mode mode) +{ + if ((GET_CODE (op) == ZERO_EXTEND || GET_CODE (op) == SIGN_EXTEND + || (GET_CODE (op) == SUBREG && SUBREG_BYTE (op) == 0)) + && (mode == VOIDmode || mode == GET_MODE (op)) + && GET_MODE_BITSIZE (GET_MODE (XEXP (op, 0))) >= 6 + && GET_MODE_CLASS (GET_MODE (XEXP (op, 0))) == MODE_INT) + { + mode = VOIDmode; + do + op = XEXP (op, 0); + while ((GET_CODE (op) == ZERO_EXTEND || GET_CODE (op) == SIGN_EXTEND + || GET_CODE (op) == TRUNCATE) + && GET_MODE_BITSIZE (GET_MODE (XEXP (op, 0))) >= 6 + && GET_MODE_CLASS (GET_MODE (XEXP (op, 0))) == MODE_INT); + + } + return arith_reg_operand (op, mode); +} + +int +shift_count_operand (rtx op, enum machine_mode mode) +{ + return (CONSTANT_P (op) + ? (GET_CODE (op) == CONST_INT + ? (unsigned) INTVAL (op) < GET_MODE_BITSIZE (mode) + : nonmemory_operand (op, mode)) + : shift_count_reg_operand (op, mode)); +} + static rtx emit_load_ptr (rtx, rtx); static rtx @@ -9706,7 +10118,7 @@ emit_load_ptr (rtx reg, rtx addr) return emit_move_insn (reg, mem); } -void +static void sh_output_mi_thunk (FILE *file, tree thunk_fndecl ATTRIBUTE_UNUSED, HOST_WIDE_INT delta, HOST_WIDE_INT vcall_offset, tree function) @@ -9718,6 +10130,7 @@ sh_output_mi_thunk (FILE *file, tree thunk_fndecl ATTRIBUTE_UNUSED, int simple_add = CONST_OK_FOR_ADD (delta); int did_load = 0; rtx scratch0, scratch1, scratch2; + unsigned i; reload_completed = 1; epilogue_completed = 1; @@ -9748,18 +10161,39 @@ sh_output_mi_thunk (FILE *file, tree thunk_fndecl ATTRIBUTE_UNUSED, static chain pointer (even if you can't have nested virtual functions right now, someone might implement them sometime), and the rest of the registers are used for argument passing, are callee-saved, or reserved. */ + /* We need to check call_used_regs / fixed_regs in case -fcall_saved-reg / + -ffixed-reg has been used. */ + if (! call_used_regs[0] || fixed_regs[0]) + error ("r0 needs to be available as a call-clobbered register"); scratch0 = scratch1 = scratch2 = gen_rtx_REG (Pmode, 0); if (! TARGET_SH5) { - scratch1 = gen_rtx_REG (ptr_mode, 1); + if (call_used_regs[1] && ! fixed_regs[1]) + scratch1 = gen_rtx_REG (ptr_mode, 1); /* N.B., if not TARGET_HITACHI, register 2 is used to pass the pointer pointing where to return struct values. */ - scratch2 = gen_rtx_REG (Pmode, 3); + if (call_used_regs[3] && ! fixed_regs[3]) + scratch2 = gen_rtx_REG (Pmode, 3); } else if (TARGET_SHMEDIA) { - scratch1 = gen_rtx_REG (ptr_mode, 21); - scratch2 = gen_rtx_REG (Pmode, TR0_REG); + for (i = FIRST_GENERAL_REG; i <= LAST_GENERAL_REG; i++) + if (i != REGNO (scratch0) && + call_used_regs[i] && ! fixed_regs[i] && ! FUNCTION_ARG_REGNO_P (i)) + { + scratch1 = gen_rtx_REG (ptr_mode, i); + break; + } + if (scratch1 == scratch0) + error ("Need a second call-clobbered general purpose register"); + for (i = FIRST_TARGET_REG; i <= LAST_TARGET_REG; i++) + if (call_used_regs[i] && ! fixed_regs[i]) + { + scratch2 = gen_rtx_REG (Pmode, i); + break; + } + if (scratch2 == scratch0) + error ("Need a call-clobbered target register"); } this_value = plus_constant (this, delta); @@ -9791,7 +10225,7 @@ sh_output_mi_thunk (FILE *file, tree thunk_fndecl ATTRIBUTE_UNUSED, offset_addr = plus_constant (scratch0, vcall_offset); if (strict_memory_address_p (ptr_mode, offset_addr)) ; /* Do nothing. */ - else if (! TARGET_SH5) + else if (! TARGET_SH5 && scratch0 != scratch1) { /* scratch0 != scratch1, and we have indexed loads. Get better schedule by loading the offset into r1 and using an indexed @@ -9827,9 +10261,30 @@ sh_output_mi_thunk (FILE *file, tree thunk_fndecl ATTRIBUTE_UNUSED, TREE_USED (function) = 1; } funexp = XEXP (DECL_RTL (function), 0); - emit_move_insn (scratch2, funexp); - funexp = gen_rtx_MEM (FUNCTION_MODE, scratch2); - sibcall = emit_call_insn (gen_sibcall (funexp, const0_rtx, NULL_RTX)); + /* If the function is overridden, so is the thunk, hence we don't + need GOT addressing even if this is a public symbol. */ +#if 0 + if (TARGET_SH1 && ! flag_weak) + sibcall = gen_sibcalli_thunk (funexp, const0_rtx); + else +#endif + if (TARGET_SH2 && flag_pic) + { + sibcall = gen_sibcall_pcrel (funexp, const0_rtx); + XEXP (XVECEXP (sibcall, 0, 2), 0) = scratch2; + } + else + { + if (TARGET_SHMEDIA && flag_pic) + { + funexp = gen_sym2PIC (funexp); + PUT_MODE (funexp, Pmode); + } + emit_move_insn (scratch2, funexp); + funexp = gen_rtx_MEM (FUNCTION_MODE, scratch2); + sibcall = gen_sibcall (funexp, const0_rtx, NULL_RTX); + } + sibcall = emit_call_insn (sibcall); SIBLING_CALL_P (sibcall) = 1; use_reg (&CALL_INSN_FUNCTION_USAGE (sibcall), this); emit_barrier (); @@ -9882,10 +10337,49 @@ sh_output_mi_thunk (FILE *file, tree thunk_fndecl ATTRIBUTE_UNUSED, } rtx -function_symbol (const char *name) +function_symbol (rtx target, const char *name, enum sh_function_kind kind) { - rtx sym = gen_rtx_SYMBOL_REF (Pmode, name); + rtx sym; + + /* If this is not an ordinary function, the name usually comes from a + string literal or an sprintf buffer. Make sure we use the same + string consistently, so that cse will be able to unify address loads. */ + if (kind != FUNCTION_ORDINARY) + name = IDENTIFIER_POINTER (get_identifier (name)); + sym = gen_rtx_SYMBOL_REF (Pmode, name); SYMBOL_REF_FLAGS (sym) = SYMBOL_FLAG_FUNCTION; + if (flag_pic) + switch (kind) + { + case FUNCTION_ORDINARY: + break; + case SFUNC_GOT: + { + rtx reg = target ? target : gen_reg_rtx (Pmode); + + emit_insn (gen_symGOT2reg (reg, sym)); + sym = reg; + break; + } + case SFUNC_STATIC: + { + /* ??? To allow cse to work, we use GOTOFF relocations. + we could add combiner patterns to transform this into + straight pc-relative calls with sym2PIC / bsrf when + label load and function call are still 1:1 and in the + same basic block during combine. */ + rtx reg = target ? target : gen_reg_rtx (Pmode); + + emit_insn (gen_symGOTOFF2reg (reg, sym)); + sym = reg; + break; + } + } + if (target && sym != target) + { + emit_move_insn (target, sym); + return target; + } return sym; } @@ -10177,4 +10671,540 @@ lose: return 0; } +#ifdef TARGET_ADJUST_UNROLL_MAX +static int +sh_adjust_unroll_max (struct loop * loop, int insn_count, + int max_unrolled_insns, int strength_reduce_p, + int unroll_type) +{ +/* This doesn't work in 4.0 because the old unroller & loop.h is gone. */ + if (TARGET_ADJUST_UNROLL && TARGET_SHMEDIA) + { + /* Throttle back loop unrolling so that the costs of using more + targets than the eight target register we have don't outweigh + the benefits of unrolling. */ + rtx insn; + int n_labels = 0, n_calls = 0, n_exit_dest = 0, n_inner_loops = -1; + int n_barriers = 0; + rtx dest; + int i; + rtx exit_dest[8]; + int threshold; + int unroll_benefit = 0, mem_latency = 0; + int base_cost, best_cost, cost; + int factor, best_factor; + int n_dest; + unsigned max_iterations = 32767; + int n_iterations; + int need_precond = 0, precond = 0; + basic_block * bbs = get_loop_body (loop); + struct niter_desc *desc; + + /* Assume that all labels inside the loop are used from inside the + loop. If the loop has multiple entry points, it is unlikely to + be unrolled anyways. + Also assume that all calls are to different functions. That is + somewhat pessimistic, but if you have lots of calls, unrolling the + loop is not likely to gain you much in the first place. */ + i = loop->num_nodes - 1; + for (insn = BB_HEAD (bbs[i]); ; ) + { + if (GET_CODE (insn) == CODE_LABEL) + n_labels++; + else if (GET_CODE (insn) == CALL_INSN) + n_calls++; + else if (GET_CODE (insn) == NOTE + && NOTE_LINE_NUMBER (insn) == NOTE_INSN_LOOP_BEG) + n_inner_loops++; + else if (GET_CODE (insn) == BARRIER) + n_barriers++; + if (insn != BB_END (bbs[i])) + insn = NEXT_INSN (insn); + else if (--i >= 0) + insn = BB_HEAD (bbs[i]); + else + break; + } + free (bbs); + /* One label for the loop top is normal, and it won't be duplicated by + unrolling. */ + if (n_labels <= 1) + return max_unrolled_insns; + if (n_inner_loops > 0) + return 0; + for (dest = loop->exit_labels; dest && n_exit_dest < 8; + dest = LABEL_NEXTREF (dest)) + { + for (i = n_exit_dest - 1; + i >= 0 && XEXP (dest, 0) != XEXP (exit_dest[i], 0); i--); + if (i < 0) + exit_dest[n_exit_dest++] = dest; + } + /* If the loop top and call and exit destinations are enough to fill up + the target registers, we're unlikely to do any more damage by + unrolling. */ + if (n_calls + n_exit_dest >= 7) + return max_unrolled_insns; + + /* ??? In the new loop unroller, there is no longer any strength + reduction information available. Thus, when it comes to unrolling, + we know the cost of everything, but we know the value of nothing. */ +#if 0 + if (strength_reduce_p + && (unroll_type == LPT_UNROLL_RUNTIME + || unroll_type == LPT_UNROLL_CONSTANT + || unroll_type == LPT_PEEL_COMPLETELY)) + { + struct loop_ivs *ivs = LOOP_IVS (loop); + struct iv_class *bl; + + /* We'll save one compare-and-branch in each loop body copy + but the last one. */ + unroll_benefit = 1; + /* Assess the benefit of removing biv & giv updates. */ + for (bl = ivs->list; bl; bl = bl->next) + { + rtx increment = biv_total_increment (bl); + struct induction *v; + + if (increment && GET_CODE (increment) == CONST_INT) + { + unroll_benefit++; + for (v = bl->giv; v; v = v->next_iv) + { + if (! v->ignore && v->same == 0 + && GET_CODE (v->mult_val) == CONST_INT) + unroll_benefit++; + /* If this giv uses an array, try to determine + a maximum iteration count from the size of the + array. This need not be correct all the time, + but should not be too far off the mark too often. */ + while (v->giv_type == DEST_ADDR) + { + rtx mem = PATTERN (v->insn); + tree mem_expr, type, size_tree; + + if (GET_CODE (SET_SRC (mem)) == MEM) + mem = SET_SRC (mem); + else if (GET_CODE (SET_DEST (mem)) == MEM) + mem = SET_DEST (mem); + else + break; + mem_expr = MEM_EXPR (mem); + if (! mem_expr) + break; + type = TREE_TYPE (mem_expr); + if (TREE_CODE (type) != ARRAY_TYPE + || ! TYPE_SIZE (type) || ! TYPE_SIZE_UNIT (type)) + break; + size_tree = fold (build (TRUNC_DIV_EXPR, + bitsizetype, + TYPE_SIZE (type), + TYPE_SIZE_UNIT (type))); + if (TREE_CODE (size_tree) == INTEGER_CST + && ! TREE_INT_CST_HIGH (size_tree) + && TREE_INT_CST_LOW (size_tree) < max_iterations) + max_iterations = TREE_INT_CST_LOW (size_tree); + break; + } + } + } + } + } +#else /* 0 */ + /* Assume there is at least some benefit. */ + unroll_benefit = 1; +#endif /* 0 */ + + desc = get_simple_loop_desc (loop); + n_iterations = desc->const_iter ? desc->niter : 0; + max_iterations + = max_iterations < desc->niter_max ? max_iterations : desc->niter_max; + + if (! strength_reduce_p || ! n_iterations) + need_precond = 1; + if (! n_iterations) + { + n_iterations + = max_iterations < 3 ? max_iterations : max_iterations * 3 / 4; + if (! n_iterations) + return 0; + } +#if 0 /* ??? See above - missing induction variable information. */ + while (unroll_benefit > 1) /* no loop */ + { + /* We include the benefit of biv/ giv updates. Check if some or + all of these updates are likely to fit into a scheduling + bubble of a load. + We check for the following case: + - All the insns leading to the first JUMP_INSN are in a strict + dependency chain. + - there is at least one memory reference in them. + + When we find such a pattern, we assume that we can hide as many + updates as the total of the load latency is, if we have an + unroll factor of at least two. We might or might not also do + this without unrolling, so rather than considering this as an + extra unroll benefit, discount it in the unroll benefits of unroll + factors higher than two. */ + + rtx set, last_set; + + insn = next_active_insn (loop->start); + last_set = single_set (insn); + if (! last_set) + break; + if (GET_CODE (SET_SRC (last_set)) == MEM) + mem_latency += 2; + for (insn = NEXT_INSN (insn); insn != end; insn = NEXT_INSN (insn)) + { + if (! INSN_P (insn)) + continue; + if (GET_CODE (insn) == JUMP_INSN) + break; + if (! reg_referenced_p (SET_DEST (last_set), PATTERN (insn))) + { + /* Check if this is a to-be-reduced giv insn. */ + struct loop_ivs *ivs = LOOP_IVS (loop); + struct iv_class *bl; + struct induction *v; + for (bl = ivs->list; bl; bl = bl->next) + { + if (bl->biv->insn == insn) + goto is_biv; + for (v = bl->giv; v; v = v->next_iv) + if (v->insn == insn) + goto is_giv; + } + mem_latency--; + is_biv: + is_giv: + continue; + } + set = single_set (insn); + if (! set) + continue; + if (GET_CODE (SET_SRC (set)) == MEM) + mem_latency += 2; + last_set = set; + } + if (mem_latency < 0) + mem_latency = 0; + else if (mem_latency > unroll_benefit - 1) + mem_latency = unroll_benefit - 1; + break; + } +#endif /* 0 */ + if (n_labels + (unroll_benefit + n_labels * 8) / n_iterations + <= unroll_benefit) + return max_unrolled_insns; + + n_dest = n_labels + n_calls + n_exit_dest; + base_cost = n_dest <= 8 ? 0 : n_dest - 7; + best_cost = 0; + best_factor = 1; + if (n_barriers * 2 > n_labels - 1) + n_barriers = (n_labels - 1) / 2; + for (factor = 2; factor <= 8; factor++) + { + /* Bump up preconditioning cost for each power of two. */ + if (! (factor & (factor-1))) + precond += 4; + /* When preconditioning, only powers of two will be considered. */ + else if (need_precond) + continue; + n_dest = ((unroll_type != LPT_PEEL_COMPLETELY) + + (n_labels - 1) * factor + n_calls + n_exit_dest + - (n_barriers * factor >> 1) + + need_precond); + cost + = ((n_dest <= 8 ? 0 : n_dest - 7) + - base_cost * factor + - ((factor > 2 ? unroll_benefit - mem_latency : unroll_benefit) + * (factor - (unroll_type != LPT_PEEL_COMPLETELY))) + + ((unroll_benefit + 1 + (n_labels - 1) * factor) + / n_iterations)); + if (need_precond) + cost += (precond + unroll_benefit * factor / 2) / n_iterations; + if (cost < best_cost) + { + best_cost = cost; + best_factor = factor; + } + } + threshold = best_factor * insn_count; + if (max_unrolled_insns > threshold) + max_unrolled_insns = threshold; + } + return max_unrolled_insns; +} +#endif /* TARGET_ADJUST_UNROLL_MAX */ + +/* Replace any occurrence of FROM(n) in X with TO(n). The function does + not enter into CONST_DOUBLE for the replace. + + Note that copying is not done so X must not be shared unless all copies + are to be modified. + + This is like replace_rtx, except that we operate on N_REPLACEMENTS + replacements sumultanously - FROM(n) is replacements[n*2] and to(n) is + replacements[n*2+1] - and that we take mode changes into account. + + If a replacement is ambigous, return NULL_RTX. + + If MODIFY is zero, don't modify any rtl in place, + just return zero or nonzero for failure / success. */ + +rtx +replace_n_hard_rtx (rtx x, rtx *replacements, int n_replacements, int modify) +{ + int i, j; + const char *fmt; + + /* The following prevents loops occurrence when we change MEM in + CONST_DOUBLE onto the same CONST_DOUBLE. */ + if (x != 0 && GET_CODE (x) == CONST_DOUBLE) + return x; + + for (i = n_replacements - 1; i >= 0 ; i--) + if (x == replacements[i*2] && GET_MODE (x) == GET_MODE (replacements[i*2+1])) + return replacements[i*2+1]; + + /* Allow this function to make replacements in EXPR_LISTs. */ + if (x == 0) + return 0; + + if (GET_CODE (x) == SUBREG) + { + rtx new = replace_n_hard_rtx (SUBREG_REG (x), replacements, + n_replacements, modify); + + if (GET_CODE (new) == CONST_INT) + { + x = simplify_subreg (GET_MODE (x), new, + GET_MODE (SUBREG_REG (x)), + SUBREG_BYTE (x)); + if (! x) + abort (); + } + else if (modify) + SUBREG_REG (x) = new; + + return x; + } + else if (GET_CODE (x) == REG) + { + unsigned regno = REGNO (x); + unsigned nregs = (regno < FIRST_PSEUDO_REGISTER + ? HARD_REGNO_NREGS (regno, GET_MODE (x)) : 1); + rtx result = NULL_RTX; + + for (i = n_replacements - 1; i >= 0; i--) + { + rtx from = replacements[i*2]; + rtx to = replacements[i*2+1]; + unsigned from_regno, from_nregs, to_regno, new_regno; + + if (GET_CODE (from) != REG) + continue; + from_regno = REGNO (from); + from_nregs = (from_regno < FIRST_PSEUDO_REGISTER + ? HARD_REGNO_NREGS (from_regno, GET_MODE (from)) : 1); + if (regno < from_regno + from_nregs && regno + nregs > from_regno) + { + if (regno < from_regno + || regno + nregs > from_regno + nregs + || GET_CODE (to) != REG + || result) + return NULL_RTX; + to_regno = REGNO (to); + if (to_regno < FIRST_PSEUDO_REGISTER) + { + new_regno = regno + to_regno - from_regno; + if ((unsigned) HARD_REGNO_NREGS (new_regno, GET_MODE (x)) + != nregs) + return NULL_RTX; + result = gen_rtx_REG (GET_MODE (x), new_regno); + } + else if (GET_MODE (x) <= GET_MODE (to)) + result = gen_lowpart_common (GET_MODE (x), to); + else + result = gen_lowpart_SUBREG (GET_MODE (x), to); + } + } + return result ? result : x; + } + else if (GET_CODE (x) == ZERO_EXTEND) + { + rtx new = replace_n_hard_rtx (XEXP (x, 0), replacements, + n_replacements, modify); + + if (GET_CODE (new) == CONST_INT) + { + x = simplify_unary_operation (ZERO_EXTEND, GET_MODE (x), + new, GET_MODE (XEXP (x, 0))); + if (! x) + abort (); + } + else if (modify) + XEXP (x, 0) = new; + + return x; + } + + fmt = GET_RTX_FORMAT (GET_CODE (x)); + for (i = GET_RTX_LENGTH (GET_CODE (x)) - 1; i >= 0; i--) + { + rtx new; + + if (fmt[i] == 'e') + { + new = replace_n_hard_rtx (XEXP (x, i), replacements, + n_replacements, modify); + if (!new) + return NULL_RTX; + if (modify) + XEXP (x, i) = new; + } + else if (fmt[i] == 'E') + for (j = XVECLEN (x, i) - 1; j >= 0; j--) + { + new = replace_n_hard_rtx (XVECEXP (x, i, j), replacements, + n_replacements, modify); + if (!new) + return NULL_RTX; + if (modify) + XVECEXP (x, i, j) = new; + } + } + + return x; +} + +rtx +sh_gen_truncate (enum machine_mode mode, rtx x, int need_sign_ext) +{ + enum rtx_code code = TRUNCATE; + + if (GET_CODE (x) == ZERO_EXTEND || GET_CODE (x) == SIGN_EXTEND) + { + rtx inner = XEXP (x, 0); + enum machine_mode inner_mode = GET_MODE (inner); + + if (inner_mode == mode) + return inner; + else if (GET_MODE_SIZE (inner_mode) >= GET_MODE_SIZE (mode)) + x = inner; + else if (GET_MODE_SIZE (inner_mode) < GET_MODE_SIZE (mode) + && (! need_sign_ext || GET_CODE (x) == SIGN_EXTEND)) + { + code = GET_CODE (x); + x = inner; + } + } + return gen_rtx_fmt_e (code, mode, x); +} + +/* called via for_each_rtx after reload, to clean up truncates of + registers that span multiple actual hard registers. */ +int +shmedia_cleanup_truncate (rtx *p, void *n_changes) +{ + rtx x = *p, reg; + + if (GET_CODE (x) != TRUNCATE) + return 0; + reg = XEXP (x, 0); + if (GET_MODE_SIZE (GET_MODE (reg)) > 8 && GET_CODE (reg) == REG) + { + enum machine_mode reg_mode = GET_MODE (reg); + XEXP (x, 0) = simplify_subreg (DImode, reg, reg_mode, + subreg_lowpart_offset (DImode, reg_mode)); + *(int*) n_changes += 1; + return -1; + } + return 0; +} + +/* Load and store depend on the highpart of the address. However, + set_attr_alternative does not give well-defined results before reload, + so we must look at the rtl ourselves to see if any of the feeding + registers is used in a memref. */ + +/* Called by sh_contains_memref_p via for_each_rtx. */ +static int +sh_contains_memref_p_1 (rtx *loc, void *data ATTRIBUTE_UNUSED) +{ + return (GET_CODE (*loc) == MEM); +} + +/* Return non-zero iff INSN contains a MEM. */ +int +sh_contains_memref_p (rtx insn) +{ + return for_each_rtx (&PATTERN (insn), &sh_contains_memref_p_1, NULL); +} + +/* FNADDR is the MEM expression from a call expander. Return an address + to use in an SHmedia insn pattern. */ +rtx +shmedia_prepare_call_address (rtx fnaddr, int is_sibcall) +{ + int is_sym; + + fnaddr = XEXP (fnaddr, 0); + is_sym = GET_CODE (fnaddr) == SYMBOL_REF; + if (flag_pic && is_sym) + { + if (! SYMBOL_REF_LOCAL_P (fnaddr)) + { + rtx reg = gen_reg_rtx (Pmode); + + /* We must not use GOTPLT for sibcalls, because PIC_REG + must be restored before the PLT code gets to run. */ + if (is_sibcall) + emit_insn (gen_symGOT2reg (reg, fnaddr)); + else + emit_insn (gen_symGOTPLT2reg (reg, fnaddr)); + fnaddr = reg; + } + else + { + fnaddr = gen_sym2PIC (fnaddr); + PUT_MODE (fnaddr, Pmode); + } + } + /* If ptabs might trap, make this visible to the rest of the compiler. + We generally assume that symbols pertain to valid locations, but + it is possible to generate invalid symbols with asm or linker tricks. + In a list of functions where each returns its sucessor, an invalid + symbol might denote an empty list. */ + if (!TARGET_PT_FIXED + && (!is_sym || TARGET_INVALID_SYMBOLS) + && (!REG_P (fnaddr) || ! TARGET_REGISTER_P (REGNO (fnaddr)))) + { + rtx tr = gen_reg_rtx (PDImode); + + emit_insn (gen_ptabs (tr, fnaddr)); + fnaddr = tr; + } + else if (! target_reg_operand (fnaddr, Pmode)) + fnaddr = copy_to_mode_reg (Pmode, fnaddr); + return fnaddr; +} + +const char *sh_multcost_str = ""; +const char *sh_gettrcost_str = ""; +const char *sh_div_str = ""; +const char *sh_divsi3_libfunc = ""; +const char *cut2_workaround_str = ""; +enum sh_divide_strategy_e sh_div_strategy = SH_DIV_STRATEGY_DEFAULT; + +/* This defines the storage for the variable part of a -mboard= option. + It is only required when using the sh-superh-elf target */ +#ifdef _SUPERH_H +const char * boardtype = "7750p2"; +const char * osruntime = "bare"; +#endif + #include "gt-sh.h" diff --git a/gcc/config/sh/sh.h b/gcc/config/sh/sh.h index f550d6b..32ad39c 100644 --- a/gcc/config/sh/sh.h +++ b/gcc/config/sh/sh.h @@ -84,6 +84,10 @@ do { \ builtin_define ("__SH4_NOFPU__"); \ } \ } \ + if (TARGET_FPU_ANY) \ + builtin_define ("__SH_FPU_ANY__"); \ + if (TARGET_FPU_DOUBLE) \ + builtin_define ("__SH_FPU_DOUBLE__"); \ if (TARGET_HITACHI) \ builtin_define ("__HITACHI__"); \ builtin_define (TARGET_LITTLE_ENDIAN \ @@ -175,6 +179,10 @@ extern int target_flags; #define SAVE_ALL_TR_BIT (1<<2) #define HARD_SH2A_BIT (1<<17) #define HARD_SH2A_DOUBLE_BIT (1<<18) +#define INDEXED_ADDRESS_BIT (1<<19) +#define PT_FIXED_BIT (1<<21) +#define INVALID_SYMBOLS_BIT (1<<25) +#define ADJUST_UNROLL_BIT (1<<20) /* Nonzero if this is an ELF target - compile time only */ #define TARGET_ELF 0 @@ -214,7 +222,7 @@ extern int target_flags; #define TARGET_SUPERSCALAR (target_flags & HARD_SH4_BIT) /* Nonzero if the target has separate instruction and data caches. */ -#define TARGET_HARVARD (target_flags & HARD_SH4_BIT) +#define TARGET_HARVARD (target_flags & HARD_SH4_BIT || TARGET_SH5) /* Nonzero if compiling for SH4 hardware (to be used for insn costs etc.) */ #define TARGET_HARD_SH4 (target_flags & HARD_SH4_BIT) @@ -317,6 +325,27 @@ extern int target_flags; #define SUPPORT_SH2A_SINGLE #endif +#define TARGET_DIVIDE_INV \ + (sh_div_strategy == SH_DIV_INV || sh_div_strategy == SH_DIV_INV_MINLAT \ + || sh_div_strategy == SH_DIV_INV20U || sh_div_strategy == SH_DIV_INV20L \ + || sh_div_strategy == SH_DIV_INV_CALL \ + || sh_div_strategy == SH_DIV_INV_CALL2 || sh_div_strategy == SH_DIV_INV_FP) +#define TARGET_DIVIDE_FP (sh_div_strategy == SH_DIV_FP) +#define TARGET_DIVIDE_INV_FP (sh_div_strategy == SH_DIV_INV_FP) +#define TARGET_DIVIDE_CALL2 (sh_div_strategy == SH_DIV_CALL2) +#define TARGET_DIVIDE_INV_MINLAT (sh_div_strategy == SH_DIV_INV_MINLAT) +#define TARGET_DIVIDE_INV20U (sh_div_strategy == SH_DIV_INV20U) +#define TARGET_DIVIDE_INV20L (sh_div_strategy == SH_DIV_INV20L) +#define TARGET_DIVIDE_INV_CALL (sh_div_strategy == SH_DIV_INV_CALL) +#define TARGET_DIVIDE_INV_CALL2 (sh_div_strategy == SH_DIV_INV_CALL2) + +/* Target macros pertaining to SHmedia architecture bugs. */ +#define TARGET_ALLOW_INDEXED_ADDRESS (target_flags & INDEXED_ADDRESS_BIT) +#define TARGET_PT_FIXED (target_flags & PT_FIXED_BIT) +#define TARGET_INVALID_SYMBOLS (target_flags & INVALID_SYMBOLS_BIT) + +#define TARGET_ADJUST_UNROLL (target_flags & ADJUST_UNROLL_BIT) + #define SELECT_SH1 (SH1_BIT) #define SELECT_SH2 (SH2_BIT | SELECT_SH1) #define SELECT_SH2E (SH_E_BIT | SH2_BIT | SH1_BIT | FPU_SINGLE_BIT) @@ -419,6 +448,14 @@ extern int target_flags; #define TARGET_SWITCHES_SH5_32MEDIA_NOFPU #endif +#if defined(TARGET_SWITCHES_SH5_32MEDIA) && defined(TARGET_SWITCHES_SH5_32MEDIA_NOFPU) +#define TARGET_SWITCH_SH5_32_ANY_EXTRA +#endif + +#if defined(TARGET_SWITCH_SH5_32_ANY_EXTRA) && !defined(SUPPORT_SH5_64MEDIA) && !defined(SUPPORT_SH5_64MEDIA_NOFPU) +#define TARGET_SWITCH_SH5_MEDIA_ANY_EXTRA +#endif + /* Reset all target-selection flags. */ #define TARGET_NONE -(SH1_BIT | SH2_BIT | SH3_BIT | SH_E_BIT | SH4_BIT \ | HARD_SH2A_BIT | HARD_SH2A_DOUBLE_BIT \ @@ -539,6 +576,22 @@ extern int target_flags; {"5-compact-nofpu", SELECT_SH5_COMPACT_NOFPU, "Generate FPU-less SHcompact code" }, #endif +#ifndef TARGET_SWITCH_SH5_32_ANY_EXTRA +#define TARGET_SWITCH_SH5_32_ANY_EXTRA \ + {"indexed-addressing", INDEXED_ADDRESS_BIT, "Enable the use of the indexed addressing mode for SHmedia32/SHcompact"}, \ + {"no-indexed-addressing", -INDEXED_ADDRESS_BIT, "Disable the use of the indexed addressing mode for SHmedia32/SHcompact"}, +#endif + +#ifndef TARGET_SWITCH_SH5_MEDIA_ANY_EXTRA +#define TARGET_SWITCH_SH5_MEDIA_ANY_EXTRA \ + {"pt-fixed", PT_FIXED_BIT, "Assume pt* instructions won't trap"}, \ + {"no-pt-fixed", -PT_FIXED_BIT, "Assume pt* instructions may trap"}, \ + {"invalid-symbols",INVALID_SYMBOLS_BIT, "Assume symbols might be invalid"}, \ + {"no-invalid-symbols",-INVALID_SYMBOLS_BIT, "Assume symbols won't be invalid"}, \ + {"adjust-unroll", ADJUST_UNROLL_BIT, "Throttle unrolling to avoid thrashing target registers unless the unroll benefit outweighs this"}, \ + {"no-adjust-unroll", -ADJUST_UNROLL_BIT, "Don't throttle unrolling"}, +#endif + #define TARGET_SWITCHES \ { TARGET_SWITCH_SH1 \ TARGET_SWITCH_SH2 \ @@ -562,25 +615,27 @@ extern int target_flags; TARGET_SWITCH_SH5_64MEDIA_NOFPU \ TARGET_SWITCHES_SH5_32MEDIA \ TARGET_SWITCHES_SH5_32MEDIA_NOFPU \ - {"b", -LITTLE_ENDIAN_BIT, "Generate code in big endian mode" }, \ - {"bigtable", BIGTABLE_BIT, "Generate 32-bit offsets in switch tables" }, \ - {"dalign", DALIGN_BIT, "Aligns doubles at 64-bit boundaries" }, \ - {"fmovd", FMOVD_BIT, "" }, \ - {"hitachi", HITACHI_BIT, "Follow Renesas (formerly Hitachi) / SuperH calling conventions" }, \ - {"renesas", HITACHI_BIT, "Follow Renesas (formerly Hitachi) / SuperH calling conventions" }, \ + {"b", -LITTLE_ENDIAN_BIT, "Generate code in big endian mode" }, \ + {"bigtable", BIGTABLE_BIT, "Generate 32-bit offsets in switch tables" }, \ + {"dalign", DALIGN_BIT, "Aligns doubles at 64-bit boundaries" }, \ + {"fmovd", FMOVD_BIT, "" }, \ + {"hitachi", HITACHI_BIT, "Follow Renesas (formerly Hitachi) / SuperH calling conventions" }, \ + {"renesas", HITACHI_BIT, "Follow Renesas (formerly Hitachi) / SuperH calling conventions" }, \ {"no-renesas",-HITACHI_BIT,"Follow the GCC calling conventions" }, \ - {"nomacsave", NOMACSAVE_BIT, "Mark MAC register as call-clobbered" }, \ - {"ieee", IEEE_BIT, "Increase the IEEE compliance for floating-point code" }, \ - {"isize", ISIZE_BIT, "" }, \ - {"l", LITTLE_ENDIAN_BIT, "Generate code in little endian mode" }, \ - {"no-ieee", -IEEE_BIT, "" }, \ - {"padstruct", PADSTRUCT_BIT, "" }, \ - {"prefergot", PREFERGOT_BIT, "Emit function-calls using global offset table when generating PIC" }, \ - {"relax", RELAX_BIT, "Shorten address references during linking" }, \ + {"nomacsave", NOMACSAVE_BIT, "Mark MAC register as call-clobbered" }, \ + {"ieee", IEEE_BIT, "Increase the IEEE compliance for floating-point code" }, \ + {"isize", ISIZE_BIT, "Annotate assembler instructions with estimated addresses" }, \ + {"l", LITTLE_ENDIAN_BIT, "Generate code in little endian mode" }, \ + {"no-ieee", -IEEE_BIT, "Opposite of -mieee" }, \ + {"padstruct", PADSTRUCT_BIT, "Make structs a multiple of 4 bytes (warning: ABI altered)" }, \ + {"prefergot", PREFERGOT_BIT, "Emit function-calls using global offset table when generating PIC" }, \ + {"relax", RELAX_BIT, "Shorten address references during linking" }, \ {"space", SPACE_BIT, "Deprecated. Use -Os instead" }, \ - {"usermode", USERMODE_BIT, "Generate library function call to invalidate instruction cache entries after fixing trampoline" }, \ - SUBTARGET_SWITCHES \ - {"", TARGET_DEFAULT, "" } \ + {"usermode", USERMODE_BIT, "Generate library function call to invalidate instruction cache entries after fixing trampoline" }, \ + TARGET_SWITCH_SH5_32_ANY_EXTRA \ + TARGET_SWITCH_SH5_MEDIA_ANY_EXTRA \ + SUBTARGET_SWITCHES \ + {"", TARGET_DEFAULT, "" } \ } /* This are meant to be redefined in the host dependent files */ @@ -591,7 +646,32 @@ extern int target_flags; #define TARGET_ENDIAN_DEFAULT 0 #endif -#define TARGET_DEFAULT (TARGET_CPU_DEFAULT|TARGET_ENDIAN_DEFAULT) +#ifndef TARGET_OPT_DEFAULT +#define TARGET_OPT_DEFAULT ADJUST_UNROLL_BIT +#endif + +#define TARGET_DEFAULT \ + (TARGET_CPU_DEFAULT | TARGET_ENDIAN_DEFAULT | TARGET_OPT_DEFAULT) + +#ifndef SUBTARGET_OPTIONS +#define SUBTARGET_OPTIONS +#endif + +#define TARGET_OPTIONS \ +{ { "ultcost=", &sh_multcost_str, \ + N_("Cost to assume for a multiply insn"), 0 }, \ + { "gettrcost=", &sh_gettrcost_str, \ + N_("Cost to assume for gettr insn"), 0 }, \ + { "div=", &sh_div_str, \ + N_("division strategy, one of: call, call2, fp, inv, inv:minlat, inv20u, inv20l, inv:call, inv:call2, inv:fp"), 0 }, \ + { "divsi3_libfunc=", &sh_divsi3_libfunc, \ + N_("Specify name for 32 bit signed division function"), 0 }, \ + { "cut2-workaround", &cut2_workaround_str, \ + N_("Enable SH5 cut2 workaround"), "\1" }, \ + SUBTARGET_OPTIONS \ +} + +#define TARGET_SH5_CUT2_WORKAROUND (*cut2_workaround_str) #ifndef SH_MULTILIB_CPU_DEFAULT #define SH_MULTILIB_CPU_DEFAULT "m1" @@ -621,7 +701,8 @@ extern int target_flags; { "subtarget_link_spec", SUBTARGET_LINK_SPEC }, \ { "subtarget_asm_endian_spec", SUBTARGET_ASM_ENDIAN_SPEC }, \ { "subtarget_asm_relax_spec", SUBTARGET_ASM_RELAX_SPEC }, \ - { "subtarget_asm_isa_spec", SUBTARGET_ASM_ISA_SPEC }, \ + { "subtarget_asm_isa_spec", SUBTARGET_ASM_ISA_SPEC }, \ + { "subtarget_asm_spec", SUBTARGET_ASM_SPEC }, \ SUBTARGET_EXTRA_SPECS #if TARGET_CPU_DEFAULT & HARD_SH4_BIT @@ -632,7 +713,15 @@ extern int target_flags; #define SH_ASM_SPEC \ "%(subtarget_asm_endian_spec) %{mrelax:-relax %(subtarget_asm_relax_spec)}\ -%(subtarget_asm_isa_spec) %{m4al:-dsp}" +%(subtarget_asm_isa_spec) %(subtarget_asm_spec)\ +%{m2a:--isa=sh2a} \ +%{m2a-single:--isa=sh2a} \ +%{m2a-single-only:--isa=sh2a} \ +%{m2a-nofpu:--isa=sh2a-nofpu} \ +%{m5-compact*:--isa=SHcompact} \ +%{m5-32media*:--isa=SHmedia --abi=32} \ +%{m5-64media*:--isa=SHmedia --abi=64} \ +%{m4al:-dsp} %{mcut2-workaround:-cut2-workaround}" #define ASM_SPEC SH_ASM_SPEC @@ -644,9 +733,29 @@ extern int target_flags; #endif #endif -#define SUBTARGET_ASM_ISA_SPEC "" +#if STRICT_NOFPU == 1 +/* Strict nofpu means that the compiler should tell the assembler + to reject FPU instructions. E.g. from ASM inserts. */ +#if TARGET_CPU_DEFAULT & HARD_SH4_BIT && !(TARGET_CPU_DEFAULT & SH_E_BIT) +#define SUBTARGET_ASM_ISA_SPEC "%{!m1:%{!m2:%{!m3*:%{m4-nofpu|!m4*:%{!m5:-isa=sh4-nofpu}}}}}" +#else +/* If there were an -isa option for sh5-nofpu then it would also go here. */ +#define SUBTARGET_ASM_ISA_SPEC \ + "%{m4-nofpu:-isa=sh4-nofpu} " ASM_ISA_DEFAULT_SPEC +#endif +#else /* ! STRICT_NOFPU */ +#define SUBTARGET_ASM_ISA_SPEC ASM_ISA_DEFAULT_SPEC +#endif +#ifndef SUBTARGET_ASM_SPEC +#define SUBTARGET_ASM_SPEC "" +#endif + +#if TARGET_ENDIAN_DEFAULT == LITTLE_ENDIAN_BIT +#define LINK_EMUL_PREFIX "sh%{!mb:l}" +#else #define LINK_EMUL_PREFIX "sh%{ml:l}" +#endif #if TARGET_CPU_DEFAULT & SH5_BIT #if TARGET_CPU_DEFAULT & SH_E_BIT @@ -682,29 +791,73 @@ extern int target_flags; %(subtarget_link_emul_suffix) \ %{mrelax:-relax} %(subtarget_link_spec)" +#ifndef SH_DIV_STR_FOR_SIZE +#define SH_DIV_STR_FOR_SIZE "call" +#endif + #define DRIVER_SELF_SPECS "%{m2a:%{ml:%eSH2a does not support little-endian}}" #define OPTIMIZATION_OPTIONS(LEVEL,SIZE) \ do { \ if (LEVEL) \ - flag_omit_frame_pointer = -1; \ + { \ + flag_omit_frame_pointer = -1; \ + if (! SIZE) \ + sh_div_str = "inv:minlat"; \ + } \ if (SIZE) \ - target_flags |= SPACE_BIT; \ - if (TARGET_SHMEDIA && LEVEL > 1) \ + { \ + target_flags |= SPACE_BIT; \ + sh_div_str = SH_DIV_STR_FOR_SIZE ; \ + } \ + /* We can't meaningfully test TARGET_SHMEDIA here, because -m options \ + haven't been parsed yet, hence we';d read only the default. \ + sh_target_reg_class will return NO_REGS if this is not SHMEDIA, so \ + it's OK to always set flag_branch_target_load_optimize. */ \ + if (LEVEL > 1) \ { \ flag_branch_target_load_optimize = 1; \ if (! (SIZE)) \ target_flags |= SAVE_ALL_TR_BIT; \ } \ + /* Likewise, we can't meaningfully test TARGET_SH2E / TARGET_IEEE \ + here, so leave it to OVERRIDE_OPTIONS to set \ + flag_finite_math_only. We set it to 2 here so we know if the user \ + explicitly requested this to be on or off. */ \ + flag_finite_math_only = 2; \ } while (0) #define ASSEMBLER_DIALECT assembler_dialect extern int assembler_dialect; +enum sh_divide_strategy_e { + SH_DIV_CALL, + SH_DIV_CALL2, + SH_DIV_FP, + SH_DIV_INV, + SH_DIV_INV_MINLAT, + SH_DIV_INV20U, + SH_DIV_INV20L, + SH_DIV_INV_CALL, + SH_DIV_INV_CALL2, + SH_DIV_INV_FP +}; + +extern enum sh_divide_strategy_e sh_div_strategy; + +#ifndef SH_DIV_STRATEGY_DEFAULT +#define SH_DIV_STRATEGY_DEFAULT SH_DIV_CALL +#endif + #define OVERRIDE_OPTIONS \ do { \ int regno; \ \ + if (flag_finite_math_only == 2) \ + flag_finite_math_only \ + = !flag_signaling_nans && TARGET_SH2E && ! TARGET_IEEE; \ + if (TARGET_SH2E && !flag_finite_math_only) \ + target_flags |= IEEE_BIT; \ sh_cpu = CPU_SH1; \ assembler_dialect = 0; \ if (TARGET_SH2) \ @@ -735,8 +888,7 @@ do { \ { \ sh_cpu = CPU_SH5; \ target_flags |= DALIGN_BIT; \ - if (TARGET_FPU_ANY \ - && ! (TARGET_SHCOMPACT && TARGET_LITTLE_ENDIAN)) \ + if (TARGET_SHMEDIA_FPU) \ target_flags |= FMOVD_BIT; \ if (TARGET_SHMEDIA) \ { \ @@ -744,16 +896,53 @@ do { \ flag_delayed_branch = 0; \ /* Relaxation isn't yet supported for SHmedia */ \ target_flags &= ~RELAX_BIT; \ + /* After reload, if conversion does little good but can cause \ + ICEs: \ + - find_if_block doesn't do anything for SH because we don't\ + have conditional execution patterns. (We use conditional\ + move patterns, which are handled differently, and only \ + before reload). \ + - find_cond_trap doesn't do anything for the SH because we \ + don't have conditional traps. \ + - find_if_case_1 uses redirect_edge_and_branch_force in \ + the only path that does an optimization, and this causes \ + an ICE when branch targets are in registers. \ + - find_if_case_2 doesn't do anything for the SHmedia after \ + reload except when it can redirect a tablejump - and \ + that's rather rare. */ \ + flag_if_conversion2 = 0; \ + if (! strcmp (sh_div_str, "call")) \ + sh_div_strategy = SH_DIV_CALL; \ + else if (! strcmp (sh_div_str, "call2")) \ + sh_div_strategy = SH_DIV_CALL2; \ + if (! strcmp (sh_div_str, "fp") && TARGET_FPU_ANY) \ + sh_div_strategy = SH_DIV_FP; \ + else if (! strcmp (sh_div_str, "inv")) \ + sh_div_strategy = SH_DIV_INV; \ + else if (! strcmp (sh_div_str, "inv:minlat")) \ + sh_div_strategy = SH_DIV_INV_MINLAT; \ + else if (! strcmp (sh_div_str, "inv20u")) \ + sh_div_strategy = SH_DIV_INV20U; \ + else if (! strcmp (sh_div_str, "inv20l")) \ + sh_div_strategy = SH_DIV_INV20L; \ + else if (! strcmp (sh_div_str, "inv:call2")) \ + sh_div_strategy = SH_DIV_INV_CALL2; \ + else if (! strcmp (sh_div_str, "inv:call")) \ + sh_div_strategy = SH_DIV_INV_CALL; \ + else if (! strcmp (sh_div_str, "inv:fp")) \ + { \ + if (TARGET_FPU_ANY) \ + sh_div_strategy = SH_DIV_INV_FP; \ + else \ + sh_div_strategy = SH_DIV_INV; \ + } \ } \ /* -fprofile-arcs needs a working libgcov . In unified tree \ configurations with newlib, this requires to configure with \ --with-newlib --with-headers. But there is no way to check \ here we have a working libgcov, so just assume that we have. */\ if (profile_flag) \ - { \ - warning (0, "Profiling is not supported on this target."); \ - profile_flag = profile_arc_flag = 0; \ - } \ + warning (0, "Profiling is still experimental for this target.");\ } \ else \ { \ @@ -761,6 +950,19 @@ do { \ targetm.asm_out.aligned_op.di = NULL; \ targetm.asm_out.unaligned_op.di = NULL; \ } \ + if (sh_divsi3_libfunc[0]) \ + ; /* User supplied - leave it alone. */ \ + else if (TARGET_HARD_SH4 && TARGET_SH2E) \ + sh_divsi3_libfunc = "__sdivsi3_i4"; \ + else if (TARGET_SH5) \ + { \ + if (TARGET_FPU_ANY && TARGET_SH1) \ + sh_divsi3_libfunc = "__sdivsi3_i4"; \ + else \ + sh_divsi3_libfunc = "__sdivsi3_1"; \ + } \ + else \ + sh_divsi3_libfunc = "__sdivsi3"; \ if (TARGET_FMOVD) \ reg_class_from_letter['e' - 'a'] = NO_REGS; \ \ @@ -783,7 +985,8 @@ do { \ flag_omit_frame_pointer = 0; \ } \ \ - if (flag_pic && ! TARGET_PREFERGOT) \ + if ((flag_pic && ! TARGET_PREFERGOT) \ + || (TARGET_SHMEDIA && !TARGET_PT_FIXED)) \ flag_no_function_cse = 1; \ \ if (SMALL_REGISTER_CLASSES) \ @@ -947,7 +1150,7 @@ do { \ barrier_align (LABEL_AFTER_BARRIER) #define LOOP_ALIGN(A_LABEL) \ - ((! optimize || TARGET_HARVARD || TARGET_SMALLCODE) \ + ((! optimize || TARGET_HARD_SH4 || TARGET_SMALLCODE) \ ? 0 : sh_loop_align (A_LABEL)) #define LABEL_ALIGN(A_LABEL) \ @@ -1293,11 +1496,14 @@ extern char sh_additional_register_names[ADDREGNAMES_SIZE] \ || ((((TARGET_SH4 || TARGET_SH2A_DOUBLE) && (MODE) == DFmode) || (MODE) == DCmode \ || (TARGET_SHMEDIA && ((MODE) == DFmode || (MODE) == DImode \ || (MODE) == V2SFmode || (MODE) == TImode))) \ - && (((REGNO) - FIRST_FP_REG) & 1) == 0)) \ + && (((REGNO) - FIRST_FP_REG) & 1) == 0) \ + || ((TARGET_SH4 || TARGET_SHMEDIA) \ + && (MODE) == TImode \ + && (((REGNO) - FIRST_FP_REG) & 3) == 0)) \ : XD_REGISTER_P (REGNO) \ ? (MODE) == DFmode \ : TARGET_REGISTER_P (REGNO) \ - ? ((MODE) == DImode || (MODE) == SImode) \ + ? ((MODE) == DImode || (MODE) == SImode || (MODE) == PDImode) \ : (REGNO) == PR_REG ? (MODE) == SImode \ : (REGNO) == FPSCR_REG ? (MODE) == PSImode \ : 1) @@ -1312,6 +1518,9 @@ extern char sh_additional_register_names[ADDREGNAMES_SIZE] \ #define MODES_TIEABLE_P(MODE1, MODE2) \ ((MODE1) == (MODE2) \ + || (TARGET_SHMEDIA \ + && GET_MODE_SIZE (MODE1) == GET_MODE_SIZE (MODE2) \ + && INTEGRAL_MODE_P (MODE1) && INTEGRAL_MODE_P (MODE2)) \ || (GET_MODE_CLASS (MODE1) == GET_MODE_CLASS (MODE2) \ && (TARGET_SHMEDIA ? ((GET_MODE_SIZE (MODE1) <= 4) \ && (GET_MODE_SIZE (MODE2) <= 4)) \ @@ -1578,7 +1787,8 @@ extern enum reg_class regno_reg_class[FIRST_PSEUDO_REGISTER]; 145,146,147,148,149,152 } /* The class value for index registers, and the one for base regs. */ -#define INDEX_REG_CLASS (TARGET_SHMEDIA ? GENERAL_REGS : R0_REGS) +#define INDEX_REG_CLASS \ + (!ALLOW_INDEXED_ADDRESS ? NO_REGS : TARGET_SHMEDIA ? GENERAL_REGS : R0_REGS) #define BASE_REG_CLASS GENERAL_REGS /* Get reg_class from a letter such as appears in the machine @@ -1619,30 +1829,11 @@ extern enum reg_class reg_class_from_letter[]; unused CONST_INT constraint letters: LO unused EXTRA_CONSTRAINT letters: D T U Y */ -#if 1 /* check that the transition went well. */ -#define CONSTRAINT_LEN(C,STR) \ - (((C) == 'L' || (C) == 'O' || (C) == 'D' || (C) == 'T' || (C) == 'U' \ - || (C) == 'Y' \ - || ((C) == 'I' \ - && (((STR)[1] != '0' && (STR)[1] != '1' && (STR)[1] != '2') \ - || (STR)[2] < '0' || (STR)[2] > '9')) \ - || ((C) == 'B' && ((STR)[1] != 's' || (STR)[2] != 'c')) \ - || ((C) == 'J' && ((STR)[1] != '1' || (STR)[2] != '6')) \ - || ((C) == 'K' && ((STR)[1] != '0' || (STR)[2] != '8')) \ - || ((C) == 'P' && ((STR)[1] != '2' || (STR)[2] != '7'))) \ - ? -1 \ - : ((C) == 'A' || (C) == 'B' || (C) == 'C' \ - || (C) == 'I' || (C) == 'J' || (C) == 'K' || (C) == 'P' \ - || (C) == 'R' || (C) == 'S') \ - ? 3 \ - : DEFAULT_CONSTRAINT_LEN ((C), (STR))) -#else #define CONSTRAINT_LEN(C,STR) \ (((C) == 'A' || (C) == 'B' || (C) == 'C' \ || (C) == 'I' || (C) == 'J' || (C) == 'K' || (C) == 'P' \ || (C) == 'R' || (C) == 'S') \ ? 3 : DEFAULT_CONSTRAINT_LEN ((C), (STR))) -#endif /* The letters I, J, K, L and M in a register constraint string can be used to stand for particular ranges of immediate operands. @@ -1671,7 +1862,7 @@ extern enum reg_class reg_class_from_letter[]; && ((HOST_WIDE_INT)(VALUE)) <= 524287 \ && TARGET_SH2A) #define CONST_OK_FOR_I(VALUE, STR) \ - ((STR)[1] == '0' && (STR)[2] == 6 ? CONST_OK_FOR_I06 (VALUE) \ + ((STR)[1] == '0' && (STR)[2] == '6' ? CONST_OK_FOR_I06 (VALUE) \ : (STR)[1] == '0' && (STR)[2] == '8' ? CONST_OK_FOR_I08 (VALUE) \ : (STR)[1] == '1' && (STR)[2] == '0' ? CONST_OK_FOR_I10 (VALUE) \ : (STR)[1] == '1' && (STR)[2] == '6' ? CONST_OK_FOR_I16 (VALUE) \ @@ -1722,11 +1913,12 @@ extern enum reg_class reg_class_from_letter[]; #define PREFERRED_RELOAD_CLASS(X, CLASS) \ ((CLASS) == NO_REGS && TARGET_SHMEDIA \ && (GET_CODE (X) == CONST_DOUBLE \ - || GET_CODE (X) == SYMBOL_REF) \ + || GET_CODE (X) == SYMBOL_REF \ + || PIC_DIRECT_ADDR_P (X)) \ ? GENERAL_REGS \ : (CLASS)) \ -#define SECONDARY_OUTPUT_RELOAD_CLASS(CLASS,MODE,X) \ +#define SECONDARY_INOUT_RELOAD_CLASS(CLASS,MODE,X,ELSE) \ ((((REGCLASS_HAS_FP_REG (CLASS) \ && (GET_CODE (X) == REG \ && (GENERAL_OR_AP_REGISTER_P (REGNO (X)) \ @@ -1747,18 +1939,21 @@ extern enum reg_class reg_class_from_letter[]; || REGNO (X) == T_REG \ || system_reg_operand (X, VOIDmode))))) \ ? GENERAL_REGS \ - : ((CLASS) == TARGET_REGS \ - || (TARGET_SHMEDIA && (CLASS) == SIBCALL_REGS)) \ - ? ((target_operand ((X), (MODE)) \ - && ! target_reg_operand ((X), (MODE))) \ - ? NO_REGS : GENERAL_REGS) \ + : (((CLASS) == TARGET_REGS \ + || (TARGET_SHMEDIA && (CLASS) == SIBCALL_REGS)) \ + && !EXTRA_CONSTRAINT_Csy (X) \ + && (GET_CODE (X) != REG || ! GENERAL_REGISTER_P (REGNO (X)))) \ + ? GENERAL_REGS \ : (((CLASS) == MAC_REGS || (CLASS) == PR_REGS) \ && GET_CODE (X) == REG && ! GENERAL_REGISTER_P (REGNO (X)) \ && (CLASS) != REGNO_REG_CLASS (REGNO (X))) \ ? GENERAL_REGS \ : ((CLASS) != GENERAL_REGS && GET_CODE (X) == REG \ && TARGET_REGISTER_P (REGNO (X))) \ - ? GENERAL_REGS : NO_REGS) + ? GENERAL_REGS : (ELSE)) + +#define SECONDARY_OUTPUT_RELOAD_CLASS(CLASS,MODE,X) \ + SECONDARY_INOUT_RELOAD_CLASS(CLASS,MODE,X,NO_REGS) #define SECONDARY_INPUT_RELOAD_CLASS(CLASS,MODE,X) \ ((REGCLASS_HAS_FP_REG (CLASS) \ @@ -1767,17 +1962,17 @@ extern enum reg_class reg_class_from_letter[]; && ! ((fp_zero_operand (X) || fp_one_operand (X)) \ && (MODE) == SFmode && fldi_ok ())) \ ? R0_REGS \ - : (CLASS == FPUL_REGS \ + : ((CLASS) == FPUL_REGS \ && ((GET_CODE (X) == REG \ && (REGNO (X) == MACL_REG || REGNO (X) == MACH_REG \ || REGNO (X) == T_REG)) \ || GET_CODE (X) == PLUS)) \ ? GENERAL_REGS \ - : CLASS == FPUL_REGS && immediate_operand ((X), (MODE)) \ + : (CLASS) == FPUL_REGS && immediate_operand ((X), (MODE)) \ ? (GET_CODE (X) == CONST_INT && CONST_OK_FOR_I08 (INTVAL (X)) \ ? GENERAL_REGS \ : R0_REGS) \ - : (CLASS == FPSCR_REGS \ + : ((CLASS) == FPSCR_REGS \ && ((GET_CODE (X) == REG && REGNO (X) >= FIRST_PSEUDO_REGISTER) \ || (GET_CODE (X) == MEM && GET_CODE (XEXP ((X), 0)) == PLUS)))\ ? GENERAL_REGS \ @@ -1787,7 +1982,13 @@ extern enum reg_class reg_class_from_letter[]; && (X) != CONST0_RTX (GET_MODE (X)) \ && GET_MODE (X) != V4SFmode) \ ? GENERAL_REGS \ - : SECONDARY_OUTPUT_RELOAD_CLASS((CLASS),(MODE),(X))) + : (((MODE) == QImode || (MODE) == HImode) \ + && TARGET_SHMEDIA && inqhi_operand ((X), (MODE))) \ + ? GENERAL_REGS \ + : (TARGET_SHMEDIA && (CLASS) == GENERAL_REGS \ + && (GET_CODE (X) == LABEL_REF || PIC_DIRECT_ADDR_P (X))) \ + ? TARGET_REGS \ + : SECONDARY_INOUT_RELOAD_CLASS((CLASS),(MODE),(X), NO_REGS)) /* Return the maximum number of consecutive registers needed to represent mode MODE in a register of class CLASS. @@ -1904,7 +2105,7 @@ extern enum reg_class reg_class_from_letter[]; #define FUNCTION_VALUE(VALTYPE, FUNC) \ gen_rtx_REG ( \ ((GET_MODE_CLASS (TYPE_MODE (VALTYPE)) == MODE_INT \ - && GET_MODE_SIZE (TYPE_MODE (VALTYPE)) < UNITS_PER_WORD \ + && GET_MODE_SIZE (TYPE_MODE (VALTYPE)) < 4 \ && (TREE_CODE (VALTYPE) == INTEGER_TYPE \ || TREE_CODE (VALTYPE) == ENUMERAL_TYPE \ || TREE_CODE (VALTYPE) == BOOLEAN_TYPE \ @@ -1912,7 +2113,7 @@ extern enum reg_class reg_class_from_letter[]; || TREE_CODE (VALTYPE) == REAL_TYPE \ || TREE_CODE (VALTYPE) == OFFSET_TYPE)) \ && sh_promote_prototypes (VALTYPE) \ - ? (TARGET_SHMEDIA ? DImode : SImode) : TYPE_MODE (VALTYPE)), \ + ? (TARGET_SHMEDIA64 ? DImode : SImode) : TYPE_MODE (VALTYPE)), \ BASE_RETURN_VALUE_REG (TYPE_MODE (VALTYPE))) /* Define how to find the value returned by a library function @@ -2225,10 +2426,19 @@ struct sh_args { #define FUNCTION_PROFILER(STREAM,LABELNO) \ { \ - fprintf((STREAM), "\t.align\t2\n"); \ - fprintf((STREAM), "\ttrapa\t#33\n"); \ - fprintf((STREAM), "\t.align\t2\n"); \ - asm_fprintf((STREAM), "\t.long\t%LLP%d\n", (LABELNO)); \ + if (TARGET_SHMEDIA) \ + { \ + fprintf((STREAM), "\tmovi\t33,r0\n"); \ + fprintf((STREAM), "\ttrapa\tr0\n"); \ + asm_fprintf((STREAM), "\t.long\t%LLP%d\n", (LABELNO)); \ + } \ + else \ + { \ + fprintf((STREAM), "\t.align\t2\n"); \ + fprintf((STREAM), "\ttrapa\t#33\n"); \ + fprintf((STREAM), "\t.align\t2\n"); \ + asm_fprintf((STREAM), "\t.long\t%LLP%d\n", (LABELNO)); \ + } \ } /* Define this macro if the code for function profiling should come @@ -2418,7 +2628,8 @@ struct sh_args { #define EXTRA_CONSTRAINT_C16(OP) \ (GET_CODE (OP) == CONST \ && GET_CODE (XEXP ((OP), 0)) == SIGN_EXTEND \ - && GET_MODE (XEXP ((OP), 0)) == DImode \ + && (GET_MODE (XEXP ((OP), 0)) == DImode \ + || GET_MODE (XEXP ((OP), 0)) == SImode) \ && GET_CODE (XEXP (XEXP ((OP), 0), 0)) == TRUNCATE \ && GET_MODE (XEXP (XEXP ((OP), 0), 0)) == HImode \ && (MOVI_SHORI_BASE_OPERAND_P (XEXP (XEXP (XEXP ((OP), 0), 0), 0)) \ @@ -2433,14 +2644,7 @@ struct sh_args { (GET_CODE (OP) == UNSPEC \ && XINT ((OP), 1) == UNSPEC_DATALABEL \ && XVECLEN ((OP), 0) == 1 \ - && (GET_CODE (XVECEXP ((OP), 0, 0)) == SYMBOL_REF \ - || GET_CODE (XVECEXP ((OP), 0, 0)) == LABEL_REF)) - -/* Check whether OP is a datalabel unspec, possibly enclosed within a - CONST. */ -#define DATALABEL_REF_P(OP) \ - ((GET_CODE (OP) == CONST && DATALABEL_REF_NO_CONST_P (XEXP ((OP), 0))) \ - || DATALABEL_REF_NO_CONST_P (OP)) + && GET_CODE (XVECEXP ((OP), 0, 0)) == LABEL_REF) #define GOT_ENTRY_P(OP) \ (GET_CODE (OP) == CONST && GET_CODE (XEXP ((OP), 0)) == UNSPEC \ @@ -2474,10 +2678,14 @@ struct sh_args { #define NON_PIC_REFERENCE_P(OP) \ (GET_CODE (OP) == LABEL_REF || GET_CODE (OP) == SYMBOL_REF \ - || DATALABEL_REF_P (OP) \ + || (GET_CODE (OP) == CONST \ + && (GET_CODE (XEXP ((OP), 0)) == LABEL_REF \ + || GET_CODE (XEXP ((OP), 0)) == SYMBOL_REF \ + || DATALABEL_REF_NO_CONST_P (XEXP ((OP), 0)))) \ || (GET_CODE (OP) == CONST && GET_CODE (XEXP ((OP), 0)) == PLUS \ && (GET_CODE (XEXP (XEXP ((OP), 0), 0)) == SYMBOL_REF \ - || DATALABEL_REF_P (XEXP (XEXP ((OP), 0), 0))) \ + || GET_CODE (XEXP (XEXP ((OP), 0), 0)) == LABEL_REF \ + || DATALABEL_REF_NO_CONST_P (XEXP (XEXP ((OP), 0), 0))) \ && GET_CODE (XEXP (XEXP ((OP), 0), 1)) == CONST_INT)) #define PIC_REFERENCE_P(OP) \ @@ -2574,6 +2782,8 @@ struct sh_args { #define BASE_REGISTER_RTX_P(X) \ ((GET_CODE (X) == REG && REG_OK_FOR_BASE_P (X)) \ || (GET_CODE (X) == SUBREG \ + && TRULY_NOOP_TRUNCATION (GET_MODE_BITSIZE (GET_MODE ((X))), \ + GET_MODE_BITSIZE (GET_MODE (SUBREG_REG (X)))) \ && GET_CODE (SUBREG_REG (X)) == REG \ && REG_OK_FOR_BASE_P (SUBREG_REG (X)))) @@ -2583,6 +2793,8 @@ struct sh_args { #define INDEX_REGISTER_RTX_P(X) \ ((GET_CODE (X) == REG && REG_OK_FOR_INDEX_P (X)) \ || (GET_CODE (X) == SUBREG \ + && TRULY_NOOP_TRUNCATION (GET_MODE_BITSIZE (GET_MODE ((X))), \ + GET_MODE_BITSIZE (GET_MODE (SUBREG_REG (X)))) \ && GET_CODE (SUBREG_REG (X)) == REG \ && SUBREG_OK_FOR_INDEX_P (SUBREG_REG (X), SUBREG_BYTE (X)))) @@ -2614,7 +2826,15 @@ struct sh_args { { \ if (TARGET_SHMEDIA) \ { \ - int MODE_SIZE = GET_MODE_SIZE (MODE); \ + int MODE_SIZE; \ + /* Check if this the address of an unaligned load / store. */\ + if ((MODE) == VOIDmode) \ + { \ + if (CONST_OK_FOR_I06 (INTVAL (OP))) \ + goto LABEL; \ + break; \ + } \ + MODE_SIZE = GET_MODE_SIZE (MODE); \ if (! (INTVAL (OP) & (MODE_SIZE - 1)) \ && INTVAL (OP) >= -512 * MODE_SIZE \ && INTVAL (OP) < 512 * MODE_SIZE) \ @@ -2627,6 +2847,9 @@ struct sh_args { } \ } while(0) +#define ALLOW_INDEXED_ADDRESS \ + ((!TARGET_SHMEDIA32 && !TARGET_SHCOMPACT) || TARGET_ALLOW_INDEXED_ADDRESS) + #define GO_IF_LEGITIMATE_ADDRESS(MODE, X, LABEL) \ { \ if (BASE_REGISTER_RTX_P (X)) \ @@ -2642,9 +2865,15 @@ struct sh_args { rtx xop1 = XEXP ((X), 1); \ if (GET_MODE_SIZE (MODE) <= 8 && BASE_REGISTER_RTX_P (xop0)) \ GO_IF_LEGITIMATE_INDEX ((MODE), xop1, LABEL); \ - if (GET_MODE_SIZE (MODE) <= 4 \ - || (TARGET_SHMEDIA && GET_MODE_SIZE (MODE) <= 8) \ - || ((TARGET_SH4 || TARGET_SH2A_DOUBLE) && TARGET_FMOVD && MODE == DFmode)) \ + if ((ALLOW_INDEXED_ADDRESS || GET_MODE (X) == DImode \ + || ((xop0 == stack_pointer_rtx || xop0 == frame_pointer_rtx) \ + && REG_P (xop1) && REGNO (xop1) == R0_REG) \ + || ((xop1 == stack_pointer_rtx || xop1 == frame_pointer_rtx) \ + && REG_P (xop0) && REGNO (xop0) == R0_REG)) \ + && ((!TARGET_SHMEDIA && GET_MODE_SIZE (MODE) <= 4) \ + || (TARGET_SHMEDIA && GET_MODE_SIZE (MODE) <= 8) \ + || ((TARGET_SH4 || TARGET_SH2A_DOUBLE) \ + && TARGET_FMOVD && MODE == DFmode))) \ { \ if (BASE_REGISTER_RTX_P (xop1) && INDEX_REGISTER_RTX_P (xop0))\ goto LABEL; \ @@ -2731,7 +2960,10 @@ struct sh_args { && BASE_REGISTER_RTX_P (XEXP (X, 0)) \ && ! TARGET_SHMEDIA \ && ! (TARGET_SH4 && (MODE) == DFmode) \ - && ! ((MODE) == PSImode && (TYPE) == RELOAD_FOR_INPUT_ADDRESS)) \ + && ! ((MODE) == PSImode && (TYPE) == RELOAD_FOR_INPUT_ADDRESS) \ + && (ALLOW_INDEXED_ADDRESS \ + || XEXP ((X), 0) == stack_pointer_rtx \ + || XEXP ((X), 0) == frame_pointer_rtx)) \ { \ rtx index_rtx = XEXP (X, 1); \ HOST_WIDE_INT offset = INTVAL (index_rtx), offset_base; \ @@ -2748,7 +2980,7 @@ struct sh_args { { \ X = copy_rtx (X); \ push_reload (index_rtx, NULL_RTX, &XEXP (X, 1), NULL, \ - INDEX_REG_CLASS, Pmode, VOIDmode, 0, 0, (OPNUM), \ + R0_REGS, Pmode, VOIDmode, 0, 0, (OPNUM), \ (TYPE)); \ goto WIN; \ } \ @@ -2892,7 +3124,9 @@ struct sh_args { #define SHIFT_COUNT_TRUNCATED (! TARGET_SH3 && ! TARGET_SH2A) /* All integers have the same format so truncation is easy. */ -#define TRULY_NOOP_TRUNCATION(OUTPREC,INPREC) 1 +/* But SHmedia must sign-extend DImode when truncating to SImode. */ +#define TRULY_NOOP_TRUNCATION(OUTPREC,INPREC) \ + (!TARGET_SHMEDIA || (INPREC) < 64 || (OUTPREC) >= 64) /* Define this if addresses of constant functions shouldn't be put through pseudo regs where they can be cse'd. @@ -3060,10 +3294,26 @@ struct sh_args { } #define ASM_OUTPUT_REG_PUSH(file, v) \ - fprintf ((file), "\tmov.l\tr%d,@-r15\n", (v)); +{ \ + if (TARGET_SHMEDIA) \ + { \ + fprintf ((file), "\taddi.l\tr15,-8,r15\n"); \ + fprintf ((file), "\tst.q\tr15,0,r%d\n", (v)); \ + } \ + else \ + fprintf ((file), "\tmov.l\tr%d,@-r15\n", (v)); \ +} #define ASM_OUTPUT_REG_POP(file, v) \ - fprintf ((file), "\tmov.l\t@r15+,r%d\n", (v)); +{ \ + if (TARGET_SHMEDIA) \ + { \ + fprintf ((file), "\tld.q\tr15,0,r%d\n", (v)); \ + fprintf ((file), "\taddi.l\tr15,8,r15\n"); \ + } \ + else \ + fprintf ((file), "\tmov.l\t@r15+,r%d\n", (v)); \ +} /* DBX register number for a given compiler register number. */ /* GDB has FPUL at 23 and FP0 at 25, so we must add one to all FP registers @@ -3207,7 +3457,7 @@ struct sh_args { #define PRINT_OPERAND_PUNCT_VALID_P(CHAR) \ ((CHAR) == '.' || (CHAR) == '#' || (CHAR) == '@' || (CHAR) == ',' \ - || (CHAR) == '$'|| (CHAR) == '\'') + || (CHAR) == '$' || (CHAR) == '\'' || (CHAR) == '>') /* Recognize machine-specific patterns that may appear within constants. Used for PIC-specific UNSPECs. */ @@ -3326,8 +3576,6 @@ extern int current_function_interrupt; for interrupt functions. */ extern struct rtx_def *sp_switch; -extern int rtx_equal_function_value_matters; - /* Instructions with unfilled delay slots take up an extra two bytes for the nop in the delay slot. @@ -3339,18 +3587,23 @@ extern int rtx_equal_function_value_matters; /* Define the codes that are matched by predicates in sh.c. */ #define PREDICATE_CODES \ {"and_operand", {SUBREG, REG, CONST_INT}}, \ + {"any_arith_reg_dest", {SUBREG, REG}}, \ {"any_register_operand", {SUBREG, REG}}, \ {"arith_operand", {SUBREG, REG, CONST_INT}}, \ {"arith_reg_dest", {SUBREG, REG}}, \ - {"arith_reg_operand", {SUBREG, REG}}, \ + {"arith_reg_operand", {SUBREG, REG, SIGN_EXTEND}}, \ {"arith_reg_or_0_operand", {SUBREG, REG, CONST_INT, CONST_VECTOR}}, \ {"binary_float_operator", {PLUS, MINUS, MULT, DIV}}, \ {"binary_logical_operator", {AND, IOR, XOR}}, \ + {"cache_address_operand", {PLUS, REG}}, \ + {"cmp_operand", {SUBREG, REG, CONST_INT}}, \ {"cmpsi_operand", {SUBREG, REG, CONST_INT}}, \ {"commutative_float_operator", {PLUS, MULT}}, \ {"equality_comparison_operator", {EQ,NE}}, \ {"extend_reg_operand", {SUBREG, REG, TRUNCATE}}, \ {"extend_reg_or_0_operand", {SUBREG, REG, TRUNCATE, CONST_INT}}, \ + {"ext_dest_operand", {SUBREG, REG}}, \ + {"fp_arith_reg_dest", {SUBREG, REG}}, \ {"fp_arith_reg_operand", {SUBREG, REG}}, \ {"fpscr_operand", {REG}}, \ {"fpul_operand", {REG}}, \ @@ -3359,30 +3612,44 @@ extern int rtx_equal_function_value_matters; {"general_movdst_operand", {SUBREG, REG, MEM}}, \ {"unaligned_load_operand", {MEM}}, \ {"greater_comparison_operator", {GT,GE,GTU,GEU}}, \ - {"int_gpr_dest", {SUBREG, REG}}, \ {"inqhi_operand", {TRUNCATE}}, \ + {"int_gpr_dest", {SUBREG, REG}}, \ {"less_comparison_operator", {LT,LE,LTU,LEU}}, \ {"logical_operand", {SUBREG, REG, CONST_INT}}, \ + {"logical_operator", {AND,IOR,XOR}}, \ + {"logical_reg_operand", {SUBREG, REG}}, \ {"mextr_bit_offset", {CONST_INT}}, \ + {"minuend_operand", {SUBREG, REG, TRUNCATE, CONST_INT}}, \ {"noncommutative_float_operator", {MINUS, DIV}}, \ - {"shmedia_6bit_operand", {SUBREG, REG, CONST_INT}}, \ + {"sh_const_vec", {CONST_VECTOR}}, \ + {"sh_1el_vec", {CONST_VECTOR}}, \ {"sh_register_operand", {REG, SUBREG, CONST_INT}}, \ - {"target_reg_operand", {SUBREG, REG}}, \ + {"sh_rep_vec", {CONST_VECTOR}}, \ + {"shift_count_operand", {CONST_INT, CONST_DOUBLE, CONST, SYMBOL_REF, \ + LABEL_REF, SUBREG, REG, ZERO_EXTEND, SIGN_EXTEND}},\ + {"shift_count_reg_operand", {SUBREG, REG, ZERO_EXTEND, SIGN_EXTEND}}, \ + {"shift_operator", {ASHIFT, ASHIFTRT, LSHIFTRT}}, \ + {"symbol_ref_operand", {SYMBOL_REF}}, \ {"target_operand", {SUBREG, REG, LABEL_REF, SYMBOL_REF, CONST, UNSPEC}},\ + {"target_reg_operand", {SUBREG, REG}}, \ {"trunc_hi_operand", {SUBREG, REG, TRUNCATE}}, \ - {"sh_const_vec", {CONST_VECTOR}}, \ - {"sh_1el_vec", {CONST_VECTOR, PARALLEL}}, \ - {"sh_rep_vec", {CONST_VECTOR, PARALLEL}}, \ - {"symbol_ref_operand", {SYMBOL_REF}}, \ + {"ua_address_operand", {SUBREG, REG, PLUS}}, \ + {"ua_offset", {CONST_INT}}, \ {"unary_float_operator", {ABS, NEG, SQRT}}, \ + {"xor_operand", {SUBREG, REG, CONST_INT}}, \ #define SPECIAL_MODE_PREDICATES \ + "any_arith_reg_dest", \ "any_register_operand", \ "int_gpr_dest", \ + "target_operand", \ + "target_reg_operand", \ "trunc_hi_operand", \ /* This line intentionally left blank. */ #define any_register_operand register_operand +#define any_arith_reg_dest arith_reg_dest +#define ext_dest_operand arith_reg_operand /* Define this macro if it is advisable to hold scalars in registers in a wider mode than that declared by the program. In such cases, @@ -3395,12 +3662,15 @@ extern int rtx_equal_function_value_matters; load instructions. */ #define PROMOTE_MODE(MODE, UNSIGNEDP, TYPE) \ if (GET_MODE_CLASS (MODE) == MODE_INT \ - && GET_MODE_SIZE (MODE) < UNITS_PER_WORD) \ + && GET_MODE_SIZE (MODE) < 4/* ! UNITS_PER_WORD */)\ (UNSIGNEDP) = ((MODE) == SImode ? 0 : (UNSIGNEDP)), \ - (MODE) = (TARGET_SH1 ? SImode : DImode); + (MODE) = (TARGET_SH1 ? SImode \ + : TARGET_SHMEDIA32 ? SImode : DImode); #define MAX_FIXED_MODE_SIZE (TARGET_SH5 ? 128 : 64) +#define SIDI_OFF (TARGET_LITTLE_ENDIAN ? 0 : 4) + /* ??? Define ACCUMULATE_OUTGOING_ARGS? This is more efficient than pushing and popping arguments. However, we do have push/pop instructions, and rather limited offsets (4 bits) in load/store instructions, so it isn't @@ -3507,4 +3777,15 @@ extern int rtx_equal_function_value_matters; : gen_rtx_MEM (Pmode, return_address_pointer_rtx)) \ : NULL_RTX) +#define SIMULTANEOUS_PREFETCHES 2 + +extern const char *sh_multcost_str; +extern const char *sh_gettrcost_str; +extern const char *sh_div_str; +extern const char *sh_divsi3_libfunc; +extern const char *cut2_workaround_str; + +/* FIXME: middle-end support for highpart optimizations is missing. */ +#define high_life_started reload_in_progress + #endif /* ! GCC_SH_H */ diff --git a/gcc/config/sh/sh.md b/gcc/config/sh/sh.md index a10774a..db00ec3 100644 --- a/gcc/config/sh/sh.md +++ b/gcc/config/sh/sh.md @@ -143,6 +143,13 @@ (UNSPEC_GOTTPOFF 24) (UNSPEC_TPOFF 25) (UNSPEC_RA 26) + (UNSPEC_DIV_INV_M0 30) + (UNSPEC_DIV_INV_M1 31) + (UNSPEC_DIV_INV_M2 32) + (UNSPEC_DIV_INV_M3 33) + (UNSPEC_DIV_INV20 34) + (UNSPEC_ASHIFTRT 35) + (UNSPEC_THUNK 36) ;; These are used with unspec_volatile. (UNSPECV_BLOCKAGE 0) @@ -247,7 +254,7 @@ ;; mcmp_media SHmedia multimedia compare, absolute, saturating ops ;; mac_media SHmedia mac-style fixed point operations ;; d2mpy_media SHmedia: two 32 bit integer multiplies -;; atrans SHmedia approximate transcendental functions +;; atrans_media SHmedia approximate transcendental functions ;; ustore_media SHmedia unaligned stores ;; nil no-op move, will be deleted. @@ -424,6 +431,9 @@ (eq_attr "type" "pt_media") (if_then_else (ne (symbol_ref "TARGET_SHMEDIA64") (const_int 0)) (const_int 20) (const_int 12)) + (and (eq_attr "type" "jump_media") + (ne (symbol_ref "TARGET_SH5_CUT2_WORKAROUND") (const_int 0))) + (const_int 8) ] (if_then_else (ne (symbol_ref "TARGET_SHMEDIA") (const_int 0)) (const_int 4) (const_int 2)))) @@ -434,6 +444,8 @@ (include "shmedia.md") (include "sh4.md") +(include "predicates.md") + ;; Definitions for filling delay slots (define_attr "needs_delay_slot" "yes,no" (const_string "no")) @@ -494,6 +506,9 @@ (const_string "yes")] (const_string "no"))) +(define_attr "highpart" "user, ignore, extend, depend, must_split" + (const_string "user")) + (define_delay (eq_attr "needs_delay_slot" "yes") [(eq_attr "in_delay_slot" "yes") (nil) (nil)]) @@ -537,7 +552,7 @@ ;; SH2e has a hardware bug that pretty much prohibits the use of ;; annuled delay slots. [(eq_attr "cond_delay_slot" "yes") (and (eq_attr "cond_delay_slot" "yes") - (not (eq_attr "cpu" "sh2e"))) (nil)]) + (not (eq_attr "cpu" "sh2e"))) (nil)]) ;; ------------------------------------------------------------------------- ;; SImode signed integer comparisons @@ -616,7 +631,7 @@ [(set (reg:SI T_REG) (compare (match_operand:SI 0 "cmpsi_operand" "") (match_operand:SI 1 "arith_operand" "")))] - "TARGET_SH1" + "TARGET_SH1 || TARGET_SHMEDIA" " { if (GET_CODE (operands[0]) == REG && REGNO (operands[0]) == T_REG @@ -731,14 +746,30 @@ [(set_attr "length" "8") (set_attr "type" "arith3")]) +(define_insn "cmpeqsi_media" + [(set (match_operand:DI 0 "register_operand" "=r") + (eq:DI (match_operand:SI 1 "logical_operand" "%r") + (match_operand:SI 2 "cmp_operand" "Nr")))] + "TARGET_SHMEDIA" + "cmpeq %1, %N2, %0" + [(set_attr "type" "cmp_media")]) + (define_insn "cmpeqdi_media" [(set (match_operand:DI 0 "register_operand" "=r") (eq:DI (match_operand:DI 1 "register_operand" "%r") - (match_operand:DI 2 "arith_reg_or_0_operand" "Nr")))] + (match_operand:DI 2 "cmp_operand" "Nr")))] "TARGET_SHMEDIA" "cmpeq %1, %N2, %0" [(set_attr "type" "cmp_media")]) +(define_insn "cmpgtsi_media" + [(set (match_operand:DI 0 "register_operand" "=r") + (gt:DI (match_operand:SI 1 "cmp_operand" "Nr") + (match_operand:SI 2 "cmp_operand" "rN")))] + "TARGET_SHMEDIA" + "cmpgt %N1, %N2, %0" + [(set_attr "type" "cmp_media")]) + (define_insn "cmpgtdi_media" [(set (match_operand:DI 0 "register_operand" "=r") (gt:DI (match_operand:DI 1 "arith_reg_or_0_operand" "Nr") @@ -747,6 +778,14 @@ "cmpgt %N1, %N2, %0" [(set_attr "type" "cmp_media")]) +(define_insn "cmpgtusi_media" + [(set (match_operand:DI 0 "register_operand" "=r") + (gtu:DI (match_operand:SI 1 "cmp_operand" "Nr") + (match_operand:SI 2 "cmp_operand" "rN")))] + "TARGET_SHMEDIA" + "cmpgtu %N1, %N2, %0" + [(set_attr "type" "cmp_media")]) + (define_insn "cmpgtudi_media" [(set (match_operand:DI 0 "register_operand" "=r") (gtu:DI (match_operand:DI 1 "arith_reg_or_0_operand" "Nr") @@ -755,6 +794,69 @@ "cmpgtu %N1, %N2, %0" [(set_attr "type" "cmp_media")]) +(define_insn "cmpsieqsi_media" + [(set (match_operand:SI 0 "register_operand" "=r") + (eq:SI (match_operand:SI 1 "logical_operand" "%r") + (match_operand:SI 2 "cmp_operand" "Nr")))] + "TARGET_SHMEDIA" + "cmpeq %1, %N2, %0" + [(set_attr "type" "cmp_media")]) + +(define_insn "cmpsieqdi_media" + [(set (match_operand:SI 0 "register_operand" "=r") + (eq:SI (match_operand:DI 1 "register_operand" "%r") + (match_operand:DI 2 "cmp_operand" "Nr")))] + "TARGET_SHMEDIA" + "cmpeq %1, %N2, %0" + [(set_attr "type" "cmp_media")]) + +(define_insn "cmpsigtsi_media" + [(set (match_operand:SI 0 "register_operand" "=r") + (gt:SI (match_operand:SI 1 "cmp_operand" "Nr") + (match_operand:SI 2 "cmp_operand" "rN")))] + "TARGET_SHMEDIA" + "cmpgt %N1, %N2, %0" + [(set_attr "type" "cmp_media")]) + +(define_insn "cmpsigtdi_media" + [(set (match_operand:SI 0 "register_operand" "=r") + (gt:SI (match_operand:DI 1 "arith_reg_or_0_operand" "Nr") + (match_operand:DI 2 "arith_reg_or_0_operand" "rN")))] + "TARGET_SHMEDIA" + "cmpgt %N1, %N2, %0" + [(set_attr "type" "cmp_media")]) + +(define_insn "cmpsigtusi_media" + [(set (match_operand:SI 0 "register_operand" "=r") + (gtu:SI (match_operand:SI 1 "cmp_operand" "Nr") + (match_operand:SI 2 "cmp_operand" "rN")))] + "TARGET_SHMEDIA" + "cmpgtu %N1, %N2, %0" + [(set_attr "type" "cmp_media")]) + +(define_insn "cmpsigtudi_media" + [(set (match_operand:SI 0 "register_operand" "=r") + (gtu:SI (match_operand:DI 1 "arith_reg_or_0_operand" "Nr") + (match_operand:DI 2 "arith_reg_or_0_operand" "rN")))] + "TARGET_SHMEDIA" + "cmpgtu %N1, %N2, %0" + [(set_attr "type" "cmp_media")]) + +; These two patterns are for combine. +(define_insn "*cmpne0si_media" + [(set (match_operand:DI 0 "register_operand" "=r") + (ne:DI (match_operand:SI 1 "arith_reg_operand" "r") (const_int 0)))] + "TARGET_SHMEDIA" + "cmpgtu %1,r63,%0" + [(set_attr "type" "cmp_media")]) + +(define_insn "*cmpne0sisi_media" + [(set (match_operand:SI 0 "register_operand" "=r") + (ne:SI (match_operand:SI 1 "arith_reg_operand" "r") (const_int 0)))] + "TARGET_SHMEDIA" + "cmpgtu %1,r63,%0" + [(set_attr "type" "cmp_media")]) + ;; We save the compare operands in the cmpxx patterns and use them when ;; we generate the branch. @@ -796,6 +898,37 @@ "cmvne %1, %N2, %0" [(set_attr "type" "arith_media")]) +(define_peephole2 + [(set (match_operand:DI 0 "arith_reg_dest" "") + (if_then_else:DI (match_operator 3 "equality_comparison_operator" + [(match_operand:DI 1 "arith_reg_operand" "") + (const_int 0)]) + (match_operand:DI 2 "arith_reg_dest" "") + (match_dup 0))) + (set (match_dup 2) (match_dup 0))] + "TARGET_SHMEDIA && peep2_reg_dead_p (2, operands[0])" + [(set (match_dup 2) + (if_then_else:DI (match_dup 3) (match_dup 0) (match_dup 2)))] + " +{ + operands[3] = gen_rtx_fmt_ee (reverse_condition (GET_CODE (operands[3])), + VOIDmode, operands[1], CONST0_RTX (DImode)); +}") + +(define_peephole2 + [(set (match_operand:DI 0 "general_movdst_operand" "") + (match_operand:DI 1 "arith_reg_or_0_operand" "")) + (set (match_operand:DI 2 "arith_reg_dest" "") + (if_then_else:DI (match_operator 4 "equality_comparison_operator" + [(match_operand:DI 3 "arith_reg_operand" "") + (const_int 0)]) + (match_dup 0) + (match_dup 2)))] + "TARGET_SHMEDIA && peep2_reg_dead_p (2, operands[0])" + [(set (match_dup 2) + (if_then_else:DI (match_dup 4) (match_dup 1) (match_dup 2)))] + "") + (define_expand "movdicc" [(set (match_operand:DI 0 "register_operand" "") (if_then_else:DI (match_operand 1 "comparison_operator" "") @@ -893,6 +1026,270 @@ } } }") + +;; Add SImode variants for cmveq / cmvne to compensate for not promoting +;; SImode to DImode. +(define_insn "movsicc_false" + [(set (match_operand:SI 0 "arith_reg_dest" "=r") + (if_then_else:SI (eq (match_operand:SI 1 "arith_reg_operand" "r") + (const_int 0)) + (match_operand:SI 2 "arith_reg_or_0_operand" "rN") + (match_operand:SI 3 "arith_reg_operand" "0")))] + "TARGET_SHMEDIA" + "cmveq %1, %N2, %0" + [(set_attr "type" "arith_media")]) + +(define_insn "movsicc_true" + [(set (match_operand:SI 0 "arith_reg_dest" "=r") + (if_then_else:SI (ne (match_operand:SI 1 "arith_reg_operand" "r") + (const_int 0)) + (match_operand:SI 2 "arith_reg_or_0_operand" "rN") + (match_operand:SI 3 "arith_reg_operand" "0")))] + "TARGET_SHMEDIA" + "cmvne %1, %N2, %0" + [(set_attr "type" "arith_media")]) + +(define_peephole2 + [(set (match_operand:SI 0 "arith_reg_dest" "") + (if_then_else:SI (match_operator 3 "equality_comparison_operator" + [(match_operand:SI 1 "arith_reg_operand" "") + (const_int 0)]) + (match_operand:SI 2 "arith_reg_dest" "") + (match_dup 0))) + (set (match_dup 2) (match_dup 0))] + "TARGET_SHMEDIA && peep2_reg_dead_p (2, operands[0])" + [(set (match_dup 2) + (if_then_else:SI (match_dup 3) (match_dup 0) (match_dup 2)))] + " +{ + operands[3] = gen_rtx_fmt_ee (reverse_condition (GET_CODE (operands[3])), + VOIDmode, operands[1], CONST0_RTX (SImode)); +}") + +(define_peephole2 + [(set (match_operand:SI 0 "general_movdst_operand" "") + (match_operand:SI 1 "arith_reg_or_0_operand" "")) + (set (match_operand:SI 2 "arith_reg_dest" "") + (if_then_else:SI (match_operator 4 "equality_comparison_operator" + [(match_operand:SI 3 "arith_reg_operand" "") + (const_int 0)]) + (match_dup 0) + (match_dup 2)))] + "TARGET_SHMEDIA && peep2_reg_dead_p (2, operands[0]) + && (GET_CODE (operands[1]) != REG || GENERAL_REGISTER_P (REGNO (operands[1])))" + [(set (match_dup 2) + (if_then_else:SI (match_dup 4) (match_dup 1) (match_dup 2)))] + " +{ + replace_rtx (operands[4], operands[0], operands[1]); +}") + +(define_peephole2 + [(set (match_operand 0 "any_register_operand" "") + (match_operand 1 "any_register_operand" "")) + (set (match_operand 2 "any_register_operand" "") (match_operand 3 "" "")) + (set (match_operand 4 "" "") (match_operand 5 "" ""))] + "(HARD_REGNO_NREGS (REGNO (operands[0]), GET_MODE (operands[2])) + <= HARD_REGNO_NREGS (REGNO (operands[0]), GET_MODE (operands[0]))) + && peep2_reg_dead_p (3, operands[0]) && peep2_reg_dead_p (3, operands[2]) + && ! reg_overlap_mentioned_p (operands[0], operands[3]) + && ! reg_overlap_mentioned_p (operands[2], operands[0]) + && ! reg_overlap_mentioned_p (operands[0], operands[1]) + && (REGNO_REG_CLASS (REGNO (operands[0])) + == REGNO_REG_CLASS (REGNO (operands[2]))) + && (REGNO_REG_CLASS (REGNO (operands[1])) + == REGNO_REG_CLASS (REGNO (operands[0])))" + [(set (match_dup 0) (match_dup 3)) + (set (match_dup 4) (match_dup 5))] + " +{ + rtx set1, set2; + rtx replacements[4]; + + /* We want to replace occurences of operands[0] with operands[1] and + operands[2] with operands[0] in operands[4]/operands[5]. + Doing just two replace_rtx calls naiively would result in the second + replacement undoing all that the first did if operands[1] and operands[2] + are identical, so we must do this simultaneously. */ + replacements[0] = operands[0]; + replacements[1] = operands[1]; + replacements[2] = operands[2]; + replacements[3] = operands[0]; + if (!replace_n_hard_rtx (operands[5], replacements, 2, 0) + || !replace_n_hard_rtx (operands[4], replacements, 2, 0) + || !replace_n_hard_rtx (operands[2], replacements, 2, 0)) + FAIL; + + operands[5] = replace_n_hard_rtx (operands[5], replacements, 2, 1); + replace_n_hard_rtx (operands[4], replacements, 2, 1); + operands[2] = replace_n_hard_rtx (operands[2], replacements, 2, 1); + /* The operands array is aliased to recog_data.operand, which gets + clobbered by extract_insn, so finish with it now. */ + set1 = gen_rtx_SET (VOIDmode, operands[2], operands[3]); + set2 = gen_rtx_SET (VOIDmode, operands[4], operands[5]); + /* ??? The last insn might be a jump insn, but the generic peephole2 code + always uses emit_insn. */ + /* Check that we don't violate matching constraints or earlyclobbers. */ + extract_insn (emit_insn (set1)); + if (! constrain_operands (1)) + goto failure; + extract_insn (emit (set2)); + if (! constrain_operands (1)) + { + rtx tmp; + failure: + tmp = replacements[0]; + replacements[0] = replacements[1]; + replacements[1] = tmp; + tmp = replacements[2]; + replacements[2] = replacements[3]; + replacements[3] = tmp; + replace_n_hard_rtx (SET_DEST (set1), replacements, 2, 1); + replace_n_hard_rtx (SET_DEST (set2), replacements, 2, 1); + replace_n_hard_rtx (SET_SRC (set2), replacements, 2, 1); + FAIL; + } + DONE; +}") + +;; The register allocator is rather clumsy in handling multi-way conditional +;; moves, so allow the combiner to make them, and we split them up after +;; reload. */ +(define_insn_and_split "*movsicc_umin" + [(set (match_operand:SI 0 "arith_reg_dest" "=&r") + (umin:SI (if_then_else:SI + (eq (match_operand:SI 1 "arith_reg_operand" "r") + (const_int 0)) + (match_operand:SI 2 "arith_reg_or_0_operand" "rN") + (match_operand:SI 3 "register_operand" "0")) + (match_operand:SI 4 "arith_reg_or_0_operand" "r"))) + (clobber (match_scratch:SI 5 "=&r"))] + "TARGET_SHMEDIA && no_new_pseudos" + "#" + "TARGET_SHMEDIA && reload_completed" + [(pc)] + " +{ + emit_insn (gen_movsicc_false (operands[0], operands[1], operands[2], + operands[3])); + emit_insn (gen_cmpsigtusi_media (operands[5], operands[4], operands[0])); + emit_insn (gen_movsicc_false (operands[0], operands[5], operands[4], + operands[0])); + DONE; +}") + +(define_expand "movsicc" + [(set (match_operand:SI 0 "register_operand" "") + (if_then_else:SI (match_operand 1 "comparison_operator" "") + (match_operand:SI 2 "register_operand" "") + (match_operand:SI 3 "register_operand" "")))] + "TARGET_SHMEDIA" + " +{ + if ((GET_CODE (operands[1]) == EQ || GET_CODE (operands[1]) == NE) + && GET_MODE (sh_compare_op0) == SImode + && sh_compare_op1 == const0_rtx) + operands[1] = gen_rtx_fmt_ee (GET_CODE (operands[1]), VOIDmode, + sh_compare_op0, sh_compare_op1); + else + { + rtx tmp; + + if (no_new_pseudos) + FAIL; + + tmp = gen_reg_rtx (SImode); + + switch (GET_CODE (operands[1])) + { + case EQ: + emit_insn (gen_seq (tmp)); + operands[1] = gen_rtx_NE (VOIDmode, tmp, const0_rtx); + break; + + case NE: + emit_insn (gen_seq (tmp)); + operands[1] = gen_rtx_EQ (VOIDmode, tmp, const0_rtx); + break; + + case GT: + emit_insn (gen_sgt (tmp)); + operands[1] = gen_rtx_NE (VOIDmode, tmp, const0_rtx); + break; + + case LT: + emit_insn (gen_slt (tmp)); + operands[1] = gen_rtx_NE (VOIDmode, tmp, const0_rtx); + break; + + case GE: + emit_insn (gen_slt (tmp)); + operands[1] = gen_rtx_EQ (VOIDmode, tmp, const0_rtx); + break; + + case LE: + emit_insn (gen_sgt (tmp)); + operands[1] = gen_rtx_EQ (VOIDmode, tmp, const0_rtx); + break; + + case GTU: + emit_insn (gen_sgtu (tmp)); + operands[1] = gen_rtx_NE (VOIDmode, tmp, const0_rtx); + break; + + case LTU: + emit_insn (gen_sltu (tmp)); + operands[1] = gen_rtx_NE (VOIDmode, tmp, const0_rtx); + break; + + case GEU: + emit_insn (gen_sltu (tmp)); + operands[1] = gen_rtx_EQ (VOIDmode, tmp, const0_rtx); + break; + + case LEU: + emit_insn (gen_sgtu (tmp)); + operands[1] = gen_rtx_EQ (VOIDmode, tmp, const0_rtx); + break; + + case UNORDERED: + emit_insn (gen_sunordered (tmp)); + operands[1] = gen_rtx_NE (VOIDmode, tmp, const0_rtx); + break; + + case ORDERED: + emit_insn (gen_sunordered (tmp)); + operands[1] = gen_rtx_EQ (VOIDmode, tmp, const0_rtx); + break; + + case UNEQ: + case UNGE: + case UNGT: + case UNLE: + case UNLT: + case LTGT: + FAIL; + + default: + abort (); + } + } +}") + +(define_expand "movqicc" + [(set (match_operand:QI 0 "register_operand" "") + (if_then_else:QI (match_operand 1 "comparison_operator" "") + (match_operand:QI 2 "register_operand" "") + (match_operand:QI 3 "register_operand" "")))] + "TARGET_SHMEDIA" + " +{ + operands[0] = simplify_gen_subreg (SImode, operands[0], QImode, 0); + operands[2] = simplify_gen_subreg (SImode, operands[2], QImode, 0); + operands[3] = simplify_gen_subreg (SImode, operands[3], QImode, 0); + emit (gen_movsicc (operands[0], operands[1], operands[2], operands[3])); + DONE; +}") ;; ------------------------------------------------------------------------- ;; Addition instructions @@ -916,7 +1313,7 @@ }") (define_insn "*adddi3_media" - [(set (match_operand:DI 0 "arith_reg_operand" "=r,r") + [(set (match_operand:DI 0 "arith_reg_dest" "=r,r") (plus:DI (match_operand:DI 1 "arith_reg_operand" "%r,r") (match_operand:DI 2 "arith_operand" "r,I10")))] "TARGET_SHMEDIA" @@ -925,17 +1322,29 @@ addi %1, %2, %0" [(set_attr "type" "arith_media")]) +(define_insn "*adddisi3_media" + [(set (subreg:DI (match_operand:SI 0 "arith_reg_operand" "=r,r") 0) + (plus:DI (match_operand:DI 1 "arith_reg_operand" "%r,r") + (match_operand:DI 2 "arith_operand" "r,I10")))] + "TARGET_SHMEDIA" + "@ + add.l %1, %2, %0 + addi.l %1, %2, %0" + [(set_attr "type" "arith_media") + (set_attr "highpart" "ignore")]) + (define_insn "adddi3z_media" - [(set (match_operand:DI 0 "arith_reg_operand" "=r") + [(set (match_operand:DI 0 "arith_reg_dest" "=r") (zero_extend:DI (plus:SI (match_operand:SI 1 "extend_reg_operand" "r") (match_operand:SI 2 "extend_reg_or_0_operand" "rN"))))] "TARGET_SHMEDIA" "addz.l %1, %N2, %0" - [(set_attr "type" "arith_media")]) + [(set_attr "type" "arith_media") + (set_attr "highpart" "ignore")]) (define_insn "adddi3_compact" - [(set (match_operand:DI 0 "arith_reg_operand" "=&r") + [(set (match_operand:DI 0 "arith_reg_dest" "=&r") (plus:DI (match_operand:DI 1 "arith_reg_operand" "%0") (match_operand:DI 2 "arith_reg_operand" "r"))) (clobber (reg:SI T_REG))] @@ -944,7 +1353,7 @@ [(set_attr "length" "6")]) (define_split - [(set (match_operand:DI 0 "arith_reg_operand" "") + [(set (match_operand:DI 0 "arith_reg_dest" "") (plus:DI (match_operand:DI 1 "arith_reg_operand" "") (match_operand:DI 2 "arith_reg_operand" ""))) (clobber (reg:SI T_REG))] @@ -966,7 +1375,7 @@ }") (define_insn "addc" - [(set (match_operand:SI 0 "arith_reg_operand" "=r") + [(set (match_operand:SI 0 "arith_reg_dest" "=r") (plus:SI (plus:SI (match_operand:SI 1 "arith_reg_operand" "0") (match_operand:SI 2 "arith_reg_operand" "r")) (reg:SI T_REG))) @@ -977,7 +1386,7 @@ [(set_attr "type" "arith")]) (define_insn "addc1" - [(set (match_operand:SI 0 "arith_reg_operand" "=r") + [(set (match_operand:SI 0 "arith_reg_dest" "=r") (plus:SI (plus:SI (match_operand:SI 1 "arith_reg_operand" "0") (match_operand:SI 2 "arith_reg_operand" "r")) (reg:SI T_REG))) @@ -998,17 +1407,31 @@ }") (define_insn "addsi3_media" - [(set (match_operand:SI 0 "arith_reg_operand" "=r,r") + [(set (match_operand:SI 0 "arith_reg_dest" "=r,r") (plus:SI (match_operand:SI 1 "extend_reg_operand" "%r,r") (match_operand:SI 2 "arith_operand" "r,I10")))] "TARGET_SHMEDIA" "@ add.l %1, %2, %0 addi.l %1, %2, %0" - [(set_attr "type" "arith_media")]) + [(set_attr "type" "arith_media") + (set_attr "highpart" "ignore")]) + +(define_insn "addsidi3_media" + [(set (match_operand:DI 0 "arith_reg_dest" "=r,r") + (sign_extend:DI (plus:SI (match_operand:SI 1 "extend_reg_operand" + "%r,r") + (match_operand:SI 2 "arith_operand" + "r,I10"))))] + "TARGET_SHMEDIA" + "@ + add.l %1, %2, %0 + addi.l %1, %2, %0" + [(set_attr "type" "arith_media") + (set_attr "highpart" "ignore")]) (define_insn "*addsi3_compact" - [(set (match_operand:SI 0 "arith_reg_operand" "=r") + [(set (match_operand:SI 0 "arith_reg_dest" "=r") (plus:SI (match_operand:SI 1 "arith_operand" "%0") (match_operand:SI 2 "arith_operand" "rI08")))] "TARGET_SH1" @@ -1035,15 +1458,24 @@ }") (define_insn "*subdi3_media" - [(set (match_operand:DI 0 "arith_reg_operand" "=r") + [(set (match_operand:DI 0 "arith_reg_dest" "=r") (minus:DI (match_operand:DI 1 "arith_reg_or_0_operand" "rN") (match_operand:DI 2 "arith_reg_operand" "r")))] "TARGET_SHMEDIA" "sub %N1, %2, %0" [(set_attr "type" "arith_media")]) + +(define_insn "subdisi3_media" + [(set (subreg:DI (match_operand:SI 0 "arith_reg_operand" "=r") 0) + (minus:DI (match_operand:DI 1 "arith_reg_or_0_operand" "rN") + (match_operand:DI 2 "arith_reg_operand" "r")))] + "TARGET_SHMEDIA" + "sub.l %N1, %2, %0" + [(set_attr "type" "arith_media") + (set_attr "highpart" "ignore")]) (define_insn "subdi3_compact" - [(set (match_operand:DI 0 "arith_reg_operand" "=&r") + [(set (match_operand:DI 0 "arith_reg_dest" "=&r") (minus:DI (match_operand:DI 1 "arith_reg_operand" "0") (match_operand:DI 2 "arith_reg_operand" "r"))) (clobber (reg:SI T_REG))] @@ -1052,7 +1484,7 @@ [(set_attr "length" "6")]) (define_split - [(set (match_operand:DI 0 "arith_reg_operand" "") + [(set (match_operand:DI 0 "arith_reg_dest" "") (minus:DI (match_operand:DI 1 "arith_reg_operand" "") (match_operand:DI 2 "arith_reg_operand" ""))) (clobber (reg:SI T_REG))] @@ -1074,7 +1506,7 @@ }") (define_insn "subc" - [(set (match_operand:SI 0 "arith_reg_operand" "=r") + [(set (match_operand:SI 0 "arith_reg_dest" "=r") (minus:SI (minus:SI (match_operand:SI 1 "arith_reg_operand" "0") (match_operand:SI 2 "arith_reg_operand" "r")) (reg:SI T_REG))) @@ -1087,7 +1519,7 @@ [(set_attr "type" "arith")]) (define_insn "subc1" - [(set (match_operand:SI 0 "arith_reg_operand" "=r") + [(set (match_operand:SI 0 "arith_reg_dest" "=r") (minus:SI (minus:SI (match_operand:SI 1 "arith_reg_operand" "0") (match_operand:SI 2 "arith_reg_operand" "r")) (reg:SI T_REG))) @@ -1096,22 +1528,57 @@ "subc %2,%0" [(set_attr "type" "arith")]) +;; life_analysis thinks rn is live before subc rn,rn, so make a special +;; pattern for this case. This helps multimedia applications that compute +;; the sum of absolute differences. +(define_insn "mov_neg_si_t" + [(set (match_operand:SI 0 "arith_reg_dest" "=r") (neg:SI (reg:SI T_REG)))] + "TARGET_SH1" + "subc %0,%0" + [(set_attr "type" "arith")]) + (define_insn "*subsi3_internal" - [(set (match_operand:SI 0 "arith_reg_operand" "=r") + [(set (match_operand:SI 0 "arith_reg_dest" "=r") (minus:SI (match_operand:SI 1 "arith_reg_operand" "0") (match_operand:SI 2 "arith_reg_operand" "r")))] "TARGET_SH1" "sub %2,%0" [(set_attr "type" "arith")]) -(define_insn "*subsi3_media" - [(set (match_operand:SI 0 "arith_reg_operand" "=r") - (minus:SI (match_operand:SI 1 "extend_reg_or_0_operand" "rN") +(define_insn_and_split "*subsi3_media" + [(set (match_operand:SI 0 "arith_reg_dest" "=r") + (minus:SI (match_operand:SI 1 "minuend_operand" "rN") (match_operand:SI 2 "extend_reg_operand" "r")))] - "TARGET_SHMEDIA" + "TARGET_SHMEDIA + && (operands[1] != constm1_rtx + || (GET_CODE (operands[2]) != TRUNCATE + && GET_CODE (operands[2]) != SUBREG))" "sub.l %N1, %2, %0" - [(set_attr "type" "arith_media")]) + "operands[1] == constm1_rtx" + [(set (match_dup 0) (xor:SI (match_dup 2) (match_dup 1)))] + "" + [(set_attr "type" "arith_media") + (set_attr "highpart" "ignore")]) +(define_split + [(set (match_operand:SI 0 "arith_reg_dest" "") + (zero_extend:SI (subreg:QI (not:SI (subreg:SI (match_operand:QI 1 + "general_extend_operand" + "") 0)) 0)))] + "TARGET_SHMEDIA && TARGET_LITTLE_ENDIAN" + [(set (match_dup 0) (zero_extend:SI (match_dup 1))) + (set (match_dup 0) (xor:SI (match_dup 0) (const_int 255)))] + "") + +(define_split + [(set (match_operand:SI 0 "arith_reg_dest" "") + (zero_extend:SI (subreg:QI (not:SI (subreg:SI (match_operand:QI 1 + "general_extend_operand" + "") 0)) 3)))] + "TARGET_SHMEDIA && ! TARGET_LITTLE_ENDIAN" + [(set (match_dup 0) (zero_extend:SI (match_dup 1))) + (set (match_dup 0) (xor:SI (match_dup 0) (const_int 255)))] + "") ;; Convert `constant - reg' to `neg rX; add rX, #const' since this ;; will sometimes save one instruction. Otherwise we might get ;; `mov #const, rY; sub rY,rX; mov rX, rY' if the source and dest regs @@ -1134,7 +1601,7 @@ { if (no_new_pseudos && ! arith_reg_or_0_operand (operands[1], SImode)) FAIL; - if (operands[1] != const0_rtx) + if (operands[1] != const0_rtx && GET_CODE (operands[1]) != SUBREG) operands[1] = force_reg (SImode, operands[1]); } }") @@ -1159,7 +1626,7 @@ [(set_attr "length" "0")]) (define_insn "udivsi3_sh2a" - [(set (match_operand:SI 0 "arith_reg_operand" "=r") + [(set (match_operand:SI 0 "arith_reg_dest" "=r") (udiv:SI (match_operand:SI 1 "arith_reg_operand" "0") (match_operand:SI 2 "arith_reg_operand" "z")))] "TARGET_SH2A" @@ -1204,8 +1671,8 @@ (clobber (reg:DI TR0_REG)) (clobber (reg:DI TR1_REG)) (clobber (reg:DI TR2_REG)) - (use (match_operand:DI 1 "target_operand" "b"))] - "TARGET_SHMEDIA && ! TARGET_SHMEDIA_FPU" + (use (match_operand 1 "target_operand" "b"))] + "TARGET_SHMEDIA && (! TARGET_SHMEDIA_FPU || ! TARGET_DIVIDE_FP)" "blink %1, r18" [(set_attr "type" "sfunc") (set_attr "needs_delay_slot" "yes")]) @@ -1290,7 +1757,7 @@ /* Emit the move of the address to a pseudo outside of the libcall. */ if (TARGET_HARD_SH4 && TARGET_SH2E) { - emit_move_insn (operands[3], function_symbol (\"__udivsi3_i4\")); + function_symbol (operands[3], \"__udivsi3_i4\", SFUNC_STATIC); if (TARGET_FPU_SINGLE) last = gen_udivsi3_i4_single (operands[0], operands[3]); else @@ -1312,17 +1779,12 @@ } else if (TARGET_SH5) { - emit_move_insn (operands[3], - function_symbol (TARGET_FPU_ANY - ? \"__udivsi3_i4\" - : \"__udivsi3\")); + function_symbol (operands[3], + TARGET_FPU_ANY ? \"__udivsi3_i4\" : \"__udivsi3\", + SFUNC_STATIC); if (TARGET_SHMEDIA) - last = gen_udivsi3_i1_media (operands[0], - Pmode == DImode - ? operands[3] - : gen_rtx_SUBREG (DImode, operands[3], - 0)); + last = gen_udivsi3_i1_media (operands[0], operands[3]); else if (TARGET_FPU_ANY) last = gen_udivsi3_i4_single (operands[0], operands[3]); else @@ -1330,7 +1792,7 @@ } else { - emit_move_insn (operands[3], function_symbol (\"__udivsi3\")); + function_symbol (operands[3], \"__udivsi3\", SFUNC_STATIC); last = gen_udivsi3_i1 (operands[0], operands[3]); } first = emit_move_insn (gen_rtx_REG (SImode, 4), operands[1]); @@ -1344,7 +1806,7 @@ }") (define_insn "divsi3_sh2a" - [(set (match_operand:SI 0 "arith_reg_operand" "=r") + [(set (match_operand:SI 0 "arith_reg_dest" "=r") (div:SI (match_operand:SI 1 "arith_reg_operand" "0") (match_operand:SI 2 "arith_reg_operand" "z")))] "TARGET_SH2A" @@ -1366,30 +1828,110 @@ [(set_attr "type" "sfunc") (set_attr "needs_delay_slot" "yes")]) -; Since shmedia-nofpu code could be linked against shcompact code, and -; the sdivsi3 libcall has the same name, we must consider all registers -; clobbered that are in the union of the registers clobbered by the -; shmedia and the shcompact implementation. Note, if the shcompact -; implementation actually used shcompact code, we'd need to clobber -; also r22, r23 and fr23. (define_insn "divsi3_i1_media" [(set (match_operand:SI 0 "register_operand" "=z") (div:SI (reg:SI R4_REG) (reg:SI R5_REG))) (clobber (reg:SI T_MEDIA_REG)) (clobber (reg:SI PR_MEDIA_REG)) (clobber (reg:SI R1_REG)) - (clobber (reg:SI R2_REG)) - (clobber (reg:SI R3_REG)) (clobber (reg:SI R20_REG)) (clobber (reg:SI R21_REG)) - (clobber (reg:DI TR0_REG)) - (clobber (reg:DI TR1_REG)) - (clobber (reg:DI TR2_REG)) - (use (match_operand:DI 1 "target_operand" "b"))] - "TARGET_SHMEDIA && ! TARGET_SHMEDIA_FPU" + (clobber (reg:SI TR0_REG)) + (use (match_operand 1 "target_operand" "b"))] + "TARGET_SHMEDIA && (! TARGET_SHMEDIA_FPU || ! TARGET_DIVIDE_FP)" "blink %1, r18" [(set_attr "type" "sfunc")]) +(define_insn "divsi3_media_2" + [(set (match_operand:SI 0 "register_operand" "=z") + (div:SI (reg:SI R4_REG) (reg:SI R5_REG))) + (clobber (reg:SI T_MEDIA_REG)) + (clobber (reg:SI PR_MEDIA_REG)) + (clobber (reg:SI R1_REG)) + (clobber (reg:SI R21_REG)) + (clobber (reg:SI TR0_REG)) + (use (reg:SI R20_REG)) + (use (match_operand 1 "target_operand" "b"))] + "TARGET_SHMEDIA && (! TARGET_SHMEDIA_FPU || ! TARGET_DIVIDE_FP)" + "blink %1, r18" + [(set_attr "type" "sfunc")]) + +;; This pattern acts as a placeholder for -mdiv=inv:call to carry +;; hard reg clobbers and data dependencies that we need when we want +;; to rematerialize the division into a call. +(define_insn_and_split "divsi_inv_call" + [(set (match_operand:SI 0 "register_operand" "=r") + (div:SI (match_operand:SI 1 "register_operand" "r") + (match_operand:SI 2 "register_operand" "r"))) + (clobber (reg:SI R4_REG)) + (clobber (reg:SI R5_REG)) + (clobber (reg:SI T_MEDIA_REG)) + (clobber (reg:SI PR_MEDIA_REG)) + (clobber (reg:SI R1_REG)) + (clobber (reg:SI R21_REG)) + (clobber (reg:SI TR0_REG)) + (clobber (reg:SI R20_REG)) + (use (match_operand:SI 3 "register_operand" "r"))] + "TARGET_SHMEDIA" + "#" + "&& (high_life_started || reload_completed)" + [(set (match_dup 0) (match_dup 3))] + "" + [(set_attr "highpart" "must_split")]) + +;; This is the combiner pattern for -mdiv=inv:call . +(define_insn_and_split "*divsi_inv_call_combine" + [(set (match_operand:SI 0 "register_operand" "=z") + (div:SI (match_operand:SI 1 "register_operand" "r") + (match_operand:SI 2 "register_operand" "r"))) + (clobber (reg:SI R4_REG)) + (clobber (reg:SI R5_REG)) + (clobber (reg:SI T_MEDIA_REG)) + (clobber (reg:SI PR_MEDIA_REG)) + (clobber (reg:SI R1_REG)) + (clobber (reg:SI R21_REG)) + (clobber (reg:SI TR0_REG)) + (clobber (reg:SI R20_REG)) + (use (unspec:SI [(match_dup 1) + (match_operand:SI 3 "" "") + (unspec:SI [(match_operand:SI 4 "" "") + (match_dup 3) + (match_operand:DI 5 "" "")] + UNSPEC_DIV_INV_M2) + (match_operand:DI 6 "" "") + (const_int 0) + (const_int 0)] + UNSPEC_DIV_INV_M3))] + "TARGET_SHMEDIA" + "#" + "&& (high_life_started || reload_completed)" + [(pc)] + " +{ + const char *name = sh_divsi3_libfunc; + enum sh_function_kind kind = SFUNC_GOT; + rtx sym; + + emit_move_insn (gen_rtx_REG (SImode, R4_REG), operands[1]); + emit_move_insn (gen_rtx_REG (SImode, R5_REG), operands[2]); + while (TARGET_DIVIDE_INV_CALL2) + { + rtx x = operands[3]; + + if (GET_CODE (x) != UNSPEC || XINT (x, 1) != UNSPEC_DIV_INV_M1) + break; + x = XVECEXP (x, 0, 0); + name = \"__sdivsi3_2\"; + kind = SFUNC_STATIC; + emit_move_insn (gen_rtx_REG (DImode, R20_REG), x); + break; + } + sym = function_symbol (NULL, name, kind); + emit_insn (gen_divsi3_media_2 (operands[0], sym)); + DONE; +}" + [(set_attr "highpart" "must_split")]) + (define_expand "divsi3_i4_media" [(set (match_dup 3) (float:DF (match_operand:SI 1 "register_operand" "r"))) (set (match_dup 4) (float:DF (match_operand:SI 2 "register_operand" "r"))) @@ -1453,7 +1995,7 @@ /* Emit the move of the address to a pseudo outside of the libcall. */ if (TARGET_HARD_SH4 && TARGET_SH2E) { - emit_move_insn (operands[3], function_symbol (\"__sdivsi3_i4\")); + function_symbol (operands[3], sh_divsi3_libfunc, SFUNC_STATIC); if (TARGET_FPU_SINGLE) last = gen_divsi3_i4_single (operands[0], operands[3]); else @@ -1466,7 +2008,87 @@ emit_insn (gen_divsi3_sh2a (operands[0], operands[1], operands[2])); DONE; } - else if (TARGET_SHMEDIA_FPU) + else if (TARGET_DIVIDE_INV) + { + rtx dividend = operands[1]; + rtx divisor = operands[2]; + rtx tab_base; + rtx nsb_res = gen_reg_rtx (DImode); + rtx norm64 = gen_reg_rtx (DImode); + rtx tab_ix = gen_reg_rtx (DImode); + rtx norm32 = gen_reg_rtx (SImode); + rtx i92 = force_reg (DImode, GEN_INT (92)); + rtx scratch0a = gen_reg_rtx (DImode); + rtx scratch0b = gen_reg_rtx (DImode); + rtx inv0 = gen_reg_rtx (SImode); + rtx scratch1a = gen_reg_rtx (DImode); + rtx scratch1b = gen_reg_rtx (DImode); + rtx shift = gen_reg_rtx (DImode); + rtx i2p27, i43; + rtx inv1 = gen_reg_rtx (SImode); + rtx scratch2a = gen_reg_rtx (DImode); + rtx scratch2b = gen_reg_rtx (SImode); + rtx inv2 = gen_reg_rtx (SImode); + rtx scratch3a = gen_reg_rtx (DImode); + rtx scratch3b = gen_reg_rtx (DImode); + rtx scratch3c = gen_reg_rtx (DImode); + rtx scratch3d = gen_reg_rtx (SImode); + rtx scratch3e = gen_reg_rtx (DImode); + rtx result = gen_reg_rtx (SImode); + + if (! arith_reg_or_0_operand (dividend, SImode)) + dividend = force_reg (SImode, dividend); + if (! arith_reg_operand (divisor, SImode)) + divisor = force_reg (SImode, divisor); + if (flag_pic && Pmode != DImode) + { + tab_base = gen_rtx_SYMBOL_REF (Pmode, \"__div_table\"); + tab_base = gen_datalabel_ref (tab_base); + tab_base = force_reg (DImode, gen_rtx_SIGN_EXTEND (DImode, tab_base)); + } + else + { + tab_base = gen_rtx_SYMBOL_REF (DImode, \"__div_table\"); + tab_base = gen_datalabel_ref (tab_base); + tab_base = force_reg (DImode, tab_base); + } + if (TARGET_DIVIDE_INV20U) + i2p27 = force_reg (DImode, GEN_INT (-2 << 27)); + else + i2p27 = GEN_INT (0); + if (TARGET_DIVIDE_INV20U || TARGET_DIVIDE_INV20L) + i43 = force_reg (DImode, GEN_INT (43)); + else + i43 = GEN_INT (0); + emit_insn (gen_nsbdi (nsb_res, + simplify_gen_subreg (DImode, divisor, SImode, 0))); + emit_insn (gen_ashldi3_media (norm64, + gen_rtx_SUBREG (DImode, divisor, 0), + nsb_res)); + emit_insn (gen_ashrdi3_media (tab_ix, norm64, GEN_INT (58))); + emit_insn (gen_ashrdisi3_media_high (norm32, norm64, GEN_INT (32))); + emit_insn (gen_divsi_inv_m1 (inv1, tab_base, tab_ix, norm32, + inv0, scratch0a, scratch0b, + scratch1a, scratch1b)); + emit_insn (gen_subdi3 (shift, i92, nsb_res)); + emit_insn (gen_divsi_inv_m2 (inv2, norm32, inv1, i92, + scratch2a)); + emit_insn (gen_divsi_inv_m3 (result, dividend, inv1, inv2, shift, + i2p27, i43, + scratch3a, scratch3b, scratch3c, + scratch2a, scratch2b, scratch3d, scratch3e)); + if (TARGET_DIVIDE_INV_CALL || TARGET_DIVIDE_INV_CALL2) + emit_insn (gen_divsi_inv_call (operands[0], dividend, divisor, result)); + else if (TARGET_DIVIDE_INV_FP) + emit_insn (gen_divsi_inv_fp (operands[0], dividend, divisor, result, + gen_reg_rtx (SImode), gen_reg_rtx (SImode), + gen_reg_rtx (DFmode), gen_reg_rtx (DFmode), + gen_reg_rtx (DFmode))); + else + emit_move_insn (operands[0], result); + DONE; + } + else if (TARGET_SHMEDIA_FPU && TARGET_DIVIDE_FP) { operands[1] = force_reg (SImode, operands[1]); operands[2] = force_reg (SImode, operands[2]); @@ -1475,17 +2097,22 @@ } else if (TARGET_SH5) { - emit_move_insn (operands[3], - function_symbol (TARGET_FPU_ANY - ? \"__sdivsi3_i4\" - : \"__sdivsi3\")); + if (TARGET_DIVIDE_CALL2) + { + rtx tab_base = gen_rtx_SYMBOL_REF (Pmode, \"__div_table\"); + tab_base = gen_datalabel_ref (tab_base); + emit_move_insn (gen_rtx_REG (Pmode, R20_REG), tab_base); + } + if (TARGET_FPU_ANY && TARGET_SH1) + function_symbol (operands[3], sh_divsi3_libfunc, SFUNC_STATIC); + else if (TARGET_DIVIDE_CALL2) + function_symbol (operands[3], \"__sdivsi3_2\", SFUNC_STATIC); + else + function_symbol (operands[3], sh_divsi3_libfunc, SFUNC_GOT); if (TARGET_SHMEDIA) - last = gen_divsi3_i1_media (operands[0], - Pmode == DImode - ? operands[3] - : gen_rtx_SUBREG (DImode, operands[3], - 0)); + last = ((TARGET_DIVIDE_CALL2 ? gen_divsi3_media_2 : gen_divsi3_i1_media) + (operands[0], operands[3])); else if (TARGET_FPU_ANY) last = gen_divsi3_i4_single (operands[0], operands[3]); else @@ -1493,7 +2120,7 @@ } else { - emit_move_insn (operands[3], function_symbol (\"__sdivsi3\")); + function_symbol (operands[3], sh_divsi3_libfunc, SFUNC_GOT); last = gen_divsi3_i1 (operands[0], operands[3]); } first = emit_move_insn (gen_rtx_REG (SImode, 4), operands[1]); @@ -1505,6 +2132,411 @@ REG_NOTES (last) = gen_rtx_INSN_LIST (REG_RETVAL, first, REG_NOTES (last)); DONE; }") + +;; operands: inv0, tab_base, tab_ix, norm32 +;; scratch equiv in sdivsi3_2: r19, r21 +(define_expand "divsi_inv_m0" + [(set (match_operand:SI 0 "register_operand" "=r") + (unspec:SI [(match_operand:DI 1 "register_operand" "r") + (match_operand:DI 2 "register_operand" "r") + (match_operand:SI 3 "register_operand" "r")] + UNSPEC_DIV_INV_M0)) + (clobber (match_operand:DI 4 "register_operand" "=r")) + (clobber (match_operand:DI 5 "register_operand" "=r"))] + "TARGET_SHMEDIA" + " +{ +/* +tab_base: r20 +tab_ix: r21 +norm32: r25 + ldx.ub r20, r21, r19 // u0.8 + shlli r21, 1, r21 + muls.l r25, r19, r19 // s2.38 + ldx.w r20, r21, r21 // s2.14 + shari r19, 24, r19 // truncate to s2.14 + sub r21, r19, r19 // some 11 bit inverse in s1.14 +*/ + + rtx inv0 = operands[0]; + rtx tab_base = operands[1]; + rtx tab_ix = operands[2]; + rtx norm32 = operands[3]; + rtx scratch0 = operands[4]; + rtx scratch0_si = simplify_gen_subreg (SImode, scratch0, DImode, SIDI_OFF); + rtx scratch1 = operands[5]; + rtx mem; + + mem = gen_rtx_MEM (QImode, gen_rtx_PLUS (DImode, tab_base, tab_ix)); + emit_insn (gen_zero_extendqidi2 (scratch0, mem)); + emit_insn (gen_ashldi3_media (scratch1, tab_ix, GEN_INT (1))); + emit_insn (gen_mulsidi3_media (scratch0, norm32, scratch0_si)); + mem = gen_rtx_MEM (HImode, gen_rtx_PLUS (DImode, tab_base, scratch1)); + emit_insn (gen_extendhidi2 (scratch1, mem)); + emit_insn (gen_ashrdi3_media (scratch0, scratch0, GEN_INT (24))); + emit_insn (gen_subdisi3_media (inv0, scratch1, scratch0)); + DONE; +}") + +;; operands: inv1, tab_base, tab_ix, norm32 +(define_insn_and_split "divsi_inv_m1" + [(set (match_operand:SI 0 "register_operand" "=r") + (unspec:SI [(match_operand:DI 1 "register_operand" "r") + (match_operand:DI 2 "register_operand" "r") + (match_operand:SI 3 "register_operand" "r")] + UNSPEC_DIV_INV_M1)) + (clobber (match_operand:SI 4 "register_operand" "=r")) + (clobber (match_operand:DI 5 "register_operand" "=r")) + (clobber (match_operand:DI 6 "register_operand" "=r")) + (clobber (match_operand:DI 7 "register_operand" "=r")) + (clobber (match_operand:DI 8 "register_operand" "=r"))] + "TARGET_SHMEDIA" + "#" + "&& no_new_pseudos" + [(pc)] + " +{ +/* inv0: r19 + muls.l r19, r19, r18 // u0.28 + muls.l r25, r18, r18 // s2.58 + shlli r19, 45, r0 // multiply by two and convert to s2.58 + sub r0, r18, r18 + shari r18, 28, r18 // some 18 bit inverse in s1.30 +*/ + + rtx inv1 = operands[0]; + rtx tab_base = operands[1]; + rtx tab_ix = operands[2]; + rtx norm32 = operands[3]; + rtx inv0 = operands[4]; + rtx inv0_di = simplify_gen_subreg (DImode, inv0, SImode, 0); + rtx scratch0a = operands[5]; + rtx scratch0b = operands[6]; + rtx scratch0 = operands[7]; + rtx scratch1 = operands[8]; + rtx scratch1_si = simplify_gen_subreg (SImode, scratch1, DImode, SIDI_OFF); + + emit_insn (gen_divsi_inv_m0 (inv0, tab_base, tab_ix, norm32, + scratch0a, scratch0b)); + emit_insn (gen_mulsidi3_media (scratch1, inv0, inv0)); + emit_insn (gen_mulsidi3_media (scratch1, norm32, scratch1_si)); + emit_insn (gen_ashldi3_media (scratch0, inv0_di, GEN_INT (45))); + emit_insn (gen_subdi3 (scratch1, scratch0, scratch1)); + emit_insn (gen_ashrdisi3_media_opaque (inv1, scratch1, GEN_INT (28))); + DONE; +}") + +;; operands: inv2, norm32, inv1, i92 +(define_insn_and_split "divsi_inv_m2" + [(set (match_operand:SI 0 "register_operand" "=r") + (unspec:SI [(match_operand:SI 1 "register_operand" "r") + (match_operand:SI 2 "register_operand" "r") + (match_operand:DI 3 "register_operand" "r")] + UNSPEC_DIV_INV_M2)) + (clobber (match_operand:DI 4 "register_operand" "=r"))] + "TARGET_SHMEDIA" + "#" + "&& no_new_pseudos" + [(pc)] + " +{ +/* + muls.l r18, r25, r0 // s2.60 + shari r0, 16, r0 // s-16.44 + sub + muls.l r0, r18, r19 // s-16.74 + shari r19, 30, r19 // s-16.44 +*/ + rtx inv2 = operands[0]; + rtx norm32 = operands[1]; + rtx inv1 = operands[2]; + rtx i92 = operands[3]; + rtx scratch0 = operands[4]; + rtx scratch0_si = simplify_gen_subreg (SImode, scratch0, DImode, SIDI_OFF); + + emit_insn (gen_mulsidi3_media (scratch0, inv1, norm32)); + emit_insn (gen_ashrdi3_media (scratch0, scratch0, GEN_INT (16))); + emit_insn (gen_subdi3 (scratch0, i92, scratch0)); + emit_insn (gen_mulsidi3_media (scratch0, scratch0_si, inv1)); + emit_insn (gen_ashrdisi3_media_opaque (inv2, scratch0, GEN_INT (30))); + DONE; +}") + +(define_insn_and_split "divsi_inv_m3" + [(set (match_operand:SI 0 "register_operand" "=r") + (unspec:SI [(match_operand:SI 1 "arith_reg_or_0_operand" "rN") + (match_operand:SI 2 "register_operand" "r") + (match_operand:SI 3 "register_operand" "r") + (match_operand:DI 4 "register_operand" "r") + (match_operand:DI 5 "arith_reg_or_0_operand" "rN") + (match_operand:DI 6 "arith_reg_or_0_operand" "rN")] + UNSPEC_DIV_INV_M3)) + (clobber (match_operand:DI 7 "register_operand" "=r")) + (clobber (match_operand:DI 8 "register_operand" "=r")) + (clobber (match_operand:DI 9 "register_operand" "=r")) + (clobber (match_operand:DI 10 "register_operand" "=r")) + (clobber (match_operand:SI 11 "register_operand" "=r")) + (clobber (match_operand:SI 12 "register_operand" "=r")) + (clobber (match_operand:DI 13 "register_operand" "=r"))] + "TARGET_SHMEDIA" + "#" + "&& no_new_pseudos" + [(pc)] + " +{ +/* + r0: result r1: shift r4: dividend r18: inv1 r19: inv2 + r0: scratch0 r19: scratch1 r21: scratch2 + + muls.l r18, r4, r25 // s32.30 + muls.l r19, r4, r19 // s15.30 + shari r25, 63, r21 + shari r19, 14, r19 // s18.-14 + sub r25, r19, r0 + shard r0, r1, r0 + sub r0, r21, r0 +*/ + + rtx result = operands[0]; + rtx dividend = operands[1]; + rtx inv1 = operands[2]; + rtx inv2 = operands[3]; + rtx shift = operands[4]; + rtx scratch0 = operands[7]; + rtx scratch1 = operands[8]; + rtx scratch2 = operands[9]; + + emit_insn (gen_mulsidi3_media (scratch0, inv1, dividend)); + emit_insn (gen_mulsidi3_media (scratch1, inv2, dividend)); + emit_insn (gen_ashrdi3_media (scratch2, scratch0, GEN_INT (63))); + emit_insn (gen_ashrdi3_media (scratch1, scratch1, GEN_INT (14))); + emit_insn (gen_adddi3 (scratch0, scratch0, scratch1)); + emit_insn (gen_ashrdi3_media (scratch0, scratch0, shift)); + emit_insn (gen_subdisi3_media (result, scratch0, scratch2)); + DONE; +}") + +;; operands: quotient, dividend, inv1, inv2, shift, i2p27, i43 +;; inv1: tab_base, tab_ix, norm32 +;; inv2: norm32, inv1, i92 +(define_insn_and_split "divsi_inv_m1_3" + [(set (match_operand:SI 0 "register_operand" "=r") + (unspec:SI [(match_operand:SI 1 "arith_reg_or_0_operand" "rN") + (unspec:SI [(match_operand:DI 2 "register_operand" "r") + (match_operand:DI 3 "register_operand" "r") + (match_operand:SI 4 "register_operand" "r")] + UNSPEC_DIV_INV_M1) + (unspec:SI [(match_dup 4) + (unspec:SI [(match_dup 2) + (match_dup 3) + (match_dup 4)] UNSPEC_DIV_INV_M1) + (match_operand:SI 5 "" "")] + UNSPEC_DIV_INV_M2) + (match_operand:DI 6 "register_operand" "r") + (match_operand:DI 7 "arith_reg_or_0_operand" "rN") + (match_operand:DI 8 "arith_reg_or_0_operand" "rN")] + UNSPEC_DIV_INV_M3)) + (clobber (match_operand:DI 9 "register_operand" "=r")) + (clobber (match_operand:DI 10 "register_operand" "=r")) + (clobber (match_operand:DI 11 "register_operand" "=r")) + (clobber (match_operand:DI 12 "register_operand" "=r")) + (clobber (match_operand:SI 13 "register_operand" "=r")) + (clobber (match_operand:SI 14 "register_operand" "=r")) + (clobber (match_operand:DI 15 "register_operand" "=r"))] + "TARGET_SHMEDIA + && (TARGET_DIVIDE_INV_MINLAT + || TARGET_DIVIDE_INV20U || TARGET_DIVIDE_INV20L)" + "#" + "&& no_new_pseudos" + [(pc)] + " +{ + rtx result = operands[0]; + rtx dividend = operands[1]; + rtx tab_base = operands[2]; + rtx tab_ix = operands[3]; + rtx norm32 = operands[4]; + /* rtx i92 = operands[5]; */ + rtx shift = operands[6]; + rtx i2p27 = operands[7]; + rtx i43 = operands[8]; + rtx scratch0 = operands[9]; + rtx scratch0_si = simplify_gen_subreg (SImode, scratch0, DImode, SIDI_OFF); + rtx scratch1 = operands[10]; + rtx scratch1_si = simplify_gen_subreg (SImode, scratch1, DImode, SIDI_OFF); + rtx scratch2 = operands[11]; + rtx scratch3 = operands[12]; + rtx scratch4 = operands[13]; + rtx scratch4_di = simplify_gen_subreg (DImode, scratch4, SImode, 0); + rtx scratch5 = operands[14]; + rtx scratch5_di = simplify_gen_subreg (DImode, scratch5, SImode, 0); + rtx scratch6 = operands[15]; + + emit_insn (gen_divsi_inv_m0 (scratch4, tab_base, tab_ix, norm32, + scratch0, scratch1)); + /* inv0 == scratch4 */ + if (! TARGET_DIVIDE_INV20U) + { + emit_insn (gen_mulsidi3_media (scratch0, scratch4, scratch4)); + i2p27 = scratch0; + emit_insn (gen_mulsidi3_media (scratch1, norm32, scratch0_si)); + } + else + { + emit_insn (gen_mulsidi3_media (scratch1, scratch4, scratch4)); + emit_insn (gen_mulsidi3_media (scratch1, norm32, scratch1_si)); + } + emit_insn (gen_ashldi3_media (scratch2, scratch4_di, GEN_INT (45))); + emit_insn (gen_subdi3 (scratch1, scratch2, scratch1)); + emit_insn (gen_ashrdisi3_media_opaque (scratch4, scratch1, GEN_INT (28))); + /* inv1 == scratch4 */ + + if (TARGET_DIVIDE_INV_MINLAT) + { + emit_insn (gen_mulsidi3_media (scratch1, scratch4, norm32)); + emit_insn (gen_mulsidi3_media (scratch2, dividend, scratch4)); + emit_insn (gen_ashrdi3_media (scratch1, scratch1, GEN_INT (16))); + emit_insn (gen_mulsidi3_media (scratch1, scratch1_si, scratch4)); + emit_insn (gen_ashrdi3_media (scratch3, scratch2, GEN_INT (63))); + emit_insn (gen_ashrsi3_media (scratch5, dividend, GEN_INT (14))); + emit_insn (gen_ashrdi3_media (scratch1, scratch1, GEN_INT (30))); + emit_insn (gen_mulsidi3_media (scratch1, scratch1_si, scratch5)); + emit_insn (gen_xordi3 (scratch0, scratch3, i2p27)); + emit_insn (gen_adddi3 (scratch2, scratch2, scratch0)); + emit_insn (gen_subdi3 (scratch2, scratch2, scratch1)); + } + else + { + rtx label = gen_rtx_LABEL_REF (Pmode, gen_label_rtx ()); + /* Use separate scratch regs for nsb and sign to allow scheduling. */ + emit_insn (gen_nsbdi (scratch6, + simplify_gen_subreg (DImode, dividend, SImode, 0))); + emit_insn (gen_xorsi3 (scratch5, dividend, norm32)); + emit_insn (gen_ashrdi3_media (scratch3, scratch5_di, GEN_INT (63))); + emit_insn (gen_divsi_inv20 (scratch2, + norm32, scratch4, dividend, + scratch6, scratch3, i43, + /* scratch0 may be shared with i2p27. */ + scratch0, scratch1, scratch5, + label, label, i2p27)); + } + emit_insn (gen_ashrdi3_media (scratch2, scratch2, shift)); + emit_insn (gen_subdisi3_media (result, scratch2, scratch3)); + DONE; +}") + +(define_insn "divsi_inv20" + [(set (match_operand:DI 0 "register_operand" "=&r") + (unspec:DI [(match_operand:SI 1 "register_operand" "r") + (match_operand:SI 2 "register_operand" "r") + (match_operand:SI 3 "register_operand" "r") + (match_operand:DI 4 "register_operand" "r") + (match_operand:DI 5 "register_operand" "r") + (match_operand:DI 6 "register_operand" "r") + (match_operand:DI 12 "register_operand" "r") + (match_operand 10 "target_operand" "b") + (match_operand 11 "immediate_operand" "i")] + UNSPEC_DIV_INV20)) + (clobber (match_operand:DI 7 "register_operand" "=&r")) + (clobber (match_operand:DI 8 "register_operand" "=&r")) + (clobber (match_operand:SI 9 "register_operand" "=r"))] + "TARGET_SHMEDIA + && (TARGET_DIVIDE_INV20U || TARGET_DIVIDE_INV20L)" + "* +{ +/* operands: %0 div_result, %1 norm32, %2 inv1, %3 dividend, + %4 dividend_nsb, %5 result_sign, %6 i43, %12 i2p27, + %7 round_scratch, %8 scratch0 (di), %9 scratch1 (si) + %10 label (tr), %11 label (imm) + + muls.l inv1, norm32, scratch0 // s2.60 + muls.l inv1, dividend, result // s32.30 + xor i2p27, result_sign, round_scratch + bge/u dividend_nsb, i43, tr.. (label) + shari scratch0, 16, scratch0 // s-16.44 + muls.l sratch0_si, inv1, scratch0 // s-16.74 + sub result, round_scratch, result + shari dividend, 14, scratch1 // s19.-14 + shari scratch0, 30, scratch0 // s-16.44 + muls.l scratch0, scratch1, round_scratch // s15.30 +label: + sub result, round_scratch, result */ + + int likely = TARGET_DIVIDE_INV20L; + + if (! likely) output_asm_insn (\"muls.l\t%2, %1 , %8\", operands); + output_asm_insn (\"muls.l\t%2, %3, %0\;xor\t%12, %5, %7\", operands); + output_asm_insn (likely + ? \"bge/l\t%4, %6, %10\;muls.l\t%2, %1 , %8\" + : \"bge/u\t%4, %6, %10\", operands); + output_asm_insn (\"shari\t%8, 16, %8\;muls.l\t%8, %2, %8\", operands); + if (! likely) output_asm_insn (\"sub\t%0, %7, %0\", operands); + output_asm_insn (\"shari\t%3, 14, %9\;shari\t%8, 30, %8\", operands); + return (likely + ? \"muls.l\t%8, %9, %8\;sub\t%0, %8, %0\n%11:\tadd\t%0, %7, %0\" + : \"muls.l\t%8, %9, %7\n%11:\tsub\t%0, %7, %0\"); +}") + +(define_insn_and_split "divsi_inv_fp" + [(set (match_operand:SI 0 "general_movdst_operand" "=rf") + (div:SI (match_operand:SI 1 "general_movsrc_operand" "rf") + (match_operand:SI 2 "register_operand" "rf"))) + (use (match_operand:SI 3 "general_movsrc_operand" "r")) + (clobber (match_operand:SI 4 "register_operand" "=r")) + (clobber (match_operand:SI 5 "register_operand" "=r")) + (clobber (match_operand:DF 6 "register_operand" "=r")) + (clobber (match_operand:DF 7 "register_operand" "=r")) + (clobber (match_operand:DF 8 "register_operand" "=r"))] + "TARGET_SHMEDIA_FPU" + "#" + "&& (high_life_started || reload_completed)" + [(set (match_dup 0) (match_dup 3))] + "" + [(set_attr "highpart" "must_split")]) + +;; If a matching group of divide-by-inverse instructions is in the same +;; basic block after gcse & loop optimizations, we want to transform them +;; to a straight division using floating point for TARGET_DIVIDE_INV_FP. +(define_insn_and_split "*divsi_inv_fp_combine" + [(set (match_operand:SI 0 "register_operand" "=f") + (div:SI (match_operand:SI 1 "register_operand" "f") + (match_operand:SI 2 "register_operand" "f"))) + (use (unspec:SI [(match_dup 1) + (match_operand:SI 3 "" "") + (unspec:SI [(match_operand:SI 4 "" "") + (match_dup 3) + (match_operand:DI 5 "" "")] UNSPEC_DIV_INV_M2) + (match_operand:DI 6 "" "") + (const_int 0) + (const_int 0)] UNSPEC_DIV_INV_M3)) + (clobber (match_operand:SI 7 "fp_arith_reg_operand" "")) + (clobber (match_operand:SI 8 "fp_arith_reg_operand" "")) + (clobber (match_operand:DF 9 "fp_arith_reg_operand" "")) + (clobber (match_operand:DF 10 "fp_arith_reg_operand" "")) + (clobber (match_operand:DF 11 "fp_arith_reg_operand" ""))] + "TARGET_SHMEDIA_FPU && TARGET_DIVIDE_INV_FP && no_new_pseudos" + "#" + "&& 1" + [(set (match_dup 9) (float:DF (match_dup 1))) + (set (match_dup 10) (float:DF (match_dup 2))) + (set (match_dup 11) (div:DF (match_dup 9) (match_dup 10))) + (set (match_dup 8) + (fix:SI (match_dup 11))) + (set (match_dup 0) (match_dup 8))] + " +{ + if (! fp_arith_reg_operand (operands[1], SImode)) + { + emit_move_insn (operands[7], operands[1]); + operands[1] = operands[7]; + } + if (! fp_arith_reg_operand (operands[2], SImode)) + { + emit_move_insn (operands[8], operands[2]); + operands[2] = operands[8]; + } +}" + [(set_attr "highpart" "must_split")]) ;; ------------------------------------------------------------------------- ;; Multiplication instructions @@ -1625,7 +2657,7 @@ "") (define_insn "mul_r" - [(set (match_operand:SI 0 "arith_reg_operand" "=r") + [(set (match_operand:SI 0 "arith_reg_dest" "=r") (mult:SI (match_operand:SI 1 "arith_reg_operand" "0") (match_operand:SI 2 "arith_reg_operand" "z")))] "TARGET_SH2A" @@ -1655,7 +2687,7 @@ { /* The address must be set outside the libcall, since it goes into a pseudo. */ - rtx sym = function_symbol (\"__mulsi3\"); + rtx sym = function_symbol (NULL, \"__mulsi3\", SFUNC_STATIC); rtx addr = force_reg (SImode, sym); rtx insns = gen_mulsi3_call (operands[0], operands[1], operands[2], addr); @@ -1710,15 +2742,16 @@ }") (define_insn "mulsidi3_media" - [(set (match_operand:DI 0 "arith_reg_operand" "=r") + [(set (match_operand:DI 0 "arith_reg_dest" "=r") (mult:DI (sign_extend:DI (match_operand:SI 1 "extend_reg_operand" "%r")) (sign_extend:DI (match_operand:SI 2 "extend_reg_operand" "r"))))] "TARGET_SHMEDIA" "muls.l %1, %2, %0" - [(set_attr "type" "dmpy_media")]) + [(set_attr "type" "dmpy_media") + (set_attr "highpart" "ignore")]) (define_insn "mulsidi3_compact" - [(set (match_operand:DI 0 "arith_reg_operand" "=r") + [(set (match_operand:DI 0 "arith_reg_dest" "=r") (mult:DI (sign_extend:DI (match_operand:SI 1 "arith_reg_operand" "r")) (sign_extend:DI (match_operand:SI 2 "arith_reg_operand" "r")))) @@ -1728,7 +2761,7 @@ "#") (define_split - [(set (match_operand:DI 0 "arith_reg_operand" "") + [(set (match_operand:DI 0 "arith_reg_dest" "") (mult:DI (sign_extend:DI (match_operand:SI 1 "arith_reg_operand" "")) (sign_extend:DI (match_operand:SI 2 "arith_reg_operand" "")))) @@ -1781,15 +2814,16 @@ }") (define_insn "umulsidi3_media" - [(set (match_operand:DI 0 "arith_reg_operand" "=r") + [(set (match_operand:DI 0 "arith_reg_dest" "=r") (mult:DI (zero_extend:DI (match_operand:SI 1 "extend_reg_operand" "%r")) (zero_extend:DI (match_operand:SI 2 "extend_reg_operand" "r"))))] "TARGET_SHMEDIA" "mulu.l %1, %2, %0" - [(set_attr "type" "dmpy_media")]) + [(set_attr "type" "dmpy_media") + (set_attr "highpart" "ignore")]) (define_insn "umulsidi3_compact" - [(set (match_operand:DI 0 "arith_reg_operand" "=r") + [(set (match_operand:DI 0 "arith_reg_dest" "=r") (mult:DI (zero_extend:DI (match_operand:SI 1 "arith_reg_operand" "r")) (zero_extend:DI (match_operand:SI 2 "arith_reg_operand" "r")))) @@ -1799,7 +2833,7 @@ "#") (define_split - [(set (match_operand:DI 0 "arith_reg_operand" "") + [(set (match_operand:DI 0 "arith_reg_dest" "") (mult:DI (zero_extend:DI (match_operand:SI 1 "arith_reg_operand" "")) (zero_extend:DI (match_operand:SI 2 "arith_reg_operand" "")))) (clobber (reg:SI MACH_REG)) @@ -1905,30 +2939,82 @@ REG_NOTES (last) = gen_rtx_INSN_LIST (REG_RETVAL, first, REG_NOTES (last)); DONE; }") + +(define_insn_and_split "muldi3" + [(set (match_operand:DI 0 "arith_reg_dest" "=r") + (mult:DI (match_operand:DI 1 "arith_reg_operand" "r") + (match_operand:DI 2 "arith_reg_operand" "r"))) + (clobber (match_scratch:DI 3 "=&r")) + (clobber (match_scratch:DI 4 "=r"))] + "TARGET_SHMEDIA" + "#" + "reload_completed" + [(const_int 0)] + " +{ + rtx op3_v2si, op2_v2si; + + op3_v2si = operands[3]; + if (GET_CODE (op3_v2si) == SIGN_EXTEND) + { + op3_v2si = XEXP (op3_v2si, 0); + op3_v2si = simplify_gen_subreg (DImode, op3_v2si, GET_MODE (op3_v2si), 0); + } + op3_v2si = simplify_gen_subreg (V2SImode, op3_v2si, DImode, 0); + op2_v2si = operands[2]; + if (GET_CODE (op2_v2si) == SIGN_EXTEND) + { + op2_v2si = XEXP (op2_v2si, 0); + op2_v2si = simplify_gen_subreg (DImode, op2_v2si, GET_MODE (op2_v2si), 0); + } + op2_v2si = simplify_gen_subreg (V2SImode, op2_v2si, DImode, 0); + emit_insn (gen_rotldi3 (operands[3], operands[1], GEN_INT (32))); + emit_insn (gen_mulv2si3 (op3_v2si, op3_v2si, op2_v2si)); + emit_insn (gen_umulsidi3_media (operands[4], + sh_gen_truncate (SImode, operands[1], 0), + sh_gen_truncate (SImode, operands[2], 0))); + emit_insn (gen_anddi3 (operands[0], operands[3], GEN_INT (0xffffffff00000000LL))); + emit_insn (gen_ashldi3_media (operands[3], operands[3], GEN_INT (32))); + emit_insn (gen_adddi3 (operands[0], operands[3], operands[0])); + emit_insn (gen_adddi3 (operands[0], operands[4], operands[0])); + DONE; +}") + ;; ------------------------------------------------------------------------- ;; Logical operations ;; ------------------------------------------------------------------------- (define_insn "*andsi3_compact" - [(set (match_operand:SI 0 "arith_reg_operand" "=r,z") + [(set (match_operand:SI 0 "arith_reg_dest" "=r,z") (and:SI (match_operand:SI 1 "arith_reg_operand" "%0,0") (match_operand:SI 2 "logical_operand" "r,K08")))] "TARGET_SH1" "and %2,%0" [(set_attr "type" "arith")]) +(define_insn "*andsi3_media" + [(set (match_operand:SI 0 "arith_reg_dest" "=r,r") + (and:SI (match_operand:SI 1 "logical_reg_operand" "%r,r") + (match_operand:SI 2 "logical_operand" "r,I10")))] + "TARGET_SHMEDIA" + "@ + and %1, %2, %0 + andi %1, %2, %0" + [(set_attr "type" "arith_media")]) + ;; If the constant is 255, then emit an extu.b instruction instead of an ;; and, since that will give better code. (define_expand "andsi3" [(set (match_operand:SI 0 "arith_reg_operand" "") - (and:SI (match_operand:SI 1 "arith_reg_operand" "") + (and:SI (match_operand:SI 1 "logical_reg_operand" "") (match_operand:SI 2 "logical_operand" "")))] - "TARGET_SH1" + "" " { - if (GET_CODE (operands[2]) == CONST_INT && INTVAL (operands[2]) == 255) + if (TARGET_SH1 + && GET_CODE (operands[2]) == CONST_INT && INTVAL (operands[2]) == 255) { emit_insn (gen_zero_extendqisi2 (operands[0], gen_lowpart (QImode, operands[1]))); @@ -1937,7 +3023,7 @@ }") (define_insn_and_split "anddi3" - [(set (match_operand:DI 0 "arith_reg_operand" "=r,r,r") + [(set (match_operand:DI 0 "arith_reg_dest" "=r,r,r") (and:DI (match_operand:DI 1 "arith_reg_operand" "%r,r,r") (match_operand:DI 2 "and_operand" "r,I10,J16")))] "TARGET_SHMEDIA" @@ -1958,24 +3044,49 @@ }" [(set_attr "type" "arith_media")]) +(define_insn "andcsi3" + [(set (match_operand:SI 0 "arith_reg_dest" "=r") + (and:SI (match_operand:SI 1 "arith_reg_operand" "r") + (not:SI (match_operand:SI 2 "arith_reg_operand" "r"))))] + "TARGET_SHMEDIA" + "andc %1,%2,%0" + [(set_attr "type" "arith_media")]) + (define_insn "andcdi3" - [(set (match_operand:DI 0 "arith_reg_operand" "=r") + [(set (match_operand:DI 0 "arith_reg_dest" "=r") (and:DI (match_operand:DI 1 "arith_reg_operand" "r") (not:DI (match_operand:DI 2 "arith_reg_operand" "r"))))] "TARGET_SHMEDIA" "andc %1,%2,%0" [(set_attr "type" "arith_media")]) -(define_insn "iorsi3" - [(set (match_operand:SI 0 "arith_reg_operand" "=r,z") +(define_expand "iorsi3" + [(set (match_operand:SI 0 "arith_reg_operand" "") + (ior:SI (match_operand:SI 1 "logical_reg_operand" "") + (match_operand:SI 2 "logical_operand" "")))] + "" + "") + +(define_insn "*iorsi3_compact" + [(set (match_operand:SI 0 "arith_reg_dest" "=r,z") (ior:SI (match_operand:SI 1 "arith_reg_operand" "%0,0") (match_operand:SI 2 "logical_operand" "r,K08")))] "TARGET_SH1" "or %2,%0" [(set_attr "type" "arith")]) +(define_insn "*iorsi3_media" + [(set (match_operand:SI 0 "arith_reg_dest" "=r,r") + (ior:SI (match_operand:SI 1 "logical_reg_operand" "%r,r") + (match_operand:SI 2 "logical_operand" "r,I10")))] + "TARGET_SHMEDIA" + "@ + or %1, %2, %0 + ori %1, %2, %0" + [(set_attr "type" "arith_media")]) + (define_insn "iordi3" - [(set (match_operand:DI 0 "arith_reg_operand" "=r,r") + [(set (match_operand:DI 0 "arith_reg_dest" "=r,r") (ior:DI (match_operand:DI 1 "arith_reg_operand" "%r,r") (match_operand:DI 2 "logical_operand" "r,I10")))] "TARGET_SHMEDIA" @@ -1984,18 +3095,74 @@ ori %1, %2, %0" [(set_attr "type" "arith_media")]) -(define_insn "xorsi3" - [(set (match_operand:SI 0 "arith_reg_operand" "=z,r") +(define_insn_and_split "*logical_sidi3" + [(set (match_operand:DI 0 "arith_reg_dest" "=r,r") + (sign_extend:DI (match_operator:SI 3 "logical_operator" + [(match_operand:SI 1 "arith_reg_operand" "%r,r") + (match_operand:SI 2 "logical_operand" "r,I10")])))] + "TARGET_SHMEDIA" + "#" + "&& reload_completed" + [(set (match_dup 0) (match_dup 3))] + " +{ + operands[3] + = gen_rtx_fmt_ee (GET_CODE (operands[3]), DImode, + simplify_gen_subreg (DImode, operands[1], SImode, 0), + simplify_gen_subreg (DImode, operands[2], SImode, 0)); +}") + +(define_insn_and_split "*logical_sidisi3" + [(set (match_operand:SI 0 "arith_reg_dest" "=r,r") + (truncate:SI (sign_extend:DI + (match_operator:SI 3 "logical_operator" + [(match_operand:SI 1 "arith_reg_operand" "%r,r") + (match_operand:SI 2 "logical_operand" "r,I10")]))))] + "TARGET_SHMEDIA" + "#" + "&& 1" + [(set (match_dup 0) (match_dup 3))]) + +(define_insn_and_split "*logical_sidi3_2" + [(set (match_operand:DI 0 "arith_reg_dest" "=r,r") + (sign_extend:DI (truncate:SI (sign_extend:DI + (match_operator:SI 3 "logical_operator" + [(match_operand:SI 1 "arith_reg_operand" "%r,r") + (match_operand:SI 2 "logical_operand" "r,I10")])))))] + "TARGET_SHMEDIA" + "#" + "&& 1" + [(set (match_dup 0) (sign_extend:DI (match_dup 3)))]) + +(define_expand "xorsi3" + [(set (match_operand:SI 0 "arith_reg_operand" "") + (xor:SI (match_operand:SI 1 "logical_reg_operand" "") + (match_operand:SI 2 "xor_operand" "")))] + "" + "") + +(define_insn "*xorsi3_compact" + [(set (match_operand:SI 0 "arith_reg_dest" "=z,r") (xor:SI (match_operand:SI 1 "arith_reg_operand" "%0,0") (match_operand:SI 2 "logical_operand" "K08,r")))] "TARGET_SH1" "xor %2,%0" [(set_attr "type" "arith")]) +(define_insn "*xorsi3_media" + [(set (match_operand:SI 0 "arith_reg_dest" "=r,r") + (xor:SI (match_operand:SI 1 "logical_reg_operand" "%r,r") + (match_operand:SI 2 "xor_operand" "r,I06")))] + "TARGET_SHMEDIA" + "@ + xor %1, %2, %0 + xori %1, %2, %0" + [(set_attr "type" "arith_media")]) + (define_insn "xordi3" - [(set (match_operand:DI 0 "arith_reg_operand" "=r,r") + [(set (match_operand:DI 0 "arith_reg_dest" "=r,r") (xor:DI (match_operand:DI 1 "arith_reg_operand" "%r,r") - (match_operand:DI 2 "shmedia_6bit_operand" "r,I06")))] + (match_operand:DI 2 "xor_operand" "r,I06")))] "TARGET_SHMEDIA" "@ xor %1, %2, %0 @@ -2005,7 +3172,7 @@ ;; Combiner bridge pattern for 2 * sign extend -> logical op -> truncate. ;; converts 2 * sign extend -> logical op into logical op -> sign extend (define_split - [(set (match_operand:DI 0 "arith_reg_operand" "") + [(set (match_operand:DI 0 "arith_reg_dest" "") (sign_extend:DI (match_operator 4 "binary_logical_operator" [(match_operand 1 "any_register_operand" "") (match_operand 2 "any_register_operand" "")])))] @@ -2075,8 +3242,25 @@ }" [(set_attr "type" "arith_media")]) +(define_split + [(set (match_operand:DI 0 "arith_reg_dest" "") + (ior:DI (zero_extend:DI (mem:QI (match_operand 1 + "ua_address_operand" ""))) + (ashift:DI (match_operand:DI 2 "arith_reg_operand" "") + (const_int 8)))) + (clobber (match_operand:DI 3 "register_operand" ""))] + "TARGET_SHMEDIA" + [(match_dup 4) (match_dup 5)] + " +{ + operands[4] = ((TARGET_LITTLE_ENDIAN ? gen_ldhi_q : gen_ldlo_q) + (operands[3], operands[1])); + operands[5] = gen_mextr_rl (operands[0], operands[3], operands[2], + GEN_INT (56), GEN_INT (8)); +}") + (define_insn "rotlsi3_1" - [(set (match_operand:SI 0 "arith_reg_operand" "=r") + [(set (match_operand:SI 0 "arith_reg_dest" "=r") (rotate:SI (match_operand:SI 1 "arith_reg_operand" "0") (const_int 1))) (set (reg:SI T_REG) @@ -2086,7 +3270,7 @@ [(set_attr "type" "arith")]) (define_insn "rotlsi3_31" - [(set (match_operand:SI 0 "arith_reg_operand" "=r") + [(set (match_operand:SI 0 "arith_reg_dest" "=r") (rotate:SI (match_operand:SI 1 "arith_reg_operand" "0") (const_int 31))) (clobber (reg:SI T_REG))] @@ -2095,7 +3279,7 @@ [(set_attr "type" "arith")]) (define_insn "rotlsi3_16" - [(set (match_operand:SI 0 "arith_reg_operand" "=r") + [(set (match_operand:SI 0 "arith_reg_dest" "=r") (rotate:SI (match_operand:SI 1 "arith_reg_operand" "r") (const_int 16)))] "TARGET_SH1" @@ -2103,7 +3287,7 @@ [(set_attr "type" "arith")]) (define_expand "rotlsi3" - [(set (match_operand:SI 0 "arith_reg_operand" "") + [(set (match_operand:SI 0 "arith_reg_dest" "") (rotate:SI (match_operand:SI 1 "arith_reg_operand" "") (match_operand:SI 2 "immediate_operand" "")))] "TARGET_SH1" @@ -2159,7 +3343,7 @@ }") (define_insn "*rotlhi3_8" - [(set (match_operand:HI 0 "arith_reg_operand" "=r") + [(set (match_operand:HI 0 "arith_reg_dest" "=r") (rotate:HI (match_operand:HI 1 "arith_reg_operand" "r") (const_int 8)))] "TARGET_SH1" @@ -2181,7 +3365,7 @@ ;; shift left (define_insn "ashlsi3_sh2a" - [(set (match_operand:SI 0 "arith_reg_operand" "=r") + [(set (match_operand:SI 0 "arith_reg_dest" "=r") (ashift:SI (match_operand:SI 1 "arith_reg_operand" "0") (match_operand:SI 2 "arith_reg_operand" "r")))] "TARGET_SH2A" @@ -2193,7 +3377,7 @@ ;; insns. (define_insn_and_split "ashlsi3_std" - [(set (match_operand:SI 0 "arith_reg_operand" "=r,r,r,r") + [(set (match_operand:SI 0 "arith_reg_dest" "=r,r,r,r") (ashift:SI (match_operand:SI 1 "arith_reg_operand" "0,0,0,0") (match_operand:SI 2 "nonmemory_operand" "r,M,P27,?ri"))) (clobber (match_scratch:SI 3 "=X,X,X,&r"))] @@ -2218,7 +3402,7 @@ (set_attr "type" "dyn_shift,arith,arith,arith")]) (define_insn "ashlhi3_k" - [(set (match_operand:HI 0 "arith_reg_operand" "=r,r") + [(set (match_operand:HI 0 "arith_reg_dest" "=r,r") (ashift:HI (match_operand:HI 1 "arith_reg_operand" "0,0") (match_operand:HI 2 "const_int_operand" "M,P27")))] "TARGET_SH1 && CONST_OK_FOR_P27 (INTVAL (operands[2]))" @@ -2228,7 +3412,7 @@ [(set_attr "type" "arith")]) (define_insn "ashlsi3_n" - [(set (match_operand:SI 0 "arith_reg_operand" "=r") + [(set (match_operand:SI 0 "arith_reg_dest" "=r") (ashift:SI (match_operand:SI 1 "arith_reg_operand" "0") (match_operand:SI 2 "const_int_operand" "n"))) (clobber (reg:SI T_REG))] @@ -2245,7 +3429,7 @@ (set_attr "type" "arith")]) (define_split - [(set (match_operand:SI 0 "arith_reg_operand" "") + [(set (match_operand:SI 0 "arith_reg_dest" "") (ashift:SI (match_operand:SI 1 "arith_reg_operand" "") (match_operand:SI 2 "const_int_operand" ""))) (clobber (reg:SI T_REG))] @@ -2258,14 +3442,15 @@ }") (define_insn "ashlsi3_media" - [(set (match_operand:SI 0 "arith_reg_operand" "=r,r") + [(set (match_operand:SI 0 "arith_reg_dest" "=r,r") (ashift:SI (match_operand:SI 1 "extend_reg_operand" "r,r") - (match_operand:SI 2 "nonmemory_operand" "r,n")))] + (match_operand:SI 2 "shift_count_operand" "r,n")))] "TARGET_SHMEDIA" "@ shlld.l %1, %2, %0 shlli.l %1, %2, %0" - [(set_attr "type" "arith_media")]) + [(set_attr "type" "arith_media") + (set_attr "highpart" "ignore")]) (define_expand "ashlsi3" [(parallel [(set (match_operand:SI 0 "arith_reg_operand" "") @@ -2293,7 +3478,7 @@ }") (define_insn "*ashlhi3_n" - [(set (match_operand:HI 0 "arith_reg_operand" "=r") + [(set (match_operand:HI 0 "arith_reg_dest" "=r") (ashift:HI (match_operand:HI 1 "arith_reg_operand" "0") (match_operand:HI 2 "const_int_operand" "n"))) (clobber (reg:SI T_REG))] @@ -2324,7 +3509,7 @@ }") (define_split - [(set (match_operand:HI 0 "arith_reg_operand" "") + [(set (match_operand:HI 0 "arith_reg_dest" "") (ashift:HI (match_operand:HI 1 "arith_reg_operand" "") (match_operand:HI 2 "const_int_operand" ""))) (clobber (reg:SI T_REG))] @@ -2341,7 +3526,7 @@ ; (define_insn "ashrsi3_sh2a" - [(set (match_operand:SI 0 "arith_reg_operand" "=r") + [(set (match_operand:SI 0 "arith_reg_dest" "=r") (ashiftrt:SI (match_operand:SI 1 "arith_reg_operand" "0") (neg:SI (match_operand:SI 2 "arith_reg_operand" "r"))))] "TARGET_SH2A" @@ -2350,7 +3535,7 @@ (set_attr "length" "4")]) (define_insn "ashrsi3_k" - [(set (match_operand:SI 0 "arith_reg_operand" "=r") + [(set (match_operand:SI 0 "arith_reg_dest" "=r") (ashiftrt:SI (match_operand:SI 1 "arith_reg_operand" "0") (match_operand:SI 2 "const_int_operand" "M"))) (clobber (reg:SI T_REG))] @@ -2367,7 +3552,7 @@ ;; ??? This should be a define expand. (define_insn "ashrsi2_16" - [(set (match_operand:SI 0 "arith_reg_operand" "=r") + [(set (match_operand:SI 0 "arith_reg_dest" "=r") (ashiftrt:SI (match_operand:SI 1 "arith_reg_operand" "r") (const_int 16)))] "TARGET_SH1" @@ -2375,7 +3560,7 @@ [(set_attr "length" "4")]) (define_split - [(set (match_operand:SI 0 "arith_reg_operand" "") + [(set (match_operand:SI 0 "arith_reg_dest" "") (ashiftrt:SI (match_operand:SI 1 "arith_reg_operand" "") (const_int 16)))] "TARGET_SH1" @@ -2386,7 +3571,7 @@ ;; ??? This should be a define expand. (define_insn "ashrsi2_31" - [(set (match_operand:SI 0 "arith_reg_operand" "=r") + [(set (match_operand:SI 0 "arith_reg_dest" "=r") (ashiftrt:SI (match_operand:SI 1 "arith_reg_operand" "0") (const_int 31))) (clobber (reg:SI T_REG))] @@ -2395,7 +3580,7 @@ [(set_attr "length" "4")]) (define_split - [(set (match_operand:SI 0 "arith_reg_operand" "") + [(set (match_operand:SI 0 "arith_reg_dest" "") (ashiftrt:SI (match_operand:SI 1 "arith_reg_operand" "") (const_int 31))) (clobber (reg:SI T_REG))] @@ -2404,12 +3589,26 @@ " { emit_insn (gen_ashlsi_c (operands[0], operands[1])); - emit_insn (gen_subc1 (operands[0], operands[0], operands[0])); + emit_insn (gen_mov_neg_si_t (operands[0])); + DONE; +}") + +(define_peephole2 + [(set (match_operand:SI 0 "arith_reg_dest" "") (const_int 0)) + (set (reg:SI T_REG) + (gt:SI (match_dup 0) (match_operand:SI 1 "arith_reg_operand" "")))] + "TARGET_SH1 + && peep2_reg_dead_p (2, operands[0]) + && peep2_reg_dead_p (2, operands[1])" + [(const_int 0)] + " +{ + emit_insn (gen_ashlsi_c (operands[1], operands[1])); DONE; }") (define_insn "ashlsi_c" - [(set (match_operand:SI 0 "arith_reg_operand" "=r") + [(set (match_operand:SI 0 "arith_reg_dest" "=r") (ashift:SI (match_operand:SI 1 "arith_reg_operand" "0") (const_int 1))) (set (reg:SI T_REG) (lt:SI (match_dup 1) (const_int 0)))] @@ -2417,8 +3616,16 @@ "shll %0" [(set_attr "type" "arith")]) +(define_insn "*ashlsi_c_void" + [(set (reg:SI T_REG) + (lt:SI (match_operand:SI 0 "arith_reg_operand" "r") (const_int 0))) + (clobber (match_scratch:SI 1 "=0"))] + "TARGET_SH1 && cse_not_expected" + "shll %0" + [(set_attr "type" "arith")]) + (define_insn "ashrsi3_d" - [(set (match_operand:SI 0 "arith_reg_operand" "=r") + [(set (match_operand:SI 0 "arith_reg_dest" "=r") (ashiftrt:SI (match_operand:SI 1 "arith_reg_operand" "0") (neg:SI (match_operand:SI 2 "arith_reg_operand" "r"))))] "TARGET_SH3" @@ -2438,14 +3645,15 @@ (set_attr "needs_delay_slot" "yes")]) (define_insn "ashrsi3_media" - [(set (match_operand:SI 0 "arith_reg_operand" "=r,r") + [(set (match_operand:SI 0 "arith_reg_dest" "=r,r") (ashiftrt:SI (match_operand:SI 1 "extend_reg_operand" "r,r") - (match_operand:SI 2 "nonmemory_operand" "r,n")))] + (match_operand:SI 2 "shift_count_operand" "r,n")))] "TARGET_SHMEDIA" "@ shard.l %1, %2, %0 shari.l %1, %2, %0" - [(set_attr "type" "arith_media")]) + [(set_attr "type" "arith_media") + (set_attr "highpart" "ignore")]) (define_expand "ashrsi3" [(parallel [(set (match_operand:SI 0 "arith_reg_operand" "") @@ -2469,7 +3677,7 @@ ;; logical shift right (define_insn "lshrsi3_sh2a" - [(set (match_operand:SI 0 "arith_reg_operand" "=r") + [(set (match_operand:SI 0 "arith_reg_dest" "=r") (lshiftrt:SI (match_operand:SI 1 "arith_reg_operand" "0") (neg:SI (match_operand:SI 2 "arith_reg_operand" "r"))))] "TARGET_SH2A" @@ -2478,7 +3686,7 @@ (set_attr "length" "4")]) (define_insn "lshrsi3_d" - [(set (match_operand:SI 0 "arith_reg_operand" "=r") + [(set (match_operand:SI 0 "arith_reg_dest" "=r") (lshiftrt:SI (match_operand:SI 1 "arith_reg_operand" "0") (neg:SI (match_operand:SI 2 "arith_reg_operand" "r"))))] "TARGET_SH3" @@ -2488,7 +3696,7 @@ ;; Only the single bit shift clobbers the T bit. (define_insn "lshrsi3_m" - [(set (match_operand:SI 0 "arith_reg_operand" "=r") + [(set (match_operand:SI 0 "arith_reg_dest" "=r") (lshiftrt:SI (match_operand:SI 1 "arith_reg_operand" "0") (match_operand:SI 2 "const_int_operand" "M"))) (clobber (reg:SI T_REG))] @@ -2497,7 +3705,7 @@ [(set_attr "type" "arith")]) (define_insn "lshrsi3_k" - [(set (match_operand:SI 0 "arith_reg_operand" "=r") + [(set (match_operand:SI 0 "arith_reg_dest" "=r") (lshiftrt:SI (match_operand:SI 1 "arith_reg_operand" "0") (match_operand:SI 2 "const_int_operand" "P27")))] "TARGET_SH1 && CONST_OK_FOR_P27 (INTVAL (operands[2])) @@ -2506,7 +3714,7 @@ [(set_attr "type" "arith")]) (define_insn "lshrsi3_n" - [(set (match_operand:SI 0 "arith_reg_operand" "=r") + [(set (match_operand:SI 0 "arith_reg_dest" "=r") (lshiftrt:SI (match_operand:SI 1 "arith_reg_operand" "0") (match_operand:SI 2 "const_int_operand" "n"))) (clobber (reg:SI T_REG))] @@ -2523,7 +3731,7 @@ (set_attr "type" "arith")]) (define_split - [(set (match_operand:SI 0 "arith_reg_operand" "") + [(set (match_operand:SI 0 "arith_reg_dest" "") (lshiftrt:SI (match_operand:SI 1 "arith_reg_operand" "") (match_operand:SI 2 "const_int_operand" ""))) (clobber (reg:SI T_REG))] @@ -2536,17 +3744,18 @@ }") (define_insn "lshrsi3_media" - [(set (match_operand:SI 0 "arith_reg_operand" "=r,r") + [(set (match_operand:SI 0 "arith_reg_dest" "=r,r") (lshiftrt:SI (match_operand:SI 1 "extend_reg_operand" "r,r") - (match_operand:SI 2 "nonmemory_operand" "r,n")))] + (match_operand:SI 2 "shift_count_operand" "r,n")))] "TARGET_SHMEDIA" "@ shlrd.l %1, %2, %0 shlri.l %1, %2, %0" - [(set_attr "type" "arith_media")]) + [(set_attr "type" "arith_media") + (set_attr "highpart" "ignore")]) (define_expand "lshrsi3" - [(parallel [(set (match_operand:SI 0 "arith_reg_operand" "") + [(parallel [(set (match_operand:SI 0 "arith_reg_dest" "") (lshiftrt:SI (match_operand:SI 1 "arith_reg_operand" "") (match_operand:SI 2 "nonmemory_operand" ""))) (clobber (reg:SI T_REG))])] @@ -2575,7 +3784,7 @@ ;; ??? This should be a define expand. (define_insn "ashldi3_k" - [(set (match_operand:DI 0 "arith_reg_operand" "=r") + [(set (match_operand:DI 0 "arith_reg_dest" "=r") (ashift:DI (match_operand:DI 1 "arith_reg_operand" "0") (const_int 1))) (clobber (reg:SI T_REG))] @@ -2585,15 +3794,24 @@ (set_attr "type" "arith")]) (define_insn "ashldi3_media" - [(set (match_operand:DI 0 "arith_reg_operand" "=r,r") + [(set (match_operand:DI 0 "arith_reg_dest" "=r,r") (ashift:DI (match_operand:DI 1 "arith_reg_operand" "r,r") - (match_operand:DI 2 "nonmemory_operand" "r,n")))] + (match_operand:DI 2 "shift_count_operand" "r,n")))] "TARGET_SHMEDIA" "@ shlld %1, %2, %0 shlli %1, %2, %0" [(set_attr "type" "arith_media")]) +(define_insn "*ashldisi3_media" + [(set (subreg:DI (match_operand:SI 0 "arith_reg_operand" "=r") 0) + (ashift:DI (match_operand:DI 1 "arith_reg_operand" "r") + (match_operand:DI 2 "const_int_operand" "n")))] + "TARGET_SHMEDIA && INTVAL (operands[2]) < 32" + "shlli.l %1, %2, %0" + [(set_attr "type" "arith_media") + (set_attr "highpart" "ignore")]) + (define_expand "ashldi3" [(parallel [(set (match_operand:DI 0 "arith_reg_operand" "") (ashift:DI (match_operand:DI 1 "arith_reg_operand" "") @@ -2615,7 +3833,7 @@ ;; ??? This should be a define expand. (define_insn "lshrdi3_k" - [(set (match_operand:DI 0 "arith_reg_operand" "=r") + [(set (match_operand:DI 0 "arith_reg_dest" "=r") (lshiftrt:DI (match_operand:DI 1 "arith_reg_operand" "0") (const_int 1))) (clobber (reg:SI T_REG))] @@ -2625,15 +3843,26 @@ (set_attr "type" "arith")]) (define_insn "lshrdi3_media" - [(set (match_operand:DI 0 "arith_reg_operand" "=r,r") + [(set (match_operand:DI 0 "ext_dest_operand" "=r,r") (lshiftrt:DI (match_operand:DI 1 "arith_reg_operand" "r,r") - (match_operand:DI 2 "nonmemory_operand" "r,n")))] - "TARGET_SHMEDIA" + (match_operand:DI 2 "shift_count_operand" "r,n")))] + "TARGET_SHMEDIA + && (arith_reg_dest (operands[0], DImode) + || (GET_CODE (operands[2]) == CONST_INT && INTVAL (operands[2]) > 32))" "@ shlrd %1, %2, %0 shlri %1, %2, %0" [(set_attr "type" "arith_media")]) +(define_insn "*lshrdisi3_media" + [(set (subreg:DI (match_operand:SI 0 "arith_reg_operand" "=r") 0) + (lshiftrt:DI (match_operand:DI 1 "arith_reg_operand" "r") + (match_operand:DI 2 "const_int_operand" "n")))] + "TARGET_SHMEDIA && INTVAL (operands[2]) < 32" + "shlri.l %1, %2, %0" + [(set_attr "type" "arith_media") + (set_attr "highpart" "ignore")]) + (define_expand "lshrdi3" [(parallel [(set (match_operand:DI 0 "arith_reg_operand" "") (lshiftrt:DI (match_operand:DI 1 "arith_reg_operand" "") @@ -2655,7 +3884,7 @@ ;; ??? This should be a define expand. (define_insn "ashrdi3_k" - [(set (match_operand:DI 0 "arith_reg_operand" "=r") + [(set (match_operand:DI 0 "arith_reg_dest" "=r") (ashiftrt:DI (match_operand:DI 1 "arith_reg_operand" "0") (const_int 1))) (clobber (reg:SI T_REG))] @@ -2665,15 +3894,44 @@ (set_attr "type" "arith")]) (define_insn "ashrdi3_media" - [(set (match_operand:DI 0 "arith_reg_operand" "=r,r") + [(set (match_operand:DI 0 "ext_dest_operand" "=r,r") (ashiftrt:DI (match_operand:DI 1 "arith_reg_operand" "r,r") - (match_operand:DI 2 "nonmemory_operand" "r,n")))] - "TARGET_SHMEDIA" + (match_operand:DI 2 "shift_count_operand" "r,n")))] + "TARGET_SHMEDIA + && (arith_reg_dest (operands[0], DImode) + || (GET_CODE (operands[2]) == CONST_INT && INTVAL (operands[2]) >= 32))" "@ shard %1, %2, %0 shari %1, %2, %0" [(set_attr "type" "arith_media")]) +(define_insn "*ashrdisi3_media" + [(set (subreg:DI (match_operand:SI 0 "arith_reg_operand" "=r") 0) + (ashiftrt:DI (match_operand:DI 1 "arith_reg_operand" "r") + (match_operand:DI 2 "const_int_operand" "n")))] + "TARGET_SHMEDIA && INTVAL (operands[2]) < 32" + "shari.l %1, %2, %0" + [(set_attr "type" "arith_media") + (set_attr "highpart" "ignore")]) + +(define_insn "ashrdisi3_media_high" + [(set (match_operand:SI 0 "arith_reg_dest" "=r") + (truncate:SI + (ashiftrt:DI (match_operand:DI 1 "arith_reg_operand" "r") + (match_operand:DI 2 "const_int_operand" "n"))))] + "TARGET_SHMEDIA && INTVAL (operands[2]) >= 32" + "shari %1, %2, %0" + [(set_attr "type" "arith_media")]) + +(define_insn "ashrdisi3_media_opaque" + [(set (match_operand:SI 0 "arith_reg_dest" "=r") + (unspec:SI [(match_operand:DI 1 "arith_reg_operand" "r") + (match_operand:DI 2 "const_int_operand" "n")] + UNSPEC_ASHIFTRT))] + "TARGET_SHMEDIA" + "shari %1, %2, %0" + [(set_attr "type" "arith_media")]) + (define_expand "ashrdi3" [(parallel [(set (match_operand:DI 0 "arith_reg_operand" "") (ashiftrt:DI (match_operand:DI 1 "arith_reg_operand" "") @@ -2890,7 +4148,7 @@ ;; allow the xtrct instruction to be generated from C source. (define_insn "xtrct_left" - [(set (match_operand:SI 0 "arith_reg_operand" "=r") + [(set (match_operand:SI 0 "arith_reg_dest" "=r") (ior:SI (ashift:SI (match_operand:SI 1 "arith_reg_operand" "r") (const_int 16)) (lshiftrt:SI (match_operand:SI 2 "arith_reg_operand" "0") @@ -2900,7 +4158,7 @@ [(set_attr "type" "arith")]) (define_insn "xtrct_right" - [(set (match_operand:SI 0 "arith_reg_operand" "=r") + [(set (match_operand:SI 0 "arith_reg_dest" "=r") (ior:SI (lshiftrt:SI (match_operand:SI 1 "arith_reg_operand" "0") (const_int 16)) (ashift:SI (match_operand:SI 2 "arith_reg_operand" "r") @@ -2914,7 +4172,7 @@ ;; ------------------------------------------------------------------------- (define_insn "negc" - [(set (match_operand:SI 0 "arith_reg_operand" "=r") + [(set (match_operand:SI 0 "arith_reg_dest" "=r") (neg:SI (plus:SI (reg:SI T_REG) (match_operand:SI 1 "arith_reg_operand" "r")))) (set (reg:SI T_REG) @@ -2925,7 +4183,7 @@ [(set_attr "type" "arith")]) (define_insn "*negdi_media" - [(set (match_operand:DI 0 "arith_reg_operand" "=r") + [(set (match_operand:DI 0 "arith_reg_dest" "=r") (neg:DI (match_operand:DI 1 "arith_reg_operand" "r")))] "TARGET_SHMEDIA" "sub r63, %1, %0" @@ -2956,35 +4214,61 @@ }") (define_insn "negsi2" - [(set (match_operand:SI 0 "arith_reg_operand" "=r") + [(set (match_operand:SI 0 "arith_reg_dest" "=r") (neg:SI (match_operand:SI 1 "arith_reg_operand" "r")))] "TARGET_SH1" "neg %1,%0" [(set_attr "type" "arith")]) (define_insn "one_cmplsi2" - [(set (match_operand:SI 0 "arith_reg_operand" "=r") + [(set (match_operand:SI 0 "arith_reg_dest" "=r") (not:SI (match_operand:SI 1 "arith_reg_operand" "r")))] "TARGET_SH1" "not %1,%0" [(set_attr "type" "arith")]) -(define_expand "one_cmpldi2" - [(set (match_operand:DI 0 "arith_reg_operand" "") - (xor:DI (match_operand:DI 1 "arith_reg_operand" "") - (const_int -1)))] - "TARGET_SHMEDIA" "") +(define_expand "one_cmpldi2" + [(set (match_operand:DI 0 "arith_reg_dest" "") + (xor:DI (match_operand:DI 1 "arith_reg_operand" "") + (const_int -1)))] + "TARGET_SHMEDIA" "") + +/* The SH4 202 can do zero-offset branches without pipeline stalls. + This can be used as some kind of conditional execution, which is useful + for abs. */ +(define_split + [(set (match_operand:SI 0 "arith_reg_dest" "") + (plus:SI (xor:SI (neg:SI (reg:SI T_REG)) + (match_operand:SI 1 "arith_reg_operand" "")) + (reg:SI T_REG)))] + "TARGET_HARD_SH4" + [(const_int 0)] + "emit_insn (gen_movsi_i (operands[0], operands[1])); + emit_insn (gen_cneg (operands[0], operands[0], operands[0])); + DONE;") + +(define_insn "cneg" + [(set (match_operand:SI 0 "arith_reg_dest" "=r") + (if_then_else:SI (eq:SI (reg:SI T_REG) (const_int 0)) + (match_operand:SI 1 "arith_reg_operand" "0") + (neg:SI (match_operand:SI 2 "arith_reg_operand" "r"))))] + "TARGET_HARD_SH4" + "bf 0f\;neg %2,%0\\n0:" + [(set_attr "type" "arith") ;; poor approximation + (set_attr "length" "4")]) + ;; ------------------------------------------------------------------------- ;; Zero extension instructions ;; ------------------------------------------------------------------------- (define_insn "zero_extendsidi2" - [(set (match_operand:DI 0 "register_operand" "=r") + [(set (match_operand:DI 0 "arith_reg_dest" "=r") (zero_extend:DI (match_operand:SI 1 "extend_reg_operand" "r")))] "TARGET_SHMEDIA" "addz.l %1, r63, %0" - [(set_attr "type" "arith_media")]) + [(set_attr "type" "arith_media") + (set_attr "highpart" "extend")]) (define_insn "zero_extendhidi2" [(set (match_operand:DI 0 "register_operand" "=r,r") @@ -2993,7 +4277,11 @@ "@ # ld%M1.uw %m1, %0" - [(set_attr "type" "*,load_media")]) + [(set_attr "type" "*,load_media") + (set (attr "highpart") + (cond [(ne (symbol_ref "sh_contains_memref_p (insn)") (const_int 0)) + (const_string "user")] + (const_string "ignore")))]) (define_split [(set (match_operand:DI 0 "register_operand" "") @@ -3010,7 +4298,7 @@ ;; ??? when a truncated input to a zero_extend is reloaded, reload will ;; reload the entire truncate expression. (define_insn_and_split "*loaddi_trunc" - [(set (match_operand 0 "int_gpr_dest" "=r") + [(set (match_operand 0 "any_register_operand" "=r") (truncate (match_operand:DI 1 "memory_operand" "m")))] "TARGET_SHMEDIA && reload_completed" "#" @@ -3025,7 +4313,11 @@ "@ andi %1, 255, %0 ld%M1.ub %m1, %0" - [(set_attr "type" "arith_media,load_media")]) + [(set_attr "type" "arith_media,load_media") + (set (attr "highpart") + (cond [(ne (symbol_ref "sh_contains_memref_p (insn)") (const_int 0)) + (const_string "user")] + (const_string "ignore")))]) (define_expand "zero_extendhisi2" [(set (match_operand:SI 0 "arith_reg_operand" "") @@ -3038,7 +4330,7 @@ }") (define_insn "*zero_extendhisi2_compact" - [(set (match_operand:SI 0 "arith_reg_operand" "=r") + [(set (match_operand:SI 0 "arith_reg_dest" "=r") (zero_extend:SI (match_operand:HI 1 "arith_reg_operand" "r")))] "TARGET_SH1" "extu.w %1,%0" @@ -3051,18 +4343,27 @@ "@ # ld%M1.uw %m1, %0" - [(set_attr "type" "arith_media,load_media")]) + [(set_attr "type" "arith_media,load_media") + (set (attr "highpart") + (cond [(ne (symbol_ref "sh_contains_memref_p (insn)") (const_int 0)) + (const_string "user")] + (const_string "ignore")))]) (define_split [(set (match_operand:SI 0 "register_operand" "") (zero_extend:SI (match_operand:HI 1 "extend_reg_operand" "")))] "TARGET_SHMEDIA && reload_completed" - [(set (match_dup 0) (ashift:SI (subreg:SI (match_dup 1) 0) (const_int 16))) + [(set (match_dup 0) (ashift:SI (match_dup 2) (const_int 16))) (set (match_dup 0) (lshiftrt:SI (match_dup 0) (const_int 16)))] " { - if (GET_CODE (operands[1]) == TRUNCATE) - operands[1] = XEXP (operands[1], 0); + rtx op1 = operands[1]; + + if (GET_CODE (op1) == TRUNCATE) + op1 = XEXP (op1, 0); + operands[2] + = simplify_gen_subreg (SImode, op1, GET_MODE (op1), + subreg_lowpart_offset (SImode, GET_MODE (op1))); }") (define_expand "zero_extendqisi2" @@ -3076,7 +4377,7 @@ }") (define_insn "*zero_extendqisi2_compact" - [(set (match_operand:SI 0 "arith_reg_operand" "=r") + [(set (match_operand:SI 0 "arith_reg_dest" "=r") (zero_extend:SI (match_operand:QI 1 "arith_reg_operand" "r")))] "TARGET_SH1" "extu.b %1,%0" @@ -3089,10 +4390,14 @@ "@ andi %1, 255, %0 ld%M1.ub %m1, %0" - [(set_attr "type" "arith_media,load_media")]) + [(set_attr "type" "arith_media,load_media") + (set (attr "highpart") + (cond [(ne (symbol_ref "sh_contains_memref_p (insn)") (const_int 0)) + (const_string "user")] + (const_string "ignore")))]) (define_insn "zero_extendqihi2" - [(set (match_operand:HI 0 "arith_reg_operand" "=r") + [(set (match_operand:HI 0 "arith_reg_dest" "=r") (zero_extend:HI (match_operand:QI 1 "arith_reg_operand" "r")))] "TARGET_SH1" "extu.b %1,%0" @@ -3107,13 +4412,18 @@ ;; convert_move generates good code for SH[1-4]. (define_insn "extendsidi2" - [(set (match_operand:DI 0 "register_operand" "=r,r") - (sign_extend:DI (match_operand:SI 1 "nonimmediate_operand" "r,m")))] + [(set (match_operand:DI 0 "register_operand" "=r,r,r") + (sign_extend:DI (match_operand:SI 1 "nonimmediate_operand" "r,m,?f")))] "TARGET_SHMEDIA" "@ add.l %1, r63, %0 - ld%M1.l %m1, %0" - [(set_attr "type" "arith_media,load_media")]) + ld%M1.l %m1, %0 + fmov.sl %1, %0" + [(set_attr "type" "arith_media,load_media,fpconv_media") + (set (attr "highpart") + (cond [(ne (symbol_ref "sh_contains_memref_p (insn)") (const_int 0)) + (const_string "user")] + (const_string "extend")))]) (define_insn "extendhidi2" [(set (match_operand:DI 0 "register_operand" "=r,r") @@ -3122,7 +4432,11 @@ "@ # ld%M1.w %m1, %0" - [(set_attr "type" "*,load_media")]) + [(set_attr "type" "*,load_media") + (set (attr "highpart") + (cond [(ne (symbol_ref "sh_contains_memref_p (insn)") (const_int 0)) + (const_string "user")] + (const_string "ignore")))]) (define_split [(set (match_operand:DI 0 "register_operand" "") @@ -3143,7 +4457,11 @@ "@ # ld%M1.b %m1, %0" - [(set_attr "type" "*,load_media")]) + [(set_attr "type" "*,load_media") + (set (attr "highpart") + (cond [(ne (symbol_ref "sh_contains_memref_p (insn)") (const_int 0)) + (const_string "user")] + (const_string "ignore")))]) (define_split [(set (match_operand:DI 0 "register_operand" "") @@ -3158,13 +4476,13 @@ }") (define_expand "extendhisi2" - [(set (match_operand:SI 0 "arith_reg_operand" "=r,r") + [(set (match_operand:SI 0 "arith_reg_dest" "=r,r") (sign_extend:SI (match_operand:HI 1 "general_extend_operand" "r,m")))] "" "") (define_insn "*extendhisi2_compact" - [(set (match_operand:SI 0 "arith_reg_operand" "=r,r") + [(set (match_operand:SI 0 "arith_reg_dest" "=r,r") (sign_extend:SI (match_operand:HI 1 "general_movsrc_operand" "r,m")))] "TARGET_SH1" "@ @@ -3179,28 +4497,36 @@ "@ # ld%M1.w %m1, %0" - [(set_attr "type" "arith_media,load_media")]) + [(set_attr "type" "arith_media,load_media") + (set (attr "highpart") + (cond [(ne (symbol_ref "sh_contains_memref_p (insn)") (const_int 0)) + (const_string "user")] + (const_string "ignore")))]) (define_split [(set (match_operand:SI 0 "register_operand" "") (sign_extend:SI (match_operand:HI 1 "extend_reg_operand" "")))] "TARGET_SHMEDIA && reload_completed" - [(set (match_dup 0) (ashift:SI (subreg:SI (match_dup 1) 0) (const_int 16))) + [(set (match_dup 0) (ashift:SI (match_dup 2) (const_int 16))) (set (match_dup 0) (ashiftrt:SI (match_dup 0) (const_int 16)))] " { - if (GET_CODE (operands[1]) == TRUNCATE) - operands[1] = XEXP (operands[1], 0); + rtx op1 = operands[1]; + if (GET_CODE (op1) == TRUNCATE) + op1 = XEXP (op1, 0); + operands[2] + = simplify_gen_subreg (SImode, op1, GET_MODE (op1), + subreg_lowpart_offset (SImode, GET_MODE (op1))); }") (define_expand "extendqisi2" - [(set (match_operand:SI 0 "arith_reg_operand" "=r,r") + [(set (match_operand:SI 0 "arith_reg_dest" "=r,r") (sign_extend:SI (match_operand:QI 1 "general_extend_operand" "r,m")))] "" "") (define_insn "*extendqisi2_compact" - [(set (match_operand:SI 0 "arith_reg_operand" "=r,r") + [(set (match_operand:SI 0 "arith_reg_dest" "=r,r") (sign_extend:SI (match_operand:QI 1 "general_movsrc_operand" "r,m")))] "TARGET_SH1" "@ @@ -3215,22 +4541,30 @@ "@ # ld%M1.b %m1, %0" - [(set_attr "type" "arith_media,load_media")]) + [(set_attr "type" "arith_media,load_media") + (set (attr "highpart") + (cond [(ne (symbol_ref "sh_contains_memref_p (insn)") (const_int 0)) + (const_string "user")] + (const_string "ignore")))]) (define_split [(set (match_operand:SI 0 "register_operand" "") (sign_extend:SI (match_operand:QI 1 "extend_reg_operand" "")))] "TARGET_SHMEDIA && reload_completed" - [(set (match_dup 0) (ashift:SI (subreg:SI (match_dup 1) 0) (const_int 24))) + [(set (match_dup 0) (ashift:SI (match_dup 2) (const_int 24))) (set (match_dup 0) (ashiftrt:SI (match_dup 0) (const_int 24)))] " { - if (GET_CODE (operands[1]) == TRUNCATE) - operands[1] = XEXP (operands[1], 0); + rtx op1 = operands[1]; + if (GET_CODE (op1) == TRUNCATE) + op1 = XEXP (op1, 0); + operands[2] + = simplify_gen_subreg (SImode, op1, GET_MODE (op1), + subreg_lowpart_offset (SImode, GET_MODE (op1))); }") (define_insn "extendqihi2" - [(set (match_operand:HI 0 "arith_reg_operand" "=r,r") + [(set (match_operand:HI 0 "arith_reg_dest" "=r,r") (sign_extend:HI (match_operand:QI 1 "general_movsrc_operand" "r,m")))] "TARGET_SH1" "@ @@ -3252,8 +4586,11 @@ fmov.ls %1, %0 fmov.sl %T1, %0 fmov.s %T1, %0" - [(set_attr "type" "arith_media,store_media,fstore_media,fload_media,fpconv_media,fmove_media")]) - + [(set_attr "type" "arith_media,store_media,fstore_media,fload_media,fpconv_media,fmove_media") + (set (attr "highpart") + (cond [(ne (symbol_ref "sh_contains_memref_p (insn)") (const_int 0)) + (const_string "user")] + (const_string "extend")))]) (define_insn "truncdihi2" [(set (match_operand:HI 0 "general_movdst_operand" "=?r,m") @@ -3263,7 +4600,11 @@ shlli\\t%1,48,%0\;shlri\\t%0,48,%0 st%M0.w %m0, %1" [(set_attr "type" "arith_media,store_media") - (set_attr "length" "8,4")]) + (set_attr "length" "8,4") + (set (attr "highpart") + (cond [(ne (symbol_ref "sh_contains_memref_p (insn)") (const_int 0)) + (const_string "user")] + (const_string "extend")))]) ; N.B. This should agree with LOAD_EXTEND_OP and movqi. ; Because we use zero extension, we can't provide signed QImode compares @@ -3275,8 +4616,11 @@ "@ andi %1, 255, %0 st%M0.b %m0, %1" - [(set_attr "type" "arith_media,store")]) - + [(set_attr "type" "arith_media,store") + (set (attr "highpart") + (cond [(ne (symbol_ref "sh_contains_memref_p (insn)") (const_int 0)) + (const_string "user")] + (const_string "extend")))]) ;; ------------------------------------------------------------------------- ;; Move instructions ;; ------------------------------------------------------------------------- @@ -3480,7 +4824,7 @@ (define_insn_and_split "load_ra" [(set (match_operand:SI 0 "general_movdst_operand" "") - (unspec:SI [(match_operand 1 "register_operand" "")] UNSPEC_RA))] + (unspec:SI [(match_operand:SI 1 "register_operand" "")] UNSPEC_RA))] "TARGET_SH1" "#" "&& ! currently_expanding_to_rtl" @@ -3491,14 +4835,18 @@ operands[1] = gen_rtx_MEM (SImode, return_address_pointer_rtx); }") +;; The '?'s in the following constraints may not reflect the time taken +;; to perform the move. They are there to discourage the use of floating- +;; point registers for storing integer values. (define_insn "*movsi_media" [(set (match_operand:SI 0 "general_movdst_operand" - "=r,r,r,r,m,f,m,f,r,f,*b,r,b") + "=r,r,r,r,m,f?,m,f?,r,f?,*b,r,b") (match_operand:SI 1 "general_movsrc_operand" - "r,I16C16,nCpg,m,rZ,m,f,rZ,f,f,r,*b,Csy"))] + "r,I16C16,nCpg,m,rZ,m,f?,rZ,f?,f?,r,*b,Csy"))] "TARGET_SHMEDIA_FPU && (register_operand (operands[0], SImode) - || sh_register_operand (operands[1], SImode))" + || sh_register_operand (operands[1], SImode) + || GET_CODE (operands[1]) == TRUNCATE)" "@ add.l %1, r63, %0 movi %1, %0 @@ -3514,16 +4862,21 @@ gettr %1, %0 pt %1, %0" [(set_attr "type" "arith_media,arith_media,*,load_media,store_media,fload_media,fstore_media,fload_media,fpconv_media,fmove_media,ptabs_media,gettr_media,pt_media") - (set_attr "length" "4,4,8,4,4,4,4,4,4,4,4,4,12")]) + (set_attr "length" "4,4,8,4,4,4,4,4,4,4,4,4,12") + (set (attr "highpart") + (cond [(ne (symbol_ref "sh_contains_memref_p (insn)") (const_int 0)) + (const_string "user")] + (const_string "ignore")))]) (define_insn "*movsi_media_nofpu" [(set (match_operand:SI 0 "general_movdst_operand" - "=r,r,r,r,m,*b,r,b") + "=r,r,r,r,m,*b,r,*b") (match_operand:SI 1 "general_movsrc_operand" "r,I16C16,nCpg,m,rZ,r,*b,Csy"))] "TARGET_SHMEDIA && (register_operand (operands[0], SImode) - || sh_register_operand (operands[1], SImode))" + || sh_register_operand (operands[1], SImode) + || GET_CODE (operands[1]) == TRUNCATE)" "@ add.l %1, r63, %0 movi %1, %0 @@ -3534,18 +4887,72 @@ gettr %1, %0 pt %1, %0" [(set_attr "type" "arith_media,arith_media,*,load_media,store_media,ptabs_media,gettr_media,pt_media") - (set_attr "length" "4,4,8,4,4,4,4,12")]) + (set_attr "length" "4,4,8,4,4,4,4,12") + (set (attr "highpart") + (cond [(ne (symbol_ref "sh_contains_memref_p (insn)") (const_int 0)) + (const_string "user")] + (const_string "ignore")))]) + +(define_expand "movsi_const" + [(set (match_operand:SI 0 "arith_reg_operand" "=r") + (const:SI (sign_extend:SI + (truncate:HI + (ashiftrt:SI + (match_operand:DI 1 "immediate_operand" "s") + (const_int 16)))))) + (set (match_dup 0) + (ior:SI (ashift:SI (match_dup 0) (const_int 16)) + (zero_extend:SI + (truncate:HI + (const:SI + (sign_extend:SI + (truncate:HI (match_dup 1))))))))] + "TARGET_SHMEDIA && reload_completed + && MOVI_SHORI_BASE_OPERAND_P (operands[1])" + " +{ + if (GET_CODE (operands[1]) == LABEL_REF + && GET_CODE (XEXP (operands[1], 0)) == CODE_LABEL) + LABEL_NUSES (XEXP (operands[1], 0)) += 2; + else if (GOTOFF_P (operands[1])) + { + rtx unspec = XEXP (operands[1], 0); + + if (! UNSPEC_GOTOFF_P (unspec)) + { + unspec = XEXP (unspec, 0); + if (! UNSPEC_GOTOFF_P (unspec)) + abort (); + } + if (GET_CODE (XVECEXP (unspec , 0, 0)) == LABEL_REF + && (GET_CODE (XEXP (XVECEXP (unspec, 0, 0), 0)) == CODE_LABEL)) + LABEL_NUSES (XEXP (XVECEXP (unspec, 0, 0), 0)) += 2; + } +}") + +(define_expand "movsi_const_16bit" + [(set (match_operand:SI 0 "arith_reg_operand" "=r") + (const:SI (sign_extend:SI + (truncate:HI + (match_operand:DI 1 "immediate_operand" "s")))))] + "TARGET_SHMEDIA && flag_pic && reload_completed + && GET_CODE (operands[1]) == SYMBOL_REF" + "") (define_split - [(set (match_operand:SI 0 "arith_reg_operand" "") + [(set (match_operand:SI 0 "arith_reg_dest" "") (match_operand:SI 1 "immediate_operand" ""))] "TARGET_SHMEDIA && reload_completed && MOVI_SHORI_BASE_OPERAND_P (operands[1])" - [(set (subreg:DI (match_dup 0) 0) (match_dup 2))] + [(const_int 0)] " { - operands[2] = shallow_copy_rtx (operands[1]); - PUT_MODE (operands[2], DImode); + rtx insn = emit_insn (gen_movsi_const (operands[0], operands[1])); + + REG_NOTES (insn) = gen_rtx_EXPR_LIST (REG_EQUAL, operands[1], + REG_NOTES (insn)); + + DONE; }") (define_split @@ -3577,7 +4984,7 @@ } else if (TARGET_SHCOMPACT) { - operands[1] = function_symbol (\"__ic_invalidate\"); + operands[1] = function_symbol (NULL, \"__ic_invalidate\", SFUNC_STATIC); operands[1] = force_reg (Pmode, operands[1]); emit_insn (gen_ic_invalidate_line_compact (operands[0], operands[1])); DONE; @@ -3618,7 +5025,7 @@ ;; ??? could make arg 0 an offsettable memory operand to allow to save ;; an add in the code that calculates the address. (define_insn "ic_invalidate_line_media" - [(unspec_volatile [(match_operand 0 "register_operand" "r")] + [(unspec_volatile [(match_operand 0 "any_register_operand" "r")] UNSPEC_ICACHE)] "TARGET_SHMEDIA" "ocbwb %0,0\;synco\;icbi %0, 0\;synci" @@ -3645,7 +5052,8 @@ rtx sfun, tramp; tramp = force_reg (Pmode, operands[0]); - sfun = force_reg (Pmode, function_symbol (\"__init_trampoline\")); + sfun = force_reg (Pmode, function_symbol (NULL, \"__init_trampoline\", + SFUNC_STATIC)); emit_move_insn (gen_rtx_REG (SImode, R2_REG), operands[1]); emit_move_insn (gen_rtx_REG (SImode, R3_REG), operands[2]); @@ -3685,13 +5093,17 @@ (match_operand:QI 1 "general_movsrc_operand" "r,I16C16,m,rZ"))] "TARGET_SHMEDIA && (arith_reg_operand (operands[0], QImode) - || arith_reg_or_0_operand (operands[1], QImode))" + || extend_reg_or_0_operand (operands[1], QImode))" "@ add.l %1, r63, %0 movi %1, %0 ld%M1.ub %m1, %0 st%M0.b %m0, %N1" - [(set_attr "type" "arith_media,arith_media,load_media,store_media")]) + [(set_attr "type" "arith_media,arith_media,load_media,store_media") + (set (attr "highpart") + (cond [(ne (symbol_ref "sh_contains_memref_p (insn)") (const_int 0)) + (const_string "user")] + (const_string "ignore")))]) (define_expand "movqi" [(set (match_operand:QI 0 "general_operand" "") @@ -3749,7 +5161,11 @@ # ld%M1.w %m1, %0 st%M0.w %m0, %N1" - [(set_attr "type" "arith_media,arith_media,*,load_media,store_media")]) + [(set_attr "type" "arith_media,arith_media,*,load_media,store_media") + (set (attr "highpart") + (cond [(ne (symbol_ref "sh_contains_memref_p (insn)") (const_int 0)) + (const_string "user")] + (const_string "ignore")))]) (define_split [(set (match_operand:HI 0 "register_operand" "") @@ -3847,11 +5263,14 @@ FAIL; }") +;; The '?'s in the following constraints may not reflect the time taken +;; to perform the move. They are there to discourage the use of floating- +;; point registers for storing integer values. (define_insn "*movdi_media" [(set (match_operand:DI 0 "general_movdst_operand" - "=r,r,r,rl,m,f,m,f,r,f,*b,r,b") + "=r,r,r,rl,m,f?,m,f?,r,f?,*b,r,*b") (match_operand:DI 1 "general_movsrc_operand" - "r,I16C16,nCpgF,m,rlZ,m,f,rZ,f,f,r,*b,Csy"))] + "r,I16C16,nCpgF,m,rlZ,m,f?,rZ,f?,f?,r,*b,Csy"))] "TARGET_SHMEDIA_FPU && (register_operand (operands[0], DImode) || sh_register_operand (operands[1], DImode))" @@ -3873,7 +5292,7 @@ (set_attr "length" "4,4,16,4,4,4,4,4,4,4,4,4,*")]) (define_insn "*movdi_media_nofpu" - [(set (match_operand:DI 0 "general_movdst_operand" "=r,r,r,rl,m,*b,r,b") + [(set (match_operand:DI 0 "general_movdst_operand" "=r,r,r,rl,m,*b,r,*b"); (match_operand:DI 1 "general_movsrc_operand" "r,I16C16,nCpgF,m,rlZ,r,*b,Csy"))] "TARGET_SHMEDIA && (register_operand (operands[0], DImode) @@ -3890,8 +5309,16 @@ [(set_attr "type" "arith_media,arith_media,*,load_media,store_media,ptabs_media,gettr_media,pt_media") (set_attr "length" "4,4,16,4,4,4,4,*")]) +(define_insn "*movdi_media_I16" + [(set (match_operand:DI 0 "ext_dest_operand" "=r") + (match_operand:DI 1 "const_int_operand" "I16"))] + "TARGET_SHMEDIA && reload_completed" + "movi %1, %0" + [(set_attr "type" "arith_media") + (set_attr "length" "4")]) + (define_split - [(set (match_operand:DI 0 "arith_reg_operand" "") + [(set (match_operand:DI 0 "arith_reg_dest" "") (match_operand:DI 1 "immediate_operand" ""))] "TARGET_SHMEDIA && reload_completed && MOVI_SHORI_BASE_OPERAND_P (operands[1])" @@ -3985,7 +5412,7 @@ "") (define_split - [(set (match_operand:DI 0 "arith_reg_operand" "") + [(set (match_operand:DI 0 "ext_dest_operand" "") (match_operand:DI 1 "immediate_operand" ""))] "TARGET_SHMEDIA && reload_completed && GET_CODE (operands[1]) == CONST_INT @@ -4007,10 +5434,20 @@ /* Arithmetic shift right the word by 16 bits. */ high >>= 16; - sign = 1; - sign <<= (HOST_BITS_PER_WIDE_INT - 16 - 1); - high ^= sign; - high -= sign; + if (GET_CODE (operands[0]) == SUBREG + && GET_MODE (SUBREG_REG (operands[0])) == SImode) + { + high &= 0xffff; + high ^= 0x8000; + high -= 0x8000; + } + else + { + sign = 1; + sign <<= (HOST_BITS_PER_WIDE_INT - 16 - 1); + high ^= sign; + high -= sign; + } do { /* If we can't generate the constant with a two-insn movi / shori @@ -4084,7 +5521,7 @@ }") (define_split - [(set (match_operand:DI 0 "arith_reg_operand" "") + [(set (match_operand:DI 0 "ext_dest_operand" "") (match_operand:DI 1 "immediate_operand" ""))] "TARGET_SHMEDIA && reload_completed && GET_CODE (operands[1]) == CONST_DOUBLE" @@ -4124,18 +5561,28 @@ }") (define_insn "shori_media" - [(set (match_operand:DI 0 "arith_reg_operand" "=r,r") + [(set (match_operand:DI 0 "ext_dest_operand" "=r,r") (ior:DI (ashift:DI (match_operand:DI 1 "arith_reg_operand" "0,0") (const_int 16)) (zero_extend:DI (truncate:HI (match_operand:DI 2 "immediate_operand" "I16C16,nF")))))] - "TARGET_SHMEDIA" + "TARGET_SHMEDIA && (reload_completed || arith_reg_dest (operands[0], DImode))" "@ shori %u2, %0 #" [(set_attr "type" "arith_media,*")]) +(define_insn "*shori_media_si" + [(set (match_operand:SI 0 "arith_reg_dest" "=r") + (ior:SI (ashift:SI (match_operand:SI 1 "arith_reg_operand" "0") + (const_int 16)) + (zero_extend:SI + (truncate:HI + (match_operand:SI 2 "immediate_operand" "I16C16")))))] + "TARGET_SHMEDIA" + "shori %u2, %0") + (define_expand "movdi" [(set (match_operand:DI 0 "general_movdst_operand" "") (match_operand:DI 1 "general_movsrc_operand" ""))] @@ -4174,7 +5621,7 @@ [(set_attr "type" "arith_media,*,load_media,store_media")]) (define_split - [(set (match_operand:DF 0 "arith_reg_operand" "") + [(set (match_operand:DF 0 "arith_reg_dest" "") (match_operand:DF 1 "immediate_operand" ""))] "TARGET_SHMEDIA && reload_completed" [(set (match_dup 3) (match_dup 2))] @@ -4607,7 +6054,8 @@ [(set (match_operand:SI 0 "register_operand" "") (match_operand:SI 1 "" "")) (clobber (match_operand 2 "register_operand" ""))] - "TARGET_SH1 && ! reload_in_progress && ! reload_completed" + "TARGET_SH1 && ! reload_in_progress && ! reload_completed + && ALLOW_INDEXED_ADDRESS" [(use (reg:SI R0_REG))] " { @@ -4634,7 +6082,8 @@ [(set (match_operand:SI 1 "" "") (match_operand:SI 0 "register_operand" "")) (clobber (match_operand 2 "register_operand" ""))] - "TARGET_SH1 && ! reload_in_progress && ! reload_completed" + "TARGET_SH1 && ! reload_in_progress && ! reload_completed + && ALLOW_INDEXED_ADDRESS" [(use (reg:SI R0_REG))] " { @@ -4873,7 +6322,11 @@ fst%M0.s %m0, %1 ld%M1.l %m1, %0 st%M0.l %m0, %N1" - [(set_attr "type" "fmove_media,fload_media,fpconv_media,arith_media,*,fload_media,fstore_media,load_media,store_media")]) + [(set_attr "type" "fmove_media,fload_media,fpconv_media,arith_media,*,fload_media,fstore_media,load_media,store_media") + (set (attr "highpart") + (cond [(ne (symbol_ref "sh_contains_memref_p (insn)") (const_int 0)) + (const_string "user")] + (const_string "ignore")))]) (define_insn "movsf_media_nofpu" [(set (match_operand:SF 0 "general_movdst_operand" "=r,r,r,m") @@ -4886,10 +6339,14 @@ # ld%M1.l %m1, %0 st%M0.l %m0, %N1" - [(set_attr "type" "arith_media,*,load_media,store_media")]) + [(set_attr "type" "arith_media,*,load_media,store_media") + (set (attr "highpart") + (cond [(ne (symbol_ref "sh_contains_memref_p (insn)") (const_int 0)) + (const_string "user")] + (const_string "ignore")))]) (define_split - [(set (match_operand:SF 0 "arith_reg_operand" "") + [(set (match_operand:SF 0 "arith_reg_dest" "") (match_operand:SF 1 "immediate_operand" ""))] "TARGET_SHMEDIA && reload_completed && ! FP_REGISTER_P (true_regnum (operands[0]))" @@ -5027,12 +6484,70 @@ "") (define_expand "reload_insi" - [(parallel [(set (match_operand:SF 0 "register_operand" "=y") - (match_operand:SF 1 "immediate_operand" "FQ")) + [(parallel [(set (match_operand:SI 0 "fpul_operand" "=y") + (match_operand:SI 1 "immediate_operand" "i")) (clobber (match_operand:SI 2 "register_operand" "=&z"))])] "TARGET_SH1" "") +(define_expand "ptabs" + [(set (match_operand 0 "" "=b") (match_operand 1 "" "r"))] + "TARGET_SHMEDIA" + " +{ + if (!TARGET_PT_FIXED) + { + rtx eq = operands[1]; + + /* ??? For canonical RTL we really should remove any CONST from EQ + before wrapping it in the AND, and finally wrap the EQ into a + const if is constant. However, for reload we must expose the + input register or symbolic constant, and we can't have + different insn structures outside of the operands for different + alternatives of the same pattern. */ + eq = gen_rtx_EQ (SImode, gen_rtx_AND (Pmode, eq, GEN_INT (3)), + GEN_INT (3)); + operands[1] + = (gen_rtx_IF_THEN_ELSE + (PDImode, + eq, + gen_rtx_MEM (PDImode, operands[1]), + gen_rtx_fmt_e (TARGET_SHMEDIA32 ? SIGN_EXTEND : TRUNCATE, + PDImode, operands[1]))); + } +}") + +;; expanded by ptabs expander. +(define_insn "*extendsipdi_media" + [(set (match_operand:PDI 0 "target_reg_operand" "=b,b"); + (if_then_else:PDI (eq (and:SI (match_operand:SI 1 "target_operand" + "r,Csy") + (const_int 3)) + (const_int 3)) + (mem:PDI (match_dup 1)) + (sign_extend:PDI (match_dup 1))))] + "TARGET_SHMEDIA && !TARGET_PT_FIXED" + "@ + ptabs %1, %0 + pt %1, %0" + [(set_attr "type" "ptabs_media,pt_media") + (set_attr "length" "4,*")]) + +(define_insn "*truncdipdi_media" + [(set (match_operand:PDI 0 "target_reg_operand" "=b,b"); + (if_then_else:PDI (eq (and:DI (match_operand:DI 1 "target_operand" + "r,Csy") + (const_int 3)) + (const_int 3)) + (mem:PDI (match_dup 1)) + (truncate:PDI (match_dup 1))))] + "TARGET_SHMEDIA && !TARGET_PT_FIXED" + "@ + ptabs %1, %0 + pt %1, %0" + [(set_attr "type" "ptabs_media,pt_media") + (set_attr "length" "4,*")]) + (define_insn "*movsi_y" [(set (match_operand:SI 0 "register_operand" "=y,y") (match_operand:SI 1 "immediate_operand" "Qi,I08")) @@ -5109,8 +6624,8 @@ ;; jump around the unconditional jump because it was out of range. (define_insn "stuff_delay_slot" [(set (pc) - (unspec [(match_operand 0 "const_int_operand" "") (pc)] UNSPEC_BBR)) - (set (reg:SI T_REG) (match_operand 1 "const_int_operand" ""))] + (unspec [(match_operand:SI 0 "const_int_operand" "") (pc)] UNSPEC_BBR)) + (set (reg:SI T_REG) (match_operand:SI 1 "const_int_operand" ""))] "TARGET_SH1" "" [(set_attr "length" "0") @@ -5122,90 +6637,126 @@ [(set (pc) (if_then_else (eq (match_operand:DI 1 "arith_reg_operand" "r,r") (match_operand:DI 2 "arith_operand" "r,I06")) - (label_ref:DI (match_operand 0 "" "")) + (match_operand 0 "" "") (pc)))] "TARGET_SHMEDIA" - "") + "operands[0] = gen_rtx_LABEL_REF (Pmode, operands[0]);") (define_insn "*beq_media_i" [(set (pc) (if_then_else (match_operator 3 "equality_comparison_operator" [(match_operand:DI 1 "arith_reg_operand" "r,r") (match_operand:DI 2 "arith_operand" "r,I06")]) - (match_operand:DI 0 "target_operand" "b,b") + (match_operand 0 "target_operand" "b,b") + (pc)))] + "TARGET_SHMEDIA" + "@ + b%o3%' %1, %2, %0%> + b%o3i%' %1, %2, %0%>" + [(set_attr "type" "cbranch_media")]) + +(define_insn "*beq_media_i32" + [(set (pc) + (if_then_else (match_operator 3 "equality_comparison_operator" + [(match_operand:SI 1 "arith_reg_operand" "r,r") + (match_operand:SI 2 "arith_operand" "r,I06")]) + (match_operand 0 "target_operand" "b,b") (pc)))] "TARGET_SHMEDIA" "@ - b%o3%' %1, %2, %0 - b%o3i%' %1, %2, %0" + b%o3%' %1, %2, %0%> + b%o3i%' %1, %2, %0%>" [(set_attr "type" "cbranch_media")]) (define_expand "bne_media" [(set (pc) (if_then_else (ne (match_operand:DI 1 "arith_reg_operand" "r,r") (match_operand:DI 2 "arith_operand" "r,I06")) - (label_ref:DI (match_operand 0 "" "")) + (match_operand 0 "" "") (pc)))] "TARGET_SHMEDIA" - "") + "operands[0] = gen_rtx_LABEL_REF (Pmode, operands[0]);") (define_expand "bgt_media" [(set (pc) - (if_then_else (gt (match_operand:DI 1 "arith_reg_or_0_operand" "r") - (match_operand:DI 2 "arith_reg_or_0_operand" "r")) - (label_ref:DI (match_operand 0 "" "")) + (if_then_else (gt (match_operand:DI 1 "arith_reg_or_0_operand" "") + (match_operand:DI 2 "arith_reg_or_0_operand" "")) + (match_operand 0 "" "") (pc)))] "TARGET_SHMEDIA" - "") + "operands[0] = gen_rtx_LABEL_REF (Pmode, operands[0]);") (define_expand "bge_media" [(set (pc) - (if_then_else (ge (match_operand:DI 1 "arith_reg_or_0_operand" "r") - (match_operand:DI 2 "arith_reg_or_0_operand" "r")) - (label_ref:DI (match_operand 0 "" "")) + (if_then_else (ge (match_operand:DI 1 "arith_reg_or_0_operand" "") + (match_operand:DI 2 "arith_reg_or_0_operand" "")) + (match_operand 0 "" "") (pc)))] "TARGET_SHMEDIA" - "") + "operands[0] = gen_rtx_LABEL_REF (Pmode, operands[0]);") (define_expand "bgtu_media" [(set (pc) - (if_then_else (gtu (match_operand:DI 1 "arith_reg_or_0_operand" "r") - (match_operand:DI 2 "arith_reg_or_0_operand" "r")) - (label_ref:DI (match_operand 0 "" "")) + (if_then_else (gtu (match_operand:DI 1 "arith_reg_or_0_operand" "") + (match_operand:DI 2 "arith_reg_or_0_operand" "")) + (match_operand 0 "" "") (pc)))] "TARGET_SHMEDIA" - "") + "operands[0] = gen_rtx_LABEL_REF (Pmode, operands[0]);") (define_expand "bgeu_media" [(set (pc) - (if_then_else (geu (match_operand:DI 1 "arith_reg_or_0_operand" "r") - (match_operand:DI 2 "arith_reg_or_0_operand" "r")) - (label_ref:DI (match_operand 0 "" "")) + (if_then_else (geu (match_operand:DI 1 "arith_reg_or_0_operand" "") + (match_operand:DI 2 "arith_reg_or_0_operand" "")) + (match_operand 0 "" "") (pc)))] "TARGET_SHMEDIA" - "") + "operands[0] = gen_rtx_LABEL_REF (Pmode, operands[0]);") (define_insn "*bgt_media_i" [(set (pc) (if_then_else (match_operator 3 "greater_comparison_operator" [(match_operand:DI 1 "arith_reg_or_0_operand" "rN") (match_operand:DI 2 "arith_reg_or_0_operand" "rN")]) - (match_operand:DI 0 "target_operand" "b") + (match_operand 0 "target_operand" "b") + (pc)))] + "TARGET_SHMEDIA" + "b%o3%' %N1, %N2, %0%>" + [(set_attr "type" "cbranch_media")]) + +(define_insn "*bgt_media_i32" + [(set (pc) + (if_then_else (match_operator 3 "greater_comparison_operator" + [(match_operand:SI 1 "arith_reg_or_0_operand" "rN") + (match_operand:SI 2 "arith_reg_or_0_operand" "rN")]) + (match_operand 0 "target_operand" "b") (pc)))] "TARGET_SHMEDIA" - "b%o3%' %N1, %N2, %0" + "b%o3%' %N1, %N2, %0%>" [(set_attr "type" "cbranch_media")]) -;; These are only needed to make invert_jump() happy. +;; These are only needed to make invert_jump() happy - otherwise, jump +;; optimization will be silently disabled. (define_insn "*blt_media_i" [(set (pc) (if_then_else (match_operator 3 "less_comparison_operator" [(match_operand:DI 1 "arith_reg_or_0_operand" "rN") (match_operand:DI 2 "arith_reg_or_0_operand" "rN")]) - (match_operand:DI 0 "target_operand" "b") + (match_operand 0 "target_operand" "b") (pc)))] "TARGET_SHMEDIA" - "b%o3%' %N2, %N1, %0" + "b%o3%' %N2, %N1, %0%>" + [(set_attr "type" "cbranch_media")]) + +(define_insn "*blt_media_i32" + [(set (pc) + (if_then_else (match_operator 3 "less_comparison_operator" + [(match_operand:SI 1 "arith_reg_or_0_operand" "rN") + (match_operand:SI 2 "arith_reg_or_0_operand" "rN")]) + (match_operand 0 "target_operand" "b") + (pc)))] + "TARGET_SHMEDIA" + "b%o3%' %N2, %N1, %0%>" [(set_attr "type" "cbranch_media")]) (define_expand "beq" @@ -5218,7 +6769,9 @@ { if (TARGET_SHMEDIA) { - if (GET_MODE (sh_compare_op0) != DImode) + enum machine_mode mode = GET_MODE (sh_compare_op0); + + if (mode != DImode && mode != SImode) { rtx tmp = gen_reg_rtx (DImode); @@ -5227,7 +6780,11 @@ DONE; } - sh_compare_op0 = force_reg (DImode, sh_compare_op0); + sh_compare_op0 = force_reg (mode, sh_compare_op0); + if (CONSTANT_P (sh_compare_op1) + && (GET_CODE (sh_compare_op1) != CONST_INT + || ! CONST_OK_FOR_I06 (INTVAL (sh_compare_op1)))) + sh_compare_op1 = force_reg (mode, sh_compare_op1); emit_jump_insn (gen_beq_media (operands[0], sh_compare_op0, sh_compare_op1)); DONE; @@ -5246,7 +6803,9 @@ { if (TARGET_SHMEDIA) { - if (GET_MODE (sh_compare_op0) != DImode) + enum machine_mode mode = GET_MODE (sh_compare_op0); + + if (mode != DImode && mode != SImode) { rtx tmp = gen_reg_rtx (DImode); @@ -5255,7 +6814,11 @@ DONE; } - sh_compare_op0 = force_reg (DImode, sh_compare_op0); + sh_compare_op0 = force_reg (mode, sh_compare_op0); + if (CONSTANT_P (sh_compare_op1) + && (GET_CODE (sh_compare_op1) != CONST_INT + || ! CONST_OK_FOR_I06 (INTVAL (sh_compare_op1)))) + sh_compare_op1 = force_reg (mode, sh_compare_op1); emit_jump_insn (gen_bne_media (operands[0], sh_compare_op0, sh_compare_op1)); DONE; @@ -5274,7 +6837,9 @@ { if (TARGET_SHMEDIA) { - if (GET_MODE (sh_compare_op0) != DImode) + enum machine_mode mode = GET_MODE (sh_compare_op0); + + if (mode != DImode && mode != SImode) { rtx tmp = gen_reg_rtx (DImode); @@ -5284,9 +6849,9 @@ } if (sh_compare_op0 != const0_rtx) - sh_compare_op0 = force_reg (DImode, sh_compare_op0); + sh_compare_op0 = force_reg (mode, sh_compare_op0); if (sh_compare_op1 != const0_rtx) - sh_compare_op1 = force_reg (DImode, sh_compare_op1); + sh_compare_op1 = force_reg (mode, sh_compare_op1); emit_jump_insn (gen_bgt_media (operands[0], sh_compare_op0, sh_compare_op1)); DONE; @@ -5305,7 +6870,9 @@ { if (TARGET_SHMEDIA) { - if (GET_MODE (sh_compare_op0) != DImode) + enum machine_mode mode = GET_MODE (sh_compare_op0); + + if (mode != DImode && mode != SImode) { rtx tmp = gen_reg_rtx (DImode); @@ -5315,9 +6882,9 @@ } if (sh_compare_op0 != const0_rtx) - sh_compare_op0 = force_reg (DImode, sh_compare_op0); + sh_compare_op0 = force_reg (mode, sh_compare_op0); if (sh_compare_op1 != const0_rtx) - sh_compare_op1 = force_reg (DImode, sh_compare_op1); + sh_compare_op1 = force_reg (mode, sh_compare_op1); emit_jump_insn (gen_bgt_media (operands[0], sh_compare_op1, sh_compare_op0)); DONE; @@ -5344,7 +6911,9 @@ { if (TARGET_SHMEDIA) { - if (GET_MODE (sh_compare_op0) != DImode) + enum machine_mode mode = GET_MODE (sh_compare_op0); + + if (mode != DImode && mode != SImode) { rtx tmp = gen_reg_rtx (DImode); @@ -5354,9 +6923,9 @@ } if (sh_compare_op0 != const0_rtx) - sh_compare_op0 = force_reg (DImode, sh_compare_op0); + sh_compare_op0 = force_reg (mode, sh_compare_op0); if (sh_compare_op1 != const0_rtx) - sh_compare_op1 = force_reg (DImode, sh_compare_op1); + sh_compare_op1 = force_reg (mode, sh_compare_op1); emit_jump_insn (gen_bge_media (operands[0], sh_compare_op1, sh_compare_op0)); DONE; @@ -5385,7 +6954,9 @@ { if (TARGET_SHMEDIA) { - if (GET_MODE (sh_compare_op0) != DImode) + enum machine_mode mode = GET_MODE (sh_compare_op0); + + if (mode != DImode && mode != SImode) { rtx tmp = gen_reg_rtx (DImode); @@ -5395,9 +6966,9 @@ } if (sh_compare_op0 != const0_rtx) - sh_compare_op0 = force_reg (DImode, sh_compare_op0); + sh_compare_op0 = force_reg (mode, sh_compare_op0); if (sh_compare_op1 != const0_rtx) - sh_compare_op1 = force_reg (DImode, sh_compare_op1); + sh_compare_op1 = force_reg (mode, sh_compare_op1); emit_jump_insn (gen_bge_media (operands[0], sh_compare_op0, sh_compare_op1)); DONE; @@ -5426,10 +6997,12 @@ { if (TARGET_SHMEDIA) { + enum machine_mode mode = GET_MODE (sh_compare_op0); + if (sh_compare_op0 != const0_rtx) - sh_compare_op0 = force_reg (DImode, sh_compare_op0); + sh_compare_op0 = force_reg (mode, sh_compare_op0); if (sh_compare_op1 != const0_rtx) - sh_compare_op1 = force_reg (DImode, sh_compare_op1); + sh_compare_op1 = force_reg (mode, sh_compare_op1); emit_jump_insn (gen_bgtu_media (operands[0], sh_compare_op0, sh_compare_op1)); DONE; @@ -5448,10 +7021,12 @@ { if (TARGET_SHMEDIA) { + enum machine_mode mode = GET_MODE (sh_compare_op0); + if (sh_compare_op0 != const0_rtx) - sh_compare_op0 = force_reg (DImode, sh_compare_op0); + sh_compare_op0 = force_reg (mode, sh_compare_op0); if (sh_compare_op1 != const0_rtx) - sh_compare_op1 = force_reg (DImode, sh_compare_op1); + sh_compare_op1 = force_reg (mode, sh_compare_op1); emit_jump_insn (gen_bgtu_media (operands[0], sh_compare_op1, sh_compare_op0)); DONE; @@ -5470,10 +7045,12 @@ { if (TARGET_SHMEDIA) { + enum machine_mode mode = GET_MODE (sh_compare_op0); + if (sh_compare_op0 != const0_rtx) - sh_compare_op0 = force_reg (DImode, sh_compare_op0); + sh_compare_op0 = force_reg (mode, sh_compare_op0); if (sh_compare_op1 != const0_rtx) - sh_compare_op1 = force_reg (DImode, sh_compare_op1); + sh_compare_op1 = force_reg (mode, sh_compare_op1); emit_jump_insn (gen_bgeu_media (operands[0], sh_compare_op0, sh_compare_op1)); DONE; @@ -5492,10 +7069,12 @@ { if (TARGET_SHMEDIA) { + enum machine_mode mode = GET_MODE (sh_compare_op0); + if (sh_compare_op0 != const0_rtx) - sh_compare_op0 = force_reg (DImode, sh_compare_op0); + sh_compare_op0 = force_reg (mode, sh_compare_op0); if (sh_compare_op1 != const0_rtx) - sh_compare_op1 = force_reg (DImode, sh_compare_op1); + sh_compare_op1 = force_reg (mode, sh_compare_op1); emit_jump_insn (gen_bgeu_media (operands[0], sh_compare_op1, sh_compare_op0)); DONE; @@ -5508,15 +7087,45 @@ [(set (match_dup 1) (unordered:DI (match_dup 2) (match_dup 3))) (set (pc) (if_then_else (ne (match_dup 1) (const_int 0)) - (label_ref:DI (match_operand 0 "" "")) + (match_operand 0 "" "") (pc)))] "TARGET_SHMEDIA" " { + operands[0] = gen_rtx_LABEL_REF (Pmode, operands[0]); operands[1] = gen_reg_rtx (DImode); operands[2] = force_reg (GET_MODE (sh_compare_op0), sh_compare_op0); operands[3] = force_reg (GET_MODE (sh_compare_op1), sh_compare_op1); }") + +;; combiner splitter for test-and-branch on single bit in register. This +;; is endian dependent because the non-paradoxical subreg looks different +;; on big endian. +(define_split + [(set (pc) + (if_then_else + (match_operator 3 "equality_comparison_operator" + [(subreg:SI (zero_extract:DI (subreg:DI (match_operand:SI 1 + "extend_reg_operand" "") + 0) + (const_int 1) + (match_operand 2 + "const_int_operand" "")) 0) + (const_int 0)]) + (match_operand 0 "target_operand" "") + (pc))) + (clobber (match_operand:SI 4 "arith_reg_dest" ""))] + "TARGET_SHMEDIA && TARGET_LITTLE_ENDIAN" + [(set (match_dup 4) (ashift:SI (match_dup 1) (match_dup 5))) + (set (pc) (if_then_else (match_dup 6) (match_dup 0) (pc)))] + + " +{ + operands[5] = GEN_INT (31 - INTVAL (operands[2])); + operands[6] = (GET_CODE (operands[3]) == EQ + ? gen_rtx_GE (VOIDmode, operands[4], const0_rtx) + : gen_rtx_GT (VOIDmode, const0_rtx, operands[4])); +}") ;; ------------------------------------------------------------------------ ;; Jump and linkage insns @@ -5552,9 +7161,9 @@ (define_insn "jump_media" [(set (pc) - (match_operand:DI 0 "target_operand" "b"))] + (match_operand 0 "target_operand" "b"))] "TARGET_SHMEDIA" - "blink %0, r63" + "blink %0, r63%>" [(set_attr "type" "jump_media")]) (define_expand "jump" @@ -5569,7 +7178,7 @@ { if (reload_in_progress || reload_completed) FAIL; - emit_jump_insn (gen_jump_media (gen_rtx_LABEL_REF (DImode, + emit_jump_insn (gen_jump_media (gen_rtx_LABEL_REF (Pmode, operands[0]))); } DONE; @@ -5679,7 +7288,7 @@ (set_attr "needs_delay_slot" "yes")]) (define_insn "call_media" - [(call (mem:DI (match_operand:DI 0 "target_reg_operand" "b")) + [(call (mem:DI (match_operand 0 "target_reg_operand" "b")) (match_operand 1 "" "")) (clobber (reg:DI PR_MEDIA_REG))] "TARGET_SHMEDIA" @@ -5786,7 +7395,7 @@ (define_insn "call_value_media" [(set (match_operand 0 "" "=rf") - (call (mem:DI (match_operand:DI 1 "target_reg_operand" "b")) + (call (mem:DI (match_operand 1 "target_reg_operand" "b")) (match_operand 2 "" ""))) (clobber (reg:DI PR_MEDIA_REG))] "TARGET_SHMEDIA" @@ -5804,48 +7413,7 @@ { if (TARGET_SHMEDIA) { - operands[0] = XEXP (operands[0], 0); - if (flag_pic && GET_CODE (operands[0]) == SYMBOL_REF) - { - if (! SYMBOL_REF_LOCAL_P (operands[0])) - { - rtx reg = gen_reg_rtx (Pmode); - - emit_insn (gen_symGOTPLT2reg (reg, operands[0])); - operands[0] = reg; - } - else - { - operands[0] = gen_sym2PIC (operands[0]); - PUT_MODE (operands[0], Pmode); - } - } - if (GET_MODE (operands[0]) == SImode) - { - if (GET_CODE (operands[0]) == REG) - operands[0] = gen_rtx_SUBREG (DImode, operands[0], 0); - else if (GET_CODE (operands[0]) == SUBREG) - { - operands[0] = SUBREG_REG (operands[0]); - if (GET_MODE (operands[0]) != DImode) - operands[0] = gen_rtx_SUBREG (DImode, operands[0], 0); - } - else if (TARGET_SHMEDIA64) - { - operands[0] = shallow_copy_rtx (operands[0]); - PUT_MODE (operands[0], DImode); - } - else - { - rtx reg = gen_reg_rtx (DImode); - - operands[0] = copy_to_mode_reg (SImode, operands[0]); - emit_insn (gen_extendsidi2 (reg, operands[0])); - operands[0] = reg; - } - } - if (! target_reg_operand (operands[0], DImode)) - operands[0] = copy_to_mode_reg (DImode, operands[0]); + operands[0] = shmedia_prepare_call_address (operands[0], 0); emit_call_insn (gen_call_media (operands[0], operands[1])); DONE; } @@ -5877,14 +7445,9 @@ run out of registers when adjusting fpscr for the call. */ emit_insn (gen_force_mode_for_call ()); - operands[0] = function_symbol (\"__GCC_shcompact_call_trampoline\"); - if (flag_pic) - { - rtx reg = gen_reg_rtx (Pmode); - - emit_insn (gen_symGOTPLT2reg (reg, operands[0])); - operands[0] = reg; - } + operands[0] + = function_symbol (NULL, \"__GCC_shcompact_call_trampoline\", + SFUNC_GOT); operands[0] = force_reg (SImode, operands[0]); emit_move_insn (r0, func); @@ -6001,14 +7564,8 @@ run out of registers when adjusting fpscr for the call. */ emit_insn (gen_force_mode_for_call ()); - operands[0] = function_symbol (\"__GCC_shcompact_call_trampoline\"); - if (flag_pic) - { - rtx reg = gen_reg_rtx (Pmode); - - emit_insn (gen_symGOTPLT2reg (reg, operands[0])); - operands[0] = reg; - } + operands[0] = function_symbol (NULL, \"__GCC_shcompact_call_trampoline\", + SFUNC_GOT); operands[0] = force_reg (SImode, operands[0]); emit_move_insn (r0, func); @@ -6036,48 +7593,7 @@ { if (TARGET_SHMEDIA) { - operands[1] = XEXP (operands[1], 0); - if (flag_pic && GET_CODE (operands[1]) == SYMBOL_REF) - { - if (! SYMBOL_REF_LOCAL_P (operands[1])) - { - rtx reg = gen_reg_rtx (Pmode); - - emit_insn (gen_symGOTPLT2reg (reg, operands[1])); - operands[1] = reg; - } - else - { - operands[1] = gen_sym2PIC (operands[1]); - PUT_MODE (operands[1], Pmode); - } - } - if (GET_MODE (operands[1]) == SImode) - { - if (GET_CODE (operands[1]) == REG) - operands[1] = gen_rtx_SUBREG (DImode, operands[1], 0); - else if (GET_CODE (operands[1]) == SUBREG) - { - operands[1] = SUBREG_REG (operands[1]); - if (GET_MODE (operands[1]) != DImode) - operands[1] = gen_rtx_SUBREG (DImode, operands[1], 0); - } - else if (TARGET_SHMEDIA64) - { - operands[1] = shallow_copy_rtx (operands[1]); - PUT_MODE (operands[1], DImode); - } - else - { - rtx reg = gen_reg_rtx (DImode); - - operands[1] = copy_to_mode_reg (SImode, operands[1]); - emit_insn (gen_extendsidi2 (reg, operands[1])); - operands[1] = reg; - } - } - if (! target_reg_operand (operands[1], DImode)) - operands[1] = copy_to_mode_reg (DImode, operands[1]); + operands[1] = shmedia_prepare_call_address (operands[1], 0); emit_call_insn (gen_call_value_media (operands[0], operands[1], operands[2])); DONE; @@ -6110,14 +7626,9 @@ run out of registers when adjusting fpscr for the call. */ emit_insn (gen_force_mode_for_call ()); - operands[1] = function_symbol (\"__GCC_shcompact_call_trampoline\"); - if (flag_pic) - { - rtx reg = gen_reg_rtx (Pmode); - - emit_insn (gen_symGOTPLT2reg (reg, operands[1])); - operands[1] = reg; - } + operands[1] + = function_symbol (NULL, \"__GCC_shcompact_call_trampoline\", + SFUNC_GOT); operands[1] = force_reg (SImode, operands[1]); emit_move_insn (r0, func); @@ -6185,6 +7696,22 @@ (const_string "single") (const_string "double"))) (set_attr "type" "jump_ind")]) +;; This uses an unspec to describe that the symbol_ref is very close. +(define_insn "sibcalli_thunk" + [(call (mem:SI (unspec:SI [(match_operand:SI 0 "symbol_ref_operand" "")] + UNSPEC_THUNK)) + (match_operand 1 "" "")) + (use (reg:PSI FPSCR_REG)) + (return)] + "TARGET_SH1" + "bra %O0" + [(set_attr "needs_delay_slot" "yes") + (set (attr "fp_mode") + (if_then_else (eq_attr "fpu_single" "yes") + (const_string "single") (const_string "double"))) + (set_attr "type" "jump") + (set_attr "length" "2")]) + (define_insn_and_split "sibcall_pcrel" [(call (mem:SI (match_operand:SI 0 "symbol_ref_operand" "")) (match_operand 1 "" "")) @@ -6232,7 +7759,7 @@ (set_attr "type" "jump_ind")]) (define_insn "sibcall_media" - [(call (mem:DI (match_operand:DI 0 "target_reg_operand" "k")) + [(call (mem:DI (match_operand 0 "target_reg_operand" "k")) (match_operand 1 "" "")) (use (reg:SI PR_MEDIA_REG)) (return)] @@ -6252,42 +7779,7 @@ { if (TARGET_SHMEDIA) { - operands[0] = XEXP (operands[0], 0); - if (flag_pic && GET_CODE (operands[0]) == SYMBOL_REF) - { - if (! SYMBOL_REF_LOCAL_P (operands[0])) - { - rtx reg = gen_reg_rtx (Pmode); - - /* We must not use GOTPLT for sibcalls, because PIC_REG - must be restored before the PLT code gets to run. */ - emit_insn (gen_symGOT2reg (reg, operands[0])); - operands[0] = reg; - } - else - { - operands[0] = gen_sym2PIC (operands[0]); - PUT_MODE (operands[0], Pmode); - } - } - if (GET_MODE (operands[0]) == SImode) - { - if (GET_CODE (operands[0]) == REG) - operands[0] = gen_rtx_SUBREG (DImode, operands[0], 0); - else if (GET_CODE (operands[0]) == SUBREG) - { - operands[0] = SUBREG_REG (operands[0]); - if (GET_MODE (operands[0]) != DImode) - operands[0] = gen_rtx_SUBREG (DImode, operands[0], 0); - } - else - { - operands[0] = shallow_copy_rtx (operands[0]); - PUT_MODE (operands[0], DImode); - } - } - if (! target_reg_operand (operands[0], DImode)) - operands[0] = copy_to_mode_reg (DImode, operands[0]); + operands[0] = shmedia_prepare_call_address (operands[0], 1); emit_call_insn (gen_sibcall_media (operands[0], operands[1])); DONE; } @@ -6329,14 +7821,9 @@ run out of registers when adjusting fpscr for the call. */ emit_insn (gen_force_mode_for_call ()); - operands[0] = function_symbol (\"__GCC_shcompact_call_trampoline\"); - if (flag_pic) - { - rtx reg = gen_reg_rtx (Pmode); - - emit_insn (gen_symGOT2reg (reg, operands[0])); - operands[0] = reg; - } + operands[0] + = function_symbol (NULL, \"__GCC_shcompact_call_trampoline\", + SFUNC_GOT); operands[0] = force_reg (SImode, operands[0]); /* We don't need a return trampoline, since the callee will @@ -6472,14 +7959,8 @@ run out of registers when adjusting fpscr for the call. */ emit_insn (gen_force_mode_for_call ()); - operands[1] = function_symbol (\"__GCC_shcompact_call_trampoline\"); - if (flag_pic) - { - rtx reg = gen_reg_rtx (Pmode); - - emit_insn (gen_symGOTPLT2reg (reg, operands[1])); - operands[1] = reg; - } + operands[1] = function_symbol (NULL, \"__GCC_shcompact_call_trampoline\", + SFUNC_GOT); operands[1] = force_reg (SImode, operands[1]); emit_move_insn (r0, func); @@ -6548,8 +8029,8 @@ "" " { - if (TARGET_SHMEDIA && GET_MODE (operands[0]) == SImode) - operands[0] = gen_rtx_SUBREG (DImode, operands[0], 0); + if (GET_MODE (operands[0]) != Pmode) + operands[0] = gen_rtx_SUBREG (Pmode, operands[0], 0); }") ;; The use of operand 1 / 2 helps us distinguish case table jumps @@ -6578,7 +8059,7 @@ (set_attr "type" "jump_ind")]) (define_insn "casesi_jump_media" - [(set (pc) (match_operand:DI 0 "target_reg_operand" "b")) + [(set (pc) (match_operand 0 "target_reg_operand" "b")) (use (label_ref (match_operand 1 "" "")))] "TARGET_SHMEDIA" "blink %0, r63" @@ -6620,7 +8101,7 @@ (define_insn "dect" [(set (reg:SI T_REG) - (eq:SI (match_operand:SI 0 "arith_reg_operand" "+r") (const_int 1))) + (eq:SI (match_operand:SI 0 "arith_reg_dest" "+r") (const_int 1))) (set (match_dup 0) (plus:SI (match_dup 0) (const_int -1)))] "TARGET_SH2" "dt %0" @@ -6664,40 +8145,35 @@ operands[0] = gen_rtx_REG (Pmode, PIC_REG); operands[1] = gen_rtx_SYMBOL_REF (VOIDmode, GOT_SYMBOL_NAME); - if (TARGET_SH5) - operands[1] = gen_datalabel_ref (operands[1]); - if (TARGET_SHMEDIA) { - rtx tr = gen_rtx_REG (DImode, TR0_REG); - rtx dipic = operands[0]; + rtx tr = gen_rtx_REG (Pmode, TR0_REG); + rtx pic = operands[0]; rtx lab = PATTERN (gen_call_site ()); rtx insn, equiv; equiv = operands[1]; - operands[1] = gen_rtx_MINUS (DImode, + operands[1] = gen_rtx_MINUS (Pmode, operands[1], gen_rtx_CONST - (DImode, - gen_rtx_MINUS (DImode, - gen_rtx_CONST (DImode, + (Pmode, + gen_rtx_MINUS (Pmode, + gen_rtx_CONST (Pmode, lab), pc_rtx))); operands[1] = gen_sym2PIC (operands[1]); - PUT_MODE (operands[1], DImode); + PUT_MODE (operands[1], Pmode); - if (GET_MODE (dipic) != DImode) - dipic = gen_rtx_SUBREG (DImode, dipic, 0); - - if (TARGET_SHMEDIA64) - emit_insn (gen_movdi_const (dipic, operands[1])); + if (Pmode == SImode) + { + emit_insn (gen_movsi_const (pic, operands[1])); + emit_insn (gen_ptrel_si (tr, pic, lab)); + } else - emit_insn (gen_movdi_const_32bit (dipic, operands[1])); - - emit_insn (gen_ptrel (tr, dipic, lab)); - - if (GET_MODE (operands[0]) != GET_MODE (tr)) - tr = gen_lowpart (GET_MODE (operands[0]), tr); + { + emit_insn (gen_movdi_const (pic, operands[1])); + emit_insn (gen_ptrel_di (tr, pic, lab)); + } insn = emit_move_insn (operands[0], tr); @@ -6710,16 +8186,25 @@ ") (define_insn "*ptb" - [(set (match_operand:DI 0 "target_reg_operand" "=b") - (const:DI (unspec:DI [(match_operand:DI 1 "" "Csy")] + [(set (match_operand 0 "target_reg_operand" "=b") + (const (unspec [(match_operand 1 "" "Csy")] UNSPEC_DATALABEL)))] "TARGET_SHMEDIA && flag_pic && EXTRA_CONSTRAINT_Csy (operands[1])" "ptb/u datalabel %1, %0" - [(set_attr "type" "pt_media") + [(set_attr "type" "ptabs_media") (set_attr "length" "*")]) -(define_insn "ptrel" +(define_insn "ptrel_si" + [(set (match_operand:SI 0 "target_reg_operand" "=b") + (plus:SI (match_operand:SI 1 "register_operand" "r") + (pc))) + (match_operand:SI 2 "" "")] + "TARGET_SHMEDIA" + "%O2: ptrel/u %1, %0" + [(set_attr "type" "ptabs_media")]) + +(define_insn "ptrel_di" [(set (match_operand:DI 0 "target_reg_operand" "=b") (plus:DI (match_operand:DI 1 "register_operand" "r") (pc))) @@ -6774,13 +8259,20 @@ { rtx reg = operands[2]; - if (GET_MODE (reg) != DImode) - reg = gen_rtx_SUBREG (DImode, reg, 0); - - if (flag_pic > 1) - emit_insn (gen_movdi_const_32bit (reg, operands[1])); + if (Pmode == DImode) + { + if (flag_pic > 1) + emit_insn (gen_movdi_const_32bit (reg, operands[1])); + else + emit_insn (gen_movdi_const_16bit (reg, operands[1])); + } else - emit_insn (gen_movdi_const_16bit (reg, operands[1])); + { + if (flag_pic > 1) + emit_insn (gen_movsi_const (reg, operands[1])); + else + emit_insn (gen_movsi_const_16bit (reg, operands[1])); + } } else emit_move_insn (operands[2], operands[1]); @@ -6898,7 +8390,7 @@ (define_insn "tls_global_dynamic" [(set (match_operand:SI 0 "register_operand" "=&z") - (call (mem:SI (unspec:SI [(match_operand:SI 1 "" "")] + (call:SI (mem:SI (unspec:SI [(match_operand:SI 1 "" "")] UNSPEC_TLSGD)) (const_int 0))) (use (reg:PSI FPSCR_REG)) @@ -6927,7 +8419,7 @@ mov.l\\t1f,r4\\n\\ (define_insn "tls_local_dynamic" [(set (match_operand:SI 0 "register_operand" "=&z") - (call (mem:SI (unspec:SI [(match_operand:SI 1 "" "")] + (call:SI (mem:SI (unspec:SI [(match_operand:SI 1 "" "")] UNSPEC_TLSLDM)) (const_int 0))) (use (reg:PSI FPSCR_REG)) @@ -7050,9 +8542,10 @@ mov.l\\t1f,r0\\n\\ { rtx reg = gen_reg_rtx (DImode); rtx reg2 = gen_reg_rtx (DImode); - rtx reg3 = gen_reg_rtx (DImode); - rtx reg4 = gen_reg_rtx (DImode); - rtx reg5 = gen_reg_rtx (DImode); + rtx reg3 = gen_reg_rtx (Pmode); + rtx reg4 = gen_reg_rtx (Pmode); + rtx reg5 = gen_reg_rtx (Pmode); + rtx load; operands[0] = convert_modes (DImode, SImode, operands[0], 0); operands[1] = convert_modes (DImode, SImode, operands[1], 0); @@ -7063,9 +8556,18 @@ mov.l\\t1f,r0\\n\\ emit_jump_insn (gen_bgtu_media (operands[4], reg, operands[2])); emit_insn (gen_casesi_shift_media (reg2, reg, operands[3])); emit_move_insn (reg3, gen_datalabel_ref (gen_rtx_LABEL_REF - (DImode, operands[3]))); - emit_insn (gen_casesi_load_media (reg4, reg3, reg2, operands[3])); - emit_move_insn (reg5, gen_rtx_PLUS (DImode, reg3, reg4)); + (Pmode, operands[3]))); + /* Messy: can we subreg to clean this up? */ + if (Pmode == DImode) + load = gen_casesi_load_media (reg4, reg3, reg2, operands[3]); + else + load = gen_casesi_load_media (reg4, + gen_rtx_SUBREG (DImode, reg3, 0), + reg2, operands[3]); + PUT_MODE (SET_SRC (load), Pmode); + emit_insn (load); + /* ??? The following add could be eliminated if we used ptrel. */ + emit_move_insn (reg5, gen_rtx_PLUS (Pmode, reg3, reg4)); emit_jump_insn (gen_casesi_jump_media (reg5, operands[3])); emit_barrier (); DONE; @@ -7212,7 +8714,7 @@ mov.l\\t1f,r0\\n\\ [(set_attr "length" "8")]) (define_insn "casesi_shift_media" - [(set (match_operand:DI 0 "arith_reg_operand" "=r") + [(set (match_operand:DI 0 "arith_reg_dest" "=r") (ashift:DI (match_operand:DI 1 "arith_reg_operand" "r") (unspec:DI [(label_ref:DI (match_operand 2 "" ""))] UNSPEC_CASESI)))] @@ -7240,10 +8742,10 @@ mov.l\\t1f,r0\\n\\ [(set_attr "type" "arith_media")]) (define_insn "casesi_load_media" - [(set (match_operand:DI 0 "arith_reg_operand" "=r") - (mem:DI (unspec [(match_operand 1 "arith_reg_operand" "r") - (match_operand 2 "arith_reg_operand" "r") - (label_ref:DI (match_operand 3 "" ""))] 2)))] + [(set (match_operand 0 "any_arith_reg_dest" "=r") + (mem (unspec [(match_operand:DI 1 "arith_reg_operand" "r") + (match_operand:DI 2 "arith_reg_operand" "r") + (label_ref:DI (match_operand 3 "" ""))] UNSPEC_CASESI)))] "TARGET_SHMEDIA" "* { @@ -7307,13 +8809,8 @@ mov.l\\t1f,r0\\n\\ " { rtx reg = gen_rtx_REG (Pmode, R0_REG); - rtx sym = function_symbol (\"__GCC_shcompact_return_trampoline\"); - - if (flag_pic) - emit_insn (gen_symGOTPLT2reg (reg, sym)); - else - emit_move_insn (reg, sym); + function_symbol (reg, \"__GCC_shcompact_return_trampoline\", SFUNC_STATIC); emit_jump_insn (gen_shcompact_return_tramp_i ()); DONE; }") @@ -7327,7 +8824,7 @@ mov.l\\t1f,r0\\n\\ (set_attr "needs_delay_slot" "yes")]) (define_insn "return_media_i" - [(parallel [(return) (use (match_operand:DI 0 "target_reg_operand" "k"))])] + [(parallel [(return) (use (match_operand 0 "target_reg_operand" "k"))])] "TARGET_SHMEDIA && reload_completed" "blink %0, r63" [(set_attr "type" "jump_media")]) @@ -7353,15 +8850,15 @@ mov.l\\t1f,r0\\n\\ } if (tr_regno < 0) { - rtx r18 = gen_rtx_REG (DImode, PR_MEDIA_REG); + rtx r18 = gen_rtx_REG (Pmode, PR_MEDIA_REG); gcc_assert (call_really_used_regs[TR0_REG] && !fixed_regs[TR0_REG]); tr_regno = TR0_REG; - tr = gen_rtx_REG (DImode, tr_regno); + tr = gen_rtx_REG (Pmode, tr_regno); emit_move_insn (tr, r18); } else - tr = gen_rtx_REG (DImode, tr_regno); + tr = gen_rtx_REG (Pmode, tr_regno); emit_jump_insn (gen_return_media_i (tr)); DONE; @@ -7472,31 +8969,73 @@ mov.l\\t1f,r0\\n\\ ;; ------------------------------------------------------------------------ (define_insn "movt" - [(set (match_operand:SI 0 "arith_reg_operand" "=r") + [(set (match_operand:SI 0 "arith_reg_dest" "=r") (eq:SI (reg:SI T_REG) (const_int 1)))] "TARGET_SH1" "movt %0" [(set_attr "type" "arith")]) (define_expand "seq" - [(set (match_operand:SI 0 "arith_reg_operand" "") + [(set (match_operand:SI 0 "arith_reg_dest" "") (match_dup 1))] "" " { if (TARGET_SHMEDIA) { - if (GET_MODE (operands[0]) != DImode) - operands[0] = gen_rtx_SUBREG (DImode, operands[0], 0); sh_compare_op0 = force_reg (GET_MODE (sh_compare_op0), sh_compare_op0); if (sh_compare_op1 != const0_rtx) sh_compare_op1 = force_reg (GET_MODE (sh_compare_op1) == VOIDmode ? GET_MODE (sh_compare_op0) : GET_MODE (sh_compare_op1), sh_compare_op1); + if (GET_MODE_SIZE (GET_MODE (operands[0])) <= 4) + { + if (GET_MODE (operands[0]) != SImode) + operands[0] = gen_rtx_SUBREG (SImode, operands[0], 0); + + switch (GET_MODE (sh_compare_op0)) + { + case SImode: + emit_insn (gen_cmpsieqsi_media (operands[0], + sh_compare_op0, sh_compare_op1)); + break; + + case DImode: + emit_insn (gen_cmpsieqdi_media (operands[0], + sh_compare_op0, sh_compare_op1)); + break; + + case SFmode: + if (! TARGET_SHMEDIA_FPU) + FAIL; + emit_insn (gen_cmpsieqsf_media (operands[0], + sh_compare_op0, sh_compare_op1)); + break; + + case DFmode: + if (! TARGET_SHMEDIA_FPU) + FAIL; + emit_insn (gen_cmpsieqdf_media (operands[0], + sh_compare_op0, sh_compare_op1)); + break; + + default: + FAIL; + } + DONE; + } + + if (GET_MODE (operands[0]) != DImode) + operands[0] = gen_rtx_SUBREG (DImode, operands[0], 0); switch (GET_MODE (sh_compare_op0)) { + case SImode: + emit_insn (gen_cmpeqsi_media (operands[0], + sh_compare_op0, sh_compare_op1)); + break; + case DImode: emit_insn (gen_cmpeqdi_media (operands[0], sh_compare_op0, sh_compare_op1)); @@ -7547,6 +9086,11 @@ mov.l\\t1f,r0\\n\\ switch (GET_MODE (sh_compare_op0)) { + case SImode: + emit_insn (gen_cmpgtsi_media (operands[0], + sh_compare_op1, sh_compare_op0)); + break; + case DImode: emit_insn (gen_cmpgtdi_media (operands[0], sh_compare_op1, sh_compare_op0)); @@ -7596,6 +9140,16 @@ mov.l\\t1f,r0\\n\\ switch (GET_MODE (sh_compare_op0)) { + case SImode: + { + tmp = no_new_pseudos ? operands[0] : gen_reg_rtx (DImode); + + emit_insn (gen_cmpgtsi_media (tmp, + sh_compare_op0, sh_compare_op1)); + emit_insn (gen_cmpeqdi_media (operands[0], tmp, const0_rtx)); + break; + } + case DImode: { tmp = no_new_pseudos ? operands[0] : gen_reg_rtx (DImode); @@ -7651,6 +9205,11 @@ mov.l\\t1f,r0\\n\\ switch (GET_MODE (sh_compare_op0)) { + case SImode: + emit_insn (gen_cmpgtsi_media (operands[0], + sh_compare_op0, sh_compare_op1)); + break; + case DImode: emit_insn (gen_cmpgtdi_media (operands[0], sh_compare_op0, sh_compare_op1)); @@ -7688,17 +9247,28 @@ mov.l\\t1f,r0\\n\\ { if (TARGET_SHMEDIA) { + enum machine_mode mode = GET_MODE (sh_compare_op0); + + if ((mode) == VOIDmode) + mode = GET_MODE (sh_compare_op1); if (GET_MODE (operands[0]) != DImode) operands[0] = gen_rtx_SUBREG (DImode, operands[0], 0); - sh_compare_op0 = force_reg (GET_MODE (sh_compare_op0), sh_compare_op0); + sh_compare_op0 = force_reg (mode, sh_compare_op0); if (sh_compare_op1 != const0_rtx) - sh_compare_op1 = force_reg (GET_MODE (sh_compare_op1) == VOIDmode - ? GET_MODE (sh_compare_op0) - : GET_MODE (sh_compare_op1), - sh_compare_op1); + sh_compare_op1 = force_reg (mode, sh_compare_op1); - switch (GET_MODE (sh_compare_op0)) + switch (mode) { + case SImode: + { + rtx tmp = no_new_pseudos ? operands[0] : gen_reg_rtx (DImode); + + emit_insn (gen_cmpgtsi_media (tmp, + sh_compare_op1, sh_compare_op0)); + emit_insn (gen_cmpeqdi_media (operands[0], tmp, const0_rtx)); + break; + } + case DImode: { rtx tmp = no_new_pseudos ? operands[0] : gen_reg_rtx (DImode); @@ -7890,7 +9460,9 @@ mov.l\\t1f,r0\\n\\ if (GET_MODE (operands[0]) != DImode) operands[0] = gen_rtx_SUBREG (DImode, operands[0], 0); - if (! TARGET_SHMEDIA_FPU && GET_MODE (sh_compare_op0) != DImode) + if (! TARGET_SHMEDIA_FPU + && GET_MODE (sh_compare_op0) != DImode + && GET_MODE (sh_compare_op0) != SImode) FAIL; sh_compare_op0 = force_reg (GET_MODE (sh_compare_op0), sh_compare_op0); @@ -7927,6 +9499,10 @@ mov.l\\t1f,r0\\n\\ }") ;; Use the same trick for FP sle / sge + +;; Apart from the constant use and the T setting, this is like movt, +;; except that it uses the logically negated value of T, i.e. +;; operand[0] := T ? 0 : 1. (define_expand "movnegt" [(set (match_dup 2) (const_int -1)) (parallel [(set (match_operand 0 "" "") @@ -7943,7 +9519,7 @@ mov.l\\t1f,r0\\n\\ ;; mov/neg for sne. (define_split - [(set (match_operand:SI 0 "arith_reg_operand" "") + [(set (match_operand:SI 0 "arith_reg_dest" "") (plus:SI (reg:SI T_REG) (const_int -1)))] "TARGET_SH1" @@ -8384,9 +9960,9 @@ mov.l\\t1f,r0\\n\\ [(set_attr "type" "fparith_media")]) (define_insn "addsf3_i" - [(set (match_operand:SF 0 "arith_reg_operand" "=f") - (plus:SF (match_operand:SF 1 "arith_reg_operand" "%0") - (match_operand:SF 2 "arith_reg_operand" "f"))) + [(set (match_operand:SF 0 "fp_arith_reg_operand" "=f") + (plus:SF (match_operand:SF 1 "fp_arith_reg_operand" "%0") + (match_operand:SF 2 "fp_arith_reg_operand" "f"))) (use (match_operand:PSI 3 "fpscr_operand" "c"))] "TARGET_SH2E" "fadd %2,%0" @@ -8471,7 +10047,7 @@ mov.l\\t1f,r0\\n\\ "fmul %2,%0" [(set_attr "type" "fp")]) -(define_insn "*mac_media" +(define_insn "mac_media" [(set (match_operand:SF 0 "fp_arith_reg_operand" "=f") (plus:SF (mult:SF (match_operand:SF 1 "fp_arith_reg_operand" "%f") (match_operand:SF 2 "fp_arith_reg_operand" "f")) @@ -8514,7 +10090,7 @@ mov.l\\t1f,r0\\n\\ [(set_attr "type" "fdiv_media")]) (define_insn "divsf3_i" - [(set (match_operand:SF 0 "arith_reg_operand" "=f") + [(set (match_operand:SF 0 "arith_reg_dest" "=f") (div:SF (match_operand:SF 1 "arith_reg_operand" "0") (match_operand:SF 2 "arith_reg_operand" "f"))) (use (match_operand:PSI 3 "fpscr_operand" "c"))] @@ -8567,7 +10143,7 @@ mov.l\\t1f,r0\\n\\ [(set_attr "type" "fp")]) (define_insn "fix_truncsfdi2" - [(set (match_operand:DI 0 "fp_arith_reg_operand" "=f") + [(set (match_operand:DI 0 "fp_arith_reg_dest" "=f") (fix:DI (match_operand:SF 1 "fp_arith_reg_operand" "f")))] "TARGET_SHMEDIA_FPU" "ftrc.sq %1, %0" @@ -8698,6 +10274,14 @@ mov.l\\t1f,r0\\n\\ "fcmpeq.s %1, %2, %0" [(set_attr "type" "fcmp_media")]) +(define_insn "cmpsieqsf_media" + [(set (match_operand:SI 0 "register_operand" "=r") + (eq:SI (match_operand:SF 1 "fp_arith_reg_operand" "f") + (match_operand:SF 2 "fp_arith_reg_operand" "f")))] + "TARGET_SHMEDIA_FPU" + "fcmpeq.s %1, %2, %0" + [(set_attr "type" "fcmp_media")]) + (define_insn "cmpgtsf_media" [(set (match_operand:DI 0 "register_operand" "=r") (gt:DI (match_operand:SF 1 "fp_arith_reg_operand" "f") @@ -9098,7 +10682,7 @@ mov.l\\t1f,r0\\n\\ (set_attr "fp_mode" "double")]) (define_insn "fix_truncdfdi2" - [(set (match_operand:DI 0 "fp_arith_reg_operand" "=f") + [(set (match_operand:DI 0 "fp_arith_reg_dest" "=f") (fix:DI (match_operand:DF 1 "fp_arith_reg_operand" "f")))] "TARGET_SHMEDIA_FPU" "ftrc.dq %1, %0" @@ -9196,6 +10780,14 @@ mov.l\\t1f,r0\\n\\ "fcmpeq.d %1,%2,%0" [(set_attr "type" "fcmp_media")]) +(define_insn "cmpsieqdf_media" + [(set (match_operand:SI 0 "register_operand" "=r") + (eq:SI (match_operand:DF 1 "fp_arith_reg_operand" "f") + (match_operand:DF 2 "fp_arith_reg_operand" "f")))] + "TARGET_SHMEDIA_FPU" + "fcmpeq.d %1,%2,%0" + [(set_attr "type" "fcmp_media")]) + (define_insn "cmpgtdf_media" [(set (match_operand:DI 0 "register_operand" "=r") (gt:DI (match_operand:DF 1 "fp_arith_reg_operand" "f") @@ -9253,8 +10845,8 @@ mov.l\\t1f,r0\\n\\ [(set_attr "type" "fmove_media")]) (define_insn "negdf2_i" - [(set (match_operand:DF 0 "arith_reg_operand" "=f") - (neg:DF (match_operand:DF 1 "arith_reg_operand" "0"))) + [(set (match_operand:DF 0 "fp_arith_reg_operand" "=f") + (neg:DF (match_operand:DF 1 "fp_arith_reg_operand" "0"))) (use (match_operand:PSI 2 "fpscr_operand" "c"))] "(TARGET_SH4 || TARGET_SH2A_DOUBLE)" "fneg %0" @@ -9282,8 +10874,8 @@ mov.l\\t1f,r0\\n\\ [(set_attr "type" "dfdiv_media")]) (define_insn "sqrtdf2_i" - [(set (match_operand:DF 0 "arith_reg_operand" "=f") - (sqrt:DF (match_operand:DF 1 "arith_reg_operand" "0"))) + [(set (match_operand:DF 0 "fp_arith_reg_operand" "=f") + (sqrt:DF (match_operand:DF 1 "fp_arith_reg_operand" "0"))) (use (match_operand:PSI 2 "fpscr_operand" "c"))] "(TARGET_SH4 || TARGET_SH2A_DOUBLE)" "fsqrt %0" @@ -9311,8 +10903,8 @@ mov.l\\t1f,r0\\n\\ [(set_attr "type" "fmove_media")]) (define_insn "absdf2_i" - [(set (match_operand:DF 0 "arith_reg_operand" "=f") - (abs:DF (match_operand:DF 1 "arith_reg_operand" "0"))) + [(set (match_operand:DF 0 "fp_arith_reg_operand" "=f") + (abs:DF (match_operand:DF 1 "fp_arith_reg_operand" "0"))) (use (match_operand:PSI 2 "fpscr_operand" "c"))] "(TARGET_SH4 || TARGET_SH2A_DOUBLE)" "fabs %0" @@ -9753,8 +11345,7 @@ mov.l\\t1f,r0\\n\\ (match_operand 1 "sh_const_vec" ""))] "TARGET_SHMEDIA && reload_completed && GET_MODE (operands[0]) == GET_MODE (operands[1]) - && sh_vector_mode_supported_p (GET_MODE (operands[0])) - && operands[1] != CONST0_RTX (GET_MODE (operands[1]))" + && sh_vector_mode_supported_p (GET_MODE (operands[0]))" [(set (match_dup 0) (match_dup 1))] " { @@ -9780,13 +11371,17 @@ mov.l\\t1f,r0\\n\\ && (register_operand (operands[0], V2HImode) || sh_register_operand (operands[1], V2HImode))" "@ - addz.l %1, r63, %0 + add.l %1, r63, %0 movi %1, %0 # ld%M1.l %m1, %0 st%M0.l %m0, %N1" [(set_attr "type" "arith_media,arith_media,*,load_media,store_media") - (set_attr "length" "4,4,16,4,4")]) + (set_attr "length" "4,4,16,4,4") + (set (attr "highpart") + (cond [(ne (symbol_ref "sh_contains_memref_p (insn)") (const_int 0)) + (const_string "user")] + (const_string "ignore")))]) (define_expand "movv4hi" [(set (match_operand:V4HI 0 "general_movdst_operand" "") @@ -9807,7 +11402,8 @@ mov.l\\t1f,r0\\n\\ ld%M1.q %m1, %0 st%M0.q %m0, %N1" [(set_attr "type" "arith_media,arith_media,*,load_media,store_media") - (set_attr "length" "4,4,16,4,4")]) + (set_attr "length" "4,4,16,4,4") + (set_attr "highpart" "depend")]) (define_expand "movv2si" [(set (match_operand:V2SI 0 "general_movdst_operand" "") @@ -9828,7 +11424,8 @@ mov.l\\t1f,r0\\n\\ ld%M1.q %m1, %0 st%M0.q %m0, %N1" [(set_attr "type" "arith_media,arith_media,*,load_media,store_media") - (set_attr "length" "4,4,16,4,4")]) + (set_attr "length" "4,4,16,4,4") + (set_attr "highpart" "depend")]) ;; Multimedia Intrinsics @@ -9837,14 +11434,16 @@ mov.l\\t1f,r0\\n\\ (abs:V2SI (match_operand:V2SI 1 "arith_reg_operand" "r")))] "TARGET_SHMEDIA" "mabs.l %1, %0" - [(set_attr "type" "mcmp_media")]) + [(set_attr "type" "mcmp_media") + (set_attr "highpart" "depend")]) (define_insn "absv4hi2" [(set (match_operand:V4HI 0 "arith_reg_dest" "=r") (abs:V4HI (match_operand:V4HI 1 "arith_reg_operand" "r")))] "TARGET_SHMEDIA" "mabs.w %1, %0" - [(set_attr "type" "mcmp_media")]) + [(set_attr "type" "mcmp_media") + (set_attr "highpart" "depend")]) (define_insn "addv2si3" [(set (match_operand:V2SI 0 "arith_reg_dest" "=r") @@ -9852,7 +11451,8 @@ mov.l\\t1f,r0\\n\\ (match_operand:V2SI 2 "arith_reg_operand" "r")))] "TARGET_SHMEDIA" "madd.l %1, %2, %0" - [(set_attr "type" "arith_media")]) + [(set_attr "type" "arith_media") + (set_attr "highpart" "depend")]) (define_insn "addv4hi3" [(set (match_operand:V4HI 0 "arith_reg_dest" "=r") @@ -9860,7 +11460,30 @@ mov.l\\t1f,r0\\n\\ (match_operand:V4HI 2 "arith_reg_operand" "r")))] "TARGET_SHMEDIA" "madd.w %1, %2, %0" - [(set_attr "type" "arith_media")]) + [(set_attr "type" "arith_media") + (set_attr "highpart" "depend")]) + +(define_insn_and_split "addv2hi3" + [(set (match_operand:V2HI 0 "arith_reg_dest" "=r") + (plus:V2HI (match_operand:V2HI 1 "extend_reg_operand" "%r") + (match_operand:V2HI 2 "extend_reg_operand" "r")))] + "TARGET_SHMEDIA" + "#" + "TARGET_SHMEDIA" + [(const_int 0)] + " +{ + rtx src0 = simplify_gen_subreg (V4HImode, operands[1], V2HImode, 0); + rtx src1 = simplify_gen_subreg (V4HImode, operands[2], V2HImode, 0); + rtx v4hi_dst = simplify_gen_subreg (V4HImode, operands[0], V2HImode, 0); + rtx di_dst = simplify_gen_subreg (DImode, operands[0], V2HImode, 0); + rtx si_dst = simplify_gen_subreg (SImode, operands[0], V2HImode, 0); + + emit_insn (gen_addv4hi3 (v4hi_dst, src0, src1)); + emit_insn (gen_truncdisi2 (si_dst, di_dst)); + DONE; +}" + [(set_attr "highpart" "must_split")]) (define_insn "ssaddv2si3" [(set (match_operand:V2SI 0 "arith_reg_dest" "=r") @@ -9868,7 +11491,8 @@ mov.l\\t1f,r0\\n\\ (match_operand:V2SI 2 "arith_reg_operand" "r")))] "TARGET_SHMEDIA" "madds.l %1, %2, %0" - [(set_attr "type" "mcmp_media")]) + [(set_attr "type" "mcmp_media") + (set_attr "highpart" "depend")]) (define_insn "usaddv8qi3" [(set (match_operand:V8QI 0 "arith_reg_dest" "=r") @@ -9876,7 +11500,8 @@ mov.l\\t1f,r0\\n\\ (match_operand:V8QI 2 "arith_reg_operand" "r")))] "TARGET_SHMEDIA" "madds.ub %1, %2, %0" - [(set_attr "type" "mcmp_media")]) + [(set_attr "type" "mcmp_media") + (set_attr "highpart" "depend")]) (define_insn "ssaddv4hi3" [(set (match_operand:V4HI 0 "arith_reg_dest" "=r") @@ -9884,7 +11509,8 @@ mov.l\\t1f,r0\\n\\ (match_operand:V4HI 2 "arith_reg_operand" "r")))] "TARGET_SHMEDIA" "madds.w %1, %2, %0" - [(set_attr "type" "mcmp_media")]) + [(set_attr "type" "mcmp_media") + (set_attr "highpart" "depend")]) (define_insn "negcmpeqv8qi" [(set (match_operand:V8QI 0 "arith_reg_dest" "=r") @@ -9892,7 +11518,8 @@ mov.l\\t1f,r0\\n\\ (match_operand:V8QI 2 "arith_reg_or_0_operand" "rZ"))))] "TARGET_SHMEDIA" "mcmpeq.b %N1, %N2, %0" - [(set_attr "type" "mcmp_media")]) + [(set_attr "type" "mcmp_media") + (set_attr "highpart" "depend")]) (define_insn "negcmpeqv2si" [(set (match_operand:V2SI 0 "arith_reg_dest" "=r") @@ -9900,7 +11527,8 @@ mov.l\\t1f,r0\\n\\ (match_operand:V2SI 2 "arith_reg_or_0_operand" "rZ"))))] "TARGET_SHMEDIA" "mcmpeq.l %N1, %N2, %0" - [(set_attr "type" "mcmp_media")]) + [(set_attr "type" "mcmp_media") + (set_attr "highpart" "depend")]) (define_insn "negcmpeqv4hi" [(set (match_operand:V4HI 0 "arith_reg_dest" "=r") @@ -9908,7 +11536,8 @@ mov.l\\t1f,r0\\n\\ (match_operand:V4HI 2 "arith_reg_or_0_operand" "rZ"))))] "TARGET_SHMEDIA" "mcmpeq.w %N1, %N2, %0" - [(set_attr "type" "mcmp_media")]) + [(set_attr "type" "mcmp_media") + (set_attr "highpart" "depend")]) (define_insn "negcmpgtuv8qi" [(set (match_operand:V8QI 0 "arith_reg_dest" "=r") @@ -9916,7 +11545,8 @@ mov.l\\t1f,r0\\n\\ (match_operand:V8QI 2 "arith_reg_or_0_operand" "rZ"))))] "TARGET_SHMEDIA" "mcmpgt.ub %N1, %N2, %0" - [(set_attr "type" "mcmp_media")]) + [(set_attr "type" "mcmp_media") + (set_attr "highpart" "depend")]) (define_insn "negcmpgtv2si" [(set (match_operand:V2SI 0 "arith_reg_dest" "=r") @@ -9924,7 +11554,8 @@ mov.l\\t1f,r0\\n\\ (match_operand:V2SI 2 "arith_reg_or_0_operand" "rZ"))))] "TARGET_SHMEDIA" "mcmpgt.l %N1, %N2, %0" - [(set_attr "type" "mcmp_media")]) + [(set_attr "type" "mcmp_media") + (set_attr "highpart" "depend")]) (define_insn "negcmpgtv4hi" [(set (match_operand:V4HI 0 "arith_reg_dest" "=r") @@ -9932,7 +11563,8 @@ mov.l\\t1f,r0\\n\\ (match_operand:V4HI 2 "arith_reg_or_0_operand" "rZ"))))] "TARGET_SHMEDIA" "mcmpgt.w %N1, %N2, %0" - [(set_attr "type" "mcmp_media")]) + [(set_attr "type" "mcmp_media") + (set_attr "highpart" "depend")]) (define_insn "mcmv" [(set (match_operand:DI 0 "arith_reg_dest" "=r") @@ -9942,7 +11574,8 @@ mov.l\\t1f,r0\\n\\ (not:DI (match_dup 2)))))] "TARGET_SHMEDIA" "mcmv %N1, %2, %0" - [(set_attr "type" "arith_media")]) + [(set_attr "type" "arith_media") + (set_attr "highpart" "depend")]) (define_insn "mcnvs_lw" [(set (match_operand:V4HI 0 "arith_reg_dest" "=r") @@ -10104,6 +11737,8 @@ mov.l\\t1f,r0\\n\\ DONE; }") +;; This could be highpart ignore if it only had inputs 2 or 3, but input 1 +;; is depend (define_insn "mmacfx_wl_i" [(set (match_operand:V2SI 0 "arith_reg_dest" "=r") (ss_plus:V2SI @@ -10117,7 +11752,8 @@ mov.l\\t1f,r0\\n\\ (const_int 1)))))] "TARGET_SHMEDIA" "mmacfx.wl %2, %3, %0" - [(set_attr "type" "mac_media")]) + [(set_attr "type" "mac_media") + (set_attr "highpart" "depend")]) (define_expand "mmacnfx_wl" [(match_operand:V2SI 0 "arith_reg_dest" "") @@ -10145,7 +11781,8 @@ mov.l\\t1f,r0\\n\\ (const_int 1)))))] "TARGET_SHMEDIA" "mmacnfx.wl %2, %3, %0" - [(set_attr "type" "mac_media")]) + [(set_attr "type" "mac_media") + (set_attr "highpart" "depend")]) (define_insn "mulv2si3" [(set (match_operand:V2SI 0 "arith_reg_dest" "=r") @@ -10153,7 +11790,8 @@ mov.l\\t1f,r0\\n\\ (match_operand:V2SI 2 "arith_reg_operand" "r")))] "TARGET_SHMEDIA" "mmul.l %1, %2, %0" - [(set_attr "type" "d2mpy_media")]) + [(set_attr "type" "d2mpy_media") + (set_attr "highpart" "depend")]) (define_insn "mulv4hi3" [(set (match_operand:V4HI 0 "arith_reg_dest" "=r") @@ -10161,7 +11799,8 @@ mov.l\\t1f,r0\\n\\ (match_operand:V4HI 2 "arith_reg_operand" "r")))] "TARGET_SHMEDIA" "mmul.w %1, %2, %0" - [(set_attr "type" "dmpy_media")]) + [(set_attr "type" "dmpy_media") + (set_attr "highpart" "depend")]) (define_insn "mmulfx_l" [(set (match_operand:V2SI 0 "arith_reg_dest" "=r") @@ -10173,7 +11812,8 @@ mov.l\\t1f,r0\\n\\ (const_int 31))))] "TARGET_SHMEDIA" "mmulfx.l %1, %2, %0" - [(set_attr "type" "d2mpy_media")]) + [(set_attr "type" "d2mpy_media") + (set_attr "highpart" "depend")]) (define_insn "mmulfx_w" [(set (match_operand:V4HI 0 "arith_reg_dest" "=r") @@ -10185,7 +11825,8 @@ mov.l\\t1f,r0\\n\\ (const_int 15))))] "TARGET_SHMEDIA" "mmulfx.w %1, %2, %0" - [(set_attr "type" "dmpy_media")]) + [(set_attr "type" "dmpy_media") + (set_attr "highpart" "depend")]) (define_insn "mmulfxrp_w" [(set (match_operand:V4HI 0 "arith_reg_dest" "=r") @@ -10199,7 +11840,9 @@ mov.l\\t1f,r0\\n\\ (const_int 15))))] "TARGET_SHMEDIA" "mmulfxrp.w %1, %2, %0" - [(set_attr "type" "dmpy_media")]) + [(set_attr "type" "dmpy_media") + (set_attr "highpart" "depend")]) + (define_expand "mmulhi_wl" [(match_operand:V2SI 0 "arith_reg_dest" "") @@ -10236,7 +11879,10 @@ mov.l\\t1f,r0\\n\\ "* return (TARGET_LITTLE_ENDIAN ? \"mmulhi.wl %1, %2, %0\" : \"mmullo.wl %1, %2, %0\");" - [(set_attr "type" "dmpy_media")]) + [(set_attr "type" "dmpy_media") + (set (attr "highpart") + (cond [(eq_attr "endian" "big") (const_string "ignore")] + (const_string "user")))]) (define_insn "mmul01_wl" [(set (match_operand:V2SI 0 "arith_reg_dest" "=r") @@ -10249,7 +11895,11 @@ mov.l\\t1f,r0\\n\\ "* return (TARGET_LITTLE_ENDIAN ? \"mmullo.wl %1, %2, %0\" : \"mmulhi.wl %1, %2, %0\");" - [(set_attr "type" "dmpy_media")]) + [(set_attr "type" "dmpy_media") + (set (attr "highpart") + (cond [(eq_attr "endian" "little") (const_string "ignore")] + (const_string "user")))]) + (define_expand "mmulsum_wq" [(match_operand:DI 0 "arith_reg_dest" "") @@ -10338,7 +11988,8 @@ mov.l\\t1f,r0\\n\\ "trunc_hi_operand" "r"))))] "TARGET_SHMEDIA" "mperm.w %1, r63, %0" - [(set_attr "type" "arith_media")]) + [(set_attr "type" "arith_media") + (set_attr "highpart" "ignore")]) (define_expand "msad_ubq" [(match_operand:DI 0 "arith_reg_dest" "") @@ -10405,7 +12056,8 @@ mov.l\\t1f,r0\\n\\ (const_int 31)))))] "TARGET_SHMEDIA" "mshalds.l %1, %2, %0" - [(set_attr "type" "mcmp_media")]) + [(set_attr "type" "mcmp_media") + (set_attr "highpart" "depend")]) (define_insn "mshalds_w" [(set (match_operand:V4HI 0 "arith_reg_dest" "=r") @@ -10416,7 +12068,8 @@ mov.l\\t1f,r0\\n\\ (const_int 15)))))] "TARGET_SHMEDIA" "mshalds.w %1, %2, %0" - [(set_attr "type" "mcmp_media")]) + [(set_attr "type" "mcmp_media") + (set_attr "highpart" "depend")]) (define_insn "ashrv2si3" [(set (match_operand:V2SI 0 "arith_reg_dest" "=r") @@ -10424,7 +12077,8 @@ mov.l\\t1f,r0\\n\\ (match_operand:DI 2 "arith_reg_operand" "r")))] "TARGET_SHMEDIA" "mshard.l %1, %2, %0" - [(set_attr "type" "arith_media")]) + [(set_attr "type" "arith_media") + (set_attr "highpart" "depend")]) (define_insn "ashrv4hi3" [(set (match_operand:V4HI 0 "arith_reg_dest" "=r") @@ -10432,7 +12086,8 @@ mov.l\\t1f,r0\\n\\ (match_operand:DI 2 "arith_reg_operand" "r")))] "TARGET_SHMEDIA" "mshard.w %1, %2, %0" - [(set_attr "type" "arith_media")]) + [(set_attr "type" "arith_media") + (set_attr "highpart" "depend")]) (define_insn "mshards_q" [(set (match_operand:HI 0 "arith_reg_dest" "=r") @@ -10479,7 +12134,10 @@ mov.l\\t1f,r0\\n\\ "* return (TARGET_LITTLE_ENDIAN ? \"mshfhi.b %N1, %N2, %0\" : \"mshflo.b %N1, %N2, %0\");" - [(set_attr "type" "arith_media")]) + [(set_attr "type" "arith_media") + (set (attr "highpart") + (cond [(eq_attr "endian" "big") (const_string "ignore")] + (const_string "user")))]) (define_insn "mshf0_b" [(set @@ -10493,7 +12151,10 @@ mov.l\\t1f,r0\\n\\ "* return (TARGET_LITTLE_ENDIAN ? \"mshflo.b %N1, %N2, %0\" : \"mshfhi.b %N1, %N2, %0\");" - [(set_attr "type" "arith_media")]) + [(set_attr "type" "arith_media") + (set (attr "highpart") + (cond [(eq_attr "endian" "little") (const_string "ignore")] + (const_string "user")))]) (define_expand "mshfhi_l" [(match_operand:V2SI 0 "arith_reg_dest" "") @@ -10529,7 +12190,10 @@ mov.l\\t1f,r0\\n\\ "* return (TARGET_LITTLE_ENDIAN ? \"mshfhi.l %N1, %N2, %0\" : \"mshflo.l %N1, %N2, %0\");" - [(set_attr "type" "arith_media")]) + [(set_attr "type" "arith_media") + (set (attr "highpart") + (cond [(eq_attr "endian" "big") (const_string "ignore")] + (const_string "user")))]) (define_insn "mshf0_l" [(set (match_operand:V2SI 0 "arith_reg_dest" "=r") @@ -10541,7 +12205,10 @@ mov.l\\t1f,r0\\n\\ "* return (TARGET_LITTLE_ENDIAN ? \"mshflo.l %N1, %N2, %0\" : \"mshfhi.l %N1, %N2, %0\");" - [(set_attr "type" "arith_media")]) + [(set_attr "type" "arith_media") + (set (attr "highpart") + (cond [(eq_attr "endian" "little") (const_string "ignore")] + (const_string "user")))]) (define_expand "mshfhi_w" [(match_operand:V4HI 0 "arith_reg_dest" "") @@ -10577,7 +12244,10 @@ mov.l\\t1f,r0\\n\\ "* return (TARGET_LITTLE_ENDIAN ? \"mshfhi.w %N1, %N2, %0\" : \"mshflo.w %N1, %N2, %0\");" - [(set_attr "type" "arith_media")]) + [(set_attr "type" "arith_media") + (set (attr "highpart") + (cond [(eq_attr "endian" "big") (const_string "ignore")] + (const_string "user")))]) (define_insn "mshf0_w" [(set (match_operand:V4HI 0 "arith_reg_dest" "=r") @@ -10589,7 +12259,10 @@ mov.l\\t1f,r0\\n\\ "* return (TARGET_LITTLE_ENDIAN ? \"mshflo.w %N1, %N2, %0\" : \"mshfhi.w %N1, %N2, %0\");" - [(set_attr "type" "arith_media")]) + [(set_attr "type" "arith_media") + (set (attr "highpart") + (cond [(eq_attr "endian" "little") (const_string "ignore")] + (const_string "user")))]) (define_insn "mshflo_w_x" [(set (match_operand:V4HI 0 "arith_reg_dest" "=r") @@ -10599,7 +12272,8 @@ mov.l\\t1f,r0\\n\\ (parallel [(const_int 2) (const_int 0) (const_int 3) (const_int 1)])))] "TARGET_SHMEDIA" "mshflo.w %N1, %N2, %0" - [(set_attr "type" "arith_media")]) + [(set_attr "type" "arith_media") + (set_attr "highpart" "ignore")]) /* These are useful to expand ANDs and as combiner patterns. */ (define_insn_and_split "mshfhi_l_di" @@ -10663,7 +12337,8 @@ mov.l\\t1f,r0\\n\\ "TARGET_SHMEDIA" "mshflo.l %N1, %N2, %0" - [(set_attr "type" "arith_media")]) + [(set_attr "type" "arith_media") + (set_attr "highpart" "ignore")]) (define_insn "*mshflo_l_di_rev" [(set (match_operand:DI 0 "arith_reg_dest" "=r") @@ -10674,7 +12349,8 @@ mov.l\\t1f,r0\\n\\ "TARGET_SHMEDIA" "mshflo.l %N2, %N1, %0" - [(set_attr "type" "arith_media")]) + [(set_attr "type" "arith_media") + (set_attr "highpart" "ignore")]) ;; Combiner pattern for trampoline initialization. (define_insn_and_split "*double_shori" @@ -10696,7 +12372,8 @@ mov.l\\t1f,r0\\n\\ emit_insn (gen_shori_media (operands[0], operands[0], gen_int_mode (v, HImode))); DONE; -}") +}" + [(set_attr "highpart" "ignore")]) (define_insn "*mshflo_l_di_x" @@ -10708,7 +12385,8 @@ mov.l\\t1f,r0\\n\\ "TARGET_SHMEDIA" "mshflo.l %N1, %N2, %0" - [(set_attr "type" "arith_media")]) + [(set_attr "type" "arith_media") + (set_attr "highpart" "ignore")]) (define_insn_and_split "concat_v2sf" [(set (match_operand:V2SF 0 "register_operand" "=r,f,f?") @@ -10730,7 +12408,8 @@ mov.l\\t1f,r0\\n\\ operands[3] = simplify_gen_subreg (SFmode, operands[0], V2SFmode, 0); operands[4] = simplify_gen_subreg (SFmode, operands[0], V2SFmode, 4); }" - [(set_attr "type" "arith_media")]) + [(set_attr "type" "arith_media") + (set_attr "highpart" "ignore")]) (define_insn "*mshflo_l_di_x_rev" [(set (match_operand:DI 0 "arith_reg_dest" "=r") @@ -10740,39 +12419,67 @@ mov.l\\t1f,r0\\n\\ "TARGET_SHMEDIA" "mshflo.l %N2, %N1, %0" - [(set_attr "type" "arith_media")]) + [(set_attr "type" "arith_media") + (set_attr "highpart" "ignore")]) (define_insn "ashlv2si3" [(set (match_operand:V2SI 0 "arith_reg_dest" "=r") (ashift:V2SI (match_operand:V2SI 1 "arith_reg_operand" "r") - (match_operand:DI 2 "arith_reg_operand" "r")))] + (match_operand:DI 2 "shift_count_reg_operand" "r")))] "TARGET_SHMEDIA" "mshlld.l %1, %2, %0" - [(set_attr "type" "arith_media")]) + [(set_attr "type" "arith_media") + (set_attr "highpart" "depend")]) + +(define_split + [(set (match_operand 0 "any_register_operand" "") + (match_operator 3 "shift_operator" + [(match_operand 1 "any_register_operand" "") + (match_operand 2 "shift_count_reg_operand" "")]))] + "TARGET_SHMEDIA && ! register_operand (operands[2], VOIDmode)" + [(set (match_dup 0) (match_dup 3))] + " +{ + rtx count = operands[2]; + enum machine_mode outer_mode = GET_MODE (operands[2]), inner_mode; + + while (GET_CODE (count) == ZERO_EXTEND || GET_CODE (count) == SIGN_EXTEND + || (GET_CODE (count) == SUBREG && SUBREG_BYTE (count) == 0) + || GET_CODE (count) == TRUNCATE) + count = XEXP (count, 0); + inner_mode = GET_MODE (count); + count = simplify_gen_subreg (outer_mode, count, inner_mode, + subreg_lowpart_offset (outer_mode, inner_mode)); + operands[3] = gen_rtx_fmt_ee (GET_CODE (operands[3]), GET_MODE (operands[3]), + operands[1], count); +}") (define_insn "ashlv4hi3" [(set (match_operand:V4HI 0 "arith_reg_dest" "=r") (ashift:V4HI (match_operand:V4HI 1 "arith_reg_operand" "r") - (match_operand:DI 2 "arith_reg_operand" "r")))] + (match_operand:DI 2 "shift_count_reg_operand" "r")))] "TARGET_SHMEDIA" "mshlld.w %1, %2, %0" - [(set_attr "type" "arith_media")]) + [(set_attr "type" "arith_media") + (set_attr "highpart" "depend")]) (define_insn "lshrv2si3" [(set (match_operand:V2SI 0 "arith_reg_dest" "=r") (lshiftrt:V2SI (match_operand:V2SI 1 "arith_reg_operand" "r") - (match_operand:DI 2 "arith_reg_operand" "r")))] + (match_operand:DI 2 "shift_count_reg_operand" "r")))] "TARGET_SHMEDIA" "mshlrd.l %1, %2, %0" - [(set_attr "type" "arith_media")]) + [(set_attr "type" "arith_media") + (set_attr "highpart" "depend")]) (define_insn "lshrv4hi3" [(set (match_operand:V4HI 0 "arith_reg_dest" "=r") (lshiftrt:V4HI (match_operand:V4HI 1 "arith_reg_operand" "r") - (match_operand:DI 2 "arith_reg_operand" "r")))] + (match_operand:DI 2 "shift_count_reg_operand" "r")))] "TARGET_SHMEDIA" "mshlrd.w %1, %2, %0" - [(set_attr "type" "arith_media")]) + [(set_attr "type" "arith_media") + (set_attr "highpart" "depend")]) (define_insn "subv2si3" [(set (match_operand:V2SI 0 "arith_reg_dest" "=r") @@ -10780,7 +12487,8 @@ mov.l\\t1f,r0\\n\\ (match_operand:V2SI 2 "arith_reg_operand" "r")))] "TARGET_SHMEDIA" "msub.l %N1, %2, %0" - [(set_attr "type" "arith_media")]) + [(set_attr "type" "arith_media") + (set_attr "highpart" "depend")]) (define_insn "subv4hi3" [(set (match_operand:V4HI 0 "arith_reg_dest" "=r") @@ -10788,7 +12496,30 @@ mov.l\\t1f,r0\\n\\ (match_operand:V4HI 2 "arith_reg_operand" "r")))] "TARGET_SHMEDIA" "msub.w %N1, %2, %0" - [(set_attr "type" "arith_media")]) + [(set_attr "type" "arith_media") + (set_attr "highpart" "depend")]) + +(define_insn_and_split "subv2hi3" + [(set (match_operand:V2HI 0 "arith_reg_dest" "=r") + (minus:V2HI (match_operand:V2HI 1 "arith_reg_or_0_operand" "rZ") + (match_operand:V2HI 2 "arith_reg_operand" "r")))] + "TARGET_SHMEDIA" + "#" + "TARGET_SHMEDIA" + [(const_int 0)] + " +{ + rtx src0 = simplify_gen_subreg (V4HImode, operands[1], V2HImode, 0); + rtx src1 = simplify_gen_subreg (V4HImode, operands[2], V2HImode, 0); + rtx v4hi_dst = simplify_gen_subreg (V4HImode, operands[0], V2HImode, 0); + rtx di_dst = simplify_gen_subreg (DImode, operands[0], V2HImode, 0); + rtx si_dst = simplify_gen_subreg (SImode, operands[0], V2HImode, 0); + + emit_insn (gen_subv4hi3 (v4hi_dst, src0, src1)); + emit_insn (gen_truncdisi2 (si_dst, di_dst)); + DONE; +}" + [(set_attr "highpart" "must_split")]) (define_insn "sssubv2si3" [(set (match_operand:V2SI 0 "arith_reg_dest" "=r") @@ -10796,15 +12527,17 @@ mov.l\\t1f,r0\\n\\ (match_operand:V2SI 2 "arith_reg_operand" "r")))] "TARGET_SHMEDIA" "msubs.l %N1, %2, %0" - [(set_attr "type" "mcmp_media")]) + [(set_attr "type" "mcmp_media") + (set_attr "highpart" "depend")]) (define_insn "ussubv8qi3" [(set (match_operand:V8QI 0 "arith_reg_dest" "=r") - (us_minus:V8QI (match_operand:V8QI 1 "arith_reg_operand" "r") + (us_minus:V8QI (match_operand:V8QI 1 "arith_reg_or_0_operand" "rZ") (match_operand:V8QI 2 "arith_reg_operand" "r")))] "TARGET_SHMEDIA" - "msubs.ub %1, %2, %0" - [(set_attr "type" "mcmp_media")]) + "msubs.ub %N1, %2, %0" + [(set_attr "type" "mcmp_media") + (set_attr "highpart" "depend")]) (define_insn "sssubv4hi3" [(set (match_operand:V4HI 0 "arith_reg_dest" "=r") @@ -10812,7 +12545,8 @@ mov.l\\t1f,r0\\n\\ (match_operand:V4HI 2 "arith_reg_operand" "r")))] "TARGET_SHMEDIA" "msubs.w %N1, %2, %0" - [(set_attr "type" "mcmp_media")]) + [(set_attr "type" "mcmp_media") + (set_attr "highpart" "depend")]) ;; Floating Point Intrinsics @@ -10892,6 +12626,353 @@ mov.l\\t1f,r0\\n\\ "ftrv.s %1, %2, %0" [(set_attr "type" "fparith_media")]) +(define_insn "ldhi_l" + [(set (match_operand:SI 0 "arith_reg_dest" "=r") + (zero_extract:SI + (mem:SI (plus:SI (ior:SI (match_operand:QI 1 "ua_address_operand" "p") + (const_int 3)) + (const_int -3))) + (plus:SI (and:SI (match_dup 1) (const_int 3)) (const_int 1)) + (const_int 0)))] + "TARGET_SHMEDIA32" + "ldhi.l %U1, %0" + [(set_attr "type" "load_media")]) + +(define_insn "ldhi_q" + [(set (match_operand:DI 0 "arith_reg_dest" "=r") + (zero_extract:DI + (mem:DI (plus:SI (ior:SI (match_operand:QI 1 "ua_address_operand" "p") + (const_int 7)) + (const_int -7))) + (plus:SI (and:SI (match_dup 1) (const_int 7)) (const_int 1)) + (const_int 0)))] + "TARGET_SHMEDIA32" + "ldhi.q %U1, %0" + [(set_attr "type" "load_media")]) + +(define_insn_and_split "*ldhi_q_comb0" + [(set (match_operand:DI 0 "arith_reg_dest" "=r") + (zero_extract:DI + (mem:DI (plus:SI (ior:SI (plus:SI (match_operand:SI 1 + "register_operand" "r") + (match_operand:SI 2 + "ua_offset" "I06")) + (const_int 7)) + (const_int -7))) + (plus:SI (and:SI (match_dup 1) (const_int 7)) + (const_int 1)) + (const_int 0)))] + "TARGET_SHMEDIA32 && (INTVAL (operands[2]) & 7) == 0" + "#" + "" + [(pc)] + "emit_insn (gen_ldhi_q (operands[0], + gen_rtx_PLUS (SImode, operands[1], operands[2]))); + DONE;") + + +(define_insn_and_split "*ldhi_q_comb1" + [(set (match_operand:DI 0 "arith_reg_dest" "=r") + (zero_extract:DI + (mem:DI (plus:SI (ior:SI (plus:SI (match_operand:SI 1 + "register_operand" "r") + (match_operand:SI 2 + "ua_offset" "I06")) + (const_int 7)) + (const_int -7))) + (plus:SI (and:SI (plus:SI (match_dup 1) (match_operand:SI 3 + "ua_offset" "I06")) + (const_int 7)) + (const_int 1)) + (const_int 0)))] + "TARGET_SHMEDIA32 && (INTVAL (operands[2]) & -8) + && (INTVAL (operands[2]) & 7) == INTVAL (operands[3])" + "#" + "" + [(pc)] + "emit_insn (gen_ldhi_q (operands[0], + gen_rtx_PLUS (SImode, operands[1], operands[2]))); + DONE;") + + +(define_insn "ldlo_l" + [(set (match_operand:SI 0 "arith_reg_dest" "=r") + (zero_extract:SI + (mem:SI (and:SI (match_operand:QI 1 "ua_address_operand" "p") + (const_int -4))) + (minus:SI (const_int 4) (and:SI (match_dup 1) (const_int 3))) + (and:SI (match_dup 1) (const_int 3))))] + "TARGET_SHMEDIA32" + "ldlo.l %U1, %0" + [(set_attr "type" "load_media")]) + +(define_insn "ldlo_q" + [(set (match_operand:DI 0 "arith_reg_dest" "=r") + (zero_extract:DI + (mem:DI (and:SI (match_operand:QI 1 "ua_address_operand" "p") + (const_int -8))) + (minus:SI (const_int 8) (and:SI (match_dup 1) (const_int 7))) + (and:SI (match_dup 1) (const_int 7))))] + "TARGET_SHMEDIA32" + "ldlo.q %U1, %0" + [(set_attr "type" "load_media")]) + +(define_insn_and_split "*ldlo_q_comb0" + [(set (match_operand:DI 0 "arith_reg_dest" "=r") + (zero_extract:DI + (mem:DI (and:SI (plus:SI (match_operand:SI 1 "register_operand" "r") + (match_operand:SI 2 "ua_offset" "I06")) + (const_int -8))) + (minus:SI (const_int 8) (and:SI (match_dup 1) (const_int 7))) + (and:SI (match_dup 1) (const_int 7))))] + "TARGET_SHMEDIA32 && (INTVAL (operands[2]) & 7) == 0" + "#" + "" + [(pc)] + "emit_insn (gen_ldlo_q (operands[0], + gen_rtx_PLUS (SImode, operands[1], operands[2]))); + DONE;") + +(define_insn_and_split "*ldlo_q_comb1" + [(set (match_operand:DI 0 "arith_reg_dest" "=r") + (zero_extract:DI + (mem:DI (and:SI (plus:SI (match_operand:SI 1 "register_operand" "r") + (match_operand:SI 2 "ua_offset" "I06")) + (const_int -8))) + (minus:SI (const_int 8) + (and:SI (plus:SI (match_dup 1) + (match_operand:SI 3 "ua_offset" "I06")) + (const_int 7))) + (and:SI (plus:SI (match_dup 1) (match_dup 3)) (const_int 7))))] + "TARGET_SHMEDIA32 && (INTVAL (operands[2]) & -8) + && (INTVAL (operands[2]) & 7) == INTVAL (operands[3])" + "#" + "" + [(pc)] + "emit_insn (gen_ldlo_q (operands[0], + gen_rtx_PLUS (SImode, operands[1], operands[2]))); + DONE;") + +(define_insn "sthi_l" + [(set (zero_extract:SI + (mem:SI (plus:SI (ior:SI (match_operand:QI 0 "ua_address_operand" "p") + (const_int 3)) + (const_int -3))) + (plus:SI (and:SI (match_dup 0) (const_int 3)) (const_int 1)) + (const_int 0)) + (match_operand:SI 1 "arith_reg_operand" "r"))] + "TARGET_SHMEDIA32" + "sthi.l %U0, %1" + [(set_attr "type" "ustore_media")]) + +;; All unaligned stores are considered to be 'narrow' because they typically +;; operate on less that a quadword, and when they operate on a full quadword, +;; the vanilla store high / store low sequence will cause a stall if not +;; scheduled apart. +(define_insn "sthi_q" + [(set (zero_extract:DI + (mem:DI (plus:SI (ior:SI (match_operand:QI 0 "ua_address_operand" "p") + (const_int 7)) + (const_int -7))) + (plus:SI (and:SI (match_dup 0) (const_int 7)) (const_int 1)) + (const_int 0)) + (match_operand:DI 1 "arith_reg_operand" "r"))] + "TARGET_SHMEDIA32" + "sthi.q %U0, %1" + [(set_attr "type" "ustore_media")]) + +(define_insn_and_split "*sthi_q_comb0" + [(set (zero_extract:DI + (mem:DI (plus:SI (ior:SI (plus:SI (match_operand:SI 0 + "register_operand" "r") + (match_operand:SI 1 "ua_offset" + "I06")) + (const_int 7)) + (const_int -7))) + (plus:SI (and:SI (match_dup 0) (const_int 7)) (const_int 1)) + (const_int 0)) + (match_operand:DI 2 "arith_reg_operand" "r"))] + "TARGET_SHMEDIA32 && (INTVAL (operands[1]) & 7) == 0" + "#" + "" + [(pc)] + "emit_insn (gen_sthi_q (gen_rtx_PLUS (SImode, operands[0], operands[1]), + operands[2])); + DONE;") + +(define_insn_and_split "*sthi_q_comb1" + [(set (zero_extract:DI + (mem:DI (plus:SI (ior:SI (plus:SI (match_operand:SI 0 + "register_operand" "r") + (match_operand:SI 1 "ua_offset" + "I06")) + (const_int 7)) + (const_int -7))) + (plus:SI (and:SI (plus:SI (match_dup 0) + (match_operand:SI 2 "ua_offset" "I06")) + (const_int 7)) + (const_int 1)) + (const_int 0)) + (match_operand:DI 3 "arith_reg_operand" "r"))] + "TARGET_SHMEDIA32 && (INTVAL (operands[1]) & -8) + && (INTVAL (operands[1]) & 7) == INTVAL (operands[2])" + "#" + "" + [(pc)] + "emit_insn (gen_sthi_q (gen_rtx_PLUS (SImode, operands[0], operands[1]), + operands[3])); + DONE;") + +;; This is highpart user because the address is used as full 64 bit. +(define_insn "stlo_l" + [(set (zero_extract:SI + (mem:SI (and:SI (match_operand:QI 0 "ua_address_operand" "p") + (const_int -4))) + (minus:SI (const_int 4) (and:SI (match_dup 0) (const_int 3))) + (and:SI (match_dup 0) (const_int 3))) + (match_operand:SI 1 "arith_reg_operand" "r"))] + "TARGET_SHMEDIA32" + "stlo.l %U0, %1" + [(set_attr "type" "ustore_media")]) + +(define_insn "stlo_q" + [(set (zero_extract:DI + (mem:DI (and:SI (match_operand:QI 0 "ua_address_operand" "p") + (const_int -8))) + (minus:SI (const_int 8) (and:SI (match_dup 0) (const_int 7))) + (and:SI (match_dup 0) (const_int 7))) + (match_operand:DI 1 "arith_reg_operand" "r"))] + "TARGET_SHMEDIA32" + "stlo.q %U0, %1" + [(set_attr "type" "ustore_media")]) + +(define_insn_and_split "*stlo_q_comb0" + [(set (zero_extract:DI + (mem:DI (and:SI (plus:SI (match_operand:SI 0 "register_operand" "r") + (match_operand:SI 1 "ua_offset" "I06")) + (const_int -8))) + (minus:SI (const_int 8) (and:SI (match_dup 0) (const_int 7))) + (and:SI (match_dup 0) (const_int 7))) + (match_operand:DI 2 "arith_reg_operand" "r"))] + "TARGET_SHMEDIA32 && (INTVAL (operands[1]) & 7) == 0" + "#" + "" + [(pc)] + "emit_insn (gen_stlo_q (gen_rtx_PLUS (SImode, operands[0], operands[1]), + operands[2])); + DONE;") + +(define_insn_and_split "*stlo_q_comb1" + [(set (zero_extract:DI + (mem:DI (and:SI (plus:SI (match_operand:SI 0 "register_operand" "r") + (match_operand:SI 1 "ua_offset" "I06")) + (const_int -8))) + (minus:SI (const_int 8) (and:SI (plus:SI (match_dup 0) + (match_operand:SI 2 + "ua_offset" "I06")) + (const_int 7))) + (and:SI (plus:SI (match_dup 0) (match_dup 2)) (const_int 7))) + (match_operand:DI 3 "arith_reg_operand" "r"))] + "TARGET_SHMEDIA32 && (INTVAL (operands[1]) & 7) == INTVAL (operands[2])" + "#" + "" + [(pc)] + "emit_insn (gen_stlo_q (gen_rtx_PLUS (SImode, operands[0], operands[1]), + operands[3])); + DONE;") + +(define_insn "ldhi_l64" + [(set (match_operand:SI 0 "arith_reg_dest" "=r") + (zero_extract:SI + (mem:SI (plus:DI (ior:DI (match_operand:QI 1 "ua_address_operand" "p") + (const_int 3)) + (const_int -3))) + (plus:DI (and:DI (match_dup 1) (const_int 3)) (const_int 1)) + (const_int 0)))] + "TARGET_SHMEDIA64" + "ldhi.l %U1, %0" + [(set_attr "type" "load_media")]) + +(define_insn "ldhi_q64" + [(set (match_operand:DI 0 "arith_reg_dest" "=r") + (zero_extract:DI + (mem:DI (plus:DI (ior:DI (match_operand:QI 1 "ua_address_operand" "p") + (const_int 7)) + (const_int -7))) + (plus:DI (and:DI (match_dup 1) (const_int 7)) (const_int 1)) + (const_int 0)))] + "TARGET_SHMEDIA64" + "ldhi.q %U1, %0" + [(set_attr "type" "load_media")]) + +(define_insn "ldlo_l64" + [(set (match_operand:SI 0 "arith_reg_dest" "=r") + (zero_extract:SI + (mem:SI (and:DI (match_operand:QI 1 "ua_address_operand" "p") + (const_int -4))) + (minus:DI (const_int 4) (and:DI (match_dup 1) (const_int 3))) + (and:DI (match_dup 1) (const_int 3))))] + "TARGET_SHMEDIA64" + "ldlo.l %U1, %0" + [(set_attr "type" "load_media")]) + +(define_insn "ldlo_q64" + [(set (match_operand:DI 0 "arith_reg_dest" "=r") + (zero_extract:DI + (mem:DI (and:DI (match_operand:QI 1 "ua_address_operand" "p") + (const_int -8))) + (minus:DI (const_int 8) (and:DI (match_dup 1) (const_int 7))) + (and:DI (match_dup 1) (const_int 7))))] + "TARGET_SHMEDIA64" + "ldlo.q %U1, %0" + [(set_attr "type" "load_media")]) + +(define_insn "sthi_l64" + [(set (zero_extract:SI + (mem:SI (plus:DI (ior:DI (match_operand:QI 0 "ua_address_operand" "p") + (const_int 3)) + (const_int -3))) + (plus:DI (and:DI (match_dup 0) (const_int 3)) (const_int 1)) + (const_int 0)) + (match_operand:SI 1 "arith_reg_operand" "r"))] + "TARGET_SHMEDIA64" + "sthi.l %U0, %1" + [(set_attr "type" "ustore_media")]) + +(define_insn "sthi_q64" + [(set (zero_extract:DI + (mem:DI (plus:DI (ior:DI (match_operand:QI 0 "ua_address_operand" "p") + (const_int 7)) + (const_int -7))) + (plus:DI (and:DI (match_dup 0) (const_int 7)) (const_int 1)) + (const_int 0)) + (match_operand:DI 1 "arith_reg_operand" "r"))] + "TARGET_SHMEDIA64" + "sthi.q %U0, %1" + [(set_attr "type" "ustore_media")]) + +(define_insn "stlo_l64" + [(set (zero_extract:SI + (mem:SI (and:DI (match_operand:QI 0 "ua_address_operand" "p") + (const_int -4))) + (minus:DI (const_int 4) (and:DI (match_dup 0) (const_int 3))) + (and:DI (match_dup 0) (const_int 3))) + (match_operand:SI 1 "arith_reg_operand" "r"))] + "TARGET_SHMEDIA64" + "stlo.l %U0, %1" + [(set_attr "type" "ustore_media")]) + +(define_insn "stlo_q64" + [(set (zero_extract:DI + (mem:DI (and:DI (match_operand:QI 0 "ua_address_operand" "p") + (const_int -8))) + (minus:DI (const_int 8) (and:DI (match_dup 0) (const_int 7))) + (and:DI (match_dup 0) (const_int 7))) + (match_operand:DI 1 "arith_reg_operand" "r"))] + "TARGET_SHMEDIA64" + "stlo.q %U0, %1" + [(set_attr "type" "ustore_media")]) + (define_insn "nsb" [(set (match_operand:QI 0 "arith_reg_dest" "=r") (unspec:QI [(match_operand:DI 1 "arith_reg_operand" "r")] @@ -10975,7 +13056,7 @@ mov.l\\t1f,r0\\n\\ "byterev %1, %0" [(set_attr "type" "arith_media")]) -(define_insn "prefetch_media" +(define_insn "*prefetch_media" [(prefetch (match_operand:QI 0 "address_operand" "p") (match_operand:SI 1 "const_int_operand" "n") (match_operand:SI 2 "const_int_operand" "n"))] @@ -10988,11 +13069,11 @@ mov.l\\t1f,r0\\n\\ }" [(set_attr "type" "other")]) -(define_insn "prefetch_i4" +(define_insn "*prefetch_i4" [(prefetch (match_operand:SI 0 "register_operand" "r") (match_operand:SI 1 "const_int_operand" "n") (match_operand:SI 2 "const_int_operand" "n"))] - "TARGET_HARD_SH4" + "TARGET_HARD_SH4 || TARGET_SHCOMPACT" "* { return \"pref @%0\"; @@ -11000,20 +13081,53 @@ mov.l\\t1f,r0\\n\\ [(set_attr "type" "other")]) (define_expand "prefetch" - [(prefetch (match_operand:QI 0 "address_operand" "p") + [(prefetch (match_operand 0 "address_operand" "p") (match_operand:SI 1 "const_int_operand" "n") (match_operand:SI 2 "const_int_operand" "n"))] - "TARGET_SHMEDIA || TARGET_HARD_SH4" + "TARGET_HARD_SH4 || TARGET_SH5" " { - if (TARGET_HARD_SH4 && ! register_operand (operands[0], SImode)) + if (GET_MODE (operands[0]) != Pmode + || GET_CODE (operands[1]) != CONST_INT + || GET_CODE (operands[2]) != CONST_INT) + FAIL; + if (! TARGET_SHMEDIA) + operands[0] = force_reg (Pmode, operands[0]); +}") + +(define_insn "alloco_i" + [(set (mem:BLK (match_operand:QI 0 "cache_address_operand" "p")) + (unspec:BLK [(const_int 0)] UNSPEC_ALLOCO))] + "TARGET_SHMEDIA32" + "* +{ + rtx xops[2]; + + if (GET_CODE (operands[0]) == PLUS) + { + xops[0] = XEXP (operands[0], 0); + xops[1] = XEXP (operands[0], 1); + } + else { - rtx reg = gen_reg_rtx (SImode); - emit_move_insn (reg, operands[0]); - operands[0] = reg; + xops[0] = operands[0]; + xops[1] = const0_rtx; } + output_asm_insn (\"alloco %0, %1\", xops); + return \"\"; +}" + [(set_attr "type" "other")]) - emit_insn ((TARGET_SHMEDIA ? gen_prefetch_media : gen_prefetch_i4) - (operands[0], operands[1], operands[2])); - DONE; +(define_split + [(set (match_operand 0 "any_register_operand" "") + (match_operand 1 "" ""))] + "TARGET_SHMEDIA && reload_completed" + [(set (match_dup 0) (match_dup 1))] + " +{ + int n_changes = 0; + + for_each_rtx (&operands[1], shmedia_cleanup_truncate, &n_changes); + if (!n_changes) + FAIL; }") diff --git a/gcc/config/sh/shmedia.md b/gcc/config/sh/shmedia.md index 4efed77..1f7ee9c 100644 --- a/gcc/config/sh/shmedia.md +++ b/gcc/config/sh/shmedia.md @@ -25,9 +25,11 @@ ;; the integer and multimedia unit (imu), the load/store unit (lsu), and ;; the floating point unit (fpu). -(define_automaton "shmedia") +(define_automaton "sh5inst_pipe, sh5fpu_pipe") -(define_cpu_unit "sh5issue,sh5fds" "shmedia") +(define_cpu_unit "sh5issue" "sh5inst_pipe") + +(define_cpu_unit "sh5fds" "sh5fpu_pipe") ;; Every instruction on SH-5 occupies the issue resource for at least one ;; cycle. @@ -86,8 +88,8 @@ ;; can continue to issue. (define_insn_reservation "shmedia_fdiv" 19 (and (eq_attr "pipe_model" "sh5media") (eq_attr "type" "fdiv_media")) - "sh5fds*19") + "sh5issue+sh5fds,sh5fds*18") (define_insn_reservation "shmedia_dfdiv" 35 (and (eq_attr "pipe_model" "sh5media") (eq_attr "type" "dfdiv_media")) - "sh5fds*35") + "sh5issue+sh5fds,sh5fds*34") diff --git a/gcc/config/sh/sshmedia.h b/gcc/config/sh/sshmedia.h index 6795a99..c94e598 100644 --- a/gcc/config/sh/sshmedia.h +++ b/gcc/config/sh/sshmedia.h @@ -31,6 +31,9 @@ Boston, MA 02111-1307, USA. */ #define _SSHMEDIA_H #if __SHMEDIA__ +__inline__ static unsigned long long sh_media_GETCON (unsigned int k) + __attribute__((always_inline)); + __inline__ static unsigned long long sh_media_GETCON (unsigned int k) @@ -40,6 +43,9 @@ sh_media_GETCON (unsigned int k) return res; } +__inline__ static void sh_media_PUTCON (unsigned long long mm, unsigned int k) + __attribute__((always_inline)); + __inline__ static void sh_media_PUTCON (unsigned long long mm, unsigned int k) diff --git a/gcc/config/sh/superh.h b/gcc/config/sh/superh.h new file mode 100644 index 0000000..ebdd0471 --- /dev/null +++ b/gcc/config/sh/superh.h @@ -0,0 +1,151 @@ +/* Definitions of target machine for gcc for Super-H using sh-superh-elf. + Copyright (C) 2001 Free Software Foundation, Inc. + +This file is part of GNU CC. + +GNU CC is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 2, or (at your option) +any later version. + +GNU CC is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with GNU CC; see the file COPYING. If not, write to +the Free Software Foundation, 59 Temple Place - Suite 330, +Boston, MA 02111-1307, USA. */ + + +/* This header file is used when the vendor name is set to 'superh'. + It configures the compiler for SH4 only and switches the default + endianess to little (although big endian is still available). + It also configures the spec file to the default board configuration + but in such a way that it can be overriden by a boardspecs file + (using the -specs= option). This file is expected to disable the + defaults and provide options --defsym _start and --defsym _stack + which are required by the SuperH configuration of GNU ld. + + This file is intended to overide sh.h */ + + +#ifndef _SUPERH_H +#define _SUPERH_H +#endif + + +#undef TARGET_VERSION +#define TARGET_VERSION fprintf (stderr, " (SuperH SH special %s)", __DATE__); + + +/* We override TARGET_PROCESSOR_SWITCHES in order to remove all the unrequired cpu options + and add options for all the SuperH CPU variants: + -m4-100 is an alias for -m4. + -m4-200 is an alias for -m4. + -m4-400 is an alias for -m4-nofpu and passes -isa=sh4-nommu-nofpu to the assembler. + -m4-500 is an alias for -m4-nofpu and passes -isa=sh4-nofpu to the assembler. */ +#undef TARGET_PROCESSOR_SWITCHES +#define TARGET_PROCESSOR_SWITCHES \ + {"4-500", TARGET_NONE, "SH4 500 series (FPU-less)" }, \ + {"4-500", SELECT_SH4_NOFPU, "" }, \ + {"4-400", TARGET_NONE, "SH4 400 series (MMU/FPU-less)" }, \ + {"4-400", SELECT_SH4_NOFPU, "" }, \ + {"4-200-single-only", TARGET_NONE, "SH4 200 series with double = float (SH3e ABI)" }, \ + {"4-200-single-only", SELECT_SH4_SINGLE_ONLY, "" }, \ + {"4-200-single", TARGET_NONE, "SH4 200 series with single precision pervading" }, \ + {"4-200-single", SELECT_SH4_SINGLE, "" }, \ + {"4-200-nofpu", TARGET_NONE, "SH4 200 series using soft floating point" }, \ + {"4-200-nofpu", SELECT_SH4_NOFPU, "" }, \ + {"4-200", TARGET_NONE, "SH4 200 series" }, \ + {"4-200", SELECT_SH4_NOFPU, "" }, \ + {"4-100-single-only", TARGET_NONE, "SH4 100 series with double = float (SH3e ABI)" }, \ + {"4-100-single-only", SELECT_SH4_SINGLE_ONLY, "" }, \ + {"4-100-single", TARGET_NONE, "SH4 100 series with single precision pervading" }, \ + {"4-100-single", SELECT_SH4_SINGLE, "" }, \ + {"4-100-nofpu", TARGET_NONE, "SH4 100 series using soft floating point" }, \ + {"4-100-nofpu", SELECT_SH4_NOFPU, "" }, \ + {"4-100", TARGET_NONE, "SH4 100 series" }, \ + {"4-100", SELECT_SH4_NOFPU, "" }, \ + {"4-single-only", TARGET_NONE, "Generic SH4 with double = float (SH3e ABI)" }, \ + {"4-single-only", SELECT_SH4_SINGLE_ONLY, "" }, \ + {"4-single", TARGET_NONE, "Generic SH4 with single precision pervading" }, \ + {"4-single", SELECT_SH4_SINGLE, "" }, \ + {"4-nofpu", TARGET_NONE, "Generic SH4 using soft floating point" }, \ + {"4-nofpu", SELECT_SH4_NOFPU, "" }, \ + {"4", TARGET_NONE, "Generic SH4 (default)" }, \ + {"4", SELECT_SH4, "" } + + +/* Provide the -mboard= option used by the boardspecs file */ +#undef SUBTARGET_OPTIONS +#define SUBTARGET_OPTIONS \ + { "board=", &boardtype, "Board name [and momory region].", 0 }, \ + { "runtime=", &osruntime, "Runtime name.", 0 }, \ + +/* These are required by the mboard= option and runtime= option + and are defined in sh.c but are not used anywhere */ +extern const char * boardtype; +extern const char * osruntime; + + +/* Override the linker spec strings to use the new emultation + The specstrings are concatenated as follows + LINK_EMUL_PREFIX.(''|'32'|'64'|LINK_DEFAULT_CPU_EMUL).SUBTARGET_LINK_EMUL_SUFFIX +*/ +#undef LINK_EMUL_PREFIX +#undef SUBTARGET_LINK_EMUL_SUFFIX + +#define LINK_EMUL_PREFIX "superh" +#define SUBTARGET_LINK_EMUL_SUFFIX "" + +/* Add the SUBTARGET_LINK_SPEC to add the board and runtime support and + change the endianness */ +#undef SUBTARGET_LINK_SPEC +#if TARGET_ENDIAN_DEFAULT == LITTLE_ENDIAN_BIT +#define SUBTARGET_LINK_SPEC "%(board_link) %(ldruntime) %{ml|!mb:-EL}%{mb:-EB}" +#else +#define SUBTARGET_LINK_SPEC "%(board_link) %(ldruntime) %{ml:-EL}%{mb|!ml:-EB}" +#endif + + +/* This is used by the link spec if the boardspecs file is not used (for whatever reason). + If the boardspecs file overrides this then an alternative can be used. */ +#undef SUBTARGET_EXTRA_SPECS +#define SUBTARGET_EXTRA_SPECS \ +{ "board_link", "--defsym _start=0x1000 --defsym _stack=0x30000" }, \ +{ "asruntime", "" }, \ +{ "cppruntime", "-D__GDB_SIM__" }, \ +{ "cc1runtime", "" }, \ +{ "ldruntime", "" }, \ +{ "libruntime", "-lc -lgloss" } + + +/* Set the SUBTARGET_CPP_SPEC to define __EMBEDDED_CROSS__ which has an effect + on newlib and provide the runtime support */ +#undef SUBTARGET_CPP_SPEC +#define SUBTARGET_CPP_SPEC \ +"-D__EMBEDDED_CROSS__ %{m4-100*:-D__SH4_100__} %{m4-200*:-D__SH4_200__} %{m4-400:-D__SH4_400__} %{m4-500:-D__SH4_500__} \ +%(cppruntime)" + +/* Override the SUBTARGET_ASM_SPEC to add the runtime support */ +#undef SUBTARGET_ASM_SPEC +#define SUBTARGET_ASM_SPEC "%{m4-100*|m4-200*:-isa=sh4} %{m4-400:-isa=sh4-nommu-nofpu} %{m4-500:-isa=sh4-nofpu} %(asruntime)" + +/* Override the SUBTARGET_ASM_RELAX_SPEC so it doesn't interfere with the + runtime support by adding -isa=sh4 in the wrong place. */ +#undef SUBTARGET_ASM_RELAX_SPEC +#define SUBTARGET_ASM_RELAX_SPEC "%{!m4-100*:%{!m4-200*:%{!m4-400:%{!m4-500:-isa=sh4}}}}" + +/* Create the CC1_SPEC to add the runtime support */ +#undef CC1_SPEC +#define CC1_SPEC "%(cc1runtime)" + +#undef CC1PLUS_SPEC +#define CC1PLUS_SPEC "%(cc1runtime)" + + +/* Override the LIB_SPEC to add the runtime support */ +#undef LIB_SPEC +#define LIB_SPEC "%{!shared:%{!symbolic:%(libruntime) -lc}} %{pg:-lprofile -lc}" diff --git a/gcc/config/sh/superh64.h b/gcc/config/sh/superh64.h new file mode 100644 index 0000000..1d07e7e --- /dev/null +++ b/gcc/config/sh/superh64.h @@ -0,0 +1,50 @@ +/* + Definitions of target machine for gcc for SuperH using target sh-superh-elf, + + Copyright 2000 Free Software Foundation, Inc. + Contributed by Alexandre Oliva + Modified for SuperH by Richard Shann + +This file is part of GNU CC. + +GNU CC is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 2, or (at your option) +any later version. + +GNU CC is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with GNU CC; see the file COPYING. If not, write to +the Free Software Foundation, 59 Temple Place - Suite 330, +Boston, MA 02111-1307, USA. */ + +/* This header file is used when the vendor name is set to 'superh'. + It configures the compiler for SH5 only and switches the default + endianess to little. + This file is intended to overide sh.h, superh.h and sh64.h (which + should have been included in that order) */ + + +#ifndef _SUPERH_H + #error superh64.h should not be used without superh.h +#endif + +/* We override TARGET_PROCESSOR_SWITCHES in order to remove all the unrequired cpu options */ +#undef TARGET_PROCESSOR_SWITCHES +#define TARGET_PROCESSOR_SWITCHES \ + {"5-64media", TARGET_NONE, "" }, \ + {"5-64media", SELECT_SH5_64, "SH5 64-bit SHmedia code" }, \ + {"5-64media-nofpu", TARGET_NONE, "" }, \ + {"5-64media-nofpu", SELECT_SH5_64_NOFPU, "SH5 64-bit FPU-less SHmedia code" }, \ + {"5-32media", TARGET_NONE, "" }, \ + {"5-32media", SELECT_SH5_32, "SH5 32-bit SHmedia code" }, \ + {"5-32media-nofpu", TARGET_NONE, "" }, \ + {"5-32media-nofpu", SELECT_SH5_32_NOFPU, "SH5 32-bit FPU-less SHmedia code" }, \ + {"5-compact", TARGET_NONE, "" }, \ + {"5-compact", SELECT_SH5_COMPACT, "SH5 SHcompact code" }, \ + {"5-compact-nofpu", TARGET_NONE, "" }, \ + {"5-compact-nofpu", SELECT_SH5_COMPACT_NOFPU, "SH5 FPU-less SHcompact code" } diff --git a/gcc/config/sh/t-linux b/gcc/config/sh/t-linux index 777d157..698882f 100644 --- a/gcc/config/sh/t-linux +++ b/gcc/config/sh/t-linux @@ -1,5 +1,5 @@ TARGET_LIBGCC2_CFLAGS = -fpic -DNO_FPSCR_VALUES -LIB1ASMFUNCS_CACHE = _ic_invalidate +LIB1ASMFUNCS_CACHE = _ic_invalidate _ic_invalidate_array LIB2FUNCS_EXTRA= diff --git a/gcc/config/sh/t-sh64 b/gcc/config/sh/t-sh64 index 97a13be..0311808 100644 --- a/gcc/config/sh/t-sh64 +++ b/gcc/config/sh/t-sh64 @@ -1,11 +1,9 @@ -EXTRA_MULTILIB_PARTS= crt1.o crti.o crtn.o crtbegin.o crtend.o - LIB1ASMFUNCS = \ _sdivsi3 _sdivsi3_i4 _udivsi3 _udivsi3_i4 _set_fpscr \ _shcompact_call_trampoline _shcompact_return_trampoline \ _shcompact_incoming_args _ic_invalidate _nested_trampoline \ _push_pop_shmedia_regs \ - _udivdi3 _divdi3 _umoddi3 _moddi3 + _udivdi3 _divdi3 _umoddi3 _moddi3 _div_table MULTILIB_CPU_DIRS= $(ML_sh1) $(ML_sh2e) $(ML_sh2) $(ML_sh3e) $(ML_sh3) $(ML_sh4_nofpu) $(ML_sh4_single_only) $(ML_sh4_single) $(ML_sh4) $(ML_sh5_32media:m5-32media/=media32) $(ML_sh5_32media_nofpu:m5-32media-nofpu/=nofpu/media32) $(ML_sh5_compact:m5-compact/=compact) $(ML_sh5_compact_nofpu:m5-compact-nofpu/=nofpu/compact) $(ML_sh5_64media:m5-64media/=media64) $(ML_sh5_64media_nofpu:m5-64media-nofpu/=nofpu/media64) diff --git a/gcc/config/sh/t-superh b/gcc/config/sh/t-superh new file mode 100644 index 0000000..35803ca --- /dev/null +++ b/gcc/config/sh/t-superh @@ -0,0 +1,6 @@ +MULTILIB_OPTIONS= mb m4-nofpu/m4-single/m4-single-only +MULTILIB_DIRNAMES= +MULTILIB_MATCHES = m4=m4-100 m4-nofpu=m4-100-nofpu m4-single=m4-100-single m4-single-only=m4-100-single-only \ + m4=m4-200 m4-nofpu=m4-200-nofpu m4-single=m4-200-single m4-single-only=m4-200-single-only \ + m4-nofpu=m4-400 \ + m4-nofpu=m4-500 diff --git a/gcc/config/sh/ushmedia.h b/gcc/config/sh/ushmedia.h index 6fb7016..86312af 100644 --- a/gcc/config/sh/ushmedia.h +++ b/gcc/config/sh/ushmedia.h @@ -36,767 +36,706 @@ typedef float __GCC_FV __attribute__ ((vector_size (4 * sizeof (float)))); typedef float __GCC_MTRX __attribute__ ((vector_size (16 * sizeof (float)))); #endif -__inline__ static -unsigned long long +static __inline unsigned long long sh_media_MABS_L (unsigned long long mm) { - unsigned long long res; - __asm__ ("mabs.l %1, %0" : "=r" (res) : "r" (mm)); - return res; + typedef float v2si __attribute__ ((mode(V2SI))); + + return (unsigned long long) __builtin_absv2si2 ((v2si) mm); } -__inline__ static -unsigned long long +static __inline unsigned long long sh_media_MABS_W (unsigned long long mm) { - unsigned long long res; - __asm__ ("mabs.w %1, %0" : "=r" (res) : "r" (mm)); - return res; + typedef float v4hi __attribute__ ((mode(V4HI))); + + return (unsigned long long) __builtin_absv4hi2 ((v4hi) mm); } -__inline__ static -unsigned long long +static __inline unsigned long long sh_media_MADD_L (unsigned long long mm, unsigned long long mn) { - unsigned long long res; - __asm__ ("madd.l %1, %2, %0" : "=r" (res) : "r" (mm), "r" (mn)); - return res; + typedef float v2si __attribute__ ((mode(V2SI))); + + return (unsigned long long) __builtin_addv2si3 ((v2si) mm, (v2si) mn); } -__inline__ static -unsigned long long +static __inline unsigned long long sh_media_MADD_W (unsigned long long mm, unsigned long long mn) { - unsigned long long res; - __asm__ ("madd.w %1, %2, %0" : "=r" (res) : "r" (mm), "r" (mn)); - return res; + typedef float v4hi __attribute__ ((mode(V4HI))); + + return (unsigned long long) __builtin_addv4hi3 ((v4hi) mm, (v4hi) mn); } -__inline__ static -unsigned long long +static __inline unsigned long long sh_media_MADDS_L (unsigned long long mm, unsigned long long mn) { - unsigned long long res; - __asm__ ("madds.l %1, %2, %0" : "=r" (res) : "r" (mm), "r" (mn)); - return res; + typedef float v2si __attribute__ ((mode(V2SI))); + + return (unsigned long long) __builtin_ssaddv2si3 ((v2si) mm, (v2si) mn); } -__inline__ static -unsigned long long +static __inline unsigned long long sh_media_MADDS_UB (unsigned long long mm, unsigned long long mn) { - unsigned long long res; - __asm__ ("madds.ub %1, %2, %0" : "=r" (res) : "r" (mm), "r" (mn)); - return res; + typedef float v8qi __attribute__ ((mode(V8QI))); + + return (unsigned long long) __builtin_usaddv8qi3 ((v8qi) mm, (v8qi) mn); } -__inline__ static -unsigned long long +static __inline unsigned long long sh_media_MADDS_W (unsigned long long mm, unsigned long long mn) { - unsigned long long res; - __asm__ ("madds.w %1, %2, %0" : "=r" (res) : "r" (mm), "r" (mn)); - return res; + typedef float v4hi __attribute__ ((mode(V4HI))); + + return (unsigned long long) __builtin_ssaddv4hi3 ((v4hi) mm, (v4hi) mn); } -__inline__ static -unsigned long long +static __inline unsigned long long sh_media_MCMPEQ_B (unsigned long long mm, unsigned long long mn) { - unsigned long long res; - __asm__ ("mcmpeq.b %1, %2, %0" : "=r" (res) : "r" (mm), "r" (mn)); - return res; + typedef float v8qi __attribute__ ((mode(V8QI))); + + return (unsigned long long) __builtin_sh_media_MCMPEQ_B ((v8qi) mm, + (v8qi) mn); } -__inline__ static -unsigned long long +static __inline unsigned long long sh_media_MCMPEQ_L (unsigned long long mm, unsigned long long mn) { - unsigned long long res; - __asm__ ("mcmpeq.l %1, %2, %0" : "=r" (res) : "r" (mm), "r" (mn)); - return res; + typedef float v2si __attribute__ ((mode(V2SI))); + + return (unsigned long long) __builtin_sh_media_MCMPEQ_L ((v2si) mm, + (v2si) mn); } -__inline__ static -unsigned long long +static __inline unsigned long long sh_media_MCMPEQ_W (unsigned long long mm, unsigned long long mn) { - unsigned long long res; - __asm__ ("mcmpeq.w %1, %2, %0" : "=r" (res) : "r" (mm), "r" (mn)); - return res; -} + typedef float v4hi __attribute__ ((mode(V4HI))); -__inline__ static -unsigned long long -sh_media_MCMPGT_L (unsigned long long mm, unsigned long long mn) -{ - unsigned long long res; - __asm__ ("mcmpgt.l %1, %2, %0" : "=r" (res) : "r" (mm), "r" (mn)); - return res; + return (unsigned long long) __builtin_sh_media_MCMPEQ_W ((v4hi) mm, + (v4hi) mn); } -__inline__ static -unsigned long long +static __inline unsigned long long sh_media_MCMPGT_UB (unsigned long long mm, unsigned long long mn) { - unsigned long long res; - __asm__ ("mcmpgt.ub %1, %2, %0" : "=r" (res) : "r" (mm), "r" (mn)); - return res; + typedef float v8qi __attribute__ ((mode(V8QI))); + + return (unsigned long long) __builtin_sh_media_MCMPGT_UB ((v8qi) mm, + (v8qi) mn); } -__inline__ static -unsigned long long -sh_media_MCMPGT_W (unsigned long long mm, unsigned long long mn) +static __inline unsigned long long +sh_media_MCMPGT_L (unsigned long long mm, unsigned long long mn) { - unsigned long long res; - __asm__ ("mcmpgt.w %1, %2, %0" : "=r" (res) : "r" (mm), "r" (mn)); - return res; + typedef float v2si __attribute__ ((mode(V2SI))); + + return (unsigned long long) __builtin_sh_media_MCMPGT_L ((v2si) mm, + (v2si) mn); } -__inline__ static -unsigned long long -sh_media_MCMV (unsigned long long mm, unsigned long long mn, unsigned long long mw) +static __inline unsigned long long +sh_media_MCMPGT_W (unsigned long long mm, unsigned long long mn) { - unsigned long long res; - __asm__ ("mcmv %1, %2, %0" : "=r" (res) - : "r" (mm), "r" (mn), "0" (mw)); - return res; + typedef float v4hi __attribute__ ((mode(V4HI))); + + return (unsigned long long) __builtin_sh_media_MCMPGT_W ((v4hi) mm, + (v4hi) mn); } -__inline__ static -unsigned long long +#define sh_media_MCMV __builtin_sh_media_MCMV + +static __inline unsigned long long sh_media_MCNVS_LW (unsigned long long mm, unsigned long long mn) { - unsigned long long res; - __asm__ ("mcnvs.lw %1, %2, %0" : "=r" (res) : "r" (mm), "r" (mn)); - return res; + typedef float v2si __attribute__ ((mode(V2SI))); + typedef unsigned int uv2si __attribute__ ((mode(V2SI))); + + return (unsigned long long) __builtin_sh_media_MCNVS_LW ((v2si) mm, + (uv2si) mn); } -__inline__ static -unsigned long long +static __inline unsigned long long sh_media_MCNVS_WB (unsigned long long mm, unsigned long long mn) { - unsigned long long res; - __asm__ ("mcnvs.wb %1, %2, %0" : "=r" (res) : "r" (mm), "r" (mn)); - return res; + typedef float v4hi __attribute__ ((mode(V4HI))); + + return (unsigned long long) __builtin_sh_media_MCNVS_WB ((v4hi) mm, + (v4hi) mn); } -__inline__ static -unsigned long long +static __inline unsigned long long sh_media_MCNVS_WUB (unsigned long long mm, unsigned long long mn) { - unsigned long long res; - __asm__ ("mcnvs.wub %1, %2, %0" : "=r" (res) : "r" (mm), "r" (mn)); - return res; + typedef float v4hi __attribute__ ((mode(V4HI))); + + return (unsigned long long) __builtin_sh_media_MCNVS_WUB ((v4hi) mm, + (v4hi) mn); } -__inline__ static -unsigned long long +static __inline unsigned long long sh_media_MEXTR1 (unsigned long long mm, unsigned long long mn) { - unsigned long long res; - __asm__ ("mextr1 %1, %2, %0" : "=r" (res) : "r" (mm), "r" (mn)); - return res; + typedef float v8qi __attribute__ ((mode(V8QI))); + + return (unsigned long long) __builtin_sh_media_MEXTR1 ((v8qi) mm, + (v8qi) mn); } -__inline__ static -unsigned long long +static __inline unsigned long long sh_media_MEXTR2 (unsigned long long mm, unsigned long long mn) { - unsigned long long res; - __asm__ ("mextr2 %1, %2, %0" : "=r" (res) : "r" (mm), "r" (mn)); - return res; + typedef float v8qi __attribute__ ((mode(V8QI))); + + return (unsigned long long) __builtin_sh_media_MEXTR2 ((v8qi) mm, + (v8qi) mn); } -__inline__ static -unsigned long long +static __inline unsigned long long sh_media_MEXTR3 (unsigned long long mm, unsigned long long mn) { - unsigned long long res; - __asm__ ("mextr3 %1, %2, %0" : "=r" (res) : "r" (mm), "r" (mn)); - return res; + typedef float v8qi __attribute__ ((mode(V8QI))); + + return (unsigned long long) __builtin_sh_media_MEXTR3 ((v8qi) mm, + (v8qi) mn); } -__inline__ static -unsigned long long +static __inline unsigned long long sh_media_MEXTR4 (unsigned long long mm, unsigned long long mn) { - unsigned long long res; - __asm__ ("mextr4 %1, %2, %0" : "=r" (res) : "r" (mm), "r" (mn)); - return res; + typedef float v8qi __attribute__ ((mode(V8QI))); + + return (unsigned long long) __builtin_sh_media_MEXTR4 ((v8qi) mm, + (v8qi) mn); } -__inline__ static -unsigned long long +static __inline unsigned long long sh_media_MEXTR5 (unsigned long long mm, unsigned long long mn) { - unsigned long long res; - __asm__ ("mextr5 %1, %2, %0" : "=r" (res) : "r" (mm), "r" (mn)); - return res; + typedef float v8qi __attribute__ ((mode(V8QI))); + + return (unsigned long long) __builtin_sh_media_MEXTR5 ((v8qi) mm, + (v8qi) mn); } -__inline__ static -unsigned long long +static __inline unsigned long long sh_media_MEXTR6 (unsigned long long mm, unsigned long long mn) { - unsigned long long res; - __asm__ ("mextr6 %1, %2, %0" : "=r" (res) : "r" (mm), "r" (mn)); - return res; + typedef float v8qi __attribute__ ((mode(V8QI))); + + return (unsigned long long) __builtin_sh_media_MEXTR6 ((v8qi) mm, + (v8qi) mn); } -__inline__ static -unsigned long long +static __inline unsigned long long sh_media_MEXTR7 (unsigned long long mm, unsigned long long mn) { - unsigned long long res; - __asm__ ("mextr7 %1, %2, %0" : "=r" (res) : "r" (mm), "r" (mn)); - return res; + typedef float v8qi __attribute__ ((mode(V8QI))); + + return (unsigned long long) __builtin_sh_media_MEXTR7 ((v8qi) mm, + (v8qi) mn); } -__inline__ static -unsigned long long -sh_media_MMACFX_WL (unsigned long long mm, unsigned long long mn, unsigned long long mw) +static __inline unsigned long long +sh_media_MMACFX_WL (unsigned long long mm, unsigned long long mn, + unsigned long long mw) { - unsigned long long res; - __asm__ ("mmacfx.wl %1, %2, %0" : "=r" (res) - : "r" (mm), "r" (mn), "0" (mw)); - return res; + typedef float v2hi __attribute__ ((mode(V2HI))); + typedef float v2si __attribute__ ((mode(V2SI))); + typedef unsigned int uv2si __attribute__ ((mode(V2SI))); + + long mm_l = (long) mm; + long mn_l = (long) mn; + + return ((unsigned long long) + __builtin_sh_media_MMACFX_WL ((v2hi) mm_l, (v2hi) mn_l, + (uv2si) mw)); } -__inline__ static -unsigned long long -sh_media_MMACNFX_WL (unsigned long long mm, unsigned long long mn, unsigned long long mw) +static __inline unsigned long long +sh_media_MMACNFX_WL (unsigned long long mm, unsigned long long mn, + unsigned long long mw) { - unsigned long long res; - __asm__ ("mmacnfx.wl %1, %2, %0" : "=r" (res) - : "r" (mm), "r" (mn), "0" (mw)); - return res; + typedef float v2hi __attribute__ ((mode(V2HI))); + typedef float v2si __attribute__ ((mode(V2SI))); + typedef unsigned int uv2si __attribute__ ((mode(V2SI))); + + long mm_l = (long) mm; + long mn_l = (long) mn; + + return ((unsigned long long) + __builtin_sh_media_MMACNFX_WL ((v2hi) mm_l, (v2hi) mn_l, + (uv2si) mw)); } -__inline__ static -unsigned long long +static __inline unsigned long long sh_media_MMUL_L (unsigned long long mm, unsigned long long mn) { - unsigned long long res; - __asm__ ("mmul.l %1, %2, %0" : "=r" (res) : "r" (mm), "r" (mn)); - return res; + typedef float v2si __attribute__ ((mode(V2SI))); + + return (unsigned long long) __builtin_mulv2si3 ((v2si) mm, (v2si) mn); } -__inline__ static -unsigned long long +static __inline unsigned long long sh_media_MMUL_W (unsigned long long mm, unsigned long long mn) { - unsigned long long res; - __asm__ ("mmul.w %1, %2, %0" : "=r" (res) : "r" (mm), "r" (mn)); - return res; + typedef float v4hi __attribute__ ((mode(V4HI))); + + return (unsigned long long) __builtin_mulv4hi3 ((v4hi) mm, (v4hi) mn); } -__inline__ static -unsigned long long +static __inline unsigned long long sh_media_MMULFX_L (unsigned long long mm, unsigned long long mn) { - unsigned long long res; - __asm__ ("mmulfx.l %1, %2, %0" : "=r" (res) : "r" (mm), "r" (mn)); - return res; + typedef float v2si __attribute__ ((mode(V2SI))); + + return (unsigned long long) __builtin_sh_media_MMULFX_L ((v2si) mm, + (v2si) mn); } -__inline__ static -unsigned long long +static __inline unsigned long long sh_media_MMULFX_W (unsigned long long mm, unsigned long long mn) { - unsigned long long res; - __asm__ ("mmulfx.w %1, %2, %0" : "=r" (res) : "r" (mm), "r" (mn)); - return res; + typedef float v4hi __attribute__ ((mode(V4HI))); + + return (unsigned long long) __builtin_sh_media_MMULFX_W ((v4hi) mm, + (v4hi) mn); } -__inline__ static -unsigned long long +static __inline unsigned long long sh_media_MMULFXRP_W (unsigned long long mm, unsigned long long mn) { - unsigned long long res; - __asm__ ("mmulfxrp.w %1, %2, %0" : "=r" (res) : "r" (mm), "r" (mn)); - return res; + typedef float v4hi __attribute__ ((mode(V4HI))); + + return (unsigned long long) __builtin_sh_media_MMULFXRP_W ((v4hi) mm, + (v4hi) mn); } -__inline__ static -unsigned long long +static __inline unsigned long long sh_media_MMULHI_WL (unsigned long long mm, unsigned long long mn) { - unsigned long long res; - __asm__ ("mmulhi.wl %1, %2, %0" : "=r" (res) : "r" (mm), "r" (mn)); - return res; + typedef float v4hi __attribute__ ((mode(V4HI))); + + return (unsigned long long) __builtin_sh_media_MMULHI_WL ((v4hi) mm, + (v4hi) mn); } -__inline__ static -unsigned long long +static __inline unsigned long long sh_media_MMULLO_WL (unsigned long long mm, unsigned long long mn) { - unsigned long long res; - __asm__ ("mmullo.wl %1, %2, %0" : "=r" (res) : "r" (mm), "r" (mn)); - return res; + typedef float v4hi __attribute__ ((mode(V4HI))); + + return (unsigned long long) __builtin_sh_media_MMULLO_WL ((v4hi) mm, + (v4hi) mn); } -__inline__ static -unsigned long long -sh_media_MMULSUM_WQ (unsigned long long mm, unsigned long long mn, unsigned long long mw) +static __inline unsigned long long +sh_media_MMULSUM_WQ (unsigned long long mm, unsigned long long mn, + unsigned long long mw) { - unsigned long long res; - __asm__ ("mmulsum.wq %1, %2, %0" : "=r" (res) - : "r" (mm), "r" (mn), "0" (mw)); - return res; + typedef unsigned int uv4hi __attribute__ ((mode(V4HI))); + + return __builtin_sh_media_MMULSUM_WQ ((uv4hi) mm, (uv4hi) mn, mw); } -__inline__ static -unsigned long long +static __inline unsigned long long sh_media_MPERM_W (unsigned long long mm, unsigned int mn) { - unsigned long long res; - __asm__ ("mperm.w %1, %2, %0" : "=r" (res) : "r" (mm), "r" (mn)); - return res; + typedef float v4hi __attribute__ ((mode(V4HI))); + + return (unsigned long long) __builtin_sh_media_MPERM_W ((v4hi) mm, mn); } -__inline__ static -unsigned long long -sh_media_MSAD_UBQ (unsigned long long mm, unsigned long long mn, unsigned long long mw) +static __inline unsigned long long +sh_media_MSAD_UBQ (unsigned long long mm, unsigned long long mn, + unsigned long long mw) { - unsigned long long res; - __asm__ ("msad.ubq %1, %2, %0" : "=r" (res) - : "r" (mm), "r" (mn), "0" (mw)); - return res; + typedef unsigned int uv8qi __attribute__ ((mode(V8QI))); + + return __builtin_sh_media_MSAD_UBQ ((uv8qi) mm, (uv8qi) mn, mw); } -__inline__ static -unsigned long long +static __inline unsigned long long sh_media_MSHALDS_L (unsigned long long mm, unsigned int mn) { - unsigned long long res; - __asm__ ("mshalds.l %1, %2, %0" : "=r" (res) : "r" (mm), "r" (mn)); - return res; + typedef float v2si __attribute__ ((mode(V2SI))); + + return (unsigned long long) __builtin_sh_media_MSHALDS_L ((v2si) mm, mn); } -__inline__ static -unsigned long long +static __inline unsigned long long sh_media_MSHALDS_W (unsigned long long mm, unsigned int mn) { - unsigned long long res; - __asm__ ("mshalds.w %1, %2, %0" : "=r" (res) : "r" (mm), "r" (mn)); - return res; + typedef float v4hi __attribute__ ((mode(V4HI))); + + return (unsigned long long) __builtin_sh_media_MSHALDS_W ((v4hi) mm, mn); } -__inline__ static -unsigned long long +static __inline unsigned long long sh_media_MSHARD_L (unsigned long long mm, unsigned int mn) { - unsigned long long res; - __asm__ ("mshard.l %1, %2, %0" : "=r" (res) : "r" (mm), "r" (mn)); - return res; + typedef float v2si __attribute__ ((mode(V2SI))); + + return (unsigned long long) __builtin_ashrv2si3 ((v2si) mm, mn); } -__inline__ static -unsigned long long +static __inline unsigned long long sh_media_MSHARD_W (unsigned long long mm, unsigned int mn) { - unsigned long long res; - __asm__ ("mshard.w %1, %2, %0" : "=r" (res) : "r" (mm), "r" (mn)); - return res; -} + typedef float v4hi __attribute__ ((mode(V4HI))); -__inline__ static -short -sh_media_MSHARDS_Q (long long mm, unsigned int mn) -{ - short res; - __asm__ ("mshards.q %1, %2, %0" : "=r" (res) : "r" (mm), "r" (mn)); - return res; + return (unsigned long long) __builtin_ashrv4hi3 ((v4hi) mm, mn); } -__inline__ static -unsigned long long +#define sh_media_MSHARDS_Q __builtin_sh_media_MSHARDS_Q + +static __inline unsigned long long sh_media_MSHFHI_B (unsigned long long mm, unsigned long long mn) { - unsigned long long res; - __asm__ ("mshfhi.b %1, %2, %0" : "=r" (res) : "r" (mm), "r" (mn)); - return res; + typedef float v8qi __attribute__ ((mode(V8QI))); + + return (unsigned long long) __builtin_sh_media_MSHFHI_B ((v8qi) mm, + (v8qi) mn); } -__inline__ static -unsigned long long +static __inline unsigned long long sh_media_MSHFHI_L (unsigned long long mm, unsigned long long mn) { - unsigned long long res; - __asm__ ("mshfhi.l %1, %2, %0" : "=r" (res) : "r" (mm), "r" (mn)); - return res; + typedef float v2si __attribute__ ((mode(V2SI))); + + return (unsigned long long) __builtin_sh_media_MSHFHI_L ((v2si) mm, + (v2si) mn); } -__inline__ static -unsigned long long +static __inline unsigned long long sh_media_MSHFHI_W (unsigned long long mm, unsigned long long mn) { - unsigned long long res; - __asm__ ("mshfhi.w %1, %2, %0" : "=r" (res) : "r" (mm), "r" (mn)); - return res; + typedef float v4hi __attribute__ ((mode(V4HI))); + + return (unsigned long long) __builtin_sh_media_MSHFHI_W ((v4hi) mm, + (v4hi) mn); } -__inline__ static -unsigned long long +static __inline unsigned long long sh_media_MSHFLO_B (unsigned long long mm, unsigned long long mn) { - unsigned long long res; - __asm__ ("mshflo.b %1, %2, %0" : "=r" (res) : "r" (mm), "r" (mn)); - return res; + typedef float v8qi __attribute__ ((mode(V8QI))); + + return (unsigned long long) __builtin_sh_media_MSHFLO_B ((v8qi) mm, + (v8qi) mn); } -__inline__ static -unsigned long long +static __inline unsigned long long sh_media_MSHFLO_L (unsigned long long mm, unsigned long long mn) { - unsigned long long res; - __asm__ ("mshflo.l %1, %2, %0" : "=r" (res) : "r" (mm), "r" (mn)); - return res; + typedef float v2si __attribute__ ((mode(V2SI))); + + return (unsigned long long) __builtin_sh_media_MSHFLO_L ((v2si) mm, + (v2si) mn); } -__inline__ static -unsigned long long +static __inline unsigned long long sh_media_MSHFLO_W (unsigned long long mm, unsigned long long mn) { - unsigned long long res; - __asm__ ("mshflo.w %1, %2, %0" : "=r" (res) : "r" (mm), "r" (mn)); - return res; + typedef float v4hi __attribute__ ((mode(V4HI))); + + return (unsigned long long) __builtin_sh_media_MSHFLO_W ((v4hi) mm, + (v4hi) mn); } -__inline__ static -unsigned long long +static __inline unsigned long long sh_media_MSHLLD_L (unsigned long long mm, unsigned int mn) { - unsigned long long res; - __asm__ ("mshlld.l %1, %2, %0" : "=r" (res) : "r" (mm), "r" (mn)); - return res; + typedef float v2si __attribute__ ((mode(V2SI))); + + return (unsigned long long) __builtin_ashlv2si3 ((v2si) mm, mn); } -__inline__ static -unsigned long long +static __inline unsigned long long sh_media_MSHLLD_W (unsigned long long mm, unsigned int mn) { - unsigned long long res; - __asm__ ("mshlld.w %1, %2, %0" : "=r" (res) : "r" (mm), "r" (mn)); - return res; + typedef float v4hi __attribute__ ((mode(V4HI))); + + return (unsigned long long) __builtin_ashlv4hi3 ((v4hi) mm, mn); } -__inline__ static -unsigned long long +static __inline unsigned long long sh_media_MSHLRD_L (unsigned long long mm, unsigned int mn) { - unsigned long long res; - __asm__ ("mshlrd.l %1, %2, %0" : "=r" (res) : "r" (mm), "r" (mn)); - return res; + typedef float v2si __attribute__ ((mode(V2SI))); + + return (unsigned long long) __builtin_lshrv2si3 ((v2si) mm, mn); } -__inline__ static -unsigned long long +static __inline unsigned long long sh_media_MSHLRD_W (unsigned long long mm, unsigned int mn) { - unsigned long long res; - __asm__ ("mshlrd.w %1, %2, %0" : "=r" (res) : "r" (mm), "r" (mn)); - return res; + typedef float v4hi __attribute__ ((mode(V4HI))); + + return (unsigned long long) __builtin_lshrv4hi3 ((v4hi) mm, mn); } -__inline__ static -unsigned long long +static __inline unsigned long long sh_media_MSUB_L (unsigned long long mm, unsigned long long mn) { - unsigned long long res; - __asm__ ("msub.l %1, %2, %0" : "=r" (res) : "r" (mm), "r" (mn)); - return res; + typedef float v2si __attribute__ ((mode(V2SI))); + + return (unsigned long long) __builtin_subv2si3 ((v2si) mm, (v2si) mn); } -__inline__ static -unsigned long long +static __inline unsigned long long sh_media_MSUB_W (unsigned long long mm, unsigned long long mn) { - unsigned long long res; - __asm__ ("msub.w %1, %2, %0" : "=r" (res) : "r" (mm), "r" (mn)); - return res; + typedef float v4hi __attribute__ ((mode(V4HI))); + + return (unsigned long long) __builtin_subv4hi3 ((v4hi) mm, (v4hi) mn); } -__inline__ static -unsigned long long +static __inline unsigned long long sh_media_MSUBS_L (unsigned long long mm, unsigned long long mn) { - unsigned long long res; - __asm__ ("msubs.l %1, %2, %0" : "=r" (res) : "r" (mm), "r" (mn)); - return res; + typedef float v2si __attribute__ ((mode(V2SI))); + + return (unsigned long long) __builtin_sssubv2si3 ((v2si) mm, (v2si) mn); } -__inline__ static -unsigned long long +static __inline unsigned long long sh_media_MSUBS_UB (unsigned long long mm, unsigned long long mn) { - unsigned long long res; - __asm__ ("msubs.ub %1, %2, %0" : "=r" (res) : "r" (mm), "r" (mn)); - return res; + typedef float v8qi __attribute__ ((mode(V8QI))); + + return (unsigned long long) __builtin_ussubv8qi3 ((v8qi) mm, (v8qi) mn); } -__inline__ static -unsigned long long +static __inline unsigned long long sh_media_MSUBS_W (unsigned long long mm, unsigned long long mn) { - unsigned long long res; - __asm__ ("msubs.w %1, %2, %0" : "=r" (res) : "r" (mm), "r" (mn)); - return res; + typedef float v4hi __attribute__ ((mode(V4HI))); + + return (unsigned long long) __builtin_sssubv4hi3 ((v4hi) mm, (v4hi) mn); } #if ! __SH4_NOFPU__ -__inline__ static -double -sh_media_FABS_D (double dg) -{ - double res; - __asm__ ("fabs.d %1, %0" : "=f" (res) : "f" (dg)); - return res; -} +/* Floating-point Intrinsics */ -__inline__ static -float -sh_media_FABS_S (float fg) -{ - float res; - __asm__ ("fabs.s %1, %0" : "=f" (res) : "f" (fg)); - return res; -} +#define sh_media_FABS_D __builtin_fabs +#define sh_media_FABS_S __builtin_fabsf +#define sh_media_FCMPUN_D __builtin_isunordered +#define sh_media_FCMPUN_S __builtin_isunordered -__inline__ static -int -sh_media_FCMPUN_D (double dg, double dh) +static __inline float sh_media_FCOSA_S (float fg) { - int res; - __asm__ ("fcmpun.d %1, %2, %0" : "=f" (res) : "f" (dg), "f" (dh)); - return res; -} + union { int i; float f; } u; -__inline__ static -int -sh_media_FCMPUN_S (float fg, float fh) -{ - int res; - __asm__ ("fcmpun.s %1, %2, %0" : "=f" (res) : "f" (fg), "f" (fh)); - return res; + u.f = fg; + return __builtin_sh_media_FCOSA_S (u.i); } -__inline__ static -float +static __inline float sh_media_FGETSCR (void) -{ - float res; - __asm__ ("fgetscr %0" : "=f" (res)); - return res; +{ + float f; + + __asm volatile ("fgetscr %0" : "=f" (f)); + return f; } -__inline__ static -float +static __inline float sh_media_FIPR_S (const void *fvg, const void *fvh) { - float res; - __asm__ ("fipr.s %1, %2, %0" : "=f" (res) - : "f" (*(const __GCC_FV *)fvg), "f" (*(const __GCC_FV *)fvh)); - return res; + typedef float v4sf __attribute__ ((mode(V4SF))); + v4sf vg = *(v4sf*) fvg; + v4sf vh = *(v4sf*) fvh; + + return __builtin_sh_media_FIPR_S (vg, vh); } -__inline__ static -float +#if 0 +/* This gives different results for -O0 */ +static __inline float sh_media_FMAC_S (float fg, float fh, float fq) { - float res; - __asm__ ("fmac.s %1, %2, %0" : "=f" (res) - : "f" (fg), "f" (fh), "0" (fq)); - return res; + return fg * fh + fq; } +#else -__inline__ static -long long +#define sh_media_FMAC_S __builtin_sh_media_FMAC_S +#endif + +static __inline long long sh_media_FMOV_DQ (double dg) { - long long res; - __asm__ ("fmov.dq %1, %0" : "=r" (res) : "f" (dg)); - return res; + union { long long l; double d; } u; + + u.d = dg; + return u.l; } -__inline__ static -float +static __inline float sh_media_FMOV_LS (int mm) { - float res; - __asm__ ("fmov.ls %1, %0" : "=f" (res) : "r" (mm)); - return res; + union { int i; float f; } u; + + u.i = mm; + return u.f; } -__inline__ static -double +static __inline double sh_media_FMOV_QD (long long mm) { - double res; - __asm__ ("fmov.qd %1, %0" : "=f" (res) : "r" (mm)); - return res; + union { long long l; double d; } u; + + u.l = mm; + return u.d; } -__inline__ static -int +static __inline int sh_media_FMOV_SL (float fg) { - int res; - __asm__ ("fmov.sl %1, %0" : "=r" (res) : "f" (fg)); - return res; + union { int i; float f; } u; + + u.f = fg; + return u.i; } -__inline__ static -void +static __inline void sh_media_FPUTSCR (float fg) -{ - __asm__ ("fputscr %0" : : "f" (fg)); +{ + __asm volatile ("fputscr %0" : : "f" (fg)); } -__inline__ static -double -sh_media_FSQRT_D (double dg) +static __inline float sh_media_FSINA_S (float fg) { - double res; - __asm__ ("fsqrt.d %1, %0" : "=f" (res) : "f" (dg)); - return res; -} + union { int i; float f; } u; -__inline__ static -float -sh_media_FSQRT_S (float fg) -{ - float res; - __asm__ ("fsqrt.s %1, %0" : "=f" (res) : "f" (fg)); - return res; + u.f = fg; + return __builtin_sh_media_FSINA_S (u.i); } -__inline__ static -void +/* Can't use __builtin_sqrt / __builtin_sqrtf because they still implement + error handling unless -ffast-math is used. */ +#define sh_media_FSQRT_D __builtin_sh_media_FSQRT_D +#define sh_media_FSQRT_S __builtin_sh_media_FSQRT_S +#define sh_media_FSRRA_S __builtin_sh_media_FSRRA_S + +static __inline void sh_media_FTRV_S (const void *mtrxg, const void *fvh, void *fvf) { - __asm__ ("ftrv.s %2, %1, %0" : "=f" (*(__GCC_FV *)fvf) - : "f" (*(const __GCC_FV *)fvh), "f" (*(const __GCC_MTRX *)mtrxg)); + typedef float v16sf __attribute__ ((mode(V16SF))); + typedef float v4sf __attribute__ ((mode(V4SF))); + v16sf mtrx = *(v16sf*) mtrxg; + v4sf vh = *(v4sf*) fvh; + + *(v4sf*) fvf = __builtin_sh_media_FTRV_S (mtrx, vh); } #endif /* ! __SH4_NOFPU__ */ -__inline__ static -unsigned long long +/* Not implemented here: Control and Configuration intrinsics. */ +/* Misaligned Access Support intrinsics */ + +static __inline unsigned long long sh_media_LDHI_L (void *p, int s) { - unsigned long long res; - __asm__ ("ldhi.l %m1, %0" : "=r" (res) : "o" (((char*)p)[s])); - return res; + return __builtin_sh_media_LDHI_L ((char *)p + s); } -__inline__ static -unsigned long long +static __inline unsigned long long sh_media_LDHI_Q (void *p, int s) { - unsigned long long res; - __asm__ ("ldhi.q %m1, %0" : "=r" (res) : "o" (((char*)p)[s])); - return res; + return __builtin_sh_media_LDHI_Q ((char *)p + s); } -__inline__ static -unsigned long long +static __inline unsigned long long sh_media_LDLO_L (void *p, int s) { - unsigned long long res; - __asm__ ("ldlo.l %m1, %0" : "=r" (res) : "o" (((char*)p)[s])); - return res; + return __builtin_sh_media_LDLO_L ((char *)p + s); } -__inline__ static -unsigned long long +static __inline unsigned long long sh_media_LDLO_Q (void *p, int s) { - unsigned long long res; - __asm__ ("ldlo.q %m1, %0" : "=r" (res) : "o" (((char*)p)[s])); - return res; + return __builtin_sh_media_LDLO_Q ((char *)p + s); } -__inline__ static -void +static __inline void sh_media_STHI_L (void *p, int s, unsigned int mw) { - __asm__ ("sthi.l %m0, %1" : "+o" (((char*)p)[s]) : "r" (mw)); + __builtin_sh_media_STHI_L ((char*)p + s, mw); } -__inline__ static -void +static __inline void sh_media_STHI_Q (void *p, int s, unsigned long long mw) { - __asm__ ("sthi.q %m0, %1" : "+o" (((char*)p)[s]) : "r" (mw)); + __builtin_sh_media_STHI_Q ((char*)p + s, mw); } -__inline__ static -void +static __inline void sh_media_STLO_L (void *p, int s, unsigned int mw) { - __asm__ ("stlo.l %m0, %1" : "+o" (((char*)p)[s]) : "r" (mw)); + __builtin_sh_media_STLO_L ((char*)p + s, mw); } -__inline__ static -void +static __inline void sh_media_STLO_Q (void *p, int s, unsigned long long mw) { - __asm__ ("stlo.q %m0, %1" : "+o" (((char*)p)[s]) : "r" (mw)); + __builtin_sh_media_STLO_Q ((char*)p + s, mw); } -__inline__ static -unsigned char -sh_media_NSB (long long mm) -{ - unsigned char res; - __asm__ ("nsb %1, %0" : "=r" (res) : "r" (mm)); - return res; -} +/* Miscellaneous intrinsics */ -__inline__ static -unsigned long long +#define sh_media_NSB __builtin_sh_media_NSB + +static __inline unsigned long long sh_media_BYTEREV (unsigned long long mm) { - unsigned long long res; - __asm__ ("byterev %1, %0" : "=r" (res) : "r" (mm)); - return res; + typedef float v8qi __attribute__ ((mode(V8QI))); + + return (unsigned long long) __builtin_sh_media_BYTEREV ((v8qi) mm); } -__inline__ static -unsigned long long +__inline__ static unsigned long long +sh_media_CMVEQ (unsigned long long mm, unsigned long long mn, unsigned long long mw) __attribute__ ((always_inline)); + +__inline__ static unsigned long long sh_media_CMVEQ (unsigned long long mm, unsigned long long mn, unsigned long long mw) { - unsigned long long res; - __asm__ ("cmveq %1, %2, %0" : "=r" (res) - : "r" (mm), "r" (mn), "0" (mw)); - return res; + return mm == 0 ? mn : mw; } -__inline__ static -unsigned long long +__inline__ static unsigned long long +sh_media_CMVNE (unsigned long long mm, unsigned long long mn, unsigned long long mw) __attribute__ ((always_inline)); + +__inline__ static unsigned long long sh_media_CMVNE (unsigned long long mm, unsigned long long mn, unsigned long long mw) { - unsigned long long res; - __asm__ ("cmveq %1, %2, %0" : "=r" (res) - : "r" (mm), "r" (mn), "0" (mw)); - return res; + return mm != 0 ? mn : mw; } -__inline__ static -unsigned long long +static __inline long long sh_media_ADDZ_L (unsigned int mm, unsigned int mn) { - unsigned long long res; - __asm__ ("addz.l %1, %2, %0" : "=r" (res) : "r" (mm), "r" (mn)); - return res; + return mm + mn; } -__inline__ static +/* NOP and Synchronization instrinsics not implemented here. */ + +static __inline__ void sh_media_PREFO(void *mm, int s) +{ + __builtin_sh_media_PREFO (mm + s, 0, 0); +} + +/* Event Handling instrinsics not implemented here. */ + +/* Old asm stuff */ + +static __inline__ void sh_media_NOP (void) { - __asm__ __volatile__ ("nop" : :); + __asm__ ("nop" : :); } __inline__ static @@ -827,7 +766,7 @@ __inline__ static void sh_media_ALLOCO (void *mm, int s) { - __asm__ __volatile__ ("alloco %m0" : : "o" (((char*)mm)[s])); + __builtin_sh_media_ALLOCO (mm + s); } __inline__ static @@ -867,13 +806,6 @@ sh_media_PREFI (void *mm, int s) __inline__ static void -sh_media_PREFO (void *mm, int s) -{ - __asm__ __volatile__ ("ld.b %m0, r63" : : "o" (((char*)mm)[s])); -} - -__inline__ static -void sh_media_BRK (void) { __asm__ __volatile__ ("brk"); @@ -911,14 +843,19 @@ sh_media_unaligned_LD_UW (void *p) #endif } +/* We don't use the sh_media_LD* functions here because that turned out + to impede constant propagation of the offsets into the ldhi / ldlo + instructions. */ __inline__ static int sh_media_unaligned_LD_L (void *p) { #if __LITTLE_ENDIAN__ - return sh_media_LDHI_L (p, 3) | sh_media_LDLO_L (p, 0); + return (__builtin_sh_media_LDHI_L ((char *)p + 3) + | __builtin_sh_media_LDLO_L (p)); #else - return sh_media_LDLO_L (p, 3) | sh_media_LDHI_L (p, 0); + return (__builtin_sh_media_LDLO_L ((char *)p + 3) + | __builtin_sh_media_LDHI_L (p)); #endif } @@ -927,9 +864,11 @@ long long sh_media_unaligned_LD_Q (void *p) { #if __LITTLE_ENDIAN__ - return sh_media_LDHI_Q (p, 7) | sh_media_LDLO_Q (p, 0); + return (__builtin_sh_media_LDHI_Q ((char *)p + 7) + | __builtin_sh_media_LDLO_Q (p)); #else - return sh_media_LDLO_Q (p, 7) | sh_media_LDHI_Q (p, 0); + return (__builtin_sh_media_LDLO_Q ((char *)p + 7) + | __builtin_sh_media_LDHI_Q (p)); #endif } @@ -947,16 +886,19 @@ sh_media_unaligned_ST_W (void *p, unsigned int k) #endif } +/* We don't use the sh_media_ST* functions here because that turned out + to impede constant propagation of the offsets into the ldhi / ldlo + instructions. */ __inline__ static void sh_media_unaligned_ST_L (void *p, unsigned int k) { #if __LITTLE_ENDIAN__ - sh_media_STHI_L (p, 3, k); - sh_media_STLO_L (p, 0, k); + __builtin_sh_media_STHI_L (p + 3, k); + __builtin_sh_media_STLO_L (p, k); #else - sh_media_STLO_L (p, 3, k); - sh_media_STHI_L (p, 0, k); + __builtin_sh_media_STLO_L (p + 3, k); + __builtin_sh_media_STHI_L (p, k); #endif } @@ -965,11 +907,11 @@ void sh_media_unaligned_ST_Q (void *p, unsigned long long k) { #if __LITTLE_ENDIAN__ - sh_media_STHI_Q (p, 7, k); - sh_media_STLO_Q (p, 0, k); + __builtin_sh_media_STHI_Q (p + 7, k); + __builtin_sh_media_STLO_Q (p, k); #else - sh_media_STLO_Q (p, 7, k); - sh_media_STHI_Q (p, 0, k); + __builtin_sh_media_STLO_Q (p + 7, k); + __builtin_sh_media_STHI_Q (p, k); #endif } diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index ab95b4d..52d879c 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -660,7 +660,10 @@ See RS/6000 and PowerPC Options. -mb -ml -mdalign -mrelax @gol -mbigtable -mfmovd -mhitachi -mrenesas -mno-renesas -mnomacsave @gol -mieee -misize -mpadstruct -mspace @gol --mprefergot -musermode} +-mprefergot -musermode -multcost=@var{number} -mdiv=@var{strategy} @gol +-mdivsi3_libfunc=@var{name} @gol +-madjust-unroll -mindexed-addressing -mgettrcost=@var{number} -mpt-fixed @gol + -minvalid-symbols} @emph{SPARC Options} @gccoptlist{-mcpu=@var{cpu-type} @gol @@ -11466,6 +11469,11 @@ Mark the @code{MAC} register as call-clobbered, even if @item -mieee @opindex mieee Increase IEEE-compliance of floating-point code. +At the moment, this is equivalent to @option{-fno-finite-math-only}. +When generating 16 bit SH opcodes, getting IEEE-conforming results for +comparisons of NANs / infinities incurs extra overhead in every +floating point comparison, therefore the default is set to +@option{-ffinite-math-only}. @item -misize @opindex misize @@ -11491,6 +11499,107 @@ Generate a library function call to invalidate instruction cache entries, after fixing up a trampoline. This library function call doesn't assume it can write to the whole memory address space. This is the default when the target is @code{sh-*-linux*}. + +@item -multcost=@var{number} +@opindex multcost=@var{number} +Set the cost to assume for a multiply insn. + +@item -mdiv=@var{strategy} +@opindex mdiv=@var{strategy} +Set the division strategy to use for SHmedia code. @var{strategy} must be +one of: call, call2, fp, inv, inv:minlat, inv20u, inv20l, inv:call, +inv:call2, inv:fp . +"fp" performs the operation in floating point. This has a very high latency, +but needs only a few instructions, so it might be a good choice if +your code has enough easily esploitable ILP to allow the compiler to +schedule the floating point instructions together with other instructions. +Division by zero causes a floating point exception. +"inv" uses integer operations to calculate the inverse of the divisor, +and then multiplies the divident with the inverse. This strategy allows +cse and hoisting of the inverse calculation. Division by zero calculates +an unspecified result, but does not trap. +"inv:minlat" is a variant of "inv" where if no cse / hoisting opportunities +have been found, or if the entire operation has been hoisted to the same +place, the last stages of the inverse calculation are intertwined with the +final multiply to reduce the overall latency, at the expense of using a few +more instructions, and thus offering fewer scheduling opportunities with +other code. +"call" calls a library function that usually implements the inv:minlat +strategy. +This gives high code density for m5-*media-nofpu compilations. +"call2" uses a different entry point of the same library function, where it +assumes that a pointer to a lookup table has already been set up, which +exposes the pointer load to cse / code hoisting optimizations. +"inv:call", "inv:call2" and "inv:fp" all use the "inv" algorithm for initial +code generation, but if the code stays unoptimized, revert to the "call", +"call2", or "fp" strategies, resspectively. Note that the +potentially-trapping side effect of division by zero is carried by a +separate instruction, so it is possible that all the integer instructions +are hoisted out, but the marker for the side effect stays where it is. +A recombination to fp operations or a call is not possible in that case. +"inv20u" and "inv20l" are variants of the "inv:minlat" strategy. In the case +that the inverse calculation was nor separated from the multiply, they speed +up division where the dividend fits into 20 bits (plus sign where applicable), +by inserting a test to skip a number of operations in this case; this test +slows down the case of larger divdends. inv20u assumes the case of a such +a small dividend to be unlikely, and inv20l assumes it to be likely. + +@item -mdivsi3_libfunc=@var{name} +@opindex mdivsi3_libfunc=@var{name} +Set the name of the library function used for 32 bit signed division to +@var{name}. This only affect the name used in the call and inv:call +division strategies, and the compiler will still expect the same +sets of input/output/clobbered registers as if this option was not present. + +@item -madjust-unroll +@opindex madjust-unroll +Throttle unrolling to avoid thrashing target registers. +This option only has an effect if the gcc code base supports the +TARGET_ADJUST_UNROLL_MAX target hook. + +@item -mindexed-addressing +@opindex mindexed-addressing +Enable the use of the indexed addressing mode for SHmedia32/SHcompact. +This is only safe if the hardware and/or OS implement 32 bit wrap-around +semantics for the indexed addressing mode. The architecture allows the +implementation of processors with 64 bit MMU, which the OS could use to +get 32 bit addressing, but since no current harware implementation supports +this or any other way to make the indexed addressing mode safe to use in +the 32 bit ABI, the default is -mno-indexed-addressing. + +@item -mgettrcost=@var{number} +@opindex mgettrcost=@var{number} +Set the cost assumed for the gettr instruction to @var{number}. +The default is 2 if @option{-mpt-fixed} is in effect, 100 otherwise. + +@item -mpt-fixed +@opindex mpt-fixed +Assume pt* instructions won't trap. This will generally generate better +scheduled code, but is unsafe on current hardware. The current architecture +definition says that ptabs and ptrel trap when the target anded with 3 is 3. +This has the unintentional effect of making it unsafe to schedule ptabs / +ptrel before a branch, or hoist it out of a loop. For example, +__do_global_ctors, a part of libgcc that runs constructors at program +startup, calls functions in a list which is delimited by -1. With the +-mpt-fixed option, the ptabs will be done before testing against -1. +That means that all the constructors will be run a bit quicker, but when +the loop comes to the end of the list, the pprogram crashes because ptabs +loads -1 into a target register. Since this option is unsafe for any +hardware implementing the current architecture specification, the default +is -mno-pt-fixed. Unless the user specifies a specific cost with +@option{-mgettrcost}, -mno-pt-fixed also implies @option{-mgettrcost=100}; +this deters register allocation using target registers for storing +ordinary integers. + +@item -minvalid-symbols +@opindex minvalid-symbols +Assume symbols might be invalid. Ordinary function symbols generated by +the compiler will always be valid to load with movi/shori/ptabs or +movi/shori/ptrel, but with assembler and/or linker tricks it is possible +to generate symbols that will cause ptabs / ptrel to trap. +This option is only meaningful when @option{-mno-pt-fixed} is in effect. +It will then prevent cross-basic-block cse, hoisting and most scheduling +of symbol loads. The default is @option{-mno-invalid-symbols}. @end table @node SPARC Options diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog index 5f9a7da..67d948a 100644 --- a/gcc/testsuite/ChangeLog +++ b/gcc/testsuite/ChangeLog @@ -1,3 +1,9 @@ +2005-05-06 J"orn Rennecke + + * gcc.dg/pr15784-3.c: Add -fno-finite-math-only option. + + * gcc.dg/20021029-1.c: For sh64*-*-*, add -mpt-fixed. + 2005-05-09 Nathan Sidwell PR c++/21427 diff --git a/gcc/testsuite/gcc.dg/20021029-1.c b/gcc/testsuite/gcc.dg/20021029-1.c index 468f9c0..d8dd8c0 100644 --- a/gcc/testsuite/gcc.dg/20021029-1.c +++ b/gcc/testsuite/gcc.dg/20021029-1.c @@ -2,6 +2,7 @@ variables into writable sections. */ /* { dg-do compile } */ /* { dg-options "-O2 -fpic" } */ +/* { dg-options "-O2 -fpic -mpt-fixed" { target sh64*-*-* } } */ /* { dg-final { scan-assembler-not ".data.rel.ro.local" } } */ int foo (int a) diff --git a/gcc/testsuite/gcc.dg/pr15784-3.c b/gcc/testsuite/gcc.dg/pr15784-3.c index b233eff..f8a3e5a 100644 --- a/gcc/testsuite/gcc.dg/pr15784-3.c +++ b/gcc/testsuite/gcc.dg/pr15784-3.c @@ -1,5 +1,6 @@ /* { dg-do compile } */ -/* { dg-options "-fdump-tree-generic" } */ +/* SH4 without -mieee defaults to -ffinite-math-only. */ +/* { dg-options "-fdump-tree-generic -fno-finite-math-only" } */ /* Test for folding abs(x) where appropriate. */ #define abs(x) x > 0 ? x : -x extern double fabs (double); -- 2.7.4