1 dnl PowerPC-32 mpn_addmul_1 -- Multiply a limb vector with a limb and add the
2 dnl result to a second limb vector.
4 dnl Copyright 1995, 1997, 1998, 2000, 2001, 2002, 2003, 2005 Free Software
7 dnl This file is part of the GNU MP Library.
9 dnl The GNU MP Library is free software; you can redistribute it and/or modify
10 dnl it under the terms of the GNU Lesser General Public License as published
11 dnl by the Free Software Foundation; either version 3 of the License, or (at
12 dnl your option) any later version.
14 dnl The GNU MP Library is distributed in the hope that it will be useful, but
15 dnl WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
16 dnl or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public
17 dnl License for more details.
19 dnl You should have received a copy of the GNU Lesser General Public License
20 dnl along with the GNU MP Library. If not, see http://www.gnu.org/licenses/.
22 include(`../config.m4')
28 C 7400,7410 (G4): 8.7-14.3
29 C 744x,745x (G4+): 9.5
39 C This is optimized for the PPC604. It has not been tuned for other
42 C Loop Analysis for the 604:
47 C 9 int ops (8 of which serialize)
49 C The multiply insns need 16 cycles/4limb.
50 C The integer register writes will need 13 cycles/4limb.
51 C All-in-all, it should be possible to get to 4 or 5 cycles/limb on PPC604,
52 C but that will require some clever FPNOPS and BNOPS for exact
57 PROLOGUE(mpn_addmul_1)
58 cmpwi cr0,r5,9 C more than 9 limbs?
59 bgt cr0,L(big) C branch if more than 9 limbs
83 L(big): stmw r30,-32(r1)
104 adde r8,r8,r0 C add cy_limb
116 addze r0,r0 C new cy_limb
134 adde r8,r8,r0 C add cy_limb
137 addze r0,r0 C new cy_limb
145 EPILOGUE(mpn_addmul_1)