manual/arith.texi

   1 @node Arithmetic, Date and Time, Mathematics, Top
   2 @c %MENU% Low level arithmetic functions
   3 @chapter Arithmetic Functions
   4
   5 This chapter contains information about functions for doing basic
   6 arithmetic operations, such as splitting a float into its integer and
   7 fractional parts or retrieving the imaginary part of a complex value.
   8 These functions are declared in the header files @file{math.h} and
   9 @file{complex.h}.
  10
  11 @menu
  12 * Floating Point Numbers::      Basic concepts.  IEEE 754.
  13 * Floating Point Classes::      The five kinds of floating-point number.
  14 * Floating Point Errors::       When something goes wrong in a calculation.
  15 * Rounding::                    Controlling how results are rounded.
  16 * Control Functions::           Saving and restoring the FPU's state.
  17 * Arithmetic Functions::        Fundamental operations provided by the library.
  18 * Complex Numbers::             The types.  Writing complex constants.
  19 * Operations on Complex::       Projection, conjugation, decomposition.
  20 * Integer Division::            Integer division with guaranteed rounding.
  21 * Parsing of Numbers::          Converting strings to numbers.
  22 * System V Number Conversion::  An archaic way to convert numbers to strings.
  23 @end menu
  24
  25 @node Floating Point Numbers
  26 @section Floating Point Numbers
  27 @cindex floating point
  28 @cindex IEEE 754
  29 @cindex IEEE floating point
  30
  31 Most computer hardware has support for two different kinds of numbers:
  32 integers (@math{@dots{}-3, -2, -1, 0, 1, 2, 3@dots{}}) and
  33 floating-point numbers.  Floating-point numbers have three parts: the
  34 @dfn{mantissa}, the @dfn{exponent}, and the @dfn{sign bit}.  The real
  35 number represented by a floating-point value is given by
  36 @tex
  37 $(s \mathrel? -1 \mathrel: 1) \cdot 2^e \cdot M$
  38 @end tex
  39 @ifnottex
  40 @math{(s ? -1 : 1) @mul{} 2^e @mul{} M}
  41 @end ifnottex
  42 where @math{s} is the sign bit, @math{e} the exponent, and @math{M}
  43 the mantissa.  @xref{Floating Point Concepts}, for details.  (It is
  44 possible to have a different @dfn{base} for the exponent, but all modern
  45 hardware uses @math{2}.)
  46
  47 Floating-point numbers can represent a finite subset of the real
  48 numbers.  While this subset is large enough for most purposes, it is
  49 important to remember that the only reals that can be represented
  50 exactly are rational numbers that have a terminating binary expansion
  51 shorter than the width of the mantissa.  Even simple fractions such as
  52 @math{1/5} can only be approximated by floating point.
  53
  54 Mathematical operations and functions frequently need to produce values
  55 that are not representable.  Often these values can be approximated
  56 closely enough for practical purposes, but sometimes they can't.
  57 Historically there was no way to tell when the results of a calculation
  58 were inaccurate.  Modern computers implement the @w{IEEE 754} standard
  59 for numerical computations, which defines a framework for indicating to
  60 the program when the results of calculation are not trustworthy.  This
  61 framework consists of a set of @dfn{exceptions} that indicate why a
  62 result could not be represented, and the special values @dfn{infinity}
  63 and @dfn{not a number} (NaN).
  64
  65 @node Floating Point Classes
  66 @section Floating-Point Number Classification Functions
  67 @cindex floating-point classes
  68 @cindex classes, floating-point
  69 @pindex math.h
  70
  71 @w{ISO C 9x} defines macros that let you determine what sort of
  72 floating-point number a variable holds.
  73
  74 @comment math.h
  75 @comment ISO
  76 @deftypefn {Macro} int fpclassify (@emph{float-type} @var{x})
  77 This is a generic macro which works on all floating-point types and
  78 which returns a value of type @code{int}.  The possible values are:
  79
  80 @vtable @code
  81 @item FP_NAN
  82 The floating-point number @var{x} is ``Not a Number'' (@pxref{Infinity
  83 and NaN})
  84 @item FP_INFINITE
  85 The value of @var{x} is either plus or minus infinity (@pxref{Infinity
  86 and NaN})
  87 @item FP_ZERO
  88 The value of @var{x} is zero.  In floating-point formats like @w{IEEE
  89 754}, where zero can be signed, this value is also returned if
  90 @var{x} is negative zero.
  91 @item FP_SUBNORMAL
  92 Numbers whose absolute value is too small to be represented in the
  93 normal format are represented in an alternate, @dfn{denormalized} format
  94 (@pxref{Floating Point Concepts}).  This format is less precise but can
  95 represent values closer to zero.  @code{fpclassify} returns this value
  96 for values of @var{x} in this alternate format.
  97 @item FP_NORMAL
  98 This value is returned for all other values of @var{x}.  It indicates
  99 that there is nothing special about the number.
 100 @end vtable
 101
 102 @end deftypefn
 103
 104 @code{fpclassify} is most useful if more than one property of a number
 105 must be tested.  There are more specific macros which only test one
 106 property at a time.  Generally these macros execute faster than
 107 @code{fpclassify}, since there is special hardware support for them.
 108 You should therefore use the specific macros whenever possible.
 109
 110 @comment math.h
 111 @comment ISO
 112 @deftypefn {Macro} int isfinite (@emph{float-type} @var{x})
 113 This macro returns a nonzero value if @var{x} is finite: not plus or
 114 minus infinity, and not NaN.  It is equivalent to
 115
 116 @smallexample
 117 (fpclassify (x) != FP_NAN && fpclassify (x) != FP_INFINITE)
 118 @end smallexample
 119
 120 @code{isfinite} is implemented as a macro which accepts any
 121 floating-point type.
 122 @end deftypefn
 123
 124 @comment math.h
 125 @comment ISO
 126 @deftypefn {Macro} int isnormal (@emph{float-type} @var{x})
 127 This macro returns a nonzero value if @var{x} is finite and normalized.
 128 It is equivalent to
 129
 130 @smallexample
 131 (fpclassify (x) == FP_NORMAL)
 132 @end smallexample
 133 @end deftypefn
 134
 135 @comment math.h
 136 @comment ISO
 137 @deftypefn {Macro} int isnan (@emph{float-type} @var{x})
 138 This macro returns a nonzero value if @var{x} is NaN.  It is equivalent
 139 to
 140
 141 @smallexample
 142 (fpclassify (x) == FP_NAN)
 143 @end smallexample
 144 @end deftypefn
 145
 146 Another set of floating-point classification functions was provided by
 147 BSD.  The GNU C library also supports these functions; however, we
 148 recommend that you use the C9x macros in new code.  Those are standard
 149 and will be available more widely.  Also, since they are macros, you do
 150 not have to worry about the type of their argument.
 151
 152 @comment math.h
 153 @comment BSD
 154 @deftypefun int isinf (double @var{x})
 155 @comment math.h
 156 @comment BSD
 157 @deftypefunx int isinff (float @var{x})
 158 @comment math.h
 159 @comment BSD
 160 @deftypefunx int isinfl (long double @var{x})
 161 This function returns @code{-1} if @var{x} represents negative infinity,
 162 @code{1} if @var{x} represents positive infinity, and @code{0} otherwise.
 163 @end deftypefun
 164
 165 @comment math.h
 166 @comment BSD
 167 @deftypefun int isnan (double @var{x})
 168 @comment math.h
 169 @comment BSD
 170 @deftypefunx int isnanf (float @var{x})
 171 @comment math.h
 172 @comment BSD
 173 @deftypefunx int isnanl (long double @var{x})
 174 This function returns a nonzero value if @var{x} is a ``not a number''
 175 value, and zero otherwise.
 176
 177 @strong{Note:} The @code{isnan} macro defined by @w{ISO C 9x} overrides
 178 the BSD function.  This is normally not a problem, because the two
 179 routines behave identically.  However, if you really need to get the BSD
 180 function for some reason, you can write
 181
 182 @smallexample
 183 (isnan) (x)
 184 @end smallexample
 185 @end deftypefun
 186
 187 @comment math.h
 188 @comment BSD
 189 @deftypefun int finite (double @var{x})
 190 @comment math.h
 191 @comment BSD
 192 @deftypefunx int finitef (float @var{x})
 193 @comment math.h
 194 @comment BSD
 195 @deftypefunx int finitel (long double @var{x})
 196 This function returns a nonzero value if @var{x} is finite or a ``not a
 197 number'' value, and zero otherwise.
 198 @end deftypefun
 199
 200 @comment math.h
 201 @comment BSD
 202 @deftypefun double infnan (int @var{error})
 203 This function is provided for compatibility with BSD.  Its argument is
 204 an error code, @code{EDOM} or @code{ERANGE}; @code{infnan} returns the
 205 value that a math function would return if it set @code{errno} to that
 206 value.  @xref{Math Error Reporting}.  @code{-ERANGE} is also acceptable
 207 as an argument, and corresponds to @code{-HUGE_VAL} as a value.
 208
 209 In the BSD library, on certain machines, @code{infnan} raises a fatal
 210 signal in all cases.  The GNU library does not do likewise, because that
 211 does not fit the @w{ISO C} specification.
 212 @end deftypefun
 213
 214 @strong{Portability Note:} The functions listed in this section are BSD
 215 extensions.
 216
 217
 218 @node Floating Point Errors
 219 @section Errors in Floating-Point Calculations
 220
 221 @menu
 222 * FP Exceptions::               IEEE 754 math exceptions and how to detect them.
 223 * Infinity and NaN::            Special values returned by calculations.
 224 * Status bit operations::       Checking for exceptions after the fact.
 225 * Math Error Reporting::        How the math functions report errors.
 226 @end menu
 227
 228 @node FP Exceptions
 229 @subsection FP Exceptions
 230 @cindex exception
 231 @cindex signal
 232 @cindex zero divide
 233 @cindex division by zero
 234 @cindex inexact exception
 235 @cindex invalid exception
 236 @cindex overflow exception
 237 @cindex underflow exception
 238
 239 The @w{IEEE 754} standard defines five @dfn{exceptions} that can occur
 240 during a calculation.  Each corresponds to a particular sort of error,
 241 such as overflow.
 242
 243 When exceptions occur (when exceptions are @dfn{raised}, in the language
 244 of the standard), one of two things can happen.  By default the
 245 exception is simply noted in the floating-point @dfn{status word}, and
 246 the program continues as if nothing had happened.  The operation
 247 produces a default value, which depends on the exception (see the table
 248 below).  Your program can check the status word to find out which
 249 exceptions happened.
 250
 251 Alternatively, you can enable @dfn{traps} for exceptions.  In that case,
 252 when an exception is raised, your program will receive the @code{SIGFPE}
 253 signal.  The default action for this signal is to terminate the
 254 program.  @xref{Signal Handling}, for how you can change the effect of
 255 the signal.
 256
 257 @findex matherr
 258 In the System V math library, the user-defined function @code{matherr}
 259 is called when certain exceptions occur inside math library functions.
 260 However, the Unix98 standard deprecates this interface.  We support it
 261 for historical compatibility, but recommend that you do not use it in
 262 new programs.
 263
 264 @noindent
 265 The exceptions defined in @w{IEEE 754} are:
 266
 267 @table @samp
 268 @item Invalid Operation
 269 This exception is raised if the given operands are invalid for the
 270 operation to be performed.  Examples are
 271 (see @w{IEEE 754}, @w{section 7}):
 272 @enumerate
 273 @item
 274 Addition or subtraction: @math{@infinity{} - @infinity{}}.  (But
 275 @math{@infinity{} + @infinity{} = @infinity{}}).
 276 @item
 277 Multiplication: @math{0 @mul{} @infinity{}}.
 278 @item
 279 Division: @math{0/0} or @math{@infinity{}/@infinity{}}.
 280 @item
 281 Remainder: @math{x} REM @math{y}, where @math{y} is zero or @math{x} is
 282 infinite.
 283 @item
 284 Square root if the operand is less then zero.  More generally, any
 285 mathematical function evaluated outside its domain produces this
 286 exception.
 287 @item
 288 Conversion of a floating-point number to an integer or decimal
 289 string, when the number cannot be represented in the target format (due
 290 to overflow, infinity, or NaN).
 291 @item
 292 Conversion of an unrecognizable input string.
 293 @item
 294 Comparison via predicates involving @math{<} or @math{>}, when one or
 295 other of the operands is NaN.  You can prevent this exception by using
 296 the unordered comparison functions instead; see @ref{FP Comparison Functions}.
 297 @end enumerate
 298
 299 If the exception does not trap, the result of the operation is NaN.
 300
 301 @item Division by Zero
 302 This exception is raised when a finite nonzero number is divided
 303 by zero.  If no trap occurs the result is either @math{+@infinity{}} or
 304 @math{-@infinity{}}, depending on the signs of the operands.
 305
 306 @item Overflow
 307 This exception is raised whenever the result cannot be represented
 308 as a finite value in the precision format of the destination.  If no trap
 309 occurs the result depends on the sign of the intermediate result and the
 310 current rounding mode (@w{IEEE 754}, @w{section 7.3}):
 311 @enumerate
 312 @item
 313 Round to nearest carries all overflows to @math{@infinity{}}
 314 with the sign of the intermediate result.
 315 @item
 316 Round toward @math{0} carries all overflows to the largest representable
 317 finite number with the sign of the intermediate result.
 318 @item
 319 Round toward @math{-@infinity{}} carries positive overflows to the
 320 largest representable finite number and negative overflows to
 321 @math{-@infinity{}}.
 322
 323 @item
 324 Round toward @math{@infinity{}} carries negative overflows to the
 325 most negative representable finite number and positive overflows
 326 to @math{@infinity{}}.
 327 @end enumerate
 328
 329 Whenever the overflow exception is raised, the inexact exception is also
 330 raised.
 331
 332 @item Underflow
 333 The underflow exception is raised when an intermediate result is too
 334 small to be calculated accurately, or if the operation's result rounded
 335 to the destination precision is too small to be normalized.
 336
 337 When no trap is installed for the underflow exception, underflow is
 338 signaled (via the underflow flag) only when both tininess and loss of
 339 accuracy have been detected.  If no trap handler is installed the
 340 operation continues with an imprecise small value, or zero if the
 341 destination precision cannot hold the small exact result.
 342
 343 @item Inexact
 344 This exception is signalled if a rounded result is not exact (such as
 345 when calculating the square root of two) or a result overflows without
 346 an overflow trap.
 347 @end table
 348
 349 @node Infinity and NaN
 350 @subsection Infinity and NaN
 351 @cindex infinity
 352 @cindex not a number
 353 @cindex NaN
 354
 355 @w{IEEE 754} floating point numbers can represent positive or negative
 356 infinity, and @dfn{NaN} (not a number).  These three values arise from
 357 calculations whose result is undefined or cannot be represented
 358 accurately.  You can also deliberately set a floating-point variable to
 359 any of them, which is sometimes useful.  Some examples of calculations
 360 that produce infinity or NaN:
 361
 362 @ifnottex
 363 @smallexample
 364 @math{1/0 = @infinity{}}
 365 @math{log (0) = -@infinity{}}
 366 @math{sqrt (-1) = NaN}
 367 @end smallexample
 368 @end ifnottex
 369 @tex
 370 $${1\over0} = \infty$$
 371 $$\log 0 = -\infty$$
 372 $$\sqrt{-1} = \hbox{NaN}$$
 373 @end tex
 374
 375 When a calculation produces any of these values, an exception also
 376 occurs; see @ref{FP Exceptions}.
 377
 378 The basic operations and math functions all accept infinity and NaN and
 379 produce sensible output.  Infinities propagate through calculations as
 380 one would expect: for example, @math{2 + @infinity{} = @infinity{}},
 381 @math{4/@infinity{} = 0}, atan @math{(@infinity{}) = @pi{}/2}.  NaN, on
 382 the other hand, infects any calculation that involves it.  Unless the
 383 calculation would produce the same result no matter what real value
 384 replaced NaN, the result is NaN.
 385
 386 In comparison operations, positive infinity is larger than all values
 387 except itself and NaN, and negative infinity is smaller than all values
 388 except itself and NaN.  NaN is @dfn{unordered}: it is not equal to,
 389 greater than, or less than anything, @emph{including itself}. @code{x ==
 390 x} is false if the value of @code{x} is NaN.  You can use this to test
 391 whether a value is NaN or not, but the recommended way to test for NaN
 392 is with the @code{isnan} function (@pxref{Floating Point Classes}).  In
 393 addition, @code{<}, @code{>}, @code{<=}, and @code{>=} will raise an
 394 exception when applied to NaNs.
 395
 396 @file{math.h} defines macros that allow you to explicitly set a variable
 397 to infinity or NaN.
 398
 399 @comment math.h
 400 @comment ISO
 401 @deftypevr Macro float INFINITY
 402 An expression representing positive infinity.  It is equal to the value
 403 produced  by mathematical operations like @code{1.0 / 0.0}.
 404 @code{-INFINITY} represents negative infinity.
 405
 406 You can test whether a floating-point value is infinite by comparing it
 407 to this macro.  However, this is not recommended; you should use the
 408 @code{isfinite} macro instead.  @xref{Floating Point Classes}.
 409
 410 This macro was introduced in the @w{ISO C 9X} standard.
 411 @end deftypevr
 412
 413 @comment math.h
 414 @comment GNU
 415 @deftypevr Macro float NAN
 416 An expression representing a value which is ``not a number''.  This
 417 macro is a GNU extension, available only on machines that support the
 418 ``not a number'' value---that is to say, on all machines that support
 419 IEEE floating point.
 420
 421 You can use @samp{#ifdef NAN} to test whether the machine supports
 422 NaN.  (Of course, you must arrange for GNU extensions to be visible,
 423 such as by defining @code{_GNU_SOURCE}, and then you must include
 424 @file{math.h}.)
 425 @end deftypevr
 426
 427 @w{IEEE 754} also allows for another unusual value: negative zero.  This
 428 value is produced when you divide a positive number by negative
 429 infinity, or when a negative result is smaller than the limits of
 430 representation.  Negative zero behaves identically to zero in all
 431 calculations, unless you explicitly test the sign bit with
 432 @code{signbit} or @code{copysign}.
 433
 434 @node Status bit operations
 435 @subsection Examining the FPU status word
 436
 437 @w{ISO C 9x} defines functions to query and manipulate the
 438 floating-point status word.  You can use these functions to check for
 439 untrapped exceptions when it's convenient, rather than worrying about
 440 them in the middle of a calculation.
 441
 442 These constants represent the various @w{IEEE 754} exceptions.  Not all
 443 FPUs report all the different exceptions.  Each constant is defined if
 444 and only if the FPU you are compiling for supports that exception, so
 445 you can test for FPU support with @samp{#ifdef}.  They are defined in
 446 @file{fenv.h}.
 447
 448 @vtable @code
 449 @comment fenv.h
 450 @comment ISO
 451 @item FE_INEXACT
 452  The inexact exception.
 453 @comment fenv.h
 454 @comment ISO
 455 @item FE_DIVBYZERO
 456  The divide by zero exception.
 457 @comment fenv.h
 458 @comment ISO
 459 @item FE_UNDERFLOW
 460  The underflow exception.
 461 @comment fenv.h
 462 @comment ISO
 463 @item FE_OVERFLOW
 464  The overflow exception.
 465 @comment fenv.h
 466 @comment ISO
 467 @item FE_INVALID
 468  The invalid exception.
 469 @end vtable
 470
 471 The macro @code{FE_ALL_EXCEPT} is the bitwise OR of all exception macros
 472 which are supported by the FP implementation.
 473
 474 These functions allow you to clear exception flags, test for exceptions,
 475 and save and restore the set of exceptions flagged.
 476
 477 @comment fenv.h
 478 @comment ISO
 479 @deftypefun void feclearexcept (int @var{excepts})
 480 This function clears all of the supported exception flags indicated by
 481 @var{excepts}.
 482 @end deftypefun
 483
 484 @comment fenv.h
 485 @comment ISO
 486 @deftypefun int fetestexcept (int @var{excepts})
 487 Test whether the exception flags indicated by the parameter @var{except}
 488 are currently set.  If any of them are, a nonzero value is returned
 489 which specifies which exceptions are set.  Otherwise the result is zero.
 490 @end deftypefun
 491
 492 To understand these functions, imagine that the status word is an
 493 integer variable named @var{status}.  @code{feclearexcept} is then
 494 equivalent to @samp{status &= ~excepts} and @code{fetestexcept} is
 495 equivalent to @samp{(status & excepts)}.  The actual implementation may
 496 be very different, of course.
 497
 498 Exception flags are only cleared when the program explicitly requests it,
 499 by calling @code{feclearexcept}.  If you want to check for exceptions
 500 from a set of calculations, you should clear all the flags first.  Here
 501 is a simple example of the way to use @code{fetestexcept}:
 502
 503 @smallexample
 504 @{
 505   double f;
 506   int raised;
 507   feclearexcept (FE_ALL_EXCEPT);
 508   f = compute ();
 509   raised = fetestexcept (FE_OVERFLOW | FE_INVALID);
 510   if (raised & FE_OVERFLOW) @{ /* ... */ @}
 511   if (raised & FE_INVALID) @{ /* ... */ @}
 512   /* ... */
 513 @}
 514 @end smallexample
 515
 516 You cannot explicitly set bits in the status word.  You can, however,
 517 save the entire status word and restore it later.  This is done with the
 518 following functions:
 519
 520 @comment fenv.h
 521 @comment ISO
 522 @deftypefun void fegetexceptflag (fexcept_t *@var{flagp}, int @var{excepts})
 523 This function stores in the variable pointed to by @var{flagp} an
 524 implementation-defined value representing the current setting of the
 525 exception flags indicated by @var{excepts}.
 526 @end deftypefun
 527
 528 @comment fenv.h
 529 @comment ISO
 530 @deftypefun void fesetexceptflag (const fexcept_t *@var{flagp}, int
 531 @var{excepts})
 532 This function restores the flags for the exceptions indicated by
 533 @var{excepts} to the values stored in the variable pointed to by
 534 @var{flagp}.
 535 @end deftypefun
 536
 537 Note that the value stored in @code{fexcept_t} bears no resemblance to
 538 the bit mask returned by @code{fetestexcept}.  The type may not even be
 539 an integer.  Do not attempt to modify an @code{fexcept_t} variable.
 540
 541 @node Math Error Reporting
 542 @subsection Error Reporting by Mathematical Functions
 543 @cindex errors, mathematical
 544 @cindex domain error
 545 @cindex range error
 546
 547 Many of the math functions are defined only over a subset of the real or
 548 complex numbers.  Even if they are mathematically defined, their result
 549 may be larger or smaller than the range representable by their return
 550 type.  These are known as @dfn{domain errors}, @dfn{overflows}, and
 551 @dfn{underflows}, respectively.  Math functions do several things when
 552 one of these errors occurs.  In this manual we will refer to the
 553 complete response as @dfn{signalling} a domain error, overflow, or
 554 underflow.
 555
 556 When a math function suffers a domain error, it raises the invalid
 557 exception and returns NaN.  It also sets @var{errno} to @code{EDOM};
 558 this is for compatibility with old systems that do not support @w{IEEE
 559 754} exception handling.  Likewise, when overflow occurs, math
 560 functions raise the overflow exception and return @math{@infinity{}} or
 561 @math{-@infinity{}} as appropriate.  They also set @var{errno} to
 562 @code{ERANGE}.  When underflow occurs, the underflow exception is
 563 raised, and zero (appropriately signed) is returned.  @var{errno} may be
 564 set to @code{ERANGE}, but this is not guaranteed.
 565
 566 Some of the math functions are defined mathematically to result in a
 567 complex value over parts of their domains.  The most familiar example of
 568 this is taking the square root of a negative number.  The complex math
 569 functions, such as @code{csqrt}, will return the appropriate complex value
 570 in this case.  The real-valued functions, such as @code{sqrt}, will
 571 signal a domain error.
 572
 573 Some older hardware does not support infinities.  On that hardware,
 574 overflows instead return a particular very large number (usually the
 575 largest representable number).  @file{math.h} defines macros you can use
 576 to test for overflow on both old and new hardware.
 577
 578 @comment math.h
 579 @comment ISO
 580 @deftypevr Macro double HUGE_VAL
 581 @comment math.h
 582 @comment ISO
 583 @deftypevrx Macro float HUGE_VALF
 584 @comment math.h
 585 @comment ISO
 586 @deftypevrx Macro {long double} HUGE_VALL
 587 An expression representing a particular very large number.  On machines
 588 that use @w{IEEE 754} floating point format, @code{HUGE_VAL} is infinity.
 589 On other machines, it's typically the largest positive number that can
 590 be represented.
 591
 592 Mathematical functions return the appropriately typed version of
 593 @code{HUGE_VAL} or @code{@minus{}HUGE_VAL} when the result is too large
 594 to be represented.
 595 @end deftypevr
 596
 597 @node Rounding
 598 @section Rounding Modes
 599
 600 Floating-point calculations are carried out internally with extra
 601 precision, and then rounded to fit into the destination type.  This
 602 ensures that results are as precise as the input data.  @w{IEEE 754}
 603 defines four possible rounding modes:
 604
 605 @table @asis
 606 @item Round to nearest.
 607 This is the default mode.  It should be used unless there is a specific
 608 need for one of the others.  In this mode results are rounded to the
 609 nearest representable value.  If the result is midway between two
 610 representable values, the even representable is chosen. @dfn{Even} here
 611 means the lowest-order bit is zero.  This rounding mode prevents
 612 statistical bias and guarantees numeric stability: round-off errors in a
 613 lengthy calculation will remain smaller than half of @code{FLT_EPSILON}.
 614
 615 @c @item Round toward @math{+@infinity{}}
 616 @item Round toward plus Infinity.
 617 All results are rounded to the smallest representable value
 618 which is greater than the result.
 619
 620 @c @item Round toward @math{-@infinity{}}
 621 @item Round toward minus Infinity.
 622 All results are rounded to the largest representable value which is less
 623 than the result.
 624
 625 @item Round toward zero.
 626 All results are rounded to the largest representable value whose
 627 magnitude is less than that of the result.  In other words, if the
 628 result is negative it is rounded up; if it is positive, it is rounded
 629 down.
 630 @end table
 631
 632 @noindent
 633 @file{fenv.h} defines constants which you can use to refer to the
 634 various rounding modes.  Each one will be defined if and only if the FPU
 635 supports the corresponding rounding mode.
 636
 637 @table @code
 638 @comment fenv.h
 639 @comment ISO
 640 @vindex FE_TONEAREST
 641 @item FE_TONEAREST
 642 Round to nearest.
 643
 644 @comment fenv.h
 645 @comment ISO
 646 @vindex FE_UPWARD
 647 @item FE_UPWARD
 648 Round toward @math{+@infinity{}}.
 649
 650 @comment fenv.h
 651 @comment ISO
 652 @vindex FE_DOWNWARD
 653 @item FE_DOWNWARD
 654 Round toward @math{-@infinity{}}.
 655
 656 @comment fenv.h
 657 @comment ISO
 658 @vindex FE_TOWARDZERO
 659 @item FE_TOWARDZERO
 660 Round toward zero.
 661 @end table
 662
 663 Underflow is an unusual case.  Normally, @w{IEEE 754} floating point
 664 numbers are always normalized (@pxref{Floating Point Concepts}).
 665 Numbers smaller than @math{2^r} (where @math{r} is the minimum exponent,
 666 @code{FLT_MIN_RADIX-1} for @var{float}) cannot be represented as
 667 normalized numbers.  Rounding all such numbers to zero or @math{2^r}
 668 would cause some algorithms to fail at 0.  Therefore, they are left in
 669 denormalized form.  That produces loss of precision, since some bits of
 670 the mantissa are stolen to indicate the decimal point.
 671
 672 If a result is too small to be represented as a denormalized number, it
 673 is rounded to zero.  However, the sign of the result is preserved; if
 674 the calculation was negative, the result is @dfn{negative zero}.
 675 Negative zero can also result from some operations on infinity, such as
 676 @math{4/-@infinity{}}.  Negative zero behaves identically to zero except
 677 when the @code{copysign} or @code{signbit} functions are used to check
 678 the sign bit directly.
 679
 680 At any time one of the above four rounding modes is selected.  You can
 681 find out which one with this function:
 682
 683 @comment fenv.h
 684 @comment ISO
 685 @deftypefun int fegetround (void)
 686 Returns the currently selected rounding mode, represented by one of the
 687 values of the defined rounding mode macros.
 688 @end deftypefun
 689
 690 @noindent
 691 To change the rounding mode, use this function:
 692
 693 @comment fenv.h
 694 @comment ISO
 695 @deftypefun int fesetround (int @var{round})
 696 Changes the currently selected rounding mode to @var{round}.  If
 697 @var{round} does not correspond to one of the supported rounding modes
 698 nothing is changed.  @code{fesetround} returns a nonzero value if it
 699 changed the rounding mode, zero if the mode is not supported.
 700 @end deftypefun
 701
 702 You should avoid changing the rounding mode if possible.  It can be an
 703 expensive operation; also, some hardware requires you to compile your
 704 program differently for it to work.  The resulting code may run slower.
 705 See your compiler documentation for details.
 706 @c This section used to claim that functions existed to round one number
 707 @c in a specific fashion.  I can't find any functions in the library
 708 @c that do that. -zw
 709
 710 @node Control Functions
 711 @section Floating-Point Control Functions
 712
 713 @w{IEEE 754} floating-point implementations allow the programmer to
 714 decide whether traps will occur for each of the exceptions, by setting
 715 bits in the @dfn{control word}.  In C, traps result in the program
 716 receiving the @code{SIGFPE} signal; see @ref{Signal Handling}.
 717
 718 @strong{Note:} @w{IEEE 754} says that trap handlers are given details of
 719 the exceptional situation, and can set the result value.  C signals do
 720 not provide any mechanism to pass this information back and forth.
 721 Trapping exceptions in C is therefore not very useful.
 722
 723 It is sometimes necessary to save the state of the floating-point unit
 724 while you perform some calculation.  The library provides functions
 725 which save and restore the exception flags, the set of exceptions that
 726 generate traps, and the rounding mode.  This information is known as the
 727 @dfn{floating-point environment}.
 728
 729 The functions to save and restore the floating-point environment all use
 730 a variable of type @code{fenv_t} to store information.  This type is
 731 defined in @file{fenv.h}.  Its size and contents are
 732 implementation-defined.  You should not attempt to manipulate a variable
 733 of this type directly.
 734
 735 To save the state of the FPU, use one of these functions:
 736
 737 @comment fenv.h
 738 @comment ISO
 739 @deftypefun void fegetenv (fenv_t *@var{envp})
 740 Store the floating-point environment in the variable pointed to by
 741 @var{envp}.
 742 @end deftypefun
 743
 744 @comment fenv.h
 745 @comment ISO
 746 @deftypefun int feholdexcept (fenv_t *@var{envp})
 747 Store the current floating-point environment in the object pointed to by
 748 @var{envp}.  Then clear all exception flags, and set the FPU to trap no
 749 exceptions.  Not all FPUs support trapping no exceptions; if
 750 @code{feholdexcept} cannot set this mode, it returns zero.  If it
 751 succeeds, it returns a nonzero value.
 752 @end deftypefun
 753
 754 The functions which restore the floating-point environment can take two
 755 kinds of arguments:
 756
 757 @itemize @bullet
 758 @item
 759 Pointers to @code{fenv_t} objects, which were initialized previously by a
 760 call to @code{fegetenv} or @code{feholdexcept}.
 761 @item
 762 @vindex FE_DFL_ENV
 763 The special macro @code{FE_DFL_ENV} which represents the floating-point
 764 environment as it was available at program start.
 765 @item
 766 Implementation defined macros with names starting with @code{FE_}.
 767
 768 @vindex FE_NOMASK_ENV
 769 If possible, the GNU C Library defines a macro @code{FE_NOMASK_ENV}
 770 which represents an environment where every exception raised causes a
 771 trap to occur.  You can test for this macro using @code{#ifdef}.  It is
 772 only defined if @code{_GNU_SOURCE} is defined.
 773
 774 Some platforms might define other predefined environments.
 775 @end itemize
 776
 777 @noindent
 778 To set the floating-point environment, you can use either of these
 779 functions:
 780
 781 @comment fenv.h
 782 @comment ISO
 783 @deftypefun void fesetenv (const fenv_t *@var{envp})
 784 Set the floating-point environment to that described by @var{envp}.
 785 @end deftypefun
 786
 787 @comment fenv.h
 788 @comment ISO
 789 @deftypefun void feupdateenv (const fenv_t *@var{envp})
 790 Like @code{fesetenv}, this function sets the floating-point environment
 791 to that described by @var{envp}.  However, if any exceptions were
 792 flagged in the status word before @code{feupdateenv} was called, they
 793 remain flagged after the call.  In other words, after @code{feupdateenv}
 794 is called, the status word is the bitwise OR of the previous status word
 795 and the one saved in @var{envp}.
 796 @end deftypefun
 797
 798 @node Arithmetic Functions
 799 @section Arithmetic Functions
 800
 801 The C library provides functions to do basic operations on
 802 floating-point numbers.  These include absolute value, maximum and minimum,
 803 normalization, bit twiddling, rounding, and a few others.
 804
 805 @menu
 806 * Absolute Value::              Absolute values of integers and floats.
 807 * Normalization Functions::     Extracting exponents and putting them back.
 808 * Rounding Functions::          Rounding floats to integers.
 809 * Remainder Functions::         Remainders on division, precisely defined.
 810 * FP Bit Twiddling::            Sign bit adjustment.  Adding epsilon.
 811 * FP Comparison Functions::     Comparisons without risk of exceptions.
 812 * Misc FP Arithmetic::          Max, min, positive difference, multiply-add.
 813 @end menu
 814
 815 @node Absolute Value
 816 @subsection Absolute Value
 817 @cindex absolute value functions
 818
 819 These functions are provided for obtaining the @dfn{absolute value} (or
 820 @dfn{magnitude}) of a number.  The absolute value of a real number
 821 @var{x} is @var{x} if @var{x} is positive, @minus{}@var{x} if @var{x} is
 822 negative.  For a complex number @var{z}, whose real part is @var{x} and
 823 whose imaginary part is @var{y}, the absolute value is @w{@code{sqrt
 824 (@var{x}*@var{x} + @var{y}*@var{y})}}.
 825
 826 @pindex math.h
 827 @pindex stdlib.h
 828 Prototypes for @code{abs}, @code{labs} and @code{llabs} are in @file{stdlib.h};
 829 @code{imaxabs} is declared in @file{inttypes.h};
 830 @code{fabs}, @code{fabsf} and @code{fabsl} are declared in @file{math.h}.
 831 @code{cabs}, @code{cabsf} and @code{cabsl} are declared in @file{complex.h}.
 832
 833 @comment stdlib.h
 834 @comment ISO
 835 @deftypefun int abs (int @var{number})
 836 @comment stdlib.h
 837 @comment ISO
 838 @deftypefunx {long int} labs (long int @var{number})
 839 @comment stdlib.h
 840 @comment ISO
 841 @deftypefunx {long long int} llabs (long long int @var{number})
 842 @comment inttypes.h
 843 @comment ISO
 844 @deftypefunx intmax_t imaxabs (intmax_t @var{number})
 845 These functions return the absolute value of @var{number}.
 846
 847 Most computers use a two's complement integer representation, in which
 848 the absolute value of @code{INT_MIN} (the smallest possible @code{int})
 849 cannot be represented; thus, @w{@code{abs (INT_MIN)}} is not defined.
 850
 851 @code{llabs} and @code{imaxdiv} are new to @w{ISO C 9x}.
 852 @end deftypefun
 853
 854 @comment math.h
 855 @comment ISO
 856 @deftypefun double fabs (double @var{number})
 857 @comment math.h
 858 @comment ISO
 859 @deftypefunx float fabsf (float @var{number})
 860 @comment math.h
 861 @comment ISO
 862 @deftypefunx {long double} fabsl (long double @var{number})
 863 This function returns the absolute value of the floating-point number
 864 @var{number}.
 865 @end deftypefun
 866
 867 @comment complex.h
 868 @comment ISO
 869 @deftypefun double cabs (complex double @var{z})
 870 @comment complex.h
 871 @comment ISO
 872 @deftypefunx float cabsf (complex float @var{z})
 873 @comment complex.h
 874 @comment ISO
 875 @deftypefunx {long double} cabsl (complex long double @var{z})
 876 These functions return the absolute  value of the complex number @var{z}
 877 (@pxref{Complex Numbers}).  The absolute value of a complex number is:
 878
 879 @smallexample
 880 sqrt (creal (@var{z}) * creal (@var{z}) + cimag (@var{z}) * cimag (@var{z}))
 881 @end smallexample
 882
 883 This function should always be used instead of the direct formula
 884 because it takes special care to avoid losing precision.  It may also
 885 take advantage of hardware support for this operation. See @code{hypot}
 886 in @ref{Exponents and Logarithms}.
 887 @end deftypefun
 888
 889 @node Normalization Functions
 890 @subsection Normalization Functions
 891 @cindex normalization functions (floating-point)
 892
 893 The functions described in this section are primarily provided as a way
 894 to efficiently perform certain low-level manipulations on floating point
 895 numbers that are represented internally using a binary radix;
 896 see @ref{Floating Point Concepts}.  These functions are required to
 897 have equivalent behavior even if the representation does not use a radix
 898 of 2, but of course they are unlikely to be particularly efficient in
 899 those cases.
 900
 901 @pindex math.h
 902 All these functions are declared in @file{math.h}.
 903
 904 @comment math.h
 905 @comment ISO
 906 @deftypefun double frexp (double @var{value}, int *@var{exponent})
 907 @comment math.h
 908 @comment ISO
 909 @deftypefunx float frexpf (float @var{value}, int *@var{exponent})
 910 @comment math.h
 911 @comment ISO
 912 @deftypefunx {long double} frexpl (long double @var{value}, int *@var{exponent})
 913 These functions are used to split the number @var{value}
 914 into a normalized fraction and an exponent.
 915
 916 If the argument @var{value} is not zero, the return value is @var{value}
 917 times a power of two, and is always in the range 1/2 (inclusive) to 1
 918 (exclusive).  The corresponding exponent is stored in
 919 @code{*@var{exponent}}; the return value multiplied by 2 raised to this
 920 exponent equals the original number @var{value}.
 921
 922 For example, @code{frexp (12.8, &exponent)} returns @code{0.8} and
 923 stores @code{4} in @code{exponent}.
 924
 925 If @var{value} is zero, then the return value is zero and
 926 zero is stored in @code{*@var{exponent}}.
 927 @end deftypefun
 928
 929 @comment math.h
 930 @comment ISO
 931 @deftypefun double ldexp (double @var{value}, int @var{exponent})
 932 @comment math.h
 933 @comment ISO
 934 @deftypefunx float ldexpf (float @var{value}, int @var{exponent})
 935 @comment math.h
 936 @comment ISO
 937 @deftypefunx {long double} ldexpl (long double @var{value}, int @var{exponent})
 938 These functions return the result of multiplying the floating-point
 939 number @var{value} by 2 raised to the power @var{exponent}.  (It can
 940 be used to reassemble floating-point numbers that were taken apart
 941 by @code{frexp}.)
 942
 943 For example, @code{ldexp (0.8, 4)} returns @code{12.8}.
 944 @end deftypefun
 945
 946 The following functions, which come from BSD, provide facilities
 947 equivalent to those of @code{ldexp} and @code{frexp}.
 948
 949 @comment math.h
 950 @comment BSD
 951 @deftypefun double logb (double @var{x})
 952 @comment math.h
 953 @comment BSD
 954 @deftypefunx float logbf (float @var{x})
 955 @comment math.h
 956 @comment BSD
 957 @deftypefunx {long double} logbl (long double @var{x})
 958 These functions return the integer part of the base-2 logarithm of
 959 @var{x}, an integer value represented in type @code{double}.  This is
 960 the highest integer power of @code{2} contained in @var{x}.  The sign of
 961 @var{x} is ignored.  For example, @code{logb (3.5)} is @code{1.0} and
 962 @code{logb (4.0)} is @code{2.0}.
 963
 964 When @code{2} raised to this power is divided into @var{x}, it gives a
 965 quotient between @code{1} (inclusive) and @code{2} (exclusive).
 966
 967 If @var{x} is zero, the return value is minus infinity if the machine
 968 supports infinities, and a very small number if it does not.  If @var{x}
 969 is infinity, the return value is infinity.
 970
 971 For finite @var{x}, the value returned by @code{logb} is one less than
 972 the value that @code{frexp} would store into @code{*@var{exponent}}.
 973 @end deftypefun
 974
 975 @comment math.h
 976 @comment BSD
 977 @deftypefun double scalb (double @var{value}, int @var{exponent})
 978 @comment math.h
 979 @comment BSD
 980 @deftypefunx float scalbf (float @var{value}, int @var{exponent})
 981 @comment math.h
 982 @comment BSD
 983 @deftypefunx {long double} scalbl (long double @var{value}, int @var{exponent})
 984 The @code{scalb} function is the BSD name for @code{ldexp}.
 985 @end deftypefun
 986
 987 @comment math.h
 988 @comment BSD
 989 @deftypefun {long long int} scalbn (double @var{x}, int n)
 990 @comment math.h
 991 @comment BSD
 992 @deftypefunx {long long int} scalbnf (float @var{x}, int n)
 993 @comment math.h
 994 @comment BSD
 995 @deftypefunx {long long int} scalbnl (long double @var{x}, int n)
 996 @code{scalbn} is identical to @code{scalb}, except that the exponent
 997 @var{n} is an @code{int} instead of a floating-point number.
 998 @end deftypefun
 999
1000 @comment math.h
1001 @comment BSD
1002 @deftypefun {long long int} scalbln (double @var{x}, long int n)
1003 @comment math.h
1004 @comment BSD
1005 @deftypefunx {long long int} scalblnf (float @var{x}, long int n)
1006 @comment math.h
1007 @comment BSD
1008 @deftypefunx {long long int} scalblnl (long double @var{x}, long int n)
1009 @code{scalbln} is identical to @code{scalb}, except that the exponent
1010 @var{n} is a @code{long int} instead of a floating-point number.
1011 @end deftypefun
1012
1013 @comment math.h
1014 @comment BSD
1015 @deftypefun {long long int} significand (double @var{x})
1016 @comment math.h
1017 @comment BSD
1018 @deftypefunx {long long int} significandf (float @var{x})
1019 @comment math.h
1020 @comment BSD
1021 @deftypefunx {long long int} significandl (long double @var{x})
1022 @code{significand} returns the mantissa of @var{x} scaled to the range
1023 @math{[1, 2)}.
1024 It is equivalent to @w{@code{scalb (@var{x}, (double) -ilogb (@var{x}))}}.
1025
1026 This function exists mainly for use in certain standardized tests
1027 of @w{IEEE 754} conformance.
1028 @end deftypefun
1029
1030 @node Rounding Functions
1031 @subsection Rounding Functions
1032 @cindex converting floats to integers
1033
1034 @pindex math.h
1035 The functions listed here perform operations such as rounding and
1036 truncation of floating-point values. Some of these functions convert
1037 floating point numbers to integer values.  They are all declared in
1038 @file{math.h}.
1039
1040 You can also convert floating-point numbers to integers simply by
1041 casting them to @code{int}.  This discards the fractional part,
1042 effectively rounding towards zero.  However, this only works if the
1043 result can actually be represented as an @code{int}---for very large
1044 numbers, this is impossible.  The functions listed here return the
1045 result as a @code{double} instead to get around this problem.
1046
1047 @comment math.h
1048 @comment ISO
1049 @deftypefun double ceil (double @var{x})
1050 @comment math.h
1051 @comment ISO
1052 @deftypefunx float ceilf (float @var{x})
1053 @comment math.h
1054 @comment ISO
1055 @deftypefunx {long double} ceill (long double @var{x})
1056 These functions round @var{x} upwards to the nearest integer,
1057 returning that value as a @code{double}.  Thus, @code{ceil (1.5)}
1058 is @code{2.0}.
1059 @end deftypefun
1060
1061 @comment math.h
1062 @comment ISO
1063 @deftypefun double floor (double @var{x})
1064 @comment math.h
1065 @comment ISO
1066 @deftypefunx float floorf (float @var{x})
1067 @comment math.h
1068 @comment ISO
1069 @deftypefunx {long double} floorl (long double @var{x})
1070 These functions round @var{x} downwards to the nearest
1071 integer, returning that value as a @code{double}.  Thus, @code{floor
1072 (1.5)} is @code{1.0} and @code{floor (-1.5)} is @code{-2.0}.
1073 @end deftypefun
1074
1075 @comment math.h
1076 @comment ISO
1077 @deftypefun double trunc (double @var{x})
1078 @comment math.h
1079 @comment ISO
1080 @deftypefunx float truncf (float @var{x})
1081 @comment math.h
1082 @comment ISO
1083 @deftypefunx {long double} truncl (long double @var{x})
1084 @code{trunc} is another name for @code{floor}
1085 @end deftypefun
1086
1087 @comment math.h
1088 @comment ISO
1089 @deftypefun double rint (double @var{x})
1090 @comment math.h
1091 @comment ISO
1092 @deftypefunx float rintf (float @var{x})
1093 @comment math.h
1094 @comment ISO
1095 @deftypefunx {long double} rintl (long double @var{x})
1096 These functions round @var{x} to an integer value according to the
1097 current rounding mode.  @xref{Floating Point Parameters}, for
1098 information about the various rounding modes.  The default
1099 rounding mode is to round to the nearest integer; some machines
1100 support other modes, but round-to-nearest is always used unless
1101 you explicitly select another.
1102
1103 If @var{x} was not initially an integer, these functions raise the
1104 inexact exception.
1105 @end deftypefun
1106
1107 @comment math.h
1108 @comment ISO
1109 @deftypefun double nearbyint (double @var{x})
1110 @comment math.h
1111 @comment ISO
1112 @deftypefunx float nearbyintf (float @var{x})
1113 @comment math.h
1114 @comment ISO
1115 @deftypefunx {long double} nearbyintl (long double @var{x})
1116 These functions return the same value as the @code{rint} functions, but
1117 do not raise the inexact exception if @var{x} is not an integer.
1118 @end deftypefun
1119
1120 @comment math.h
1121 @comment ISO
1122 @deftypefun double round (double @var{x})
1123 @comment math.h
1124 @comment ISO
1125 @deftypefunx float roundf (float @var{x})
1126 @comment math.h
1127 @comment ISO
1128 @deftypefunx {long double} roundl (long double @var{x})
1129 These functions are similar to @code{rint}, but they round halfway
1130 cases away from zero instead of to the nearest even integer.
1131 @end deftypefun
1132
1133 @comment math.h
1134 @comment ISO
1135 @deftypefun {long int} lrint (double @var{x})
1136 @comment math.h
1137 @comment ISO
1138 @deftypefunx {long int} lrintf (float @var{x})
1139 @comment math.h
1140 @comment ISO
1141 @deftypefunx {long int} lrintl (long double @var{x})
1142 These functions are just like @code{rint}, but they return a
1143 @code{long int} instead of a floating-point number.
1144 @end deftypefun
1145
1146 @comment math.h
1147 @comment ISO
1148 @deftypefun {long long int} llrint (double @var{x})
1149 @comment math.h
1150 @comment ISO
1151 @deftypefunx {long long int} llrintf (float @var{x})
1152 @comment math.h
1153 @comment ISO
1154 @deftypefunx {long long int} llrintl (long double @var{x})
1155 These functions are just like @code{rint}, but they return a
1156 @code{long long int} instead of a floating-point number.
1157 @end deftypefun
1158
1159 @comment math.h
1160 @comment ISO
1161 @deftypefun {long int} lround (double @var{x})
1162 @comment math.h
1163 @comment ISO
1164 @deftypefunx {long int} lroundf (float @var{x})
1165 @comment math.h
1166 @comment ISO
1167 @deftypefunx {long int} lroundl (long double @var{x})
1168 These functions are just like @code{round}, but they return a
1169 @code{long int} instead of a floating-point number.
1170 @end deftypefun
1171
1172 @comment math.h
1173 @comment ISO
1174 @deftypefun {long long int} llround (double @var{x})
1175 @comment math.h
1176 @comment ISO
1177 @deftypefunx {long long int} llroundf (float @var{x})
1178 @comment math.h
1179 @comment ISO
1180 @deftypefunx {long long int} llroundl (long double @var{x})
1181 These functions are just like @code{round}, but they return a
1182 @code{long long int} instead of a floating-point number.
1183 @end deftypefun
1184
1185
1186 @comment math.h
1187 @comment ISO
1188 @deftypefun double modf (double @var{value}, double *@var{integer-part})
1189 @comment math.h
1190 @comment ISO
1191 @deftypefunx float modff (float @var{value}, float *@var{integer-part})
1192 @comment math.h
1193 @comment ISO
1194 @deftypefunx {long double} modfl (long double @var{value}, long double *@var{integer-part})
1195 These functions break the argument @var{value} into an integer part and a
1196 fractional part (between @code{-1} and @code{1}, exclusive).  Their sum
1197 equals @var{value}.  Each of the parts has the same sign as @var{value},
1198 and the integer part is always rounded toward zero.
1199
1200 @code{modf} stores the integer part in @code{*@var{integer-part}}, and
1201 returns the fractional part.  For example, @code{modf (2.5, &intpart)}
1202 returns @code{0.5} and stores @code{2.0} into @code{intpart}.
1203 @end deftypefun
1204
1205 @node Remainder Functions
1206 @subsection Remainder Functions
1207
1208 The functions in this section compute the remainder on division of two
1209 floating-point numbers.  Each is a little different; pick the one that
1210 suits your problem.
1211
1212 @comment math.h
1213 @comment ISO
1214 @deftypefun double fmod (double @var{numerator}, double @var{denominator})
1215 @comment math.h
1216 @comment ISO
1217 @deftypefunx float fmodf (float @var{numerator}, float @var{denominator})
1218 @comment math.h
1219 @comment ISO
1220 @deftypefunx {long double} fmodl (long double @var{numerator}, long double @var{denominator})
1221 These functions compute the remainder from the division of
1222 @var{numerator} by @var{denominator}.  Specifically, the return value is
1223 @code{@var{numerator} - @w{@var{n} * @var{denominator}}}, where @var{n}
1224 is the quotient of @var{numerator} divided by @var{denominator}, rounded
1225 towards zero to an integer.  Thus, @w{@code{fmod (6.5, 2.3)}} returns
1226 @code{1.9}, which is @code{6.5} minus @code{4.6}.
1227
1228 The result has the same sign as the @var{numerator} and has magnitude
1229 less than the magnitude of the @var{denominator}.
1230
1231 If @var{denominator} is zero, @code{fmod} signals a domain error.
1232 @end deftypefun
1233
1234 @comment math.h
1235 @comment BSD
1236 @deftypefun double drem (double @var{numerator}, double @var{denominator})
1237 @comment math.h
1238 @comment BSD
1239 @deftypefunx float dremf (float @var{numerator}, float @var{denominator})
1240 @comment math.h
1241 @comment BSD
1242 @deftypefunx {long double} dreml (long double @var{numerator}, long double @var{denominator})
1243 These functions are like @code{fmod} except that they rounds the
1244 internal quotient @var{n} to the nearest integer instead of towards zero
1245 to an integer.  For example, @code{drem (6.5, 2.3)} returns @code{-0.4},
1246 which is @code{6.5} minus @code{6.9}.
1247
1248 The absolute value of the result is less than or equal to half the
1249 absolute value of the @var{denominator}.  The difference between
1250 @code{fmod (@var{numerator}, @var{denominator})} and @code{drem
1251 (@var{numerator}, @var{denominator})} is always either
1252 @var{denominator}, minus @var{denominator}, or zero.
1253
1254 If @var{denominator} is zero, @code{drem} signals a domain error.
1255 @end deftypefun
1256
1257 @comment math.h
1258 @comment BSD
1259 @deftypefun double remainder (double @var{numerator}, double @var{denominator})
1260 @comment math.h
1261 @comment BSD
1262 @deftypefunx float remainderf (float @var{numerator}, float @var{denominator})
1263 @comment math.h
1264 @comment BSD
1265 @deftypefunx {long double} remainderl (long double @var{numerator}, long double @var{denominator})
1266 This function is another name for @code{drem}.
1267 @end deftypefun
1268
1269 @node FP Bit Twiddling
1270 @subsection Setting and modifying single bits of FP values
1271 @cindex FP arithmetic
1272
1273 There are some operations that are too complicated or expensive to
1274 perform by hand on floating-point numbers.  @w{ISO C 9x} defines
1275 functions to do these operations, which mostly involve changing single
1276 bits.
1277
1278 @comment math.h
1279 @comment ISO
1280 @deftypefun double copysign (double @var{x}, double @var{y})
1281 @comment math.h
1282 @comment ISO
1283 @deftypefunx float copysignf (float @var{x}, float @var{y})
1284 @comment math.h
1285 @comment ISO
1286 @deftypefunx {long double} copysignl (long double @var{x}, long double @var{y})
1287 These functions return @var{x} but with the sign of @var{y}.  They work
1288 even if @var{x} or @var{y} are NaN or zero.  Both of these can carry a
1289 sign (although not all implementations support it) and this is one of
1290 the few operations that can tell the difference.
1291
1292 @code{copysign} never raises an exception.
1293 @c except signalling NaNs
1294
1295 This function is defined in @w{IEC 559} (and the appendix with
1296 recommended functions in @w{IEEE 754}/@w{IEEE 854}).
1297 @end deftypefun
1298
1299 @comment math.h
1300 @comment ISO
1301 @deftypefun int signbit (@emph{float-type} @var{x})
1302 @code{signbit} is a generic macro which can work on all floating-point
1303 types.  It returns a nonzero value if the value of @var{x} has its sign
1304 bit set.
1305
1306 This is not the same as @code{x < 0.0}, because @w{IEEE 754} floating
1307 point allows zero to be signed.  The comparison @code{-0.0 < 0.0} is
1308 false, but @code{signbit (-0.0)} will return a nonzero value.
1309 @end deftypefun
1310
1311 @comment math.h
1312 @comment ISO
1313 @deftypefun double nextafter (double @var{x}, double @var{y})
1314 @comment math.h
1315 @comment ISO
1316 @deftypefunx float nextafterf (float @var{x}, float @var{y})
1317 @comment math.h
1318 @comment ISO
1319 @deftypefunx {long double} nextafterl (long double @var{x}, long double @var{y})
1320 The @code{nextafter} function returns the next representable neighbor of
1321 @var{x} in the direction towards @var{y}.  The size of the step between
1322 @var{x} and the result depends on the type of the result.  If
1323 @math{@var{x} = @var{y}} the function simply returns @var{x}.  If either
1324 value is @code{NaN}, @code{NaN} is returned.  Otherwise
1325 a value corresponding to the value of the least significant bit in the
1326 mantissa is added or subtracted, depending on the direction.
1327 @code{nextafter} will signal overflow or underflow if the result goes
1328 outside of the range of normalized numbers.
1329
1330 This function is defined in @w{IEC 559} (and the appendix with
1331 recommended functions in @w{IEEE 754}/@w{IEEE 854}).
1332 @end deftypefun
1333
1334 @comment math.h
1335 @comment ISO
1336 @deftypefun double nexttoward (double @var{x}, long double @var{y})
1337 @comment math.h
1338 @comment ISO
1339 @deftypefunx float nexttowardf (float @var{x}, long double @var{y})
1340 @comment math.h
1341 @comment ISO
1342 @deftypefunx {long double} nexttowardl (long double @var{x}, long double @var{y})
1343 These functions are identical to the corresponding versions of
1344 @code{nextafter} except that their second argument is a @code{long
1345 double}.
1346 @end deftypefun
1347
1348 @cindex NaN
1349 @comment math.h
1350 @comment ISO
1351 @deftypefun double nan (const char *@var{tagp})
1352 @comment math.h
1353 @comment ISO
1354 @deftypefunx float nanf (const char *@var{tagp})
1355 @comment math.h
1356 @comment ISO
1357 @deftypefunx {long double} nanl (const char *@var{tagp})
1358 The @code{nan} function returns a representation of NaN, provided that
1359 NaN is supported by the target platform.
1360 @code{nan ("@var{n-char-sequence}")} is equivalent to
1361 @code{strtod ("NAN(@var{n-char-sequence})")}.
1362
1363 The argument @var{tagp} is used in an unspecified manner.  On @w{IEEE
1364 754} systems, there are many representations of NaN, and @var{tagp}
1365 selects one.  On other systems it may do nothing.
1366 @end deftypefun
1367
1368 @node FP Comparison Functions
1369 @subsection Floating-Point Comparison Functions
1370 @cindex unordered comparison
1371
1372 The standard C comparison operators provoke exceptions when one or other
1373 of the operands is NaN.  For example,
1374
1375 @smallexample
1376 int v = a < 1.0;
1377 @end smallexample
1378
1379 @noindent
1380 will raise an exception if @var{a} is NaN.  (This does @emph{not}
1381 happen with @code{==} and @code{!=}; those merely return false and true,
1382 respectively, when NaN is examined.)  Frequently this exception is
1383 undesirable.  @w{ISO C 9x} therefore defines comparison functions that
1384 do not raise exceptions when NaN is examined.  All of the functions are
1385 implemented as macros which allow their arguments to be of any
1386 floating-point type.  The macros are guaranteed to evaluate their
1387 arguments only once.
1388
1389 @comment math.h
1390 @comment ISO
1391 @deftypefn Macro int isgreater (@emph{real-floating} @var{x}, @emph{real-floating} @var{y})
1392 This macro determines whether the argument @var{x} is greater than
1393 @var{y}.  It is equivalent to @code{(@var{x}) > (@var{y})}, but no
1394 exception is raised if @var{x} or @var{y} are NaN.
1395 @end deftypefn
1396
1397 @comment math.h
1398 @comment ISO
1399 @deftypefn Macro int isgreaterequal (@emph{real-floating} @var{x}, @emph{real-floating} @var{y})
1400 This macro determines whether the argument @var{x} is greater than or
1401 equal to @var{y}.  It is equivalent to @code{(@var{x}) >= (@var{y})}, but no
1402 exception is raised if @var{x} or @var{y} are NaN.
1403 @end deftypefn
1404
1405 @comment math.h
1406 @comment ISO
1407 @deftypefn Macro int isless (@emph{real-floating} @var{x}, @emph{real-floating} @var{y})
1408 This macro determines whether the argument @var{x} is less than @var{y}.
1409 It is equivalent to @code{(@var{x}) < (@var{y})}, but no exception is
1410 raised if @var{x} or @var{y} are NaN.
1411 @end deftypefn
1412
1413 @comment math.h
1414 @comment ISO
1415 @deftypefn Macro int islessequal (@emph{real-floating} @var{x}, @emph{real-floating} @var{y})
1416 This macro determines whether the argument @var{x} is less than or equal
1417 to @var{y}.  It is equivalent to @code{(@var{x}) <= (@var{y})}, but no
1418 exception is raised if @var{x} or @var{y} are NaN.
1419 @end deftypefn
1420
1421 @comment math.h
1422 @comment ISO
1423 @deftypefn Macro int islessgreater (@emph{real-floating} @var{x}, @emph{real-floating} @var{y})
1424 This macro determines whether the argument @var{x} is less or greater
1425 than @var{y}.  It is equivalent to @code{(@var{x}) < (@var{y}) ||
1426 (@var{x}) > (@var{y})} (although it only evaluates @var{x} and @var{y}
1427 once), but no exception is raised if @var{x} or @var{y} are NaN.
1428
1429 This macro is not equivalent to @code{@var{x} != @var{y}}, because that
1430 expression is true if @var{x} or @var{y} are NaN.
1431 @end deftypefn
1432
1433 @comment math.h
1434 @comment ISO
1435 @deftypefn Macro int isunordered (@emph{real-floating} @var{x}, @emph{real-floating} @var{y})
1436 This macro determines whether its arguments are unordered.  In other
1437 words, it is true if @var{x} or @var{y} are NaN, and false otherwise.
1438 @end deftypefn
1439
1440 Not all machines provide hardware support for these operations.  On
1441 machines that don't, the macros can be very slow.  Therefore, you should
1442 not use these functions when NaN is not a concern.
1443
1444 @strong{Note:} There are no macros @code{isequal} or @code{isunequal}.
1445 They are unnecessary, because the @code{==} and @code{!=} operators do
1446 @emph{not} throw an exception if one or both of the operands are NaN.
1447
1448 @node Misc FP Arithmetic
1449 @subsection Miscellaneous FP arithmetic functions
1450 @cindex minimum
1451 @cindex maximum
1452 @cindex positive difference
1453 @cindex multiply-add
1454
1455 The functions in this section perform miscellaneous but common
1456 operations that are awkward to express with C operators.  On some
1457 processors these functions can use special machine instructions to
1458 perform these operations faster than the equivalent C code.
1459
1460 @comment math.h
1461 @comment ISO
1462 @deftypefun double fmin (double @var{x}, double @var{y})
1463 @comment math.h
1464 @comment ISO
1465 @deftypefunx float fminf (float @var{x}, float @var{y})
1466 @comment math.h
1467 @comment ISO
1468 @deftypefunx {long double} fminl (long double @var{x}, long double @var{y})
1469 The @code{fmin} function returns the lesser of the two values @var{x}
1470 and @var{y}.  It is similar to the expression
1471 @smallexample
1472 ((x) < (y) ? (x) : (y))
1473 @end smallexample
1474 except that @var{x} and @var{y} are only evaluated once.
1475
1476 If an argument is NaN, the other argument is returned.  If both arguments
1477 are NaN, NaN is returned.
1478 @end deftypefun
1479
1480 @comment math.h
1481 @comment ISO
1482 @deftypefun double fmax (double @var{x}, double @var{y})
1483 @comment math.h
1484 @comment ISO
1485 @deftypefunx float fmaxf (float @var{x}, float @var{y})
1486 @comment math.h
1487 @comment ISO
1488 @deftypefunx {long double} fmaxl (long double @var{x}, long double @var{y})
1489 The @code{fmax} function returns the greater of the two values @var{x}
1490 and @var{y}.
1491
1492 If an argument is NaN, the other argument is returned.  If both arguments
1493 are NaN, NaN is returned.
1494 @end deftypefun
1495
1496 @comment math.h
1497 @comment ISO
1498 @deftypefun double fdim (double @var{x}, double @var{y})
1499 @comment math.h
1500 @comment ISO
1501 @deftypefunx float fdimf (float @var{x}, float @var{y})
1502 @comment math.h
1503 @comment ISO
1504 @deftypefunx {long double} fdiml (long double @var{x}, long double @var{y})
1505 The @code{fdim} function returns the positive difference between
1506 @var{x} and @var{y}.  The positive difference is @math{@var{x} -
1507 @var{y}} if @var{x} is greater than @var{y}, and @math{0} otherwise.
1508
1509 If @var{x}, @var{y}, or both are NaN, NaN is returned.
1510 @end deftypefun
1511
1512 @comment math.h
1513 @comment ISO
1514 @deftypefun double fma (double @var{x}, double @var{y}, double @var{z})
1515 @comment math.h
1516 @comment ISO
1517 @deftypefunx float fmaf (float @var{x}, float @var{y}, float @var{z})
1518 @comment math.h
1519 @comment ISO
1520 @deftypefunx {long double} fmal (long double @var{x}, long double @var{y}, long double @var{z})
1521 @cindex butterfly
1522 The @code{fma} function performs floating-point multiply-add.  This is
1523 the operation @math{(@var{x} @mul{} @var{y}) + @var{z}}, but the
1524 intermediate result is not rounded to the destination type.  This can
1525 sometimes improve the precision of a calculation.
1526
1527 This function was introduced because some processors have a special
1528 instruction to perform multiply-add.  The C compiler cannot use it
1529 directly, because the expression @samp{x*y + z} is defined to round the
1530 intermediate result.  @code{fma} lets you choose when you want to round
1531 only once.
1532
1533 @vindex FP_FAST_FMA
1534 On processors which do not implement multiply-add in hardware,
1535 @code{fma} can be very slow since it must avoid intermediate rounding.
1536 @file{math.h} defines the symbols @code{FP_FAST_FMA},
1537 @code{FP_FAST_FMAF}, and @code{FP_FAST_FMAL} when the corresponding
1538 version of @code{fma} is no slower than the expression @samp{x*y + z}.
1539 In the GNU C library, this always means the operation is implemented in
1540 hardware.
1541 @end deftypefun
1542
1543 @node Complex Numbers
1544 @section Complex Numbers
1545 @pindex complex.h
1546 @cindex complex numbers
1547
1548 @w{ISO C 9x} introduces support for complex numbers in C.  This is done
1549 with a new type qualifier, @code{complex}.  It is a keyword if and only
1550 if @file{complex.h} has been included.  There are three complex types,
1551 corresponding to the three real types:  @code{float complex},
1552 @code{double complex}, and @code{long double complex}.
1553
1554 To construct complex numbers you need a way to indicate the imaginary
1555 part of a number.  There is no standard notation for an imaginary
1556 floating point constant.  Instead, @file{complex.h} defines two macros
1557 that can be used to create complex numbers.
1558
1559 @deftypevr Macro {const float complex} _Complex_I
1560 This macro is a representation of the complex number ``@math{0+1i}''.
1561 Multiplying a real floating-point value by @code{_Complex_I} gives a
1562 complex number whose value is purely imaginary.  You can use this to
1563 construct complex constants:
1564
1565 @smallexample
1566 @math{3.0 + 4.0i} = @code{3.0 + 4.0 * _Complex_I}
1567 @end smallexample
1568
1569 Note that @code{_Complex_I * _Complex_I} has the value @code{-1}, but
1570 the type of that value is @code{complex}.
1571 @end deftypevr
1572
1573 @c Put this back in when gcc supports _Imaginary_I.  It's too confusing.
1574 @ignore
1575 @noindent
1576 Without an optimizing compiler this is more expensive than the use of
1577 @code{_Imaginary_I} but with is better than nothing.  You can avoid all
1578 the hassles if you use the @code{I} macro below if the name is not
1579 problem.
1580
1581 @deftypevr Macro {const float imaginary} _Imaginary_I
1582 This macro is a representation of the value ``@math{1i}''.  I.e., it is
1583 the value for which
1584
1585 @smallexample
1586 _Imaginary_I * _Imaginary_I = -1
1587 @end smallexample
1588
1589 @noindent
1590 The result is not of type @code{float imaginary} but instead @code{float}.
1591 One can use it to easily construct complex number like in
1592
1593 @smallexample
1594 3.0 - _Imaginary_I * 4.0
1595 @end smallexample
1596
1597 @noindent
1598 which results in the complex number with a real part of 3.0 and a
1599 imaginary part -4.0.
1600 @end deftypevr
1601 @end ignore
1602
1603 @noindent
1604 @code{_Complex_I} is a bit of a mouthful.  @file{complex.h} also defines
1605 a shorter name for the same constant.
1606
1607 @deftypevr Macro {const float complex} I
1608 This macro has exactly the same value as @code{_Complex_I}.  Most of the
1609 time it is preferable.  However, it causes problems if you want to use
1610 the identifier @code{I} for something else.  You can safely write
1611
1612 @smallexample
1613 #include <complex.h>
1614 #undef I
1615 @end smallexample
1616
1617 @noindent
1618 if you need @code{I} for your own purposes.  (In that case we recommend
1619 you also define some other short name for @code{_Complex_I}, such as
1620 @code{J}.)
1621
1622 @ignore
1623 If the implementation does not support the @code{imaginary} types
1624 @code{I} is defined as @code{_Complex_I} which is the second best
1625 solution.  It still can be used in the same way but requires a most
1626 clever compiler to get the same results.
1627 @end ignore
1628 @end deftypevr
1629
1630 @node Operations on Complex
1631 @section Projections, Conjugates, and Decomposing of Complex Numbers
1632 @cindex project complex numbers
1633 @cindex conjugate complex numbers
1634 @cindex decompose complex numbers
1635 @pindex complex.h
1636
1637 @w{ISO C 9x} also defines functions that perform basic operations on
1638 complex numbers, such as decomposition and conjugation.  The prototypes
1639 for all these functions are in @file{complex.h}.  All functions are
1640 available in three variants, one for each of the three complex types.
1641
1642 @comment complex.h
1643 @comment ISO
1644 @deftypefun double creal (complex double @var{z})
1645 @comment complex.h
1646 @comment ISO
1647 @deftypefunx float crealf (complex float @var{z})
1648 @comment complex.h
1649 @comment ISO
1650 @deftypefunx {long double} creall (complex long double @var{z})
1651 These functions return the real part of the complex number @var{z}.
1652 @end deftypefun
1653
1654 @comment complex.h
1655 @comment ISO
1656 @deftypefun double cimag (complex double @var{z})
1657 @comment complex.h
1658 @comment ISO
1659 @deftypefunx float cimagf (complex float @var{z})
1660 @comment complex.h
1661 @comment ISO
1662 @deftypefunx {long double} cimagl (complex long double @var{z})
1663 These functions return the imaginary part of the complex number @var{z}.
1664 @end deftypefun
1665
1666 @comment complex.h
1667 @comment ISO
1668 @deftypefun {complex double} conj (complex double @var{z})
1669 @comment complex.h
1670 @comment ISO
1671 @deftypefunx {complex float} conjf (complex float @var{z})
1672 @comment complex.h
1673 @comment ISO
1674 @deftypefunx {complex long double} conjl (complex long double @var{z})
1675 These functions return the conjugate value of the complex number
1676 @var{z}.  The conjugate of a complex number has the same real part and a
1677 negated imaginary part.  In other words, @samp{conj(a + bi) = a + -bi}.
1678 @end deftypefun
1679
1680 @comment complex.h
1681 @comment ISO
1682 @deftypefun double carg (complex double @var{z})
1683 @comment complex.h
1684 @comment ISO
1685 @deftypefunx float cargf (complex float @var{z})
1686 @comment complex.h
1687 @comment ISO
1688 @deftypefunx {long double} cargl (complex long double @var{z})
1689 These functions return the argument of the complex number @var{z}.
1690 The argument of a complex number is the angle in the complex plane
1691 between the positive real axis and a line passing through zero and the
1692 number.  This angle is measured in the usual fashion and ranges from @math{0}
1693 to @math{2@pi{}}.
1694
1695 @code{carg} has a branch cut along the positive real axis.
1696 @end deftypefun
1697
1698 @comment complex.h
1699 @comment ISO
1700 @deftypefun {complex double} cproj (complex double @var{z})
1701 @comment complex.h
1702 @comment ISO
1703 @deftypefunx {complex float} cprojf (complex float @var{z})
1704 @comment complex.h
1705 @comment ISO
1706 @deftypefunx {complex long double} cprojl (complex long double @var{z})
1707 These functions return the projection of the complex value @var{z} onto
1708 the Riemann sphere.  Values with a infinite imaginary part are projected
1709 to positive infinity on the real axis, even if the real part is NaN.  If
1710 the real part is infinite, the result is equivalent to
1711
1712 @smallexample
1713 INFINITY + I * copysign (0.0, cimag (z))
1714 @end smallexample
1715 @end deftypefun
1716
1717 @node Integer Division
1718 @section Integer Division
1719 @cindex integer division functions
1720
1721 This section describes functions for performing integer division.  These
1722 functions are redundant when GNU CC is used, because in GNU C the
1723 @samp{/} operator always rounds towards zero.  But in other C
1724 implementations, @samp{/} may round differently with negative arguments.
1725 @code{div} and @code{ldiv} are useful because they specify how to round
1726 the quotient: towards zero.  The remainder has the same sign as the
1727 numerator.
1728
1729 These functions are specified to return a result @var{r} such that the value
1730 @code{@var{r}.quot*@var{denominator} + @var{r}.rem} equals
1731 @var{numerator}.
1732
1733 @pindex stdlib.h
1734 To use these facilities, you should include the header file
1735 @file{stdlib.h} in your program.
1736
1737 @comment stdlib.h
1738 @comment ISO
1739 @deftp {Data Type} div_t
1740 This is a structure type used to hold the result returned by the @code{div}
1741 function.  It has the following members:
1742
1743 @table @code
1744 @item int quot
1745 The quotient from the division.
1746
1747 @item int rem
1748 The remainder from the division.
1749 @end table
1750 @end deftp
1751
1752 @comment stdlib.h
1753 @comment ISO
1754 @deftypefun div_t div (int @var{numerator}, int @var{denominator})
1755 This function @code{div} computes the quotient and remainder from
1756 the division of @var{numerator} by @var{denominator}, returning the
1757 result in a structure of type @code{div_t}.
1758
1759 If the result cannot be represented (as in a division by zero), the
1760 behavior is undefined.
1761
1762 Here is an example, albeit not a very useful one.
1763
1764 @smallexample
1765 div_t result;
1766 result = div (20, -6);
1767 @end smallexample
1768
1769 @noindent
1770 Now @code{result.quot} is @code{-3} and @code{result.rem} is @code{2}.
1771 @end deftypefun
1772
1773 @comment stdlib.h
1774 @comment ISO
1775 @deftp {Data Type} ldiv_t
1776 This is a structure type used to hold the result returned by the @code{ldiv}
1777 function.  It has the following members:
1778
1779 @table @code
1780 @item long int quot
1781 The quotient from the division.
1782
1783 @item long int rem
1784 The remainder from the division.
1785 @end table
1786
1787 (This is identical to @code{div_t} except that the components are of
1788 type @code{long int} rather than @code{int}.)
1789 @end deftp
1790
1791 @comment stdlib.h
1792 @comment ISO
1793 @deftypefun ldiv_t ldiv (long int @var{numerator}, long int @var{denominator})
1794 The @code{ldiv} function is similar to @code{div}, except that the
1795 arguments are of type @code{long int} and the result is returned as a
1796 structure of type @code{ldiv_t}.
1797 @end deftypefun
1798
1799 @comment stdlib.h
1800 @comment ISO
1801 @deftp {Data Type} lldiv_t
1802 This is a structure type used to hold the result returned by the @code{lldiv}
1803 function.  It has the following members:
1804
1805 @table @code
1806 @item long long int quot
1807 The quotient from the division.
1808
1809 @item long long int rem
1810 The remainder from the division.
1811 @end table
1812
1813 (This is identical to @code{div_t} except that the components are of
1814 type @code{long long int} rather than @code{int}.)
1815 @end deftp
1816
1817 @comment stdlib.h
1818 @comment ISO
1819 @deftypefun lldiv_t lldiv (long long int @var{numerator}, long long int @var{denominator})
1820 The @code{lldiv} function is like the @code{div} function, but the
1821 arguments are of type @code{long long int} and the result is returned as
1822 a structure of type @code{lldiv_t}.
1823
1824 The @code{lldiv} function was added in @w{ISO C 9x}.
1825 @end deftypefun
1826
1827 @comment inttypes.h
1828 @comment ISO
1829 @deftp {Data Type} imaxdiv_t
1830 This is a structure type used to hold the result returned by the @code{imaxdiv}
1831 function.  It has the following members:
1832
1833 @table @code
1834 @item intmax_t quot
1835 The quotient from the division.
1836
1837 @item intmax_t rem
1838 The remainder from the division.
1839 @end table
1840
1841 (This is identical to @code{div_t} except that the components are of
1842 type @code{intmax_t} rather than @code{int}.)
1843 @end deftp
1844
1845 @comment inttypes.h
1846 @comment ISO
1847 @deftypefun imaxdiv_t imaxdiv (intmax_t @var{numerator}, intmax_t @var{denominator})
1848 The @code{imaxdiv} function is like the @code{div} function, but the
1849 arguments are of type @code{intmax_t} and the result is returned as
1850 a structure of type @code{imaxdiv_t}.
1851
1852 The @code{imaxdiv} function was added in @w{ISO C 9x}.
1853 @end deftypefun
1854
1855
1856 @node Parsing of Numbers
1857 @section Parsing of Numbers
1858 @cindex parsing numbers (in formatted input)
1859 @cindex converting strings to numbers
1860 @cindex number syntax, parsing
1861 @cindex syntax, for reading numbers
1862
1863 This section describes functions for ``reading'' integer and
1864 floating-point numbers from a string.  It may be more convenient in some
1865 cases to use @code{sscanf} or one of the related functions; see
1866 @ref{Formatted Input}.  But often you can make a program more robust by
1867 finding the tokens in the string by hand, then converting the numbers
1868 one by one.
1869
1870 @menu
1871 * Parsing of Integers::         Functions for conversion of integer values.
1872 * Parsing of Floats::           Functions for conversion of floating-point
1873                                  values.
1874 @end menu
1875
1876 @node Parsing of Integers
1877 @subsection Parsing of Integers
1878
1879 @pindex stdlib.h
1880 These functions are declared in @file{stdlib.h}.
1881
1882 @comment stdlib.h
1883 @comment ISO
1884 @deftypefun {long int} strtol (const char *@var{string}, char **@var{tailptr}, int @var{base})
1885 The @code{strtol} (``string-to-long'') function converts the initial
1886 part of @var{string} to a signed integer, which is returned as a value
1887 of type @code{long int}.
1888
1889 This function attempts to decompose @var{string} as follows:
1890
1891 @itemize @bullet
1892 @item
1893 A (possibly empty) sequence of whitespace characters.  Which characters
1894 are whitespace is determined by the @code{isspace} function
1895 (@pxref{Classification of Characters}).  These are discarded.
1896
1897 @item
1898 An optional plus or minus sign (@samp{+} or @samp{-}).
1899
1900 @item
1901 A nonempty sequence of digits in the radix specified by @var{base}.
1902
1903 If @var{base} is zero, decimal radix is assumed unless the series of
1904 digits begins with @samp{0} (specifying octal radix), or @samp{0x} or
1905 @samp{0X} (specifying hexadecimal radix); in other words, the same
1906 syntax used for integer constants in C.
1907
1908 Otherwise @var{base} must have a value between @code{2} and @code{35}.
1909 If @var{base} is @code{16}, the digits may optionally be preceded by
1910 @samp{0x} or @samp{0X}.  If base has no legal value the value returned
1911 is @code{0l} and the global variable @code{errno} is set to @code{EINVAL}.
1912
1913 @item
1914 Any remaining characters in the string.  If @var{tailptr} is not a null
1915 pointer, @code{strtol} stores a pointer to this tail in
1916 @code{*@var{tailptr}}.
1917 @end itemize
1918
1919 If the string is empty, contains only whitespace, or does not contain an
1920 initial substring that has the expected syntax for an integer in the
1921 specified @var{base}, no conversion is performed.  In this case,
1922 @code{strtol} returns a value of zero and the value stored in
1923 @code{*@var{tailptr}} is the value of @var{string}.
1924
1925 In a locale other than the standard @code{"C"} locale, this function
1926 may recognize additional implementation-dependent syntax.
1927
1928 If the string has valid syntax for an integer but the value is not
1929 representable because of overflow, @code{strtol} returns either
1930 @code{LONG_MAX} or @code{LONG_MIN} (@pxref{Range of Type}), as
1931 appropriate for the sign of the value.  It also sets @code{errno}
1932 to @code{ERANGE} to indicate there was overflow.
1933
1934 You should not check for errors by examining the return value of
1935 @code{strtol}, because the string might be a valid representation of
1936 @code{0l}, @code{LONG_MAX}, or @code{LONG_MIN}.  Instead, check whether
1937 @var{tailptr} points to what you expect after the number
1938 (e.g. @code{'\0'} if the string should end after the number).  You also
1939 need to clear @var{errno} before the call and check it afterward, in
1940 case there was overflow.
1941
1942 There is an example at the end of this section.
1943 @end deftypefun
1944
1945 @comment stdlib.h
1946 @comment ISO
1947 @deftypefun {unsigned long int} strtoul (const char *@var{string}, char **@var{tailptr}, int @var{base})
1948 The @code{strtoul} (``string-to-unsigned-long'') function is like
1949 @code{strtol} except it returns an @code{unsigned long int} value.  If
1950 the number has a leading @samp{-} sign, the return value is negated.
1951 The syntax is the same as described above for @code{strtol}.  The value
1952 returned on overflow is @code{ULONG_MAX} (@pxref{Range of
1953 Type}).
1954
1955 @code{strtoul} sets @var{errno} to @code{EINVAL} if @var{base} is out of
1956 range, or @code{ERANGE} on overflow.
1957 @end deftypefun
1958
1959 @comment stdlib.h
1960 @comment ISO
1961 @deftypefun {long long int} strtoll (const char *@var{string}, char **@var{tailptr}, int @var{base})
1962 The @code{strtoll} function is like @code{strtol} except that it returns
1963 a @code{long long int} value, and accepts numbers with a correspondingly
1964 larger range.
1965
1966 If the string has valid syntax for an integer but the value is not
1967 representable because of overflow, @code{strtoll} returns either
1968 @code{LONG_LONG_MAX} or @code{LONG_LONG_MIN} (@pxref{Range of Type}), as
1969 appropriate for the sign of the value.  It also sets @code{errno} to
1970 @code{ERANGE} to indicate there was overflow.
1971
1972 The @code{strtoll} function was introduced in @w{ISO C 9x}.
1973 @end deftypefun
1974
1975 @comment stdlib.h
1976 @comment BSD
1977 @deftypefun {long long int} strtoq (const char *@var{string}, char **@var{tailptr}, int @var{base})
1978 @code{strtoq} (``string-to-quad-word'') is the BSD name for @code{strtoll}.
1979 @end deftypefun
1980
1981 @comment stdlib.h
1982 @comment ISO
1983 @deftypefun {unsigned long long int} strtoull (const char *@var{string}, char **@var{tailptr}, int @var{base})
1984 The @code{strtoull} function is like @code{strtoul} except that it
1985 returns an @code{unsigned long long int}.  The value returned on overflow
1986 is @code{ULONG_LONG_MAX} (@pxref{Range of Type}).
1987
1988 The @code{strtoull} function was introduced in @w{ISO C 9x}.
1989 @end deftypefun
1990
1991 @comment stdlib.h
1992 @comment BSD
1993 @deftypefun {unsigned long long int} strtouq (const char *@var{string}, char **@var{tailptr}, int @var{base})
1994 @code{strtouq} is the BSD name for @code{strtoull}.
1995 @end deftypefun
1996
1997 @comment stdlib.h
1998 @comment ISO
1999 @deftypefun {long int} atol (const char *@var{string})
2000 This function is similar to the @code{strtol} function with a @var{base}
2001 argument of @code{10}, except that it need not detect overflow errors.
2002 The @code{atol} function is provided mostly for compatibility with
2003 existing code; using @code{strtol} is more robust.
2004 @end deftypefun
2005
2006 @comment stdlib.h
2007 @comment ISO
2008 @deftypefun int atoi (const char *@var{string})
2009 This function is like @code{atol}, except that it returns an @code{int}.
2010 The @code{atoi} function is also considered obsolete; use @code{strtol}
2011 instead.
2012 @end deftypefun
2013
2014 @comment stdlib.h
2015 @comment ISO
2016 @deftypefun {long long int} atoll (const char *@var{string})
2017 This function is similar to @code{atol}, except it returns a @code{long
2018 long int}.
2019
2020 The @code{atoll} function was introduced in @w{ISO C 9x}.  It too is
2021 obsolete (despite having just been added); use @code{strtoll} instead.
2022 @end deftypefun
2023
2024 @c !!! please fact check this paragraph -zw
2025 @findex strtol_l
2026 @findex strtoul_l
2027 @findex strtoll_l
2028 @findex strtoull_l
2029 @cindex parsing numbers and locales
2030 @cindex locales, parsing numbers and
2031 Some locales specify a printed syntax for numbers other than the one
2032 that these functions understand.  If you need to read numbers formatted
2033 in some other locale, you can use the @code{strtoX_l} functions.  Each
2034 of the @code{strtoX} functions has a counterpart with @samp{_l} added to
2035 its name.  The @samp{_l} counterparts take an additional argument: a
2036 pointer to an @code{locale_t} structure, which describes how the numbers
2037 to be read are formatted.  @xref{Locales}.
2038
2039 @strong{Portability Note:} These functions are all GNU extensions.  You
2040 can also use @code{scanf} or its relatives, which have the @samp{'} flag
2041 for parsing numeric input according to the current locale
2042 (@pxref{Numeric Input Conversions}).  This feature is standard.
2043
2044 Here is a function which parses a string as a sequence of integers and
2045 returns the sum of them:
2046
2047 @smallexample
2048 int
2049 sum_ints_from_string (char *string)
2050 @{
2051   int sum = 0;
2052
2053   while (1) @{
2054     char *tail;
2055     int next;
2056
2057     /* @r{Skip whitespace by hand, to detect the end.}  */
2058     while (isspace (*string)) string++;
2059     if (*string == 0)
2060       break;
2061
2062     /* @r{There is more nonwhitespace,}  */
2063     /* @r{so it ought to be another number.}  */
2064     errno = 0;
2065     /* @r{Parse it.}  */
2066     next = strtol (string, &tail, 0);
2067     /* @r{Add it in, if not overflow.}  */
2068     if (errno)
2069       printf ("Overflow\n");
2070     else
2071       sum += next;
2072     /* @r{Advance past it.}  */
2073     string = tail;
2074   @}
2075
2076   return sum;
2077 @}
2078 @end smallexample
2079
2080 @node Parsing of Floats
2081 @subsection Parsing of Floats
2082
2083 @pindex stdlib.h
2084 These functions are declared in @file{stdlib.h}.
2085
2086 @comment stdlib.h
2087 @comment ISO
2088 @deftypefun double strtod (const char *@var{string}, char **@var{tailptr})
2089 The @code{strtod} (``string-to-double'') function converts the initial
2090 part of @var{string} to a floating-point number, which is returned as a
2091 value of type @code{double}.
2092
2093 This function attempts to decompose @var{string} as follows:
2094
2095 @itemize @bullet
2096 @item
2097 A (possibly empty) sequence of whitespace characters.  Which characters
2098 are whitespace is determined by the @code{isspace} function
2099 (@pxref{Classification of Characters}).  These are discarded.
2100
2101 @item
2102 An optional plus or minus sign (@samp{+} or @samp{-}).
2103
2104 @item
2105 A nonempty sequence of digits optionally containing a decimal-point
2106 character---normally @samp{.}, but it depends on the locale
2107 (@pxref{General Numeric}).
2108
2109 @item
2110 An optional exponent part, consisting of a character @samp{e} or
2111 @samp{E}, an optional sign, and a sequence of digits.
2112
2113 @item
2114 Any remaining characters in the string.  If @var{tailptr} is not a null
2115 pointer, a pointer to this tail of the string is stored in
2116 @code{*@var{tailptr}}.
2117 @end itemize
2118
2119 If the string is empty, contains only whitespace, or does not contain an
2120 initial substring that has the expected syntax for a floating-point
2121 number, no conversion is performed.  In this case, @code{strtod} returns
2122 a value of zero and the value returned in @code{*@var{tailptr}} is the
2123 value of @var{string}.
2124
2125 In a locale other than the standard @code{"C"} or @code{"POSIX"} locales,
2126 this function may recognize additional locale-dependent syntax.
2127
2128 If the string has valid syntax for a floating-point number but the value
2129 is outside the range of a @code{double}, @code{strtod} will signal
2130 overflow or underflow as described in @ref{Math Error Reporting}.
2131
2132 @code{strtod} recognizes four special input strings.  The strings
2133 @code{"inf"} and @code{"infinity"} are converted to @math{@infinity{}},
2134 or to the largest representable value if the floating-point format
2135 doesn't support infinities.  You can prepend a @code{"+"} or @code{"-"}
2136 to specify the sign.  Case is ignored when scanning these strings.
2137
2138 The strings @code{"nan"} and @code{"nan(@var{chars...})"} are converted
2139 to NaN.  Again, case is ignored.  If @var{chars...} are provided, they
2140 are used in some unspecified fashion to select a particular
2141 representation of NaN (there can be several).
2142
2143 Since zero is a valid result as well as the value returned on error, you
2144 should check for errors in the same way as for @code{strtol}, by
2145 examining @var{errno} and @var{tailptr}.
2146 @end deftypefun
2147
2148 @comment stdlib.h
2149 @comment GNU
2150 @deftypefun float strtof (const char *@var{string}, char **@var{tailptr})
2151 @comment stdlib.h
2152 @comment GNU
2153 @deftypefunx {long double} strtold (const char *@var{string}, char **@var{tailptr})
2154 These functions are analogous to @code{strtod}, but return @code{float}
2155 and @code{long double} values respectively.  They report errors in the
2156 same way as @code{strtod}.  @code{strtof} can be substantially faster
2157 than @code{strtod}, but has less precision; conversely, @code{strtold}
2158 can be much slower but has more precision (on systems where @code{long
2159 double} is a separate type).
2160
2161 These functions are GNU extensions.
2162 @end deftypefun
2163
2164 @comment stdlib.h
2165 @comment ISO
2166 @deftypefun double atof (const char *@var{string})
2167 This function is similar to the @code{strtod} function, except that it
2168 need not detect overflow and underflow errors.  The @code{atof} function
2169 is provided mostly for compatibility with existing code; using
2170 @code{strtod} is more robust.
2171 @end deftypefun
2172
2173 The GNU C library also provides @samp{_l} versions of thse functions,
2174 which take an additional argument, the locale to use in conversion.
2175 @xref{Parsing of Integers}.
2176
2177 @node System V Number Conversion
2178 @section Old-fashioned System V number-to-string functions
2179
2180 The old @w{System V} C library provided three functions to convert
2181 numbers to strings, with unusual and hard-to-use semantics.  The GNU C
2182 library also provides these functions and some natural extensions.
2183
2184 These functions are only available in glibc and on systems descended
2185 from AT&T Unix.  Therefore, unless these functions do precisely what you
2186 need, it is better to use @code{sprintf}, which is standard.
2187
2188 All these functions are defined in @file{stdlib.h}.
2189
2190 @comment stdlib.h
2191 @comment SVID, Unix98
2192 @deftypefun {char *} ecvt (double @var{value}, int @var{ndigit}, int *@var{decpt}, int *@var{neg})
2193 The function @code{ecvt} converts the floating-point number @var{value}
2194 to a string with at most @var{ndigit} decimal digits.
2195 The returned string contains no decimal point or sign. The first
2196 digit of the string is non-zero (unless @var{value} is actually zero)
2197 and the last digit is rounded to nearest.  @var{decpt} is set to the
2198 index in the string of the first digit after the decimal point.
2199 @var{neg} is set to a nonzero value if @var{value} is negative, zero
2200 otherwise.
2201
2202 The returned string is statically allocated and overwritten by each call
2203 to @code{ecvt}.
2204
2205 If @var{value} is zero, it's implementation defined whether @var{decpt} is
2206 @code{0} or @code{1}.
2207
2208 For example: @code{ecvt (12.3, 5, &decpt, &neg)} returns @code{"12300"}
2209 and sets @var{decpt} to @code{2} and @var{neg} to @code{0}.
2210 @end deftypefun
2211
2212 @comment stdlib.h
2213 @comment SVID, Unix98
2214 @deftypefun {char *} fcvt (double @var{value}, int @var{ndigit}, int @var{decpt}, int *@var{neg})
2215 The function @code{fcvt} is like @code{ecvt}, but @var{ndigit} specifies
2216 the number of digits after the decimal point.  If @var{ndigit} is less
2217 than zero, @var{value} is rounded to the @math{@var{ndigit}+1}'th place to the
2218 left of the decimal point.  For example, if @var{ndigit} is @code{-1},
2219 @var{value} will be rounded to the nearest 10.  If @var{ndigit} is
2220 negative and larger than the number of digits to the left of the decimal
2221 point in @var{value}, @var{value} will be rounded to one significant digit.
2222
2223 The returned string is statically allocated and overwritten by each call
2224 to @code{fcvt}.
2225 @end deftypefun
2226
2227 @comment stdlib.h
2228 @comment SVID, Unix98
2229 @deftypefun {char *} gcvt (double @var{value}, int @var{ndigit}, char *@var{buf})
2230 @code{gcvt} is functionally equivalent to @samp{sprintf(buf, "%*g",
2231 ndigit, value}.  It is provided only for compatibility's sake.  It
2232 returns @var{buf}.
2233 @end deftypefun
2234
2235 As extensions, the GNU C library provides versions of these three
2236 functions that take @code{long double} arguments.
2237
2238 @comment stdlib.h
2239 @comment GNU
2240 @deftypefun {char *} qecvt (long double @var{value}, int @var{ndigit}, int *@var{decpt}, int *@var{neg})
2241 This function is equivalent to @code{ecvt} except that it
2242 takes a @code{long double} for the first parameter.
2243 @end deftypefun
2244
2245 @comment stdlib.h
2246 @comment GNU
2247 @deftypefun {char *} qfcvt (long double @var{value}, int @var{ndigit}, int @var{decpt}, int *@var{neg})
2248 This function is equivalent to @code{fcvt} except that it
2249 takes a @code{long double} for the first parameter.
2250 @end deftypefun
2251
2252 @comment stdlib.h
2253 @comment GNU
2254 @deftypefun {char *} qgcvt (long double @var{value}, int @var{ndigit}, char *@var{buf})
2255 This function is equivalent to @code{gcvt} except that it
2256 takes a @code{long double} for the first parameter.
2257 @end deftypefun
2258
2259
2260 @cindex gcvt_r
2261 The @code{ecvt} and @code{fcvt} functions, and their @code{long double}
2262 equivalents, all return a string located in a static buffer which is
2263 overwritten by the next call to the function.  The GNU C library
2264 provides another set of extended functions which write the converted
2265 string into a user-supplied buffer.  These have the conventional
2266 @code{_r} suffix.
2267
2268 @code{gcvt_r} is not necessary, because @code{gcvt} already uses a
2269 user-supplied buffer.
2270
2271 @comment stdlib.h
2272 @comment GNU
2273 @deftypefun {char *} ecvt_r (double @var{value}, int @var{ndigit}, int *@var{decpt}, int *@var{neg}, char *@var{buf}, size_t @var{len})
2274 The @code{ecvt_r} function is the same as @code{ecvt}, except
2275 that it places its result into the user-specified buffer pointed to by
2276 @var{buf}, with length @var{len}.
2277
2278 This function is a GNU extension.
2279 @end deftypefun
2280
2281 @comment stdlib.h
2282 @comment SVID, Unix98
2283 @deftypefun {char *} fcvt_r (double @var{value}, int @var{ndigit}, int @var{decpt}, int *@var{neg}, char *@var{buf}, size_t @var{len})
2284 The @code{fcvt_r} function is the same as @code{fcvt}, except
2285 that it places its result into the user-specified buffer pointed to by
2286 @var{buf}, with length @var{len}.
2287
2288 This function is a GNU extension.
2289 @end deftypefun
2290
2291 @comment stdlib.h
2292 @comment GNU
2293 @deftypefun {char *} qecvt_r (long double @var{value}, int @var{ndigit}, int *@var{decpt}, int *@var{neg}, char *@var{buf}, size_t @var{len})
2294 The @code{qecvt_r} function is the same as @code{qecvt}, except
2295 that it places its result into the user-specified buffer pointed to by
2296 @var{buf}, with length @var{len}.
2297
2298 This function is a GNU extension.
2299 @end deftypefun
2300
2301 @comment stdlib.h
2302 @comment GNU
2303 @deftypefun {char *} qfcvt_r (long double @var{value}, int @var{ndigit}, int @var{decpt}, int *@var{neg}, char *@var{buf}, size_t @var{len})
2304 The @code{qfcvt_r} function is the same as @code{qfcvt}, except
2305 that it places its result into the user-specified buffer pointed to by
2306 @var{buf}, with length @var{len}.
2307
2308 This function is a GNU extension.
2309 @end deftypefun