man/mawk.1

   1 .TH MAWK 1  "Dec 22 1994" "Version 1.2" "USER COMMANDS"
   2 .\" strings
   3 .ds ex \fIexpr\fR
   4 .SH NAME
   5 mawk \- pattern scanning and text processing language
   6 .SH SYNOPSIS
   7 .B mawk
   8 [\-\fBW
   9 .IR option ]
  10 [\-\fBF
  11 .IR value ]
  12 [\-\fBv
  13 .IR var=value ]
  14 [\-\|\-] 'program text' [file ...]
  15 .br
  16 .B mawk
  17 [\-\fBW
  18 .IR option ]
  19 [\-\fBF
  20 .IR value ]
  21 [\-\fBv
  22 .IR var=value ]
  23 [\-\fBf
  24 .IR program-file ]
  25 [\-\|\-] [file ...]
  26 .SH DESCRIPTION
  27 .B mawk
  28 is an interpreter for the AWK Programming Language.
  29 The AWK language
  30 is useful for manipulation of data files,
  31 text retrieval and processing,
  32 and for prototyping and experimenting with algorithms.
  33 .B mawk
  34 is a \fInew awk\fR meaning it implements the AWK language as
  35 defined in Aho, Kernighan and Weinberger,
  36 .I "The AWK Programming Language,"
  37 Addison-Wesley Publishing, 1988.  (Hereafter referred to as
  38 the AWK book.)
  39 .B mawk
  40 conforms to the Posix 1003.2
  41 (draft 11.3)
  42 definition of the AWK language
  43 which contains a few features not described in the AWK
  44 book,  and
  45 .B mawk
  46 provides a small number of extensions.
  47 .PP
  48 An AWK program is a sequence of \fIpattern {action}\fR pairs and
  49 function definitions.
  50 Short programs are entered on the command line
  51 usually enclosed in ' ' to avoid shell
  52 interpretation.
  53 Longer programs can be read in from a
  54 file with the \-f option.
  55 Data  input is read from the list of files on
  56 the command line or from standard input when the list is empty.
  57 The input is broken into records as determined by the
  58 record separator variable, \fBRS\fR.  Initially,
  59 .B RS
  60 = "\en" and records are synonymous with lines.
  61 Each record is compared against each
  62 .I pattern
  63 and if it matches, the program text for
  64 .I "{action}"
  65 is executed.
  66 .SH OPTIONS
  67 .TP \w'\-\fBW'u+\w'\fRsprintf=\fInum\fR'u+2n
  68 \-\fBF \fIvalue\fP
  69 sets the field separator, \fBFS\fR, to
  70 .IR value .
  71 .TP
  72 \-\fBf \fIfile
  73 Program text is read from \fIfile\fR instead of from the
  74 command line.  Multiple
  75 .B \-f
  76 options are allowed.
  77 .TP
  78 \-\fBv \fIvar=value\fR
  79 assigns
  80 .I value
  81 to program variable
  82 .IR var .
  83 .TP
  84 \-\|\-
  85 indicates the unambiguous end of options.
  86 .PP
  87 The above options will be available with any Posix compatible
  88 implementation of AWK, and implementation specific options are
  89 prefaced with
  90 .BR \-W .
  91 .B mawk
  92 provides six:
  93 .TP \w'\-\fBW'u+\w'\fRsprintf=\fInum\fR'u+2n
  94 \-\fBW \fRversion
  95 .B mawk
  96 writes its version and copyright
  97 to stdout and compiled limits to
  98 stderr and exits 0.
  99 .TP
 100 \-\fBW \fRdump
 101 writes an assembler like listing of the internal
 102 representation of the program to stdout and exits 0
 103 (on successful compilation).
 104 .TP
 105 \-\fBW \fRinteractive
 106 sets unbuffered writes to stdout and line buffered reads from stdin.
 107 Records from stdin are lines regardless of the value of
 108 .BR RS .
 109 .TP
 110 \-\fBW \fRexec \fIfile
 111 Program text is read from
 112 .I file
 113 and this is the last option. Useful on systems that support the
 114 .B #!
 115 "magic number" convention for executable scripts.
 116 .TP
 117 \-\fBW \fRsprintf=\fInum\fR
 118 adjusts the size of
 119 .B mawk's
 120 internal sprintf buffer to
 121 .I num
 122 bytes.  More than rare use of this option indicates
 123 .B mawk
 124 should be recompiled.
 125 .TP
 126 \-\fBW \fRposix_space
 127 forces
 128 .B mawk
 129 not to consider '\en' to be space.
 130 .PP
 131 The short forms
 132 .BR \-W [vdiesp]
 133 are recognized and on some systems \fB\-W\fRe is mandatory to avoid
 134 command line length limitations.
 135 .SH "THE AWK LANGUAGE"
 136 .SS "\fB1. Program structure"
 137 An AWK program is a sequence of
 138 .I "pattern {action}"
 139 pairs and user
 140 function definitions.
 141 .PP
 142 A pattern can be:
 143 .nf
 144 .RS
 145 \fBBEGIN
 146 END\fR
 147 expression
 148 expression , expression
 149 .sp
 150 .RE
 151 .fi
 152 One, but not both,
 153 of \fIpattern {action}\fR can be omitted.   If
 154 .I {action}
 155 is omitted it is implicitly { print }.  If
 156 .I pattern
 157 is omitted, then it is implicitly matched.
 158 .B BEGIN
 159 and
 160 .B END
 161 patterns require an action.
 162 .PP
 163 Statements are terminated by newlines, semi-colons or both.
 164 Groups of statements such as
 165 actions or loop bodies are blocked via { ... } as in C.  The
 166 last statement in a block doesn't need a terminator.  Blank lines
 167 have no meaning; an empty statement is terminated with a
 168 semi-colon. Long statements
 169 can be continued with a backslash, \e\|.  A statement can be broken
 170 without a backslash after a comma, left brace, &&, ||,
 171 .BR do ,
 172 .BR else  ,
 173 the right parenthesis of an
 174 .BR if ,
 175 .B while
 176 or
 177 .B for
 178 statement, and the
 179 right parenthesis of a function definition.
 180 A comment starts with # and extends to, but does not include
 181 the end of line.
 182 .PP
 183 The following statements control program flow inside blocks.
 184 .RS
 185 .PP
 186 .B if
 187 ( \*(ex )
 188 .I statement
 189 .PP
 190 .B if
 191 ( \*(ex )
 192 .I statement
 193 .B else
 194 .I statement
 195 .PP
 196 .B while
 197 ( \*(ex )
 198 .I statement
 199 .PP
 200 .B do
 201 .I statement
 202 .B while
 203 ( \*(ex )
 204 .PP
 205 .B for
 206 (
 207 \fIopt_expr\fR ;
 208 \fIopt_expr\fR ;
 209 \fIopt_expr\fR
 210 )
 211 .I statement
 212 .PP
 213 .B for
 214 ( \fIvar \fBin \fIarray\fR )
 215 .I statement
 216 .PP
 217 .B continue
 218 .PP
 219 .B break
 220 .RE
 221 .\"
 222 .SS "\fB2. Data types, conversion and comparison"
 223 There are two basic data types, numeric and string.
 224 Numeric constants can be integer like \-2,
 225 decimal like 1.08, or in scientific notation like
 226 \-1.1e4 or .28E\-3.  All numbers are represented internally and all
 227 computations are done in floating point arithmetic.
 228 So for example, the expression
 229 0.2e2 == 20
 230 is true and true is represented as 1.0.
 231 .PP
 232 String constants are enclosed in double quotes.
 233 .sp
 234 .ce
 235 "This is a string with a newline at the end.\en"
 236 .sp
 237 Strings can be continued across a line by escaping (\e) the newline.
 238 The following escape sequences are recognized.
 239 .nf
 240 .sp
 241         \e\e            \e
 242         \e"             "
 243         \ea             alert, ascii 7
 244         \eb             backspace, ascii 8
 245         \et             tab, ascii 9
 246         \en             newline, ascii 10
 247         \ev             vertical tab, ascii 11
 248         \ef             formfeed, ascii 12
 249         \er             carriage return, ascii 13
 250         \eddd           1, 2 or 3 octal digits for ascii ddd
 251         \exhh           1 or 2 hex digits for ascii  hh
 252 .sp
 253 .fi
 254 If you escape any other character \ec, you get \ec, i.e.,
 255 .B mawk
 256 ignores the escape.
 257 .PP
 258 There are really three basic data types; the third is
 259 .I "number and string"
 260 which has both a numeric value and a string value
 261 at the same time.
 262 User defined variables come into existence when first referenced
 263 and are initialized to
 264 .IR null ,
 265 a number and string value which has numeric value 0 and string value
 266 "".
 267 Non-trivial number and string typed data come from input
 268 and are typically stored in fields.  (See section 4).
 269 .PP
 270 The type of an expression is determined by its context and automatic
 271 type conversion occurs if needed.  For example, to evaluate the
 272 statements
 273 .nf
 274 .sp
 275         y = x + 2  ;  z = x  "hello"
 276 .sp
 277 .fi
 278 The value stored in variable y will be typed numeric.
 279 If x is not numeric,
 280 the value read from x is converted to numeric before it is added to
 281 2 and stored in y.  The value stored in variable z will be typed
 282 string, and the value of x will be converted to string if necessary
 283 and concatenated with "hello".  (Of course, the value and type
 284 stored in x is not changed by any conversions.)
 285 A string expression is converted to numeric using its longest
 286 numeric prefix as with
 287 .IR atof (3).
 288 A numeric expression is converted to string by replacing
 289 .I expr
 290 with
 291 .BR sprintf(CONVFMT ,
 292 .IR expr ),
 293 unless
 294 .I expr
 295 can be represented on the host machine as an exact integer then
 296 it is converted to \fBsprintf\fR("%d", \*(ex).
 297 .B Sprintf()
 298 is an AWK built-in that duplicates the functionality of
 299 .IR sprintf (3),
 300 and
 301 .B CONVFMT
 302 is a built-in variable used for internal conversion
 303 from number to string and initialized to "%.6g".
 304 Explicit type conversions can be forced,
 305 \*(ex ""
 306 is string and
 307 .IR  expr +0
 308 is numeric.
 309 .PP
 310 To evaluate,
 311 \*(ex\d1\u \fBrel-op \*(ex\d2\u,
 312 if both operands are numeric or number and string then the comparison
 313 is numeric; if both operands are string the comparison is string;
 314 if one operand is string, the non-string operand is converted and
 315 the comparison is string.  The result is numeric, 1 or 0.
 316 .PP
 317 In boolean contexts such as,
 318 \fBif\fR ( \*(ex ) \fIstatement\fR,
 319 a string expression evaluates true if and only if it is not the
 320 empty string "";
 321 numeric values if and only if not numerically zero.
 322 .\"
 323 .SS "\fB3. Regular expressions"
 324 In the AWK language, records, fields and strings are often
 325 tested for matching a
 326 .IR "regular expression" .
 327 Regular expressions are enclosed in slashes, and
 328 .nf
 329 .sp
 330         \*(ex ~ /\fIr\fR/
 331 .sp
 332 .fi
 333 is an AWK expression that evaluates to 1 if \*(ex "matches"
 334 .IR r ,
 335 which means a substring of \*(ex is in the set of strings
 336 defined by
 337 .IR r .
 338 With no match the expression evaluates to 0; replacing
 339 ~ with the "not match" operator, !~ , reverses the meaning.
 340 As  pattern-action pairs,
 341 .nf
 342 .sp
 343         /\fIr\fR/ { \fIaction\fR }   and\
 344    \fB$0\fR ~ /\fIr\fR/ { \fIaction\fR }
 345 .sp
 346 .fi
 347 are the same,
 348 and for each input record that matches
 349 .IR r ,
 350 .I action
 351 is executed.
 352 In fact, /\fIr\fR/ is an AWK expression that is
 353 equivalent to (\fB$0\fR ~ /\fIr\fR/) anywhere except when on the
 354 right side of a match operator or passed as an argument to
 355 a built-in function that expects a regular expression
 356 argument.
 357 .PP
 358 AWK uses extended regular expressions as with
 359 .IR egrep (1).
 360 The regular expression metacharacters, i.e., those with special
 361 meaning in regular expressions are
 362 .nf
 363 .sp
 364         \ ^ $ . [ ] | ( ) * + ?
 365 .sp
 366 .fi
 367 Regular expressions are built up from characters as follows:
 368 .RS
 369 .TP \w'[^c\d1\uc\d2\uc\d3\u...]'u+1n
 370 \fIc\fR
 371 matches any non-metacharacter
 372 .IR c .
 373 .TP
 374 \e\fIc\fR
 375 matches a character defined by the same escape sequences used
 376 in string constants or the literal
 377 character
 378 .I c
 379 if
 380 \e\fIc\fR
 381 is not an escape sequence.
 382 .TP
 383 \&\.
 384 matches any character (including newline).
 385 .TP
 386 ^
 387 matches the front of a string.
 388 .TP
 389 $
 390 matches the back of a string.
 391 .TP
 392 [c\d1\uc\d2\uc\d3\u...]
 393 matches any character in the class
 394 c\d1\uc\d2\uc\d3\u... .  An interval of characters is denoted
 395 c\d1\u\-c\d2\u inside a class [...].
 396 .TP
 397 [^c\d1\uc\d2\uc\d3\u...]
 398 matches any character not in the class
 399 c\d1\uc\d2\uc\d3\u...
 400 .RE
 401 .sp
 402 Regular expressions are built up from other regular expressions
 403 as follows:
 404 .RS
 405 .TP \w'[^c\d1\uc\d2\uc\d3\u...]'u+1n
 406 \fIr\fR\d1\u\fIr\fR\d2\u
 407 matches
 408 \fIr\fR\d1\u
 409 followed immediately by
 410 \fIr\fR\d2\u
 411 (concatenation).
 412 .TP
 413 \fIr\fR\d1\u | \fIr\fR\d2\u
 414 matches
 415 \fIr\fR\d1\u or
 416 \fIr\fR\d2\u
 417 (alternation).
 418 .TP
 419 \fIr\fR*
 420 matches \fIr\fR repeated zero or more times.
 421 .TP
 422 \fIr\fR+
 423 matches \fIr\fR repeated one or more times.
 424 .TP
 425 \fIr\fR?
 426 matches \fIr\fR zero or once.
 427 .TP
 428 (\fIr\fR)
 429 matches \fIr\fR, providing grouping.
 430 .RE
 431 .sp
 432 The increasing precedence of operators is alternation,
 433 concatenation and
 434 unary (*, + or ?).
 435 .PP
 436 For example,
 437 .nf
 438 .sp
 439         /^[_a\-zA-Z][_a\-zA\-Z0\-9]*$/  and
 440         /^[\-+]?([0\-9]+\e\|.?|\e\|.[0\-9])[0\-9]*([eE][\-+]?[0\-9]+)?$/
 441 .sp
 442 .fi
 443 are matched by AWK identifiers and AWK numeric constants
 444 respectively.  Note that . has to be escaped to be
 445 recognized as a decimal point, and that metacharacters are not
 446 special inside character classes.
 447 .PP
 448 Any expression can be used on the right hand side of the ~ or !~
 449 operators or
 450 passed to a built-in that expects
 451 a regular expression.
 452 If needed, it is converted to string, and then interpreted
 453 as a regular expression.  For example,
 454 .nf
 455 .sp
 456         BEGIN { identifier = "[_a\-zA\-Z][_a\-zA\-Z0\-9]*" }
 457
 458         $0 ~ "^" identifier
 459 .sp
 460 .fi
 461 prints all lines that start with an AWK identifier.
 462 .PP
 463 .B mawk
 464 recognizes the empty regular expression, //\|, which matches the
 465 empty string and hence is matched by any string at the front,
 466 back and between every character.  For example,
 467 .nf
 468 .sp
 469         echo  abc | mawk { gsub(//, "X") ; print }
 470         XaXbXcX
 471 .sp
 472 .fi
 473 .\"
 474 .SS "\fB4. Records and fields"
 475 Records are read in one at a time, and stored in the
 476 .I field
 477 variable
 478 .BR $0 .
 479 The record is split into
 480 .I fields
 481 which are stored in
 482 .BR $1 ,
 483 .BR $2 ", ...,"
 484 .BR $NF .
 485 The built-in variable
 486 .B NF
 487 is set to the number of fields,
 488 and
 489 .B NR
 490 and
 491 .B FNR
 492 are incremented by 1.
 493 Fields above
 494 .B $NF
 495 are set to "".
 496 .PP
 497 Assignment to
 498 .B $0
 499 causes the fields and
 500 .B NF
 501 to be recomputed.
 502 Assignment to
 503 .B NF
 504 or to a field
 505 causes
 506 .B $0
 507 to be reconstructed by
 508 concatenating the
 509 .B $i's
 510 separated by
 511 .BR OFS .
 512 Assignment to a field with index greater than
 513 .BR NF ,
 514 increases
 515 .B NF
 516 and causes
 517 .B $0
 518 to be reconstructed.
 519 .PP
 520 Data input stored in fields
 521 is string, unless the entire field has numeric
 522 form and then the type is number and string.
 523 For example,
 524 .sp
 525 .nf
 526         echo 24 24E |
 527         mawk '{ print($1>100, $1>"100", $2>100, $2>"100") }'
 528         0 1 1 1
 529 .fi
 530 .sp
 531 .B $0
 532 and
 533 .B $2
 534 are string and
 535 .B $1
 536 is number and string.  The first comparison is numeric,
 537 the second is string, the third is string
 538 (100 is converted to "100"),
 539 and the last is string.
 540 .\"
 541 .SS "\fB5. Expressions and operators"
 542 .PP
 543 The expression syntax is
 544 similar to C.  Primary expressions are numeric constants,
 545 string constants, variables, fields, arrays and function calls.
 546 The identifier
 547 for a variable, array or function can be a sequence of
 548 letters, digits and underscores, that does
 549 not start with a digit.
 550 Variables are not declared; they exist when first referenced and
 551 are initialized to
 552 .IR null .
 553 .PP
 554 New
 555 expressions are composed with the following operators in
 556 order of increasing precedence.
 557 .PP
 558 .RS
 559 .nf
 560 .vs +2p  \"  open up a little
 561 \fIassignment\fR                =  +=  \-=  *=  /=  %=  ^=
 562 \fIconditional\fR               ?  :
 563 \fIlogical or\fR                ||
 564 \fIlogical and\fR               &&
 565 \fIarray membership\fR  \fBin
 566 \fImatching\fR          ~   !~
 567 \fIrelational\fR                <  >   <=  >=  ==  !=
 568 \fIconcatenation\fR             (no explicit operator)
 569 \fIadd ops\fR                   +  \-
 570 \fImul ops\fR                   *  /  %
 571 \fIunary\fR                     +  \-
 572 \fIlogical not\fR               !
 573 \fIexponentiation\fR            ^
 574 \fIinc and dec\fR               ++ \-\|\- (both post and pre)
 575 \fIfield\fR                     $
 576 .vs
 577 .RE
 578 .PP
 579 .fi
 580 Assignment, conditional and exponentiation associate right to
 581 left; the other operators associate left to right.  Any
 582 expression can be parenthesized.
 583 .\"
 584 .SS "\fB6. Arrays"
 585 .ds ae \fIarray\fR[\fIexpr\fR]
 586 Awk provides one-dimensional arrays.  Array elements are expressed
 587 as \*(ae.
 588 .I Expr
 589 is internally converted to string type, so, for example,
 590 A[1] and A["1"] are the same element and the actual
 591 index is "1".
 592 Arrays indexed by strings are called associative arrays.
 593 Initially an array is empty; elements exist when first accessed.
 594 An expression,
 595 \fIexpr\fB in\fI array\fR
 596 evaluates to 1 if
 597 \*(ae
 598 exists, else to 0.
 599 .PP
 600 There is a form of the
 601 .B for
 602 statement that loops over each index of an array.
 603 .nf
 604 .sp
 605         \fBfor\fR ( \fIvar\fB in \fIarray \fR) \fIstatement\fR
 606 .sp
 607 .fi
 608 sets
 609 .I var
 610 to each index of
 611 .I array
 612 and executes
 613 .IR statement .
 614 The order that
 615 .I var
 616 transverses the indices of
 617 .I array
 618 is not defined.
 619 .PP
 620 The statement,
 621 .B delete
 622 \*(ae,
 623 causes
 624 \*(ae
 625 not to exist.
 626 .B mawk
 627 supports an extension,
 628 .B delete
 629 .IR array ,
 630 which deletes all elements of
 631 .IR array .
 632 .PP
 633 Multidimensional arrays are synthesized with concatenation using
 634 the built-in variable
 635 .BR SUBSEP .
 636 \fIarray\fR[\fIexpr\fR\d1\u,\|\fIexpr\fR\d2\u]
 637 is equivalent to
 638 \fIarray\fR[\fIexpr\fR\d1\u \fBSUBSEP \fIexpr\fR\d2\u].
 639 Testing for a multidimensional element uses a parenthesized index,
 640 such as
 641 .sp
 642 .nf
 643         if ( (i, j) in A )  print A[i, j]
 644 .fi
 645 .sp
 646 .\"
 647 .SS "\fB7. Builtin-variables\fR"
 648 .PP
 649 The following variables are built-in and initialized before program
 650 execution.
 651 .RS
 652 .TP \w'FILENAME'u+2n
 653 .B ARGC
 654 number of command line arguments.
 655 .TP
 656 .B ARGV
 657 array of command line arguments, 0..ARGC-1.
 658 .TP
 659 .B CONVFMT
 660 format for internal conversion of numbers to string,
 661 initially = "%.6g".
 662 .TP
 663 .B ENVIRON
 664 array indexed by environment variables.  An environment string,
 665 \fIvar=value\fR is stored as
 666 \fBENVIRON\fR[\fIvar\fR] =
 667 .IR value .
 668 .TP
 669 .B FILENAME
 670 name of the current input file.
 671 .TP
 672 .B FNR
 673 current record number in
 674 .BR FILENAME .
 675 .TP
 676 .B FS
 677 splits records into fields as a regular expression.
 678 .TP
 679 .B NF
 680 number of fields in the current record.
 681 .TP
 682 .B NR
 683 current record number in the total input stream.
 684 .TP
 685 .B OFMT
 686 format for printing numbers; initially = "%.6g".
 687 .TP
 688 .B OFS
 689 inserted between fields on output, initially = " ".
 690 .TP
 691 .B   ORS
 692 terminates each record on output, initially = "\en".
 693 .TP
 694 .B    RLENGTH
 695 length set by the last call to the built-in function,
 696 .BR match() .
 697 .TP
 698 .B   RS
 699 input record separator, initially = "\en".
 700 .TP
 701 .B  RSTART
 702 index set by the last call to
 703 .BR match() .
 704 .TP
 705 .B SUBSEP
 706 used to build multiple array subscripts, initially = "\e034".
 707 .RE
 708 .\"
 709 .SS "\fB8. Built-in functions"
 710 String functions
 711 .RS
 712 .TP
 713 gsub(\fIr,s,t\fR)  gsub(\fIr,s\fR)
 714 Global substitution, every match of regular expression
 715 .I r
 716 in variable
 717 .I t
 718 is replaced by string
 719 .IR s .
 720 The number of replacements is returned.
 721 If
 722 .I t
 723 is omitted,
 724 .B $0
 725 is used.  An & in the replacement string
 726 .I s
 727 is replaced by the matched substring of
 728 .IR t .
 729 \e& and \e\e put  literal & and \e, respectively,
 730 in the replacement string.
 731 .TP
 732 index(\fIs,t\fR)
 733 If
 734 .I t
 735 is a substring of
 736 .IR s ,
 737 then the position where
 738 .I t
 739 starts is returned, else 0 is returned.
 740 The first character of
 741 .I s
 742 is in position 1.
 743 .TP
 744 length(\fIs\fR)
 745 Returns the length of string
 746 .IR s .
 747 .TP
 748 match(\fIs,r\fR)
 749 Returns the index of the first longest match of regular expression
 750 .I r
 751 in string
 752 .IR s .
 753 Returns 0 if no match.
 754 As a side effect,
 755 .B RSTART
 756 is set to the return value.
 757 .B RLENGTH
 758 is set to the length of the match or \-1 if no match.  If the
 759 empty string is matched,
 760 .B RLENGTH
 761 is set to 0, and 1 is returned if the match is at the front, and
 762 length(\fIs\fR)+1 is returned if the match is at the back.
 763 .TP
 764 split(\fIs,A,r\fR)  split(\fIs,A\fR)
 765 String
 766 .I s
 767 is split into fields by regular expression
 768 .I  r
 769 and the fields are loaded into array
 770 .IR A .
 771 The number of fields
 772 is returned.  See section 11 below for more detail.
 773 If
 774 .I r
 775 is omitted,
 776 .B FS
 777 is used.
 778 .TP
 779 sprintf(\fIformat,expr-list\fR)
 780 Returns a string constructed from
 781 .I expr-list
 782 according to
 783 .IR format .
 784 See the description of printf() below.
 785 .TP
 786 sub(\fIr,s,t\fR)  sub(\fIr,s\fR)
 787 Single substitution, same as gsub() except at most one substitution.
 788 .TP
 789 substr(\fIs,i,n\fR)  substr(\fIs,i\fR)
 790 Returns the substring of string
 791 .IR s ,
 792 starting at index
 793 .IR i ,
 794 of length
 795 .IR n .
 796 If
 797 .I n
 798 is omitted, the suffix of
 799 .IR s ,
 800 starting at
 801 .I i
 802 is returned.
 803 .TP
 804 tolower(\fIs\fR)
 805 Returns a copy of
 806 .I s
 807 with all upper case characters converted to lower case.
 808 .TP
 809 toupper(\fIs\fR)
 810 Returns a copy of
 811 .I s
 812 with all lower case characters converted to upper case.
 813 .RE
 814 .PP
 815 Arithmetic functions
 816 .RS
 817 .PP
 818 .nf
 819 .ie n \
 820 .ds Pi pi
 821 .el \
 822 .ds Pi \\(*p
 823 atan2(\fIy,x\fR)        Arctan of \fIy\fR/\fIx\fR between -\*(Pi and \*(Pi.
 824 .PP
 825 cos(\fIx\fR)            Cosine function, \fIx\fR in radians.
 826 .PP
 827 exp(\fIx\fR)            Exponential function.
 828 .PP
 829 int(\fIx\fR)            Returns \fIx\fR truncated towards zero.
 830 .PP
 831 log(\fIx\fR)            Natural logarithm.
 832 .PP
 833 rand()          Returns a random number between zero and one.
 834 .PP
 835 sin(\fIx\fR)            Sine function, \fIx\fR in radians.
 836 .PP
 837 sqrt(\fIx\fR)           Returns square root of \fIx\fR.
 838 .fi
 839 .TP
 840 srand(\fIexpr\fR)  srand()
 841 Seeds the random number generator, using the clock if
 842 .I expr
 843 is omitted, and returns the value of the previous seed.
 844 .B mawk
 845 seeds the random number generator from the clock at startup
 846 so there is no real need to call srand().  Srand(\fIexpr\fR)
 847 is useful for repeating pseudo random sequences.
 848 .RE
 849 .\"
 850 .SS "\fB9. Input and output"
 851 There are two output statements,
 852 .B print
 853 and
 854 .BR printf .
 855 .RS
 856 .TP
 857 print
 858 writes
 859 .B "$0  ORS"
 860 to standard output.
 861 .TP
 862 print \*(ex\d1\u, \*(ex\d2\u, ..., \*(ex\dn\u
 863 writes
 864 \*(ex\d1\u \fBOFS \*(ex\d2\u \fBOFS\fR ... \*(ex\dn\u
 865 .B ORS
 866 to standard output.  Numeric expressions are converted to
 867 string with
 868 .BR OFMT .
 869 .TP
 870 printf \fIformat, expr-list\fR
 871 duplicates the printf C library function writing to standard output.
 872 The complete ANSI C format specifications are recognized with
 873 conversions %c, %d, %e, %E, %f, %g, %G,
 874 %i, %o, %s, %u, %x, %X and %%,
 875 and conversion qualifiers h and l.
 876 .RE
 877 .PP
 878 The argument list to print or printf can optionally be enclosed in
 879 parentheses.
 880 Print formats numbers using
 881 .B OFMT
 882 or "%d" for exact integers.
 883 "%c" with a numeric argument prints the corresponding 8 bit
 884 character, with a string argument it prints the first character of
 885 the string.
 886 The output of print and printf can be redirected to a file or
 887 command by appending >
 888 .IR file ,
 889 >>
 890 .I file
 891 or
 892 |
 893 .I command
 894 to the end of the print statement.
 895 Redirection opens
 896 .I file
 897 or
 898 .I command
 899 only once, subsequent redirections append to the already open stream.
 900 By convention,
 901 .B mawk
 902 associates the filename "/dev/stderr" with stderr which allows
 903 print and printf to be redirected to stderr.
 904 .B mawk
 905 also associates "\-" and "/dev/stdout" with stdin and stdout which
 906 allows these streams to be passed to functions.
 907 .PP
 908 The input function
 909 .B getline
 910 has the following variations.
 911 .RS
 912 .TP
 913 getline
 914 reads into
 915 .BR $0 ,
 916 updates the fields,
 917 .BR NF ,
 918 .B  NR
 919 and
 920 .BR FNR .
 921 .TP
 922 getline < \fIfile\fR
 923 reads into
 924 .B $0
 925 from \fIfile\fR,
 926 updates the fields and
 927 .BR NF .
 928 .TP
 929 getline \fIvar
 930 reads the next record into
 931 .IR var ,
 932 updates
 933 .B NR
 934 and
 935 .BR FNR .
 936 .TP
 937 getline \fIvar\fR < \fIfile
 938 reads the next record of
 939 .I file
 940 into
 941 .IR var .
 942 .TP
 943 \fI command\fR | getline
 944 pipes a record from
 945 .I command
 946 into
 947 .B $0
 948 and updates the fields and
 949 .BR NF .
 950 .TP
 951 \fI command\fR | getline \fIvar
 952 pipes a record from
 953 .I command
 954 into
 955 .IR var .
 956 .RE
 957 .PP
 958 Getline returns 0 on end-of-file, \-1 on error, otherwise 1.
 959 .PP
 960 Commands on the end of pipes are executed by /bin/sh.
 961 .PP
 962 The function \fBclose\fR(\*(ex) closes the file or pipe
 963 associated with
 964 .IR expr .
 965 Close returns 0 if
 966 .I expr
 967 is an open file,
 968 the exit status if
 969 .I expr
 970 is a piped command, and \-1 otherwise.
 971 Close is used to reread a file or command, make sure the other
 972 end of an output pipe is finished or conserve file resources.
 973 .PP
 974 The function \fBfflush\fR(\*(ex) flushes the output file or pipe
 975 associated with
 976 .IR expr .
 977 Fflush returns 0 if
 978 .I expr
 979 is an open output stream else \-1.
 980 Fflush without an argument flushes stdout.
 981 Fflush with an empty argument ("") flushes all open output.
 982 .PP
 983 The function
 984 \fBsystem\fR(\fIexpr\fR)
 985 uses
 986 /bin/sh
 987 to execute
 988 .I expr
 989 and returns the exit status of the command
 990 .IR expr .
 991 Changes made to the
 992 .B ENVIRON
 993 array are not passed to commands executed with
 994 .B system
 995 or pipes.
 996 .SS \fB10. User defined functions
 997 The syntax for a user defined function is
 998 .nf
 999 .sp
1000         \fBfunction\fR name( \fIargs\fR ) { \fIstatements\fR }
1001 .sp
1002 .fi
1003 The function body can contain a return statement
1004 .nf
1005 .sp
1006         \fBreturn\fI opt_expr\fR
1007 .sp
1008 .fi
1009 A return statement is not required.
1010 Function calls may be nested or recursive.
1011 Functions are passed expressions by value
1012 and arrays by reference.
1013 Extra arguments serve as local variables
1014 and are initialized to
1015 .IR null .
1016 For example, csplit(\fIs,\|A\fR) puts each character of
1017 .I s
1018 into array
1019 .I A
1020 and returns the length of
1021 .IR s .
1022 .nf
1023 .sp
1024         function csplit(s, A,   n, i)
1025         {
1026           n = length(s)
1027           for( i = 1 ; i <= n ; i++ ) A[i] = substr(s, i, 1)
1028           return n
1029         }
1030 .sp
1031 .fi
1032 Putting extra space between passed arguments and local
1033 variables is conventional.
1034 Functions can be referenced before they are defined, but the
1035 function name and the '(' of the arguments must touch to
1036 avoid confusion with concatenation.
1037 .\"
1038 .SS "\fB11. Splitting strings, records and files"
1039 Awk programs use the same algorithm to
1040 split strings into arrays with split(), and records into fields
1041 on
1042 .BR FS .
1043 .B mawk
1044 uses essentially the same algorithm to split files into
1045 records on
1046 .BR RS .
1047 .PP
1048 Split(\fIexpr,\|A,\|sep\fR) works as follows:
1049 .RS
1050 .TP
1051 (1)
1052 If
1053 .I sep
1054 is omitted, it is replaced by
1055 .BR FS .
1056 .I Sep
1057 can be an expression or regular expression.  If it is an
1058 expression of non-string type, it is converted to string.
1059 .TP
1060 (2)
1061 If
1062 .I sep
1063 = " " (a single space),
1064 then <SPACE> is trimmed from the front and back of
1065 .IR expr ,
1066 and
1067 .I sep
1068 becomes <SPACE>.
1069 .B mawk
1070 defines <SPACE> as the regular expression
1071 /[\ \et\en]+/.
1072 Otherwise
1073 .I sep
1074 is treated as a regular expression, except that meta-characters
1075 are ignored for a string of length 1,
1076 e.g.,
1077 split(x, A, "*") and split(x, A, /\e*/) are the same.
1078 .TP
1079 (3)
1080 If \*(ex is not string, it is converted to string.
1081 If \*(ex is then the empty string "", split() returns 0
1082 and
1083 .I A
1084 is set empty.
1085 Otherwise,
1086 all non-overlapping, non-null and longest matches of
1087 .I sep
1088 in
1089 .IR expr ,
1090 separate
1091 .I expr
1092 into fields which are loaded into
1093 .IR A .
1094 The fields are placed in
1095 A[1], A[2], ..., A[n] and split() returns n, the number
1096 of fields which is the number
1097 of matches plus one.
1098 Data placed in
1099 .I A
1100 that looks numeric is typed number and string.
1101 .RE
1102 .PP
1103 Splitting records into fields works the same except the
1104 pieces are loaded into
1105 .BR $1 ,
1106 \fB$2\fR,...,
1107 .BR $NF .
1108 If
1109 .B $0
1110 is empty,
1111 .B NF
1112 is set to 0 and all
1113 .B $i
1114 to "".
1115 .PP
1116 .B mawk
1117 splits files into records by the same algorithm, but with the
1118 slight difference that
1119 .B RS
1120 is really a terminator instead of a separator.
1121 (\fBORS\fR is really a terminator too).
1122 .RS
1123 .PP
1124 E.g., if
1125 .B FS
1126 = ":+" and
1127 .B $0
1128 = "a::b:" , then
1129 .B NF
1130 = 3 and
1131 .B $1
1132 = "a",
1133 .B $2
1134 = "b" and
1135 .B $3
1136 = "", but
1137 if "a::b:" is the contents of an input file and
1138 .B RS
1139 = ":+", then
1140 there are two records "a" and "b".
1141 .RE
1142 .PP
1143 .B RS
1144 = " " is not special.
1145 .PP
1146 If
1147 .B FS
1148 = "", then
1149 .B mawk
1150 breaks the record into individual characters, and, similarly,
1151 split(\fIs,A,\fR"") places the individual characters of
1152 .I s
1153 into
1154 .IR A .
1155 .\"
1156 .SS "\fB12. Multi-line records"
1157 Since
1158 .B mawk
1159 interprets
1160 .B RS
1161 as a regular expression, multi-line
1162 records are easy.  Setting
1163 .B RS
1164 = "\en\en+", makes one or more blank
1165 lines separate records.  If
1166 .B FS
1167 = " " (the default), then single
1168 newlines, by the rules for <SPACE> above, become space and
1169 single newlines are field separators.
1170 .RS
1171 .PP
1172 For example, if a file is "a\ b\enc\en\en",
1173 .B RS
1174 = "\en\en+" and
1175 .B FS
1176 = "\ ", then there is one record "a\ b\enc" with three
1177 fields "a", "b" and "c".  Changing
1178 .B FS
1179 = "\en", gives two
1180 fields "a b" and "c"; changing
1181 .B FS
1182 = "", gives one field
1183 identical to the record.
1184 .RE
1185 .PP
1186 If you want lines with spaces or tabs to be considered blank,
1187 set
1188 .B RS
1189 = "\en([\ \et]*\en)+".
1190 For compatibility with other awks, setting
1191 .B RS
1192 = "" has the same
1193 effect as if blank lines are stripped from the
1194 front and back of files and then records are determined as if
1195 .B RS
1196 = "\en\en+".
1197 Posix requires that "\en" always separates records when
1198 .B RS
1199 = "" regardless of the value of
1200 .BR FS .
1201 .B mawk
1202 does not support this convention, because defining
1203 "\en" as <SPACE> makes it unnecessary.
1204 .\"
1205 .PP
1206 Most of the time when you change
1207 .B RS
1208 for multi-line records, you
1209 will also want to change
1210 .B ORS
1211 to "\en\en" so the record spacing is preserved on output.
1212 .\"
1213 .SS "\fB13. Program execution"
1214 This section describes the order of program execution.
1215 First
1216 .B ARGC
1217 is set to the total number of command line arguments passed to
1218 the execution phase of the program.
1219 .B ARGV[0]
1220 is set the name of the AWK interpreter and
1221 \fBARGV[1]\fR ...
1222 .B ARGV[ARGC-1]
1223 holds the remaining command line arguments exclusive of
1224 options and program source.
1225 For example with
1226 .nf
1227 .sp
1228         mawk  \-f  prog  v=1  A  t=hello  B
1229 .sp
1230 .fi
1231 .B ARGC
1232 = 5 with
1233 .B ARGV[0]
1234 = "mawk",
1235 .B ARGV[1]
1236 = "v=1",
1237 .B ARGV[2]
1238 = "A",
1239 .B ARGV[3]
1240 = "t=hello" and
1241 .B ARGV[4]
1242 = "B".
1243 .PP
1244 Next, each
1245 .B BEGIN
1246 block is executed in order.
1247 If the program consists
1248 entirely of
1249 .B BEGIN
1250 blocks, then execution terminates, else
1251 an input stream is opened and execution continues.
1252 If
1253 .B ARGC
1254 equals 1,
1255 the input stream is set to stdin,
1256 else  the command line arguments
1257 .BR ARGV[1]  " ...
1258 .B ARGV[ARGC-1]
1259 are examined for a file argument.
1260 .PP
1261 The command line arguments divide into three sets:
1262 file arguments, assignment arguments and empty strings "".
1263 An assignment has the form
1264 \fIvar\fR=\fIstring\fR.
1265 When an
1266 .B ARGV[i]
1267 is examined as a possible file argument,
1268 if it is empty it is skipped;
1269 if it is an assignment argument, the assignment to
1270 .I var
1271 takes place and
1272 .B i
1273 skips to the next argument;
1274 else
1275 .B ARGV[i]
1276 is opened for input.
1277 If it fails to open, execution terminates with exit code 2.
1278 If no command line argument is a file argument, then input
1279 comes from stdin.
1280 Getline in a
1281 .B BEGIN
1282 action opens input.  "\-" as a file argument denotes stdin.
1283 .PP
1284 Once an input stream is open, each input record is tested
1285 against each
1286 .IR pattern ,
1287 and if it matches, the associated
1288 .I action
1289 is executed.
1290 An expression pattern matches if it is boolean true (see
1291 the end of section 2).
1292 A
1293 .B BEGIN
1294 pattern matches before any input has been read, and
1295 an
1296 .B END
1297 pattern matches after all input has been read.
1298 A range pattern,
1299 \fIexpr\fR1,\|\fIexpr\fR2 ,
1300 matches every record between the match of
1301 .IR expr 1
1302 and the match
1303 .IR expr 2
1304 inclusively.
1305 .PP
1306 When end of file occurs on the input stream, the remaining
1307 command line arguments are examined for a file argument, and
1308 if there is one it is opened, else the
1309 .B END
1310 .I pattern
1311 is considered matched
1312 and all
1313 .B END
1314 .I actions
1315 are executed.
1316 .PP
1317 In the example, the assignment
1318 v=1
1319 takes place after the
1320 .B BEGIN
1321 .I actions
1322 are executed, and
1323 the data placed in
1324 v
1325 is typed number and string.
1326 Input is then read from file A.
1327 On end of file A,
1328 t
1329 is set to the string "hello",
1330 and B is opened for input.
1331 On end of file B, the
1332 .B END
1333 .I actions
1334 are executed.
1335 .PP
1336 Program flow at the
1337 .I pattern
1338 .I {action}
1339 level can be changed with the
1340 .nf
1341 .sp
1342         \fBnext
1343         \fBexit  \fIopt_expr\fR
1344 .sp
1345 .fi
1346 statements.
1347 A
1348 .B next
1349 statement
1350 causes the next input record to be read and pattern testing
1351 to restart with the first
1352 .I "pattern {action}"
1353 pair in the program.
1354 An
1355 .B  exit
1356 statement
1357 causes immediate execution of the
1358 .B END
1359 actions or program termination if there are none or
1360 if the
1361 .B exit
1362 occurs in an
1363 .B END
1364 action.
1365 The
1366 .I opt_expr
1367 sets the exit value of the program unless overridden by
1368 a later
1369 .B exit
1370 or subsequent error.
1371 .SH EXAMPLES
1372 .nf
1373 1. emulate cat.
1374
1375         { print }
1376
1377 2. emulate wc.
1378
1379         { chars += length($0) + 1  # add one for the \en
1380           words += NF
1381         }
1382
1383         END{ print NR, words, chars }
1384
1385 3. count the number of unique "real words".
1386
1387         BEGIN { FS = "[^A-Za-z]+" }
1388
1389         { for(i = 1 ; i <= NF ; i++)  word[$i] = "" }
1390
1391         END { delete word[""]
1392               for ( i in word )  cnt++
1393               print cnt
1394         }
1395
1396 .fi
1397 4. sum the second field of
1398 every record based on the first field.
1399 .nf
1400
1401         $1 ~ /credit\||\|gain/ { sum += $2 }
1402         $1 ~ /debit\||\|loss/  { sum \-= $2 }
1403
1404         END { print sum }
1405
1406 5. sort a file, comparing as string
1407
1408         { line[NR] = $0 "" }  # make sure of comparison type
1409                               # in case some lines look numeric
1410
1411         END {  isort(line, NR)
1412           for(i = 1 ; i <= NR ; i++) print line[i]
1413         }
1414
1415         #insertion sort of A[1..n]
1416         function isort( A, n,   i, j, hold)
1417         {
1418           for( i = 2 ; i <= n ; i++)
1419           {
1420             hold = A[j = i]
1421             while ( A[j\-1] > hold )
1422             { j\-\|\- ; A[j+1] = A[j] }
1423             A[j] = hold
1424           }
1425           # sentinel A[0] = "" will be created if needed
1426         }
1427
1428 .fi
1429 .SH  "COMPATIBILITY ISSUES"
1430 The Posix 1003.2(draft 11.3) definition of the AWK language
1431 is AWK as described in the AWK book with a few extensions
1432 that appeared in SystemVR4 nawk. The extensions are:
1433 .sp
1434 .RS
1435 New functions: toupper() and tolower().
1436
1437 New variables: ENVIRON[\|] and CONVFMT.
1438
1439 ANSI C conversion specifications for printf() and sprintf().
1440
1441 New command options:  \-v var=value, multiple -f options and
1442 implementation options as arguments to \-W.
1443 .RE
1444 .sp
1445
1446 Posix AWK is oriented to operate on files a line at
1447 a time.
1448 .B RS
1449 can be changed from "\en" to another single character,
1450 but it
1451 is hard to find any use for this \(em there are no
1452 examples in the AWK book.
1453 By convention, \fBRS\fR = "", makes one or more blank lines
1454 separate records, allowing multi-line records.  When
1455 \fBRS\fR = "", "\en" is always a field separator
1456 regardless of the value in
1457 .BR FS .
1458 .PP
1459 .BR mawk ,
1460 on the other hand,
1461 allows
1462 .B RS
1463 to be a regular expression.
1464 When "\en" appears in records, it is treated as space, and
1465 .B FS
1466 always determines fields.
1467 .PP
1468 Removing the line at a time paradigm can make some programs
1469 simpler and can
1470 often improve performance.  For example,
1471 redoing example 3 from above,
1472 .nf
1473 .sp
1474         BEGIN { RS = "[^A-Za-z]+" }
1475
1476         { word[ $0 ] = "" }
1477
1478         END { delete  word[ "" ]
1479           for( i in word )  cnt++
1480           print cnt
1481         }
1482 .sp
1483 .fi
1484 counts the number of unique words by making each word a record.
1485 On moderate size files,
1486 .B mawk
1487 executes twice as fast, because of the simplified inner loop.
1488 .PP
1489 The following program replaces each comment by a single space in
1490 a C program file,
1491 .nf
1492 .sp
1493         BEGIN {
1494           RS = "/\|\e*([^*]\||\|\e*+[^/*])*\e*+/"
1495                 # comment is record separator
1496           ORS = " "
1497           getline  hold
1498        }
1499
1500        { print hold ; hold = $0 }
1501
1502        END { printf "%s" , hold }
1503 .sp
1504 .fi
1505 Buffering one record is needed to avoid terminating the last
1506 record with a space.
1507 .PP
1508 With
1509 .BR mawk ,
1510 the following are all equivalent,
1511 .nf
1512 .sp
1513         x ~ /a\e+b/    x ~ "a\e+b"     x ~ "a\e\e+b"
1514 .sp
1515 .fi
1516 The strings get scanned twice, once as string and once as
1517 regular expression.  On the string scan,
1518 .B mawk
1519 ignores the escape on non-escape characters while the AWK
1520 book advocates
1521 .I \ec
1522 be recognized as
1523 .I c
1524 which necessitates the double escaping of meta-characters in
1525 strings.
1526 Posix explicitly declines to define the behavior which passively
1527 forces programs that must run under a variety of awks to use
1528 the more portable but less readable, double escape.
1529 .PP
1530 Posix AWK does not recognize "/dev/std{out,err}" or \ex hex escape
1531 sequences in strings.  Unlike ANSI C,
1532 .B mawk
1533 limits the number of digits that follows \ex to two as the current
1534 implementation only supports 8 bit characters.
1535 The built-in
1536 .B fflush
1537 first appeared in a recent (1993) AT&T awk released to netlib, and is
1538 not part of the posix standard.  Aggregate deletion with
1539 .B delete
1540 .I array
1541 is not part of the posix standard.
1542 .PP
1543 Posix explicitly leaves the behavior of
1544 .B FS
1545 = "" undefined, and mentions splitting the record into characters as
1546 a possible interpretation, but currently this use is not portable
1547 across implementations.
1548 .PP
1549 Finally, here is how
1550 .B mawk
1551 handles exceptional cases not discussed in the
1552 AWK book or the Posix draft.  It is unsafe to assume
1553 consistency across awks and safe to skip to
1554 the next section.
1555 .PP
1556 .RS
1557 substr(s, i, n) returns the characters of s in the intersection
1558 of the closed interval [1, length(s)] and the half-open interval
1559 [i, i+n).  When this intersection is empty, the empty string is
1560 returned; so substr("ABC", 1, 0) = "" and
1561 substr("ABC", \-4, 6) = "A".
1562 .PP
1563 Every string, including the empty string, matches the empty string
1564 at the
1565 front so, s ~ // and s ~ "", are always 1 as is match(s, //) and
1566 match(s, "").  The last two set
1567 .B RLENGTH
1568 to 0.
1569 .PP
1570 index(s, t) is always the same as match(s, t1) where t1 is the
1571 same as t with metacharacters escaped.  Hence consistency
1572 with match requires that
1573 index(s, "") always returns 1.
1574 Also the condition, index(s,t) != 0 if and only t is a substring
1575 of s, requires index("","") = 1.
1576 .PP
1577 If getline encounters end of file, getline var, leaves var
1578 unchanged.  Similarly, on entry to the
1579 .B END
1580 actions,
1581 .BR $0 ,
1582 the fields and
1583 .B NF
1584 have their value unaltered from the last record.
1585 .SH SEE ALSO
1586 .IR egrep (1)
1587 .PP
1588 Aho, Kernighan and Weinberger,
1589 .IR "The AWK Programming Language" ,
1590 Addison-Wesley Publishing, 1988, (the AWK book),
1591 defines the language, opening with a tutorial
1592 and advancing to many interesting programs that delve into
1593 issues of software design and analysis relevant to programming
1594 in any language.
1595 .PP
1596 .IR "The GAWK Manual" ,
1597 The Free Software Foundation, 1991, is a tutorial
1598 and language reference
1599 that does not attempt the depth of the AWK book
1600 and assumes the reader may be a novice programmer.
1601 The section on AWK arrays is excellent.  It also
1602 discusses Posix requirements for AWK.
1603 .SH BUGS
1604 .B mawk
1605 cannot handle ascii NUL \e0 in the source or data files.  You
1606 can output NUL using printf with %c, and any other 8 bit
1607 character is acceptable input.
1608 .PP
1609 .B mawk
1610 implements printf() and sprintf() using the C library functions,
1611 printf and sprintf, so full ANSI compatibility requires an ANSI
1612 C library.  In practice this means the h conversion qualifier may
1613 not be available.  Also
1614 .B mawk
1615 inherits any bugs or limitations of the library functions.
1616 .PP
1617 Implementors of the AWK language have shown a consistent lack
1618 of imagination when naming their programs.
1619 .SH AUTHOR
1620 Mike Brennan (brennan@whidbey.com).