From: Kibum Kim Date: Fri, 6 Jan 2012 15:50:00 +0000 (+0900) Subject: Git init X-Git-Tag: 2.0_alpha^0 X-Git-Url: http://review.tizen.org/git/?a=commitdiff_plain;h=refs%2Fheads%2F1.0_post;p=external%2Fmawk.git Git init --- diff --git a/ACKNOWLEDGMENT b/ACKNOWLEDGMENT new file mode 100644 index 0000000..e01329b --- /dev/null +++ b/ACKNOWLEDGMENT @@ -0,0 +1,47 @@ +Version 1.2 +=========== + +Thanks for help with beta test to Bill Davidsen, Tom Dickey, Ed +Ferguson, Jack Fitts, Onno van der Linden, Carl Mascott, Jean-Pierre +Radley, John Roll, Ian Searle, Bob Stockler. + +The calendar program examples/hical was written by Bob Stockler. + +Darrel Hankerson ported versions 1.2.x to DOS/OS2. + +Version 1.0 and 1.1 +=================== + +Carl Mascott ported mawk to V7 and in the process rooted out +some subtle (and not so subtle) bugs. + +Ian Searle ported mawk to System V and put up with my insane +attempts to get fpe exception trapping off. + +An anonymous reviewer for comp.sources.reviewed did the +MSC and Mac ports and wrote .bat files for the tests. +Another or maybe the same reviewer did the Dynix port. + +Ports to new systems: + Ed Ferguson MIPS M2000 C2.20 OS4.52 + Jwahar R. Bammi Atari ST + Berry Kercheval SGI IRIX 4.0.1 + Andy Newman Next 2.1 + Mike Carlton Next 2.1 + Elliot Jaffe AIX 3.1 + Jeremy Martin Convex 9.1 + Scott Hunziker Coherent 4.0 + Ken Poulton Hpux + Onno van der Linden 386bsd 0.1 + Bob Hutchinson Linux 0.98p14 + +The DOS version is a lot better thanks to suggestions and testing +from Ed Ferguson, Jack Fitts, Nadav Horesh, Michael Golan and +Conny Ohstrom. The DOS additions for 1.1.2d are all ideas of +Ben Myers; much of the code is his too. + +Arnold Robbins kept me current on POSIX standards for AWK, and +explained some of the "dark corners". + +Thank you to everyone who reported bugs or offered encouragement, +suggestions or criticism. (At least the bugs got fixed). diff --git a/CHANGES b/CHANGES new file mode 100644 index 0000000..8497492 --- /dev/null +++ b/CHANGES @@ -0,0 +1,42 @@ +1.3.1 -> 1.3.2 Sep 1996 + +1) Numeric but not integer indices caused core dump in new array scheme. + Fixed bug and fired test division. + +2) Added ferror() checks on writes. + +3) Added some static storage specs to array.c to keep non-ansi + compilers happy. + +1.3 -> 1.3.1 Sep 1996 +Release to new ftp site ftp://ftp.whidbey.net. + +1) Workaround for overflow exception in strtod, sunos5.5 solaris. + +2) []...] and [^]...] put ] in a class (or not in a class) without + having to use back-slash escape. + +1.2.2 -> 1.3 Jul 1996 +Extensive redesign of array data structures to support large arrays and +fast access to arrays created with split. Many of the ideas in the +new design were inspired by reading "The Design and Implementation of +Dynamic Hashing Sets and Tables in Icon" by William Griswold and +Gregg Townsend, SPE 23,351-367. + +1.2.1 -> 1.2.2 Jan 1996 + +1) Improved autoconfig, in particular, fpe tests. This is far from + perfect and never will be until C standardizes an interface to ieee754. + +2) Removed automatic error message on open failure for getline. + +3) Flush all output before system(). Previous behavior was to only + flush std{out,err}. + +4) Explicitly fclose() all output on exit to work around AIX4.1 bug. + +5) Fixed random number generator to work with longs larger than + 32bits. + +6) Added a type Int which is int on real machines and long on dos machines. + Believe that all implicit assumptions that int=32bits are now gone. diff --git a/COPYING b/COPYING new file mode 100644 index 0000000..a43ea21 --- /dev/null +++ b/COPYING @@ -0,0 +1,339 @@ + GNU GENERAL PUBLIC LICENSE + Version 2, June 1991 + + Copyright (C) 1989, 1991 Free Software Foundation, Inc. + 675 Mass Ave, Cambridge, MA 02139, USA + Everyone is permitted to copy and distribute verbatim copies + of this license document, but changing it is not allowed. + + Preamble + + The licenses for most software are designed to take away your +freedom to share and change it. By contrast, the GNU General Public +License is intended to guarantee your freedom to share and change free +software--to make sure the software is free for all its users. This +General Public License applies to most of the Free Software +Foundation's software and to any other program whose authors commit to +using it. (Some other Free Software Foundation software is covered by +the GNU Library General Public License instead.) You can apply it to +your programs, too. + + When we speak of free software, we are referring to freedom, not +price. Our General Public Licenses are designed to make sure that you +have the freedom to distribute copies of free software (and charge for +this service if you wish), that you receive source code or can get it +if you want it, that you can change the software or use pieces of it +in new free programs; and that you know you can do these things. + + To protect your rights, we need to make restrictions that forbid +anyone to deny you these rights or to ask you to surrender the rights. +These restrictions translate to certain responsibilities for you if you +distribute copies of the software, or if you modify it. + + For example, if you distribute copies of such a program, whether +gratis or for a fee, you must give the recipients all the rights that +you have. You must make sure that they, too, receive or can get the +source code. And you must show them these terms so they know their +rights. + + We protect your rights with two steps: (1) copyright the software, and +(2) offer you this license which gives you legal permission to copy, +distribute and/or modify the software. + + Also, for each author's protection and ours, we want to make certain +that everyone understands that there is no warranty for this free +software. If the software is modified by someone else and passed on, we +want its recipients to know that what they have is not the original, so +that any problems introduced by others will not reflect on the original +authors' reputations. + + Finally, any free program is threatened constantly by software +patents. We wish to avoid the danger that redistributors of a free +program will individually obtain patent licenses, in effect making the +program proprietary. To prevent this, we have made it clear that any +patent must be licensed for everyone's free use or not licensed at all. + + The precise terms and conditions for copying, distribution and +modification follow. + + GNU GENERAL PUBLIC LICENSE + TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION + + 0. This License applies to any program or other work which contains +a notice placed by the copyright holder saying it may be distributed +under the terms of this General Public License. The "Program", below, +refers to any such program or work, and a "work based on the Program" +means either the Program or any derivative work under copyright law: +that is to say, a work containing the Program or a portion of it, +either verbatim or with modifications and/or translated into another +language. (Hereinafter, translation is included without limitation in +the term "modification".) Each licensee is addressed as "you". + +Activities other than copying, distribution and modification are not +covered by this License; they are outside its scope. The act of +running the Program is not restricted, and the output from the Program +is covered only if its contents constitute a work based on the +Program (independent of having been made by running the Program). +Whether that is true depends on what the Program does. + + 1. You may copy and distribute verbatim copies of the Program's +source code as you receive it, in any medium, provided that you +conspicuously and appropriately publish on each copy an appropriate +copyright notice and disclaimer of warranty; keep intact all the +notices that refer to this License and to the absence of any warranty; +and give any other recipients of the Program a copy of this License +along with the Program. + +You may charge a fee for the physical act of transferring a copy, and +you may at your option offer warranty protection in exchange for a fee. + + 2. You may modify your copy or copies of the Program or any portion +of it, thus forming a work based on the Program, and copy and +distribute such modifications or work under the terms of Section 1 +above, provided that you also meet all of these conditions: + + a) You must cause the modified files to carry prominent notices + stating that you changed the files and the date of any change. + + b) You must cause any work that you distribute or publish, that in + whole or in part contains or is derived from the Program or any + part thereof, to be licensed as a whole at no charge to all third + parties under the terms of this License. + + c) If the modified program normally reads commands interactively + when run, you must cause it, when started running for such + interactive use in the most ordinary way, to print or display an + announcement including an appropriate copyright notice and a + notice that there is no warranty (or else, saying that you provide + a warranty) and that users may redistribute the program under + these conditions, and telling the user how to view a copy of this + License. (Exception: if the Program itself is interactive but + does not normally print such an announcement, your work based on + the Program is not required to print an announcement.) + +These requirements apply to the modified work as a whole. If +identifiable sections of that work are not derived from the Program, +and can be reasonably considered independent and separate works in +themselves, then this License, and its terms, do not apply to those +sections when you distribute them as separate works. But when you +distribute the same sections as part of a whole which is a work based +on the Program, the distribution of the whole must be on the terms of +this License, whose permissions for other licensees extend to the +entire whole, and thus to each and every part regardless of who wrote it. + +Thus, it is not the intent of this section to claim rights or contest +your rights to work written entirely by you; rather, the intent is to +exercise the right to control the distribution of derivative or +collective works based on the Program. + +In addition, mere aggregation of another work not based on the Program +with the Program (or with a work based on the Program) on a volume of +a storage or distribution medium does not bring the other work under +the scope of this License. + + 3. You may copy and distribute the Program (or a work based on it, +under Section 2) in object code or executable form under the terms of +Sections 1 and 2 above provided that you also do one of the following: + + a) Accompany it with the complete corresponding machine-readable + source code, which must be distributed under the terms of Sections + 1 and 2 above on a medium customarily used for software interchange; or, + + b) Accompany it with a written offer, valid for at least three + years, to give any third party, for a charge no more than your + cost of physically performing source distribution, a complete + machine-readable copy of the corresponding source code, to be + distributed under the terms of Sections 1 and 2 above on a medium + customarily used for software interchange; or, + + c) Accompany it with the information you received as to the offer + to distribute corresponding source code. (This alternative is + allowed only for noncommercial distribution and only if you + received the program in object code or executable form with such + an offer, in accord with Subsection b above.) + +The source code for a work means the preferred form of the work for +making modifications to it. For an executable work, complete source +code means all the source code for all modules it contains, plus any +associated interface definition files, plus the scripts used to +control compilation and installation of the executable. However, as a +special exception, the source code distributed need not include +anything that is normally distributed (in either source or binary +form) with the major components (compiler, kernel, and so on) of the +operating system on which the executable runs, unless that component +itself accompanies the executable. + +If distribution of executable or object code is made by offering +access to copy from a designated place, then offering equivalent +access to copy the source code from the same place counts as +distribution of the source code, even though third parties are not +compelled to copy the source along with the object code. + + 4. You may not copy, modify, sublicense, or distribute the Program +except as expressly provided under this License. Any attempt +otherwise to copy, modify, sublicense or distribute the Program is +void, and will automatically terminate your rights under this License. +However, parties who have received copies, or rights, from you under +this License will not have their licenses terminated so long as such +parties remain in full compliance. + + 5. You are not required to accept this License, since you have not +signed it. However, nothing else grants you permission to modify or +distribute the Program or its derivative works. These actions are +prohibited by law if you do not accept this License. Therefore, by +modifying or distributing the Program (or any work based on the +Program), you indicate your acceptance of this License to do so, and +all its terms and conditions for copying, distributing or modifying +the Program or works based on it. + + 6. Each time you redistribute the Program (or any work based on the +Program), the recipient automatically receives a license from the +original licensor to copy, distribute or modify the Program subject to +these terms and conditions. You may not impose any further +restrictions on the recipients' exercise of the rights granted herein. +You are not responsible for enforcing compliance by third parties to +this License. + + 7. If, as a consequence of a court judgment or allegation of patent +infringement or for any other reason (not limited to patent issues), +conditions are imposed on you (whether by court order, agreement or +otherwise) that contradict the conditions of this License, they do not +excuse you from the conditions of this License. If you cannot +distribute so as to satisfy simultaneously your obligations under this +License and any other pertinent obligations, then as a consequence you +may not distribute the Program at all. For example, if a patent +license would not permit royalty-free redistribution of the Program by +all those who receive copies directly or indirectly through you, then +the only way you could satisfy both it and this License would be to +refrain entirely from distribution of the Program. + +If any portion of this section is held invalid or unenforceable under +any particular circumstance, the balance of the section is intended to +apply and the section as a whole is intended to apply in other +circumstances. + +It is not the purpose of this section to induce you to infringe any +patents or other property right claims or to contest validity of any +such claims; this section has the sole purpose of protecting the +integrity of the free software distribution system, which is +implemented by public license practices. Many people have made +generous contributions to the wide range of software distributed +through that system in reliance on consistent application of that +system; it is up to the author/donor to decide if he or she is willing +to distribute software through any other system and a licensee cannot +impose that choice. + +This section is intended to make thoroughly clear what is believed to +be a consequence of the rest of this License. + + 8. If the distribution and/or use of the Program is restricted in +certain countries either by patents or by copyrighted interfaces, the +original copyright holder who places the Program under this License +may add an explicit geographical distribution limitation excluding +those countries, so that distribution is permitted only in or among +countries not thus excluded. In such case, this License incorporates +the limitation as if written in the body of this License. + + 9. The Free Software Foundation may publish revised and/or new versions +of the General Public License from time to time. Such new versions will +be similar in spirit to the present version, but may differ in detail to +address new problems or concerns. + +Each version is given a distinguishing version number. If the Program +specifies a version number of this License which applies to it and "any +later version", you have the option of following the terms and conditions +either of that version or of any later version published by the Free +Software Foundation. If the Program does not specify a version number of +this License, you may choose any version ever published by the Free Software +Foundation. + + 10. If you wish to incorporate parts of the Program into other free +programs whose distribution conditions are different, write to the author +to ask for permission. For software which is copyrighted by the Free +Software Foundation, write to the Free Software Foundation; we sometimes +make exceptions for this. Our decision will be guided by the two goals +of preserving the free status of all derivatives of our free software and +of promoting the sharing and reuse of software generally. + + NO WARRANTY + + 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY +FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN +OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES +PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED +OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF +MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS +TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE +PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, +REPAIR OR CORRECTION. + + 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING +WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR +REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, +INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING +OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED +TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY +YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER +PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE +POSSIBILITY OF SUCH DAMAGES. + + END OF TERMS AND CONDITIONS + + Appendix: How to Apply These Terms to Your New Programs + + If you develop a new program, and you want it to be of the greatest +possible use to the public, the best way to achieve this is to make it +free software which everyone can redistribute and change under these terms. + + To do so, attach the following notices to the program. It is safest +to attach them to the start of each source file to most effectively +convey the exclusion of warranty; and each file should have at least +the "copyright" line and a pointer to where the full notice is found. + + + Copyright (C) 19yy + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, write to the Free Software + Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + +Also add information on how to contact you by electronic and paper mail. + +If the program is interactive, make it output a short notice like this +when it starts in an interactive mode: + + Gnomovision version 69, Copyright (C) 19yy name of author + Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'. + This is free software, and you are welcome to redistribute it + under certain conditions; type `show c' for details. + +The hypothetical commands `show w' and `show c' should show the appropriate +parts of the General Public License. Of course, the commands you use may +be called something other than `show w' and `show c'; they could even be +mouse-clicks or menu items--whatever suits your program. + +You should also get your employer (if you work as a programmer) or your +school, if any, to sign a "copyright disclaimer" for the program, if +necessary. Here is a sample; alter the names: + + Yoyodyne, Inc., hereby disclaims all copyright interest in the program + `Gnomovision' (which makes passes at compilers) written by James Hacker. + + , 1 April 1989 + Ty Coon, President of Vice + +This General Public License does not permit incorporating your program into +proprietary programs. If your program is a subroutine library, you may +consider it more useful to permit linking proprietary applications with the +library. If this is what you want to do, use the GNU Library General +Public License instead of this License. diff --git a/INSTALL b/INSTALL new file mode 100644 index 0000000..f114627 --- /dev/null +++ b/INSTALL @@ -0,0 +1,47 @@ +Look at the file config.user and edit to set user defines. + +if your system is one of + apollo + convex + mips + sgi + ultrix-mips + cray + hpux (read below) + unixware (read below) + +and you don't have gcc or prefer to use cc, then you may want to +copy config-user/your_system to config.user and edit that. + +run + + configure + make + + +If you have problems, please report it. If you can fix the problem, by +changing config.user, please send the results. Else send output from +configure, make and config.h. Send to brennan@whidbey.com. + + + +DOS: +Look at the file msdos/INSTALL + + +HPUX: +Evidently there is more than one compiler and/or math library. Some +configurations work out of the box (configure/make). Others need +CFLAGS='+O2 +FPZO'. On HPUX 9.05 with the ansi compiler HP92453-01 +A.09.77 set CFLAGS='-Ae +O2 +FPZO'. Thanks to Dr. Rafael R. +Pappalardo for this info. + + + +UNIXWARE: +On some but not all versions, configure might decide you don't have +memcpy. Remove #define NO_MEMCPY 1 from config.h. +If the fpe_test check fails, change the definition of TURN_ON_FPE_TRAPS +to + +#define TURN_ON_FPE_TRAPS() fpsetmask(fpgetmask()|FP_X_DZ|FP_X_OFL|FP_X_INV) diff --git a/Makefile b/Makefile new file mode 100644 index 0000000..05e63b0 --- /dev/null +++ b/Makefile @@ -0,0 +1,125 @@ +# Generated automatically from Makefile.in by configure. + +SHELL=/bin/sh + +#################################### + +CC = gcc + +CFLAGS = -g -O2 + +MATHLIB = -lm + +YACC = bison -y + +# where to put mawk +BINDIR = /usr/local/bin +# where to put the man pages +MANDIR = /usr/local/man/man1 +MANEXT = 1 +####################################### + + +O=parse.o scan.o memory.o main.o hash.o execute.o code.o\ + da.o error.o init.o bi_vars.o cast.o print.o bi_funct.o\ + kw.o jmp.o array.o field.o split.o re_cmpl.o zmalloc.o\ + fin.o files.o scancode.o matherr.o fcall.o version.o\ + missing.o + +REXP_O=rexp/rexp.o rexp/rexp0.o rexp/rexp1.o rexp/rexp2.o\ + rexp/rexp3.o + +REXP_C=rexp/rexp.c rexp/rexp0.c rexp/rexp1.c rexp/rexp2.c\ + rexp/rexp3.c + + +mawk_and_test : mawk mawk_test fpe_test + +mawk : $(O) rexp/.done + $(CC) $(CFLAGS) -o mawk $(O) $(REXP_O) $(MATHLIB) + +mawk_test : mawk # test that we have a sane mawk + @cp mawk test/mawk + cd test ; ./mawktest + @rm test/mawk + +fpe_test : mawk # test FPEs are handled OK + @cp mawk test/mawk + @echo ; echo testing floating point exception handling + cd test ; ./fpe_test + @rm test/mawk + +rexp/.done : $(REXP_C) + cd rexp ;\ + $(MAKE) CC="$(CC)" CFLAGS="$(CFLAGS) -DMAWK -I.." + +parse.c : parse.y + @echo expect 4 shift/reduce conflicts + $(YACC) -d parse.y + mv y.tab.c parse.c + -if cmp -s y.tab.h parse.h ;\ + then rm y.tab.h ;\ + else mv y.tab.h parse.h ; fi + +array.c : array.w + notangle -R'"array.c"' array.w | cpif array.c + +array.h : array.w + notangle -R'"array.h"' array.w | cpif array.h + +scancode.c : makescan.c scan.h + $(CC) -o makescan.exe makescan.c + rm -f scancode.c + ./makescan.exe > scancode.c + rm makescan.exe + +MAWKMAN = $(MANDIR)/mawk.$(MANEXT) +install : mawk + cp mawk $(BINDIR) + chmod 0755 $(BINDIR)/mawk + cp man/mawk.1 $(MAWKMAN) + chmod 0644 $(MAWKMAN) + +clean : + rm -f *.o rexp/*.o rexp/.done test/mawk core test/core mawk + +distclean : clean + rm -f config.h Makefile \ + config.status config.user config.log config.cache + rm -f defines.out maxint.out fpe_check + cp config-user/.config.user config.user ; chmod +w config.user + +configure : configure.in mawk.ac.m4 + autoconf + + + +# output from mawk -f deps.awk *.c +array.o : config.h field.h bi_vars.h mawk.h symtype.h nstd.h memory.h array.h zmalloc.h types.h sizes.h +bi_funct.o : config.h field.h bi_vars.h mawk.h init.h regexp.h symtype.h nstd.h repl.h memory.h bi_funct.h array.h files.h zmalloc.h fin.h types.h sizes.h +bi_vars.o : config.h field.h bi_vars.h mawk.h init.h symtype.h nstd.h memory.h array.h zmalloc.h types.h sizes.h +cast.o : config.h field.h mawk.h parse.h symtype.h nstd.h memory.h repl.h scan.h array.h zmalloc.h types.h sizes.h +code.o : config.h field.h code.h mawk.h init.h symtype.h nstd.h memory.h array.h jmp.h zmalloc.h types.h sizes.h +da.o : config.h field.h code.h mawk.h symtype.h nstd.h memory.h repl.h bi_funct.h array.h zmalloc.h types.h sizes.h +error.o : config.h bi_vars.h mawk.h parse.h vargs.h symtype.h nstd.h scan.h array.h types.h sizes.h +execute.o : config.h field.h bi_vars.h code.h mawk.h regexp.h symtype.h nstd.h memory.h repl.h bi_funct.h array.h zmalloc.h types.h fin.h sizes.h +fcall.o : config.h code.h mawk.h symtype.h nstd.h memory.h array.h zmalloc.h types.h sizes.h +field.o : config.h field.h bi_vars.h mawk.h init.h parse.h regexp.h symtype.h nstd.h memory.h repl.h scan.h array.h zmalloc.h types.h sizes.h +files.o : config.h mawk.h nstd.h memory.h files.h zmalloc.h types.h fin.h sizes.h +fin.o : config.h field.h bi_vars.h mawk.h parse.h symtype.h nstd.h memory.h scan.h array.h zmalloc.h types.h fin.h sizes.h +hash.o : config.h mawk.h symtype.h nstd.h memory.h array.h zmalloc.h types.h sizes.h +init.o : config.h field.h bi_vars.h code.h mawk.h init.h symtype.h nstd.h memory.h array.h zmalloc.h types.h sizes.h +jmp.o : config.h code.h mawk.h init.h symtype.h nstd.h memory.h array.h jmp.h zmalloc.h types.h sizes.h +kw.o : config.h mawk.h init.h parse.h symtype.h nstd.h array.h types.h sizes.h +main.o : config.h code.h mawk.h init.h symtype.h nstd.h memory.h array.h files.h zmalloc.h types.h sizes.h +makescan.o : parse.h symtype.h scan.h array.h +matherr.o : config.h mawk.h nstd.h types.h sizes.h +memory.o : config.h mawk.h nstd.h memory.h zmalloc.h types.h sizes.h +missing.o : config.h nstd.h +parse.o : config.h field.h bi_vars.h code.h mawk.h symtype.h nstd.h memory.h bi_funct.h array.h files.h zmalloc.h jmp.h types.h sizes.h +print.o : config.h field.h bi_vars.h mawk.h parse.h symtype.h nstd.h memory.h scan.h bi_funct.h array.h files.h zmalloc.h types.h sizes.h +re_cmpl.o : config.h mawk.h parse.h regexp.h symtype.h nstd.h memory.h repl.h scan.h array.h zmalloc.h types.h sizes.h +scan.o : config.h field.h code.h mawk.h init.h parse.h symtype.h nstd.h memory.h repl.h scan.h array.h files.h zmalloc.h types.h fin.h sizes.h +split.o : config.h field.h bi_vars.h mawk.h parse.h regexp.h symtype.h nstd.h memory.h scan.h bi_funct.h array.h zmalloc.h types.h sizes.h +version.o : config.h mawk.h patchlev.h nstd.h types.h sizes.h +zmalloc.o : config.h mawk.h nstd.h zmalloc.h types.h sizes.h diff --git a/Makefile.in b/Makefile.in new file mode 100644 index 0000000..b1387ed --- /dev/null +++ b/Makefile.in @@ -0,0 +1,124 @@ + +SHELL=/bin/sh + +#################################### + +CC = @CC@ + +CFLAGS = @CFLAGS@ + +MATHLIB = @MATHLIB@ + +YACC = @YACC@ + +# where to put mawk +BINDIR = @BINDIR@ +# where to put the man pages +MANDIR = @MANDIR@ +MANEXT = @MANEXT@ +####################################### + + +O=parse.o scan.o memory.o main.o hash.o execute.o code.o\ + da.o error.o init.o bi_vars.o cast.o print.o bi_funct.o\ + kw.o jmp.o array.o field.o split.o re_cmpl.o zmalloc.o\ + fin.o files.o scancode.o matherr.o fcall.o version.o\ + missing.o + +REXP_O=rexp/rexp.o rexp/rexp0.o rexp/rexp1.o rexp/rexp2.o\ + rexp/rexp3.o + +REXP_C=rexp/rexp.c rexp/rexp0.c rexp/rexp1.c rexp/rexp2.c\ + rexp/rexp3.c + + +mawk_and_test : mawk mawk_test fpe_test + +mawk : $(O) rexp/.done + $(CC) $(CFLAGS) -o mawk $(O) $(REXP_O) $(MATHLIB) + +mawk_test : mawk # test that we have a sane mawk + @cp mawk test/mawk + cd test ; ./mawktest + @rm test/mawk + +fpe_test : mawk # test FPEs are handled OK + @cp mawk test/mawk + @echo ; echo testing floating point exception handling + cd test ; ./fpe_test + @rm test/mawk + +rexp/.done : $(REXP_C) + cd rexp ;\ + $(MAKE) CC="$(CC)" CFLAGS="$(CFLAGS) -DMAWK -I.." + +parse.c : parse.y + @echo expect 4 shift/reduce conflicts + $(YACC) -d parse.y + mv y.tab.c parse.c + -if cmp -s y.tab.h parse.h ;\ + then rm y.tab.h ;\ + else mv y.tab.h parse.h ; fi + +array.c : array.w + notangle -R'"array.c"' array.w | cpif array.c + +array.h : array.w + notangle -R'"array.h"' array.w | cpif array.h + +scancode.c : makescan.c scan.h + $(CC) -o makescan.exe makescan.c + rm -f scancode.c + ./makescan.exe > scancode.c + rm makescan.exe + +MAWKMAN = $(MANDIR)/mawk.$(MANEXT) +install : mawk + cp mawk $(BINDIR) + chmod 0755 $(BINDIR)/mawk + cp man/mawk.1 $(MAWKMAN) + chmod 0644 $(MAWKMAN) + +clean : + rm -f *.o rexp/*.o rexp/.done test/mawk core test/core mawk + +distclean : clean + rm -f config.h Makefile \ + config.status config.user config.log config.cache + rm -f defines.out maxint.out fpe_check + cp config-user/.config.user config.user ; chmod +w config.user + +configure : configure.in mawk.ac.m4 + autoconf + + + +# output from mawk -f deps.awk *.c +array.o : config.h field.h bi_vars.h mawk.h symtype.h nstd.h memory.h array.h zmalloc.h types.h sizes.h +bi_funct.o : config.h field.h bi_vars.h mawk.h init.h regexp.h symtype.h nstd.h repl.h memory.h bi_funct.h array.h files.h zmalloc.h fin.h types.h sizes.h +bi_vars.o : config.h field.h bi_vars.h mawk.h init.h symtype.h nstd.h memory.h array.h zmalloc.h types.h sizes.h +cast.o : config.h field.h mawk.h parse.h symtype.h nstd.h memory.h repl.h scan.h array.h zmalloc.h types.h sizes.h +code.o : config.h field.h code.h mawk.h init.h symtype.h nstd.h memory.h array.h jmp.h zmalloc.h types.h sizes.h +da.o : config.h field.h code.h mawk.h symtype.h nstd.h memory.h repl.h bi_funct.h array.h zmalloc.h types.h sizes.h +error.o : config.h bi_vars.h mawk.h parse.h vargs.h symtype.h nstd.h scan.h array.h types.h sizes.h +execute.o : config.h field.h bi_vars.h code.h mawk.h regexp.h symtype.h nstd.h memory.h repl.h bi_funct.h array.h zmalloc.h types.h fin.h sizes.h +fcall.o : config.h code.h mawk.h symtype.h nstd.h memory.h array.h zmalloc.h types.h sizes.h +field.o : config.h field.h bi_vars.h mawk.h init.h parse.h regexp.h symtype.h nstd.h memory.h repl.h scan.h array.h zmalloc.h types.h sizes.h +files.o : config.h mawk.h nstd.h memory.h files.h zmalloc.h types.h fin.h sizes.h +fin.o : config.h field.h bi_vars.h mawk.h parse.h symtype.h nstd.h memory.h scan.h array.h zmalloc.h types.h fin.h sizes.h +hash.o : config.h mawk.h symtype.h nstd.h memory.h array.h zmalloc.h types.h sizes.h +init.o : config.h field.h bi_vars.h code.h mawk.h init.h symtype.h nstd.h memory.h array.h zmalloc.h types.h sizes.h +jmp.o : config.h code.h mawk.h init.h symtype.h nstd.h memory.h array.h jmp.h zmalloc.h types.h sizes.h +kw.o : config.h mawk.h init.h parse.h symtype.h nstd.h array.h types.h sizes.h +main.o : config.h code.h mawk.h init.h symtype.h nstd.h memory.h array.h files.h zmalloc.h types.h sizes.h +makescan.o : parse.h symtype.h scan.h array.h +matherr.o : config.h mawk.h nstd.h types.h sizes.h +memory.o : config.h mawk.h nstd.h memory.h zmalloc.h types.h sizes.h +missing.o : config.h nstd.h +parse.o : config.h field.h bi_vars.h code.h mawk.h symtype.h nstd.h memory.h bi_funct.h array.h files.h zmalloc.h jmp.h types.h sizes.h +print.o : config.h field.h bi_vars.h mawk.h parse.h symtype.h nstd.h memory.h scan.h bi_funct.h array.h files.h zmalloc.h types.h sizes.h +re_cmpl.o : config.h mawk.h parse.h regexp.h symtype.h nstd.h memory.h repl.h scan.h array.h zmalloc.h types.h sizes.h +scan.o : config.h field.h code.h mawk.h init.h parse.h symtype.h nstd.h memory.h repl.h scan.h array.h files.h zmalloc.h types.h fin.h sizes.h +split.o : config.h field.h bi_vars.h mawk.h parse.h regexp.h symtype.h nstd.h memory.h scan.h bi_funct.h array.h zmalloc.h types.h sizes.h +version.o : config.h mawk.h patchlev.h nstd.h types.h sizes.h +zmalloc.o : config.h mawk.h nstd.h zmalloc.h types.h sizes.h diff --git a/README b/README new file mode 100644 index 0000000..8e835f3 --- /dev/null +++ b/README @@ -0,0 +1,92 @@ +Mawk -- an implementation of new/posix awk +version 1.3.2 + +Installation instructions in file INSTALL. + +Bug reports, comments, questions, etc. to +Mike Brennan, brennan@whidbey.com. +ftp site: ftp.whidbey.net in ~/pub/brennan + +Version 1.3 implements a new internal design for arrays. See file +CHANGES. + +Version 1.2.2 is best for MsDOS +--------------------------------------------------------- + +Changes from version 1.1.4 to 1.2: + +1) Limit on code size set by #define in sizes.h is gone. + +2) A number of obscure bugs have been fixed such as, + you can now make a recursive function call inside a for( i in A) loop. + Function calls with array parameters in loop expressions sometimes + generated erroneous internal code. + + See RCS log comments in code for details. + + Reported bugs are fixed. + +3) new -W options + + + -We file : reads commands from file and next argument, regardless + of form, is ARGV[1]. Useful for passing -v , -f etc to + an awk program started with #!/.../mawk + + + #!/usr/local/bin/mawk -We + + myprogram -v works, while + + #!/usr/local/bin/mawk -f + + myprogram -v gives error message + mawk: option -v lacks argument + + This is really a posix bozo. Posix says you end arguments with + -- , but this doesn't work with the #! convention. + + + + -W interactive : forces stdout to be unbuffered and stdin to + be line buffered. Records from stdin are lines regardless of + the value of RS. Useful for interaction with a mawk on a pipe. + + -W dump, -Wd : disassembles internal code to stdout (used to be + stderr) and exits 0. + +4) FS = "" causes each record to be broken into characters and placed + into $1,$2 ... + + same with split(x,A,"") and split(x,A,//) + + +5) print > "/dev/stdout" writes to stdout, exactly the same as + print + + This is useful for passing stdout to + + function my_special_output_routine(s, file) + { + # do something fancy with s + print s > file + } + + +6) New built-in function fflush() -- copied from the lastest att awk. + + fflush() : flushes stdout and returns 0 + fflush(file) flushes file and returns 0; if file was not an + open output file then returns -1. + +7) delete A ; -- removes all elements of the array A + + intended to replace: + + for( i in A) delete A[i] + +8) mawk errors such as compilation failure, file open failure, etc. + now exit 2 which reserves exit 1 for the user. + +9) No program now silently exits 0, prior behavior was to exit 2 with + an error message diff --git a/array.c b/array.c new file mode 100644 index 0000000..f02da26 --- /dev/null +++ b/array.c @@ -0,0 +1,607 @@ +/* +array.c +copyright 1991-96, Michael D. Brennan + +This is a source file for mawk, an implementation of +the AWK programming language. + +Mawk is distributed without warranty under the terms of +the GNU General Public License, version 2, 1991. +*/ + +/* +This file was generated with the command + + notangle -R'"array.c"' array.w > array.c + +Notangle is part of Norman Ramsey's noweb literate programming package +available from CTAN(ftp.shsu.edu). + +It's easiest to read or modify this file by working with array.w. +*/ + +#include "mawk.h" +#include "symtype.h" +#include "memory.h" +#include "field.h" +#include "bi_vars.h" +struct anode ; +typedef struct {struct anode *slink, *ilink ;} DUAL_LINK ; + +typedef struct anode { + struct anode *slink ; + struct anode *ilink ; + STRING *sval ; + unsigned hval ; + Int ival ; + CELL cell ; +} ANODE ; + + +#define NOT_AN_IVALUE (-Max_Int-1) /* usually 0x80000000 */ + +#define STARTING_HMASK 63 /* 2^6-1, must have form 2^n-1 */ +#define MAX_AVE_LIST_LENGTH 12 +#define hmask_to_limit(x) (((x)+1)*MAX_AVE_LIST_LENGTH) + +static ANODE* PROTO(find_by_ival,(ARRAY, Int, int)) ; +static ANODE* PROTO(find_by_sval,(ARRAY, STRING*, int)) ; +static void PROTO(add_string_associations,(ARRAY)) ; +static void PROTO(make_empty_table,(ARRAY, int)) ; +static void PROTO(convert_split_array_to_table,(ARRAY)) ; +static void PROTO(double_the_hash_table,(ARRAY)) ; +static unsigned PROTO(ahash, (STRING*)) ; + + +CELL* array_find(A, cp, create_flag) + ARRAY A ; + CELL *cp ; + int create_flag ; +{ + ANODE *ap ; + if (A->size == 0 && !create_flag) + /* eliminating this trivial case early avoids unnecessary conversions later */ + return (CELL*) 0 ; + switch (cp->type) { + case C_DOUBLE: + { + double d = cp->dval ; + Int ival = d_to_I(d) ; + if ((double)ival == d) { + if (A->type == AY_SPLIT) { + if (ival >= 1 && ival <= A->size) + return (CELL*)A->ptr+(ival-1) ; + if (!create_flag) return (CELL*) 0 ; + convert_split_array_to_table(A) ; + } + else if (A->type == AY_NULL) make_empty_table(A, AY_INT) ; + ap = find_by_ival(A, ival, create_flag) ; + } + else { + /* convert to string */ + char buff[260] ; + STRING *sval ; + sprintf(buff, string(CONVFMT)->str, d) ; + sval = new_STRING(buff) ; + ap = find_by_sval(A,sval,create_flag) ; + free_STRING(sval) ; + } + } + + break ; + case C_NOINIT: + ap = find_by_sval(A, &null_str, create_flag) ; + break ; + default: + ap = find_by_sval(A, string(cp), create_flag) ; + break ; + } + return ap ? &ap->cell : (CELL *) 0 ; +} + +void array_delete(A, cp) + ARRAY A ; + CELL *cp ; +{ + ANODE *ap ; + if (A->size == 0) return ; + switch(cp->type) { + case C_DOUBLE : + { + double d = cp->dval ; + Int ival = d_to_I(d) ; + if ((double)ival == d) { + if (A->type == AY_SPLIT) + { + if (ival >=1 && ival <= A->size) convert_split_array_to_table(A) ; + else return ; /* ival not in range */ + } + ap = find_by_ival(A, ival, NO_CREATE) ; + if (ap) { /* remove from the front of the ilist */ + DUAL_LINK *table = (DUAL_LINK*) A->ptr ; + table[ap->ival & A->hmask].ilink = ap->ilink ; + if (ap->sval) { + ANODE *p, *q = 0 ; + int index = ap->hval & A->hmask ; + p = table[index].slink ; + while(p != ap) { q = p ; p = q->slink ; } + if (q) q->slink = p->slink ; + else table[index].slink = p->slink ; + free_STRING(ap->sval) ; + } + + cell_destroy(&ap->cell) ; + ZFREE(ap) ; + if (--A->size == 0) array_clear(A) ; + + + } + return ; + } + + else { /* get the string value */ + char buff[260] ; + STRING *sval ; + sprintf(buff, string(CONVFMT)->str, d) ; + sval = new_STRING(buff) ; + ap = find_by_sval(A, sval, NO_CREATE) ; + free_STRING(sval) ; + } + } + break ; + case C_NOINIT : + ap = find_by_sval(A, &null_str, NO_CREATE) ; + break ; + default : + ap = find_by_sval(A, string(cp), NO_CREATE) ; + break ; + } + if (ap) { /* remove from the front of the slist */ + DUAL_LINK *table = (DUAL_LINK*) A->ptr ; + table[ap->hval&A->hmask].slink = ap->slink ; + if (ap->ival != NOT_AN_IVALUE) { + ANODE *p, *q = 0 ; + int index = ap->ival & A->hmask ; + p = table[index].ilink ; + while(p != ap) { q = p ; p = q->ilink ; } + if (q) q->ilink = p->ilink ; + else table[index].ilink = p->ilink ; + } + + free_STRING(ap->sval) ; + cell_destroy(&ap->cell) ; + ZFREE(ap) ; + if (--A->size == 0) array_clear(A) ; + + + } +} + +void array_load(A, cnt) + ARRAY A ; + int cnt ; +{ + CELL *cells ; /* storage for A[1..cnt] */ + int i ; /* index into cells[] */ + if (A->type != AY_SPLIT || A->limit < cnt) { + array_clear(A) ; + A->limit = (cnt&~3)+4 ; + A->ptr = zmalloc(A->limit*sizeof(CELL)) ; + A->type = AY_SPLIT ; + } + else + for(i=0;i < A->size; i++) cell_destroy((CELL*)A->ptr+i) ; + + cells = (CELL*) A->ptr ; + A->size = cnt ; + if (cnt > MAX_SPLIT) { + SPLIT_OV *p = split_ov_list ; + SPLIT_OV *q ; + split_ov_list = (SPLIT_OV*) 0 ; + i = MAX_SPLIT ; + while( p ) { + cells[i].type = C_MBSTRN ; + cells[i].ptr = (PTR) p->sval ; + q = p ; p = q->link ; ZFREE(q) ; + i++ ; + } + cnt = MAX_SPLIT ; + } + + for(i=0;i < cnt; i++) { + cells[i].type = C_MBSTRN ; + cells[i].ptr = split_buff[i] ; + } +} + +void array_clear(A) + ARRAY A ; +{ + int i ; + ANODE *p, *q ; + if (A->type == AY_SPLIT) { + for(i=0;i < A->size; i++) cell_destroy((CELL*)A->ptr+i) ; + zfree(A->ptr, A->limit * sizeof(CELL)) ; + } + else if (A->type & AY_STR) { + DUAL_LINK *table = (DUAL_LINK*) A->ptr ; + for(i=0;i <= A->hmask; i++) { + p = table[i].slink ; + while(p) { + q = p ; p = q->slink ; + free_STRING(q->sval) ; + cell_destroy(&q->cell) ; + ZFREE(q) ; + } + } + zfree(A->ptr, (A->hmask+1)*sizeof(DUAL_LINK)) ; + } + else if (A->type & AY_INT) { + DUAL_LINK *table = (DUAL_LINK*) A->ptr ; + for(i=0;i <= A->hmask; i++) { + p = table[i].ilink ; + while(p) { + q = p ; p = q->ilink ; + cell_destroy(&q->cell) ; + ZFREE(q) ; + } + } + zfree(A->ptr, (A->hmask+1)*sizeof(DUAL_LINK)) ; + } + memset(A, 0, sizeof(*A)) ; +} + + + +STRING** array_loop_vector(A, sizep) + ARRAY A ; + unsigned *sizep ; +{ + STRING** ret ; + *sizep = A->size ; + if (A->size > 0) { + if (!(A->type & AY_STR)) add_string_associations(A) ; + ret = (STRING**) zmalloc(A->size*sizeof(STRING*)) ; + { + int r = 0 ; /* indexes ret */ + DUAL_LINK* table = (DUAL_LINK*) A->ptr ; + int i ; /* indexes table */ + ANODE *p ; /* walks slists */ + for(i=0;i <= A->hmask; i++) { + for(p = table[i].slink; p ; p = p->slink) { + ret[r++] = p->sval ; + p->sval->ref_cnt++ ; + } + } + } + + return ret ; + } + else return (STRING**) 0 ; +} + +CELL *array_cat(sp, cnt) + CELL *sp ; + int cnt ; +{ + CELL *p ; /* walks the eval stack */ + CELL subsep ; /* local copy of SUBSEP */ + unsigned subsep_len ; /* string length of subsep_str */ + char *subsep_str ; + + unsigned total_len ; /* length of cat'ed expression */ + CELL *top ; /* value of sp at entry */ + char *target ; /* build cat'ed char* here */ + STRING *sval ; /* build cat'ed STRING here */ + cellcpy(&subsep, SUBSEP) ; + if ( subsep.type < C_STRING ) cast1_to_s(&subsep) ; + subsep_len = string(&subsep)->len ; + subsep_str = string(&subsep)->str ; + + top = sp ; sp -= (cnt-1) ; + + total_len = (cnt-1)*subsep_len ; + for(p = sp ; p <= top ; p++) { + if ( p->type < C_STRING ) cast1_to_s(p) ; + total_len += string(p)->len ; + } + + sval = new_STRING0(total_len) ; + target = sval->str ; + for(p = sp ; p < top ; p++) { + memcpy(target, string(p)->str, string(p)->len) ; + target += string(p)->len ; + memcpy(target, subsep_str, subsep_len) ; + target += subsep_len ; + } + /* now p == top */ + memcpy(target, string(p)->str, string(p)->len) ; + + for(p = sp; p <= top ; p++) free_STRING(string(p)) ; + free_STRING(string(&subsep)) ; + /* set contents of sp , sp->type > C_STRING is possible so reset */ + sp->type = C_STRING ; + sp->ptr = (PTR) sval ; + return sp ; + +} + +static ANODE* find_by_ival(A, ival, create_flag) + ARRAY A ; + Int ival ; + int create_flag ; +{ + DUAL_LINK *table = (DUAL_LINK*) A->ptr ; + unsigned index = ival & A->hmask ; + ANODE *p = table[index].ilink ; /* walks ilist */ + ANODE *q = (ANODE*) 0 ; /* trails p */ + while(1) { + if (!p) { + /* search failed */ + if (A->type & AY_STR) { + /* need to search by string */ + char buff[256] ; + STRING *sval ; + sprintf(buff, INT_FMT, ival) ; + sval = new_STRING(buff) ; + p = find_by_sval(A, sval, create_flag) ; + free_STRING(sval) ; + if (!p) return (ANODE*) 0 ; + } + else if (create_flag) { + p = ZMALLOC(ANODE) ; + p->sval = (STRING*) 0 ; + p->cell.type = C_NOINIT ; + if (++A->size > A->limit) { + double_the_hash_table(A) ; /* changes table, may change index */ + table = (DUAL_LINK*) A->ptr ; + index = A->hmask & ival ; + } + } + else return (ANODE*) 0 ; + p->ival = ival ; + A->type |= AY_INT ; + + break ; + } + else if (p->ival == ival) { + /* found it, now move to the front */ + if (!q) /* already at the front */ + return p ; + /* delete for insertion at the front */ + q->ilink = p->ilink ; + break ; + } + q = p ; p = q->ilink ; + } + /* insert at the front */ + p->ilink = table[index].ilink ; + table[index].ilink = p ; + return p ; +} + +static ANODE* find_by_sval(A, sval, create_flag) + ARRAY A ; + STRING *sval ; + int create_flag ; +{ + unsigned hval = ahash(sval) ; + char *str = sval->str ; + DUAL_LINK *table ; + int index ; + ANODE *p ; /* walks list */ + ANODE *q = (ANODE*) 0 ; /* trails p */ + if (! (A->type & AY_STR)) add_string_associations(A) ; + table = (DUAL_LINK*) A->ptr ; + index = hval & A->hmask ; + p = table[index].slink ; + while(1) { + if (!p) { + if (create_flag) { + { + p = ZMALLOC(ANODE) ; + p->sval = sval ; + sval->ref_cnt++ ; + p->ival = NOT_AN_IVALUE ; + p->hval = hval ; + p->cell.type = C_NOINIT ; + if (++A->size > A->limit) { + double_the_hash_table(A) ; /* changes table, may change index */ + table = (DUAL_LINK*) A->ptr ; + index = hval & A->hmask ; + } + } + + break ; + } + else return (ANODE*) 0 ; + } + else if (p->hval == hval && strcmp(p->sval->str,str) == 0 ) { + /* found */ + if (!q) /* already at the front */ + return p ; + else { /* delete for move to the front */ + q->slink = p->slink ; + break ; + } + } + q = p ; p = q->slink ; + } + p->slink = table[index].slink ; + table[index].slink = p ; + return p ; +} + +static void add_string_associations(A) + ARRAY A ; +{ + if (A->type == AY_NULL) make_empty_table(A, AY_STR) ; + else { + DUAL_LINK *table ; + int i ; /* walks table */ + ANODE *p ; /* walks ilist */ + char buff[256] ; + if (A->type == AY_SPLIT) convert_split_array_to_table(A) ; + table = (DUAL_LINK*) A->ptr ; + for(i=0;i <= A->hmask; i++) { + p = table[i].ilink ; + while(p) { + sprintf(buff, INT_FMT, p->ival) ; + p->sval = new_STRING(buff) ; + p->hval = ahash(p->sval) ; + p->slink = table[A->hmask&p->hval].slink ; + table[A->hmask&p->hval].slink = p ; + p = p->ilink ; + } + } + A->type |= AY_STR ; + } +} + +static void make_empty_table(A, type) + ARRAY A ; + int type ; /* AY_INT or AY_STR */ +{ + size_t sz = (STARTING_HMASK+1)*sizeof(DUAL_LINK) ; + A->type = type ; + A->hmask = STARTING_HMASK ; + A->limit = hmask_to_limit(STARTING_HMASK) ; + A->ptr = memset(zmalloc(sz), 0, sz) ; +} + +static void convert_split_array_to_table(A) + ARRAY A ; +{ + CELL *cells = (CELL*) A->ptr ; + int i ; /* walks cells */ + DUAL_LINK *table ; + int j ; /* walks table */ + unsigned entry_limit = A->limit ; + A->hmask = STARTING_HMASK ; + A->limit = hmask_to_limit(STARTING_HMASK) ; + while(A->size > A->limit) { + A->hmask = (A->hmask<<1) + 1 ; /* double the size */ + A->limit = hmask_to_limit(A->hmask) ; + } + { + size_t sz = (A->hmask+1)*sizeof(DUAL_LINK) ; + A->ptr = memset(zmalloc(sz), 0, sz) ; + table = (DUAL_LINK*) A->ptr ; + } + + + /* insert each cells[i] in the new hash table on an ilist */ + for(i=0, j=1 ;i < A->size; i++) { + ANODE *p = ZMALLOC(ANODE) ; + p->sval = (STRING*) 0 ; + p->ival = i+1 ; + p->cell = cells[i] ; + p->ilink = table[j].ilink ; + table[j].ilink = p ; + j++ ; j &= A->hmask ; + } + A->type = AY_INT ; + zfree(cells, entry_limit*sizeof(CELL)) ; +} + +static void double_the_hash_table(A) + ARRAY A ; +{ + unsigned old_hmask = A->hmask ; + unsigned new_hmask = (old_hmask<<1)+1 ; + DUAL_LINK *table ; + A->ptr = zrealloc(A->ptr, (old_hmask+1)*sizeof(DUAL_LINK), + (new_hmask+1)*sizeof(DUAL_LINK)) ; + table = (DUAL_LINK*) A->ptr ; + /* zero out the new part which is the back half */ + memset(&table[old_hmask+1], 0, (old_hmask+1)*sizeof(DUAL_LINK)) ; + + if (A->type & AY_STR) { + int i ; /* index to old lists */ + int j ; /* index to new lists */ + ANODE *p ; /* walks an old list */ + ANODE *q ; /* trails p for deletion */ + ANODE *tail ; /* builds new list from the back */ + ANODE dummy0, dummy1 ; + for(i=0, j=old_hmask+1;i <= old_hmask; i++, j++) + { + q = &dummy0 ; + q->slink = p = table[i].slink ; + tail = &dummy1 ; + while (p) { + if ((p->hval&new_hmask) != i) { /* move it */ + q->slink = p->slink ; + tail = tail->slink = p ; + } + else q = p ; + p = q->slink ; + } + table[i].slink = dummy0.slink ; + tail->slink = (ANODE*) 0 ; + table[j].slink = dummy1.slink ; + } + + } + + if (A->type & AY_INT) { + int i ; /* index to old lists */ + int j ; /* index to new lists */ + ANODE *p ; /* walks an old list */ + ANODE *q ; /* trails p for deletion */ + ANODE *tail ; /* builds new list from the back */ + ANODE dummy0, dummy1 ; + for(i=0, j=old_hmask+1;i <= old_hmask; i++, j++) + { + q = &dummy0 ; + q->ilink = p = table[i].ilink ; + tail = &dummy1 ; + while (p) { + if ((p->ival&new_hmask) != i) { /* move it */ + q->ilink = p->ilink ; + tail = tail->ilink = p ; + } + else q = p ; + p = q->ilink ; + } + table[i].ilink = dummy0.ilink ; + tail->ilink = (ANODE*) 0 ; + table[j].ilink = dummy1.ilink ; + } + + } + + A->hmask = new_hmask ; + A->limit = hmask_to_limit(new_hmask) ; +} + + +static unsigned ahash(sval) + STRING* sval ; +{ + unsigned sum1 = sval->len ; + unsigned sum2 = sum1 ; + unsigned char *p , *q ; + if (sum1 <= 10) { + for(p=(unsigned char*)sval->str; *p ; p++) { + sum1 += sum1 + *p ; + sum2 += sum1 ; + } + } + else { + int cnt = 5 ; + p = (unsigned char*)sval->str ; /* p starts at the front */ + q = (unsigned char*)sval->str + (sum1-1) ; /* q starts at the back */ + while( cnt ) { + cnt-- ; + sum1 += sum1 + *p ; + sum2 += sum1 ; + sum1 += sum1 + *q ; + sum2 += sum1 ; + p++ ; q-- ; + } + } + return sum2 ; +} + + + diff --git a/array.h b/array.h new file mode 100644 index 0000000..84b6e4e --- /dev/null +++ b/array.h @@ -0,0 +1,51 @@ +/* +array.h +copyright 1991-96, Michael D. Brennan + +This is a source file for mawk, an implementation of +the AWK programming language. + +Mawk is distributed without warranty under the terms of +the GNU General Public License, version 2, 1991. +*/ + +/* +This file was generated with the command + + notangle -R'"array.h"' array.w > array.h + +Notangle is part of Norman Ramsey's noweb literate programming package +available from CTAN(ftp.shsu.edu). + +It's easiest to read or modify this file by working with array.w. +*/ + +#ifndef ARRAY_H +#define ARRAY_H 1 +typedef struct array { + PTR ptr ; /* What this points to depends on the type */ + unsigned size ; /* number of elts in the table */ + unsigned limit ; /* Meaning depends on type */ + unsigned hmask ; /* bitwise and with hash value to get table index */ + short type ; /* values in AY_NULL .. AY_SPLIT */ +} *ARRAY ; + +#define AY_NULL 0 +#define AY_INT 1 +#define AY_STR 2 +#define AY_SPLIT 4 + +#define NO_CREATE 0 +#define CREATE 1 + +#define new_ARRAY() ((ARRAY)memset(ZMALLOC(struct array),0,sizeof(struct array))) + +CELL* PROTO(array_find, (ARRAY,CELL*,int)) ; +void PROTO(array_delete, (ARRAY,CELL*)) ; +void PROTO(array_load, (ARRAY,int)) ; +void PROTO(array_clear, (ARRAY)) ; +STRING** PROTO(array_loop_vector, (ARRAY,unsigned*)) ; +CELL* PROTO(array_cat, (CELL*,int)) ; + +#endif /* ARRAY_H */ + diff --git a/array.o b/array.o new file mode 100644 index 0000000..499d885 Binary files /dev/null and b/array.o differ diff --git a/array.w b/array.w new file mode 100644 index 0000000..8ed62fc --- /dev/null +++ b/array.w @@ -0,0 +1,1092 @@ + +% $Log: array.w,v $ +% Revision 1.4 1996/09/18 00:37:25 mike +% 1) Fix stupid bozo in A[expr], expr is numeric and not integer. +% 2) Add static for non-ansi compilers. +% 3) Minor tweaks to documentation. +% +% Revision 1.3 1996/07/28 21:55:32 mike +% trivial change -- add extra {} +% +% Revision 1.2 1996/02/25 23:42:25 mike +% Fix zfree bug in array_clear. +% Clean up documentation. +% + +%\hi -- hang item +\def\hi{\smallskip\hangindent\parindent\indent\ignorespaces} +\def\expr{{\it expr}} +\def\Null{{\it null}} + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% + +<<"array.h">>= +<> +#ifndef ARRAY_H +#define ARRAY_H 1 +<> +<> +#endif /* ARRAY_H */ + +<<"array.c">>= +<> +#include "mawk.h" +#include "symtype.h" +#include "memory.h" +#include "field.h" +#include "bi_vars.h" +<> +<> +<> + +@ Array Structure +The type [[ARRAY]] is a pointer to a [[struct array]]. +The [[size]] field is the number of elements in the table. +The meaning of the other fields depends on the [[type]] field. + +<>= +typedef struct array { + PTR ptr ; /* What this points to depends on the type */ + unsigned size ; /* number of elts in the table */ + unsigned limit ; /* Meaning depends on type */ + unsigned hmask ; /* bitwise and with hash value to get table index */ + short type ; /* values in AY_NULL .. AY_SPLIT */ +} *ARRAY ; + +@ +There are three types of arrays and these are distinguished by the +[[type]] field in the structure. The types are: + +\hi [[AY_NULL]]\quad The array is empty and the [[size]] field is always +zero. The other fields have no meaning. + +\hi [[AY_SPLIT]]\quad The array was created by the [[AWK]] built-in +[[split]]. The return value from [[split]] is stored in the [[size]] +field. The [[ptr]] field points at a vector of [[CELLs]]. The number +of [[CELLs]] is the [[limit]] field. It is always true that +${\it size}\leq{\it limit}$. The address of [[A[i]]] is [[(CELL*)A->ptr+i-1]] +for $1\leq i\leq{\it size}$. The [[hmask]] field has no meaning. + +\hi {\bf Hash Table}\quad The array is a hash table. If the [[AY_STR]] bit +in the [[type]] field is set, then the table is keyed on strings. +If the [[AY_INT]] bit in the [[type]] field is set, then the table is +keyed on integers. Both bits can be set, and then the two keys are +consistent, i.e., look up of [[A[-14]]] and [[A["-14"]]] will +return identical [[CELL]] pointers although the look up methods will +be different. In this case, the [[size]] field is the number of hash nodes +in the table. When insertion of a new element would cause [[size]] to +exceed [[limit]], the table grows by doubling the number of hash chains. +The invariant, +$({\it hmask}+1){\it max\_ave\_list\_length}={\it limit}$, is always true. +{\it Max\_ave\_list\_length} is a tunable constant. + + +<>= +#define AY_NULL 0 +#define AY_INT 1 +#define AY_STR 2 +#define AY_SPLIT 4 + +@ Hash Tables +The hash tables are linked lists of nodes, called [[ANODEs]]. +The number of lists is [[hmask+1]] which is always a power of two. +The [[ptr]] field points at a vector of list heads. Since there are +potentially two types of lists, integer lists and strings lists, +each list head is a structure, [[DUAL_LINK]]. + +<>= +struct anode ; +typedef struct {struct anode *slink, *ilink ;} DUAL_LINK ; + +@ +The string lists are chains connected by [[slinks]] and the integer +lists are chains connected by [[ilinks]]. We sometimes refer to these +lists as slists and ilists, respectively. +The elements on the lists are [[ANODEs]]. +The fields of an [[ANODE]] are: + +\hi [[slink]]\quad The link field for slists. +\hi [[ilink]]\quad The link field for ilists. +\hi [[sval]]\quad If non-null, then [[sval]] is a pointer to a string +key. For a given table, if the [[AY_STR]] bit is set then every +[[ANODE]] has a non-null [[sval]] field and conversely, if [[AY_STR]] +is not set, then every [[sval]] field is null. + +\hi [[hval]]\quad The hash value of [[sval]]. This field has no +meaning if [[sval]] is null. +\hi [[ival]]\quad The integer key. The field has no meaning if set +to the constant, [[NOT_AN_IVALUE]]. If the [[AY_STR]] bit is off, +then every [[ANODE]] will have a valid [[ival]] field. If the +[[AY_STR]] bit is on, then the [[ival]] field may or may not be +valid. + +\hi [[cell]]\quad The data field in the hash table. + +\smallskip\noindent +So the value of $A[\expr]$ is stored in the [[cell]] field, and if +\expr{} is an integer, then \expr{} is stored in [[ival]], else it +is stored in [[sval]]. + + +<>= +typedef struct anode { + struct anode *slink ; + struct anode *ilink ; + STRING *sval ; + unsigned hval ; + Int ival ; + CELL cell ; +} ANODE ; + + +@ Interface Functions +The interface functions are: + +\nobreak +\hi [[CELL* array_find(ARRAY A, CELL *cp, int create_flag)]] returns a +pointer to $A[\expr]$ where [[cp]] is a pointer to the [[CELL]] +holding \expr\/. If the [[create_flag]] is on and \expr\/ is not +an element of [[A]], then the element is created with value \Null\/. + +\hi [[void array_delete(ARRAY A, CELL *cp)]] removes an element +$A[\expr]$ from the array $A$. [[cp]] points at the [[CELL]] holding +\expr\/. + +\hi [[void array_load(ARRAY A, int cnt)]] builds a split array. The +values $A[1..{\it cnt}]$ are copied from the array +${\it split\_buff}[0..{\it cnt}-1]$. + +\hi [[void array_clear(ARRAY A)]] removes all elements of $A$. The +type of $A$ is then [[AY_NULL]]. + +\hi [[STRING** array_loop_vector(ARRAY A, unsigned *sizep)]] +returns a pointer +to a linear vector that holds all the strings that are indices of $A$. +The size of the the vector is returned indirectly in [[*sizep]]. +If [[A->size==0]], a \Null{} pointer is returned. + +\hi [[CELL* array_cat(CELL *sp, int cnt)]] concatenates the elements +of ${\it sp}[1-cnt..0]$, with each element separated by [[SUBSEP]], to +compute an array index. For example, on a reference to $A[i,j]$, +[[array_cat]] computes $i\circ{\it SUBSEP}\circ j$ where +$\circ$ denotes concatenation. + + +<>= +CELL* PROTO(array_find, (ARRAY,CELL*,int)) ; +void PROTO(array_delete, (ARRAY,CELL*)) ; +void PROTO(array_load, (ARRAY,int)) ; +void PROTO(array_clear, (ARRAY)) ; +STRING** PROTO(array_loop_vector, (ARRAY,unsigned*)) ; +CELL* PROTO(array_cat, (CELL*,int)) ; + +@ Array Find +Any reference to $A[\expr]$ creates a call to +[[array_find(A,cp,CREATE)]] where [[cp]] points at the cell holding +\expr\/. The test, $\expr \hbox{ in } A$, creates a call to +[[array_find(A,cp,NO_CREATE)]]. + +<>= +#define NO_CREATE 0 +#define CREATE 1 + +@ +[[Array_find]] is hash-table lookup that breaks into two cases: + +\hi 1)\quad If [[*cp]] is numeric and integer valued, then lookup by +integer value using [[find_by_ival]]. If [[*cp]] is numeric, but not +integer valued, then convert to string with [[sprintf(CONVFMT,...)]] and +go to case~2. + +\hi 2)\quad if [[*cp]] is string valued, then lookup by string value +using [[find_by_sval]]. + +<>= +CELL* array_find(A, cp, create_flag) + ARRAY A ; + CELL *cp ; + int create_flag ; +{ + ANODE *ap ; + if (A->size == 0 && !create_flag) + /* eliminating this trivial case early avoids unnecessary conversions later */ + return (CELL*) 0 ; + switch (cp->type) { + case C_DOUBLE: + <> + break ; + case C_NOINIT: + ap = find_by_sval(A, &null_str, create_flag) ; + break ; + default: + ap = find_by_sval(A, string(cp), create_flag) ; + break ; + } + return ap ? &ap->cell : (CELL *) 0 ; +} + +@ +To test whether [[cp->dval]] is integer, we convert to the nearest +integer by rounding towards zero (done by [[do_to_I]]) and then cast +back to double. If we get the same number we started with, then +[[cp->dval]] is integer valued. + +<>= +{ + double d = cp->dval ; + Int ival = d_to_I(d) ; + if ((double)ival == d) { + if (A->type == AY_SPLIT) { + if (ival >= 1 && ival <= A->size) + return (CELL*)A->ptr+(ival-1) ; + if (!create_flag) return (CELL*) 0 ; + convert_split_array_to_table(A) ; + } + else if (A->type == AY_NULL) make_empty_table(A, AY_INT) ; + ap = find_by_ival(A, ival, create_flag) ; + } + else { + /* convert to string */ + char buff[260] ; + STRING *sval ; + sprintf(buff, string(CONVFMT)->str, d) ; + sval = new_STRING(buff) ; + ap = find_by_sval(A,sval,create_flag) ; + free_STRING(sval) ; + } +} + +@ +When we get to the function [[find_by_ival]], the search has been reduced +to lookup in a hash table by integer value. + +<>= +static ANODE* find_by_ival(A, ival, create_flag) + ARRAY A ; + Int ival ; + int create_flag ; +{ + DUAL_LINK *table = (DUAL_LINK*) A->ptr ; + unsigned index = ival & A->hmask ; + ANODE *p = table[index].ilink ; /* walks ilist */ + ANODE *q = (ANODE*) 0 ; /* trails p */ + while(1) { + if (!p) { + /* search failed */ + <> + break ; + } + else if (p->ival == ival) { + /* found it, now move to the front */ + if (!q) /* already at the front */ + return p ; + /* delete for insertion at the front */ + q->ilink = p->ilink ; + break ; + } + q = p ; p = q->ilink ; + } + /* insert at the front */ + p->ilink = table[index].ilink ; + table[index].ilink = p ; + return p ; +} + +@ +When a search by integer value fails, we have to check by string +value to correctly +handle the case insertion by [[A["123"]]] and later search as +[[A[123]]]. This string search is necessary if and only if the +[[AY_STR]] bit is set. An important point is that all [[ANODEs]] get +created with a valid [[sval]] if [[AY_STR]] is set, because then creation +of new nodes always occurs in a call to [[find_by_sval]]. + +<>= +if (A->type & AY_STR) { + /* need to search by string */ + char buff[256] ; + STRING *sval ; + sprintf(buff, INT_FMT, ival) ; + sval = new_STRING(buff) ; + p = find_by_sval(A, sval, create_flag) ; + free_STRING(sval) ; + if (!p) return (ANODE*) 0 ; +} +else if (create_flag) { + p = ZMALLOC(ANODE) ; + p->sval = (STRING*) 0 ; + p->cell.type = C_NOINIT ; + if (++A->size > A->limit) { + double_the_hash_table(A) ; /* changes table, may change index */ + table = (DUAL_LINK*) A->ptr ; + index = A->hmask & ival ; + } +} +else return (ANODE*) 0 ; +p->ival = ival ; +A->type |= AY_INT ; + +@ +Searching by string value is easier because [[AWK]] arrays are really +string associations. If the array does not have the [[AY_STR]] bit set, +then we have to convert the array to a dual hash table with strings +which is done by the function [[add_string_associations]]. + +<>= +static ANODE* find_by_sval(A, sval, create_flag) + ARRAY A ; + STRING *sval ; + int create_flag ; +{ + unsigned hval = ahash(sval) ; + char *str = sval->str ; + DUAL_LINK *table ; + int index ; + ANODE *p ; /* walks list */ + ANODE *q = (ANODE*) 0 ; /* trails p */ + if (! (A->type & AY_STR)) add_string_associations(A) ; + table = (DUAL_LINK*) A->ptr ; + index = hval & A->hmask ; + p = table[index].slink ; + while(1) { + if (!p) { + if (create_flag) { + <> + break ; + } + else return (ANODE*) 0 ; + } + else if (p->hval == hval && strcmp(p->sval->str,str) == 0 ) { + /* found */ + if (!q) /* already at the front */ + return p ; + else { /* delete for move to the front */ + q->slink = p->slink ; + break ; + } + } + q = p ; p = q->slink ; + } + p->slink = table[index].slink ; + table[index].slink = p ; + return p ; +} + +@ +One [[Int]] value is reserved to show that the [[ival]] field is invalid. +This works because [[d_to_I]] returns a value in [[[-Max_Int, Max_Int]]]. + +<>= +#define NOT_AN_IVALUE (-Max_Int-1) /* usually 0x80000000 */ + +<>= +{ + p = ZMALLOC(ANODE) ; + p->sval = sval ; + sval->ref_cnt++ ; + p->ival = NOT_AN_IVALUE ; + p->hval = hval ; + p->cell.type = C_NOINIT ; + if (++A->size > A->limit) { + double_the_hash_table(A) ; /* changes table, may change index */ + table = (DUAL_LINK*) A->ptr ; + index = hval & A->hmask ; + } +} + +@ +On entry to [[add_string_associations]], we know that the [[AY_STR]] bit +is not set. We convert to a dual hash table, then walk all the integer +lists and put each [[ANODE]] on a string list. + +<>= +static void add_string_associations(A) + ARRAY A ; +{ + if (A->type == AY_NULL) make_empty_table(A, AY_STR) ; + else { + DUAL_LINK *table ; + int i ; /* walks table */ + ANODE *p ; /* walks ilist */ + char buff[256] ; + if (A->type == AY_SPLIT) convert_split_array_to_table(A) ; + table = (DUAL_LINK*) A->ptr ; + for(i=0;i <= A->hmask; i++) { + p = table[i].ilink ; + while(p) { + sprintf(buff, INT_FMT, p->ival) ; + p->sval = new_STRING(buff) ; + p->hval = ahash(p->sval) ; + p->slink = table[A->hmask&p->hval].slink ; + table[A->hmask&p->hval].slink = p ; + p = p->ilink ; + } + } + A->type |= AY_STR ; + } +} + +@ Array Delete +The execution of the statement, $\hbox{\it delete }A[\expr]$, creates a +call to [[array_delete(ARRAY A, CELL *cp)]]. Depending on the +type of [[*cp]], the call is routed to [[find_by_sval]] or [[find_by_ival]]. +Each of these functions leaves its return value on the front of an +slist or ilist, respectively, and then it is deleted from the front of +the list. The case where $A[\expr]$ is on two lists, e.g., +[[A[12]]] and [[A["12"]]] is checked by examining the [[sval]] and +[[ival]] fields of the returned [[ANODE*]]. + +<>= +void array_delete(A, cp) + ARRAY A ; + CELL *cp ; +{ + ANODE *ap ; + if (A->size == 0) return ; + switch(cp->type) { + case C_DOUBLE : + { + double d = cp->dval ; + Int ival = d_to_I(d) ; + if ((double)ival == d) <> + else { /* get the string value */ + char buff[260] ; + STRING *sval ; + sprintf(buff, string(CONVFMT)->str, d) ; + sval = new_STRING(buff) ; + ap = find_by_sval(A, sval, NO_CREATE) ; + free_STRING(sval) ; + } + } + break ; + case C_NOINIT : + ap = find_by_sval(A, &null_str, NO_CREATE) ; + break ; + default : + ap = find_by_sval(A, string(cp), NO_CREATE) ; + break ; + } + if (ap) { /* remove from the front of the slist */ + DUAL_LINK *table = (DUAL_LINK*) A->ptr ; + table[ap->hval&A->hmask].slink = ap->slink ; + <> + free_STRING(ap->sval) ; + cell_destroy(&ap->cell) ; + ZFREE(ap) ; + <size]]>> + } +} + +<>= +{ + if (A->type == AY_SPLIT) + if (ival >=1 && ival <= A->size) convert_split_array_to_table(A) ; + else return ; /* ival not in range */ + ap = find_by_ival(A, ival, NO_CREATE) ; + if (ap) { /* remove from the front of the ilist */ + DUAL_LINK *table = (DUAL_LINK*) A->ptr ; + table[ap->ival & A->hmask].ilink = ap->ilink ; + <> + cell_destroy(&ap->cell) ; + ZFREE(ap) ; + <size]]>> + } + return ; +} + +@ +Even though we found a node by searching an ilist it might also +be on an slist and vice-versa. + +<>= +if (ap->sval) { + ANODE *p, *q = 0 ; + int index = ap->hval & A->hmask ; + p = table[index].slink ; + while(p != ap) { q = p ; p = q->slink ; } + if (q) q->slink = p->slink ; + else table[index].slink = p->slink ; + free_STRING(ap->sval) ; +} + +<>= +if (ap->ival != NOT_AN_IVALUE) { + ANODE *p, *q = 0 ; + int index = ap->ival & A->hmask ; + p = table[index].ilink ; + while(p != ap) { q = p ; p = q->ilink ; } + if (q) q->ilink = p->ilink ; + else table[index].ilink = p->ilink ; +} + +@ +When the size of a hash table drops below a certain value, it might +be profitable to shrink the hash table. Currently we don't do this, +because our guess is that it would be a waste of time for most +[[AWK]] applications. However, we do convert an array to [[AY_NULL]] +when the size goes to zero which would resize a large hash table +that had been completely cleared by successive deletions. + +<size]]>>= +if (--A->size == 0) array_clear(A) ; + + +@ Building an Array with Split +A simple operation is to create an array with the [[AWK]] +primitive [[split]]. The code that performs [[split]] puts the +pieces in the global buffer [[split_buff]]. The call +[[array_load(A, cnt)]] moves the [[cnt]] elements from [[split_buff]] to +[[A]]. This is the only way an array of type [[AY_SPLIT]] is +created. + +<>= +void array_load(A, cnt) + ARRAY A ; + int cnt ; +{ + CELL *cells ; /* storage for A[1..cnt] */ + int i ; /* index into cells[] */ + <> + cells = (CELL*) A->ptr ; + A->size = cnt ; + <> + for(i=0;i < cnt; i++) { + cells[i].type = C_MBSTRN ; + cells[i].ptr = split_buff[i] ; + } +} + +@ +When [[cnt > MAX_SPLIT]], [[split_buff]] was not big enough to hold +everything so the overflow went on the [[split_ov_list]]. +The elements from [[MAX_SPLIT+1]] to [[cnt]] get loaded into +[[cells[MAX_SPLIT..cnt-1]]] from this list. + +<>= +if (cnt > MAX_SPLIT) { + SPLIT_OV *p = split_ov_list ; + SPLIT_OV *q ; + split_ov_list = (SPLIT_OV*) 0 ; + i = MAX_SPLIT ; + while( p ) { + cells[i].type = C_MBSTRN ; + cells[i].ptr = (PTR) p->sval ; + q = p ; p = q->link ; ZFREE(q) ; + i++ ; + } + cnt = MAX_SPLIT ; +} + +@ +If the array [[A]] is a split array and big enough then we reuse it, +otherwise we need to allocate a new split array. +When we allocate a block of [[CELLs]] for a split array, we round up +to a multiple of 4. + +<>= +if (A->type != AY_SPLIT || A->limit < cnt) { + array_clear(A) ; + A->limit = (cnt&~3)+4 ; + A->ptr = zmalloc(A->limit*sizeof(CELL)) ; + A->type = AY_SPLIT ; +} +else + for(i=0;i < A->size; i++) cell_destroy((CELL*)A->ptr+i) ; + +@ Array Clear +The function [[array_clear(ARRAY A)]] converts [[A]] to type [[AY_NULL]] +and frees all storage used by [[A]] except for the [[struct array]] +itself. This function gets called in two contexts: +(1)~when an array local to a user function goes out of scope and +(2)~execution of the [[AWK]] statement, [[delete A]]. + +<>= +void array_clear(A) + ARRAY A ; +{ + int i ; + ANODE *p, *q ; + if (A->type == AY_SPLIT) { + for(i=0;i < A->size; i++) cell_destroy((CELL*)A->ptr+i) ; + zfree(A->ptr, A->limit * sizeof(CELL)) ; + } + else if (A->type & AY_STR) { + DUAL_LINK *table = (DUAL_LINK*) A->ptr ; + for(i=0;i <= A->hmask; i++) { + p = table[i].slink ; + while(p) { + q = p ; p = q->slink ; + free_STRING(q->sval) ; + cell_destroy(&q->cell) ; + ZFREE(q) ; + } + } + zfree(A->ptr, (A->hmask+1)*sizeof(DUAL_LINK)) ; + } + else if (A->type & AY_INT) { + DUAL_LINK *table = (DUAL_LINK*) A->ptr ; + for(i=0;i <= A->hmask; i++) { + p = table[i].ilink ; + while(p) { + q = p ; p = q->ilink ; + cell_destroy(&q->cell) ; + ZFREE(q) ; + } + } + zfree(A->ptr, (A->hmask+1)*sizeof(DUAL_LINK)) ; + } + memset(A, 0, sizeof(*A)) ; +} + + + +@ Constructor and Conversions +Arrays are always created as empty arrays of type [[AY_NULL]]. +Global arrays are never destroyed although they can go empty or have +their type change by conversion. The only constructor function is +a macro. + +<>= +#define new_ARRAY() ((ARRAY)memset(ZMALLOC(struct array),0,sizeof(struct array))) + +@ +Hash tables only get constructed by conversion. This happens in two +ways. +The function [[make_empty_table]] converts an empty array of type +[[AY_NULL]] to an empty hash table. The number of lists in the table +is a power of 2 determined by the constant [[STARTING_HMASK]]. +The limit size of the table is determined by the constant +[[MAX_AVE_LIST_LENGTH]] which is the largest average size of the hash +lists that we are willing to tolerate before enlarging the table. +When [[A->size]] exceeds [[A->limit]], +the hash table grows in size by doubling the number of lists. +[[A->limit]] is then reset to [[MAX_AVE_LIST_LENGTH]] times +[[A->hmask+1]]. + +<>= +#define STARTING_HMASK 63 /* 2^6-1, must have form 2^n-1 */ +#define MAX_AVE_LIST_LENGTH 12 +#define hmask_to_limit(x) (((x)+1)*MAX_AVE_LIST_LENGTH) + +<>= +static void make_empty_table(A, type) + ARRAY A ; + int type ; /* AY_INT or AY_STR */ +{ + size_t sz = (STARTING_HMASK+1)*sizeof(DUAL_LINK) ; + A->type = type ; + A->hmask = STARTING_HMASK ; + A->limit = hmask_to_limit(STARTING_HMASK) ; + A->ptr = memset(zmalloc(sz), 0, sz) ; +} + +@ +The other way a hash table gets constructed is when a split array is +converted to a hash table of type [[AY_INT]]. + +<>= +static void convert_split_array_to_table(A) + ARRAY A ; +{ + CELL *cells = (CELL*) A->ptr ; + int i ; /* walks cells */ + DUAL_LINK *table ; + int j ; /* walks table */ + unsigned entry_limit = A->limit ; + <> + /* insert each cells[i] in the new hash table on an ilist */ + for(i=0, j=1 ;i < A->size; i++) { + ANODE *p = ZMALLOC(ANODE) ; + p->sval = (STRING*) 0 ; + p->ival = i+1 ; + p->cell = cells[i] ; + p->ilink = table[j].ilink ; + table[j].ilink = p ; + j++ ; j &= A->hmask ; + } + A->type = AY_INT ; + zfree(cells, entry_limit*sizeof(CELL)) ; +} + +@ +To determine the size of the table, we set the initial size to +[[STARTING_HMASK+1]] and then double the size until +[[A->size <= A->limit]]. + +<>= +A->hmask = STARTING_HMASK ; +A->limit = hmask_to_limit(STARTING_HMASK) ; +while(A->size > A->limit) { + A->hmask = (A->hmask<<1) + 1 ; /* double the size */ + A->limit = hmask_to_limit(A->hmask) ; +} +{ + size_t sz = (A->hmask+1)*sizeof(DUAL_LINK) ; + A->ptr = memset(zmalloc(sz), 0, sz) ; + table = (DUAL_LINK*) A->ptr ; +} + + +@ Doubling the Size of a Hash Table +The whole point of making the table size a power of two is to +facilitate resizing the table. If the table size is $2^n$ and +$h$ is the hash key, then $h\bmod 2^n$ is the hash chain index +which can be calculated with bit-wise and, +{\mathchardef~="2026 $h ~ (2^n-1)$}. +When the table size doubles, the new bit-mask has one more bit +turned on. Elements of an old hash chain whose hash value have this bit +turned on get moved to a new chain. Elements with this bit turned off +stay on the same chain. On average only half the old chain moves to the +new chain. If the old chain is at ${\it table}[i],\ 0\le i < 2^n$, +then the elements that move, all move to the new chain at +${\it table}[i+2^n]$. + +<>= +static void double_the_hash_table(A) + ARRAY A ; +{ + unsigned old_hmask = A->hmask ; + unsigned new_hmask = (old_hmask<<1)+1 ; + DUAL_LINK *table ; + <> + <> + <> + A->hmask = new_hmask ; + A->limit = hmask_to_limit(new_hmask) ; +} + + +<>= +A->ptr = zrealloc(A->ptr, (old_hmask+1)*sizeof(DUAL_LINK), + (new_hmask+1)*sizeof(DUAL_LINK)) ; +table = (DUAL_LINK*) A->ptr ; +/* zero out the new part which is the back half */ +memset(&table[old_hmask+1], 0, (old_hmask+1)*sizeof(DUAL_LINK)) ; + +<>= +if (A->type & AY_STR) { + int i ; /* index to old lists */ + int j ; /* index to new lists */ + ANODE *p ; /* walks an old list */ + ANODE *q ; /* trails p for deletion */ + ANODE *tail ; /* builds new list from the back */ + ANODE dummy0, dummy1 ; + for(i=0, j=old_hmask+1;i <= old_hmask; i++, j++) + <> +} + +@ +As we walk an old string list with pointer [[p]], the expression +[[p->hval & new_hmask]] takes one of two values. If it is equal +to [[p->hval & old_hmask]] (which equals [[i]]), +then the node stays otherwise it gets moved +to a new string list at [[j]]. The new string list preserves order so that +the positions of the move-to-the-front heuristic are preserved. +Nodes moving to the new list are appended at pointer [[tail]]. +The [[ANODEs]], [[dummy0]]~and [[dummy1]], are sentinels that remove +special handling of boundary conditions. + +<>= +{ + q = &dummy0 ; + q->slink = p = table[i].slink ; + tail = &dummy1 ; + while (p) { + if ((p->hval&new_hmask) != i) { /* move it */ + q->slink = p->slink ; + tail = tail->slink = p ; + } + else q = p ; + p = q->slink ; + } + table[i].slink = dummy0.slink ; + tail->slink = (ANODE*) 0 ; + table[j].slink = dummy1.slink ; +} + +@ +The doubling of the integer lists is exactly the same except that +[[slink]] is replaced by [[ilink]] and [[hval]] is replaced by [[ival]]. + +<>= +if (A->type & AY_INT) { + int i ; /* index to old lists */ + int j ; /* index to new lists */ + ANODE *p ; /* walks an old list */ + ANODE *q ; /* trails p for deletion */ + ANODE *tail ; /* builds new list from the back */ + ANODE dummy0, dummy1 ; + for(i=0, j=old_hmask+1;i <= old_hmask; i++, j++) + <> +} + +<>= +{ + q = &dummy0 ; + q->ilink = p = table[i].ilink ; + tail = &dummy1 ; + while (p) { + if ((p->ival&new_hmask) != i) { /* move it */ + q->ilink = p->ilink ; + tail = tail->ilink = p ; + } + else q = p ; + p = q->ilink ; + } + table[i].ilink = dummy0.ilink ; + tail->ilink = (ANODE*) 0 ; + table[j].ilink = dummy1.ilink ; +} + +@ Initializing Array Loops +Our mechanism for dealing with execution of the statement, +\medskip +\centerline{[[for(i in A) {]] {\it statements} [[}]]} +\medskip +\noindent +is simple. We allocate a vector of [[STRING*]] of size, +[[A->size]]. Each element of the vector is a string key for~[[A]]. +Note that if the [[AY_STR]] bit of [[A]] is not set, then [[A]] +has to be converted to a string hash table, because the index +[[i]] walks string indices. + +To execute the loop, the only state that needs to be saved is the +address of [[i]] and an index into the vector of string keys. Since +nothing about [[A]] is saved as state, the user +program can do anything to [[A]] inside the body of +the loop, even [[delete A]], and the loop +still works. Essentially, we have traded data space (the string vector) +in exchange for implementation simplicity. On a 32-bit system, each +[[ANODE]] is 36 bytes, so the extra memory needed for the array loop is +11\% more than the memory consumed by the [[ANODEs]] of the array. +Note that the large size of the [[ANODEs]] is indicative of our whole +design which pays data space for integer lookup speed and algorithm +simplicity. + +The only aspect of array loops that occurs in [[array.c]] is construction +of the string vector. The rest of the implementation +is in the file [[execute.c]]. + +<>= +STRING** array_loop_vector(A, sizep) + ARRAY A ; + unsigned *sizep ; +{ + STRING** ret ; + *sizep = A->size ; + if (A->size > 0) { + if (!(A->type & AY_STR)) add_string_associations(A) ; + ret = (STRING**) zmalloc(A->size*sizeof(STRING*)) ; + <> + return ret ; + } + else return (STRING**) 0 ; +} + +@ +As we walk over the hash table [[ANODEs]], putting each [[sval]] in +[[ret]], we need to increment each reference count. The user of the +return value is responsible for these new reference counts. + +<>= +{ + int r = 0 ; /* indexes ret */ + DUAL_LINK* table = (DUAL_LINK*) A->ptr ; + int i ; /* indexes table */ + ANODE *p ; /* walks slists */ + for(i=0;i <= A->hmask; i++) { + for(p = table[i].slink; p ; p = p->slink) { + ret[r++] = p->sval ; + p->sval->ref_cnt++ ; + } + } +} + +@ The Hash Function +Since a hash value is turned into a table index via bit-wise and with +\hbox{[[A->hmask]]}, it is important that the hash function does a good job +of scrambling the low-order bits of the returned hash value. +Empirical tests indicate the following function does an adequate job. +Note that for strings with length greater than 10, we only hash on +the first five characters, the last five character and the length. + +<>= +static unsigned ahash(sval) + STRING* sval ; +{ + unsigned sum1 = sval->len ; + unsigned sum2 = sum1 ; + unsigned char *p , *q ; + if (sum1 <= 10) { + for(p=(unsigned char*)sval->str; *p ; p++) { + sum1 += sum1 + *p ; + sum2 += sum1 ; + } + } + else { + int cnt = 5 ; + p = (unsigned char*)sval->str ; /* p starts at the front */ + q = (unsigned char*)sval->str + (sum1-1) ; /* q starts at the back */ + while( cnt ) { + cnt-- ; + sum1 += sum1 + *p ; + sum2 += sum1 ; + sum1 += sum1 + *q ; + sum2 += sum1 ; + p++ ; q-- ; + } + } + return sum2 ; +} + + +@ Concatenating Array Indices +In [[AWK]], an array expression [[A[i,j]]] is equivalent to the +expression [[A[i SUBSEP j]]], i.e., the index is the +concatenation of the three +elements [[i]], [[SUBSEP]] and [[j]]. This is performed by the +function [[array_cat]]. On entry, [[sp]] points at the top of a +stack of [[CELLs]]. +[[Cnt]] cells are popped off the stack and concatenated together +separated by [[SUBSEP]] and the result is pushed back on the stack. +On entry, the first multi-index is in [[sp[1-cnt]]] and the last is +in [[sp[0]]]. The return value is the new stack top. +(The stack is the run-time evaluation stack. +This operation really has nothing to do with array structure, so +logically this code belongs in [[execute.c]], but remains here for +historical reasons.) + + +<>= +CELL *array_cat(sp, cnt) + CELL *sp ; + int cnt ; +{ + CELL *p ; /* walks the eval stack */ + CELL subsep ; /* local copy of SUBSEP */ + <> + unsigned total_len ; /* length of cat'ed expression */ + CELL *top ; /* value of sp at entry */ + char *target ; /* build cat'ed char* here */ + STRING *sval ; /* build cat'ed STRING here */ + <> + <> + <> + <> + <> +} + +@ +We make a copy of [[SUBSEP]] which we can cast to string in the +unlikely event the user has assigned a number to [[SUBSEP]]. + +<>= +unsigned subsep_len ; /* string length of subsep_str */ +char *subsep_str ; + +<>= +cellcpy(&subsep, SUBSEP) ; +if ( subsep.type < C_STRING ) cast1_to_s(&subsep) ; +subsep_len = string(&subsep)->len ; +subsep_str = string(&subsep)->str ; + +@ +Set [[sp]] and [[top]] so the cells to concatenate are inclusively +between [[sp]] and [[top]]. + +<>= +top = sp ; sp -= (cnt-1) ; + +@ +The [[total_len]] is the sum of the lengths of the [[cnt]] +strings and the [[cnt-1]] copies of [[subsep]]. + +<>= +total_len = (cnt-1)*subsep_len ; +for(p = sp ; p <= top ; p++) { + if ( p->type < C_STRING ) cast1_to_s(p) ; + total_len += string(p)->len ; +} + +<>= +sval = new_STRING0(total_len) ; +target = sval->str ; +for(p = sp ; p < top ; p++) { + memcpy(target, string(p)->str, string(p)->len) ; + target += string(p)->len ; + memcpy(target, subsep_str, subsep_len) ; + target += subsep_len ; +} +/* now p == top */ +memcpy(target, string(p)->str, string(p)->len) ; + +@ +The return value is [[sp]] and it is already set correctly. We +just need to free the strings and set the contents of [[sp]]. + +<>= +for(p = sp; p <= top ; p++) free_STRING(string(p)) ; +free_STRING(string(&subsep)) ; +/* set contents of sp , sp->type > C_STRING is possible so reset */ +sp->type = C_STRING ; +sp->ptr = (PTR) sval ; +return sp ; + +@ Loose Ends +Here are some things we want to make sure end up in the [[.c]] and +[[.h]] files. +The compiler needs prototypes for the local functions, and we will +put a copyright and links to the source file, [[array.w]], in each +output file. + +<>= +static ANODE* PROTO(find_by_ival,(ARRAY, Int, int)) ; +static ANODE* PROTO(find_by_sval,(ARRAY, STRING*, int)) ; +static void PROTO(add_string_associations,(ARRAY)) ; +static void PROTO(make_empty_table,(ARRAY, int)) ; +static void PROTO(convert_split_array_to_table,(ARRAY)) ; +static void PROTO(double_the_hash_table,(ARRAY)) ; +static unsigned PROTO(ahash, (STRING*)) ; + + +<>= +/* +array.c +<> +*/ + +/* +This file was generated with the command + + notangle -R'"array.c"' array.w > array.c + +<> +*/ + +<>= +Notangle is part of Norman Ramsey's noweb literate programming package +available from CTAN(ftp.shsu.edu). + +It's easiest to read or modify this file by working with array.w. +<>= +/* +array.h +<> +*/ + +/* +This file was generated with the command + + notangle -R'"array.h"' array.w > array.h + +<> +*/ + +<>= +copyright 1991-96, Michael D. Brennan + +This is a source file for mawk, an implementation of +the AWK programming language. + +Mawk is distributed without warranty under the terms of +the GNU General Public License, version 2, 1991. diff --git a/atarist/README.ST b/atarist/README.ST new file mode 100644 index 0000000..8a2920a --- /dev/null +++ b/atarist/README.ST @@ -0,0 +1,25 @@ +The following worked for mawk1.1.x. +It has not been updated for mawk1.2(beta) +------------------------------------------------------------- + +This is a quick mawk port to the atariST/StE/TT using +gcc + libs @ Patchlevel 73 + +Compiling: + if you are cross compiling issue: + build_mawk atarist + + if you are native compiling issue: + gnumake CC=gcc CFLAGS=-O RANLIB= AR=gcc-ar MATHLIB=-lpml + +Testing: + use the gulam scripts mawktest.g and fpe_test.g in the test directory. + Make sure you pick up cmp or diff from the gnu stuff + on atari.archive to run the test scripts. + +report atari related problems to: +-- +bang: uunet!cadence!bammi jwahar r. bammi +domain: bammi@cadence.com +GEnie: J.Bammi +CIS: 71515,155 diff --git a/bi_funct.c b/bi_funct.c new file mode 100644 index 0000000..b57f1da --- /dev/null +++ b/bi_funct.c @@ -0,0 +1,1014 @@ + +/******************************************** +bi_funct.c +copyright 1991, Michael D. Brennan + +This is a source file for mawk, an implementation of +the AWK programming language. + +Mawk is distributed without warranty under the terms of +the GNU General Public License, version 2, 1991. +********************************************/ + +/* $Log: bi_funct.c,v $ + * Revision 1.9 1996/01/14 17:16:11 mike + * flush_all_output() before system() + * + * Revision 1.8 1995/08/27 18:13:03 mike + * fix random number generator to work with longs larger than 32bits + * + * Revision 1.7 1995/06/09 22:53:30 mike + * change a memcmp() to strncmp() to make purify happy + * + * Revision 1.6 1994/12/13 00:26:32 mike + * rt_nr and rt_fnr for run-time error messages + * + * Revision 1.5 1994/12/11 22:14:11 mike + * remove THINK_C #defines. Not a political statement, just no indication + * that anyone ever used it. + * + * Revision 1.4 1994/12/10 21:44:12 mike + * fflush builtin + * + * Revision 1.3 1993/07/14 11:46:36 mike + * code cleanup + * + * Revision 1.2 1993/07/14 01:22:27 mike + * rm SIZE_T + * + * Revision 1.1.1.1 1993/07/03 18:58:08 mike + * move source to cvs + * + * Revision 5.5 1993/02/13 21:57:18 mike + * merge patch3 + * + * Revision 5.4 1993/01/01 21:30:48 mike + * split new_STRING() into new_STRING and new_STRING0 + * + * Revision 5.3.1.2 1993/01/27 01:04:06 mike + * minor tuning to str_str() + * + * Revision 5.3.1.1 1993/01/15 03:33:35 mike + * patch3: safer double to int conversion + * + * Revision 5.3 1992/12/17 02:48:01 mike + * 1.1.2d changes for DOS + * + * Revision 5.2 1992/07/08 15:43:41 brennan + * patch2: length returns. I am a wimp + * + * Revision 5.1 1991/12/05 07:55:35 brennan + * 1.1 pre-release + * +*/ + + +#include "mawk.h" +#include "bi_funct.h" +#include "bi_vars.h" +#include "memory.h" +#include "init.h" +#include "files.h" +#include "fin.h" +#include "field.h" +#include "regexp.h" +#include "repl.h" +#include +#include + +/* statics */ +static STRING *PROTO(gsub, (PTR, CELL *, char *, int)) ; +static void PROTO(fplib_err, (char *, double, char *)) ; + +/* global for the disassembler */ +BI_REC bi_funct[] = +{ /* info to load builtins */ + + {"length", bi_length, 0, 1}, /* special must come first */ + {"index", bi_index, 2, 2}, + {"substr", bi_substr, 2, 3}, + {"sprintf", bi_sprintf, 1, 255}, + {"sin", bi_sin, 1, 1}, + {"cos", bi_cos, 1, 1}, + {"atan2", bi_atan2, 2, 2}, + {"exp", bi_exp, 1, 1}, + {"log", bi_log, 1, 1}, + {"int", bi_int, 1, 1}, + {"sqrt", bi_sqrt, 1, 1}, + {"rand", bi_rand, 0, 0}, + {"srand", bi_srand, 0, 1}, + {"close", bi_close, 1, 1}, + {"system", bi_system, 1, 1}, + {"toupper", bi_toupper, 1, 1}, + {"tolower", bi_tolower, 1, 1}, + {"fflush", bi_fflush, 0, 1}, + + {(char *) 0, (PF_CP) 0, 0, 0}} ; + + +/* load built-in functions in symbol table */ +void +bi_funct_init() +{ + register BI_REC *p ; + register SYMTAB *stp ; + + /* length is special (posix bozo) */ + stp = insert(bi_funct->name) ; + stp->type = ST_LENGTH ; + stp->stval.bip = bi_funct ; + + for (p = bi_funct + 1; p->name; p++) + { + stp = insert(p->name) ; + stp->type = ST_BUILTIN ; + stp->stval.bip = p ; + } + + /* seed rand() off the clock */ + { + CELL c ; + + c.type = 0 ; bi_srand(&c) ; + } + +} + +/************************************************** + string builtins (except split (in split.c) and [g]sub (at end)) + **************************************************/ + +CELL * +bi_length(sp) + register CELL *sp ; +{ + unsigned len ; + + if (sp->type == 0) cellcpy(sp, field) ; + else sp-- ; + + if (sp->type < C_STRING) cast1_to_s(sp) ; + len = string(sp)->len ; + + free_STRING(string(sp)) ; + sp->type = C_DOUBLE ; + sp->dval = (double) len ; + + return sp ; +} + +char * +str_str(target, key, key_len) + register char *target ; + char *key ; + unsigned key_len ; +{ + register int k = key[0] ; + + switch (key_len) + { + case 0: + return (char *) 0 ; + case 1: + return strchr(target, k) ; + case 2: + { + int k1 = key[1] ; + while ((target = strchr(target, k))) + if (target[1] == k1) return target ; + else target++ ; + /*failed*/ + return (char *) 0 ; + } + } + + key_len-- ; + while ((target = strchr(target, k))) + { + if (strncmp(target + 1, key + 1, key_len) == 0) return target ; + else target++ ; + } + /*failed*/ + return (char *) 0 ; +} + + + +CELL * +bi_index(sp) + register CELL *sp ; +{ + register int idx ; + unsigned len ; + char *p ; + + sp-- ; + if (TEST2(sp) != TWO_STRINGS) cast2_to_s(sp) ; + + if ((len = string(sp + 1)->len)) + idx = (p = str_str(string(sp)->str, string(sp + 1)->str, len)) + ? p - string(sp)->str + 1 : 0 ; + + else /* index of the empty string */ + idx = 1 ; + + free_STRING(string(sp)) ; + free_STRING(string(sp + 1)) ; + sp->type = C_DOUBLE ; + sp->dval = (double) idx ; + return sp ; +} + +/* substr(s, i, n) + if l = length(s) then get the characters + from max(1,i) to min(l,n-i-1) inclusive */ + +CELL * +bi_substr(sp) + CELL *sp ; +{ + int n_args, len ; + register int i, n ; + STRING *sval ; /* substr(sval->str, i, n) */ + + n_args = sp->type ; + sp -= n_args ; + if (sp->type != C_STRING) cast1_to_s(sp) ; + /* don't use < C_STRING shortcut */ + sval = string(sp) ; + + if ((len = sval->len) == 0) /* substr on null string */ + { + if (n_args == 3) { cell_destroy(sp + 2) ; } + cell_destroy(sp + 1) ; + return sp ; + } + + if (n_args == 2) + { + n = MAX__INT ; + if (sp[1].type != C_DOUBLE) cast1_to_d(sp + 1) ; + } + else + { + if (TEST2(sp + 1) != TWO_DOUBLES) cast2_to_d(sp + 1) ; + n = d_to_i(sp[2].dval) ; + } + i = d_to_i(sp[1].dval) - 1 ; /* i now indexes into string */ + + if ( i < 0 ) { n += i ; i = 0 ; } + if (n > len - i) n = len - i ; + + if (n <= 0) /* the null string */ + { + sp->ptr = (PTR) & null_str ; + null_str.ref_cnt++ ; + } + else /* got something */ + { + sp->ptr = (PTR) new_STRING0(n) ; + memcpy(string(sp)->str, sval->str + i, n) ; + } + + free_STRING(sval) ; + return sp ; +} + +/* + match(s,r) + sp[0] holds r, sp[-1] holds s +*/ + +CELL * +bi_match(sp) + register CELL *sp ; +{ + char *p ; + unsigned length ; + + if (sp->type != C_RE) cast_to_RE(sp) ; + if ((--sp)->type < C_STRING) cast1_to_s(sp) ; + + cell_destroy(RSTART) ; + cell_destroy(RLENGTH) ; + RSTART->type = C_DOUBLE ; + RLENGTH->type = C_DOUBLE ; + + p = REmatch(string(sp)->str, (sp + 1)->ptr, &length) ; + + if (p) + { + sp->dval = (double) (p - string(sp)->str + 1) ; + RLENGTH->dval = (double) length ; + } + else + { + sp->dval = 0.0 ; + RLENGTH->dval = -1.0 ; /* posix */ + } + + free_STRING(string(sp)) ; + sp->type = C_DOUBLE ; + + RSTART->dval = sp->dval ; + + return sp ; +} + +CELL * +bi_toupper(sp) + CELL *sp ; +{ + STRING *old ; + register char *p, *q ; + + if (sp->type != C_STRING) cast1_to_s(sp) ; + old = string(sp) ; + sp->ptr = (PTR) new_STRING0(old->len) ; + + q = string(sp)->str ; p = old->str ; + while (*p) + { + *q = *p++ ; + if (*q >= 'a' && *q <= 'z') *q += 'A' - 'a' ; + q++ ; + } + free_STRING(old) ; + return sp ; +} + +CELL * +bi_tolower(sp) + CELL *sp ; +{ + STRING *old ; + register char *p, *q ; + + if (sp->type != C_STRING) cast1_to_s(sp) ; + old = string(sp) ; + sp->ptr = (PTR) new_STRING0(old->len) ; + + q = string(sp)->str ; p = old->str ; + while (*p) + { + *q = *p++ ; + if (*q >= 'A' && *q <= 'Z') *q += 'a' - 'A' ; + q++ ; + } + free_STRING(old) ; + return sp ; +} + + +/************************************************ + arithemetic builtins + ************************************************/ + +static void +fplib_err(fname, val, error) + char *fname ; + double val; + char *error ; +{ + rt_error("%s(%g) : %s", fname, val, error) ; +} + + +CELL * +bi_sin(sp) + register CELL *sp ; +{ +#if ! STDC_MATHERR + if (sp->type != C_DOUBLE) cast1_to_d(sp) ; + sp->dval = sin(sp->dval) ; + return sp ; +#else + double x; + + errno = 0 ; + if (sp->type != C_DOUBLE) cast1_to_d(sp) ; + x = sp->dval ; + sp->dval = sin(sp->dval) ; + if (errno) fplib_err("sin", x, "loss of precision") ; + return sp ; +#endif +} + +CELL * +bi_cos(sp) + register CELL *sp ; +{ +#if ! STDC_MATHERR + if (sp->type != C_DOUBLE) cast1_to_d(sp) ; + sp->dval = cos(sp->dval) ; + return sp ; +#else + double x; + + errno = 0 ; + if (sp->type != C_DOUBLE) cast1_to_d(sp) ; + x = sp->dval ; + sp->dval = cos(sp->dval) ; + if (errno) fplib_err("cos", x, "loss of precision") ; + return sp ; +#endif +} + +CELL * +bi_atan2(sp) + register CELL *sp ; +{ +#if ! STDC_MATHERR + sp-- ; + if (TEST2(sp) != TWO_DOUBLES) cast2_to_d(sp) ; + sp->dval = atan2(sp->dval, (sp + 1)->dval) ; + return sp ; +#else + + errno = 0 ; + sp-- ; + if (TEST2(sp) != TWO_DOUBLES) cast2_to_d(sp) ; + sp->dval = atan2(sp->dval, (sp + 1)->dval) ; + if (errno) rt_error("atan2(0,0) : domain error") ; + return sp ; +#endif +} + +CELL * +bi_log(sp) + register CELL *sp ; +{ +#if ! STDC_MATHERR + if (sp->type != C_DOUBLE) cast1_to_d(sp) ; + sp->dval = log(sp->dval) ; + return sp ; +#else + double x; + + errno = 0 ; + if (sp->type != C_DOUBLE) cast1_to_d(sp) ; + x = sp->dval ; + sp->dval = log(sp->dval) ; + if (errno) fplib_err("log", x, "domain error") ; + return sp ; +#endif +} + +CELL * +bi_exp(sp) + register CELL *sp ; +{ +#if ! STDC_MATHERR + if (sp->type != C_DOUBLE) cast1_to_d(sp) ; + sp->dval = exp(sp->dval) ; + return sp ; +#else + double x; + + errno = 0 ; + if (sp->type != C_DOUBLE) cast1_to_d(sp) ; + x = sp->dval ; + sp->dval = exp(sp->dval) ; + if (errno && sp->dval) fplib_err("exp", x, "overflow") ; + /* on underflow sp->dval==0, ignore */ + return sp ; +#endif +} + +CELL * +bi_int(sp) + register CELL *sp ; +{ + if (sp->type != C_DOUBLE) cast1_to_d(sp) ; + sp->dval = sp->dval >= 0.0 ? floor(sp->dval) : ceil(sp->dval) ; + return sp ; +} + +CELL * +bi_sqrt(sp) + register CELL *sp ; +{ +#if ! STDC_MATHERR + if (sp->type != C_DOUBLE) cast1_to_d(sp) ; + sp->dval = sqrt(sp->dval) ; + return sp ; +#else + double x; + + errno = 0 ; + if (sp->type != C_DOUBLE) cast1_to_d(sp) ; + x = sp->dval ; + sp->dval = sqrt(sp->dval) ; + if (errno) fplib_err("sqrt", x, "domain error") ; + return sp ; +#endif +} + +#ifndef NO_TIME_H +#include +#else +#include +#endif + + +/* For portability, we'll use our own random number generator , taken + from: Park, SK and Miller KW, "Random Number Generators: + Good Ones are Hard to Find", CACM, 31, 1192-1201, 1988. +*/ + +static long seed ; /* must be >=1 and < 2^31-1 */ +static CELL cseed ; /* argument of last call to srand() */ + +#define M 0x7fffffff /* 2^31-1 */ +#define MX 0xffffffff +#define A 16807 +#define Q 127773 /* M/A */ +#define R 2836 /* M%A */ + +#if M == MAX__LONG +#define crank(s) s = A * (s % Q) - R * (s / Q) ;\ + if ( s <= 0 ) s += M +#else +/* 64 bit longs */ +#define crank(s) { unsigned long t = s ;\ + t = (A * (t % Q) - R * (t / Q)) & MX ;\ + if ( t >= M ) t = (t+M)&M ;\ + s = t ;\ + } +#endif + + +CELL * +bi_srand(sp) + register CELL *sp ; +{ + CELL c ; + + if (sp->type == 0) /* seed off clock */ + { + cellcpy(sp, &cseed) ; + cell_destroy(&cseed) ; + cseed.type = C_DOUBLE ; + cseed.dval = (double) time((time_t *) 0) ; + } + else /* user seed */ + { + sp-- ; + /* swap cseed and *sp ; don't need to adjust ref_cnts */ + c = *sp ; *sp = cseed ; cseed = c ; + } + + /* The old seed is now in *sp ; move the value in cseed to + seed in range [1,M) */ + + cellcpy(&c, &cseed) ; + if (c.type == C_NOINIT) cast1_to_d(&c) ; + + seed = c.type == C_DOUBLE ? (d_to_i(c.dval) & M) % M + 1 : + hash(string(&c)->str) % M + 1 ; + if( seed == M ) seed = M-1 ; + + cell_destroy(&c) ; + + /* crank it once so close seeds don't give a close + first result */ + crank(seed) ; + + return sp ; +} + +CELL * +bi_rand(sp) + register CELL *sp ; +{ + crank(seed) ; + sp++ ; + sp->type = C_DOUBLE ; + sp->dval = (double) seed / (double) M ; + return sp ; +} +#undef A +#undef M +#undef MX +#undef Q +#undef R +#undef crank + +/************************************************* + miscellaneous builtins + close, system and getline + fflush + *************************************************/ + +CELL * +bi_close(sp) + register CELL *sp ; +{ + int x ; + + if (sp->type < C_STRING) cast1_to_s(sp) ; + x = file_close((STRING *) sp->ptr) ; + free_STRING(string(sp)) ; + sp->type = C_DOUBLE ; + sp->dval = (double) x ; + return sp ; +} + + +CELL * +bi_fflush(sp) + register CELL *sp ; +{ + int ret = 0 ; + + if ( sp->type == 0 ) fflush(stdout) ; + else + { + sp-- ; + if ( sp->type < C_STRING ) cast1_to_s(sp) ; + ret = file_flush(string(sp)) ; + free_STRING(string(sp)) ; + } + + sp->type = C_DOUBLE ; + sp->dval = (double) ret ; + return sp ; +} + + + +#if HAVE_REAL_PIPES + +CELL * +bi_system(sp) + CELL *sp ; +{ + int pid ; + unsigned ret_val ; + + if (sp->type < C_STRING) cast1_to_s(sp) ; + + flush_all_output() ; + switch (pid = fork()) + { + case -1: /* fork failed */ + + errmsg(errno, "could not create a new process") ; + ret_val = 127 ; + break ; + + case 0: /* the child */ + execl(shell, shell, "-c", string(sp)->str, (char *) 0) ; + /* if get here, execl() failed */ + errmsg(errno, "execute of %s failed", shell) ; + fflush(stderr) ; + _exit(127) ; + + default: /* wait for the child */ + ret_val = wait_for(pid) ; + break ; + } + + cell_destroy(sp) ; + sp->type = C_DOUBLE ; + sp->dval = (double) ret_val ; + return sp ; +} + +#endif /* HAVE_REAL_PIPES */ + + + +#if MSDOS + + +CELL * +bi_system(sp) + register CELL *sp ; +{ + int retval ; + + if (sp->type < C_STRING) cast1_to_s(sp) ; + retval = DOSexec(string(sp)->str) ; + free_STRING(string(sp)) ; + sp->type = C_DOUBLE ; + sp->dval = (double) retval ; + return sp ; +} + +#endif + + +/* getline() */ + +/* if type == 0 : stack is 0 , target address + + if type == F_IN : stack is F_IN, expr(filename), target address + if type == PIPE_IN : stack is PIPE_IN, target address, expr(pipename) +*/ + +CELL * +bi_getline(sp) + register CELL *sp ; +{ + CELL tc, *cp ; + char *p ; + unsigned len ; + FIN *fin_p ; + + + switch (sp->type) + { + case 0: + sp-- ; + if (!main_fin) open_main() ; + + if (!(p = FINgets(main_fin, &len))) goto eof ; + + cp = (CELL *) sp->ptr ; + if (TEST2(NR) != TWO_DOUBLES) cast2_to_d(NR) ; + NR->dval += 1.0 ; rt_nr++ ; + FNR->dval += 1.0 ; rt_fnr++ ; + break ; + + case F_IN: + sp-- ; + if (sp->type < C_STRING) cast1_to_s(sp) ; + fin_p = (FIN *) file_find(sp->ptr, F_IN) ; + free_STRING(string(sp)) ; + sp-- ; + + if (!fin_p) goto open_failure ; + if (!(p = FINgets(fin_p, &len))) + { + FINsemi_close(fin_p) ; + goto eof ; + } + cp = (CELL *) sp->ptr ; + break ; + + case PIPE_IN: + sp -= 2 ; + if (sp->type < C_STRING) cast1_to_s(sp) ; + fin_p = (FIN *) file_find(sp->ptr, PIPE_IN) ; + free_STRING(string(sp)) ; + + if (!fin_p) goto open_failure ; + if (!(p = FINgets(fin_p, &len))) + { + FINsemi_close(fin_p) ; +#if HAVE_REAL_PIPES + /* reclaim process slot */ + wait_for(0) ; +#endif + goto eof ; + } + cp = (CELL *) (sp + 1)->ptr ; + break ; + + default: + bozo("type in bi_getline") ; + + } + + /* we've read a line , store it */ + + if (len == 0) + { + tc.type = C_STRING ; + tc.ptr = (PTR) & null_str ; + null_str.ref_cnt++ ; + } + else + { + tc.type = C_MBSTRN ; + tc.ptr = (PTR) new_STRING0(len) ; + memcpy(string(&tc)->str, p, len) ; + } + + slow_cell_assign(cp, &tc) ; + + cell_destroy(&tc) ; + + sp->dval = 1.0 ; goto done ; + open_failure: + sp->dval = -1.0 ; goto done ; + eof: + sp->dval = 0.0 ; /* fall thru to done */ + + done:sp->type = C_DOUBLE; + return sp ; +} + +/********************************************** + sub() and gsub() + **********************************************/ + +/* entry: sp[0] = address of CELL to sub on + sp[-1] = substitution CELL + sp[-2] = regular expression to match +*/ + +CELL * +bi_sub(sp) + register CELL *sp ; +{ + CELL *cp ; /* pointer to the replacement target */ + CELL tc ; /* build the new string here */ + CELL sc ; /* copy of the target CELL */ + char *front, *middle, *back ; /* pieces */ + unsigned front_len, middle_len, back_len ; + + sp -= 2 ; + if (sp->type != C_RE) cast_to_RE(sp) ; + if (sp[1].type != C_REPL && sp[1].type != C_REPLV) + cast_to_REPL(sp + 1) ; + cp = (CELL *) (sp + 2)->ptr ; + /* make a copy of the target, because we won't change anything + including type unless the match works */ + cellcpy(&sc, cp) ; + if (sc.type < C_STRING) cast1_to_s(&sc) ; + front = string(&sc)->str ; + + if ((middle = REmatch(front, sp->ptr, &middle_len))) + { + front_len = middle - front ; + back = middle + middle_len ; + back_len = string(&sc)->len - front_len - middle_len ; + + if ((sp + 1)->type == C_REPLV) + { + STRING *sval = new_STRING0(middle_len) ; + + memcpy(sval->str, middle, middle_len) ; + replv_to_repl(sp + 1, sval) ; + free_STRING(sval) ; + } + + tc.type = C_STRING ; + tc.ptr = (PTR) new_STRING0( + front_len + string(sp + 1)->len + back_len) ; + + { + char *p = string(&tc)->str ; + + if (front_len) + { + memcpy(p, front, front_len) ; + p += front_len ; + } + if (string(sp + 1)->len) + { + memcpy(p, string(sp + 1)->str, string(sp + 1)->len) ; + p += string(sp + 1)->len ; + } + if (back_len) memcpy(p, back, back_len) ; + } + + slow_cell_assign(cp, &tc) ; + + free_STRING(string(&tc)) ; + } + + free_STRING(string(&sc)) ; + repl_destroy(sp + 1) ; + sp->type = C_DOUBLE ; + sp->dval = middle != (char *) 0 ? 1.0 : 0.0 ; + return sp ; +} + +static unsigned repl_cnt ; /* number of global replacements */ + +/* recursive global subsitution + dealing with empty matches makes this mildly painful +*/ + +static STRING * +gsub(re, repl, target, flag) + PTR re ; + CELL *repl ; /* always of type REPL or REPLV, + destroyed by caller */ + char *target ; + + /* if on, match of empty string at front is OK */ + int flag ; +{ + char *front, *middle ; + STRING *back ; + unsigned front_len, middle_len ; + STRING *ret_val ; + CELL xrepl ; /* a copy of repl so we can change repl */ + + if (!(middle = REmatch(target, re, &middle_len))) + return new_STRING(target) ;/* no match */ + + cellcpy(&xrepl, repl) ; + + if (!flag && middle_len == 0 && middle == target) + { /* match at front that's not allowed */ + + if (*target == 0) /* target is empty string */ + { + repl_destroy(&xrepl) ; + null_str.ref_cnt++ ; + return &null_str ; + } + else + { + char xbuff[2] ; + + front_len = 0 ; + /* make new repl with target[0] */ + repl_destroy(repl) ; + xbuff[0] = *target++ ; xbuff[1] = 0 ; + repl->type = C_REPL ; + repl->ptr = (PTR) new_STRING(xbuff) ; + back = gsub(re, &xrepl, target, 1) ; + } + } + else /* a match that counts */ + { + repl_cnt++ ; + + front = target ; + front_len = middle - target ; + + if (*middle == 0) /* matched back of target */ + { + back = &null_str ; + null_str.ref_cnt++ ; + } + else back = gsub(re, &xrepl, middle + middle_len, 0) ; + + /* patch the &'s if needed */ + if (repl->type == C_REPLV) + { + STRING *sval = new_STRING0(middle_len) ; + + memcpy(sval->str, middle, middle_len) ; + replv_to_repl(repl, sval) ; + free_STRING(sval) ; + } + } + + /* put the three pieces together */ + ret_val = new_STRING0(front_len + string(repl)->len + back->len) ; + { + char *p = ret_val->str ; + + if (front_len) + { + memcpy(p, front, front_len) ; + p += front_len ; + } + + if (string(repl)->len) + { + memcpy(p, string(repl)->str, string(repl)->len) ; + p += string(repl)->len ; + } + if (back->len) memcpy(p, back->str, back->len) ; + } + + /* cleanup, repl is freed by the caller */ + repl_destroy(&xrepl) ; + free_STRING(back) ; + + return ret_val ; +} + +/* set up for call to gsub() */ +CELL * +bi_gsub(sp) + register CELL *sp ; +{ + CELL *cp ; /* pts at the replacement target */ + CELL sc ; /* copy of replacement target */ + CELL tc ; /* build the result here */ + + sp -= 2 ; + if (sp->type != C_RE) cast_to_RE(sp) ; + if ((sp + 1)->type != C_REPL && (sp + 1)->type != C_REPLV) + cast_to_REPL(sp + 1) ; + + cellcpy(&sc, cp = (CELL *) (sp + 2)->ptr) ; + if (sc.type < C_STRING) cast1_to_s(&sc) ; + + repl_cnt = 0 ; + tc.ptr = (PTR) gsub(sp->ptr, sp + 1, string(&sc)->str, 1) ; + + if (repl_cnt) + { + tc.type = C_STRING ; + slow_cell_assign(cp, &tc) ; + } + + /* cleanup */ + free_STRING(string(&sc)) ; free_STRING(string(&tc)) ; + repl_destroy(sp + 1) ; + + sp->type = C_DOUBLE ; + sp->dval = (double) repl_cnt ; + return sp ; +} + diff --git a/bi_funct.h b/bi_funct.h new file mode 100644 index 0000000..d040f7b --- /dev/null +++ b/bi_funct.h @@ -0,0 +1,67 @@ + +/******************************************** +bi_funct.h +copyright 1991, Michael D. Brennan + +This is a source file for mawk, an implementation of +the AWK programming language. + +Mawk is distributed without warranty under the terms of +the GNU General Public License, version 2, 1991. +********************************************/ + + +/* $Log: bi_funct.h,v $ + * Revision 1.2 1994/12/11 22:10:15 mike + * fflush + * + * Revision 1.1.1.1 1993/07/03 18:58:08 mike + * move source to cvs + * + * Revision 5.1 1991/12/05 07:59:03 brennan + * 1.1 pre-release + * +*/ + +#ifndef BI_FUNCT_H +#define BI_FUNCT_H 1 + +#include "symtype.h" + +extern BI_REC bi_funct[] ; + +void PROTO(bi_init, (void) ) ; + +/* builtin string functions */ +CELL *PROTO( bi_print, (CELL *) ) ; +CELL *PROTO( bi_printf, (CELL *) ) ; +CELL *PROTO( bi_length, (CELL *) ) ; +CELL *PROTO( bi_index, (CELL *) ) ; +CELL *PROTO( bi_substr, (CELL *) ) ; +CELL *PROTO( bi_sprintf, (CELL *) ) ; +CELL *PROTO( bi_split, (CELL *) ) ; +CELL *PROTO( bi_match, (CELL *) ) ; +CELL *PROTO( bi_getline, (CELL *) ) ; +CELL *PROTO( bi_sub, (CELL *) ) ; +CELL *PROTO( bi_gsub, (CELL *) ) ; +CELL *PROTO( bi_toupper, (CELL*) ) ; +CELL *PROTO( bi_tolower, (CELL*) ) ; + +/* builtin arith functions */ +CELL *PROTO( bi_sin, (CELL *) ) ; +CELL *PROTO( bi_cos, (CELL *) ) ; +CELL *PROTO( bi_atan2, (CELL *) ) ; +CELL *PROTO( bi_log, (CELL *) ) ; +CELL *PROTO( bi_exp, (CELL *) ) ; +CELL *PROTO( bi_int, (CELL *) ) ; +CELL *PROTO( bi_sqrt, (CELL *) ) ; +CELL *PROTO( bi_srand, (CELL *) ) ; +CELL *PROTO( bi_rand, (CELL *) ) ; + +/* other builtins */ +CELL *PROTO( bi_close, (CELL *) ) ; +CELL *PROTO( bi_system, (CELL *) ) ; +CELL *PROTO( bi_fflush, (CELL *) ) ; + +#endif /* BI_FUNCT_H */ + diff --git a/bi_funct.o b/bi_funct.o new file mode 100644 index 0000000..c4ad0e9 Binary files /dev/null and b/bi_funct.o differ diff --git a/bi_vars.c b/bi_vars.c new file mode 100644 index 0000000..5ae9717 --- /dev/null +++ b/bi_vars.c @@ -0,0 +1,92 @@ + +/******************************************** +bi_vars.c +copyright 1991, Michael D. Brennan + +This is a source file for mawk, an implementation of +the AWK programming language. + +Mawk is distributed without warranty under the terms of +the GNU General Public License, version 2, 1991. +********************************************/ + +/* $Log: bi_vars.c,v $ + * Revision 1.1.1.1 1993/07/03 18:58:09 mike + * move source to cvs + * + * Revision 5.2 1992/07/10 16:17:10 brennan + * MsDOS: remove NO_BINMODE macro + * + * Revision 5.1 1991/12/05 07:55:38 brennan + * 1.1 pre-release + * +*/ + + +/* bi_vars.c */ + +#include "mawk.h" +#include "symtype.h" +#include "bi_vars.h" +#include "field.h" +#include "init.h" +#include "memory.h" + +/* the builtin variables */ +CELL bi_vars[NUM_BI_VAR] ; + +/* the order here must match the order in bi_vars.h */ + +static char *bi_var_names[NUM_BI_VAR] = { +"NR" , +"FNR" , +"ARGC" , +"FILENAME" , +"OFS" , +"ORS" , +"RLENGTH" , +"RSTART" , +"SUBSEP" +#if MSDOS +, "BINMODE" +#endif +} ; + +/* insert the builtin vars in the hash table */ + +void bi_vars_init() +{ register int i ; + register SYMTAB *s ; + + + for ( i = 0 ; i < NUM_BI_VAR ; i++ ) + { s = insert( bi_var_names[i] ) ; + s->type = i <= 1 ? ST_NR : ST_VAR ; + s->stval.cp = bi_vars + i ; + /* bi_vars[i].type = 0 which is C_NOINIT */ + } + + s = insert("ENVIRON") ; + s->type = ST_ENV ; + + /* set defaults */ + + FILENAME->type = C_STRING ; + FILENAME->ptr = (PTR) new_STRING( "" ) ; + + OFS->type = C_STRING ; + OFS->ptr = (PTR) new_STRING( " " ) ; + + ORS->type = C_STRING ; + ORS->ptr = (PTR) new_STRING( "\n" ) ; + + SUBSEP->type = C_STRING ; + SUBSEP->ptr = (PTR) new_STRING( "\034" ) ; + + NR->type = FNR->type = C_DOUBLE ; + /* dval is already 0.0 */ + +#if MSDOS + BINMODE->type = C_DOUBLE ; +#endif +} diff --git a/bi_vars.h b/bi_vars.h new file mode 100644 index 0000000..31981ca --- /dev/null +++ b/bi_vars.h @@ -0,0 +1,59 @@ + +/******************************************** +bi_vars.h +copyright 1991, Michael D. Brennan + +This is a source file for mawk, an implementation of +the AWK programming language. + +Mawk is distributed without warranty under the terms of +the GNU General Public License, version 2, 1991. +********************************************/ + + +/* $Log: bi_vars.h,v $ + * Revision 1.1.1.1 1993/07/03 18:58:09 mike + * move source to cvs + * + * Revision 5.2 1992/07/10 16:17:10 brennan + * MsDOS: remove NO_BINMODE macro + * + * Revision 5.1 1991/12/05 07:59:05 brennan + * 1.1 pre-release + * +*/ + + +/* bi_vars.h */ + +#ifndef BI_VARS_H +#define BI_VARS_H 1 + + +/* builtin variables NF, RS, FS, OFMT are stored + internally in field[], so side effects of assignment can + be handled +*/ + +/* NR and FNR must be next to each other */ +#define NR bi_vars +#define FNR (bi_vars+1) +#define ARGC (bi_vars+2) +#define FILENAME (bi_vars+3) +#define OFS (bi_vars+4) +#define ORS (bi_vars+5) +#define RLENGTH (bi_vars+6) +#define RSTART (bi_vars+7) +#define SUBSEP (bi_vars+8) + +#if MSDOS +#define BINMODE (bi_vars+9) +#define NUM_BI_VAR 10 +#else +#define NUM_BI_VAR 9 +#endif + +extern CELL bi_vars[NUM_BI_VAR] ; + + +#endif diff --git a/bi_vars.o b/bi_vars.o new file mode 100644 index 0000000..a27d20b Binary files /dev/null and b/bi_vars.o differ diff --git a/build b/build new file mode 100644 index 0000000..e69de29 diff --git a/cast.c b/cast.c new file mode 100644 index 0000000..8fa7fff --- /dev/null +++ b/cast.c @@ -0,0 +1,418 @@ + +/******************************************** +cast.c +copyright 1991, Michael D. Brennan + +This is a source file for mawk, an implementation of +the AWK programming language. + +Mawk is distributed without warranty under the terms of +the GNU General Public License, version 2, 1991. +********************************************/ + + +/* $Log: cast.c,v $ + * Revision 1.6 1996/08/11 22:07:50 mike + * Fix small bozo in rt_error("overflow converting ...") + * + * Revision 1.5 1995/06/18 19:17:45 mike + * Create a type Int which on most machines is an int, but on machines + * with 16bit ints, i.e., the PC is a long. This fixes implicit assumption + * that int==long. + * + * Revision 1.4 1995/06/06 00:02:02 mike + * fix cast in d_to_l() + * + * Revision 1.3 1993/07/17 13:22:45 mike + * indent and general code cleanup + * + * Revision 1.2 1993/07/04 12:51:41 mike + * start on autoconfig changes + * + * Revision 5.5 1993/03/06 18:49:45 mike + * rm rt_overflow from check_strnum + * + * Revision 5.4 1993/02/13 21:57:20 mike + * merge patch3 + * + * Revision 5.3.1.4 1993/01/22 15:05:19 mike + * pow2->mpow2 for linux + * + * Revision 5.3.1.3 1993/01/22 14:18:33 mike + * const for strtod and ansi picky compilers + * + * Revision 5.3.1.2 1993/01/20 12:53:06 mike + * d_to_l() + * + * Revision 5.3.1.1 1993/01/15 03:33:37 mike + * patch3: safer double to int conversion + * + * Revision 5.3 1992/11/28 23:48:42 mike + * For internal conversion numeric->string, when testing + * if integer, use longs instead of ints so 16 and 32 bit + * systems behave the same + * + * Revision 5.2 1992/08/17 14:19:45 brennan + * patch2: After parsing, only bi_sprintf() uses string_buff. + * + * Revision 5.1 1991/12/05 07:55:41 brennan + * 1.1 pre-release + * +*/ + + +/* cast.c */ + +#include "mawk.h" +#include "field.h" +#include "memory.h" +#include "scan.h" +#include "repl.h" + +int mpow2[NUM_CELL_TYPES] = +{1, 2, 4, 8, 16, 32, 64, 128, 256, 512} ; + +void +cast1_to_d(cp) + register CELL *cp ; +{ + switch (cp->type) + { + case C_NOINIT: + cp->dval = 0.0 ; + break ; + + case C_DOUBLE: + return ; + + case C_MBSTRN: + case C_STRING: + { + register STRING *s = (STRING *) cp->ptr ; + +#if FPE_TRAPS_ON /* look for overflow error */ + errno = 0 ; + cp->dval = strtod(s->str, (char **) 0) ; + if (errno && cp->dval != 0.0) /* ignore underflow */ + rt_error("overflow converting %s to double", s->str) ; +#else + cp->dval = strtod(s->str, (char **) 0) ; +#endif + free_STRING(s) ; + } + break ; + + case C_STRNUM: + /* don't need to convert, but do need to free the STRING part */ + free_STRING(string(cp)) ; + break ; + + + default: + bozo("cast on bad type") ; + } + cp->type = C_DOUBLE ; +} + +void +cast2_to_d(cp) + register CELL *cp ; +{ + register STRING *s ; + + switch (cp->type) + { + case C_NOINIT: + cp->dval = 0.0 ; + break ; + + case C_DOUBLE: + goto two ; + case C_STRNUM: + free_STRING(string(cp)) ; + break ; + + case C_MBSTRN: + case C_STRING: + s = (STRING *) cp->ptr ; + +#if FPE_TRAPS_ON /* look for overflow error */ + errno = 0 ; + cp->dval = strtod(s->str, (char **) 0) ; + if (errno && cp->dval != 0.0) /* ignore underflow */ + rt_error("overflow converting %s to double", s->str) ; +#else + cp->dval = strtod(s->str, (char **) 0) ; +#endif + free_STRING(s) ; + break ; + + default: + bozo("cast on bad type") ; + } + cp->type = C_DOUBLE ; + + two:cp++ ; + + switch (cp->type) + { + case C_NOINIT: + cp->dval = 0.0 ; + break ; + + case C_DOUBLE: + return ; + case C_STRNUM: + free_STRING(string(cp)) ; + break ; + + case C_MBSTRN: + case C_STRING: + s = (STRING *) cp->ptr ; + +#if FPE_TRAPS_ON /* look for overflow error */ + errno = 0 ; + cp->dval = strtod(s->str, (char **) 0) ; + if (errno && cp->dval != 0.0) /* ignore underflow */ + rt_error("overflow converting %s to double", s->str) ; +#else + cp->dval = strtod(s->str, (char **) 0) ; +#endif + free_STRING(s) ; + break ; + + default: + bozo("cast on bad type") ; + } + cp->type = C_DOUBLE ; +} + +void +cast1_to_s(cp) + register CELL *cp ; +{ + register Int lval ; + char xbuff[260] ; + + switch (cp->type) + { + case C_NOINIT: + null_str.ref_cnt++ ; + cp->ptr = (PTR) & null_str ; + break ; + + case C_DOUBLE: + + lval = d_to_I(cp->dval) ; + if (lval == cp->dval) sprintf(xbuff, INT_FMT, lval) ; + else sprintf(xbuff, string(CONVFMT)->str, cp->dval) ; + + cp->ptr = (PTR) new_STRING(xbuff) ; + break ; + + case C_STRING: + return ; + + case C_MBSTRN: + case C_STRNUM: + break ; + + default: + bozo("bad type on cast") ; + } + cp->type = C_STRING ; +} + +void +cast2_to_s(cp) + register CELL *cp ; +{ + register Int lval ; + char xbuff[260] ; + + switch (cp->type) + { + case C_NOINIT: + null_str.ref_cnt++ ; + cp->ptr = (PTR) & null_str ; + break ; + + case C_DOUBLE: + + lval = d_to_I(cp->dval) ; + if (lval == cp->dval) sprintf(xbuff, INT_FMT, lval) ; + else sprintf(xbuff, string(CONVFMT)->str, cp->dval) ; + + cp->ptr = (PTR) new_STRING(xbuff) ; + break ; + + case C_STRING: + goto two ; + + case C_MBSTRN: + case C_STRNUM: + break ; + + default: + bozo("bad type on cast") ; + } + cp->type = C_STRING ; + +two: + cp++ ; + + switch (cp->type) + { + case C_NOINIT: + null_str.ref_cnt++ ; + cp->ptr = (PTR) & null_str ; + break ; + + case C_DOUBLE: + + lval = d_to_I(cp->dval) ; + if (lval == cp->dval) sprintf(xbuff, INT_FMT, lval) ; + else sprintf(xbuff, string(CONVFMT)->str, cp->dval) ; + + cp->ptr = (PTR) new_STRING(xbuff) ; + break ; + + case C_STRING: + return ; + + case C_MBSTRN: + case C_STRNUM: + break ; + + default: + bozo("bad type on cast") ; + } + cp->type = C_STRING ; +} + +void +cast_to_RE(cp) + register CELL *cp ; +{ + register PTR p ; + + if (cp->type < C_STRING) cast1_to_s(cp) ; + + p = re_compile(string(cp)) ; + free_STRING(string(cp)) ; + cp->type = C_RE ; + cp->ptr = p ; + +} + +void +cast_for_split(cp) + register CELL *cp ; +{ + static char meta[] = "^$.*+?|[]()" ; + static char xbuff[] = "\\X" ; + int c ; + unsigned len ; + + if (cp->type < C_STRING) cast1_to_s(cp) ; + + if ((len = string(cp)->len) == 1) + { + if ((c = string(cp)->str[0]) == ' ') + { + free_STRING(string(cp)) ; + cp->type = C_SPACE ; + return ; + } + else if (strchr(meta, c)) + { + xbuff[1] = c ; + free_STRING(string(cp)) ; + cp->ptr = (PTR) new_STRING(xbuff) ; + } + } + else if (len == 0) + { + free_STRING(string(cp)) ; + cp->type = C_SNULL ; + return ; + } + + cast_to_RE(cp) ; +} + +/* input: cp-> a CELL of type C_MBSTRN (maybe strnum) + test it -- casting it to the appropriate type + which is C_STRING or C_STRNUM +*/ + +void +check_strnum(cp) + CELL *cp ; +{ + char *test ; + register unsigned char *s, *q ; + + cp->type = C_STRING ; /* assume not C_STRNUM */ + s = (unsigned char *) string(cp)->str ; + q = s + string(cp)->len ; + while (scan_code[*s] == SC_SPACE) s++ ; + if (s == q) return ; + + while (scan_code[q[-1]] == SC_SPACE) q-- ; + if (scan_code[q[-1]] != SC_DIGIT && + q[-1] != '.') + return ; + + switch (scan_code[*s]) + { + case SC_DIGIT: + case SC_PLUS: + case SC_MINUS: + case SC_DOT: + +#if FPE_TRAPS_ON + errno = 0 ; + cp->dval = strtod((char *) s, &test) ; + /* make overflow pure string */ + if (errno && cp->dval != 0.0) return ; +#else + cp->dval = strtod((char *) s, &test) ; +#endif + + if ((char *) q <= test) cp->type = C_STRNUM ; + /* <= instead of == , for some buggy strtod + e.g. Apple Unix */ + } +} + +/* cast a CELL to a replacement cell */ + +void +cast_to_REPL(cp) + register CELL *cp ; +{ + register STRING *sval ; + + if (cp->type < C_STRING) cast1_to_s(cp) ; + sval = (STRING *) cp->ptr ; + + cellcpy(cp, repl_compile(sval)) ; + free_STRING(sval) ; +} + + +/* convert a double to Int (this is not as simple as a + cast because the results are undefined if it won't fit). + Truncate large values to +Max_Int or -Max_Int + Send nans to -Max_Int +*/ + +Int +d_to_I(d) + double d; +{ + if (d >= Max_Int) return Max_Int ; + if (d > -Max_Int) return (Int) d ; + return -Max_Int ; +} diff --git a/cast.o b/cast.o new file mode 100644 index 0000000..f30d8b7 Binary files /dev/null and b/cast.o differ diff --git a/code.c b/code.c new file mode 100644 index 0000000..40fca41 --- /dev/null +++ b/code.c @@ -0,0 +1,256 @@ + +/******************************************** +code.c +copyright 1991-93, Michael D. Brennan + +This is a source file for mawk, an implementation of +the AWK programming language. + +Mawk is distributed without warranty under the terms of +the GNU General Public License, version 2, 1991. +********************************************/ + + +/* $Log: code.c,v $ + * Revision 1.6 1995/06/18 19:42:13 mike + * Remove some redundant declarations and add some prototypes + * + * Revision 1.5 1995/06/09 23:21:36 mike + * make sure there is an execution block in case user defines function, + * but no pattern-action pairs + * + * Revision 1.4 1995/03/08 00:06:22 mike + * add a pointer cast + * + * Revision 1.3 1994/10/08 19:15:29 mike + * remove SM_DOS + * + * Revision 1.2 1993/07/07 00:07:38 mike + * more work on 1.2 + * + * Revision 1.1.1.1 1993/07/03 18:58:10 mike + * move source to cvs + * + * Revision 5.4 1993/01/14 13:11:11 mike + * code2() -> xcode2() + * + * Revision 5.3 1993/01/09 20:15:35 mike + * code_pop checks if the resolve_list needs relocation + * + * Revision 5.2 1993/01/07 02:50:33 mike + * relative vs absolute code + * + * Revision 5.1 1991/12/05 07:55:43 brennan + * 1.1 pre-release + * +*/ + +/* code.c */ + +#include "mawk.h" +#include "code.h" +#include "init.h" +#include "jmp.h" +#include "field.h" + + +static CODEBLOCK *PROTO(new_code, (void)) ; + +CODEBLOCK active_code ; + +CODEBLOCK *main_code_p, *begin_code_p, *end_code_p ; + +INST *begin_start, *main_start, *end_start ; +unsigned begin_size, main_size ; + +INST *execution_start = 0 ; + + +/* grow the active code */ +void +code_grow() +{ + unsigned oldsize = code_limit - code_base ; + unsigned newsize = PAGESZ + oldsize ; + unsigned delta = code_ptr - code_base ; + + if (code_ptr > code_limit) bozo("CODEWARN is too small") ; + + code_base = (INST *) + zrealloc(code_base, INST_BYTES(oldsize), + INST_BYTES(newsize)) ; + code_limit = code_base + newsize ; + code_warn = code_limit - CODEWARN ; + code_ptr = code_base + delta ; +} + +/* shrinks executable code that's done to its final size */ +INST * +code_shrink(p, sizep) + CODEBLOCK *p ; + unsigned *sizep ; +{ + + unsigned oldsize = INST_BYTES(p->limit - p->base) ; + unsigned newsize = INST_BYTES(p->ptr - p->base) ; + INST *retval ; + + *sizep = newsize ; + + retval = (INST *) zrealloc(p->base, oldsize, newsize) ; + ZFREE(p) ; + return retval ; +} + + +/* code an op and a pointer in the active_code */ +void +xcode2(op, ptr) + int op ; + PTR ptr ; +{ + register INST *p = code_ptr + 2 ; + + if (p >= code_warn) + { + code_grow() ; + p = code_ptr + 2 ; + } + + p[-2].op = op ; + p[-1].ptr = ptr ; + code_ptr = p ; +} + +/* code two ops in the active_code */ +void +code2op(x, y) + int x, y ; +{ + register INST *p = code_ptr + 2 ; + + if (p >= code_warn) + { + code_grow() ; + p = code_ptr + 2 ; + } + + p[-2].op = x ; + p[-1].op = y ; + code_ptr = p ; +} + +void +code_init() +{ + main_code_p = new_code() ; + + active_code = *main_code_p ; + code1(_OMAIN) ; +} + +/* final code relocation + set_code() as in set concrete */ +void +set_code() +{ + /* set the main code which is active_code */ + if (end_code_p || code_offset > 1) + { + int gl_offset = code_offset ; + extern int NR_flag ; + + if (NR_flag) code2op(OL_GL_NR, _HALT) ; + else code2op(OL_GL, _HALT) ; + + *main_code_p = active_code ; + main_start = code_shrink(main_code_p, &main_size) ; + next_label = main_start + gl_offset ; + execution_start = main_start ; + } + else /* only BEGIN */ + { + zfree(code_base, INST_BYTES(PAGESZ)) ; + ZFREE(main_code_p) ; + } + + /* set the END code */ + if (end_code_p) + { + unsigned dummy ; + + active_code = *end_code_p ; + code2op(_EXIT0, _HALT) ; + *end_code_p = active_code ; + end_start = code_shrink(end_code_p, &dummy) ; + } + + /* set the BEGIN code */ + if (begin_code_p) + { + active_code = *begin_code_p ; + if (main_start) code2op(_JMAIN, _HALT) ; + else code2op(_EXIT0, _HALT) ; + *begin_code_p = active_code ; + begin_start = code_shrink(begin_code_p, &begin_size) ; + + execution_start = begin_start ; + } + + if ( ! execution_start ) + { + /* program had functions but no pattern-action bodies */ + execution_start = begin_start = (INST*) zmalloc(2*sizeof(INST)) ; + execution_start[0].op = _EXIT0 ; + execution_start[1].op = _HALT ; + } +} + +void +dump_code() +{ + fdump() ; /* dumps all user functions */ + if (begin_start) + { fprintf(stdout, "BEGIN\n") ; + da(begin_start, stdout) ; } + if (end_start) + { fprintf(stdout, "END\n") ; + da(end_start, stdout) ; } + if (main_start) + { fprintf(stdout, "MAIN\n") ; + da(main_start, stdout) ; } +} + + +static CODEBLOCK * +new_code() +{ + CODEBLOCK *p = ZMALLOC(CODEBLOCK) ; + + p->base = (INST *) zmalloc(INST_BYTES(PAGESZ)) ; + p->limit = p->base + PAGESZ ; + p->warn = p->limit - CODEWARN ; + p->ptr = p->base ; + + return p ; +} + +/* moves the active_code from MAIN to a BEGIN or END */ + +void +be_setup(scope) + int scope ; +{ + *main_code_p = active_code ; + + if (scope == SCOPE_BEGIN) + { + if (!begin_code_p) begin_code_p = new_code() ; + active_code = *begin_code_p ; + } + else + { + if (!end_code_p) end_code_p = new_code() ; + active_code = *end_code_p ; + } +} diff --git a/code.h b/code.h new file mode 100644 index 0000000..474c7e8 --- /dev/null +++ b/code.h @@ -0,0 +1,192 @@ + +/******************************************** +code.h +copyright 1991, Michael D. Brennan + +This is a source file for mawk, an implementation of +the AWK programming language. + +Mawk is distributed without warranty under the terms of +the GNU General Public License, version 2, 1991. +********************************************/ + + +/* $Log: code.h,v $ + * Revision 1.5 1995/06/18 19:42:15 mike + * Remove some redundant declarations and add some prototypes + * + * Revision 1.4 1994/12/13 00:13:01 mike + * delete A statement to delete all of A at once + * + * Revision 1.3 1993/12/01 14:25:06 mike + * reentrant array loops + * + * Revision 1.2 1993/07/22 00:04:01 mike + * new op code _LJZ _LJNZ + * + * Revision 1.1.1.1 1993/07/03 18:58:10 mike + * move source to cvs + * + * Revision 5.3 1993/01/14 13:11:11 mike + * code2() -> xcode2() + * + * Revision 5.2 1993/01/07 02:50:33 mike + * relative vs absolute code + * + * Revision 5.1 1991/12/05 07:59:07 brennan + * 1.1 pre-release + * +*/ + + +/* code.h */ + +#ifndef CODE_H +#define CODE_H + +#include "memory.h" + +#define PAGESZ 512 + /* number of code instructions allocated at one time */ +#define CODEWARN 16 + +/* coding scope */ +#define SCOPE_MAIN 0 +#define SCOPE_BEGIN 1 +#define SCOPE_END 2 +#define SCOPE_FUNCT 3 + + +typedef struct { +INST *base, *limit, *warn, *ptr ; +} CODEBLOCK ; + +extern CODEBLOCK active_code ; +extern CODEBLOCK *main_code_p, *begin_code_p, *end_code_p ; + +extern INST *main_start, *begin_start, *end_start ; +extern unsigned main_size, begin_size ; +extern INST *execution_start ; +extern INST *next_label ; /* next statements jump to here */ +extern int dump_code_flag ; + +#define code_ptr active_code.ptr +#define code_base active_code.base +#define code_warn active_code.warn +#define code_limit active_code.limit +#define code_offset (code_ptr-code_base) + +#define INST_BYTES(x) (sizeof(INST)*(unsigned)(x)) + +extern CELL eval_stack[] ; +extern int exit_code ; + + +#define code1(x) code_ptr++ -> op = (x) +/* shutup picky compilers */ +#define code2(x,p) xcode2(x,(PTR)(p)) + +void PROTO(xcode2, (int, PTR)) ; +void PROTO(code2op, (int, int)) ; +INST *PROTO(code_shrink, (CODEBLOCK*, unsigned*)) ; +void PROTO(code_grow, (void)) ; +void PROTO(set_code, (void)) ; +void PROTO(be_setup, (int)) ; +void PROTO(dump_code, (void)) ; + + +/* the machine opcodes */ +/* to avoid confusion with a ptr FE_PUSHA must have op code 0 */ +/* unfortunately enums are less portable than defines */ + +#define FE_PUSHA 0 +#define FE_PUSHI 1 +#define F_PUSHA 2 +#define F_PUSHI 3 +#define NF_PUSHI 4 +#define _HALT 5 +#define _STOP 6 +#define _PUSHC 7 +#define _PUSHD 8 +#define _PUSHS 9 +#define _PUSHINT 10 +#define _PUSHA 11 +#define _PUSHI 12 +#define L_PUSHA 13 +#define L_PUSHI 14 +#define AE_PUSHA 15 +#define AE_PUSHI 16 +#define A_PUSHA 17 +#define LAE_PUSHA 18 +#define LAE_PUSHI 19 +#define LA_PUSHA 20 +#define _POP 21 +#define _ADD 22 +#define _SUB 23 +#define _MUL 24 +#define _DIV 25 +#define _MOD 26 +#define _POW 27 +#define _NOT 28 +#define _TEST 29 +#define A_TEST 30 +#define A_DEL 31 +#define ALOOP 32 +#define A_CAT 33 +#define _UMINUS 34 +#define _UPLUS 35 +#define _ASSIGN 36 +#define _ADD_ASG 37 +#define _SUB_ASG 38 +#define _MUL_ASG 39 +#define _DIV_ASG 40 +#define _MOD_ASG 41 +#define _POW_ASG 42 +#define F_ASSIGN 43 +#define F_ADD_ASG 44 +#define F_SUB_ASG 45 +#define F_MUL_ASG 46 +#define F_DIV_ASG 47 +#define F_MOD_ASG 48 +#define F_POW_ASG 49 +#define _CAT 50 +#define _BUILTIN 51 +#define _PRINT 52 +#define _POST_INC 53 +#define _POST_DEC 54 +#define _PRE_INC 55 +#define _PRE_DEC 56 +#define F_POST_INC 57 +#define F_POST_DEC 58 +#define F_PRE_INC 59 +#define F_PRE_DEC 60 +#define _JMP 61 +#define _JNZ 62 +#define _JZ 63 +#define _LJZ 64 +#define _LJNZ 65 +#define _EQ 66 +#define _NEQ 67 +#define _LT 68 +#define _LTE 69 +#define _GT 70 +#define _GTE 71 +#define _MATCH0 72 +#define _MATCH1 73 +#define _MATCH2 74 +#define _EXIT 75 +#define _EXIT0 76 +#define _NEXT 77 +#define _RANGE 78 +#define _CALL 79 +#define _RET 80 +#define _RET0 81 +#define SET_ALOOP 82 +#define POP_AL 83 +#define OL_GL 84 +#define OL_GL_NR 85 +#define _OMAIN 86 +#define _JMAIN 87 +#define DEL_A 88 + +#endif /* CODE_H */ diff --git a/code.o b/code.o new file mode 100644 index 0000000..c5fd390 Binary files /dev/null and b/code.o differ diff --git a/config-user/.config.user b/config-user/.config.user new file mode 100644 index 0000000..b34e0b1 --- /dev/null +++ b/config-user/.config.user @@ -0,0 +1,41 @@ +# config.user (user configuration template) + +# User settable configuration parameters +# +# Uncomment and change as needed +# (no space around '=' , this gets sourced) + +# Most people will not need to do anything with this file. + +# If you want or need changes, edit this file. +############ + +# default is to look for gcc and use cc if no gcc +# change if you do not want gcc or want a different compiler from +# gcc or cc +#CC=lcc + +# change if need special C compiler flags +# otherwise default is -O +# CFLAGS='-O4 -special flags' + +# configure will look for libm. Change if you know this will fail +# or want a different math library +#MATHLIB=-lspecialmath +#MATHLIB='' # if you don't need a special lib to get sin,sqrt,etc + +# where to put the binary +BINDIR=/usr/local/bin + +# where to put the man pages and man page extension +MANDIR=/usr/local/man/man1 +MANEXT=1 + +# fix up things the configuration script bungles here +# most likely candidate is fpe tests +# This gets put in config.h via: echo "$USER_DEFINES" +# example: +#USER_DEFINES=' +#define FPE_TRAPS_ON 1 +#define NOINFO_SIGFPE 1' + diff --git a/config-user/apollo b/config-user/apollo new file mode 100644 index 0000000..16e814d --- /dev/null +++ b/config-user/apollo @@ -0,0 +1,24 @@ +# Apollo system 10.4 +# domain C compiler 6.9 +# +# mawk will not work with 6.8 or earlier compiler + + +# $Log: apollo,v $ +# Revision 1.1 1994/12/14 00:11:03 mike +# initial ci +# + +############ + +CC='cc -A nansi' # otherwise matherr() won't load + +MATHLIB=' ' # don't need a mathlib + +# where to put the binary +BINDIR=/usr/local/bin + +# where to put the man pages and man page extension +MANDIR=/usr/local/man/man1 +MANEXT=1 + diff --git a/config-user/convex b/config-user/convex new file mode 100644 index 0000000..67962c3 --- /dev/null +++ b/config-user/convex @@ -0,0 +1,21 @@ +# convex +# if you prefer to use cc or no gcc + +# $Log: convex,v $ +# Revision 1.1 1995/06/03 09:27:17 mike +# initial checkin +# + +############ + +CC=cc + +CFLAGS='-O2 -std' + + +# where to put the binary +BINDIR=/usr/local/bin + +# where to put the man pages and man page extension +MANDIR=/usr/local/man/man1 +MANEXT=1 diff --git a/config-user/cray b/config-user/cray new file mode 100644 index 0000000..4e4b8b4 --- /dev/null +++ b/config-user/cray @@ -0,0 +1,34 @@ +# cray +# $Log: cray,v $ +# Revision 1.2 1996/09/04 23:35:33 mike +# fix typo +# +# Revision 1.1 1995/12/17 22:47:43 mike +# initial checkin +# + +############ + +# $uname -a +# sn9069 sn9069 8.0.4 wd4.1 CRAY J90 + +CC=cc + +CFLAGS='-h matherror=errno' +# should -O be added to this ?? + +# where to put the binary +BINDIR=/usr/local/bin + +# where to put the man pages and man page extension +MANDIR=/usr/local/man/man1 +MANEXT=1 + +# this stuff goes in config.h +# autoconfiguration failures are fixed by hand here +USER_DEFINES=' +#define NO_MATHERR 1 +#define FPE_TRAPS_ON 1 +#define NOINFO_SIGFPE 1 +' + diff --git a/config-user/mips b/config-user/mips new file mode 100644 index 0000000..4ee857f --- /dev/null +++ b/config-user/mips @@ -0,0 +1,20 @@ +# mips +# if you prefer to use cc or no gcc + +# $Log: mips,v $ +# Revision 1.1 1995/06/03 09:27:19 mike +# initial checkin +# + +############ + +CC=cc + +CFLAGS='-O -Olimit 700 -systype bsd43' + +# where to put the binary +BINDIR=/usr/local/bin + +# where to put the man pages and man page extension +MANDIR=/usr/local/man/man1 +MANEXT=1 diff --git a/config-user/sgi b/config-user/sgi new file mode 100644 index 0000000..58a342d --- /dev/null +++ b/config-user/sgi @@ -0,0 +1,20 @@ +# sgi +# if you prefer to use cc or no gcc + +# $Log: sgi,v $ +# Revision 1.1 1995/06/03 09:27:20 mike +# initial checkin +# + +############ + +CC=cc + +CFLAGS='-O -cckr -w' + +# where to put the binary +BINDIR=/usr/local/bin + +# where to put the man pages and man page extension +MANDIR=/usr/local/man/man1 +MANEXT=1 diff --git a/config-user/ultrix-mips b/config-user/ultrix-mips new file mode 100644 index 0000000..2d321c0 --- /dev/null +++ b/config-user/ultrix-mips @@ -0,0 +1,21 @@ +# ultrix4.2 mips +# prefer to use cc or no gcc + +# $Log: ultrix-mips,v $ +# Revision 1.1 1994/12/14 00:11:06 mike +# initial ci +# + +############ + +CC=cc + +CFLAGS='-O -Olimit 700' + +# where to put the binary +BINDIR=/usr/local/bin + +# where to put the man pages and man page extension +MANDIR=/usr/local/man/man1 +MANEXT=1 + diff --git a/config.cache b/config.cache new file mode 100644 index 0000000..53535f4 --- /dev/null +++ b/config.cache @@ -0,0 +1,34 @@ +# This file is a shell script that caches the results of configure +# tests run on this system so they can be shared between configure +# scripts and configure runs. It is not useful on other systems. +# If it contains results you don't want to keep, you may remove or edit it. +# +# By default, configure uses ./config.cache as the cache file, +# creating it if it does not exist already. You can give configure +# the --cache-file=FILE option to use a different cache file; that is +# what configure does when it calls configure scripts in +# subdirectories, so they share the cache. +# Giving --cache-file=/dev/null disables caching, for debugging configure. +# config.status only pays attention to the cache file if you give it the +# --recheck option to rerun configure. +# +ac_cv_c_const=${ac_cv_c_const='yes'} +ac_cv_func_fmod=${ac_cv_func_fmod='yes'} +ac_cv_func_matherr=${ac_cv_func_matherr='yes'} +ac_cv_func_memcpy=${ac_cv_func_memcpy='yes'} +ac_cv_func_strchr=${ac_cv_func_strchr='yes'} +ac_cv_func_strerror=${ac_cv_func_strerror='yes'} +ac_cv_func_strtod=${ac_cv_func_strtod='yes'} +ac_cv_func_vfprintf=${ac_cv_func_vfprintf='yes'} +ac_cv_header_errno_h=${ac_cv_header_errno_h='yes'} +ac_cv_header_fcntl_h=${ac_cv_header_fcntl_h='yes'} +ac_cv_header_limits_h=${ac_cv_header_limits_h='yes'} +ac_cv_header_stdarg_h=${ac_cv_header_stdarg_h='yes'} +ac_cv_header_stddef_h=${ac_cv_header_stddef_h='yes'} +ac_cv_header_time_h=${ac_cv_header_time_h='yes'} +ac_cv_lib_m=${ac_cv_lib_m='yes'} +ac_cv_prog_CC=${ac_cv_prog_CC='gcc'} +ac_cv_prog_CPP=${ac_cv_prog_CPP=''gcc -E''} +ac_cv_prog_YACC=${ac_cv_prog_YACC='bison'} +ac_cv_prog_gcc=${ac_cv_prog_gcc='yes'} +ac_cv_type_signal=${ac_cv_type_signal='void'} diff --git a/config.h b/config.h new file mode 100644 index 0000000..16c4614 --- /dev/null +++ b/config.h @@ -0,0 +1,9 @@ +/* config.h -- generated by configure */ +#ifndef CONFIG_H +#define CONFIG_H + + +#define SIZE_T_STDDEF_H 1 + +#define HAVE_REAL_PIPES 1 +#endif /* CONFIG_H */ diff --git a/config.log b/config.log new file mode 100644 index 0000000..4d1217d --- /dev/null +++ b/config.log @@ -0,0 +1,7 @@ +This file contains any messages produced by compilers while +running configure, to aid debugging if configure makes a mistake. + +configure:1006:6: warning: conflicting types for built-in function 'memcpy' +configure:1055:6: warning: conflicting types for built-in function 'strchr' +configure:1153:6: warning: conflicting types for built-in function 'vfprintf' +configure:1251:6: warning: conflicting types for built-in function 'fmod' diff --git a/config.status b/config.status new file mode 100755 index 0000000..4fb91ef --- /dev/null +++ b/config.status @@ -0,0 +1,107 @@ +#! /bin/sh +# Generated automatically by configure. +# Run this file to recreate the current configuration. +# This directory was configured as follows, +# on host jinhyuk-desktop: +# +# ./configure +# +# Compiler output produced by configure, useful for debugging +# configure, is in ./config.log if it exists. + +ac_cs_usage="Usage: ./config.status [--recheck] [--version] [--help]" +for ac_option +do + case "$ac_option" in + -recheck | --recheck | --rechec | --reche | --rech | --rec | --re | --r) + echo "running ${CONFIG_SHELL-/bin/sh} ./configure --no-create --no-recursion" + exec ${CONFIG_SHELL-/bin/sh} ./configure --no-create --no-recursion ;; + -version | --version | --versio | --versi | --vers | --ver | --ve | --v) + echo "./config.status generated by autoconf version 2.4" + exit 0 ;; + -help | --help | --hel | --he | --h) + echo "$ac_cs_usage"; exit 0 ;; + *) echo "$ac_cs_usage"; exit 1 ;; + esac +done + +ac_given_srcdir=. + +trap 'rm -fr Makefile conftest*; exit 1' 1 2 15 + +# Protect against being on the right side of a sed subst in config.status. +sed 's/%@/@@/; s/@%/@@/; s/%g$/@g/; /@g$/s/[\\&%]/\\&/g; + s/@@/%@/; s/@@/@%/; s/@g$/%g/' > conftest.subs <<\CEOF +/^[ ]*VPATH[ ]*=[^:]*$/d + +s%@CFLAGS@%-g -O2%g +s%@CPPFLAGS@%%g +s%@CXXFLAGS@%-g -O2%g +s%@DEFS@% -DRETSIGTYPE=void %g +s%@LDFLAGS@%%g +s%@LIBS@% -lm%g +s%@exec_prefix@%${prefix}%g +s%@prefix@%/usr/local%g +s%@program_transform_name@%s,x,x,%g +s%@BINDIR@%/usr/local/bin%g +s%@MANDIR@%/usr/local/man/man1%g +s%@MANEXT@%1%g +s%@CC@%gcc%g +s%@CPP@%gcc -E%g +s%@MATHLIB@%-lm%g +s%@YACC@%bison -y%g + +CEOF + +CONFIG_FILES=${CONFIG_FILES-"Makefile"} +for ac_file in .. $CONFIG_FILES; do if test "x$ac_file" != x..; then + # Support "outfile[:infile]", defaulting infile="outfile.in". + case "$ac_file" in + *:*) ac_file_in=`echo "$ac_file"|sed 's%.*:%%'` + ac_file=`echo "$ac_file"|sed 's%:.*%%'` ;; + *) ac_file_in="${ac_file}.in" ;; + esac + + # Adjust relative srcdir, etc. for subdirectories. + + # Remove last slash and all that follows it. Not all systems have dirname. + ac_dir=`echo $ac_file|sed 's%/[^/][^/]*$%%'` + if test "$ac_dir" != "$ac_file" && test "$ac_dir" != .; then + # The file is in a subdirectory. + test ! -d "$ac_dir" && mkdir "$ac_dir" + ac_dir_suffix="/`echo $ac_dir|sed 's%^\./%%'`" + # A "../" for each directory in $ac_dir_suffix. + ac_dots=`echo $ac_dir_suffix|sed 's%/[^/]*%../%g'` + else + ac_dir_suffix= ac_dots= + fi + + case "$ac_given_srcdir" in + .) srcdir=. + if test -z "$ac_dots"; then top_srcdir=. + else top_srcdir=`echo $ac_dots|sed 's%/$%%'`; fi ;; + /*) srcdir="$ac_given_srcdir$ac_dir_suffix"; top_srcdir="$ac_given_srcdir" ;; + *) # Relative path. + srcdir="$ac_dots$ac_given_srcdir$ac_dir_suffix" + top_srcdir="$ac_dots$ac_given_srcdir" ;; + esac + + echo creating "$ac_file" + rm -f "$ac_file" + configure_input="Generated automatically from `echo $ac_file_in|sed 's%.*/%%'` by configure." + case "$ac_file" in + *Makefile*) ac_comsub="1i\\ +# $configure_input" ;; + *) ac_comsub= ;; + esac + sed -e "$ac_comsub +s%@configure_input@%$configure_input%g +s%@srcdir@%$srcdir%g +s%@top_srcdir@%$top_srcdir%g +" -f conftest.subs $ac_given_srcdir/$ac_file_in > $ac_file +fi; done +rm -f conftest.subs + + + +exit 0 diff --git a/config.user b/config.user new file mode 100644 index 0000000..b34e0b1 --- /dev/null +++ b/config.user @@ -0,0 +1,41 @@ +# config.user (user configuration template) + +# User settable configuration parameters +# +# Uncomment and change as needed +# (no space around '=' , this gets sourced) + +# Most people will not need to do anything with this file. + +# If you want or need changes, edit this file. +############ + +# default is to look for gcc and use cc if no gcc +# change if you do not want gcc or want a different compiler from +# gcc or cc +#CC=lcc + +# change if need special C compiler flags +# otherwise default is -O +# CFLAGS='-O4 -special flags' + +# configure will look for libm. Change if you know this will fail +# or want a different math library +#MATHLIB=-lspecialmath +#MATHLIB='' # if you don't need a special lib to get sin,sqrt,etc + +# where to put the binary +BINDIR=/usr/local/bin + +# where to put the man pages and man page extension +MANDIR=/usr/local/man/man1 +MANEXT=1 + +# fix up things the configuration script bungles here +# most likely candidate is fpe tests +# This gets put in config.h via: echo "$USER_DEFINES" +# example: +#USER_DEFINES=' +#define FPE_TRAPS_ON 1 +#define NOINFO_SIGFPE 1' + diff --git a/configure b/configure new file mode 100755 index 0000000..ecbb759 --- /dev/null +++ b/configure @@ -0,0 +1,2336 @@ +#! /bin/sh + +# Guess values for system-dependent variables and create Makefiles. +# Generated automatically using autoconf version 2.4 +# Copyright (C) 1992, 1993, 1994 Free Software Foundation, Inc. +# +# This configure script is free software; the Free Software Foundation +# gives unlimited permission to copy, distribute and modify it. + +# Defaults: +ac_help= +ac_default_prefix=/usr/local +# Any additions from configure.in: + +# Initialize some variables set by options. +# The variables have the same names as the options, with +# dashes changed to underlines. +build=NONE +cache_file=./config.cache +exec_prefix=NONE +host=NONE +no_create= +nonopt=NONE +no_recursion= +prefix=NONE +program_prefix=NONE +program_suffix=NONE +program_transform_name=s,x,x, +silent= +site= +srcdir= +target=NONE +verbose= +x_includes=NONE +x_libraries=NONE + +# Initialize some other variables. +subdirs= + +ac_prev= +for ac_option +do + + # If the previous option needs an argument, assign it. + if test -n "$ac_prev"; then + eval "$ac_prev=\$ac_option" + ac_prev= + continue + fi + + case "$ac_option" in + -*=*) ac_optarg=`echo "$ac_option" | sed 's/[-_a-zA-Z0-9]*=//'` ;; + *) ac_optarg= ;; + esac + + # Accept the important Cygnus configure options, so we can diagnose typos. + + case "$ac_option" in + + -build | --build | --buil | --bui | --bu | --b) + ac_prev=build ;; + -build=* | --build=* | --buil=* | --bui=* | --bu=* | --b=*) + build="$ac_optarg" ;; + + -cache-file | --cache-file | --cache-fil | --cache-fi \ + | --cache-f | --cache- | --cache | --cach | --cac | --ca | --c) + ac_prev=cache_file ;; + -cache-file=* | --cache-file=* | --cache-fil=* | --cache-fi=* \ + | --cache-f=* | --cache-=* | --cache=* | --cach=* | --cac=* | --ca=* | --c=*) + cache_file="$ac_optarg" ;; + + -disable-* | --disable-*) + ac_feature=`echo $ac_option|sed -e 's/-*disable-//'` + # Reject names that are not valid shell variable names. + if test -n "`echo $ac_feature| sed 's/[-a-zA-Z0-9_]//g'`"; then + { echo "configure: error: $ac_feature: invalid feature name" 1>&2; exit 1; } + fi + ac_feature=`echo $ac_feature| sed 's/-/_/g'` + eval "enable_${ac_feature}=no" ;; + + -enable-* | --enable-*) + ac_feature=`echo $ac_option|sed -e 's/-*enable-//' -e 's/=.*//'` + # Reject names that are not valid shell variable names. + if test -n "`echo $ac_feature| sed 's/[-_a-zA-Z0-9]//g'`"; then + { echo "configure: error: $ac_feature: invalid feature name" 1>&2; exit 1; } + fi + ac_feature=`echo $ac_feature| sed 's/-/_/g'` + case "$ac_option" in + *=*) ;; + *) ac_optarg=yes ;; + esac + eval "enable_${ac_feature}='$ac_optarg'" ;; + + -exec-prefix | --exec_prefix | --exec-prefix | --exec-prefi \ + | --exec-pref | --exec-pre | --exec-pr | --exec-p | --exec- \ + | --exec | --exe | --ex) + ac_prev=exec_prefix ;; + -exec-prefix=* | --exec_prefix=* | --exec-prefix=* | --exec-prefi=* \ + | --exec-pref=* | --exec-pre=* | --exec-pr=* | --exec-p=* | --exec-=* \ + | --exec=* | --exe=* | --ex=*) + exec_prefix="$ac_optarg" ;; + + -gas | --gas | --ga | --g) + # Obsolete; use --with-gas. + with_gas=yes ;; + + -help | --help | --hel | --he) + # Omit some internal or obsolete options to make the list less imposing. + # This message is too long to be a string in the A/UX 3.1 sh. + cat << EOF +Usage: configure [options] [host] +Options: [defaults in brackets after descriptions] +Configuration: + --cache-file=FILE cache test results in FILE + --help print this message + --no-create do not create output files + --quiet, --silent do not print \`checking...' messages + --version print the version of autoconf that created configure +Directory and file names: + --prefix=PREFIX install architecture-independent files in PREFIX + [$ac_default_prefix] + --exec-prefix=PREFIX install architecture-dependent files in PREFIX + [same as prefix] + --srcdir=DIR find the sources in DIR [configure dir or ..] + --program-prefix=PREFIX prepend PREFIX to installed program names + --program-suffix=SUFFIX append SUFFIX to installed program names + --program-transform-name=PROGRAM run sed PROGRAM on installed program names +Host type: + --build=BUILD configure for building on BUILD [BUILD=HOST] + --host=HOST configure for HOST [guessed] + --target=TARGET configure for TARGET [TARGET=HOST] +Features and packages: + --disable-FEATURE do not include FEATURE (same as --enable-FEATURE=no) + --enable-FEATURE[=ARG] include FEATURE [ARG=yes] + --with-PACKAGE[=ARG] use PACKAGE [ARG=yes] + --without-PACKAGE do not use PACKAGE (same as --with-PACKAGE=no) + --x-includes=DIR X include files are in DIR + --x-libraries=DIR X library files are in DIR +--enable and --with options recognized:$ac_help +EOF + exit 0 ;; + + -host | --host | --hos | --ho) + ac_prev=host ;; + -host=* | --host=* | --hos=* | --ho=*) + host="$ac_optarg" ;; + + -nfp | --nfp | --nf) + # Obsolete; use --without-fp. + with_fp=no ;; + + -no-create | --no-create | --no-creat | --no-crea | --no-cre \ + | --no-cr | --no-c) + no_create=yes ;; + + -no-recursion | --no-recursion | --no-recursio | --no-recursi \ + | --no-recurs | --no-recur | --no-recu | --no-rec | --no-re | --no-r) + no_recursion=yes ;; + + -prefix | --prefix | --prefi | --pref | --pre | --pr | --p) + ac_prev=prefix ;; + -prefix=* | --prefix=* | --prefi=* | --pref=* | --pre=* | --pr=* | --p=*) + prefix="$ac_optarg" ;; + + -program-prefix | --program-prefix | --program-prefi | --program-pref \ + | --program-pre | --program-pr | --program-p) + ac_prev=program_prefix ;; + -program-prefix=* | --program-prefix=* | --program-prefi=* \ + | --program-pref=* | --program-pre=* | --program-pr=* | --program-p=*) + program_prefix="$ac_optarg" ;; + + -program-suffix | --program-suffix | --program-suffi | --program-suff \ + | --program-suf | --program-su | --program-s) + ac_prev=program_suffix ;; + -program-suffix=* | --program-suffix=* | --program-suffi=* \ + | --program-suff=* | --program-suf=* | --program-su=* | --program-s=*) + program_suffix="$ac_optarg" ;; + + -program-transform-name | --program-transform-name \ + | --program-transform-nam | --program-transform-na \ + | --program-transform-n | --program-transform- \ + | --program-transform | --program-transfor \ + | --program-transfo | --program-transf \ + | --program-trans | --program-tran \ + | --progr-tra | --program-tr | --program-t) + ac_prev=program_transform_name ;; + -program-transform-name=* | --program-transform-name=* \ + | --program-transform-nam=* | --program-transform-na=* \ + | --program-transform-n=* | --program-transform-=* \ + | --program-transform=* | --program-transfor=* \ + | --program-transfo=* | --program-transf=* \ + | --program-trans=* | --program-tran=* \ + | --progr-tra=* | --program-tr=* | --program-t=*) + program_transform_name="$ac_optarg" ;; + + -q | -quiet | --quiet | --quie | --qui | --qu | --q \ + | -silent | --silent | --silen | --sile | --sil) + silent=yes ;; + + -site | --site | --sit) + ac_prev=site ;; + -site=* | --site=* | --sit=*) + site="$ac_optarg" ;; + + -srcdir | --srcdir | --srcdi | --srcd | --src | --sr) + ac_prev=srcdir ;; + -srcdir=* | --srcdir=* | --srcdi=* | --srcd=* | --src=* | --sr=*) + srcdir="$ac_optarg" ;; + + -target | --target | --targe | --targ | --tar | --ta | --t) + ac_prev=target ;; + -target=* | --target=* | --targe=* | --targ=* | --tar=* | --ta=* | --t=*) + target="$ac_optarg" ;; + + -v | -verbose | --verbose | --verbos | --verbo | --verb) + verbose=yes ;; + + -version | --version | --versio | --versi | --vers) + echo "configure generated by autoconf version 2.4" + exit 0 ;; + + -with-* | --with-*) + ac_package=`echo $ac_option|sed -e 's/-*with-//' -e 's/=.*//'` + # Reject names that are not valid shell variable names. + if test -n "`echo $ac_package| sed 's/[-_a-zA-Z0-9]//g'`"; then + { echo "configure: error: $ac_package: invalid package name" 1>&2; exit 1; } + fi + ac_package=`echo $ac_package| sed 's/-/_/g'` + case "$ac_option" in + *=*) ;; + *) ac_optarg=yes ;; + esac + eval "with_${ac_package}='$ac_optarg'" ;; + + -without-* | --without-*) + ac_package=`echo $ac_option|sed -e 's/-*without-//'` + # Reject names that are not valid shell variable names. + if test -n "`echo $ac_package| sed 's/[-a-zA-Z0-9_]//g'`"; then + { echo "configure: error: $ac_package: invalid package name" 1>&2; exit 1; } + fi + ac_package=`echo $ac_package| sed 's/-/_/g'` + eval "with_${ac_package}=no" ;; + + --x) + # Obsolete; use --with-x. + with_x=yes ;; + + -x-includes | --x-includes | --x-include | --x-includ | --x-inclu \ + | --x-incl | --x-inc | --x-in | --x-i) + ac_prev=x_includes ;; + -x-includes=* | --x-includes=* | --x-include=* | --x-includ=* | --x-inclu=* \ + | --x-incl=* | --x-inc=* | --x-in=* | --x-i=*) + x_includes="$ac_optarg" ;; + + -x-libraries | --x-libraries | --x-librarie | --x-librari \ + | --x-librar | --x-libra | --x-libr | --x-lib | --x-li | --x-l) + ac_prev=x_libraries ;; + -x-libraries=* | --x-libraries=* | --x-librarie=* | --x-librari=* \ + | --x-librar=* | --x-libra=* | --x-libr=* | --x-lib=* | --x-li=* | --x-l=*) + x_libraries="$ac_optarg" ;; + + -*) { echo "configure: error: $ac_option: invalid option; use --help to show usage" 1>&2; exit 1; } + ;; + + *) + if test -n "`echo $ac_option| sed 's/[-a-z0-9.]//g'`"; then + echo "configure: warning: $ac_option: invalid host type" 1>&2 + fi + if test "x$nonopt" != xNONE; then + { echo "configure: error: can only configure for one host and one target at a time" 1>&2; exit 1; } + fi + nonopt="$ac_option" + ;; + + esac +done + +if test -n "$ac_prev"; then + { echo "configure: error: missing argument to --`echo $ac_prev | sed 's/_/-/g'`" 1>&2; exit 1; } +fi + +trap 'rm -fr conftest* confdefs* core core.* *.core $ac_clean_files; exit 1' 1 2 15 + +# File descriptor usage: +# 0 standard input +# 1 file creation +# 2 errors and warnings +# 3 some systems may open it to /dev/tty +# 4 used on the Kubota Titan +# 6 checking for... messages and results +# 5 compiler messages saved in config.log +if test "$silent" = yes; then + exec 6>/dev/null +else + exec 6>&1 +fi +exec 5>./config.log + +echo "\ +This file contains any messages produced by compilers while +running configure, to aid debugging if configure makes a mistake. +" 1>&5 + +# Strip out --no-create and --no-recursion so they do not pile up. +# Also quote any args containing shell metacharacters. +ac_configure_args= +for ac_arg +do + case "$ac_arg" in + -no-create | --no-create | --no-creat | --no-crea | --no-cre \ + | --no-cr | --no-c) ;; + -no-recursion | --no-recursion | --no-recursio | --no-recursi \ + | --no-recurs | --no-recur | --no-recu | --no-rec | --no-re | --no-r) ;; + *" "*|*" "*|*[\[\]\~\#\$\^\&\*\(\)\{\}\\\|\;\<\>\?]*) + ac_configure_args="$ac_configure_args '$ac_arg'" ;; + *) ac_configure_args="$ac_configure_args $ac_arg" ;; + esac +done + +# NLS nuisances. +# Only set LANG and LC_ALL to C if already set. +# These must not be set unconditionally because not all systems understand +# e.g. LANG=C (notably SCO). +if test "${LC_ALL+set}" = set; then LC_ALL=C; export LC_ALL; fi +if test "${LANG+set}" = set; then LANG=C; export LANG; fi + +# confdefs.h avoids OS command line length limits that DEFS can exceed. +rm -rf conftest* confdefs.h +# AIX cpp loses on an empty file, so make sure it contains at least a newline. +echo > confdefs.h + +# A filename unique to this package, relative to the directory that +# configure is in, which we can look for to find out if srcdir is correct. +ac_unique_file=mawk.h + +# Find the source files, if location was not specified. +if test -z "$srcdir"; then + ac_srcdir_defaulted=yes + # Try the directory containing this script, then its parent. + ac_prog=$0 + ac_confdir=`echo $ac_prog|sed 's%/[^/][^/]*$%%'` + test "x$ac_confdir" = "x$ac_prog" && ac_confdir=. + srcdir=$ac_confdir + if test ! -r $srcdir/$ac_unique_file; then + srcdir=.. + fi +else + ac_srcdir_defaulted=no +fi +if test ! -r $srcdir/$ac_unique_file; then + if test "$ac_srcdir_defaulted" = yes; then + { echo "configure: error: can not find sources in $ac_confdir or .." 1>&2; exit 1; } + else + { echo "configure: error: can not find sources in $srcdir" 1>&2; exit 1; } + fi +fi +srcdir=`echo "${srcdir}" | sed 's%\([^/]\)/*$%\1%'` + +# Prefer explicitly selected file to automatically selected ones. +if test -z "$CONFIG_SITE"; then + if test "x$prefix" != xNONE; then + CONFIG_SITE="$prefix/share/config.site $prefix/etc/config.site" + else + CONFIG_SITE="$ac_default_prefix/share/config.site $ac_default_prefix/etc/config.site" + fi +fi +for ac_site_file in $CONFIG_SITE; do + if test -r "$ac_site_file"; then + echo "loading site script $ac_site_file" + . "$ac_site_file" + fi +done + +if test -r "$cache_file"; then + echo "loading cache $cache_file" + . $cache_file +else + echo "creating cache $cache_file" + > $cache_file +fi + +ac_ext=c +# CFLAGS is not in ac_cpp because -g, -O, etc. are not valid cpp options. +ac_cpp='$CPP $CPPFLAGS' +ac_compile='${CC-cc} -c $CFLAGS $CPPFLAGS conftest.$ac_ext 1>&5 2>&5' +ac_link='${CC-cc} -o conftest $CFLAGS $CPPFLAGS $LDFLAGS conftest.$ac_ext $LIBS 1>&5 2>&5' + +if (echo "testing\c"; echo 1,2,3) | grep c >/dev/null; then + # Stardent Vistra SVR4 grep lacks -e, says ghazi@caip.rutgers.edu. + if (echo -n testing; echo 1,2,3) | sed s/-n/xn/ | grep xn >/dev/null; then + ac_n= ac_c=' +' ac_t=' ' + else + ac_n=-n ac_c= ac_t= + fi +else + ac_n= ac_c='\c' ac_t= +fi + + + + + + + + +cat < /dev/null > defines.out +test -f config.user && . ./config.user +test "${BINDIR+set}" = set || BINDIR="/usr/local/bin" + +test "${MANDIR+set}" = set || MANDIR="/usr/local/man/man1" + +test "${MANEXT+set}" = set || MANEXT="1" + +echo "$USER_DEFINES" >> defines.out +# Extract the first word of "gcc", so it can be a program name with args. +set dummy gcc; ac_word=$2 +echo $ac_n "checking for $ac_word""... $ac_c" 1>&6 +if eval "test \"`echo '$''{'ac_cv_prog_CC'+set}'`\" = set"; then + echo $ac_n "(cached) $ac_c" 1>&6 +else + if test -n "$CC"; then + ac_cv_prog_CC="$CC" # Let the user override the test. +else + IFS="${IFS= }"; ac_save_ifs="$IFS"; IFS="${IFS}:" + for ac_dir in $PATH; do + test -z "$ac_dir" && ac_dir=. + if test -f $ac_dir/$ac_word; then + ac_cv_prog_CC="gcc" + break + fi + done + IFS="$ac_save_ifs" + test -z "$ac_cv_prog_CC" && ac_cv_prog_CC="cc" +fi +fi +CC="$ac_cv_prog_CC" +if test -n "$CC"; then + echo "$ac_t""$CC" 1>&6 +else + echo "$ac_t""no" 1>&6 +fi + +echo $ac_n "checking whether we are using GNU C""... $ac_c" 1>&6 +if eval "test \"`echo '$''{'ac_cv_prog_gcc'+set}'`\" = set"; then + echo $ac_n "(cached) $ac_c" 1>&6 +else + cat > conftest.c <&5 | egrep yes >/dev/null 2>&1; then + ac_cv_prog_gcc=yes +else + ac_cv_prog_gcc=no +fi +fi +echo "$ac_t""$ac_cv_prog_gcc" 1>&6 +rm -f conftest* + +echo $ac_n "checking how to run the C preprocessor""... $ac_c" 1>&6 +# On Suns, sometimes $CPP names a directory. +if test -n "$CPP" && test -d "$CPP"; then + CPP= +fi +if test -z "$CPP"; then +if eval "test \"`echo '$''{'ac_cv_prog_CPP'+set}'`\" = set"; then + echo $ac_n "(cached) $ac_c" 1>&6 +else + # This must be in double quotes, not single quotes, because CPP may get + # substituted into the Makefile and "${CC-cc}" will confuse make. + CPP="${CC-cc} -E" + # On the NeXT, cc -E runs the code through the compiler's parser, + # not just through cpp. + cat > conftest.$ac_ext < +Syntax Error +EOF +eval "$ac_cpp conftest.$ac_ext >/dev/null 2>conftest.out" +ac_err=`grep -v '^ *+' conftest.out` +if test -z "$ac_err"; then + : +else + echo "$ac_err" >&5 + rm -rf conftest* + CPP="${CC-cc} -E -traditional-cpp" + cat > conftest.$ac_ext < +Syntax Error +EOF +eval "$ac_cpp conftest.$ac_ext >/dev/null 2>conftest.out" +ac_err=`grep -v '^ *+' conftest.out` +if test -z "$ac_err"; then + : +else + echo "$ac_err" >&5 + rm -rf conftest* + CPP=/lib/cpp +fi +rm -f conftest* +fi +rm -f conftest* + ac_cv_prog_CPP="$CPP" +fi + CPP="$ac_cv_prog_CPP" +else + ac_cv_prog_CPP="$CPP" +fi +echo "$ac_t""$CPP" 1>&6 + +test "${CFLAGS+set}" = set || CFLAGS="-O" + + +if test "${MATHLIB+set}" != set ; then +echo $ac_n "checking for -lm""... $ac_c" 1>&6 +if eval "test \"`echo '$''{'ac_cv_lib_m'+set}'`\" = set"; then + echo $ac_n "(cached) $ac_c" 1>&6 +else + ac_save_LIBS="$LIBS" +LIBS="-lm $LIBS" +cat > conftest.$ac_ext <&6 + MATHLIB=-lm ; LIBS="$LIBS -lm" +else + echo "$ac_t""no" 1>&6 +# maybe don't need separate math library +echo $ac_n "checking for log""... $ac_c" 1>&6 +if eval "test \"`echo '$''{'ac_cv_func_log'+set}'`\" = set"; then + echo $ac_n "(cached) $ac_c" 1>&6 +else + cat > conftest.$ac_ext < +/* Override any gcc2 internal prototype to avoid an error. */ +char log(); + +int main() { return 0; } +int t() { + +/* The GNU C library defines this for functions which it implements + to always fail with ENOSYS. Some functions are actually named + something starting with __ and the normal name is an alias. */ +#if defined (__stub_log) || defined (__stub___log) +choke me +#else +log(); +#endif + +; return 0; } +EOF +if eval $ac_link; then + rm -rf conftest* + eval "ac_cv_func_log=yes" +else + rm -rf conftest* + eval "ac_cv_func_log=no" +fi +rm -f conftest* + +fi +if eval "test \"`echo '$ac_cv_func_'log`\" = yes"; then + echo "$ac_t""yes" 1>&6 + log=yes +else + echo "$ac_t""no" 1>&6 +fi + +if test "$log$" = yes +then + MATHLIB='' # evidently don't need one +else + { echo "configure: error: Cannot find a math library. You need to set MATHLIB in config.user" 1>&2; exit 1; } +fi +fi +fi + +for ac_prog in byacc bison yacc +do +# Extract the first word of "$ac_prog", so it can be a program name with args. +set dummy $ac_prog; ac_word=$2 +echo $ac_n "checking for $ac_word""... $ac_c" 1>&6 +if eval "test \"`echo '$''{'ac_cv_prog_YACC'+set}'`\" = set"; then + echo $ac_n "(cached) $ac_c" 1>&6 +else + if test -n "$YACC"; then + ac_cv_prog_YACC="$YACC" # Let the user override the test. +else + IFS="${IFS= }"; ac_save_ifs="$IFS"; IFS="${IFS}:" + for ac_dir in $PATH; do + test -z "$ac_dir" && ac_dir=. + if test -f $ac_dir/$ac_word; then + ac_cv_prog_YACC="$ac_prog" + break + fi + done + IFS="$ac_save_ifs" +fi +fi +YACC="$ac_cv_prog_YACC" +if test -n "$YACC"; then + echo "$ac_t""$YACC" 1>&6 +else + echo "$ac_t""no" 1>&6 +fi + +test -n "$YACC" && break +done + +test "$YACC" = bison && YACC='bison -y' +echo $ac_n "checking compiler supports void*""... $ac_c" 1>&6 +cat > conftest.$ac_ext <&6 +test "$void_star" = no && echo X 'NO_VOID_STAR' '1' >> defines.out +echo $ac_n "checking compiler groks prototypes""... $ac_c" 1>&6 +cat > conftest.$ac_ext <&6 +test "$protos" = no && echo X 'NO_PROTOS' '1' >> defines.out +echo $ac_n "checking for working const""... $ac_c" 1>&6 +if eval "test \"`echo '$''{'ac_cv_c_const'+set}'`\" = set"; then + echo $ac_n "(cached) $ac_c" 1>&6 +else + cat > conftest.$ac_ext <j = 5; +} +{ /* ULTRIX-32 V3.1 (Rev 9) vcc rejects this */ + const int foo = 10; +} + +; return 0; } +EOF +if eval $ac_compile; then + rm -rf conftest* + ac_cv_c_const=yes +else + rm -rf conftest* + ac_cv_c_const=no +fi +rm -f conftest* + +fi +echo "$ac_t""$ac_cv_c_const" 1>&6 +if test $ac_cv_c_const = no; then + cat >> confdefs.h <<\EOF +#define const +EOF + +fi + +test "$ac_cv_c_const" = no && echo X 'const' '' >> defines.out + + if test "$size_t_defed" != 1 ; then + ac_safe=`echo "stddef.h" | tr './\055' '___'` +echo $ac_n "checking for stddef.h""... $ac_c" 1>&6 +if eval "test \"`echo '$''{'ac_cv_header_$ac_safe'+set}'`\" = set"; then + echo $ac_n "(cached) $ac_c" 1>&6 +else + cat > conftest.$ac_ext < +EOF +eval "$ac_cpp conftest.$ac_ext >/dev/null 2>conftest.out" +ac_err=`grep -v '^ *+' conftest.out` +if test -z "$ac_err"; then + rm -rf conftest* + eval "ac_cv_header_$ac_safe=yes" +else + echo "$ac_err" >&5 + rm -rf conftest* + eval "ac_cv_header_$ac_safe=no" +fi +rm -f conftest* +fi +if eval "test \"`echo '$ac_cv_header_'$ac_safe`\" = yes"; then + echo "$ac_t""yes" 1>&6 + size_t_header=ok +else + echo "$ac_t""no" 1>&6 +fi + + if test "$size_t_header" = ok ; then + cat > conftest.$ac_ext < +int main() { return 0; } +int t() { +size_t *n ; + +; return 0; } +EOF +if eval $ac_compile; then + rm -rf conftest* + size_t_defed=1; +echo X 'SIZE_T_STDDEF_H' '1' >> defines.out +echo getting size_t from '' +fi +rm -f conftest* + +fi;fi + + if test "$size_t_defed" != 1 ; then + ac_safe=`echo "sys/types.h" | tr './\055' '___'` +echo $ac_n "checking for sys/types.h""... $ac_c" 1>&6 +if eval "test \"`echo '$''{'ac_cv_header_$ac_safe'+set}'`\" = set"; then + echo $ac_n "(cached) $ac_c" 1>&6 +else + cat > conftest.$ac_ext < +EOF +eval "$ac_cpp conftest.$ac_ext >/dev/null 2>conftest.out" +ac_err=`grep -v '^ *+' conftest.out` +if test -z "$ac_err"; then + rm -rf conftest* + eval "ac_cv_header_$ac_safe=yes" +else + echo "$ac_err" >&5 + rm -rf conftest* + eval "ac_cv_header_$ac_safe=no" +fi +rm -f conftest* +fi +if eval "test \"`echo '$ac_cv_header_'$ac_safe`\" = yes"; then + echo "$ac_t""yes" 1>&6 + size_t_header=ok +else + echo "$ac_t""no" 1>&6 +fi + + if test "$size_t_header" = ok ; then + cat > conftest.$ac_ext < +int main() { return 0; } +int t() { +size_t *n ; + +; return 0; } +EOF +if eval $ac_compile; then + rm -rf conftest* + size_t_defed=1; +echo X 'SIZE_T_TYPES_H' '1' >> defines.out +echo getting size_t from '' +fi +rm -f conftest* + +fi;fi +ac_safe=`echo "fcntl.h" | tr './\055' '___'` +echo $ac_n "checking for fcntl.h""... $ac_c" 1>&6 +if eval "test \"`echo '$''{'ac_cv_header_$ac_safe'+set}'`\" = set"; then + echo $ac_n "(cached) $ac_c" 1>&6 +else + cat > conftest.$ac_ext < +EOF +eval "$ac_cpp conftest.$ac_ext >/dev/null 2>conftest.out" +ac_err=`grep -v '^ *+' conftest.out` +if test -z "$ac_err"; then + rm -rf conftest* + eval "ac_cv_header_$ac_safe=yes" +else + echo "$ac_err" >&5 + rm -rf conftest* + eval "ac_cv_header_$ac_safe=no" +fi +rm -f conftest* +fi +if eval "test \"`echo '$ac_cv_header_'$ac_safe`\" = yes"; then + echo "$ac_t""yes" 1>&6 + : +else + echo "$ac_t""no" 1>&6 +cat >> confdefs.h <<\EOF +#define NO_FCNTL_H 1 +EOF + +echo X 'NO_FCNTL_H' '1' >> defines.out +fi + +ac_safe=`echo "errno.h" | tr './\055' '___'` +echo $ac_n "checking for errno.h""... $ac_c" 1>&6 +if eval "test \"`echo '$''{'ac_cv_header_$ac_safe'+set}'`\" = set"; then + echo $ac_n "(cached) $ac_c" 1>&6 +else + cat > conftest.$ac_ext < +EOF +eval "$ac_cpp conftest.$ac_ext >/dev/null 2>conftest.out" +ac_err=`grep -v '^ *+' conftest.out` +if test -z "$ac_err"; then + rm -rf conftest* + eval "ac_cv_header_$ac_safe=yes" +else + echo "$ac_err" >&5 + rm -rf conftest* + eval "ac_cv_header_$ac_safe=no" +fi +rm -f conftest* +fi +if eval "test \"`echo '$ac_cv_header_'$ac_safe`\" = yes"; then + echo "$ac_t""yes" 1>&6 + : +else + echo "$ac_t""no" 1>&6 +cat >> confdefs.h <<\EOF +#define NO_ERRNO_H 1 +EOF + +echo X 'NO_ERRNO_H' '1' >> defines.out +fi + +ac_safe=`echo "time.h" | tr './\055' '___'` +echo $ac_n "checking for time.h""... $ac_c" 1>&6 +if eval "test \"`echo '$''{'ac_cv_header_$ac_safe'+set}'`\" = set"; then + echo $ac_n "(cached) $ac_c" 1>&6 +else + cat > conftest.$ac_ext < +EOF +eval "$ac_cpp conftest.$ac_ext >/dev/null 2>conftest.out" +ac_err=`grep -v '^ *+' conftest.out` +if test -z "$ac_err"; then + rm -rf conftest* + eval "ac_cv_header_$ac_safe=yes" +else + echo "$ac_err" >&5 + rm -rf conftest* + eval "ac_cv_header_$ac_safe=no" +fi +rm -f conftest* +fi +if eval "test \"`echo '$ac_cv_header_'$ac_safe`\" = yes"; then + echo "$ac_t""yes" 1>&6 + : +else + echo "$ac_t""no" 1>&6 +cat >> confdefs.h <<\EOF +#define NO_TIME_H 1 +EOF + +echo X 'NO_TIME_H' '1' >> defines.out +fi + +ac_safe=`echo "stdarg.h" | tr './\055' '___'` +echo $ac_n "checking for stdarg.h""... $ac_c" 1>&6 +if eval "test \"`echo '$''{'ac_cv_header_$ac_safe'+set}'`\" = set"; then + echo $ac_n "(cached) $ac_c" 1>&6 +else + cat > conftest.$ac_ext < +EOF +eval "$ac_cpp conftest.$ac_ext >/dev/null 2>conftest.out" +ac_err=`grep -v '^ *+' conftest.out` +if test -z "$ac_err"; then + rm -rf conftest* + eval "ac_cv_header_$ac_safe=yes" +else + echo "$ac_err" >&5 + rm -rf conftest* + eval "ac_cv_header_$ac_safe=no" +fi +rm -f conftest* +fi +if eval "test \"`echo '$ac_cv_header_'$ac_safe`\" = yes"; then + echo "$ac_t""yes" 1>&6 + : +else + echo "$ac_t""no" 1>&6 +cat >> confdefs.h <<\EOF +#define NO_STDARG_H 1 +EOF + +echo X 'NO_STDARG_H' '1' >> defines.out +fi + +echo $ac_n "checking for memcpy""... $ac_c" 1>&6 +if eval "test \"`echo '$''{'ac_cv_func_memcpy'+set}'`\" = set"; then + echo $ac_n "(cached) $ac_c" 1>&6 +else + cat > conftest.$ac_ext < +/* Override any gcc2 internal prototype to avoid an error. */ +char memcpy(); + +int main() { return 0; } +int t() { + +/* The GNU C library defines this for functions which it implements + to always fail with ENOSYS. Some functions are actually named + something starting with __ and the normal name is an alias. */ +#if defined (__stub_memcpy) || defined (__stub___memcpy) +choke me +#else +memcpy(); +#endif + +; return 0; } +EOF +if eval $ac_link; then + rm -rf conftest* + eval "ac_cv_func_memcpy=yes" +else + rm -rf conftest* + eval "ac_cv_func_memcpy=no" +fi +rm -f conftest* + +fi +if eval "test \"`echo '$ac_cv_func_'memcpy`\" = yes"; then + echo "$ac_t""yes" 1>&6 + : +else + echo "$ac_t""no" 1>&6 +cat >> confdefs.h <<\EOF +#define NO_MEMCPY 1 +EOF + +echo X 'NO_MEMCPY' '1' >> defines.out +fi + +echo $ac_n "checking for strchr""... $ac_c" 1>&6 +if eval "test \"`echo '$''{'ac_cv_func_strchr'+set}'`\" = set"; then + echo $ac_n "(cached) $ac_c" 1>&6 +else + cat > conftest.$ac_ext < +/* Override any gcc2 internal prototype to avoid an error. */ +char strchr(); + +int main() { return 0; } +int t() { + +/* The GNU C library defines this for functions which it implements + to always fail with ENOSYS. Some functions are actually named + something starting with __ and the normal name is an alias. */ +#if defined (__stub_strchr) || defined (__stub___strchr) +choke me +#else +strchr(); +#endif + +; return 0; } +EOF +if eval $ac_link; then + rm -rf conftest* + eval "ac_cv_func_strchr=yes" +else + rm -rf conftest* + eval "ac_cv_func_strchr=no" +fi +rm -f conftest* + +fi +if eval "test \"`echo '$ac_cv_func_'strchr`\" = yes"; then + echo "$ac_t""yes" 1>&6 + : +else + echo "$ac_t""no" 1>&6 +cat >> confdefs.h <<\EOF +#define NO_STRCHR 1 +EOF + +echo X 'NO_STRCHR' '1' >> defines.out +fi + +echo $ac_n "checking for strerror""... $ac_c" 1>&6 +if eval "test \"`echo '$''{'ac_cv_func_strerror'+set}'`\" = set"; then + echo $ac_n "(cached) $ac_c" 1>&6 +else + cat > conftest.$ac_ext < +/* Override any gcc2 internal prototype to avoid an error. */ +char strerror(); + +int main() { return 0; } +int t() { + +/* The GNU C library defines this for functions which it implements + to always fail with ENOSYS. Some functions are actually named + something starting with __ and the normal name is an alias. */ +#if defined (__stub_strerror) || defined (__stub___strerror) +choke me +#else +strerror(); +#endif + +; return 0; } +EOF +if eval $ac_link; then + rm -rf conftest* + eval "ac_cv_func_strerror=yes" +else + rm -rf conftest* + eval "ac_cv_func_strerror=no" +fi +rm -f conftest* + +fi +if eval "test \"`echo '$ac_cv_func_'strerror`\" = yes"; then + echo "$ac_t""yes" 1>&6 + : +else + echo "$ac_t""no" 1>&6 +cat >> confdefs.h <<\EOF +#define NO_STRERROR 1 +EOF + +echo X 'NO_STRERROR' '1' >> defines.out +fi + +echo $ac_n "checking for vfprintf""... $ac_c" 1>&6 +if eval "test \"`echo '$''{'ac_cv_func_vfprintf'+set}'`\" = set"; then + echo $ac_n "(cached) $ac_c" 1>&6 +else + cat > conftest.$ac_ext < +/* Override any gcc2 internal prototype to avoid an error. */ +char vfprintf(); + +int main() { return 0; } +int t() { + +/* The GNU C library defines this for functions which it implements + to always fail with ENOSYS. Some functions are actually named + something starting with __ and the normal name is an alias. */ +#if defined (__stub_vfprintf) || defined (__stub___vfprintf) +choke me +#else +vfprintf(); +#endif + +; return 0; } +EOF +if eval $ac_link; then + rm -rf conftest* + eval "ac_cv_func_vfprintf=yes" +else + rm -rf conftest* + eval "ac_cv_func_vfprintf=no" +fi +rm -f conftest* + +fi +if eval "test \"`echo '$ac_cv_func_'vfprintf`\" = yes"; then + echo "$ac_t""yes" 1>&6 + : +else + echo "$ac_t""no" 1>&6 +cat >> confdefs.h <<\EOF +#define NO_VFPRINTF 1 +EOF + +echo X 'NO_VFPRINTF' '1' >> defines.out +fi + +echo $ac_n "checking for strtod""... $ac_c" 1>&6 +if eval "test \"`echo '$''{'ac_cv_func_strtod'+set}'`\" = set"; then + echo $ac_n "(cached) $ac_c" 1>&6 +else + cat > conftest.$ac_ext < +/* Override any gcc2 internal prototype to avoid an error. */ +char strtod(); + +int main() { return 0; } +int t() { + +/* The GNU C library defines this for functions which it implements + to always fail with ENOSYS. Some functions are actually named + something starting with __ and the normal name is an alias. */ +#if defined (__stub_strtod) || defined (__stub___strtod) +choke me +#else +strtod(); +#endif + +; return 0; } +EOF +if eval $ac_link; then + rm -rf conftest* + eval "ac_cv_func_strtod=yes" +else + rm -rf conftest* + eval "ac_cv_func_strtod=no" +fi +rm -f conftest* + +fi +if eval "test \"`echo '$ac_cv_func_'strtod`\" = yes"; then + echo "$ac_t""yes" 1>&6 + : +else + echo "$ac_t""no" 1>&6 +cat >> confdefs.h <<\EOF +#define NO_STRTOD 1 +EOF + +echo X 'NO_STRTOD' '1' >> defines.out +fi + +echo $ac_n "checking for fmod""... $ac_c" 1>&6 +if eval "test \"`echo '$''{'ac_cv_func_fmod'+set}'`\" = set"; then + echo $ac_n "(cached) $ac_c" 1>&6 +else + cat > conftest.$ac_ext < +/* Override any gcc2 internal prototype to avoid an error. */ +char fmod(); + +int main() { return 0; } +int t() { + +/* The GNU C library defines this for functions which it implements + to always fail with ENOSYS. Some functions are actually named + something starting with __ and the normal name is an alias. */ +#if defined (__stub_fmod) || defined (__stub___fmod) +choke me +#else +fmod(); +#endif + +; return 0; } +EOF +if eval $ac_link; then + rm -rf conftest* + eval "ac_cv_func_fmod=yes" +else + rm -rf conftest* + eval "ac_cv_func_fmod=no" +fi +rm -f conftest* + +fi +if eval "test \"`echo '$ac_cv_func_'fmod`\" = yes"; then + echo "$ac_t""yes" 1>&6 + : +else + echo "$ac_t""no" 1>&6 +cat >> confdefs.h <<\EOF +#define NO_FMOD 1 +EOF + +echo X 'NO_FMOD' '1' >> defines.out +fi + +echo $ac_n "checking for matherr""... $ac_c" 1>&6 +if eval "test \"`echo '$''{'ac_cv_func_matherr'+set}'`\" = set"; then + echo $ac_n "(cached) $ac_c" 1>&6 +else + cat > conftest.$ac_ext < +/* Override any gcc2 internal prototype to avoid an error. */ +char matherr(); + +int main() { return 0; } +int t() { + +/* The GNU C library defines this for functions which it implements + to always fail with ENOSYS. Some functions are actually named + something starting with __ and the normal name is an alias. */ +#if defined (__stub_matherr) || defined (__stub___matherr) +choke me +#else +matherr(); +#endif + +; return 0; } +EOF +if eval $ac_link; then + rm -rf conftest* + eval "ac_cv_func_matherr=yes" +else + rm -rf conftest* + eval "ac_cv_func_matherr=no" +fi +rm -f conftest* + +fi +if eval "test \"`echo '$ac_cv_func_'matherr`\" = yes"; then + echo "$ac_t""yes" 1>&6 + : +else + echo "$ac_t""no" 1>&6 +cat >> confdefs.h <<\EOF +#define NO_MATHERR 1 +EOF + +echo X 'NO_MATHERR' '1' >> defines.out +fi + +cat > conftest.$ac_ext < +EOF +if (eval "$ac_cpp conftest.$ac_ext") 2>&5 | + egrep "[^v]fprintf" >/dev/null 2>&1; then + : +else + rm -rf conftest* + cat >> confdefs.h <<\EOF +#define NO_FPRINTF_IN_STDIO 1 +EOF + +echo X 'NO_FPRINTF_IN_STDIO' '1' >> defines.out +fi +rm -f conftest* + +cat > conftest.$ac_ext < +EOF +if (eval "$ac_cpp conftest.$ac_ext") 2>&5 | + egrep "[^v]sprintf" >/dev/null 2>&1; then + : +else + rm -rf conftest* + cat >> confdefs.h <<\EOF +#define NO_SPRINTF_IN_STDIO 1 +EOF + +echo X 'NO_SPRINTF_IN_STDIO' '1' >> defines.out +fi +rm -f conftest* + +ac_safe=`echo "limits.h" | tr './\055' '___'` +echo $ac_n "checking for limits.h""... $ac_c" 1>&6 +if eval "test \"`echo '$''{'ac_cv_header_$ac_safe'+set}'`\" = set"; then + echo $ac_n "(cached) $ac_c" 1>&6 +else + cat > conftest.$ac_ext < +EOF +eval "$ac_cpp conftest.$ac_ext >/dev/null 2>conftest.out" +ac_err=`grep -v '^ *+' conftest.out` +if test -z "$ac_err"; then + rm -rf conftest* + eval "ac_cv_header_$ac_safe=yes" +else + echo "$ac_err" >&5 + rm -rf conftest* + eval "ac_cv_header_$ac_safe=no" +fi +rm -f conftest* +fi +if eval "test \"`echo '$ac_cv_header_'$ac_safe`\" = yes"; then + echo "$ac_t""yes" 1>&6 + limits_h=yes +else + echo "$ac_t""no" 1>&6 +fi + +if test "$limits_h" = yes ; then : +else +ac_safe=`echo "values.h" | tr './\055' '___'` +echo $ac_n "checking for values.h""... $ac_c" 1>&6 +if eval "test \"`echo '$''{'ac_cv_header_$ac_safe'+set}'`\" = set"; then + echo $ac_n "(cached) $ac_c" 1>&6 +else + cat > conftest.$ac_ext < +EOF +eval "$ac_cpp conftest.$ac_ext >/dev/null 2>conftest.out" +ac_err=`grep -v '^ *+' conftest.out` +if test -z "$ac_err"; then + rm -rf conftest* + eval "ac_cv_header_$ac_safe=yes" +else + echo "$ac_err" >&5 + rm -rf conftest* + eval "ac_cv_header_$ac_safe=no" +fi +rm -f conftest* +fi +if eval "test \"`echo '$ac_cv_header_'$ac_safe`\" = yes"; then + echo "$ac_t""yes" 1>&6 + values_h=yes +else + echo "$ac_t""no" 1>&6 +fi + + if test "$values_h" = yes ; then + # If we cannot run a trivial program, we must be cross compiling. +echo $ac_n "checking whether cross-compiling""... $ac_c" 1>&6 +if eval "test \"`echo '$''{'ac_cv_c_cross'+set}'`\" = set"; then + echo $ac_n "(cached) $ac_c" 1>&6 +else + if test "$cross_compiling" = yes; then + ac_cv_c_cross=yes +else +cat > conftest.$ac_ext </dev/null; then + ac_cv_c_cross=no +else + ac_cv_c_cross=yes +fi +fi +rm -fr conftest* +fi +cross_compiling=$ac_cv_c_cross +echo "$ac_t""$ac_cv_c_cross" 1>&6 + +if test "$cross_compiling" = yes; then + { echo "configure: error: can not run test program while cross compiling" 1>&2; exit 1; } +else +cat > conftest.$ac_ext < +#include +int main() +{ FILE *out = fopen("maxint.out", "w") ; + if ( ! out ) exit(1) ; + fprintf(out, "X MAX__INT 0x%x\n", MAXINT) ; + fprintf(out, "X MAX__LONG 0x%lx\n", MAXLONG) ; + exit(0) ; return(0) ; +} + +EOF +eval $ac_link +if test -s conftest && (./conftest; exit) 2>/dev/null; then + maxint_set=1 +else + { echo "configure: error: C program to compute maxint and maxlong failed. +Please send bug report to brennan@whidbey.com." 1>&2; exit 1; } +fi +fi +rm -fr conftest* + fi +if test "$maxint_set" != 1 ; then +# compute it -- assumes two's complement +if test "$cross_compiling" = yes; then + { echo "configure: error: can not run test program while cross compiling" 1>&2; exit 1; } +else +cat > conftest.$ac_ext < +int main() +{ int y ; long yy ; + FILE *out ; + + if ( !(out = fopen("maxint.out","w")) ) exit(1) ; + /* find max int and max long */ + y = 0x1000 ; + while ( y > 0 ) y *= 2 ; + fprintf(out,"X MAX__INT 0x%x\n", y-1) ; + yy = 0x1000 ; + while ( yy > 0 ) yy *= 2 ; + fprintf(out,"X MAX__LONG 0x%lx\n", yy-1) ; + exit(0) ; + return 0 ; + } +EOF +eval $ac_link +if test -s conftest && (./conftest; exit) 2>/dev/null; then + : +else + { echo "configure: error: C program to compute maxint and maxlong failed. +Please send bug report to brennan@whidbey.com." 1>&2; exit 1; } +fi +fi +rm -fr conftest* +fi +cat maxint.out >> defines.out ; rm -f maxint.out +fi ; +if echo "$USER_DEFINES" | grep FPE_TRAPS_ON >/dev/null +then echo skipping fpe tests based on '$'USER_DEFINES +else +echo $ac_n "checking return type of signal handlers""... $ac_c" 1>&6 +if eval "test \"`echo '$''{'ac_cv_type_signal'+set}'`\" = set"; then + echo $ac_n "(cached) $ac_c" 1>&6 +else + cat > conftest.$ac_ext < +#include +#ifdef signal +#undef signal +#endif +#ifdef __cplusplus +extern "C" +#endif +void (*signal ()) (); +int main() { return 0; } +int t() { +int i; +; return 0; } +EOF +if eval $ac_compile; then + rm -rf conftest* + ac_cv_type_signal=void +else + rm -rf conftest* + ac_cv_type_signal=int +fi +rm -f conftest* + +fi +echo "$ac_t""$ac_cv_type_signal" 1>&6 +cat >> confdefs.h </dev/null + status=$? +else + echo fpe_check.c failed to compile 1>&2 + status=100 +fi + +case $status in + 0) ;; # good news do nothing + 3) # reasonably good news +cat >> confdefs.h <<\EOF +#define FPE_TRAPS_ON 1 +EOF + +echo X 'FPE_TRAPS_ON' '1' >> defines.out +echo $ac_n "checking for sigaction""... $ac_c" 1>&6 +if eval "test \"`echo '$''{'ac_cv_func_sigaction'+set}'`\" = set"; then + echo $ac_n "(cached) $ac_c" 1>&6 +else + cat > conftest.$ac_ext < +/* Override any gcc2 internal prototype to avoid an error. */ +char sigaction(); + +int main() { return 0; } +int t() { + +/* The GNU C library defines this for functions which it implements + to always fail with ENOSYS. Some functions are actually named + something starting with __ and the normal name is an alias. */ +#if defined (__stub_sigaction) || defined (__stub___sigaction) +choke me +#else +sigaction(); +#endif + +; return 0; } +EOF +if eval $ac_link; then + rm -rf conftest* + eval "ac_cv_func_sigaction=yes" +else + rm -rf conftest* + eval "ac_cv_func_sigaction=no" +fi +rm -f conftest* + +fi +if eval "test \"`echo '$ac_cv_func_'sigaction`\" = yes"; then + echo "$ac_t""yes" 1>&6 + sigaction=1 +else + echo "$ac_t""no" 1>&6 +fi + +ac_safe=`echo "siginfo.h" | tr './\055' '___'` +echo $ac_n "checking for siginfo.h""... $ac_c" 1>&6 +if eval "test \"`echo '$''{'ac_cv_header_$ac_safe'+set}'`\" = set"; then + echo $ac_n "(cached) $ac_c" 1>&6 +else + cat > conftest.$ac_ext < +EOF +eval "$ac_cpp conftest.$ac_ext >/dev/null 2>conftest.out" +ac_err=`grep -v '^ *+' conftest.out` +if test -z "$ac_err"; then + rm -rf conftest* + eval "ac_cv_header_$ac_safe=yes" +else + echo "$ac_err" >&5 + rm -rf conftest* + eval "ac_cv_header_$ac_safe=no" +fi +rm -f conftest* +fi +if eval "test \"`echo '$ac_cv_header_'$ac_safe`\" = yes"; then + echo "$ac_t""yes" 1>&6 + siginfo_h=1 +else + echo "$ac_t""no" 1>&6 +fi + +if test "$sigaction" = 1 && test "$siginfo_h" = 1 ; then + cat >> confdefs.h <<\EOF +#define SV_SIGINFO 1 +EOF + +echo X 'SV_SIGINFO' '1' >> defines.out +else + echo $ac_n "checking for sigvec""... $ac_c" 1>&6 +if eval "test \"`echo '$''{'ac_cv_func_sigvec'+set}'`\" = set"; then + echo $ac_n "(cached) $ac_c" 1>&6 +else + cat > conftest.$ac_ext < +/* Override any gcc2 internal prototype to avoid an error. */ +char sigvec(); + +int main() { return 0; } +int t() { + +/* The GNU C library defines this for functions which it implements + to always fail with ENOSYS. Some functions are actually named + something starting with __ and the normal name is an alias. */ +#if defined (__stub_sigvec) || defined (__stub___sigvec) +choke me +#else +sigvec(); +#endif + +; return 0; } +EOF +if eval $ac_link; then + rm -rf conftest* + eval "ac_cv_func_sigvec=yes" +else + rm -rf conftest* + eval "ac_cv_func_sigvec=no" +fi +rm -f conftest* + +fi +if eval "test \"`echo '$ac_cv_func_'sigvec`\" = yes"; then + echo "$ac_t""yes" 1>&6 + sigvec=1 +else + echo "$ac_t""no" 1>&6 +fi + + if test "$sigvec" = 1 && ./fpe_check phoney_arg >> defines.out ; then : + else cat >> confdefs.h <<\EOF +#define NOINFO_SIGFPE 1 +EOF + +echo X 'NOINFO_SIGFPE' '1' >> defines.out + fi +fi ;; + + 1|2|4) # bad news have to turn off traps + # only know how to do this on systemV and solaris +ac_safe=`echo "ieeefp.h" | tr './\055' '___'` +echo $ac_n "checking for ieeefp.h""... $ac_c" 1>&6 +if eval "test \"`echo '$''{'ac_cv_header_$ac_safe'+set}'`\" = set"; then + echo $ac_n "(cached) $ac_c" 1>&6 +else + cat > conftest.$ac_ext < +EOF +eval "$ac_cpp conftest.$ac_ext >/dev/null 2>conftest.out" +ac_err=`grep -v '^ *+' conftest.out` +if test -z "$ac_err"; then + rm -rf conftest* + eval "ac_cv_header_$ac_safe=yes" +else + echo "$ac_err" >&5 + rm -rf conftest* + eval "ac_cv_header_$ac_safe=no" +fi +rm -f conftest* +fi +if eval "test \"`echo '$ac_cv_header_'$ac_safe`\" = yes"; then + echo "$ac_t""yes" 1>&6 + ieeefp_h=1 +else + echo "$ac_t""no" 1>&6 +fi + +echo $ac_n "checking for fpsetmask""... $ac_c" 1>&6 +if eval "test \"`echo '$''{'ac_cv_func_fpsetmask'+set}'`\" = set"; then + echo $ac_n "(cached) $ac_c" 1>&6 +else + cat > conftest.$ac_ext < +/* Override any gcc2 internal prototype to avoid an error. */ +char fpsetmask(); + +int main() { return 0; } +int t() { + +/* The GNU C library defines this for functions which it implements + to always fail with ENOSYS. Some functions are actually named + something starting with __ and the normal name is an alias. */ +#if defined (__stub_fpsetmask) || defined (__stub___fpsetmask) +choke me +#else +fpsetmask(); +#endif + +; return 0; } +EOF +if eval $ac_link; then + rm -rf conftest* + eval "ac_cv_func_fpsetmask=yes" +else + rm -rf conftest* + eval "ac_cv_func_fpsetmask=no" +fi +rm -f conftest* + +fi +if eval "test \"`echo '$ac_cv_func_'fpsetmask`\" = yes"; then + echo "$ac_t""yes" 1>&6 + fpsetmask=1 +else + echo "$ac_t""no" 1>&6 +fi + +if test "$ieeefp_h" = 1 && test "$fpsetmask" = 1 ; then +cat >> confdefs.h <<\EOF +#define FPE_TRAPS_ON 1 +EOF + +echo X 'FPE_TRAPS_ON' '1' >> defines.out +cat >> confdefs.h <<\EOF +#define USE_IEEEFP_H 1 +EOF + +echo X 'USE_IEEEFP_H' '1' >> defines.out +echo X 'TURN_ON_FPE_TRAPS()' 'fpsetmask(fpgetmask()|FP_X_DZ|FP_X_OFL)' >> defines.out +echo $ac_n "checking for sigaction""... $ac_c" 1>&6 +if eval "test \"`echo '$''{'ac_cv_func_sigaction'+set}'`\" = set"; then + echo $ac_n "(cached) $ac_c" 1>&6 +else + cat > conftest.$ac_ext < +/* Override any gcc2 internal prototype to avoid an error. */ +char sigaction(); + +int main() { return 0; } +int t() { + +/* The GNU C library defines this for functions which it implements + to always fail with ENOSYS. Some functions are actually named + something starting with __ and the normal name is an alias. */ +#if defined (__stub_sigaction) || defined (__stub___sigaction) +choke me +#else +sigaction(); +#endif + +; return 0; } +EOF +if eval $ac_link; then + rm -rf conftest* + eval "ac_cv_func_sigaction=yes" +else + rm -rf conftest* + eval "ac_cv_func_sigaction=no" +fi +rm -f conftest* + +fi +if eval "test \"`echo '$ac_cv_func_'sigaction`\" = yes"; then + echo "$ac_t""yes" 1>&6 + sigaction=1 +else + echo "$ac_t""no" 1>&6 +fi + +ac_safe=`echo "siginfo.h" | tr './\055' '___'` +echo $ac_n "checking for siginfo.h""... $ac_c" 1>&6 +if eval "test \"`echo '$''{'ac_cv_header_$ac_safe'+set}'`\" = set"; then + echo $ac_n "(cached) $ac_c" 1>&6 +else + cat > conftest.$ac_ext < +EOF +eval "$ac_cpp conftest.$ac_ext >/dev/null 2>conftest.out" +ac_err=`grep -v '^ *+' conftest.out` +if test -z "$ac_err"; then + rm -rf conftest* + eval "ac_cv_header_$ac_safe=yes" +else + echo "$ac_err" >&5 + rm -rf conftest* + eval "ac_cv_header_$ac_safe=no" +fi +rm -f conftest* +fi +if eval "test \"`echo '$ac_cv_header_'$ac_safe`\" = yes"; then + echo "$ac_t""yes" 1>&6 + siginfo_h=1 +else + echo "$ac_t""no" 1>&6 +fi + +if test "$sigaction" = 1 && test "$siginfo_h" = 1 ; then + cat >> confdefs.h <<\EOF +#define SV_SIGINFO 1 +EOF + +echo X 'SV_SIGINFO' '1' >> defines.out +else + echo $ac_n "checking for sigvec""... $ac_c" 1>&6 +if eval "test \"`echo '$''{'ac_cv_func_sigvec'+set}'`\" = set"; then + echo $ac_n "(cached) $ac_c" 1>&6 +else + cat > conftest.$ac_ext < +/* Override any gcc2 internal prototype to avoid an error. */ +char sigvec(); + +int main() { return 0; } +int t() { + +/* The GNU C library defines this for functions which it implements + to always fail with ENOSYS. Some functions are actually named + something starting with __ and the normal name is an alias. */ +#if defined (__stub_sigvec) || defined (__stub___sigvec) +choke me +#else +sigvec(); +#endif + +; return 0; } +EOF +if eval $ac_link; then + rm -rf conftest* + eval "ac_cv_func_sigvec=yes" +else + rm -rf conftest* + eval "ac_cv_func_sigvec=no" +fi +rm -f conftest* + +fi +if eval "test \"`echo '$ac_cv_func_'sigvec`\" = yes"; then + echo "$ac_t""yes" 1>&6 + sigvec=1 +else + echo "$ac_t""no" 1>&6 +fi + + if test "$sigvec" = 1 && ./fpe_check phoney_arg >> defines.out ; then : + else cat >> confdefs.h <<\EOF +#define NOINFO_SIGFPE 1 +EOF + +echo X 'NOINFO_SIGFPE' '1' >> defines.out + fi +fi +# look for strtod overflow bug +echo $ac_n "checking strtod bug on overflow""... $ac_c" 1>&6 +rm -f fpe_check +$CC $CFLAGS -DRETSIGTYPE=$ac_cv_type_signal -DUSE_IEEEFP_H \ + -o fpe_check fpe_check.c $MATHLIB +if ./fpe_check phoney_arg phoney_arg 2>/dev/null +then + echo "$ac_t""no bug" 1>&6 +else + echo "$ac_t""buggy -- will use work around" 1>&6 + echo X 'HAVE_STRTOD_OVF_BUG' '1' >> defines.out +fi + +else + if test $status != 4 ; then + cat >> confdefs.h <<\EOF +#define FPE_TRAPS_ON 1 +EOF + +echo X 'FPE_TRAPS_ON' '1' >> defines.out + echo $ac_n "checking for sigaction""... $ac_c" 1>&6 +if eval "test \"`echo '$''{'ac_cv_func_sigaction'+set}'`\" = set"; then + echo $ac_n "(cached) $ac_c" 1>&6 +else + cat > conftest.$ac_ext < +/* Override any gcc2 internal prototype to avoid an error. */ +char sigaction(); + +int main() { return 0; } +int t() { + +/* The GNU C library defines this for functions which it implements + to always fail with ENOSYS. Some functions are actually named + something starting with __ and the normal name is an alias. */ +#if defined (__stub_sigaction) || defined (__stub___sigaction) +choke me +#else +sigaction(); +#endif + +; return 0; } +EOF +if eval $ac_link; then + rm -rf conftest* + eval "ac_cv_func_sigaction=yes" +else + rm -rf conftest* + eval "ac_cv_func_sigaction=no" +fi +rm -f conftest* + +fi +if eval "test \"`echo '$ac_cv_func_'sigaction`\" = yes"; then + echo "$ac_t""yes" 1>&6 + sigaction=1 +else + echo "$ac_t""no" 1>&6 +fi + +ac_safe=`echo "siginfo.h" | tr './\055' '___'` +echo $ac_n "checking for siginfo.h""... $ac_c" 1>&6 +if eval "test \"`echo '$''{'ac_cv_header_$ac_safe'+set}'`\" = set"; then + echo $ac_n "(cached) $ac_c" 1>&6 +else + cat > conftest.$ac_ext < +EOF +eval "$ac_cpp conftest.$ac_ext >/dev/null 2>conftest.out" +ac_err=`grep -v '^ *+' conftest.out` +if test -z "$ac_err"; then + rm -rf conftest* + eval "ac_cv_header_$ac_safe=yes" +else + echo "$ac_err" >&5 + rm -rf conftest* + eval "ac_cv_header_$ac_safe=no" +fi +rm -f conftest* +fi +if eval "test \"`echo '$ac_cv_header_'$ac_safe`\" = yes"; then + echo "$ac_t""yes" 1>&6 + siginfo_h=1 +else + echo "$ac_t""no" 1>&6 +fi + +if test "$sigaction" = 1 && test "$siginfo_h" = 1 ; then + cat >> confdefs.h <<\EOF +#define SV_SIGINFO 1 +EOF + +echo X 'SV_SIGINFO' '1' >> defines.out +else + echo $ac_n "checking for sigvec""... $ac_c" 1>&6 +if eval "test \"`echo '$''{'ac_cv_func_sigvec'+set}'`\" = set"; then + echo $ac_n "(cached) $ac_c" 1>&6 +else + cat > conftest.$ac_ext < +/* Override any gcc2 internal prototype to avoid an error. */ +char sigvec(); + +int main() { return 0; } +int t() { + +/* The GNU C library defines this for functions which it implements + to always fail with ENOSYS. Some functions are actually named + something starting with __ and the normal name is an alias. */ +#if defined (__stub_sigvec) || defined (__stub___sigvec) +choke me +#else +sigvec(); +#endif + +; return 0; } +EOF +if eval $ac_link; then + rm -rf conftest* + eval "ac_cv_func_sigvec=yes" +else + rm -rf conftest* + eval "ac_cv_func_sigvec=no" +fi +rm -f conftest* + +fi +if eval "test \"`echo '$ac_cv_func_'sigvec`\" = yes"; then + echo "$ac_t""yes" 1>&6 + sigvec=1 +else + echo "$ac_t""no" 1>&6 +fi + + if test "$sigvec" = 1 && ./fpe_check phoney_arg >> defines.out ; then : + else cat >> confdefs.h <<\EOF +#define NOINFO_SIGFPE 1 +EOF + +echo X 'NOINFO_SIGFPE' '1' >> defines.out + fi +fi + fi + + case $status in + 1) +cat 1>&2 <<'EOF' +Warning: Your system defaults generate floating point exception +on divide by zero but not on overflow. You need to +#define TURN_ON_FPE_TRAPS() to handle overflow. +Please report this so I can fix this script to do it automatically. +EOF +;; + 2) +cat 1>&2 <<'EOF' +Warning: Your system defaults generate floating point exception +on overflow but not on divide by zero. You need to +#define TURN_ON_FPE_TRAPS() to handle divide by zero. +Please report this so I can fix this script to do it automatically. +EOF +;; + 4) +cat 1>&2 <<'EOF' +Warning: Your system defaults do not generate floating point +exceptions, but your math library does not support this behavior. +You need to +#define TURN_ON_FPE_TRAPS() to use fp exceptions for consistency. +Please report this so I can fix this script to do it automatically. +EOF +;; + esac +echo brennan@whidbey.com +echo You can continue with the build and the resulting mawk will be +echo useable, but getting FPE_TRAPS_ON correct eventually is best. +fi ;; + + *) # some sort of disaster +cat 1>&2 <<'EOF' +The program `fpe_check' compiled from fpe_check.c seems to have +unexpectly blown up. Please report this to brennan@whidbey.com. +EOF +# quit or not ??? +;; +esac +rm -f fpe_check # whew!! +fi +# output config.h +rm -f config.h +( +cat<<'EOF' +/* config.h -- generated by configure */ +#ifndef CONFIG_H +#define CONFIG_H + +EOF +sed 's/^X/#define/' defines.out +cat<<'EOF' + +#define HAVE_REAL_PIPES 1 +#endif /* CONFIG_H */ +EOF +) | tee config.h +rm defines.out +trap '' 1 2 15 +cat > confcache <<\EOF +# This file is a shell script that caches the results of configure +# tests run on this system so they can be shared between configure +# scripts and configure runs. It is not useful on other systems. +# If it contains results you don't want to keep, you may remove or edit it. +# +# By default, configure uses ./config.cache as the cache file, +# creating it if it does not exist already. You can give configure +# the --cache-file=FILE option to use a different cache file; that is +# what configure does when it calls configure scripts in +# subdirectories, so they share the cache. +# Giving --cache-file=/dev/null disables caching, for debugging configure. +# config.status only pays attention to the cache file if you give it the +# --recheck option to rerun configure. +# +EOF +# Ultrix sh set writes to stderr and can't be redirected directly, +# and sets the high bit in the cache file unless we assign to the vars. +(set) 2>&1 | + sed -n "s/^\([a-zA-Z0-9_]*_cv_[a-zA-Z0-9_]*\)=\(.*\)/\1=\${\1='\2'}/p" \ + >> confcache +if cmp -s $cache_file confcache; then + : +else + if test -w $cache_file; then + echo "updating cache $cache_file" + cat confcache > $cache_file + else + echo "not updating unwritable cache $cache_file" + fi +fi +rm -f confcache + +trap 'rm -fr conftest* confdefs* core core.* *.core $ac_clean_files; exit 1' 1 2 15 + +test "x$prefix" = xNONE && prefix=$ac_default_prefix +# Let make expand exec_prefix. +test "x$exec_prefix" = xNONE && exec_prefix='${prefix}' + +# Any assignment to VPATH causes Sun make to only execute +# the first set of double-colon rules, so remove it if not needed. +# If there is a colon in the path, we need to keep it. +if test "x$srcdir" = x.; then + ac_vpsub='/^[ ]*VPATH[ ]*=[^:]*$/d' +fi + +trap 'rm -f $CONFIG_STATUS conftest*; exit 1' 1 2 15 + +# Transform confdefs.h into DEFS. +# Protect against shell expansion while executing Makefile rules. +# Protect against Makefile macro expansion. +cat > conftest.defs <<\EOF +s%#define \([A-Za-z_][A-Za-z0-9_]*\) \(.*\)%-D\1=\2%g +s%[ `~#$^&*(){}\\|;'"<>?]%\\&%g +s%\[%\\&%g +s%\]%\\&%g +s%\$%$$%g +EOF +DEFS=`sed -f conftest.defs confdefs.h | tr '\012' ' '` +rm -f conftest.defs + + +# Without the "./", some shells look in PATH for config.status. +: ${CONFIG_STATUS=./config.status} + +echo creating $CONFIG_STATUS +rm -f $CONFIG_STATUS +cat > $CONFIG_STATUS </dev/null | sed 1q`: +# +# $0 $ac_configure_args +# +# Compiler output produced by configure, useful for debugging +# configure, is in ./config.log if it exists. + +ac_cs_usage="Usage: $CONFIG_STATUS [--recheck] [--version] [--help]" +for ac_option +do + case "\$ac_option" in + -recheck | --recheck | --rechec | --reche | --rech | --rec | --re | --r) + echo "running \${CONFIG_SHELL-/bin/sh} $0 $ac_configure_args --no-create --no-recursion" + exec \${CONFIG_SHELL-/bin/sh} $0 $ac_configure_args --no-create --no-recursion ;; + -version | --version | --versio | --versi | --vers | --ver | --ve | --v) + echo "$CONFIG_STATUS generated by autoconf version 2.4" + exit 0 ;; + -help | --help | --hel | --he | --h) + echo "\$ac_cs_usage"; exit 0 ;; + *) echo "\$ac_cs_usage"; exit 1 ;; + esac +done + +ac_given_srcdir=$srcdir + +trap 'rm -fr `echo "Makefile" | sed "s/:[^ ]*//g"` conftest*; exit 1' 1 2 15 + +# Protect against being on the right side of a sed subst in config.status. +sed 's/%@/@@/; s/@%/@@/; s/%g$/@g/; /@g$/s/[\\\\&%]/\\\\&/g; + s/@@/%@/; s/@@/@%/; s/@g$/%g/' > conftest.subs <<\CEOF +$ac_vpsub +$extrasub +s%@CFLAGS@%$CFLAGS%g +s%@CPPFLAGS@%$CPPFLAGS%g +s%@CXXFLAGS@%$CXXFLAGS%g +s%@DEFS@%$DEFS%g +s%@LDFLAGS@%$LDFLAGS%g +s%@LIBS@%$LIBS%g +s%@exec_prefix@%$exec_prefix%g +s%@prefix@%$prefix%g +s%@program_transform_name@%$program_transform_name%g +s%@BINDIR@%$BINDIR%g +s%@MANDIR@%$MANDIR%g +s%@MANEXT@%$MANEXT%g +s%@CC@%$CC%g +s%@CPP@%$CPP%g +s%@MATHLIB@%$MATHLIB%g +s%@YACC@%$YACC%g + +CEOF +EOF +cat >> $CONFIG_STATUS <> $CONFIG_STATUS <<\EOF +for ac_file in .. $CONFIG_FILES; do if test "x$ac_file" != x..; then + # Support "outfile[:infile]", defaulting infile="outfile.in". + case "$ac_file" in + *:*) ac_file_in=`echo "$ac_file"|sed 's%.*:%%'` + ac_file=`echo "$ac_file"|sed 's%:.*%%'` ;; + *) ac_file_in="${ac_file}.in" ;; + esac + + # Adjust relative srcdir, etc. for subdirectories. + + # Remove last slash and all that follows it. Not all systems have dirname. + ac_dir=`echo $ac_file|sed 's%/[^/][^/]*$%%'` + if test "$ac_dir" != "$ac_file" && test "$ac_dir" != .; then + # The file is in a subdirectory. + test ! -d "$ac_dir" && mkdir "$ac_dir" + ac_dir_suffix="/`echo $ac_dir|sed 's%^\./%%'`" + # A "../" for each directory in $ac_dir_suffix. + ac_dots=`echo $ac_dir_suffix|sed 's%/[^/]*%../%g'` + else + ac_dir_suffix= ac_dots= + fi + + case "$ac_given_srcdir" in + .) srcdir=. + if test -z "$ac_dots"; then top_srcdir=. + else top_srcdir=`echo $ac_dots|sed 's%/$%%'`; fi ;; + /*) srcdir="$ac_given_srcdir$ac_dir_suffix"; top_srcdir="$ac_given_srcdir" ;; + *) # Relative path. + srcdir="$ac_dots$ac_given_srcdir$ac_dir_suffix" + top_srcdir="$ac_dots$ac_given_srcdir" ;; + esac + + echo creating "$ac_file" + rm -f "$ac_file" + configure_input="Generated automatically from `echo $ac_file_in|sed 's%.*/%%'` by configure." + case "$ac_file" in + *Makefile*) ac_comsub="1i\\ +# $configure_input" ;; + *) ac_comsub= ;; + esac + sed -e "$ac_comsub +s%@configure_input@%$configure_input%g +s%@srcdir@%$srcdir%g +s%@top_srcdir@%$top_srcdir%g +" -f conftest.subs $ac_given_srcdir/$ac_file_in > $ac_file +fi; done +rm -f conftest.subs + + + +exit 0 +EOF +chmod +x $CONFIG_STATUS +rm -fr confdefs* $ac_clean_files +test "$no_create" = yes || ${CONFIG_SHELL-/bin/sh} $CONFIG_STATUS || exit 1 + diff --git a/configure.in b/configure.in new file mode 100644 index 0000000..d9acaf8 --- /dev/null +++ b/configure.in @@ -0,0 +1,49 @@ +dnl configure.in for mawk +dnl +dnl $Log: configure.in,v $ +dnl Revision 1.13 1995/10/16 12:25:00 mike +dnl configure cleanup +dnl +dnl Revision 1.12 1995/04/20 20:26:51 mike +dnl beta improvements from Carl Mascott +dnl +dnl Revision 1.11 1995/01/09 01:22:30 mike +dnl check sig handler ret type to make fpe_check.c more robust +dnl +dnl Revision 1.10 1994/12/18 20:46:24 mike +dnl fpe_check -> ./fpe_check +dnl +dnl Revision 1.9 1994/12/14 14:42:55 mike +dnl more explicit that " " MATHLIB means none +dnl +dnl Revision 1.8 1994/12/11 21:26:25 mike +dnl tweak egrep for [fs]printf prototypes +dnl +dnl Revision 1.7 1994/10/16 18:38:23 mike +dnl use sed on defines.out +dnl +dnl Revision 1.6 1994/10/11 02:49:06 mike +dnl systemVr4 siginfo +dnl +dnl Revision 1.5 1994/10/11 00:39:25 mike +dnl fpe check stuff +dnl +dnl +dnl +AC_INIT(mawk.h) +builtin(include,mawk.ac.m4) +GET_USER_DEFAULTS +PROG_CC_NO_MINUS_G_NONSENSE +AC_PROG_CPP +NOTSET_THEN_DEFAULT(CFLAGS,-O) +LOOK_FOR_MATH_LIBRARY +WHICH_YACC +COMPILER_ATTRIBUTES +WHERE_SIZE_T +CHECK_HEADERS(fcntl.h,errno.h, time.h,stdarg.h) +CHECK_FUNCTIONS(memcpy,strchr,strerror,vfprintf,strtod,fmod,matherr) +FPRINTF_IN_STDIO +FIND_OR_COMPUTE_MAX__INT +DREADED_FPE_TESTS +DO_CONFIG_H +AC_OUTPUT(Makefile) diff --git a/da.c b/da.c new file mode 100644 index 0000000..788375b --- /dev/null +++ b/da.c @@ -0,0 +1,431 @@ + +/******************************************** +da.c +copyright 1991, Michael D. Brennan + +This is a source file for mawk, an implementation of +the AWK programming language. + +Mawk is distributed without warranty under the terms of +the GNU General Public License, version 2, 1991. +********************************************/ + + +/* $Log: da.c,v $ + * Revision 1.6 1995/06/18 19:19:59 mike + * remove use of comma operator that broke some sysVr3 compilers + * + * Revision 1.5 1994/12/13 00:12:08 mike + * delete A statement to delete all of A at once + * + * Revision 1.4 1994/10/08 19:15:32 mike + * remove SM_DOS + * + * Revision 1.3 1993/12/01 14:25:10 mike + * reentrant array loops + * + * Revision 1.2 1993/07/22 00:04:05 mike + * new op code _LJZ _LJNZ + * + * Revision 1.1.1.1 1993/07/03 18:58:10 mike + * move source to cvs + * + * Revision 5.4 1993/01/09 19:05:48 mike + * dump code to stdout and exit(0) + * + * Revision 5.3 1993/01/07 02:50:33 mike + * relative vs absolute code + * + * Revision 5.2 1992/07/25 21:35:25 brennan + * patch2 + * fixed small typo on da of _PRE_DEC + * + * Revision 5.1 1991/12/05 07:55:45 brennan + * 1.1 pre-release + * +*/ + + +/* da.c */ +/* disassemble code */ + + + +#include "mawk.h" + + +#include "code.h" +#include "bi_funct.h" +#include "repl.h" +#include "field.h" + +static char *PROTO(find_bi_name, (PF_CP)) ; + +static struct sc +{ + char op ; + char *name ; +} simple_code[] = + +{ + {_STOP, "stop"}, + {FE_PUSHA, "fe_pusha"}, + {FE_PUSHI, "fe_pushi"}, + {A_TEST, "a_test"}, + {A_DEL, "a_del"}, + {DEL_A, "del_a"}, + {POP_AL, "pop_al"}, + {_POP, "pop"}, + {_ADD, "add"}, + {_SUB, "sub"}, + {_MUL, "mul"}, + {_DIV, "div"}, + {_MOD, "mod"}, + {_POW, "pow"}, + {_NOT, "not"}, + {_UMINUS, "uminus"}, + {_UPLUS, "uplus"}, + {_TEST, "test"}, + {_CAT, "cat"}, + {_ASSIGN, "assign"}, + {_ADD_ASG, "add_asg"}, + {_SUB_ASG, "sub_asg"}, + {_MUL_ASG, "mul_asg"}, + {_DIV_ASG, "div_asg"}, + {_MOD_ASG, "mod_asg"}, + {_POW_ASG, "pow_asg"}, + {NF_PUSHI, "nf_pushi"}, + {F_ASSIGN, "f_assign"}, + {F_ADD_ASG, "f_add_asg"}, + {F_SUB_ASG, "f_sub_asg"}, + {F_MUL_ASG, "f_mul_asg"}, + {F_DIV_ASG, "f_div_asg"}, + {F_MOD_ASG, "f_mod_asg"}, + {F_POW_ASG, "f_pow_asg"}, + {_POST_INC, "post_inc"}, + {_POST_DEC, "post_dec"}, + {_PRE_INC, "pre_inc"}, + {_PRE_DEC, "pre_dec"}, + {F_POST_INC, "f_post_inc"}, + {F_POST_DEC, "f_post_dec"}, + {F_PRE_INC, "f_pre_inc"}, + {F_PRE_DEC, "f_pre_dec"}, + {_EQ, "eq"}, + {_NEQ, "neq"}, + {_LT, "lt"}, + {_LTE, "lte"}, + {_GT, "gt"}, + {_GTE, "gte"}, + {_MATCH2, "match2"}, + {_EXIT, "exit"}, + {_EXIT0, "exit0"}, + {_NEXT, "next"}, + {_RET, "ret"}, + {_RET0, "ret0"}, + {_OMAIN, "omain"}, + {_JMAIN, "jmain"}, + {OL_GL, "ol_gl"}, + {OL_GL_NR, "ol_gl_nr"}, + {_HALT, (char *) 0} +} ; + +static char *jfmt = "%s%s%03d\n" ; + /* format to print jumps */ +static char *tab2 = "\t\t" ; + +void +da(start, fp) + INST *start ; + FILE *fp ; +{ + CELL *cp ; + register INST *p = start ; + char *name ; + + while (p->op != _HALT) + { + /* print the relative code address (label) */ + fprintf(fp, "%03d ", p - start) ; + + switch (p++->op) + { + + case _PUSHC: + cp = (CELL *) p++->ptr ; + switch (cp->type) + { + case C_RE: + fprintf(fp, "pushc\t0x%lx\t/%s/\n", (long) cp->ptr, + re_uncompile(cp->ptr)) ; + break ; + + case C_SPACE: + fprintf(fp, "pushc\tspace split\n") ; + break ; + + case C_SNULL: + fprintf(fp, "pushc\tnull split\n") ; + break ; + case C_REPL: + fprintf(fp, "pushc\trepl\t%s\n", + repl_uncompile(cp)) ; + break ; + case C_REPLV: + fprintf(fp, "pushc\treplv\t%s\n", + repl_uncompile(cp)) ; + break ; + + default: + fprintf(fp,"pushc\tWEIRD\n") ; ; + break ; + } + break ; + + case _PUSHD: + fprintf(fp, "pushd\t%.6g\n", *(double *) p++->ptr) ; + break ; + case _PUSHS: + { + STRING *sval = (STRING *) p++->ptr ; + fprintf(fp, "pushs\t\"%s\"\n", sval->str) ; + break ; + } + + case _MATCH0: + case _MATCH1: + fprintf(fp, "match%d\t0x%lx\t/%s/\n", + p[-1].op == _MATCH1, (long) p->ptr, + re_uncompile(p->ptr)) ; + p++ ; + break ; + + case _PUSHA: + fprintf(fp, "pusha\t%s\n", + reverse_find(ST_VAR, &p++->ptr)) ; + break ; + + case _PUSHI: + cp = (CELL *) p++->ptr ; + if (cp == field) fprintf(fp, "pushi\t$0\n") ; + else if (cp == &fs_shadow) + fprintf(fp, "pushi\t@fs_shadow\n") ; + else + { + if ( +#ifdef MSDOS + SAMESEG(cp, field) && +#endif + cp > NF && cp <= LAST_PFIELD) + name = reverse_find(ST_FIELD, &cp) ; + else name = reverse_find(ST_VAR, &cp) ; + + fprintf(fp, "pushi\t%s\n", name) ; + } + break ; + + case L_PUSHA: + fprintf(fp, "l_pusha\t%d\n", p++->op) ; + break ; + + case L_PUSHI: + fprintf(fp, "l_pushi\t%d\n", p++->op) ; + break ; + + case LAE_PUSHI: + fprintf(fp, "lae_pushi\t%d\n", p++->op) ; + break ; + + case LAE_PUSHA: + fprintf(fp, "lae_pusha\t%d\n", p++->op) ; + break ; + + case LA_PUSHA: + fprintf(fp, "la_pusha\t%d\n", p++->op) ; + break ; + + case F_PUSHA: + cp = (CELL *) p++->ptr ; + if ( +#ifdef MSDOS + SAMESEG(cp, field) && +#endif + cp >= NF && cp <= LAST_PFIELD) + fprintf(fp, "f_pusha\t%s\n", + reverse_find(ST_FIELD, &cp)) ; + else fprintf(fp, "f_pusha\t$%d\n", + field_addr_to_index(cp)) ; + break ; + + case F_PUSHI: + p++ ; + fprintf(fp, "f_pushi\t$%d\n", p++->op) ; + break ; + + case AE_PUSHA: + fprintf(fp, "ae_pusha\t%s\n", + reverse_find(ST_ARRAY, &p++->ptr)) ; + break ; + + case AE_PUSHI: + fprintf(fp, "ae_pushi\t%s\n", + reverse_find(ST_ARRAY, &p++->ptr)) ; + break ; + + case A_PUSHA: + fprintf(fp, "a_pusha\t%s\n", + reverse_find(ST_ARRAY, &p++->ptr)) ; + break ; + + case _PUSHINT: + fprintf(fp, "pushint\t%d\n", p++->op) ; + break ; + + case _BUILTIN: + fprintf(fp, "%s\n", + find_bi_name((PF_CP) p++->ptr)) ; + break ; + + case _PRINT: + fprintf(fp, "%s\n", + (PF_CP) p++->ptr == bi_printf + ? "printf" : "print") ; + break ; + + case _JMP: + fprintf(fp, jfmt, "jmp", tab2, (p - start) + p->op) ; + p++ ; + break ; + + case _JNZ: + fprintf(fp, jfmt, "jnz", tab2, (p - start) + p->op) ; + p++ ; + break ; + + case _JZ: + fprintf(fp, jfmt, "jz", tab2, (p - start) + p->op) ; + p++ ; + break ; + + case _LJZ: + fprintf(fp, jfmt, "ljz", tab2, (p - start) + p->op) ; + p++ ; + break ; + + case _LJNZ: + fprintf(fp, jfmt, "ljnz", tab2+1 , (p - start) + p->op) ; + p++ ; + break ; + + case SET_ALOOP: + fprintf(fp, "set_al\t%03d\n", p + p->op - start) ; + p++ ; + break ; + + case ALOOP: + fprintf(fp, "aloop\t%03d\n", p - start + p->op) ; + p++ ; + break ; + + case A_CAT : + fprintf(fp,"a_cat\t%d\n", p++->op) ; + break ; + + case _CALL: + fprintf(fp, "call\t%s\t%d\n", + ((FBLOCK *) p->ptr)->name, p[1].op) ; + p += 2 ; + break ; + + case _RANGE: + fprintf(fp, "range\t%03d %03d %03d\n", + /* label for pat2, action, follow */ + p - start + p[1].op, + p - start + p[2].op, + p - start + p[3].op) ; + p += 4 ; + break ; + default: + { + struct sc *q = simple_code ; + int k = (p - 1)->op ; + + while (q->op != _HALT && q->op != k) q++ ; + + fprintf(fp, "%s\n", + q->op != _HALT ? q->name : "bad instruction") ; + } + break ; + } + } + fflush(fp) ; +} + +static struct +{ + PF_CP action ; + char *name ; +} +special_cases[] = +{ + {bi_split, "split"}, + {bi_match, "match"}, + {bi_getline, "getline"}, + {bi_sub, "sub"}, + {bi_gsub, "gsub"}, + {(PF_CP) 0, (char *) 0} +} ; + +static char * +find_bi_name(p) + PF_CP p ; +{ + BI_REC *q ; + int i ; + + for (q = bi_funct; q->name; q++) + { + if (q->fp == p) + { + /* found */ + return q->name ; + } + } + /* next check some special cases */ + for (i = 0; special_cases[i].action; i++) + { + if (special_cases[i].action == p) return special_cases[i].name ; + } + + return "unknown builtin" ; +} + +static struct fdump +{ + struct fdump *link ; + FBLOCK *fbp ; +} *fdump_list ; /* linked list of all user functions */ + +void +add_to_fdump_list(fbp) + FBLOCK *fbp ; +{ + struct fdump *p = ZMALLOC(struct fdump) ; + p->fbp = fbp ; + p->link = fdump_list ; fdump_list = p ; +} + +void +fdump() +{ + register struct fdump *p, *q = fdump_list ; + + while (q) + { + p = q ; + q = p->link ; + fprintf(stdout, "function %s\n", p->fbp->name) ; + da(p->fbp->code, stdout) ; + ZFREE(p) ; + } +} + diff --git a/da.o b/da.o new file mode 100644 index 0000000..e6132ce Binary files /dev/null and b/da.o differ diff --git a/debian/changelog b/debian/changelog new file mode 100644 index 0000000..728e41d --- /dev/null +++ b/debian/changelog @@ -0,0 +1,293 @@ +mawk (1.3.3-15+s4) unstable; urgency=low + + * Apply debian patches + * Remove generated files + + -- Mike McCormack Wed, 25 May 2011 11:46:28 +0900 + +mawk (1.3.3-15+s3) unstable; urgency=low + + * Set myself as maintainer. + + -- Rafal Krypa Wed, 14 Apr 2010 13:38:34 +0200 + +mawk (1.3.3-15+s2) unstable; urgency=low + + * debian/rules: don't run the tests after compilation, explicitly + specify make target. + + -- Rafal Krypa Wed, 03 Feb 2010 21:20:31 +0900 + +mawk (1.3.3-15+s1) unstable; urgency=low + + * debian/control: add the Uploaders field. + + -- Rafal Krypa Fri, 29 Jan 2010 16:58:49 +0900 + +mawk (1.3.3-15) unstable; urgency=high + + * Fix debian/copyright to correctly list the license as GPLv2, not GPLv2 + or later. Closes: #536689 + + -- Steve Langasek Mon, 27 Jul 2009 11:26:47 -0700 + +mawk (1.3.3-14) unstable; urgency=low + + * Build-Conflict with byacc, as the current version doesn't appear to be + compatible with mawk; though we ought to fix the upstream build rules + to not check for byacc first in this case, this is an ok fix for now. + Closes: #509832. + + -- Steve Langasek Fri, 26 Dec 2008 16:17:53 -0800 + +mawk (1.3.3-13) unstable; urgency=low + + * New maintainer; closes: #496711. + * Drop versioned gcc build-dependency, which has been satisfied since + before oldstable. + * debian/rules: fix up clean target to use a simpler, standard distclean + call, fixing a lintian warning. + * debian/rules: future-proof the clean target for patch interaction + with the build system, moving all the cleaning into a + "clean-patched" target that fires before the unpatch target + + -- Steve Langasek Wed, 27 Aug 2008 10:03:33 -0700 + +mawk (1.3.3-12) unstable; urgency=low + + * New maintainer; closes: 496711 + * Fix the following lintian issues: + W: ancient-standards-version 3.5.10.0 (current is 3.8.0) + W: mawk: unknown-section base + W: mawk: old-fsf-address-in-copyright-file + + -- Anibal Monsalve Salazar Wed, 27 Aug 2008 17:41:50 +1000 + +mawk (1.3.3-11.1) unstable; urgency=low + + * Non-maintainer upload. + * debian/postinst: fix bashism. Closes: #308134 + + -- Peter Eisentraut Sat, 05 Apr 2008 17:11:11 +0200 + +mawk (1.3.3-11) unstable; urgency=low + + * 08_fix-for-gcc3.3.dpatch: grossly hack configure to work around + gcc-3.3 providing a builtin log() function which broke the configure + tests. Thanks to Daniel Schepler for the + report. Closes: #195371 + + * debian/control: add build-depends on gcc (>= 3:3.3-1) for hppa. + * debian/rules: remove de-optimization hack for hppa. Thanks to LaMont + Jones and Matthias Klose + . Closes: #105816 + + * debian/control (Standards-Version): bump to 3.5.10.0. + + -- James Troup Fri, 30 May 2003 15:24:50 +0100 + +mawk (1.3.3-10) unstable; urgency=low + + * Move to dpatch; existing non-debian/ changes split into + 01_error-on-full-fs, 02_fix-examples, 03_read-and-close-redefinition, + 04_mawk.1-fix-pi and 05_-Wall-fixes. + * debian/rules: include /usr/share/dpatch/dpatch.make. + * debian/rules (build): depend on patch-stamp. + * debian/rules (clean): depend on unpatch. Remove debian/patched. + * debian/control (Build-Depends): add dpatch. + + * debian/rules: update copyright and use install_foo convenience + variables. + * debian/copyright: update copyright. + + * debian/control (Standards-Version): bump to 3.5.9.0. + * debian/postinst, debian/prerm: no longer do /usr/doc symlinks. + + * debian/prerm: use set -e rather than #!/bin/sh -e. + + * 06_parse.y-semicolons.dpatch: new patch to fix missing semi-colons + that upset recent versions of bison. Thanks to Paul Eggert + . Closes: #170973 + * debian/control (Build-Depends): add bison. + * debian/rules (clean): remove parse.c and parse.h so they're not + included in the .diff.gz. + + * 07_mawktest-check-devfull: new patch to conditionalize the write error + tests on the existence of /dev/full since apparently some systems + don't have it. Requested by Marcus.Brinkmann@ruhr-uni-bochum.de. + Closes: #51875 + + * debian/postinst: demote mawk to priority 5 so that gawk will be + selected by default. [mawk isn't being actively maintained upstream + and has both long-standing bugs and isn't feature-complete WRT POSIX + at least.] + + -- James Troup Thu, 10 Apr 2003 02:22:27 +0100 + +mawk (1.3.3-9) unstable; urgency=low + + * debian/control: capitalize POSIX, thanks to Matt Zimmerman + . Closes: #125120 + * debian/changelog: remove obsolete local variables. + * man/mawk.1: add a macro provided by Colin Watson + to force PI to be displayed as "pi" (rather than 'n') when processed + by nroff. Closes: #103699 + * debian/control (Standards-Version): update to 3.5.6.1. + + -- James Troup Fri, 9 Aug 2002 15:04:23 +0100 + +mawk (1.3.3-8) unstable; urgency=low + + * debian/rules (build): compile with -O1 on hppa to work around probable + compiler bug. Thanks to LaMont Jones . + + -- James Troup Wed, 18 Jul 2001 20:40:37 +0100 + +mawk (1.3.3-7) unstable; urgency=low + + * mawk.h: remove bogus redefinition of read() and close() and #include + instead, thanks to LaMont Jones . + Closes: #104124 + + -- James Troup Tue, 10 Jul 2001 03:09:24 +0100 + +mawk (1.3.3-6) unstable; urgency=low + + * debian/control (Maintainer): fixed to be me. + * debian/changelog: remove add-log-mailing-address. + * debian/rules: rewritten. + * debian/control (Standards-Version): bump to 3.5.5.0. + * Half hearted attempt at -Wall cleaning of the code. + + -- James Troup Mon, 25 Jun 2001 05:33:51 +0100 + +mawk (1.3.3-5) unstable; urgency=low + + * debian/postinst: manpages are in /usr/share/man now; forgot to update + the arguments to update-alternatives. Thanks to Malcolm Parsons + for noticing. [#54440] + + -- James Troup Sun, 9 Jan 2000 16:54:14 +0000 + +mawk (1.3.3-4) unstable; urgency=low + + * debian/rules (build): make configure and the test scripts executables + to make builds work under aegis. + * debian/copyright: remove references to Linux. Update location of the + GPL. We are mawk, not hello. + * debian/rules (binary-arch): move to FHS; install documentation into + /usr/share/doc/ and manpages into /usr/share/man/. + * debian/postinst: add /usr/doc/ symlink. + * debian/prerm: remove /usr/doc/ symlink. + * debian/control (Standards-Version): update to 3.1.1.1. + * debian/rules (binary-arch): pass -isp to dpkg-gencontrol. + + -- James Troup Fri, 31 Dec 1999 13:53:42 +0000 + +mawk (1.3.3-3) unstable; urgency=low + + * debian/rules (binary-arch): add chmod -R go=rX to correct permissions + on directories. + * debian/control (Standards-Version): update [FSVO] to 2.5.0.0. + + * The following entries are a patch from Torsten Landschoff + . [#4293, #28249] + * files.c: Added handling of write errors delivered when closing the + output file. + * test/mawktest: Added checking for correct handling of write errors on + full disks. + + * The remaining entries are a patch from Edward Betts + . [#36011] + * examples/hical: use /bin/echo to avoid bash's builtin. + * examples/{ct_length.awk,eatc.awk,nocomment,primes,qsort}: fix bang + path. + * examples/{decl.awk,gdecl.awk,hcal}: fix bang path, remove trailing + white space. + + -- James Troup Fri, 8 Oct 1999 18:26:49 +0100 + +mawk (1.3.3-2) frozen unstable; urgency=low + + * debian/control (Maintainer): New maintainer. However, I'm just an + interim real maintainer, the package will go back to Chris as soon as + he's ready. + * debian/control (Standards-Version): Upgraded to 2.4.1.0. + * debian/control (Depends): Made a Pre-Depends. [#20601] + * debian/copyright: corrected URL of upstream source. [#20603] + * debian/copyright: updated the address of the FSF. + * Pristine upstream source. + + -- James Troup Thu, 30 Apr 1998 16:02:45 +0200 + +mawk (1.3.3-1.1) unstable; urgency=low + + * Non-maintainer release. + * Rebuilt under libc6 [#11707]. + + -- James Troup Fri, 3 Oct 1997 20:19:36 +0200 + +mawk (1.3.3-1) unstable; urgency=low + + * Upgrade to latest upstream source (very minor bug fix) + * Change update-alternatives links to reflect compressed man pages. + * postinst: remove bad links in /usr/man/man1. + + -- Chris Fearnley Fri, 7 Mar 1997 14:41:20 -0500 + +mawk (1.3.2-3) unstable; urgency=low + + * I accidently built the last mawk from my hacked source :( + This version reverts to the pristine upstream (but fails to + completely close Bug #4293 -- which was the point of the dubious + hack found in mawk-1.3.2-2). Sorry. Glad I noticed so soon! + + -- Chris Fearnley Sun, 3 Nov 1996 22:57:45 -0500 + +mawk (1.3.2-2) unstable; urgency=low + + * Fixed postinst to install awk and nawk man pages correctly (bug#5001) + + -- Chris Fearnley Sun, 3 Nov 1996 22:28:56 -0500 + +mawk (1.3.2-1) unstable; urgency=low + + * upgrade to latest upstream source (solves bug #4293) + * development environment: gcc-2.7.2.1-1, libc5-5.2.18-10, + libc5-dev-5.2.18-10, binutils-2.6-2, and make-3.74-12 + + -- Chris Fearnley Sat, 28 Sep 1996 22:58:42 -0400 + +mawk (1.3.1-1) unstable; urgency=low + + * upgrade to latest upstream source + * upgrade packaging to Debian Standards-Version 2.1.1.0 + * i386 development environment: gcc-2.7.2.1-1, libc5-5.2.18-10, + libc5-dev-5.2.18-10, binutils-2.6-2, and make-3.74-12 + + -- Chris Fearnley Wed, 18 Sep 1996 14:24:31 -0400 + +mawk (1.2.2-2) unstable; urgency=low + + * upgrade to latest debian packaging guidelines + * provides: awk + * mawk is now the default awk/nawk for Debian GNU/Linux + * Development environment for i386: gcc-2.7.2-8 libc5-5.2.18-9 make-3.74-12 + * Section: base ; Priority: important + + -- Chris Fearnley Wed, 7 Aug 1996 21:51:21 -0400 + +mawk (1.2.2-1) unstable; urgency=low + + * Upgrade to new upsteam version + * Compiled with: gcc-2.7.2-2, binutils-2.6-2, and libc5-5.2.18-1. + + -- Chris Fearnley Mon, 29 Jan 1996 04:02:39 -0400 + +mawk (1.2.1-1) unstable; urgency=low + + * added Debian GNU/Linux package maintenance system files + * patched Makefile.in to make debianization more flexible + * initial release - ELF package + + -- Chris Fearnley Sun, 3 Dec 1995 00:48:23 -0400 diff --git a/debian/control b/debian/control new file mode 100644 index 0000000..2ebd91e --- /dev/null +++ b/debian/control @@ -0,0 +1,26 @@ +Source: mawk +Section: utils +Priority: required +Maintainer: Rafal Krypa +X-Original-Maintainer: Steve Langasek +Build-Depends: bison +Build-Conflicts: byacc +Standards-Version: 3.8.2 + +Package: mawk +Architecture: any +Provides: awk +Pre-Depends: ${shlibs:Depends} +Description: a pattern scanning and text processing language + Mawk is an interpreter for the AWK Programming Language. The AWK + language is useful for manipulation of data files, text retrieval and + processing, and for prototyping and experimenting with algorithms. Mawk + is a new awk meaning it implements the AWK language as defined in Aho, + Kernighan and Weinberger, The AWK Programming Language, Addison-Wesley + Publishing, 1988. (Hereafter referred to as the AWK book.) Mawk conforms + to the POSIX 1003.2 (draft 11.3) definition of the AWK language + which contains a few features not described in the AWK book, and mawk + provides a small number of extensions. + . + Mawk is smaller and much faster than gawk. It has some compile-time + limits such as NF = 32767 and sprintf buffer = 1020. diff --git a/debian/copyright b/debian/copyright new file mode 100644 index 0000000..fe24f9b --- /dev/null +++ b/debian/copyright @@ -0,0 +1,28 @@ +This is the Debian GNU prepackaged version of mawk, an implementation +of the AWK Programming Language. mawk was written by Mike Brennan + + +This package was put together by Chris Fearnley , +from sources obtained from: + ftp://ftp.whidbey.net/pub/brennan/mawk1.3.3.tar.gz + +It is currently being maintained by James Troup . + +mawk 1.3.3 Nov 1996, Copyright (C) Michael D. Brennan +Modifications for Debian GNU/Linux Copyright (C) 1995-96 Chris Fearnley. +Modifications for Debian GNU/Linux Copyright (C) 1998-2003 James Troup. + +This program is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License version 2 as published +by the Free Software Foundation. + +This program is distributed in the hope that it will be useful, but +WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +General Public License for more details. + +You should have received a copy of the GNU General Public License with +your Debian GNU system, in /usr/share/common-licenses/GPL-2, or with the +Debian GNU mawk source package as the file COPYING. If not, write to +the Free Software Foundation, Inc., 51 Franklin St, Fifth Floor, +Boston, MA 02110-1301, USA. diff --git a/debian/files b/debian/files new file mode 100644 index 0000000..c3efb60 --- /dev/null +++ b/debian/files @@ -0,0 +1 @@ +mawk_1.3.3-15+s4_i386.deb utils required diff --git a/debian/postinst b/debian/postinst new file mode 100644 index 0000000..4f19365 --- /dev/null +++ b/debian/postinst @@ -0,0 +1,14 @@ +#!/bin/sh + +set -e + +update-alternatives --quiet --install /usr/bin/awk awk /usr/bin/mawk 5 \ + --slave /usr/share/man/man1/awk.1.gz awk.1.gz /usr/share/man/man1/mawk.1.gz \ + --slave /usr/bin/nawk nawk /usr/bin/mawk \ + --slave /usr/share/man/man1/nawk.1.gz nawk.1.gz /usr/share/man/man1/mawk.1.gz +for badlink in /usr/man/man1/awk.1 /usr/man/man1/nawk.1 /usr/share/man/man1/awk.1 /usr/share/man/man1/nawk.1; do + if [ -L $badlink ]; then + if ! ls -l $(ls -l $badlink | cut -d">" -f2) >/dev/null 2>&1; then + rm -f $badlink; fi; fi; done + +exit 0 diff --git a/debian/prerm b/debian/prerm new file mode 100644 index 0000000..d153132 --- /dev/null +++ b/debian/prerm @@ -0,0 +1,9 @@ +#!/bin/sh + +set -e + +if [ "$1" != "upgrade" ]; then + update-alternatives --remove awk /usr/bin/mawk +fi + +exit 0 diff --git a/debian/rules b/debian/rules new file mode 100755 index 0000000..5ad7110 --- /dev/null +++ b/debian/rules @@ -0,0 +1,79 @@ +#!/usr/bin/make -f +# debian/rules file - for Mawk (1.3.3) +# Based on sample debian/rules file - for GNU Hello (1.3). +# Copyright 1994,1995 by Ian Jackson. +# Copyright 2001-2003 James Troup +# I hereby give you perpetual unlimited permission to copy, +# modify and relicense this file, provided that you do not remove +# my name from the file itself. (I assert my moral right of +# paternity under the Copyright, Designs and Patents Act 1988.) +# This file may have to be extensively modified + +install_dir=install -d -m 755 +install_file=install -m 644 +install_script=install -m 755 +install_binary=install -m 755 -s + +include /usr/share/dpatch/dpatch.make + +build: patch-stamp + $(checkdir) + chmod 755 configure test/mawktest test/fpe_test + ./configure + $(MAKE) CC="$(CC)" CFLAGS="-g -Wall -O2" LDFLAGS="$(LDFLAGS)" mawk + touch build + +clean: clean-patched unpatch + -rm -rf debian/patched + +clean-patched: + $(checkdir) + -rm -f build parse.c parse.h + [ ! -f Makefile ] || make distclean + -rm -f man/index.db + -rm -rf debian/tmp debian/substvars debian/files* + find . -name \*~ | xargs rm -vf + +binary-indep: + +binary-arch: checkroot build + $(checkdir) + rm -rf debian/tmp + + $(install_dir) debian/tmp/DEBIAN + $(install_script) debian/prerm debian/postinst debian/tmp/DEBIAN/ + + $(install_dir) debian/tmp/usr/bin + $(install_binary) mawk debian/tmp/usr/bin/ + + $(install_dir) debian/tmp/usr/share/man/man1 + $(install_file) man/mawk.1 debian/tmp/usr/share/man/man1/ + gzip -9v debian/tmp/usr/share/man/man1/mawk.1 + + $(install_dir) debian/tmp/usr/share/doc/mawk/ + $(install_file) CHANGES debian/tmp/usr/share/doc/mawk/changelog + $(install_file) debian/changelog debian/tmp/usr/share/doc/mawk/changelog.Debian + $(install_file) README ACKNOWLEDGMENT debian/tmp/usr/share/doc/mawk/ + gzip -9v debian/tmp/usr/share/doc/mawk/* + + $(install_dir) debian/tmp/usr/share/doc/mawk/examples + $(install_file) examples/* debian/tmp/usr/share/doc/mawk/examples/ + gzip -9v debian/tmp/usr/share/doc/mawk/examples/* + + $(install_file) debian/copyright debian/tmp/usr/share/doc/mawk/ + + dpkg-shlibdeps mawk + dpkg-gencontrol -isp + chown -R root.root debian/tmp + chmod -R go=rX debian/tmp + dpkg --build debian/tmp .. + +# Below here is fairly generic really + +binary: binary-indep binary-arch + +checkroot: + $(checkdir) + test root = "`whoami`" + +.PHONY: binary binary-arch binary-indep clean checkroot diff --git a/debian/substvars b/debian/substvars new file mode 100644 index 0000000..2089e51 --- /dev/null +++ b/debian/substvars @@ -0,0 +1 @@ +shlibs:Depends=libc6 (>= 2.1) diff --git a/debian/tmp/DEBIAN/control b/debian/tmp/DEBIAN/control new file mode 100644 index 0000000..d8931b4 --- /dev/null +++ b/debian/tmp/DEBIAN/control @@ -0,0 +1,22 @@ +Package: mawk +Version: 1.3.3-15+s4 +Architecture: i386 +Maintainer: Rafal Krypa +Installed-Size: 236 +Pre-Depends: libc6 (>= 2.1) +Provides: awk +Section: utils +Priority: required +Description: a pattern scanning and text processing language + Mawk is an interpreter for the AWK Programming Language. The AWK + language is useful for manipulation of data files, text retrieval and + processing, and for prototyping and experimenting with algorithms. Mawk + is a new awk meaning it implements the AWK language as defined in Aho, + Kernighan and Weinberger, The AWK Programming Language, Addison-Wesley + Publishing, 1988. (Hereafter referred to as the AWK book.) Mawk conforms + to the POSIX 1003.2 (draft 11.3) definition of the AWK language + which contains a few features not described in the AWK book, and mawk + provides a small number of extensions. + . + Mawk is smaller and much faster than gawk. It has some compile-time + limits such as NF = 32767 and sprintf buffer = 1020. diff --git a/debian/tmp/DEBIAN/postinst b/debian/tmp/DEBIAN/postinst new file mode 100755 index 0000000..4f19365 --- /dev/null +++ b/debian/tmp/DEBIAN/postinst @@ -0,0 +1,14 @@ +#!/bin/sh + +set -e + +update-alternatives --quiet --install /usr/bin/awk awk /usr/bin/mawk 5 \ + --slave /usr/share/man/man1/awk.1.gz awk.1.gz /usr/share/man/man1/mawk.1.gz \ + --slave /usr/bin/nawk nawk /usr/bin/mawk \ + --slave /usr/share/man/man1/nawk.1.gz nawk.1.gz /usr/share/man/man1/mawk.1.gz +for badlink in /usr/man/man1/awk.1 /usr/man/man1/nawk.1 /usr/share/man/man1/awk.1 /usr/share/man/man1/nawk.1; do + if [ -L $badlink ]; then + if ! ls -l $(ls -l $badlink | cut -d">" -f2) >/dev/null 2>&1; then + rm -f $badlink; fi; fi; done + +exit 0 diff --git a/debian/tmp/DEBIAN/prerm b/debian/tmp/DEBIAN/prerm new file mode 100755 index 0000000..d153132 --- /dev/null +++ b/debian/tmp/DEBIAN/prerm @@ -0,0 +1,9 @@ +#!/bin/sh + +set -e + +if [ "$1" != "upgrade" ]; then + update-alternatives --remove awk /usr/bin/mawk +fi + +exit 0 diff --git a/debian/tmp/usr/bin/mawk b/debian/tmp/usr/bin/mawk new file mode 100755 index 0000000..c8b8b11 Binary files /dev/null and b/debian/tmp/usr/bin/mawk differ diff --git a/debian/tmp/usr/share/doc/mawk/ACKNOWLEDGMENT.gz b/debian/tmp/usr/share/doc/mawk/ACKNOWLEDGMENT.gz new file mode 100644 index 0000000..e0cdce0 Binary files /dev/null and b/debian/tmp/usr/share/doc/mawk/ACKNOWLEDGMENT.gz differ diff --git a/debian/tmp/usr/share/doc/mawk/README.gz b/debian/tmp/usr/share/doc/mawk/README.gz new file mode 100644 index 0000000..08c338b Binary files /dev/null and b/debian/tmp/usr/share/doc/mawk/README.gz differ diff --git a/debian/tmp/usr/share/doc/mawk/changelog.Debian.gz b/debian/tmp/usr/share/doc/mawk/changelog.Debian.gz new file mode 100644 index 0000000..c0309cb Binary files /dev/null and b/debian/tmp/usr/share/doc/mawk/changelog.Debian.gz differ diff --git a/debian/tmp/usr/share/doc/mawk/changelog.gz b/debian/tmp/usr/share/doc/mawk/changelog.gz new file mode 100644 index 0000000..c3464aa Binary files /dev/null and b/debian/tmp/usr/share/doc/mawk/changelog.gz differ diff --git a/debian/tmp/usr/share/doc/mawk/copyright b/debian/tmp/usr/share/doc/mawk/copyright new file mode 100644 index 0000000..fe24f9b --- /dev/null +++ b/debian/tmp/usr/share/doc/mawk/copyright @@ -0,0 +1,28 @@ +This is the Debian GNU prepackaged version of mawk, an implementation +of the AWK Programming Language. mawk was written by Mike Brennan + + +This package was put together by Chris Fearnley , +from sources obtained from: + ftp://ftp.whidbey.net/pub/brennan/mawk1.3.3.tar.gz + +It is currently being maintained by James Troup . + +mawk 1.3.3 Nov 1996, Copyright (C) Michael D. Brennan +Modifications for Debian GNU/Linux Copyright (C) 1995-96 Chris Fearnley. +Modifications for Debian GNU/Linux Copyright (C) 1998-2003 James Troup. + +This program is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License version 2 as published +by the Free Software Foundation. + +This program is distributed in the hope that it will be useful, but +WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +General Public License for more details. + +You should have received a copy of the GNU General Public License with +your Debian GNU system, in /usr/share/common-licenses/GPL-2, or with the +Debian GNU mawk source package as the file COPYING. If not, write to +the Free Software Foundation, Inc., 51 Franklin St, Fifth Floor, +Boston, MA 02110-1301, USA. diff --git a/debian/tmp/usr/share/doc/mawk/examples/ct_length.awk.gz b/debian/tmp/usr/share/doc/mawk/examples/ct_length.awk.gz new file mode 100644 index 0000000..2789d77 Binary files /dev/null and b/debian/tmp/usr/share/doc/mawk/examples/ct_length.awk.gz differ diff --git a/debian/tmp/usr/share/doc/mawk/examples/decl.awk.gz b/debian/tmp/usr/share/doc/mawk/examples/decl.awk.gz new file mode 100644 index 0000000..ef04055 Binary files /dev/null and b/debian/tmp/usr/share/doc/mawk/examples/decl.awk.gz differ diff --git a/debian/tmp/usr/share/doc/mawk/examples/deps.awk.gz b/debian/tmp/usr/share/doc/mawk/examples/deps.awk.gz new file mode 100644 index 0000000..1217c01 Binary files /dev/null and b/debian/tmp/usr/share/doc/mawk/examples/deps.awk.gz differ diff --git a/debian/tmp/usr/share/doc/mawk/examples/eatc.awk.gz b/debian/tmp/usr/share/doc/mawk/examples/eatc.awk.gz new file mode 100644 index 0000000..08c3976 Binary files /dev/null and b/debian/tmp/usr/share/doc/mawk/examples/eatc.awk.gz differ diff --git a/debian/tmp/usr/share/doc/mawk/examples/gdecl.awk.gz b/debian/tmp/usr/share/doc/mawk/examples/gdecl.awk.gz new file mode 100644 index 0000000..159c25d Binary files /dev/null and b/debian/tmp/usr/share/doc/mawk/examples/gdecl.awk.gz differ diff --git a/debian/tmp/usr/share/doc/mawk/examples/hcal.gz b/debian/tmp/usr/share/doc/mawk/examples/hcal.gz new file mode 100644 index 0000000..5da6f12 Binary files /dev/null and b/debian/tmp/usr/share/doc/mawk/examples/hcal.gz differ diff --git a/debian/tmp/usr/share/doc/mawk/examples/hical.gz b/debian/tmp/usr/share/doc/mawk/examples/hical.gz new file mode 100644 index 0000000..72b670d Binary files /dev/null and b/debian/tmp/usr/share/doc/mawk/examples/hical.gz differ diff --git a/debian/tmp/usr/share/doc/mawk/examples/nocomment.awk.gz b/debian/tmp/usr/share/doc/mawk/examples/nocomment.awk.gz new file mode 100644 index 0000000..071bed4 Binary files /dev/null and b/debian/tmp/usr/share/doc/mawk/examples/nocomment.awk.gz differ diff --git a/debian/tmp/usr/share/doc/mawk/examples/primes.awk.gz b/debian/tmp/usr/share/doc/mawk/examples/primes.awk.gz new file mode 100644 index 0000000..448d8ad Binary files /dev/null and b/debian/tmp/usr/share/doc/mawk/examples/primes.awk.gz differ diff --git a/debian/tmp/usr/share/doc/mawk/examples/qsort.awk.gz b/debian/tmp/usr/share/doc/mawk/examples/qsort.awk.gz new file mode 100644 index 0000000..f77ca4b Binary files /dev/null and b/debian/tmp/usr/share/doc/mawk/examples/qsort.awk.gz differ diff --git a/debian/tmp/usr/share/man/man1/mawk.1.gz b/debian/tmp/usr/share/man/man1/mawk.1.gz new file mode 100644 index 0000000..ca344ee Binary files /dev/null and b/debian/tmp/usr/share/man/man1/mawk.1.gz differ diff --git a/error.c b/error.c new file mode 100644 index 0000000..e278d9c --- /dev/null +++ b/error.c @@ -0,0 +1,397 @@ + +/******************************************** +error.c +copyright 1991, 1992 Michael D. Brennan + +This is a source file for mawk, an implementation of +the AWK programming language. + +Mawk is distributed without warranty under the terms of +the GNU General Public License, version 2, 1991. +********************************************/ + + +/* $Log: error.c,v $ + * Revision 1.6 1995/06/06 00:18:22 mike + * change mawk_exit(1) to mawk_exit(2) + * + * Revision 1.5 1994/12/13 00:26:33 mike + * rt_nr and rt_fnr for run-time error messages + * + * Revision 1.4 1994/09/23 00:20:00 mike + * minor bug fix: handle \ in eat_nl() + * + * Revision 1.3 1993/07/17 13:22:49 mike + * indent and general code cleanup + * + * Revision 1.2 1993/07/04 12:51:44 mike + * start on autoconfig changes + * + * Revision 1.1.1.1 1993/07/03 18:58:11 mike + * move source to cvs + * + * Revision 5.3 1993/01/22 14:55:46 mike + * trivial change for unexpected_char() + * + * Revision 5.2 1992/10/02 23:26:04 mike + * using vargs.h + * + * Revision 5.1 1991/12/05 07:55:48 brennan + * 1.1 pre-release + * +*/ + + +#include "mawk.h" +#include "scan.h" +#include "bi_vars.h" +#include "vargs.h" + + +#ifndef EOF +#define EOF (-1) +#endif + +static void PROTO( rt_where, (void) ) ; +static void PROTO( missing, (int, char *, int) ) ; +static char *PROTO( type_to_str, (int) ) ; + + +#ifdef NO_VFPRINTF +#define vfprintf simple_vfprintf +#endif + + +/* for run time error messages only */ +unsigned rt_nr , rt_fnr ; + +static struct token_str { +short token ; +char *str ; } token_str[] = { +{EOF , "end of file" }, +{NL , "end of line"}, +{SEMI_COLON , ";" }, +{LBRACE , "{" }, +{RBRACE , "}" }, +{SC_FAKE_SEMI_COLON, "}"}, +{LPAREN , "(" }, +{RPAREN , ")" }, +{LBOX , "["}, +{RBOX , "]"}, +{QMARK , "?"}, +{COLON , ":"}, +{OR, "||"}, +{AND, "&&"}, +{ASSIGN , "=" }, +{ADD_ASG, "+="}, +{SUB_ASG, "-="}, +{MUL_ASG, "*="}, +{DIV_ASG, "/="}, +{MOD_ASG, "%="}, +{POW_ASG, "^="}, +{EQ , "==" }, +{NEQ , "!="}, +{LT, "<" }, +{LTE, "<=" }, +{GT, ">"}, +{GTE, ">=" }, +{MATCH, string_buff}, +{PLUS , "+" }, +{MINUS, "-" }, +{MUL , "*" }, +{DIV, "/" }, +{MOD, "%" }, +{POW, "^" }, +{NOT, "!" }, +{COMMA, "," }, +{INC_or_DEC , string_buff }, +{DOUBLE , string_buff }, +{STRING_ , string_buff }, +{ID , string_buff }, +{FUNCT_ID , string_buff }, +{BUILTIN , string_buff }, +{IO_OUT , string_buff }, +{IO_IN, "<" }, +{PIPE, "|" }, +{DOLLAR, "$" }, +{FIELD, "$" }, +{0, (char *) 0 }} ; + +/* if paren_cnt >0 and we see one of these, we are missing a ')' */ +static int missing_rparen[] = +{ EOF, NL, SEMI_COLON, SC_FAKE_SEMI_COLON, RBRACE, 0 } ; + +/* ditto for '}' */ +static int missing_rbrace[] = +{ EOF, BEGIN, END , 0 } ; + +static void missing( c, n , ln) + int c ; + char *n ; + int ln ; +{ char *s0, *s1 ; + + if ( pfile_name ) + { s0 = pfile_name ; s1 = ": " ; } + else s0 = s1 = "" ; + + errmsg(0, "%s%sline %u: missing %c near %s" ,s0, s1, ln, c, n) ; +} + +void yyerror(s) + char *s ; /* we won't use s as input + (yacc and bison force this). + We will use s for storage to keep lint or the compiler + off our back */ +{ struct token_str *p ; + int *ip ; + + s = (char *) 0 ; + + for ( p = token_str ; p->token ; p++ ) + if ( current_token == p->token ) + { s = p->str ; break ; } + + if ( ! s ) /* search the keywords */ + s = find_kw_str(current_token) ; + + if ( s ) + { + if ( paren_cnt ) + for( ip = missing_rparen ; *ip ; ip++) + if ( *ip == current_token ) + { missing(')', s, token_lineno) ; + paren_cnt = 0 ; + goto done ; + } + + if ( brace_cnt ) + for( ip = missing_rbrace ; *ip ; ip++) + if ( *ip == current_token ) + { missing('}', s, token_lineno) ; + brace_cnt = 0 ; + goto done ; + } + + compile_error("syntax error at or near %s", s) ; + + } + else /* special cases */ + switch ( current_token ) + { + case UNEXPECTED : + unexpected_char() ; + goto done ; + + case BAD_DECIMAL : + compile_error( + "syntax error in decimal constant %s", + string_buff ) ; + break ; + + case RE : + compile_error( + "syntax error at or near /%s/", + string_buff ) ; + break ; + + default : + compile_error("syntax error") ; + break ; + } + return ; + +done : + if ( ++compile_error_count == MAX_COMPILE_ERRORS ) mawk_exit(2) ; +} + + +/* generic error message with a hook into the system error + messages if errnum > 0 */ + +void errmsg VA_ALIST2(int , errnum, char *, format) + va_list args ; + + fprintf(stderr, "%s: " , progname) ; + + VA_START2(args, int, errnum, char *, format) ; + vfprintf(stderr, format, args) ; + va_end(args) ; + + if ( errnum > 0 ) fprintf(stderr, " (%s)" , strerror(errnum) ) ; + + fprintf( stderr, "\n") ; +} + +void compile_error VA_ALIST(char *, format) + va_list args ; + char *s0, *s1 ; + + /* with multiple program files put program name in + error message */ + if ( pfile_name ) + { s0 = pfile_name ; s1 = ": " ; } + else + { s0 = s1 = "" ; } + + fprintf(stderr, "%s: %s%sline %u: " , progname, s0, s1,token_lineno) ; + VA_START(args, char *, format) ; + vfprintf(stderr, format, args) ; + va_end(args) ; + fprintf(stderr, "\n") ; + if ( ++compile_error_count == MAX_COMPILE_ERRORS ) mawk_exit(2) ; +} + +void rt_error VA_ALIST( char *, format) + va_list args ; + + fprintf(stderr, "%s: run time error: " , progname ) ; + VA_START(args, char *, format) ; + vfprintf(stderr, format, args) ; + va_end(args) ; + putc('\n',stderr) ; + rt_where() ; + mawk_exit(2) ; +} + + +void bozo(s) + char *s ; +{ + errmsg(0, "bozo: %s" , s) ; + mawk_exit(3) ; +} + +void overflow(s, size) + char *s ; unsigned size ; +{ + errmsg(0 , "program limit exceeded: %s size=%u", s, size) ; + mawk_exit(2) ; +} + + +/* print as much as we know about where a rt error occured */ + +static void rt_where() +{ + if ( FILENAME->type != C_STRING ) cast1_to_s(FILENAME) ; + + fprintf(stderr, "\tFILENAME=\"%s\" FNR=%u NR=%u\n", + string(FILENAME)->str, rt_fnr, rt_nr) ; +} + +/* run time */ +void rt_overflow(s, size) + char *s ; unsigned size ; +{ + errmsg(0 , "program limit exceeded: %s size=%u", s, size) ; + rt_where() ; + mawk_exit(2) ; +} + +void +unexpected_char() +{ int c = yylval.ival ; + + fprintf(stderr, "%s: %u: ", progname, token_lineno) ; + if ( c > ' ' && c < 127 ) + fprintf(stderr, "unexpected character '%c'\n" , c) ; + else + fprintf(stderr, "unexpected character 0x%02x\n" , c) ; +} + +static char *type_to_str( type ) + int type ; +{ char *retval ; + + switch( type ) + { + case ST_VAR : retval = "variable" ; break ; + case ST_ARRAY : retval = "array" ; break ; + case ST_FUNCT : retval = "function" ; break ; + case ST_LOCAL_VAR : retval = "local variable" ; break ; + case ST_LOCAL_ARRAY : retval = "local array" ; break ; + default : bozo("type_to_str") ; + } + return retval ; +} + +/* emit an error message about a type clash */ +void type_error(p) + SYMTAB *p ; +{ compile_error("illegal reference to %s %s", + type_to_str(p->type) , p->name) ; +} + + + +#ifdef NO_VFPRINTF + +/* a minimal vfprintf */ +int simple_vfprintf( fp, format, argp) + FILE *fp ; + char *format ; + va_list argp ; +{ + char *q , *p, *t ; + int l_flag ; + char xbuff[64] ; + + q = format ; + xbuff[0] = '%' ; + + while ( *q != 0 ) + { + if ( *q != '%' ) + { + putc(*q, fp) ; q++ ; continue ; + } + + /* mark the start with p */ + p = ++q ; t = xbuff + 1 ; + + if ( *q == '-' ) *t++ = *q++ ; + while ( scan_code[*(unsigned char*)q] == SC_DIGIT ) *t++ = *q++ ; + if ( *q == '.' ) + { *t++ = *q++ ; + while ( scan_code[*(unsigned char*)q] == SC_DIGIT ) *t++ = *q++ ; + } + + if ( *q == 'l' ) { l_flag = 1 ; *t++ = *q++ ; } + else l_flag = 0 ; + + + *t = *q++ ; t[1] = 0 ; + + switch( *t ) + { + case 'c' : + case 'd' : + case 'o' : + case 'x' : + case 'u' : + if ( l_flag ) fprintf(fp, xbuff, va_arg(argp,long) ) ; + else fprintf(fp, xbuff, va_arg(argp, int)) ; + break ; + + case 's' : + fprintf(fp, xbuff, va_arg(argp, char*)) ; + break ; + + case 'g' : + case 'f' : + fprintf(fp, xbuff, va_arg(argp, double)) ; + break ; + + default: + putc('%', fp) ; + q = p ; + break ; + } + } + return 0 ; /* shut up */ +} + +#endif /* USE_SIMPLE_VFPRINTF */ + + diff --git a/error.o b/error.o new file mode 100644 index 0000000..29566e8 Binary files /dev/null and b/error.o differ diff --git a/examples/ct_length.awk b/examples/ct_length.awk new file mode 100755 index 0000000..2b7a448 --- /dev/null +++ b/examples/ct_length.awk @@ -0,0 +1,28 @@ +#!/usr/bin/mawk -f + +# ct_length.awk +# +# replaces all length +# by length($0) +# + + +{ + + while ( i = index($0, "length") ) + { + printf "%s" , substr($0,1, i+5) # ...length + $0 = substr($0,i+6) + + if ( match($0, /^[ \t]*\(/) ) + { + # its OK + printf "%s", substr($0, 1, RLENGTH) + $0 = substr($0, RLENGTH+1) + } + else # length alone + printf "($0)" + + } + print +} diff --git a/examples/decl.awk b/examples/decl.awk new file mode 100644 index 0000000..52bb810 --- /dev/null +++ b/examples/decl.awk @@ -0,0 +1,140 @@ +#!/usr/bin/awk -f + +# parse a C declaration by recursive descent +# based on a C program in KR ANSI edition +# +# run on a C file it finds the declarations +# +# restrictions: one declaration per line +# doesn't understand struct {...} +# makes assumptions about type names +# +# +# some awks need double escapes on strings used as +# regular expressions. If not run on mawk, use gdecl.awk + + +################################################ +# lexical scanner -- gobble() +# input : string s -- treated as a regular expression +# gobble eats SPACE, then eats longest match of s off front +# of global variable line. +# Cuts the matched part off of line +# + + +function gobble(s, x) +{ + sub( /^ /, "", line) # eat SPACE if any + + # surround s with parenthesis to make sure ^ acts on the + # whole thing + + match(line, "^" "(" s ")") + x = substr(line, 1, RLENGTH) + line = substr(line, RLENGTH+1) + return x +} + + +function ptr_to(n, x) # print "pointer to" , n times +{ n = int(n) + if ( n <= 0 ) return "" + x = "pointer to" ; n-- + while ( n-- ) x = x " pointer to" + return x +} + + +#recursively get a decl +# returns an english description of the declaration or +# "" if not a C declaration. + +function decl( x, t, ptr_part) +{ + + x = gobble("[* ]+") # get list of *** ... + gsub(/ /, "", x) # remove all SPACES + ptr_part = ptr_to( length(x) ) + + # We expect to see either an identifier or '(' + # + + if ( gobble("\(") ) + { + # this is the recursive descent part + # we expect to match a declaration and closing ')' + # If not return "" to indicate failure + + if ( (x = decl()) == "" || gobble( "\)" ) == "" ) return "" + + } + else # expecting an identifier + { + if ( (x = gobble(id)) == "" ) return "" + x = x ":" + } + + # finally look for () + # or [ opt_size ] + + while ( 1 ) + if ( gobble( funct_mark ) ) x = x " function returning" + else + if ( t = gobble( array_mark ) ) + { gsub(/ /, "", t) + x = x " array" t " of" + } + else break + + + x = x " " ptr_part + return x +} + + +BEGIN { id = "[_A-Za-z][_A-Za-z0-9]*" + funct_mark = "\([ \t]*\)" + array_mark = "\[[ \t]*[_A-Za-z0-9]*[ \t]*\]" + +# I've assumed types are keywords or all CAPS or end in _t +# Other conventions could be added. + + type0 = "int|char|short|long|double|float|void" + type1 = "[_A-Z][_A-Z0-9]*" # types are CAPS + type2 = "[_A-Za-z][_A-Za-z0-9]*_t" # end in _t + + types = "(" type0 "|" type1 "|" type2 ")" +} + + +{ + + gsub( "/\*([^*]|\*[^/])*(\*/|$)" , " ") # remove comments + gsub( /[ \t]+/, " ") # squeeze white space to a single space + + + line = $0 + + scope = gobble( "extern|static" ) + + if ( type = gobble("(struct|union|enum) ") ) + type = type gobble(id) # get the tag + else + { + + type = gobble("(un)?signed ") gobble( types ) + + } + + if ( ! type ) next + + if ( (x = decl()) && gobble( ";") ) + { + x = x " " type + if ( scope ) x = x " (" scope ")" + gsub( / +/, " ", x) # + print x + } + +} diff --git a/examples/deps.awk b/examples/deps.awk new file mode 100644 index 0000000..4eba11d --- /dev/null +++ b/examples/deps.awk @@ -0,0 +1,58 @@ +#!/usr/bin/mawk -f + +# find include dependencies in C source +# +# mawk -f deps.awk C_source_files +# -- prints a dependency list suitable for make +# -- ignores #include < > +# + + +BEGIN { stack_index = 0 # stack[] holds the input files + + for(i = 1 ; i < ARGC ; i++) + { + file = ARGV[i] + if ( file !~ /\.[cC]$/ ) continue # skip it + outfile = substr(file, 1, length(file)-2) ".o" + + # INCLUDED[] stores the set of included files + # -- start with the empty set + for( j in INCLUDED ) delete INCLUDED[j] + + while ( 1 ) + { + if ( getline line < file <= 0 ) # no open or EOF + { close(file) + if ( stack_index == 0 ) break # empty stack + else + { file = stack[ stack_index-- ] + continue + } + } + + if ( line ~ /^#include[ \t]+".*"/ ) + { + split(line, X, "\"") # filename is in X[2] + + if ( X[2] in INCLUDED ) # we've already included it + continue + + #push current file + stack[ ++stack_index ] = file + INCLUDED[ file = X[2] ] = "" + } + } # end of while + + # test if INCLUDED is empty + flag = 0 # on once the front is printed + for( j in INCLUDED ) + if ( ! flag ) + { printf "%s : %s" , outfile, j ; flag = 1 } + else printf " %s" , j + + if ( flag ) print "" + + }# end of loop over files in ARGV[i] + +} diff --git a/examples/eatc.awk b/examples/eatc.awk new file mode 100644 index 0000000..2c29deb --- /dev/null +++ b/examples/eatc.awk @@ -0,0 +1,32 @@ +#!/usr/bin/mawk -f + +# eatc.awk +# another program to remove comments +# + + +{ while( t = index($0 , "/*") ) + { + printf "%s" , substr($0,1,t-1) + $0 = eat_comment( substr($0, t+2) ) + } + + print +} + + +function eat_comment(s, t) +{ + #replace comment by one space + printf " " + + while ( (t = index(s, "*/")) == 0 ) + if ( getline s == 0 ) + { # input error -- unterminated comment + system("/bin/sh -c 'echo unterminated comment' 1>&2") + exit 1 + } + + return substr(s,t+2) +} + diff --git a/examples/gdecl.awk b/examples/gdecl.awk new file mode 100644 index 0000000..cad0774 --- /dev/null +++ b/examples/gdecl.awk @@ -0,0 +1,133 @@ +#!/usr/bin/mawk + +# parse a C declaration by recursive descent +# +# decl.awk with extra escapes \ + +################################################ +############################################ + + +# lexical scanner -- gobble() +# input : string s -- treated as a regular expression +# gobble eats SPACE, then eats longest match of s off front +# of global variable line. +# Cuts the matched part off of line +# + + +function gobble(s, x) +{ + sub( /^ /, "", line) # eat SPACE if any + + # surround s with parenthesis to make sure ^ acts on the + # whole thing + + match(line, "^" "(" s ")") + x = substr(line, 1, RLENGTH) + line = substr(line, RLENGTH+1) + return x +} + + +function ptr_to(n, x) # print "pointer to" , n times +{ n = int(n) + if ( n <= 0 ) return "" + x = "pointer to" ; n-- + while ( n-- ) x = x " pointer to" + return x +} + + +#recursively get a decl +# returns an english description of the declaration or +# "" if not a C declaration. + +function decl( x, t, ptr_part) +{ + + x = gobble("[* ]+") # get list of *** ... + gsub(/ /, "", x) # remove all SPACES + ptr_part = ptr_to( length(x) ) + + # We expect to see either an identifier or '(' + # + + if ( gobble("\\(") ) + { + # this is the recursive descent part + # we expect to match a declaration and closing ')' + # If not return "" to indicate failure + + if ( (x = decl()) == "" || gobble( "\\)" ) == "" ) return "" + + } + else # expecting an identifier + { + if ( (x = gobble(id)) == "" ) return "" + x = x ":" + } + + # finally look for () + # or [ opt_size ] + + while ( 1 ) + if ( gobble( funct_mark ) ) x = x " function returning" + else + if ( t = gobble( array_mark ) ) + { gsub(/ /, "", t) + x = x " array" t " of" + } + else break + + + x = x " " ptr_part + return x +} + + +BEGIN { id = "[_A-Za-z][_A-Za-z0-9]*" + funct_mark = "\\([ \t]*\\)" + array_mark = "\\[[ \t]*[_A-Za-z0-9]*[ \t]*\\]" + +# I've assumed types are keywords or all CAPS or end in _t +# Other conventions could be added. + + type0 = "int|char|short|long|double|float|void" + type1 = "[_A-Z][_A-Z0-9]*" # types are CAPS + type2 = "[_A-Za-z][_A-Za-z0-9]*_t" # end in _t + + types = "(" type0 "|" type1 "|" type2 ")" +} + + +{ + + gsub( /\/\*([^*]|\*[^\/])*(\*\/|$)/ , " ") # remove comments + gsub( /[ \t]+/, " ") # squeeze white space to a single space + + + line = $0 + + scope = gobble( "extern|static" ) + + if ( type = gobble("(struct|union|enum) ") ) + type = type gobble(id) # get the tag + else + { + + type = gobble("(un)?signed ") gobble( types ) + + } + + if ( ! type ) next + + if ( (x = decl()) && gobble( ";") ) + { + x = x " " type + if ( scope ) x = x " (" scope ")" + gsub( / +/, " ", x) # + print x + } + +} diff --git a/examples/hcal b/examples/hcal new file mode 100755 index 0000000..1e97d67 --- /dev/null +++ b/examples/hcal @@ -0,0 +1,417 @@ +#!/usr/bin/mawk -We + +# edit the above to be the full pathname of 'mawk' +# @(#) hcal - v01.00.02 - Tue Feb 27 21:21:21 EST 1996 +# @(#) prints a 3-month (highlighted) calendar centered on the target month +# @(#) may be edited for week to start with Sun or Mon & for local language +# @(#) to display a usage screen, execute: hcal -h +# NOTE: to edit, set ts=4 in 'vi' (or equivalent) +# to print, pipe through 'pr -t -e4' + +# Using ideas from a KornShell script by Mikhail Kuperblum (mikhail@klm.com) +# Bob Stockler - bob@trebor.iglou.com - Sysop CompuServe SCOForum [75162,1612] + +BEGIN { +# Local Edits: + PROG = "hcal" # Program name given to this script +# FMT = 0 # date format dd/mm/yyyy +# FMT1 = 0 # for weekdays ordered "Mo Tu We Th Fr Sa Su" + FMT = 1 # date format mm/dd/yyyy + FMT1 = 1 # for weekdays ordered "Su Mo Tu We Th Fr Sa" +# edit day & month names and abbreviations for local language names + Days[0] = "Mo Tu We Th Fr Sa Su" + Days[1] = "Su Mo Tu We Th Fr Sa" + MONTHS = "January February March April May June July August" + MONTHS = MONTHS " September October November December" + Months = "Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec" +# STDOUT = 0 # emulate SCO Unix 'cal' (NO highlighting) + STDOUT = 1 # default to highlight mode + MINUS = "-" # possible input date field delimiter + SLASH = "/" # possible input date field delimiter + DOT = "." # possible input date field delimiter + IDFD = "[" MINUS # make MINUS the first character in this series + IDFD = IDFD SLASH # so that it stands for itself in the RE + IDFD = IDFD DOT "]" # Input Date Field Delimiters RE + ODFD = SLASH # Output Date Field Delimiter (default) + DATE_FMT = "%.2d%s%.2d%s%.4d" # date format +## this script presumes 'date' recognizes these arguments in these ways: +## w - Day of the week - Sunday = 0 +## m - Month of year - 01 to 12 +## d - Day of month - 01 to 31 +## y Last 2 digits of year - 00 to 99 +## Y - Year (including century), as decimal numbers +## j - Day of the year - 001 to 366 (Julian date) +## T - Time as HH:MM:SS +## X Current time, as defined by the locale +## a - Abbreviated weekday - Sun to Sat +## b - Abbreviated month name +## Z - Timezone name, or no characters if no timezone exists +## Command to get today's date information: +## DATE = "/bin/date '+%w %m %d 19%y %j~%a %b %d %T %Z 19%y'" +## For sunos4 +## DATE = DATE = "/bin/date '+%w %m %d 19%y %j~%a %h %d %T 19%y'" + DATE = "/bin/date '+%w %m %d %Y %j~%a %b %d %X %Z %Y'" +# End of Local Edits + + INT_RE = "^[0-9]+$" # unsigned integer RE + S_INT_RE = "^[-+][0-9]+$" # signed integer RE + MNAM_RE = "^[A-Za-z]+$" # month name RE + YEAR_RE = "^[0-9]?[0-9]?[0-9]?[0-9]$" + DATE_RE = "^[0-9]?[0-9]" IDFD "[0-9]?[0-9]" IDFD "[0-9]?[0-9]?[0-9]?[0-9]$" + DAT1_RE = "^[0-9]?[0-9]" IDFD "[0-9]?[0-9]$" + + split(Months,M_Name) + split("31 28 31 30 31 30 31 31 30 31 30 31",Mdays) ; Mdays[0] = 0 + + NUM_ARGS = ARGC - 1 + if ( ARGV[1] == "-x" ) { + # standout mode switch + if ( STDOUT == 1 ) STDOUT = 0 ; else STDOUT = 1 + ARG1 = ARGV[2] ; ARG2 = ARGV[3] ; NUM_ARGS -= 1 + } + else if ( ARGV[1] ~ /^-[h?]$/ ) { HELP = 1 ; exit } + else { ARG1 = ARGV[1] ; ARG2 = ARGV[2] } + + if ( STDOUT == 1 ) { + # get the terminal standout-start & standout-end control codes + so = ENVIRON["so"] ; if ( ! so ) "tput smso" | getline so + se = ENVIRON["se"] ; if ( ! se ) "tput rmso" | getline se + } + + if ( NUM_ARGS == 0 ) { + # no arguments - print a calendar display centered on today + DEFAULT = 1 + } + else if ( NUM_ARGS == 1 ) { + # one argument - may be a month name, date, year, or interval of days + if ( ARG1 ~ DATE_RE ) DATE1 = Fmt_Date(ARG1) + else if ( ARG1 ~ DAT1_RE ) DATE1 = ARG1 + else if ( ARG1 ~ MNAM_RE ) { Get_Mnum() ; DATE1 = RMSO = ARG1 "/1" } + else if ( ARG1 ~ S_INT_RE ) INTERVAL = ARG1 + 0 + else if ( ARG1 ~ INT_RE ) { + if ( ARG1 > 0 && ARG1 <= 9999 ) YEAR = ARG1 + 0 + else if ( ARG1 > 9999 ) { ERR = 9 ; exit } + else { ERR = 7 ; exit } + } + else { ERR = 1 ; exit } + } + else if ( NUM_ARGS == 2 ) { + # two arguments, the second of which must be an integer + if ( ARG2 ~ INT_RE ) { + ARG2 = ARG2 + 0 + if ( ARG2 < 1 ) { ERR = 7 ; exit } + else if ( ARG2 > 9999 ) { ERR = 9 ; exit } + } + else { ERR = 1 ; exit } + RMSO = 1 + # the first may be a string or an integer + if ( ARG1 ~ INT_RE ) { + # a month number and a year + if ( ARG1 < 1 || ARG1 > 12 ) { ERR = 4 ; mm = ARG1 ; exit } + } + else if ( ARG1 ~ MNAM_RE ) { + Get_Mnum() + } + else { ERR = 6 ; exit } + DATE1 = ARG1 "/1/" ARG2 + } + else { ERR = 2 ; exit } + + if ( DEFAULT ) { Get_Now() } + else if ( INTERVAL ) { + Get_Now() + daynum = daynum + ( INTERVAL % 7 ) + this_date = "" + DATE1 = Get_Date(INTERVAL,m,d,y,j) + split(DATE1,mdy,IDFD) + Mon[2] = mdy[1] + 0 + today = mdy[2] + 0 + Year[1] = Year[2] = Year[3] = mdy[3] + 0 + } + else if ( DATE1 ) { + Get_Now() + if ( split(DATE1,mdy,IDFD) == 2 ) DATE1 = DATE1 "/" This_Year + Chk_Date(DATE1) + Mon[2] = mdy[1] + 0 + today = mdy[2] + 0 + Year[1] = Year[2] = Year[3] = mdy[3] + 0 + DATE1 = sprintf( "%.2d/%.2d/%.4d", Mon[2], today, Year[2] ) + INTERVAL = Get_Num(DATE1,m,d,y,j) + daynum = daynum + ( INTERVAL % 7 ) + this_date = "" + } + else if ( YEAR ) { + so = se = "" + Get_Now() + Mon[2] = 2 + today = 1 + Year[1] = Year[2] = Year[3] = YEAR + DATE1 = sprintf( "%.2d/%.2d/%.4d", Mon[2], today, Year[2] ) + INTERVAL = Get_Num(DATE1,m,d,y,j) + daynum = daynum + ( INTERVAL % 7 ) + this_date = "" + } + else { ERR = 5 ; exit } + + if ( Mon[2] != 1 ) Mon[1] = Mon[2] - 1 + else { Mon[1] = 12 ; Year[1] -= 1 } + if ( Mon[2] != 12 ) Mon[3] = Mon[2] + 1 + else { Mon[3] = 1 ; Year[3] += 1 } + if ( Mon[1] == 2 ) Leap(Year[1]) + else if ( Mon[2] == 2 ) Leap(Year[2]) + else if ( Mon[3] == 2 ) Leap(Year[3]) + + Start[2] = 7 - ( ( today - daynum ) % 7 ) + Start[1] = 7 - ( ( Mdays[Mon[1]] - Start[2] ) % 7 ) + Start[3] = ( Mdays[Mon[2]] + Start[2] ) % 7 + + if ( ! YEAR ) quarters = 1 + else { + quarters = 4 ; s[3] = Start[3] + for (i=4;i<=12;i++) { s[i] = ( Mdays[i-1] + s[i-1] ) % 7 } + } + for ( quarter = 1 ; quarter <= quarters ; quarter++ ) { + if ( quarter > 1 ) { + delete cal + ll = 0 ; Mon[1] += 3 ; Mon[2] += 3 ; Mon[3] += 3 + Start[1] = s[Mon[1]] ; Start[2] = s[Mon[2]] ; Start[3] = s[Mon[3]] + } + if ( Year[2] == 1752 && Mon[2] ~ /8|9|10/ ) Kludge_1752() + if ( ARG1 ) print "" ; else printf( "\n%s\n\n", this_date ) + for (i=1;i<=3;i++) { while ( Start[i] >= 7 ) Start[i] -= 7 } + for (mm=1;mm<=3;mm++) { l = 1 + if ( mm != 2 ) { So = Se = "" } else { So = so ; Se = se } + cal[mm SUBSEP l++] = sprintf( "%s %-4s%.4d %s ", \ + So, M_Name[Mon[mm]], Year[mm], Se ) + cal[mm SUBSEP l++] = sprintf( "%s%3s", Days[FMT1], "" ) + j = k = 1 + while ( j <= Mdays[Mon[mm]] ) { + line = "" + for (i=1;i<=7;i++) { + if ( Start[mm] > 0 || j > Mdays[Mon[mm]] ) { + date = "" ; Start[mm]-- } + else date = j++ + if ( Year[mm] == 1752 && Mon[mm] == 9 && date == 3 ) { + date = 14 ; j = 15 } + if ( date == today && mm == 2 && ! RMSO ) { + So = so ; Se = se } + else { So = Se = "" } + line = sprintf( "%s%s%2s%s ", line, So, date, Se ) + } + cal[mm SUBSEP l++] = sprintf( "%s ", line ) + } + if ( l > ll ) ll = l + } + for (l=1;l"/dev/tty" + print usage >"/dev/tty" + exit ERR +} + +function Get_Now() { + # get the week, month, date & year numbers and the time-of-day + DATE | getline date + split(date,Date,"~") + split(Date[1],field) + daynum = field[1] + FMT1 + m = field[2] ; This_Mon = Mon[2] = m + 0 + d = field[3] ; This_Date = today = d + 0 + y = This_Year = Year[1] = Year[2] = Year[3] = field[4] + j = julian = field[5] + 0 + this_date = Date[2] +} + +function Fmt_Date(date) { + # format dates as mm/dd/yyyy or dd/mm/yyyy + split(date,MorD_DorM_Y,IDFD) + if ( FMT == 1 ) { Dt_Fld1 = MorD_DorM_Y[1] ; Dt_Fld2 = MorD_DorM_Y[2] } + else { Dt_Fld1 = MorD_DorM_Y[2] ; Dt_Fld2 = MorD_DorM_Y[1] } + Dt_Fld3 = MorD_DorM_Y[3] + return sprintf( DATE_FMT, Dt_Fld1, ODFD, Dt_Fld2, ODFD, Dt_Fld3 ) +} + +function Kludge_1752() { + # kludge for September 1752 & the change to the Gregorian Calendar + Mdays[9] = 30 + if ( Mon[2] == 9 ) { + Start[1] = Start[2] = 1 + FMT1 ; Start[3] = -1 + FMT1 + } + else if ( Mon[2] == 8 ) { + Start[1] = 2 + FMT1 ; Start[2] = 5 + FMT1 ; Start[3] = 1 + FMT1 + } + else if ( Mon[2] == 10 ) { + Start[1] = 1 + FMT1 ; Start[2] = -1 + FMT1 ; Start[3] = 3 + } +} + +function Get_Mnum() { + ARG1 = tolower(ARG1) + months = tolower(MONTHS) + split(months,month) + for (i=1;i<=12;i++) { + if ( index(month[i],ARG1) == 1 ) { ARG = i ; n++ } + } + if ( n == 1 ) ARG1 = ARG + else if ( n == 0 ) { ERR = 1 ; exit } + else { ERR = 8 ; exit } +} + +function Get_Num(date,m,d,y,j) { + # get the number of days from one date to another date + NOW = y m d ; N = 0 ; M = m + 0 ; D = d + 0 ; Y = y + 0 ; J = j + 0 + split(date,mdy,IDFD) + M2 = mdy[1] ; D2 = mdy[2] ; Y2 = mdy[3] + THEN = Y2 M2 D2 ; M2 = M2 + 0 ; D2 = D2 + 0 ; Y2 = Y2 + 0 + Leap(Y2) + if ( M2 > 12 ) { ERR = 4 ; exit } + if ( D2 > Mdays[M2] && Y2 != 1752 && M2 != 9 ) { ERR = 5 ; exit } + if ( THEN ~ /^1752090[3-9]$|^1752091[0-3]$/ ) { ERR = 6 ; exit } + Leap(Y) + if ( THEN > NOW ) { + Ydays = Ydays - J + 1 ; mdays = Mdays[M] - D + 1 + while ( Y < Y2 ) Next_Y() + while ( M < M2 ) Next_M() + while ( D < D2 ) Next_D() + N *= -1 + } + else { + Ydays = J ; mdays = D + while ( Y > Y2 ) Prev_Y() + while ( M > M2 ) Prev_M() + if ( Y == 1752 && M == 9 && D == 19 ) D = 30 + while ( D > D2 ) Prev_D() + } + return N +} + +function Get_Date(n,m,d,y,j) { + # get the date a number of days before or after a date + N = n + 0 ; M = m + 0 ; D = d + 0 ; Y = y + 0 ; J = j + 0 + if ( N != 0 ) { + Leap(Y) + if ( N > 0 ) { + Ydays = Ydays - J + 1 ; mdays = Mdays[M] - D + 1 + while ( N >= Ydays ) { Next_Y() ; Leap(Y) } + while ( N >= ( ( mdays > 0 ) ? mdays : Mdays[M] ) ) { Next_M() } + while ( N > 0 ) Next_D() + } + else { + Ydays = J ; mdays = D ; N *= -1 + while ( N >= Ydays ) { Prev_Y() ; Leap(Y) } + while ( N >= ( ( mdays > 0 ) ? mdays : Mdays[M] ) ) { Prev_M() } + if ( Y == 1752 && M == 9 && D == 19 ) D = 30 + while ( N > 0 ) Prev_D() + } + if ( Y < 1 ) { ERR = 3 ; exit } + } + return M ODFD D ODFD Y +} + +function Leap(YR) { + # adjust for Leap Years + if ( YR % 4 == 0 && ( YR % 100 != 0 || YR % 400 == 0 || YR < 1800 ) ) { + Ydays = 366 ; Mdays[2] = 29 } + else { Ydays = 365 ; Mdays[2] = 28 } + if ( YR != 1752 ) Mdays[9] = 30 + else { Ydays = 355 ; Mdays[9] = 19 } +} + +function Chk_Date(date) { + # check validity of input dates + split(date,mdy,IDFD) + mm = mdy[1] + 0 ; dd = mdy[2] + 0 ; yy = mdy[3] + 0 + if ( mm == 2 ) Leap(yy) + if ( yy < 1 ) { ERR = 3 ; exit } + if ( mm < 1 || mm > 12 ) { ERR = 4 ; exit } + if ( dd < 1 || dd > Mdays[mm] ) { ERR = 5 ; exit } +} + +# day counting functions for next or previous year, month and day +function Next_Y() { + N -= Ydays ; Y += 1 ; M = 1 ; D = 1 ; mdays = 0 ; Leap(Y) +} +function Next_M() { + if ( mdays != 0 ) N -= mdays ; else N -= Mdays[M] + M += 1 ; D = 1 ; mdays = 0 +} +function Next_D() { + N -= 1 ; D += 1 + if ( D > Mdays[M] ) { M += 1 ; D = 1 } + else if ( Y == 1752 && M == 9 && D == 2 ) D = 13 +} +function Prev_Y() { + N -= Ydays ; Y -= 1 ; M = 12 ; D = 31 ; mdays = 0 ; Leap(Y) +} +function Prev_M() { + if ( mdays != 0 ) N -= mdays ; else N -= Mdays[M] + M -= 1 ; D = Mdays[M] ; mdays = 0 +} +function Prev_D() { + N -= 1 ; D -= 1 ; if ( Y == 1752 && M == 9 && D == 13 ) D = 2 +} + +function Get_J(m,d,y) { + # get the Julian date for an input date + m = m + 0 ; d = d + 0 ; y = y + 0 + Leap(y) + j = d + for (i=1;i) +/bin/echo '{ +# USER EDITS MAY BE REQUIRED (for FMT, day & month names, and the time stuff) +# FMT = 0 # for weekdays ordered "Mo Tu We Th Fr Sa Su" + FMT = 1 # for weekdays ordered "Su Mo Tu We Th Fr Sa" + Header[0] = "Mo Tu We Th Fr Sa Su" + Header[1] = "Su Mo Tu We Th Fr Sa" + months = "Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec" + time_is = "The time is:" ; time_fmt = "%s %s %s %s\\n" +# NO MORE USER EDITS REQUIRED (I think!) + split(months,M_Name) ; split("31 28 31 30 31 30 31 31 30 31 30 31",M_Len) + daynum = $1 + FMT + Mon[2] = $2 + 0 + today = $3 + 0 + time = $4 + Year[1] = Year[2] = Year[3] = $NF + if ( Mon[2] == 1 ) { Year[1] = Year[1] - 1 ; Mon[1] = 12 } + else { Mon[1] = Mon[2] - 1 } + if ( Mon[2] == 12 ) { Year[3] = Year[3] + 1 ; Mon[3] = 1 } + else { Mon[3] = Mon[2] + 1 } + if ( Year[2] % 4 == 0 && \ + Year[2] % 100 != 0 || \ + Year[2] % 400 == 0 ) M_Len[2] = 29 + Start[2] = 7 - ( ( today - daynum ) % 7 ) + Start[1] = 7 - ( ( M_Len[Mon[1]] - Start[2] ) % 7 ) + Start[3] = ( M_Len[Mon[2]] + Start[2] ) % 7 + for (i=1;i<=3;i++) { while ( Start[i] >= 7 ) Start[i] -= 7 } + for (mm=1;mm<=3;mm++) { + if ( Year[mm] != Year[mm-1] ) + printf( "%s %s %s\\n", so, Year[mm], se ) + if ( mm == 1 ) printf( "%s %s %s\\n", so, Header[FMT], se ) + j = k = 1 + while ( j <= M_Len[Mon[mm]] ) { + line = "" + for (i=1;i<=7;i++) { + if ( Start[mm] > 0 || j > M_Len[Mon[mm]] ) { date = "" ; Start[mm]-- } + else date = j++ + if ( mm == 2 && date == today ) { So = so ; Se = se } + else { So = Se = "" } + line = sprintf( "%s%s%2s%s ", line, So, date, Se ) + } + m1 = substr(M_Name[Mon[mm]],k++,1) + printf( "%s %1s %s %s%s %s\\n", so, m1, se, line, so, se ) + } + } + printf( time_fmt, so, time_is, time, se ) +}' >$prog + +date +"$DATE_ARGS" | ${AWK:=mawk} -f $prog so=$so se=$se + +exit 0 + +# EOF 'hical' - Tue Dec 19 19:19:19 EST 1994 +# Bob Stockler - bob@trebor.iglou.com - CIS: 72726,452 diff --git a/examples/nocomment.awk b/examples/nocomment.awk new file mode 100644 index 0000000..f7d3164 --- /dev/null +++ b/examples/nocomment.awk @@ -0,0 +1,31 @@ +#!/usr/bin/mawk -f + +# remove C comments from a list of files +# using a comment as the record separator +# +# this is trickier than I first thought +# The first version in .97-.9993 was wrong + +BEGIN { + # RS is set to a comment (this is mildly tricky, I blew it here + RS = "/\*([^*]|\*+[^*/])*\*+/" + ORS = " " + getline hold + filename = FILENAME +} + +# if changing files +filename != FILENAME { + filename = FILENAME + printf "%s" , hold + hold = $0 + next +} + +{ # hold one record because we don't want ORS on the last + # record in each file + print hold + hold = $0 +} + +END { printf "%s", hold } diff --git a/examples/primes.awk b/examples/primes.awk new file mode 100644 index 0000000..430ad2d --- /dev/null +++ b/examples/primes.awk @@ -0,0 +1,63 @@ +#!/usr/bin/mawk -f + +# primes.awk +# +# mawk -f primes.awk [START] STOP +# find all primes between 2 and STOP +# or START and STOP +# + + + +function usage() +{ ustr = sprintf("usage: %s [start] stop", ARGV[0]) + system( "echo " ustr) + exit 1 +} + + +BEGIN { if (ARGC == 1 || ARGC > 3 ) usage() + if ( ARGC == 2 ) { start = 2 ; stop = ARGV[1]+0 } + else + if ( ARGC == 3 ) { start = ARGV[1]+0 ; stop = ARGV[2]+0 } + + if ( start < 2 ) start = 2 + if ( stop < start ) stop = start + + prime[ p_cnt = 1 ] = 3 # keep primes in prime[] + +# keep track of integer part of square root by adding +# odd integers + odd = test = 5 + root = 2 + squares = 9 + + +while ( test <= stop ) +{ + if ( test >= squares ) + { root++ + odd += 2 + squares += odd + } + + flag = 1 + for ( i = 1 ; prime[i] <= root ; i++ ) + if ( test % prime[i] == 0 ) # not prime + { flag = 0 ; break } + + if ( flag ) prime[ ++p_cnt ] = test + + test += 2 +} + +prime[0] = 2 + +for( i = 0 ; prime[i] < start ; i++) ; + +for ( ; i <= p_cnt ; i++ ) print prime[i] + +} + + + diff --git a/examples/qsort.awk b/examples/qsort.awk new file mode 100644 index 0000000..2b4390e --- /dev/null +++ b/examples/qsort.awk @@ -0,0 +1,78 @@ +#!/usr/bin/mawk -f + +# qsort text files +# + +function middle(x,y,z) #return middle of 3 +{ + if ( x <= y ) + { if ( z >= y ) return y + if ( z < x ) return x + return z + } + + if ( z >= x ) return x + if ( z < y ) return y + return z +} + + +function isort(A , n, i, j, hold) +{ + # if needed a sentinal at A[0] will be created + + for( i = 2 ; i <= n ; i++) + { + hold = A[ j = i ] + while ( A[j-1] > hold ) + { j-- ; A[j+1] = A[j] } + + A[j] = hold + } +} + + +# recursive quicksort +function qsort(A, left, right ,i , j, pivot, hold) +{ + + pivot = middle(A[left], A[int((left+right)/2)], A[right]) + + i = left + j = right + + while ( i <= j ) + { + while ( A[i] < pivot ) i++ + while ( A[j] > pivot ) j-- + + if ( i <= j ) + { hold = A[i] + A[i++] = A[j] + A[j--] = hold + } + } + + if ( j - left > BLOCK ) qsort(A,left,j) + if ( right - i > BLOCK ) qsort(A,i,right) +} + +BEGIN { BLOCK = 5 } + + +{ line[NR] = $0 "" # sort as string +} + +END { + + if ( NR > BLOCK ) qsort(line, 1, NR) + + isort(line, NR) + + for(i = 1 ; i <= NR ; i++) print line[i] +} + + + + + diff --git a/execute.c b/execute.c new file mode 100644 index 0000000..9629640 --- /dev/null +++ b/execute.c @@ -0,0 +1,1456 @@ + +/******************************************** +execute.c +copyright 1991-1996, Michael D. Brennan + +This is a source file for mawk, an implementation of +the AWK programming language. + +Mawk is distributed without warranty under the terms of +the GNU General Public License, version 2, 1991. +********************************************/ + +/* $Log: execute.c,v $ + * Revision 1.13 1996/02/01 04:39:40 mike + * dynamic array scheme + * + * Revision 1.12 1995/06/06 00:18:24 mike + * change mawk_exit(1) to mawk_exit(2) + * + * Revision 1.11 1995/03/08 00:06:24 mike + * add a pointer cast + * + * Revision 1.10 1994/12/13 00:12:10 mike + * delete A statement to delete all of A at once + * + * Revision 1.9 1994/10/25 23:36:11 mike + * clear aloop stack on _NEXT + * + * Revision 1.8 1994/10/08 19:15:35 mike + * remove SM_DOS + * + * Revision 1.7 1993/12/30 19:10:03 mike + * minor cleanup to _CALL + * + * Revision 1.6 1993/12/01 14:25:13 mike + * reentrant array loops + * + * Revision 1.5 1993/07/22 00:04:08 mike + * new op code _LJZ _LJNZ + * + * Revision 1.4 1993/07/14 12:18:21 mike + * run thru indent + * + * Revision 1.3 1993/07/14 11:50:17 mike + * rm SIZE_T and void casts + * + * Revision 1.2 1993/07/04 12:51:49 mike + * start on autoconfig changes + * + * Revision 5.10 1993/02/13 21:57:22 mike + * merge patch3 + * + * Revision 5.9 1993/01/07 02:50:33 mike + * relative vs absolute code + * + * Revision 5.8 1993/01/01 21:30:48 mike + * split new_STRING() into new_STRING and new_STRING0 + * + * Revision 5.7.1.1 1993/01/15 03:33:39 mike + * patch3: safer double to int conversion + * + * Revision 5.7 1992/12/17 02:48:01 mike + * 1.1.2d changes for DOS + * + * Revision 5.6 1992/11/29 18:57:50 mike + * field expressions convert to long so 16 bit and 32 bit + * systems behave the same + * + * Revision 5.5 1992/08/11 15:24:55 brennan + * patch2: F_PUSHA and FE_PUSHA + * If this is preparation for g?sub(r,s,$expr) or (++|--) on $expr, + * then if expr > NF, make sure $expr is set to "" + * + * Revision 5.4 1992/08/11 14:51:54 brennan + * patch2: $expr++ is numeric even if $expr is string. + * I forgot to do this earlier when handling x++ case. + * + * Revision 5.3 1992/07/08 17:03:30 brennan + * patch 2 + * revert to version 1.0 comparisons, i.e. + * page 44-45 of AWK book + * + * Revision 5.2 1992/04/20 21:40:40 brennan + * patch 2 + * x++ is numeric, even if x is string + * + * Revision 5.1 1991/12/05 07:55:50 brennan + * 1.1 pre-release + * +*/ + + +#include "mawk.h" +#include "code.h" +#include "memory.h" +#include "symtype.h" +#include "field.h" +#include "bi_funct.h" +#include "bi_vars.h" +#include "regexp.h" +#include "repl.h" +#include "fin.h" +#include + +static int PROTO(compare, (CELL *)) ; +static int PROTO(d_to_index, (double)) ; + +#ifdef NOINFO_SIGFPE +static char dz_msg[] = "division by zero" ; +#define CHECK_DIVZERO(x) if( (x) == 0.0 )rt_error(dz_msg);else +#endif + +#ifdef DEBUG +static void PROTO(eval_overflow, (void)) ; + +#define inc_sp() if( ++sp == eval_stack+EVAL_STACK_SIZE )\ + eval_overflow() +#else + +/* If things are working, the eval stack should not overflow */ + +#define inc_sp() sp++ +#endif + +#define SAFETY 16 +#define DANGER (EVAL_STACK_SIZE-SAFETY) + +/* The stack machine that executes the code */ + +CELL eval_stack[EVAL_STACK_SIZE] ; +/* these can move for deep recursion */ +static CELL *stack_base = eval_stack ; +static CELL *stack_danger = eval_stack + DANGER ; + +#ifdef DEBUG +static void +eval_overflow() +{ + overflow("eval stack", EVAL_STACK_SIZE) ; +} +#endif + +/* holds info for array loops (on a stack) */ +typedef struct aloop_state { + struct aloop_state *link ; + CELL *var ; /* for(var in A) */ + STRING **base ; + STRING **ptr ; + STRING **limit ; +} ALOOP_STATE ; + +/* clean up aloop stack on next, return, exit */ +#define CLEAR_ALOOP_STACK() if(aloop_state){\ + clear_aloop_stack(aloop_state);\ + aloop_state=(ALOOP_STATE*)0;}else + +static void clear_aloop_stack(top) + ALOOP_STATE *top ; +{ + ALOOP_STATE *q ; + + do { + while(top->ptrlimit) { + free_STRING(*top->ptr) ; + top->ptr++ ; + } + if (top->base < top->limit) + zfree(top->base, (top->limit-top->base)*sizeof(STRING*)) ; + q = top ; top = q->link ; + ZFREE(q) ; + } while (top) ; +} + + +static INST *restart_label ; /* control flow labels */ +INST *next_label ; +static CELL tc ; /*useful temp */ + +void +execute(cdp, sp, fp) + register INST *cdp ; /* code ptr, start execution here */ + register CELL *sp ; /* eval_stack pointer */ + CELL *fp ; /* frame ptr into eval_stack for + user defined functions */ +{ + /* some useful temporaries */ + CELL *cp ; + int t ; + + /* save state for array loops via a stack */ + ALOOP_STATE *aloop_state = (ALOOP_STATE*) 0 ; + + /* for moving the eval stack on deep recursion */ + CELL *old_stack_base ; + CELL *old_sp ; + +#ifdef DEBUG + CELL *entry_sp = sp ; +#endif + + + if (fp) + { + /* we are a function call, check for deep recursion */ + if (sp > stack_danger) + { /* change stacks */ + old_stack_base = stack_base ; + old_sp = sp ; + stack_base = (CELL *) zmalloc(sizeof(CELL) * EVAL_STACK_SIZE) ; + stack_danger = stack_base + DANGER ; + sp = stack_base ; + /* waste 1 slot for ANSI, actually large model msdos breaks in + RET if we don't */ +#ifdef DEBUG + entry_sp = sp ; +#endif + } + else old_stack_base = (CELL *) 0 ; + } + + while (1) + switch (cdp++->op) + { + +/* HALT only used by the disassemble now ; this remains + so compilers don't offset the jump table */ + case _HALT: + + case _STOP: /* only for range patterns */ +#ifdef DEBUG + if (sp != entry_sp + 1) bozo("stop0") ; +#endif + return ; + + case _PUSHC: + inc_sp() ; + cellcpy(sp, cdp++->ptr) ; + break ; + + case _PUSHD: + inc_sp() ; + sp->type = C_DOUBLE ; + sp->dval = *(double *) cdp++->ptr ; + break ; + + case _PUSHS: + inc_sp() ; + sp->type = C_STRING ; + sp->ptr = cdp++->ptr ; + string(sp)->ref_cnt++ ; + break ; + + case F_PUSHA: + cp = (CELL *) cdp->ptr ; + if (cp != field) + { + if (nf < 0) split_field0() ; + + if (!( +#ifdef MSDOS + SAMESEG(cp, field) && +#endif + cp >= NF && cp <= LAST_PFIELD)) + { + /* its a real field $1, $2 ... + If its greater than $NF, we have to + make sure its set to "" so that + (++|--) and g?sub() work right + */ + t = field_addr_to_index(cp) ; + if (t > nf) + { + cell_destroy(cp) ; + cp->type = C_STRING ; + cp->ptr = (PTR) & null_str ; + null_str.ref_cnt++ ; + } + } + } + /* fall thru */ + + case _PUSHA: + case A_PUSHA: + inc_sp() ; + sp->ptr = cdp++->ptr ; + break ; + + case _PUSHI: + /* put contents of next address on stack*/ + inc_sp() ; + cellcpy(sp, cdp++->ptr) ; + break ; + + case L_PUSHI: + /* put the contents of a local var on stack, + cdp->op holds the offset from the frame pointer */ + inc_sp() ; + cellcpy(sp, fp + cdp++->op) ; + break ; + + case L_PUSHA: + /* put a local address on eval stack */ + inc_sp() ; + sp->ptr = (PTR) (fp + cdp++->op) ; + break ; + + + case F_PUSHI: + + /* push contents of $i + cdp[0] holds & $i , cdp[1] holds i */ + + inc_sp() ; + if (nf < 0) split_field0() ; + cp = (CELL *) cdp->ptr ; + t = (cdp + 1)->op ; + cdp += 2 ; + + if (t <= nf) cellcpy(sp, cp) ; + else /* an unset field */ + { + sp->type = C_STRING ; + sp->ptr = (PTR) & null_str ; + null_str.ref_cnt++ ; + } + break ; + + case NF_PUSHI: + + inc_sp() ; + if (nf < 0) split_field0() ; + cellcpy(sp, NF) ; + break ; + + case FE_PUSHA: + + if (sp->type != C_DOUBLE) cast1_to_d(sp) ; + + t = d_to_index(sp->dval) ; + if (t && nf < 0) split_field0() ; + sp->ptr = (PTR) field_ptr(t) ; + if (t > nf) + { + /* make sure its set to "" */ + cp = (CELL *) sp->ptr ; + cell_destroy(cp) ; + cp->type = C_STRING ; + cp->ptr = (PTR) & null_str ; + null_str.ref_cnt++ ; + } + break ; + + case FE_PUSHI: + + if (sp->type != C_DOUBLE) cast1_to_d(sp) ; + + t = d_to_index(sp->dval) ; + + if (nf < 0) split_field0() ; + if (t <= nf) cellcpy(sp, field_ptr(t)) ; + else + { + sp->type = C_STRING ; + sp->ptr = (PTR) & null_str ; + null_str.ref_cnt++ ; + } + break ; + + + case AE_PUSHA: + /* top of stack has an expr, cdp->ptr points at an + array, replace the expr with the cell address inside + the array */ + + cp = array_find((ARRAY) cdp++->ptr, sp, CREATE) ; + cell_destroy(sp) ; + sp->ptr = (PTR) cp ; + break ; + + case AE_PUSHI: + /* top of stack has an expr, cdp->ptr points at an + array, replace the expr with the contents of the + cell inside the array */ + + cp = array_find((ARRAY) cdp++->ptr, sp, CREATE) ; + cell_destroy(sp) ; + cellcpy(sp, cp) ; + break ; + + case LAE_PUSHI: + /* sp[0] is an expression + cdp->op is offset from frame pointer of a CELL which + has an ARRAY in the ptr field, replace expr + with array[expr] + */ + cp = array_find((ARRAY) fp[cdp++->op].ptr, sp, CREATE) ; + cell_destroy(sp) ; + cellcpy(sp, cp) ; + break ; + + case LAE_PUSHA: + /* sp[0] is an expression + cdp->op is offset from frame pointer of a CELL which + has an ARRAY in the ptr field, replace expr + with & array[expr] + */ + cp = array_find((ARRAY) fp[cdp++->op].ptr, sp, CREATE) ; + cell_destroy(sp) ; + sp->ptr = (PTR) cp ; + break ; + + case LA_PUSHA: + /* cdp->op is offset from frame pointer of a CELL which + has an ARRAY in the ptr field. Push this ARRAY + on the eval stack + */ + inc_sp() ; + sp->ptr = fp[cdp++->op].ptr ; + break ; + + case SET_ALOOP: + { + ALOOP_STATE *ap = ZMALLOC(ALOOP_STATE) ; + unsigned vector_size ; + + ap->var = (CELL *) sp[-1].ptr ; + ap->base = ap->ptr = array_loop_vector( + (ARRAY)sp->ptr, &vector_size) ; + ap->limit = ap->base + vector_size ; + sp -= 2 ; + + /* push onto aloop stack */ + ap->link = aloop_state ; + aloop_state = ap ; + cdp += cdp->op ; + } + break ; + + case ALOOP : + { + ALOOP_STATE *ap = aloop_state ; + if (ap->ptr < ap->limit) + { + cell_destroy(ap->var) ; + ap->var->type = C_STRING ; + ap->var->ptr = (PTR) *ap->ptr++ ; + cdp += cdp->op ; + } + else cdp++ ; + } + break ; + + case POP_AL : + { + /* finish up an array loop */ + ALOOP_STATE *ap = aloop_state ; + aloop_state = ap->link ; + while(ap->ptr < ap->limit) { + free_STRING(*ap->ptr) ; + ap->ptr++ ; + } + if (ap->base < ap->limit) + zfree(ap->base,(ap->limit-ap->base)*sizeof(STRING*)) ; + ZFREE(ap) ; + } + break ; + + case _POP: + cell_destroy(sp) ; + sp-- ; + break ; + + case _ASSIGN: + /* top of stack has an expr, next down is an + address, put the expression in *address and + replace the address with the expression */ + + /* don't propagate type C_MBSTRN */ + if (sp->type == C_MBSTRN) check_strnum(sp) ; + sp-- ; + cell_destroy(((CELL *) sp->ptr)) ; + cellcpy(sp, cellcpy(sp->ptr, sp + 1)) ; + cell_destroy(sp + 1) ; + break ; + + case F_ASSIGN: + /* assign to a field */ + if (sp->type == C_MBSTRN) check_strnum(sp) ; + sp-- ; + field_assign((CELL *) sp->ptr, sp + 1) ; + cell_destroy(sp + 1) ; + cellcpy(sp, (CELL *) sp->ptr) ; + break ; + + case _ADD_ASG: + if (sp->type != C_DOUBLE) cast1_to_d(sp) ; + cp = (CELL *) (sp - 1)->ptr ; + if (cp->type != C_DOUBLE) cast1_to_d(cp) ; + +#if SW_FP_CHECK /* specific to V7 and XNX23A */ + clrerr() ; +#endif + cp->dval += sp--->dval ; +#if SW_FP_CHECK + fpcheck() ; +#endif + sp->type = C_DOUBLE ; + sp->dval = cp->dval ; + break ; + + case _SUB_ASG: + if (sp->type != C_DOUBLE) cast1_to_d(sp) ; + cp = (CELL *) (sp - 1)->ptr ; + if (cp->type != C_DOUBLE) cast1_to_d(cp) ; +#if SW_FP_CHECK + clrerr() ; +#endif + cp->dval -= sp--->dval ; +#if SW_FP_CHECK + fpcheck() ; +#endif + sp->type = C_DOUBLE ; + sp->dval = cp->dval ; + break ; + + case _MUL_ASG: + if (sp->type != C_DOUBLE) cast1_to_d(sp) ; + cp = (CELL *) (sp - 1)->ptr ; + if (cp->type != C_DOUBLE) cast1_to_d(cp) ; +#if SW_FP_CHECK + clrerr() ; +#endif + cp->dval *= sp--->dval ; +#if SW_FP_CHECK + fpcheck() ; +#endif + sp->type = C_DOUBLE ; + sp->dval = cp->dval ; + break ; + + case _DIV_ASG: + if (sp->type != C_DOUBLE) cast1_to_d(sp) ; + cp = (CELL *) (sp - 1)->ptr ; + if (cp->type != C_DOUBLE) cast1_to_d(cp) ; + +#if NOINFO_SIGFPE + CHECK_DIVZERO(sp->dval) ; +#endif + +#if SW_FP_CHECK + clrerr() ; +#endif + cp->dval /= sp--->dval ; +#if SW_FP_CHECK + fpcheck() ; +#endif + sp->type = C_DOUBLE ; + sp->dval = cp->dval ; + break ; + + case _MOD_ASG: + if (sp->type != C_DOUBLE) cast1_to_d(sp) ; + cp = (CELL *) (sp - 1)->ptr ; + if (cp->type != C_DOUBLE) cast1_to_d(cp) ; + +#if NOINFO_SIGFPE + CHECK_DIVZERO(sp->dval) ; +#endif + + cp->dval = fmod(cp->dval, sp--->dval) ; + sp->type = C_DOUBLE ; + sp->dval = cp->dval ; + break ; + + case _POW_ASG: + if (sp->type != C_DOUBLE) cast1_to_d(sp) ; + cp = (CELL *) (sp - 1)->ptr ; + if (cp->type != C_DOUBLE) cast1_to_d(cp) ; + cp->dval = pow(cp->dval, sp--->dval) ; + sp->type = C_DOUBLE ; + sp->dval = cp->dval ; + break ; + + /* will anyone ever use these ? */ + + case F_ADD_ASG: + if (sp->type != C_DOUBLE) cast1_to_d(sp) ; + cp = (CELL *) (sp - 1)->ptr ; + cast1_to_d(cellcpy(&tc, cp)) ; +#if SW_FP_CHECK + clrerr() ; +#endif + tc.dval += sp--->dval ; +#if SW_FP_CHECK + fpcheck() ; +#endif + sp->type = C_DOUBLE ; + sp->dval = tc.dval ; + field_assign(cp, &tc) ; + break ; + + case F_SUB_ASG: + if (sp->type != C_DOUBLE) cast1_to_d(sp) ; + cp = (CELL *) (sp - 1)->ptr ; + cast1_to_d(cellcpy(&tc, cp)) ; +#if SW_FP_CHECK + clrerr() ; +#endif + tc.dval -= sp--->dval ; +#if SW_FP_CHECK + fpcheck() ; +#endif + sp->type = C_DOUBLE ; + sp->dval = tc.dval ; + field_assign(cp, &tc) ; + break ; + + case F_MUL_ASG: + if (sp->type != C_DOUBLE) cast1_to_d(sp) ; + cp = (CELL *) (sp - 1)->ptr ; + cast1_to_d(cellcpy(&tc, cp)) ; +#if SW_FP_CHECK + clrerr() ; +#endif + tc.dval *= sp--->dval ; +#if SW_FP_CHECK + fpcheck() ; +#endif + sp->type = C_DOUBLE ; + sp->dval = tc.dval ; + field_assign(cp, &tc) ; + break ; + + case F_DIV_ASG: + if (sp->type != C_DOUBLE) cast1_to_d(sp) ; + cp = (CELL *) (sp - 1)->ptr ; + cast1_to_d(cellcpy(&tc, cp)) ; + +#if NOINFO_SIGFPE + CHECK_DIVZERO(sp->dval) ; +#endif + +#if SW_FP_CHECK + clrerr() ; +#endif + tc.dval /= sp--->dval ; +#if SW_FP_CHECK + fpcheck() ; +#endif + sp->type = C_DOUBLE ; + sp->dval = tc.dval ; + field_assign(cp, &tc) ; + break ; + + case F_MOD_ASG: + if (sp->type != C_DOUBLE) cast1_to_d(sp) ; + cp = (CELL *) (sp - 1)->ptr ; + cast1_to_d(cellcpy(&tc, cp)) ; + +#if NOINFO_SIGFPE + CHECK_DIVZERO(sp->dval) ; +#endif + + tc.dval = fmod(tc.dval, sp--->dval) ; + sp->type = C_DOUBLE ; + sp->dval = tc.dval ; + field_assign(cp, &tc) ; + break ; + + case F_POW_ASG: + if (sp->type != C_DOUBLE) cast1_to_d(sp) ; + cp = (CELL *) (sp - 1)->ptr ; + cast1_to_d(cellcpy(&tc, cp)) ; + tc.dval = pow(tc.dval, sp--->dval) ; + sp->type = C_DOUBLE ; + sp->dval = tc.dval ; + field_assign(cp, &tc) ; + break ; + + case _ADD: + sp-- ; + if (TEST2(sp) != TWO_DOUBLES) cast2_to_d(sp) ; +#if SW_FP_CHECK + clrerr() ; +#endif + sp[0].dval += sp[1].dval ; +#if SW_FP_CHECK + fpcheck() ; +#endif + break ; + + case _SUB: + sp-- ; + if (TEST2(sp) != TWO_DOUBLES) cast2_to_d(sp) ; +#if SW_FP_CHECK + clrerr() ; +#endif + sp[0].dval -= sp[1].dval ; +#if SW_FP_CHECK + fpcheck() ; +#endif + break ; + + case _MUL: + sp-- ; + if (TEST2(sp) != TWO_DOUBLES) cast2_to_d(sp) ; +#if SW_FP_CHECK + clrerr() ; +#endif + sp[0].dval *= sp[1].dval ; +#if SW_FP_CHECK + fpcheck() ; +#endif + break ; + + case _DIV: + sp-- ; + if (TEST2(sp) != TWO_DOUBLES) cast2_to_d(sp) ; + +#if NOINFO_SIGFPE + CHECK_DIVZERO(sp[1].dval) ; +#endif + +#if SW_FP_CHECK + clrerr() ; +#endif + sp[0].dval /= sp[1].dval ; +#if SW_FP_CHECK + fpcheck() ; +#endif + break ; + + case _MOD: + sp-- ; + if (TEST2(sp) != TWO_DOUBLES) cast2_to_d(sp) ; + +#if NOINFO_SIGFPE + CHECK_DIVZERO(sp[1].dval) ; +#endif + + sp[0].dval = fmod(sp[0].dval, sp[1].dval) ; + break ; + + case _POW: + sp-- ; + if (TEST2(sp) != TWO_DOUBLES) cast2_to_d(sp) ; + sp[0].dval = pow(sp[0].dval, sp[1].dval) ; + break ; + + case _NOT: + /* evaluates to 0.0 or 1.0 */ + reswitch_1: + switch (sp->type) + { + case C_NOINIT: + sp->dval = 1.0 ; break ; + case C_DOUBLE: + sp->dval = sp->dval != 0.0 ? 0.0 : 1.0 ; + break ; + case C_STRING: + sp->dval = string(sp)->len ? 0.0 : 1.0 ; + free_STRING(string(sp)) ; + break ; + case C_STRNUM: /* test as a number */ + sp->dval = sp->dval != 0.0 ? 0.0 : 1.0 ; + free_STRING(string(sp)) ; + break ; + case C_MBSTRN: + check_strnum(sp) ; + goto reswitch_1 ; + default: + bozo("bad type on eval stack") ; + } + sp->type = C_DOUBLE ; + break ; + + case _TEST: + /* evaluates to 0.0 or 1.0 */ + reswitch_2: + switch (sp->type) + { + case C_NOINIT: + sp->dval = 0.0 ; break ; + case C_DOUBLE: + sp->dval = sp->dval != 0.0 ? 1.0 : 0.0 ; + break ; + case C_STRING: + sp->dval = string(sp)->len ? 1.0 : 0.0 ; + free_STRING(string(sp)) ; + break ; + case C_STRNUM: /* test as a number */ + sp->dval = sp->dval != 0.0 ? 1.0 : 0.0 ; + free_STRING(string(sp)) ; + break ; + case C_MBSTRN: + check_strnum(sp) ; + goto reswitch_2 ; + default: + bozo("bad type on eval stack") ; + } + sp->type = C_DOUBLE ; + break ; + + case _UMINUS: + if (sp->type != C_DOUBLE) cast1_to_d(sp) ; + sp->dval = -sp->dval ; + break ; + + case _UPLUS: + if (sp->type != C_DOUBLE) cast1_to_d(sp) ; + break ; + + case _CAT: + { + unsigned len1, len2 ; + char *str1, *str2 ; + STRING *b ; + + sp-- ; + if (TEST2(sp) != TWO_STRINGS) cast2_to_s(sp) ; + str1 = string(sp)->str ; + len1 = string(sp)->len ; + str2 = string(sp + 1)->str ; + len2 = string(sp + 1)->len ; + + b = new_STRING0(len1 + len2) ; + memcpy(b->str, str1, len1) ; + memcpy(b->str + len1, str2, len2) ; + free_STRING(string(sp)) ; + free_STRING(string(sp + 1)) ; + + sp->ptr = (PTR) b ; + break ; + } + + case _PUSHINT: + inc_sp() ; + sp->type = cdp++->op ; + break ; + + case _BUILTIN: + case _PRINT: + sp = (*(PF_CP) cdp++->ptr) (sp) ; + break ; + + case _POST_INC: + cp = (CELL *) sp->ptr ; + if (cp->type != C_DOUBLE) cast1_to_d(cp) ; + sp->type = C_DOUBLE ; + sp->dval = cp->dval ; + cp->dval += 1.0 ; + break ; + + case _POST_DEC: + cp = (CELL *) sp->ptr ; + if (cp->type != C_DOUBLE) cast1_to_d(cp) ; + sp->type = C_DOUBLE ; + sp->dval = cp->dval ; + cp->dval -= 1.0 ; + break ; + + case _PRE_INC: + cp = (CELL *) sp->ptr ; + if (cp->type != C_DOUBLE) cast1_to_d(cp) ; + sp->dval = cp->dval += 1.0 ; + sp->type = C_DOUBLE ; + break ; + + case _PRE_DEC: + cp = (CELL *) sp->ptr ; + if (cp->type != C_DOUBLE) cast1_to_d(cp) ; + sp->dval = cp->dval -= 1.0 ; + sp->type = C_DOUBLE ; + break ; + + + case F_POST_INC: + cp = (CELL *) sp->ptr ; + cellcpy(&tc, cp) ; + cast1_to_d(&tc) ; + sp->type = C_DOUBLE ; + sp->dval = tc.dval ; + tc.dval += 1.0 ; + field_assign(cp, &tc) ; + break ; + + case F_POST_DEC: + cp = (CELL *) sp->ptr ; + cellcpy(&tc, cp) ; + cast1_to_d(&tc) ; + sp->type = C_DOUBLE ; + sp->dval = tc.dval ; + tc.dval -= 1.0 ; + field_assign(cp, &tc) ; + break ; + + case F_PRE_INC: + cp = (CELL *) sp->ptr ; + cast1_to_d(cellcpy(sp, cp)) ; + sp->dval += 1.0 ; + field_assign(cp, sp) ; + break ; + + case F_PRE_DEC: + cp = (CELL *) sp->ptr ; + cast1_to_d(cellcpy(sp, cp)) ; + sp->dval -= 1.0 ; + field_assign(cp, sp) ; + break ; + + case _JMP: + cdp += cdp->op ; + break ; + + case _JNZ: + /* jmp if top of stack is non-zero and pop stack */ + if (test(sp)) cdp += cdp->op ; + else cdp++ ; + cell_destroy(sp) ; + sp-- ; + break ; + + case _JZ: + /* jmp if top of stack is zero and pop stack */ + if (!test(sp)) cdp += cdp->op ; + else cdp++ ; + cell_destroy(sp) ; + sp-- ; + break ; + + case _LJZ: + /* special jump for logical and */ + /* this is always preceded by _TEST */ + if ( sp->dval == 0.0 ) + { + /* take jump, but don't pop stack */ + cdp += cdp->op ; + } + else + { + /* pop and don't jump */ + sp-- ; + cdp++ ; + } + break ; + + case _LJNZ: + /* special jump for logical or */ + /* this is always preceded by _TEST */ + if ( sp->dval != 0.0 ) + { + /* take jump, but don't pop stack */ + cdp += cdp->op ; + } + else + { + /* pop and don't jump */ + sp-- ; + cdp++ ; + } + break ; + + /* the relation operations */ + /* compare() makes sure string ref counts are OK */ + case _EQ: + t = compare(--sp) ; + sp->type = C_DOUBLE ; + sp->dval = t == 0 ? 1.0 : 0.0 ; + break ; + + case _NEQ: + t = compare(--sp) ; + sp->type = C_DOUBLE ; + sp->dval = t ? 1.0 : 0.0 ; + break ; + + case _LT: + t = compare(--sp) ; + sp->type = C_DOUBLE ; + sp->dval = t < 0 ? 1.0 : 0.0 ; + break ; + + case _LTE: + t = compare(--sp) ; + sp->type = C_DOUBLE ; + sp->dval = t <= 0 ? 1.0 : 0.0 ; + break ; + + case _GT: + t = compare(--sp) ; + sp->type = C_DOUBLE ; + sp->dval = t > 0 ? 1.0 : 0.0 ; + break ; + + case _GTE: + t = compare(--sp) ; + sp->type = C_DOUBLE ; + sp->dval = t >= 0 ? 1.0 : 0.0 ; + break ; + + case _MATCH0: + /* does $0 match, the RE at cdp? */ + + inc_sp() ; + if (field->type >= C_STRING) + { + sp->type = C_DOUBLE ; + sp->dval = REtest(string(field)->str, cdp++->ptr) + ? 1.0 : 0.0 ; + + break /* the case */ ; + } + else + { + cellcpy(sp, field) ; + /* and FALL THRU */ + } + + case _MATCH1: + /* does expr at sp[0] match RE at cdp */ + if (sp->type < C_STRING) cast1_to_s(sp) ; + t = REtest(string(sp)->str, cdp++->ptr) ; + free_STRING(string(sp)) ; + sp->type = C_DOUBLE ; + sp->dval = t ? 1.0 : 0.0 ; + break ; + + + case _MATCH2: + /* does sp[-1] match sp[0] as re */ + cast_to_RE(sp) ; + + if ((--sp)->type < C_STRING) cast1_to_s(sp) ; + t = REtest(string(sp)->str, (sp + 1)->ptr) ; + + free_STRING(string(sp)) ; + sp->type = C_DOUBLE ; + sp->dval = t ? 1.0 : 0.0 ; + break ; + + case A_TEST: + /* entry : sp[0].ptr-> an array + sp[-1] is an expression + + we compute (expression in array) */ + sp-- ; + cp = array_find((sp + 1)->ptr, sp, NO_CREATE) ; + cell_destroy(sp) ; + sp->type = C_DOUBLE ; + sp->dval = (cp != (CELL *) 0) ? 1.0 : 0.0 ; + break ; + + case A_DEL: + /* sp[0].ptr -> array + sp[-1] is an expr + delete array[expr] */ + + array_delete(sp->ptr, sp - 1) ; + cell_destroy(sp - 1) ; + sp -= 2 ; + break ; + + case DEL_A: + /* free all the array at once */ + array_clear(sp->ptr) ; + sp-- ; + break ; + + /* form a multiple array index */ + case A_CAT: + sp = array_cat(sp, cdp++->op) ; + break ; + + case _EXIT: + if (sp->type != C_DOUBLE) cast1_to_d(sp) ; + exit_code = d_to_i(sp->dval) ; + sp-- ; + /* fall thru */ + + case _EXIT0: + + if (!end_start) mawk_exit(exit_code) ; + + cdp = end_start ; + end_start = (INST *) 0 ; /* makes sure next exit exits */ + + if (begin_start) zfree(begin_start, begin_size) ; + if (main_start) zfree(main_start, main_size) ; + sp = eval_stack - 1 ;/* might be in user function */ + CLEAR_ALOOP_STACK() ; /* ditto */ + break ; + + case _JMAIN: /* go from BEGIN code to MAIN code */ + zfree(begin_start, begin_size) ; + begin_start = (INST *) 0 ; + cdp = main_start ; + break ; + + case _OMAIN: + if (!main_fin) open_main() ; + restart_label = cdp ; + cdp = next_label ; + break ; + + case _NEXT: + /* next might be inside an aloop -- clear stack */ + CLEAR_ALOOP_STACK() ; + cdp = next_label ; + break ; + + case OL_GL: + { + char *p ; + unsigned len ; + + if (!(p = FINgets(main_fin, &len))) + { + if (!end_start) mawk_exit(0) ; + + cdp = end_start ; + zfree(main_start, main_size) ; + main_start = end_start = (INST *) 0 ; + } + else + { + set_field0(p, len) ; + cdp = restart_label ; + rt_nr++ ; rt_fnr++ ; + } + } + break ; + + /* two kinds of OL_GL is a historical stupidity from working on + a machine with very slow floating point emulation */ + case OL_GL_NR: + { + char *p ; + unsigned len ; + + if (!(p = FINgets(main_fin, &len))) + { + if (!end_start) mawk_exit(0) ; + + cdp = end_start ; + zfree(main_start, main_size) ; + main_start = end_start = (INST *) 0 ; + } + else + { + set_field0(p, len) ; + cdp = restart_label ; + + if (TEST2(NR) != TWO_DOUBLES) cast2_to_d(NR) ; + + NR->dval += 1.0 ; rt_nr++ ; + FNR->dval += 1.0 ; rt_fnr++ ; + } + } + break ; + + + case _RANGE: +/* test a range pattern: pat1, pat2 { action } + entry : + cdp[0].op -- a flag, test pat1 if on else pat2 + cdp[1].op -- offset of pat2 code from cdp + cdp[2].op -- offset of action code from cdp + cdp[3].op -- offset of code after the action from cdp + cdp[4] -- start of pat1 code +*/ + +#define FLAG cdp[0].op +#define PAT2 cdp[1].op +#define ACTION cdp[2].op +#define FOLLOW cdp[3].op +#define PAT1 4 + + if (FLAG) /* test again pat1 */ + { + execute(cdp + PAT1, sp, fp) ; + t = test(sp + 1) ; + cell_destroy(sp + 1) ; + if (t) FLAG = 0 ; + else + { + cdp += FOLLOW ; + break ; /* break the switch */ + } + } + + /* test against pat2 and then perform the action */ + execute(cdp + PAT2, sp, fp) ; + FLAG = test(sp + 1) ; + cell_destroy(sp + 1) ; + cdp += ACTION ; + break ; + +/* function calls */ + + case _RET0: + inc_sp() ; + sp->type = C_NOINIT ; + /* fall thru */ + + case _RET: + +#ifdef DEBUG + if (sp != entry_sp + 1) bozo("ret") ; +#endif + if (old_stack_base) /* reset stack */ + { + /* move the return value */ + cellcpy(old_sp + 1, sp) ; + cell_destroy(sp) ; + zfree(stack_base, sizeof(CELL) * EVAL_STACK_SIZE) ; + stack_base = old_stack_base ; + stack_danger = old_stack_base + DANGER ; + } + + /* return might be inside an aloop -- clear stack */ + CLEAR_ALOOP_STACK() ; + + return ; + + case _CALL: + + /* cdp[0] holds ptr to "function block" + cdp[1] holds number of input arguments + */ + + { + FBLOCK *fbp = (FBLOCK *) cdp++->ptr ; + int a_args = cdp++->op ; /* actual number of args */ + CELL *nfp = sp - a_args + 1 ; /* new fp for callee */ + CELL *local_p = sp + 1 ; /* first local argument on stack */ + char *type_p ; /* pts to type of an argument */ + + if (fbp->nargs) type_p = fbp->typev + a_args - 1 ; + + /* create space for locals */ + t = fbp->nargs - a_args ; /* t is number of locals */ + while (t>0) + { + t-- ; sp++ ; type_p++ ; + sp->type = C_NOINIT ; + if (*type_p == ST_LOCAL_ARRAY) + sp->ptr = (PTR) new_ARRAY() ; + } + + execute(fbp->code, sp, nfp) ; + + /* cleanup the callee's arguments */ + /* putting return value at top of eval stack */ + if (sp >= nfp) + { + cp = sp + 1 ; /* cp -> the function return */ + + do + { + if (*type_p == ST_LOCAL_ARRAY) + { + if (sp >= local_p) + { + array_clear(sp->ptr) ; + ZFREE((ARRAY)sp->ptr) ; + } + } + else cell_destroy(sp) ; + + type_p-- ; sp-- ; + + } + while (sp >= nfp); + + cellcpy(++sp, cp) ; + cell_destroy(cp) ; + } + else sp++ ; /* no arguments passed */ + } + break ; + + default: + bozo("bad opcode") ; + } +} + + +/* + return 0 if a numeric is zero else return non-zero + return 0 if a string is "" else return non-zero +*/ +int +test(cp) + register CELL *cp ; +{ + reswitch: + + switch (cp->type) + { + case C_NOINIT: + return 0 ; + case C_STRNUM: /* test as a number */ + case C_DOUBLE: + return cp->dval != 0.0 ; + case C_STRING: + return string(cp)->len ; + case C_MBSTRN : check_strnum(cp) ; goto reswitch ; + default: + bozo("bad cell type in call to test") ; + } + return 0 ; /*can't get here: shutup */ +} + +/* compare cells at cp and cp+1 and + frees STRINGs at those cells +*/ +static int +compare(cp) + register CELL *cp ; +{ + int k ; + + reswitch: + + switch (TEST2(cp)) + { + case TWO_NOINITS: + return 0 ; + + case TWO_DOUBLES: + two_d: + return cp->dval > (cp + 1)->dval ? 1 : + cp->dval < (cp + 1)->dval ? -1 : 0 ; + + case TWO_STRINGS: + case STRING_AND_STRNUM: + two_s: + k = strcmp(string(cp)->str, string(cp + 1)->str) ; + free_STRING(string(cp)) ; + free_STRING(string(cp + 1)) ; + return k ; + + case NOINIT_AND_DOUBLE: + case NOINIT_AND_STRNUM: + case DOUBLE_AND_STRNUM: + case TWO_STRNUMS: + cast2_to_d(cp) ; goto two_d ; + case NOINIT_AND_STRING: + case DOUBLE_AND_STRING: + cast2_to_s(cp) ; goto two_s ; + case TWO_MBSTRNS: + check_strnum(cp) ; check_strnum(cp+1) ; + goto reswitch ; + + case NOINIT_AND_MBSTRN: + case DOUBLE_AND_MBSTRN: + case STRING_AND_MBSTRN: + case STRNUM_AND_MBSTRN: + check_strnum(cp->type == C_MBSTRN ? cp : cp + 1) ; + goto reswitch ; + + default: /* there are no default cases */ + bozo("bad cell type passed to compare") ; + } + return 0 ; /* shut up */ +} + +/* does not assume target was a cell, if so + then caller should have made a previous + call to cell_destroy */ + +CELL * +cellcpy(target, source) + register CELL *target, *source ; +{ + switch (target->type = source->type) + { + case C_NOINIT: + case C_SPACE: + case C_SNULL: + break ; + + case C_DOUBLE: + target->dval = source->dval ; + break ; + + case C_STRNUM: + target->dval = source->dval ; + /* fall thru */ + + case C_REPL: + case C_MBSTRN: + case C_STRING: + string(source)->ref_cnt++ ; + /* fall thru */ + + case C_RE: + target->ptr = source->ptr ; + break ; + + case C_REPLV: + replv_cpy(target, source) ; + break ; + + default: + bozo("bad cell passed to cellcpy()") ; + break ; + } + return target ; +} + +#ifdef DEBUG + +void +DB_cell_destroy(cp) /* HANGOVER time */ + register CELL *cp ; +{ + switch (cp->type) + { + case C_NOINIT: + case C_DOUBLE: + break ; + + case C_MBSTRN: + case C_STRING: + case C_STRNUM: + if (--string(cp)->ref_cnt == 0) + zfree(string(cp), string(cp)->len + STRING_OH) ; + break ; + + case C_RE: + bozo("cell destroy called on RE cell") ; + default: + bozo("cell destroy called on bad cell type") ; + } +} + +#endif + + + +/* convert a double d to a field index $d -> $i */ +static int +d_to_index(d) + double d; +{ + + if (d > MAX_FIELD) + rt_overflow("maximum number of fields", MAX_FIELD) ; + + if (d >= 0.0) return (int) d ; + + /* might include nan */ + rt_error("negative field index $%.6g", d) ; + return 0 ; /* shutup */ +} diff --git a/execute.o b/execute.o new file mode 100644 index 0000000..8b5d1db Binary files /dev/null and b/execute.o differ diff --git a/fcall.c b/fcall.c new file mode 100644 index 0000000..ca77f15 --- /dev/null +++ b/fcall.c @@ -0,0 +1,442 @@ + +/******************************************** +fcall.c +copyright 1991, Michael D. Brennan + +This is a source file for mawk, an implementation of +the AWK programming language. + +Mawk is distributed without warranty under the terms of +the GNU General Public License, version 2, 1991. +********************************************/ + + +/*$Log: fcall.c,v $ + * Revision 1.7 1995/08/27 15:46:47 mike + * change some errmsgs to compile_errors + * + * Revision 1.6 1995/06/09 22:58:24 mike + * cast to shutup solaris cc on comparison of short to ushort + * + * Revision 1.5 1995/06/06 00:18:26 mike + * change mawk_exit(1) to mawk_exit(2) + * + * Revision 1.4 1995/04/21 14:20:14 mike + * move_level variable to fix bug in arglist patching of moved code. + * + * Revision 1.3 1995/02/19 22:15:37 mike + * Always set the call_offset field in a CA_REC (for obscure + * reasons in fcall.c (see comments) there.) + * + * Revision 1.2 1993/07/17 13:22:52 mike + * indent and general code cleanup + * + * Revision 1.1.1.1 1993/07/03 18:58:11 mike + * move source to cvs + * + * Revision 5.4 1993/01/09 19:03:44 mike + * code_pop checks if the resolve_list needs relocation + * + * Revision 5.3 1993/01/07 02:50:33 mike + * relative vs absolute code + * + * Revision 5.2 1993/01/01 21:30:48 mike + * split new_STRING() into new_STRING and new_STRING0 + * + * Revision 5.1 1991/12/05 07:55:54 brennan + * 1.1 pre-release + * +*/ + +#include "mawk.h" +#include "symtype.h" +#include "code.h" + +/* This file has functions involved with type checking of + function calls +*/ + +static FCALL_REC *PROTO(first_pass, (FCALL_REC *)) ; +static CA_REC *PROTO(call_arg_check, (FBLOCK *, CA_REC *, + INST *, unsigned)) ; +static int PROTO(arg_cnt_ok, (FBLOCK *, CA_REC *, unsigned)) ; +static void PROTO(relocate_arglist, (CA_REC *, int, unsigned, int)) ; + + +static int check_progress ; + /* flag that indicates call_arg_check() was able to type + check some call arguments */ + +/* type checks a list of call arguments, + returns a list of arguments whose type is still unknown +*/ +static CA_REC * +call_arg_check(callee, entry_list, start, line_no) + FBLOCK *callee ; + CA_REC *entry_list ; + INST *start ; /* to locate patch */ + unsigned line_no ; /* for error messages */ +{ + register CA_REC *q ; + CA_REC *exit_list = (CA_REC *) 0 ; + + check_progress = 0 ; + + /* loop : + take q off entry_list + test it + if OK zfree(q) else put on exit_list */ + while ((q = entry_list)) + { + entry_list = q->link ; + + if (q->type == ST_NONE) + { + /* try to infer the type */ + /* it might now be in symbol table */ + if (q->sym_p->type == ST_VAR) + { + /* set type and patch */ + q->type = CA_EXPR ; + start[q->call_offset + 1].ptr = (PTR) q->sym_p->stval.cp ; + } + else if (q->sym_p->type == ST_ARRAY) + { + q->type = CA_ARRAY ; + start[q->call_offset].op = A_PUSHA ; + start[q->call_offset + 1].ptr = (PTR) q->sym_p->stval.array ; + } + else /* try to infer from callee */ + { + switch (callee->typev[q->arg_num]) + { + case ST_LOCAL_VAR: + q->type = CA_EXPR ; + q->sym_p->type = ST_VAR ; + q->sym_p->stval.cp = ZMALLOC(CELL) ; + q->sym_p->stval.cp->type = C_NOINIT ; + start[q->call_offset + 1].ptr = + (PTR) q->sym_p->stval.cp ; + break ; + + case ST_LOCAL_ARRAY: + q->type = CA_ARRAY ; + q->sym_p->type = ST_ARRAY ; + q->sym_p->stval.array = new_ARRAY() ; + start[q->call_offset].op = A_PUSHA ; + start[q->call_offset + 1].ptr = + (PTR) q->sym_p->stval.array ; + break ; + } + } + } + else if (q->type == ST_LOCAL_NONE) + { + /* try to infer the type */ + if (*q->type_p == ST_LOCAL_VAR) + { + /* set type , don't need to patch */ + q->type = CA_EXPR ; + } + else if (*q->type_p == ST_LOCAL_ARRAY) + { + q->type = CA_ARRAY ; + start[q->call_offset].op = LA_PUSHA ; + /* offset+1 op is OK */ + } + else /* try to infer from callee */ + { + switch (callee->typev[q->arg_num]) + { + case ST_LOCAL_VAR: + q->type = CA_EXPR ; + *q->type_p = ST_LOCAL_VAR ; + /* do not need to patch */ + break ; + + case ST_LOCAL_ARRAY: + q->type = CA_ARRAY ; + *q->type_p = ST_LOCAL_ARRAY ; + start[q->call_offset].op = LA_PUSHA ; + break ; + } + } + } + + /* if we still do not know the type put on the new list + else type check */ + if (q->type == ST_NONE || q->type == ST_LOCAL_NONE) + { + q->link = exit_list ; + exit_list = q ; + } + else /* type known */ + { + if (callee->typev[q->arg_num] == ST_LOCAL_NONE) + callee->typev[q->arg_num] = q->type ; + else if (q->type != callee->typev[q->arg_num]) + compile_error("type error in arg(%d) in call to %s", + q->arg_num + 1, callee->name) ; + + ZFREE(q) ; + check_progress = 1 ; + } + } /* while */ + + return exit_list ; +} + + +static int +arg_cnt_ok(fbp, q, line_no) + FBLOCK *fbp ; + CA_REC *q ; + unsigned line_no ; +{ + if ((int)q->arg_num >= (int)fbp->nargs) + /* casts shutup stupid warning from solaris sun cc */ + { + compile_error("too many arguments in call to %s", fbp->name) ; + return 0 ; + } + else return 1 ; +} + + +FCALL_REC *resolve_list ; + /* function calls whose arg types need checking + are stored on this list */ + + +/* on first pass thru the resolve list + we check : + if forward referenced functions were really defined + if right number of arguments + and compute call_start which is now known +*/ + +static FCALL_REC * +first_pass(p) + register FCALL_REC *p ; +{ + FCALL_REC dummy ; + register FCALL_REC *q = &dummy ; /* trails p */ + + q->link = p ; + while (p) + { + if (!p->callee->code) + { + /* callee never defined */ + compile_error("function %s never defined", p->callee->name) ; + /* delete p from list */ + q->link = p->link ; + /* don't worry about freeing memory, we'll exit soon */ + } + /* note p->arg_list starts with last argument */ + else if (!p->arg_list /* nothing to do */ || + (!p->arg_cnt_checked && + !arg_cnt_ok(p->callee, p->arg_list, p->line_no))) + { + q->link = p->link ; /* delete p */ + /* the ! arg_list case is not an error so free memory */ + ZFREE(p) ; + } + else + { + /* keep p and set call_start */ + q = p ; + switch (p->call_scope) + { + case SCOPE_MAIN: + p->call_start = main_start ; + break ; + + case SCOPE_BEGIN: + p->call_start = begin_start ; + break ; + + case SCOPE_END: + p->call_start = end_start ; + break ; + + case SCOPE_FUNCT: + p->call_start = p->call->code ; + break ; + } + } + p = q->link ; + } + return dummy.link ; +} + +/* continuously walk the resolve_list making type deductions + until this list goes empty or no more progress can be made + (An example where no more progress can be made is at end of file +*/ + +void +resolve_fcalls() +{ + register FCALL_REC *p, *old_list, *new_list ; + int progress ; /* a flag */ + + old_list = first_pass(resolve_list) ; + new_list = (FCALL_REC *) 0 ; + progress = 0 ; + + while (1) + { + if (!old_list) + { + /* flop the lists */ + old_list = new_list ; + if (!old_list /* nothing left */ + || !progress /* can't do any more */ ) + return ; + + new_list = (FCALL_REC *) 0 ; progress = 0 ; + } + + p = old_list ; + old_list = p->link ; + + if ((p->arg_list = call_arg_check(p->callee, p->arg_list, + p->call_start, p->line_no))) + { + /* still have work to do , put on new_list */ + progress |= check_progress ; + p->link = new_list ; new_list = p ; + } + else + { + /* done with p */ + progress = 1 ; + ZFREE(p) ; + } + } +} + +/* the parser has just reduced a function call ; + the info needed to type check is passed in. If type checking + can not be done yet (most common reason -- function referenced + but not defined), a node is added to the resolve list. +*/ +void +check_fcall(callee, call_scope, move_level, call, arg_list, line_no) + FBLOCK *callee ; + int call_scope ; + int move_level ; + FBLOCK *call ; + CA_REC *arg_list ; + unsigned line_no ; +{ + FCALL_REC *p ; + + if (!callee->code) + { + /* forward reference to a function to be defined later */ + p = ZMALLOC(FCALL_REC) ; + p->callee = callee ; + p->call_scope = call_scope ; + p->move_level = move_level ; + p->call = call ; + p->arg_list = arg_list ; + p->arg_cnt_checked = 0 ; + p->line_no = line_no ; + /* add to resolve list */ + p->link = resolve_list ; resolve_list = p ; + } + else if (arg_list && arg_cnt_ok(callee, arg_list, line_no)) + { + /* usually arg_list disappears here and all is well + otherwise add to resolve list */ + + if ((arg_list = call_arg_check(callee, arg_list, + code_base, line_no))) + { + p = ZMALLOC(FCALL_REC) ; + p->callee = callee ; + p->call_scope = call_scope ; + p->move_level = move_level ; + p->call = call ; + p->arg_list = arg_list ; + p->arg_cnt_checked = 1 ; + p->line_no = line_no ; + /* add to resolve list */ + p->link = resolve_list ; resolve_list = p ; + } + } +} + + +/* code_pop() has just moved some code. If this code contains + a function call, it might need to be relocated on the + resolve list too. This function does it. +*/ + +void +relocate_resolve_list(scope, move_level, fbp, orig_offset, len, delta) + int scope ; + int move_level ; + FBLOCK *fbp ; + int orig_offset ; + unsigned len ; + int delta ; /* relocation distance */ +{ + FCALL_REC *p = resolve_list ; + + while (p) + { + if (scope == p->call_scope && move_level == p->move_level && + (scope == SCOPE_FUNCT ? fbp == p->call : 1)) + { + relocate_arglist(p->arg_list, orig_offset, + len, delta) ; + } + p = p->link ; + } +} + +static void +relocate_arglist(arg_list, offset, len, delta) + CA_REC *arg_list ; + int offset ; + unsigned len ; + int delta ; +{ + register CA_REC *p ; + + if (!arg_list) return ; + + p = arg_list ; + /* all nodes must be relocated or none, so test the + first one */ + + /* Note: call_offset is always set even for args that don't need to + be patched so that this check works. */ + if ( p->call_offset < offset || p->call_offset >= offset + len ) + return ; + + /* relocate the whole list */ + do + { + p->call_offset += delta ; + p = p->link ; + } + while (p); +} + + + + + +/* example where typing cannot progress + +{ f(z) } + +function f(x) { print NR } + +# this is legal, does something useful, but absurdly written +# We have to design so this works +*/ diff --git a/fcall.o b/fcall.o new file mode 100644 index 0000000..edce486 Binary files /dev/null and b/fcall.o differ diff --git a/field.c b/field.c new file mode 100644 index 0000000..6ffc4a0 --- /dev/null +++ b/field.c @@ -0,0 +1,688 @@ + +/******************************************** +field.c +copyright 1991, Michael D. Brennan + +This is a source file for mawk, an implementation of +the AWK programming language. + +Mawk is distributed without warranty under the terms of +the GNU General Public License, version 2, 1991. +********************************************/ + +/* $Log: field.c,v $ + * Revision 1.5 1995/06/18 19:17:47 mike + * Create a type Int which on most machines is an int, but on machines + * with 16bit ints, i.e., the PC is a long. This fixes implicit assumption + * that int==long. + * + * Revision 1.4 1994/10/08 19:15:38 mike + * remove SM_DOS + * + * Revision 1.3 1993/07/14 12:32:39 mike + * run thru indent + * + * Revision 1.2 1993/07/14 12:22:11 mike + * rm SIZE_T and (void) casts + * + * Revision 1.1.1.1 1993/07/03 18:58:12 mike + * move source to cvs + * + * Revision 5.7 1993/05/08 18:06:00 mike + * null_split + * + * Revision 5.6 1993/02/13 21:57:25 mike + * merge patch3 + * + * Revision 5.5 1993/01/01 21:30:48 mike + * split new_STRING() into new_STRING and new_STRING0 + * + * Revision 5.4.1.2 1993/01/20 12:53:08 mike + * d_to_l() + * + * Revision 5.4.1.1 1993/01/15 03:33:42 mike + * patch3: safer double to int conversion + * + * Revision 5.4 1992/11/29 22:52:11 mike + * double->string conversions uses long ints for 16/32 bit + * compatibility. + * Fixed small LM_DOS bozo. + * + * Revision 5.3 1992/08/17 14:21:10 brennan + * patch2: After parsing, only bi_sprintf() uses string_buff. + * + * Revision 5.2 1992/07/10 16:17:10 brennan + * MsDOS: remove NO_BINMODE macro + * + * Revision 5.1 1991/12/05 07:55:57 brennan + * 1.1 pre-release + * +*/ + + +/* field.c */ + +#include "mawk.h" +#include "field.h" +#include "init.h" +#include "memory.h" +#include "scan.h" +#include "bi_vars.h" +#include "repl.h" +#include "regexp.h" + +CELL field[FBANK_SZ + NUM_PFIELDS] ; + +CELL *fbank[NUM_FBANK] = +{field} ; + +static int max_field = MAX_SPLIT ; /* maximum field actually created*/ + +static void PROTO(build_field0, (void)) ; +static void PROTO(set_rs_shadow, (void)) ; +static void PROTO(load_pfield, (char *, CELL *)) ; +static void PROTO(load_field_ov, (void)) ; + + + +/* a description of how to split based on RS. + If RS is changed, so is rs_shadow */ +SEPARATOR rs_shadow = +{SEP_CHAR, '\n'} ; +/* a splitting CELL version of FS */ +CELL fs_shadow = +{C_SPACE} ; +int nf ; + /* nf holds the true value of NF. If nf < 0 , then + NF has not been computed, i.e., $0 has not been split + */ + +static void +set_rs_shadow() +{ + CELL c ; + STRING *sval ; + char *s ; + unsigned len ; + + if (posix_space_flag && mawk_state == EXECUTION) + scan_code['\n'] = SC_UNEXPECTED ; + + if (rs_shadow.type == SEP_STR) + { + free_STRING((STRING *) rs_shadow.ptr) ; + } + + cast_for_split(cellcpy(&c, RS)) ; + switch (c.type) + { + case C_RE: + if ((s = is_string_split(c.ptr, &len))) + { + if (len == 1) + { + rs_shadow.type = SEP_CHAR ; + rs_shadow.c = s[0] ; + } + else + { + rs_shadow.type = SEP_STR ; + rs_shadow.ptr = (PTR) new_STRING(s) ; + } + } + else + { + rs_shadow.type = SEP_RE ; + rs_shadow.ptr = c.ptr ; + } + break ; + + case C_SPACE: + rs_shadow.type = SEP_CHAR ; + rs_shadow.c = ' ' ; + break ; + + case C_SNULL: /* RS becomes one or more blank lines */ + if (mawk_state == EXECUTION) scan_code['\n'] = SC_SPACE ; + rs_shadow.type = SEP_MLR ; + sval = new_STRING("\n\n+") ; + rs_shadow.ptr = re_compile(sval) ; + free_STRING(sval) ; + break ; + + default: + bozo("bad cell in set_rs_shadow") ; + } +} + +static void +load_pfield(name, cp) + char *name ; + CELL *cp ; +{ + SYMTAB *stp ; + + stp = insert(name) ; stp->type = ST_FIELD ; + stp->stval.cp = cp ; +} + +/* initialize $0 and the pseudo fields */ +void +field_init() +{ + field[0].type = C_STRING ; + field[0].ptr = (PTR) & null_str ; + null_str.ref_cnt++ ; + + load_pfield("NF", NF) ; + NF->type = C_DOUBLE ; + NF->dval = 0.0 ; + + load_pfield("RS", RS) ; + RS->type = C_STRING ; + RS->ptr = (PTR) new_STRING("\n") ; + /* rs_shadow already set */ + + load_pfield("FS", FS) ; + FS->type = C_STRING ; + FS->ptr = (PTR) new_STRING(" ") ; + /* fs_shadow is already set */ + + load_pfield("OFMT", OFMT) ; + OFMT->type = C_STRING ; + OFMT->ptr = (PTR) new_STRING("%.6g") ; + + load_pfield("CONVFMT", CONVFMT) ; + CONVFMT->type = C_STRING ; + CONVFMT->ptr = OFMT->ptr ; + string(OFMT)->ref_cnt++ ; +} + + + +void +set_field0(s, len) + char *s ; + unsigned len ; +{ + cell_destroy(&field[0]) ; + nf = -1 ; + + if (len) + { + field[0].type = C_MBSTRN ; + field[0].ptr = (PTR) new_STRING0(len) ; + memcpy(string(&field[0])->str, s, len) ; + } + else + { + field[0].type = C_STRING ; + field[0].ptr = (PTR) & null_str ; + null_str.ref_cnt++ ; + } +} + + + +/* split field[0] into $1, $2 ... and set NF */ + +void +split_field0() +{ + register CELL *cp ; + register int cnt ; + CELL c ; /* copy field[0] here if not string */ + + + if (field[0].type < C_STRING) + { + cast1_to_s(cellcpy(&c, field + 0)) ; + cp = &c ; + } + else cp = &field[0] ; + + if (string(cp)->len == 0) nf = 0 ; + else + { + switch (fs_shadow.type) + { + case C_SNULL: /* FS == "" */ + nf = null_split(string(cp)->str) ; + break ; + + case C_SPACE: + nf = space_split(string(cp)->str, string(cp)->len) ; + break ; + + default: + nf = re_split(string(cp)->str, fs_shadow.ptr) ; + break ; + } + + } + + cell_destroy(NF) ; + NF->type = C_DOUBLE ; + NF->dval = (double) nf ; + + if (nf > MAX_SPLIT) + { + cnt = MAX_SPLIT ; load_field_ov() ; + } + else cnt = nf ; + + while (cnt > 0) + { + cell_destroy(field + cnt) ; + field[cnt].ptr = (PTR) split_buff[cnt - 1] ; + field[cnt--].type = C_MBSTRN ; + } + + if (cp == &c) { free_STRING(string(cp)) ; } +} + +/* + assign CELL *cp to field or pseudo field + and take care of all side effects +*/ + +void +field_assign(fp, cp) + register CELL *fp ; + CELL *cp ; +{ + CELL c ; + int i, j ; + + /* the most common case first */ + if (fp == field) + { + cell_destroy(field) ; + cellcpy(fp, cp) ; + nf = -1 ; + return ; + } + + /* its not important to do any of this fast */ + + if (nf < 0) split_field0() ; + +#ifdef MSDOS + if (!SAMESEG(fp, field)) + { + i = -1 ; + goto lm_dos_label ; + } +#endif + + switch (i = (fp - field)) + { + + case NF_field: + + cell_destroy(NF) ; + cellcpy(NF, cellcpy(&c, cp)) ; + if (c.type != C_DOUBLE) cast1_to_d(&c) ; + + if ((j = d_to_i(c.dval)) < 0) + rt_error("negative value assigned to NF") ; + + if (j > nf) + for (i = nf + 1; i <= j; i++) + { + cp = field_ptr(i) ; + cell_destroy(cp) ; + cp->type = C_STRING ; + cp->ptr = (PTR) & null_str ; + null_str.ref_cnt++ ; + } + + nf = j ; + build_field0() ; + break ; + + case RS_field: + cell_destroy(RS) ; + cellcpy(RS, cp) ; + set_rs_shadow() ; + break ; + + case FS_field: + cell_destroy(FS) ; + cast_for_split(cellcpy(&fs_shadow, cellcpy(FS, cp))) ; + break ; + + case OFMT_field: + case CONVFMT_field: + /* If the user does something stupid with OFMT or CONVFMT, + we could crash. + We'll make an attempt to protect ourselves here. This is + why OFMT and CONVFMT are pseudo fields. + + The ptrs of OFMT and CONVFMT always have a valid STRING, + even if assigned a DOUBLE or NOINIT + */ + + free_STRING(string(fp)) ; + cellcpy(fp, cp) ; + if (fp->type < C_STRING) /* !! */ + fp->ptr = (PTR) new_STRING("%.6g") ; + else if (fp == CONVFMT) + { + /* It's a string, but if it's really goofy and CONVFMT, + it could still damage us. Test it . + */ + char xbuff[512] ; + + xbuff[256] = 0 ; + sprintf(xbuff, string(fp)->str, 3.1459) ; + if (xbuff[256]) + rt_error("CONVFMT assigned unusable value") ; + } + break ; + +#ifdef MSDOS + lm_dos_label: +#endif + + default: /* $1 or $2 or ... */ + + + cell_destroy(fp) ; + cellcpy(fp, cp) ; + + if (i < 0 || i > MAX_SPLIT) i = field_addr_to_index(fp) ; + + if (i > nf) + { + for (j = nf + 1; j < i; j++) + { + cp = field_ptr(j) ; + cell_destroy(cp) ; + cp->type = C_STRING ; + cp->ptr = (PTR) & null_str ; + null_str.ref_cnt++ ; + } + nf = i ; + cell_destroy(NF) ; + NF->type = C_DOUBLE ; + NF->dval = (double) i ; + } + + build_field0() ; + + } +} + + +/* construct field[0] from the other fields */ + +static void +build_field0() +{ + + +#ifdef DEBUG + if (nf < 0) bozo("nf <0 in build_field0") ; +#endif + + cell_destroy(field + 0) ; + + if (nf == 0) + { + field[0].type = C_STRING ; + field[0].ptr = (PTR) & null_str ; + null_str.ref_cnt++ ; + } + else if (nf == 1) + { + cellcpy(field, field + 1) ; + } + else + { + CELL c ; + STRING *ofs, *tail ; + unsigned len ; + register CELL *cp ; + register char *p, *q ; + int cnt ; + CELL **fbp, *cp_limit ; + + + cast1_to_s(cellcpy(&c, OFS)) ; + ofs = (STRING *) c.ptr ; + cast1_to_s(cellcpy(&c, field_ptr(nf))) ; + tail = (STRING *) c.ptr ; + cnt = nf - 1 ; + + len = cnt * ofs->len + tail->len ; + + fbp = fbank ; cp_limit = field + FBANK_SZ ; + cp = field + 1 ; + + while (cnt-- > 0) + { + if (cp->type < C_STRING) + { /* use the string field temporarily */ + if (cp->type == C_NOINIT) + { + cp->ptr = (PTR) & null_str ; + null_str.ref_cnt++ ; + } + else /* its a double */ + { + Int ival ; + char xbuff[260] ; + + ival = d_to_I(cp->dval) ; + if (ival == cp->dval) sprintf(xbuff, INT_FMT, ival) ; + else sprintf(xbuff, string(CONVFMT)->str, cp->dval) ; + + cp->ptr = (PTR) new_STRING(xbuff) ; + } + } + + len += string(cp)->len ; + + if (++cp == cp_limit) + { + cp = *++fbp ; + cp_limit = cp + FBANK_SZ ; + } + + } + + field[0].type = C_STRING ; + field[0].ptr = (PTR) new_STRING0(len) ; + + p = string(field)->str ; + + /* walk it again , putting things together */ + cnt = nf-1 ; fbp = fbank ; + cp = field+1 ; cp_limit = field + FBANK_SZ ; + while (cnt-- > 0) + { + memcpy(p, string(cp)->str, string(cp)->len) ; + p += string(cp)->len ; + /* if not really string, free temp use of ptr */ + if (cp->type < C_STRING) { free_STRING(string(cp)) ; } + if (++cp == cp_limit) + { + cp = *++fbp ; + cp_limit = cp + FBANK_SZ ; + } + /* add the separator */ + q = ofs->str ; while( *q ) *p++ = *q++ ; + } + /* tack tail on the end */ + memcpy(p, tail->str, tail->len) ; + + /* cleanup */ + free_STRING(tail) ; free_STRING(ofs) ; + } +} + +/* We are assigning to a CELL and we aren't sure if its + a field */ + +void +slow_cell_assign(target, source) + register CELL *target ; + CELL *source ; +{ + if ( + +#ifdef MSDOS /* the dreaded segment nonsense */ + SAMESEG(target, field) && +#endif + target >= field && target <= LAST_PFIELD) + field_assign(target, source) ; + else + { + CELL **p = fbank + 1 ; + + while (*p) + { + if ( +#ifdef MSDOS + SAMESEG(target, *p) && +#endif + target >= *p && target < *p + FBANK_SZ) + { + field_assign(target, source) ; + return ; + } + p++ ; + } + /* its not a field */ + cell_destroy(target) ; + cellcpy(target, source) ; + } +} + +int +field_addr_to_index(cp) + CELL *cp ; +{ + CELL **p = fbank ; + + while ( + +#ifdef MSDOS + !SAMESEG(cp, *p) || +#endif + + cp < *p || cp >= *p + FBANK_SZ) + p++ ; + + return ((p - fbank) << FB_SHIFT) + (cp - *p) ; +} + +/*------- more than 1 fbank needed ------------*/ + +/* + compute the address of a field with index + > MAX_SPLIT +*/ + +CELL * +slow_field_ptr(i) + register int i ; +{ + + if (i > max_field) + { + int j ; + + if (i > MAX_FIELD) + rt_overflow("maximum number of fields", MAX_FIELD) ; + + j = 1 ; + while (fbank[j]) j++ ; + + do + { + fbank[j] = (CELL *) zmalloc(sizeof(CELL) * FBANK_SZ) ; + memset(fbank[j], 0, sizeof(CELL) * FBANK_SZ) ; + j++ ; + max_field += FBANK_SZ ; + } + while (i > max_field); + } + + return &fbank[i >> FB_SHIFT][i & (FBANK_SZ - 1)] ; +} + +/* + $0 split into more than MAX_SPLIT fields, + $(MAX_FIELD+1) ... are on the split_ov_list. + Copy into fields which start at fbank[1] +*/ + +static void +load_field_ov() +{ + register SPLIT_OV *p ; /* walks split_ov_list */ + register CELL *cp ; /* target of copy */ + int j ; /* current fbank[] */ + CELL *cp_limit ; /* change fbank[] */ + SPLIT_OV *q ; /* trails p */ + + /* make sure the fields are allocated */ + slow_field_ptr(nf) ; + + p = split_ov_list ; split_ov_list = (SPLIT_OV*) 0 ; + j = 1 ; cp = fbank[j] ; cp_limit = cp + FBANK_SZ ; + while (p) + { + cell_destroy(cp) ; + cp->type = C_MBSTRN ; + cp->ptr = (PTR) p->sval ; + + if (++cp == cp_limit) + { + cp = fbank[++j] ; cp_limit = cp + FBANK_SZ ; + } + + q = p ; p = p->link ; ZFREE(q) ; + } +} + + +#if MSDOS + +int +binmode() /* read current value of BINMODE */ +{ + CELL c ; + + cast1_to_d(cellcpy(&c, BINMODE)) ; + return d_to_i(c.dval) ; +} + +/* set BINMODE and RS and ORS + from environment or -W binmode= */ + +void +set_binmode(x) + int x ; +{ + CELL c ; + + /* set RS */ + c.type = C_STRING ; + c.ptr = (PTR) new_STRING((x & 1) ? "\r\n" : "\n") ; + field_assign(RS, &c) ; + free_STRING(string(&c)) ; + + /* set ORS */ + cell_destroy(ORS) ; + ORS->type = C_STRING ; + ORS->ptr = (PTR) new_STRING((x & 2) ? "\r\n" : "\n") ; + + cell_destroy(BINMODE) ; + BINMODE->type = C_DOUBLE ; + BINMODE->dval = (double) x ; +} + +#endif /* MSDOS */ diff --git a/field.h b/field.h new file mode 100644 index 0000000..84448aa --- /dev/null +++ b/field.h @@ -0,0 +1,105 @@ + +/******************************************** +field.h +copyright 1991, Michael D. Brennan + +This is a source file for mawk, an implementation of +the AWK programming language. + +Mawk is distributed without warranty under the terms of +the GNU General Public License, version 2, 1991. +********************************************/ + +/* $Log: field.h,v $ + * Revision 1.2 1995/06/18 19:42:16 mike + * Remove some redundant declarations and add some prototypes + * + * Revision 1.1.1.1 1993/07/03 18:58:12 mike + * move source to cvs + * + * Revision 5.2 1992/01/06 08:10:24 brennan + * set_binmode() proto for MSDOS + * + * Revision 5.1 91/12/05 07:59:16 brennan + * 1.1 pre-release + * +*/ + +/* field.h */ + + +#ifndef FIELD_H +#define FIELD_H 1 + +void PROTO( set_field0, (char *, unsigned) ) ; +void PROTO( split_field0, (void) ) ; +int PROTO( space_split, (char *, unsigned) ) ; +int PROTO( re_split, (char *, PTR) ) ; +int PROTO( null_split, (char *)) ; +void PROTO( field_assign, (CELL*, CELL *) ) ; +char *PROTO( is_string_split, (PTR , unsigned *) ) ; +void PROTO( slow_cell_assign, (CELL*, CELL*)) ; +CELL *PROTO( slow_field_ptr, (int)) ; +int PROTO( field_addr_to_index, (CELL*)) ; +void PROTO( set_binmode, (int)) ; + + +#define NUM_PFIELDS 5 +extern CELL field[FBANK_SZ+NUM_PFIELDS] ; + /* $0, $1 ... $(MAX_SPLIT), NF, RS, RS, CONVFMT, OFMT */ + +/* more fields if needed go here */ +extern CELL *fbank[NUM_FBANK] ; /* fbank[0] == field */ + +/* index to CELL * for a field */ +#define field_ptr(i) ((i)<=MAX_SPLIT?field+(i):slow_field_ptr(i)) + +/* the pseudo fields, assignment has side effects */ +#define NF (field+MAX_SPLIT+1) /* must be first */ +#define RS (field+MAX_SPLIT+2) +#define FS (field+MAX_SPLIT+3) +#define CONVFMT (field+MAX_SPLIT+4) +#define OFMT (field+MAX_SPLIT+5) /* must be last */ + +#define LAST_PFIELD OFMT + +/* some compilers choke on (NF-field) in a case statement + even though it's constant so ... +*/ +#define NF_field (MAX_SPLIT+1) +#define RS_field (MAX_SPLIT+2) +#define FS_field (MAX_SPLIT+3) +#define CONVFMT_field (MAX_SPLIT+4) +#define OFMT_field (MAX_SPLIT+5) + + +extern int nf ; /* shadows NF */ + +/* a shadow type for RS and FS */ +#define SEP_SPACE 0 +#define SEP_CHAR 1 +#define SEP_STR 2 +#define SEP_RE 3 +#define SEP_MLR 4 + +typedef struct { +char type ; +char c ; +PTR ptr ; /* STRING* or RE machine* */ +} SEPARATOR ; + +extern SEPARATOR rs_shadow ; +extern CELL fs_shadow ; + + +/* types for splitting overflow */ + +typedef struct spov { +struct spov *link ; +STRING *sval ; +} SPLIT_OV ; + +extern SPLIT_OV *split_ov_list ; + + +#endif /* FIELD_H */ diff --git a/field.o b/field.o new file mode 100644 index 0000000..8669853 Binary files /dev/null and b/field.o differ diff --git a/files.c b/files.c new file mode 100644 index 0000000..aab115f --- /dev/null +++ b/files.c @@ -0,0 +1,661 @@ + +/******************************************** +files.c +copyright 1991-94. Michael D. Brennan + +This is a source file for mawk, an implementation of +the AWK programming language. + +Mawk is distributed without warranty under the terms of +the GNU General Public License, version 2, 1991. +********************************************/ + +/*$Log: files.c,v $ + * Revision 1.9 1996/01/14 17:14:10 mike + * flush_all_output() + * + * Revision 1.8 1995/06/06 00:18:27 mike + * change mawk_exit(1) to mawk_exit(2) + * + * Revision 1.7 1994/12/11 20:48:50 mike + * fflush builtin + * + * Revision 1.6 1994/10/08 19:15:40 mike + * remove SM_DOS + * + * Revision 1.5 1994/04/17 20:01:37 mike + * recognize filename "/dev/stdout" + * + * Revision 1.4 1994/02/21 00:11:07 mike + * code cleanup + * + * Revision 1.3 1993/07/16 01:00:36 mike + * cleanup and indent + * + * Revision 5.5 1992/12/17 02:48:01 mike + * 1.1.2d changes for DOS + * + * Revision 5.4 1992/07/10 16:10:30 brennan + * patch2 + * MsDOS: remove useless NO_BINMODE macro + * get process exit code on in pipes + * + * Revision 5.3 1992/04/07 20:21:17 brennan + * patch 2 + * unbuffered output to a tty + * + * Revision 5.2 1992/04/07 16:03:08 brennan + * patch 2 + * allow same filename for output and input, but use different descriptors + * E.g. < "/dev/tty" and > "/dev/tty" + * + * Revision 5.1 91/12/05 07:56:00 brennan + * 1.1 pre-release + * +*/ + +/* files.c */ + +#include "mawk.h" +#include "files.h" +#include "memory.h" +#include "fin.h" + +#include +#include +#include + +#ifdef V7 +#include /* defines FIOCLEX */ +#endif + +#ifndef NO_FCNTL_H + +#include +#define CLOSE_ON_EXEC(fd) fcntl(fd, F_SETFD, 1) + +#else +#define CLOSE_ON_EXEC(fd) ioctl(fd, FIOCLEX, (PTR) 0) +#endif + + +/* We store dynamically created files on a linked linear + list with move to the front (big surprise) */ + +typedef struct file +{ + struct file *link ; + STRING *name ; + short type ; + int pid ; /* we need to wait() when we close a pipe */ + /* holds temp file index under MSDOS */ + +#if HAVE_FAKE_PIPES + int inpipe_exit ; +#endif + + PTR ptr ; /* FIN* or FILE* */ +} +FILE_NODE ; + +static FILE_NODE *file_list ; + +/* Prototypes for local functions */ + +static FILE *PROTO(tfopen, (char *, char *)) ; +static void PROTO(efflush, (FILE*)) ; +static void PROTO(add_to_child_list, (int, int)) ; +static struct child *PROTO(remove_from_child_list, (int)) ; +extern int PROTO(isatty, (int)) ; +static void PROTO(close_error, (FILE_NODE *p)); + +/* find a file on file_list */ +PTR +file_find(sval, type) + STRING *sval ; + int type ; +{ + register FILE_NODE *p = file_list ; + FILE_NODE *q = (FILE_NODE *) 0 ; + char *name = sval->str ; + char *ostr ; + + while (1) + { + if (!p) + { + /* open a new one */ + p = ZMALLOC(FILE_NODE) ; + + switch (p->type = type) + { + case F_TRUNC: +#if MSDOS + ostr = (binmode() & 2) ? "wb" : "w" ; +#else + ostr = "w" ; +#endif + if (!(p->ptr = (PTR) tfopen(name, ostr))) + goto out_failure ; + break ; + + case F_APPEND: +#if MSDOS + ostr = (binmode() & 2) ? "ab" : "a" ; +#else + ostr = "a" ; +#endif + if (!(p->ptr = (PTR) tfopen(name, ostr))) + goto out_failure ; + break ; + + case F_IN: + if (!(p->ptr = (PTR) FINopen(name, 0))) + { + zfree(p, sizeof(FILE_NODE)) ; + return (PTR) 0 ; + } + break ; + + case PIPE_OUT: + case PIPE_IN: + +#if HAVE_REAL_PIPES || HAVE_FAKE_PIPES + + if (!(p->ptr = get_pipe(name, type, &p->pid))) + { + if (type == PIPE_OUT) goto out_failure ; + else + { + zfree(p, sizeof(FILE_NODE)) ; + return (PTR) 0 ; + } + } +#else + rt_error("pipes not supported") ; +#endif + break ; + +#ifdef DEBUG + default: + bozo("bad file type") ; +#endif + } + /* successful open */ + p->name = sval ; + sval->ref_cnt++ ; + break ; /* while loop */ + } + + /* search is by name and type */ + if (strcmp(name, p->name->str) == 0 && + (p->type == type || + /* no distinction between F_APPEND and F_TRUNC here */ + (p->type >= F_APPEND && type >= F_APPEND))) + + { + /* found */ + if (!q) /*at front of list */ + return p->ptr ; + /* delete from list for move to front */ + q->link = p->link ; + break ; /* while loop */ + } + + q = p ; p = p->link ; + } /* end while loop */ + + /* put p at the front of the list */ + p->link = file_list ; + return (PTR) (file_list = p)->ptr ; + +out_failure: + errmsg(errno, "cannot open \"%s\" for output", name) ; + mawk_exit(2) ; + +} + + +/* Close a file and delete it's node from the file_list. + Walk the whole list, in case a name has two nodes, + e.g. < "/dev/tty" and > "/dev/tty" +*/ + +int +file_close(sval) + STRING *sval ; +{ + FILE_NODE dummy ; + register FILE_NODE *p ; + FILE_NODE *q = &dummy ; /* trails p */ + FILE_NODE *hold ; + char *name = sval->str ; + int retval = -1 ; + + dummy.link = p = file_list ; + while (p) + { + if (strcmp(name, p->name->str) == 0) + { + /* found */ + + /* Remove it from the list first because we might be called + again if an error occurs leading to an infinite loop. + + Note that we don't have to consider the list corruption + caused by a recursive call because it will never return. */ + + q->link = p->link ; + file_list = dummy.link ; /* maybe it was the first file */ + + switch (p->type) + { + case F_TRUNC: + case F_APPEND: + if( fclose((FILE *) p->ptr) != 0 ) + close_error(p) ; + retval = 0 ; + break ; + + case PIPE_OUT: + if( fclose((FILE *) p->ptr) != 0 ) + close_error(p) ; + +#if HAVE_REAL_PIPES + retval = wait_for(p->pid) ; +#endif +#if HAVE_FAKE_PIPES + retval = close_fake_outpipe(p->name->str, p->pid) ; +#endif + break ; + + case F_IN: + FINclose((FIN *) p->ptr) ; + retval = 0 ; + break ; + + case PIPE_IN: + FINclose((FIN *) p->ptr) ; + +#if HAVE_REAL_PIPES + retval = wait_for(p->pid) ; +#endif +#if HAVE_FAKE_PIPES + { + char xbuff[100] ; + unlink(tmp_file_name(p->pid, xbuff)) ; + retval = p->inpipe_exit ; + } +#endif + break ; + } + + free_STRING(p->name) ; + hold = p ; + p = p->link ; + ZFREE(hold) ; + } + else + { + q = p ; p = p->link ; + } + } + + return retval ; +} + +/* +find an output file with name == sval and fflush it +*/ + +int +file_flush(sval) + STRING *sval ; +{ + int ret = -1 ; + register FILE_NODE *p = file_list ; + unsigned len = sval->len ; + char *str = sval->str ; + + if (len==0) + { + /* for consistency with gawk */ + flush_all_output() ; + return 0 ; + } + + while( p ) + { + if ( IS_OUTPUT(p->type) && + len == p->name->len && + strcmp(str,p->name->str) == 0 ) + { + ret = 0 ; + efflush((FILE*)p->ptr) ; + /* it's possible for a command and a file to have the same + name -- so keep looking */ + } + p = p->link ; + } + return ret ; +} + +void +flush_all_output() +{ + FILE_NODE *p ; + + for(p=file_list; p ; p = p->link) + if (IS_OUTPUT(p->type)) efflush((FILE*)p->ptr) ; +} + +static void +efflush(fp) + FILE *fp ; +{ + if (fflush(fp) < 0) + { + errmsg(errno, "unexpected write error") ; + mawk_exit(2) ; + } +} + + +/* When we exit, we need to close and wait for all output pipes */ + +#if HAVE_REAL_PIPES + +/* work around for bug in AIX 4.1 -- If there are exactly 16 or + 32 or 48 ..., open files then the last one doesn't get flushed on + exit. So the following is now a misnomer as we'll really close + all output. +*/ + +void +close_out_pipes() +{ + register FILE_NODE *p = file_list ; + + while (p) + { + if (IS_OUTPUT(p->type)) + { + if( fclose((FILE *) p->ptr) != 0 ) + { + /* if another error occurs we do not want to be called + for the same file again */ + + file_list = p->link ; + close_error(p) ; + } + if (p->type == PIPE_OUT) wait_for(p->pid) ; + } + + p = p->link ; + } +} + +#else +#if HAVE_FAKE_PIPES /* pipes are faked with temp files */ + +void +close_fake_pipes() +{ + register FILE_NODE *p = file_list ; + char xbuff[100] ; + + /* close input pipes first to free descriptors for children */ + while (p) + { + if (p->type == PIPE_IN) + { + FINclose((FIN *) p->ptr) ; + unlink(tmp_file_name(p->pid, xbuff)) ; + } + p = p->link ; + } + /* doit again */ + p = file_list ; + while (p) + { + if (p->type == PIPE_OUT) + { + if( fclose(p->ptr) != 0 ) + close_error(p) ; + close_fake_outpipe(p->name->str, p->pid) ; + } + p = p->link ; + } +} +#endif /* HAVE_FAKE_PIPES */ +#endif /* ! HAVE_REAL_PIPES */ + +/* hardwire to /bin/sh for portability of programs */ +char *shell = "/bin/sh" ; + +#if HAVE_REAL_PIPES + +PTR +get_pipe(name, type, pid_ptr) + char *name ; + int type ; + int *pid_ptr ; +{ + int the_pipe[2], local_fd, remote_fd ; + + if (pipe(the_pipe) == -1) return (PTR) 0 ; + local_fd = the_pipe[type == PIPE_OUT] ; + remote_fd = the_pipe[type == PIPE_IN] ; + /* to keep output ordered correctly */ + fflush(stdout) ; fflush(stderr) ; + + switch (*pid_ptr = fork()) + { + case -1: + close(local_fd) ; + close(remote_fd) ; + return (PTR) 0 ; + + case 0: + close(local_fd) ; + close(type == PIPE_IN) ; + dup(remote_fd) ; + close(remote_fd) ; + execl(shell, shell, "-c", name, (char *) 0) ; + errmsg(errno, "failed to exec %s -c %s", shell, name) ; + fflush(stderr) ; + _exit(128) ; + + default: + close(remote_fd) ; + /* we could deadlock if future child inherit the local fd , + set close on exec flag */ + CLOSE_ON_EXEC(local_fd) ; + break ; + } + + return type == PIPE_IN ? (PTR) FINdopen(local_fd, 0) : + (PTR) fdopen(local_fd, "w") ; +} + + + +/*------------ children ------------------*/ + +/* we need to wait for children at the end of output pipes to + complete so we know any files they have created are complete */ + +/* dead children are kept on this list */ + +static struct child +{ + int pid ; + int exit_status ; + struct child *link ; +} *child_list ; + +static void +add_to_child_list(pid, exit_status) + int pid, exit_status ; +{ + register struct child *p = ZMALLOC(struct child) ; + + p->pid = pid ; p->exit_status = exit_status ; + p->link = child_list ; child_list = p ; +} + +static struct child * +remove_from_child_list(pid) + int pid ; +{ + struct child dummy ; + register struct child *p ; + struct child *q = &dummy ; + + dummy.link = p = child_list ; + while (p) + { + if (p->pid == pid) + { + q->link = p->link ; + break ; + } + else + { + q = p ; p = p->link ; + } + } + + child_list = dummy.link ; + return p ; + /* null return if not in the list */ +} + + +/* wait for a specific child to complete and return its + exit status + + If pid is zero, wait for any single child and + put it on the dead children list +*/ + +int +wait_for(pid) + int pid ; +{ + int exit_status ; + struct child *p ; + int id ; + + if (pid == 0) + { + id = wait(&exit_status) ; + add_to_child_list(id, exit_status) ; + } + /* see if an earlier wait() caught our child */ + else if ((p = remove_from_child_list(pid))) + { + exit_status = p->exit_status ; + ZFREE(p) ; + } + else + { + /* need to really wait */ + while ((id = wait(&exit_status)) != pid) + { + if (id == -1) /* can't happen */ + bozo("wait_for") ; + else + { + /* we got the exit status of another child + put it on the child list and try again */ + add_to_child_list(id, exit_status) ; + } + } + } + + if (exit_status & 0xff) exit_status = 128 + (exit_status & 0xff) ; + else exit_status = (exit_status & 0xff00) >> 8 ; + + return exit_status ; +} + +#endif /* HAVE_REAL_PIPES */ + + +void +set_stderr() /* and stdout */ +{ + FILE_NODE *p, *q ; + + /* We insert stderr first to get it at the end of the list. This is + needed because we want to output errors encountered on closing + stdout. */ + + q = ZMALLOC(FILE_NODE); + q->link = (FILE_NODE*) 0 ; + q->type = F_TRUNC ; + q->name = new_STRING("/dev/stderr") ; + q->ptr = (PTR) stderr ; + + p = ZMALLOC(FILE_NODE) ; + p->link = q; + p->type = F_TRUNC ; + p->name = new_STRING("/dev/stdout") ; + p->ptr = (PTR) stdout ; + + file_list = p ; +} + +/* fopen() but no buffering to ttys */ +static FILE * +tfopen(name, mode) + char *name, *mode ; +{ + FILE *retval = fopen(name, mode) ; + + if (retval) + { + if (isatty(fileno(retval))) setbuf(retval, (char *) 0) ; + else + { +#ifdef MSDOS + enlarge_output_buffer(retval) ; +#endif + } + } + return retval ; +} + +#ifdef MSDOS +void +enlarge_output_buffer(fp) + FILE *fp ; +{ + if (setvbuf(fp, (char *) 0, _IOFBF, BUFFSZ) < 0) + { + errmsg(errno, "setvbuf failed on fileno %d", fileno(fp)) ; + mawk_exit(2) ; + } +} + +void +stdout_init() +{ + if (!isatty(1)) enlarge_output_buffer(stdout) ; + if (binmode() & 2) + { + setmode(1,O_BINARY) ; setmode(2,O_BINARY) ; + } +} +#endif /* MSDOS */ + +/* An error occured closing the file referred to by P. We tell the + user and terminate the program. */ + +static void close_error(p) + FILE_NODE *p ; +{ + errmsg(errno, "close failed on file %s", p->name->str) ; + mawk_exit(2) ; +} diff --git a/files.h b/files.h new file mode 100644 index 0000000..7f7f50d --- /dev/null +++ b/files.h @@ -0,0 +1,67 @@ + +/******************************************** +files.h +copyright 1991, Michael D. Brennan + +This is a source file for mawk, an implementation of +the AWK programming language. + +Mawk is distributed without warranty under the terms of +the GNU General Public License, version 2, 1991. +********************************************/ + +/*$Log: files.h,v $ + * Revision 1.3 1996/01/14 17:14:11 mike + * flush_all_output() + * + * Revision 1.2 1994/12/11 22:14:13 mike + * remove THINK_C #defines. Not a political statement, just no indication + * that anyone ever used it. + * + * Revision 1.1.1.1 1993/07/03 18:58:13 mike + * move source to cvs + * + * Revision 5.2 1992/12/17 02:48:01 mike + * 1.1.2d changes for DOS + * + * Revision 5.1 1991/12/05 07:59:18 brennan + * 1.1 pre-release + * +*/ + +#ifndef FILES_H +#define FILES_H + +/* IO redirection types */ +#define F_IN (-5) +#define PIPE_IN (-4) +#define PIPE_OUT (-3) +#define F_APPEND (-2) +#define F_TRUNC (-1) +#define IS_OUTPUT(type) ((type)>=PIPE_OUT) + +extern char *shell ; /* for pipes and system() */ + +PTR PROTO(file_find, (STRING *, int)) ; +int PROTO(file_close, (STRING *)) ; +int PROTO(file_flush, (STRING *)) ; +void PROTO(flush_all_output, (void)) ; +PTR PROTO(get_pipe, (char *, int, int *) ) ; +int PROTO(wait_for, (int) ) ; +void PROTO( close_out_pipes, (void) ) ; + +#if HAVE_FAKE_PIPES +void PROTO(close_fake_pipes, (void)) ; +int PROTO(close_fake_outpipe, (char *,int)) ; +char *PROTO(tmp_file_name, (int, char*)) ; +#endif + +#if MSDOS +int PROTO(DOSexec, (char *)) ; +int PROTO(binmode, (void)) ; +void PROTO(set_binmode, (int)) ; +void PROTO(enlarge_output_buffer, (FILE*)) ; +#endif + + +#endif diff --git a/files.o b/files.o new file mode 100644 index 0000000..ae2726c Binary files /dev/null and b/files.o differ diff --git a/fin.c b/fin.c new file mode 100644 index 0000000..9cc166c --- /dev/null +++ b/fin.c @@ -0,0 +1,588 @@ + +/******************************************** +fin.c +copyright 1991, 1992. Michael D. Brennan + +This is a source file for mawk, an implementation of +the AWK programming language. + +Mawk is distributed without warranty under the terms of +the GNU General Public License, version 2, 1991. +********************************************/ + +/*$Log: fin.c,v $ + * Revision 1.10 1995/12/24 22:23:22 mike + * remove errmsg() from inside FINopen + * + * Revision 1.9 1995/06/06 00:18:29 mike + * change mawk_exit(1) to mawk_exit(2) + * + * Revision 1.8 1994/12/13 00:26:35 mike + * rt_nr and rt_fnr for run-time error messages + * + * Revision 1.7 1994/12/11 23:25:05 mike + * -Wi option + * + * Revision 1.6 1994/12/11 22:14:15 mike + * remove THINK_C #defines. Not a political statement, just no indication + * that anyone ever used it. + * + * Revision 1.5 1994/10/08 19:15:42 mike + * remove SM_DOS + * + * Revision 1.4 1993/07/17 13:22:55 mike + * indent and general code cleanup + * + * Revision 1.3 1993/07/15 13:26:55 mike + * SIZE_T and indent + * + * Revision 1.2 1993/07/04 12:51:57 mike + * start on autoconfig changes + * + * Revision 1.1.1.1 1993/07/03 18:58:13 mike + * move source to cvs + * + * Revision 5.7 1993/01/01 21:30:48 mike + * split new_STRING() into new_STRING and new_STRING0 + * + * Revision 5.6 1992/12/17 02:48:01 mike + * 1.1.2d changes for DOS + * + * Revision 5.5 1992/07/28 15:11:30 brennan + * minor change in finding eol, needed for MsDOS + * + * Revision 5.4 1992/07/10 16:17:10 brennan + * MsDOS: remove NO_BINMODE macro + * + * Revision 5.3 1992/07/08 16:14:27 brennan + * FILENAME and FNR retain last values in the + * END block. + * + * Revision 5.2 1992/02/21 13:30:08 brennan + * fixed bug that free'd FILENAME twice if + * command line was var=value only + * + * Revision 5.1 91/12/05 07:56:02 brennan + * 1.1 pre-release + * +*/ + +/* fin.c */ + +#include "mawk.h" +#include "fin.h" +#include "memory.h" +#include "bi_vars.h" +#include "field.h" +#include "symtype.h" +#include "scan.h" + +#ifndef NO_FCNTL_H +#include +#endif + +/* This file handles input files. Opening, closing, + buffering and (most important) splitting files into + records, FINgets(). +*/ + +int PROTO(isatty, (int)) ; + +static FIN *PROTO(next_main, (int)) ; +static char *PROTO(enlarge_fin_buffer, (FIN *)) ; +static void PROTO(set_main_to_stdin, (void)) ; +int PROTO(is_cmdline_assign, (char *)) ; /* also used by init */ + +/* convert file-descriptor to FIN*. + It's the main stream if main_flag is set +*/ +FIN * +FINdopen(fd, main_flag) + int fd, main_flag ; +{ + register FIN *fin = ZMALLOC(FIN) ; + + fin->fd = fd ; + fin->flags = main_flag ? (MAIN_FLAG | START_FLAG) : START_FLAG ; + fin->buffp = fin->buff = (char *) zmalloc(BUFFSZ + 1) ; + fin->nbuffs = 1 ; + fin->buff[0] = 0 ; + + if ((isatty(fd) && rs_shadow.type == SEP_CHAR && rs_shadow.c == '\n') + || (interactive_flag && fd == 0) ) + { + /* interactive, i.e., line buffer this file */ + if (fd == 0) fin->fp = stdin ; + else if (!(fin->fp = fdopen(fd, "r"))) + { + errmsg(errno, "fdopen failed") ; mawk_exit(2) ; + } + } + else fin->fp = (FILE *) 0 ; + + return fin ; +} + +/* open a FIN* by filename. + It's the main stream if main_flag is set. + Recognizes "-" as stdin. +*/ + +FIN * +FINopen(filename, main_flag) + char *filename ; + int main_flag ; +{ + int fd ; + int oflag = O_RDONLY ; + +#if MSDOS + int bm = binmode() & 1 ; + if (bm) oflag |= O_BINARY ; +#endif + + if (filename[0] == '-' && filename[1] == 0) + { +#if MSDOS + if (bm) setmode(0, O_BINARY) ; +#endif + return FINdopen(0, main_flag) ; + } + + if ((fd = open(filename, oflag, 0)) == -1) + return (FIN *) 0 ; + else + return FINdopen(fd, main_flag) ; +} + +/* frees the buffer and fd, but leaves FIN structure until + the user calls close() */ + +void +FINsemi_close(fin) + register FIN *fin ; +{ + static char dead = 0 ; + + if (fin->buff != &dead) + { + zfree(fin->buff, fin->nbuffs * BUFFSZ + 1) ; + + if (fin->fd) + { + if (fin->fp) fclose(fin->fp) ; + else close(fin->fd) ; + } + + fin->buff = fin->buffp = &dead ; /* marks it semi_closed */ + } + /* else was already semi_closed */ +} + +/* user called close() on input file */ +void +FINclose(fin) + FIN *fin ; +{ + FINsemi_close(fin) ; + ZFREE(fin) ; +} + +/* return one input record as determined by RS, + from input file (FIN) fin +*/ + +char * +FINgets(fin, len_p) + FIN *fin ; + unsigned *len_p ; +{ + register char *p, *q ; + unsigned match_len ; + unsigned r ; + +restart : + + if (!(p = fin->buffp)[0]) /* need a refill */ + { + if (fin->flags & EOF_FLAG) + { + if (fin->flags & MAIN_FLAG) + { + fin = next_main(0) ; goto restart ; + } + else + { + *len_p = 0 ; return (char *) 0 ; + } + } + + if (fin->fp) + { + /* line buffering */ + if (!fgets(fin->buff, BUFFSZ + 1, fin->fp)) + { + fin->flags |= EOF_FLAG ; + fin->buff[0] = 0 ; + fin->buffp = fin->buff ; + goto restart ; /* might be main_fin */ + } + else /* return this line */ + { + /* find eol */ + p = fin->buff ; + while (*p != '\n' && *p != 0) p++ ; + + *p = 0 ; *len_p = p - fin->buff ; + fin->buffp = p ; + return fin->buff ; + } + } + else + { + /* block buffering */ + r = fillbuff(fin->fd, fin->buff, fin->nbuffs * BUFFSZ) ; + if (r == 0) + { + fin->flags |= EOF_FLAG ; + fin->buffp = fin->buff ; + goto restart ; /* might be main */ + } + else if (r < fin->nbuffs * BUFFSZ) + { + fin->flags |= EOF_FLAG ; + } + + p = fin->buffp = fin->buff ; + + if (fin->flags & START_FLAG) + { + fin->flags &= ~START_FLAG ; + if (rs_shadow.type == SEP_MLR) + { + /* trim blank lines from front of file */ + while (*p == '\n') p++ ; + fin->buffp = p ; + if (*p == 0) goto restart ; + } + } + } + } + +retry: + + switch (rs_shadow.type) + { + case SEP_CHAR: + q = strchr(p, rs_shadow.c) ; + match_len = 1 ; + break ; + + case SEP_STR: + q = str_str(p, ((STRING *) rs_shadow.ptr)->str, + match_len = ((STRING *) rs_shadow.ptr)->len) ; + break ; + + case SEP_MLR: + case SEP_RE: + q = re_pos_match(p, rs_shadow.ptr, &match_len) ; + /* if the match is at the end, there might still be + more to match in the file */ + if (q && q[match_len] == 0 && !(fin->flags & EOF_FLAG)) + q = (char *) 0 ; + break ; + + default: + bozo("type of rs_shadow") ; + } + + if (q) + { + /* the easy and normal case */ + *q = 0 ; *len_p = q - p ; + fin->buffp = q + match_len ; + return p ; + } + + if (fin->flags & EOF_FLAG) + { + /* last line without a record terminator */ + *len_p = r = strlen(p) ; fin->buffp = p+r ; + + if (rs_shadow.type == SEP_MLR && fin->buffp[-1] == '\n' + && r != 0) + { + (*len_p)-- ; + *--fin->buffp = 0 ; + } + return p ; + } + + if (p == fin->buff) + { + /* current record is too big for the input buffer, grow buffer */ + p = enlarge_fin_buffer(fin) ; + } + else + { + /* move a partial line to front of buffer and try again */ + unsigned rr ; + + p = (char *) memcpy(fin->buff, p, r = strlen(p)) ; + q = p+r ; rr = fin->nbuffs*BUFFSZ - r ; + + if ((r = fillbuff(fin->fd, q, rr)) < rr) + fin->flags |= EOF_FLAG ; + } + goto retry ; +} + +static char * +enlarge_fin_buffer(fin) + FIN *fin ; +{ + unsigned r ; + unsigned oldsize = fin->nbuffs * BUFFSZ + 1 ; + +#ifdef MSDOS + /* I'm not sure this can really happen: + avoid "16bit wrap" */ + if (fin->nbuffs == MAX_BUFFS) + { + errmsg(0, "out of input buffer space") ; + mawk_exit(2) ; + } +#endif + + fin->buffp = + fin->buff = (char *) zrealloc(fin->buff, oldsize, oldsize + BUFFSZ) ; + fin->nbuffs++ ; + + r = fillbuff(fin->fd, fin->buff + (oldsize - 1), BUFFSZ) ; + if (r < BUFFSZ) fin->flags |= EOF_FLAG ; + + return fin->buff ; +} + +/*-------- + target is big enough to hold size + 1 chars + on exit the back of the target is zero terminated + *--------------*/ +unsigned +fillbuff(fd, target, size) + int fd ; + register char *target ; + unsigned size ; +{ + register int r ; + unsigned entry_size = size ; + + while (size) + switch (r = read(fd, target, size)) + { + case -1: + errmsg(errno, "read error") ; + mawk_exit(2) ; + + case 0: + goto out ; + + default: + target += r ; size -= r ; + break ; + } + +out : + *target = 0 ; + return entry_size - size ; +} + +/* main_fin is a handle to the main input stream + == 0 never been opened */ + +FIN *main_fin ; +ARRAY Argv ; /* to the user this is ARGV */ +static double argi = 1.0 ; /* index of next ARGV[argi] to try to open */ + + +static void +set_main_to_stdin() +{ + cell_destroy(FILENAME) ; + FILENAME->type = C_STRING ; + FILENAME->ptr = (PTR) new_STRING("-") ; + cell_destroy(FNR) ; + FNR->type = C_DOUBLE ; + FNR->dval = 0.0 ; rt_fnr = 0 ; + main_fin = FINdopen(0, 1) ; +} + + +/* this gets called once to get the input stream going. + It is called after the execution of the BEGIN block + unless there is a getline inside BEGIN {} +*/ +void +open_main() +{ + CELL argc ; + +#if MSDOS + int k = binmode() ; + + if (k & 1) setmode(0, O_BINARY) ; + if ( k & 2 ) { setmode(1,O_BINARY) ; setmode(2,O_BINARY) ; } +#endif + + cellcpy(&argc, ARGC) ; + if (argc.type != C_DOUBLE) cast1_to_d(&argc) ; + + if (argc.dval == 1.0) set_main_to_stdin() ; + else next_main(1) ; +} + +/* get the next command line file open */ +static FIN * +next_main(open_flag) + int open_flag ; /* called by open_main() if on */ +{ + register CELL *cp ; + CELL argc ; /* copy of ARGC */ + CELL c_argi ; /* cell copy of argi */ + CELL argval ; /* copy of ARGV[c_argi] */ + + + argval.type = C_NOINIT ; + c_argi.type = C_DOUBLE ; + + if (main_fin) FINclose(main_fin) ; + /* FILENAME and FNR don't change unless we really open + a new file */ + + /* make a copy of ARGC to avoid side effect */ + if (cellcpy(&argc, ARGC)->type != C_DOUBLE) cast1_to_d(&argc) ; + + while (argi < argc.dval) + { + c_argi.dval = argi ; + argi += 1.0 ; + + if (!(cp = array_find(Argv, &c_argi, NO_CREATE))) + continue ; /* its deleted */ + + /* make a copy so we can cast w/o side effect */ + cell_destroy(&argval) ; + cp = cellcpy(&argval, cp) ; + if (cp->type < C_STRING) cast1_to_s(cp) ; + if (string(cp)->len == 0) continue ; + /* file argument is "" */ + + /* it might be a command line assignment */ + if (is_cmdline_assign(string(cp)->str)) continue ; + + /* try to open it -- we used to continue on failure, + but posix says we should quit */ + if (!(main_fin = FINopen(string(cp)->str, 1))) + { + errmsg(errno, "cannot open %s", string(cp)->str) ; + mawk_exit(2) ; + } + + /* success -- set FILENAME and FNR */ + cell_destroy(FILENAME) ; + cellcpy(FILENAME, cp) ; + free_STRING(string(cp)) ; + cell_destroy(FNR) ; + FNR->type = C_DOUBLE ; + FNR->dval = 0.0 ; rt_fnr = 0 ; + + return main_fin ; + } + /* failure */ + cell_destroy(&argval) ; + + if (open_flag) + { + /* all arguments were null or assignment */ + set_main_to_stdin() ; + return main_fin ; + } + + /* real failure */ + { + /* this is how we mark EOF on main_fin */ + static char dead_buff = 0 ; + static FIN dead_main = + {0, (FILE *) 0, &dead_buff, &dead_buff, + 1, EOF_FLAG} ; + + return main_fin = &dead_main ; + /* since MAIN_FLAG is not set, FINgets won't call next_main() */ + } +} + + +int +is_cmdline_assign(s) + char *s ; +{ + register char *p ; + int c ; + SYMTAB *stp ; + CELL *cp ; + unsigned len ; + CELL cell ; /* used if command line assign to pseudo field */ + CELL *fp = (CELL *) 0 ; /* ditto */ + + if (scan_code[*(unsigned char *) s] != SC_IDCHAR) return 0 ; + + p = s + 1 ; + while ((c = scan_code[*(unsigned char *) p]) == SC_IDCHAR + || c == SC_DIGIT) + p++ ; + + if (*p != '=') return 0 ; + + *p = 0 ; + stp = find(s) ; + + switch (stp->type) + { + case ST_NONE: + stp->type = ST_VAR ; + stp->stval.cp = cp = ZMALLOC(CELL) ; + break ; + + case ST_VAR: + case ST_NR: /* !! no one will do this */ + cp = stp->stval.cp ; + cell_destroy(cp) ; + break ; + + case ST_FIELD: + /* must be pseudo field */ + fp = stp->stval.cp ; + cp = &cell ; + break ; + + default: + rt_error( + "cannot command line assign to %s\n\ttype clash or keyword" + ,s) ; + } + + /* we need to keep ARGV[i] intact */ + *p++ = '=' ; + len = strlen(p) + 1 ; + /* posix says escape sequences are on from command line */ + p = rm_escape(strcpy((char *) zmalloc(len), p)) ; + cp->ptr = (PTR) new_STRING(p) ; + zfree(p, len) ; + check_strnum(cp) ; /* sets cp->type */ + if (fp) /* move it from cell to pfield[] */ + { + field_assign(fp, cp) ; + free_STRING(string(cp)) ; + } + return 1 ; +} diff --git a/fin.h b/fin.h new file mode 100644 index 0000000..c47fd2f --- /dev/null +++ b/fin.h @@ -0,0 +1,56 @@ + +/******************************************** +fin.h +copyright 1991, Michael D. Brennan + +This is a source file for mawk, an implementation of +the AWK programming language. + +Mawk is distributed without warranty under the terms of +the GNU General Public License, version 2, 1991. +********************************************/ + +/*$Log: fin.h,v $ + * Revision 1.1.1.1 1993/07/03 18:58:13 mike + * move source to cvs + * + * Revision 5.2 1992/01/06 08:16:24 brennan + * setmode() proto for MSDOS + * + * Revision 5.1 91/12/05 07:59:20 brennan + * 1.1 pre-release + * +*/ + +/* fin.h */ + +#ifndef FIN_H +#define FIN_H +/* structure to control input files */ + +typedef struct { +int fd ; +FILE *fp ; /* NULL unless interactive */ +char *buff ; +char *buffp ; +unsigned nbuffs ; /* sizeof *buff in BUFFSZs */ +int flags ; +} FIN ; + +#define MAIN_FLAG 1 /* part of main input stream if on */ +#define EOF_FLAG 2 +#define START_FLAG 4 /* used when RS == "" */ + +FIN * PROTO (FINdopen, (int, int) ); +FIN * PROTO (FINopen, (char *, int) ); +void PROTO (FINclose, (FIN *) ) ; +void PROTO (FINsemi_close, (FIN *)) ; +char* PROTO (FINgets, (FIN *, unsigned *) ) ; +unsigned PROTO ( fillbuff, (int, char *, unsigned) ) ; + + +extern FIN *main_fin ; /* for the main input stream */ +void PROTO( open_main, (void) ) ; + +void PROTO(setmode, (int,int)) ; +#endif /* FIN_H */ diff --git a/fin.o b/fin.o new file mode 100644 index 0000000..d868b74 Binary files /dev/null and b/fin.o differ diff --git a/fpe_check.c b/fpe_check.c new file mode 100644 index 0000000..247cebd --- /dev/null +++ b/fpe_check.c @@ -0,0 +1,263 @@ + +/* This code attempts to figure out what the default + floating point exception handling does. +*/ + +/* $Log: fpe_check.c,v $ + * Revision 1.7 1996/08/30 00:07:14 mike + * Modifications to the test and implementation of the bug fix for + * solaris overflow in strtod. + * + * Revision 1.6 1996/08/25 19:25:46 mike + * Added test for solaris strtod overflow bug. + * + * Revision 1.5 1996/08/11 22:10:39 mike + * Some systems blow the !(d==d) test for a NAN. Added a work around. + * + * Revision 1.4 1995/01/09 01:22:28 mike + * check sig handler ret type to make fpe_check.c more robust + * + * Revision 1.3 1994/12/18 20:54:00 mike + * check NetBSD mathlib defines + * + * Revision 1.2 1994/12/14 14:37:26 mike + * add messages to user + * +*/ + +#include +#include +#include + +/* Sets up NetBSD 1.0A for ieee floating point */ +#if defined(_LIB_VERSION_TYPE) && defined(_LIB_VERSION) && defined(_IEEE_) +_LIB_VERSION_TYPE _LIB_VERSION = _IEEE_; +#endif + +void message(s) + char *s ; +{ + printf("\t%s\n", s) ; +} + +jmp_buf jbuff ; +int may_be_safe_to_look_at_why = 0 ; +int why_v ; +int checking_for_strtod_ovf_bug = 0 ; + +RETSIGTYPE fpe_catch() ; +int is_nan() ; +void check_strtod_ovf() ; +double strtod() ; + +double +div_by(x,y) + double x ; + double y ; +{ + return x/y ; +} + +double overflow(x) + double x ; +{ + double y ; + + do + { + y = x ; + x *= x ; + } while( y != x ) ; + return x ; +} + + +void check_fpe_traps() +{ + int traps = 0 ; + + if (setjmp(jbuff) == 0) + { + div_by(44.0, 0.0) ; + message("division by zero does not generate an exception") ; + } + else + { + traps = 1 ; + message("division by zero generates an exception") ; + signal(SIGFPE, fpe_catch) ; /* set again if sysV */ + } + + if ( setjmp(jbuff) == 0 ) + { + overflow(1000.0) ; + message("overflow does not generate an exception") ; + } + else + { + traps |= 2 ; + message("overflow generates an exception") ; + signal(SIGFPE, fpe_catch) ; + } + + if ( traps == 0 ) + { + double maybe_nan = log(-8.0) ; + + if (is_nan(maybe_nan)) + { + message("math library supports ieee754") ; + } + else + { + traps |= 4 ; + message("math library does not support ieee754") ; + } + } + + exit(traps) ; +} + +int is_nan(d) + double d ; +{ + char command[128] ; + + if (!(d==d)) return 1 ; + + /* on some systems with an ieee754 bug, we need to make another check */ + sprintf(command, + "echo '%f' | egrep '[nN][aA][nN]|\\?' >/dev/null", d) ; + return system(command)==0 ; +} + +/* +Only get here if we think we have Berkeley type signals so we can +look at a second argument to fpe_catch() to get the reason for +an exception +*/ +void +get_fpe_codes() +{ + int divz ; + int ovf ; + + may_be_safe_to_look_at_why = 1 ; + + if( setjmp(jbuff) == 0 ) div_by(1000.0, 0.0) ; + else + { + divz = why_v ; + signal(SIGFPE, fpe_catch) ; + } + + if( setjmp(jbuff) == 0 ) overflow(1000.0) ; + else + { + ovf = why_v ; + signal(SIGFPE, fpe_catch) ; + } + + + /* make some guesses if sane values */ + if ( divz>0 && ovf>0 && divz != ovf ) + { + printf("X FPE_ZERODIVIDE %d\n", divz) ; + printf("X FPE_OVERFLOW %d\n", ovf) ; + exit(0) ; + } + else exit(1) ; +} + +int +main(argc) + int argc ; +{ + + signal(SIGFPE, fpe_catch) ; + switch(argc) { + case 1 : + check_fpe_traps() ; + break ; + case 2 : + get_fpe_codes() ; + break ; + default: + check_strtod_ovf() ; + break ; + } + /* not reached */ + return 0 ; +} + +/* put this down here in attempt to defeat ambitious compiler that + may have seen a prototype without 2nd argument */ + +RETSIGTYPE fpe_catch(signal, why) + int signal ; + int why ; +{ + if (checking_for_strtod_ovf_bug) exit(1) ; + if ( may_be_safe_to_look_at_why ) why_v = why ; + longjmp(jbuff,1) ; +} + +char longstr[] = +"1234567890\ +1234567890\ +1234567890\ +1234567890\ +1234567890\ +1234567890\ +1234567890\ +1234567890\ +1234567890\ +1234567890\ +1234567890\ +1234567890\ +1234567890\ +1234567890\ +1234567890\ +1234567890\ +1234567890\ +1234567890\ +1234567890\ +1234567890\ +1234567890\ +1234567890\ +1234567890\ +1234567890\ +1234567890\ +1234567890\ +1234567890\ +1234567890\ +1234567890\ +1234567890\ +1234567890\ +1234567890\ +1234567890\ +1234567890\ +1234567890\ +1234567890\ +1234567890\ +1234567890\ +1234567890\ +1234567890" ; + +#ifdef USE_IEEEFP_H +#include +#endif + +void +check_strtod_ovf() +{ + double x ; + +#ifdef USE_IEEEFP_H + fpsetmask(fpgetmask()|FP_X_OFL|FP_X_DZ) ; +#endif + + checking_for_strtod_ovf_bug = 1 ; + strtod(longstr,(char**)0) ; + exit(0) ; +} diff --git a/hash.c b/hash.c new file mode 100644 index 0000000..ef32c8f --- /dev/null +++ b/hash.c @@ -0,0 +1,258 @@ + +/******************************************** +hash.c +copyright 1991, Michael D. Brennan + +This is a source file for mawk, an implementation of +the AWK programming language. + +Mawk is distributed without warranty under the terms of +the GNU General Public License, version 2, 1991. +********************************************/ + + +/* $Log: hash.c,v $ + * Revision 1.3 1994/10/08 19:15:43 mike + * remove SM_DOS + * + * Revision 1.2 1993/07/16 00:17:35 mike + * cleanup + * + * Revision 1.1.1.1 1993/07/03 18:58:14 mike + * move source to cvs + * + * Revision 5.1 1991/12/05 07:56:05 brennan + * 1.1 pre-release + * +*/ + + +/* hash.c */ + +#include "mawk.h" +#include "memory.h" +#include "symtype.h" + + +unsigned +hash(s) + register char *s ; +{ + register unsigned h = 0 ; + + while (*s) h += h + *s++ ; + return h ; +} + +typedef struct hash +{ + struct hash *link ; + SYMTAB symtab ; +} HASHNODE ; + +static HASHNODE *PROTO(delete, (char *)) ; + +static HASHNODE *hash_table[HASH_PRIME] ; + +/* +insert a string in the symbol table. +Caller knows the symbol is not there +-- used for initialization +*/ + +SYMTAB * +insert(s) + char *s ; +{ + register HASHNODE *p = ZMALLOC(HASHNODE) ; + register unsigned h ; + + p->link = hash_table[h = hash(s) % HASH_PRIME] ; + p->symtab.name = s ; + hash_table[h] = p ; + return &p->symtab ; +} + +/* Find s in the symbol table, + if not there insert it, s must be dup'ed */ + +SYMTAB * +find(s) + char *s ; +{ + register HASHNODE *p ; + HASHNODE *q ; + unsigned h ; + + p = hash_table[h = hash(s) % HASH_PRIME] ; + q = (HASHNODE *) 0 ; + while (1) + { + if (!p) + { + p = ZMALLOC(HASHNODE) ; + p->symtab.type = ST_NONE ; + p->symtab.name = strcpy(zmalloc(strlen(s) + 1), s) ; + break ; + } + + if (strcmp(p->symtab.name, s) == 0) /* found */ + { + if (!q) /* already at the front */ + return &p->symtab ; + else /* delete from the list */ + { + q->link = p->link ; break ; + } + } + + q = p ; p = p->link ; + } + /* put p on front of the list */ + p->link = hash_table[h] ; + hash_table[h] = p ; + return &p->symtab ; +} + + +/* remove a node from the hash table + return a ptr to the node */ + +static unsigned last_hash ; + +static HASHNODE * +delete(s) + char *s ; +{ + register HASHNODE *p ; + HASHNODE *q = (HASHNODE *) 0 ; + unsigned h ; + + p = hash_table[last_hash = h = hash(s) % HASH_PRIME] ; + while (p) + { + if (strcmp(p->symtab.name, s) == 0) /* found */ + { + if (q) q->link = p->link ; + else hash_table[h] = p->link ; + return p ; + } + else + { + q = p ; p = p->link ; + } + } + +#ifdef DEBUG /* we should not ever get here */ + bozo("delete") ; +#endif + return (HASHNODE *) 0 ; +} + +/* when processing user functions, global ids which are + replaced by local ids are saved on this list */ + +static HASHNODE *save_list ; + +/* store a global id on the save list, + return a ptr to the local symtab */ +SYMTAB * +save_id(s) + char *s ; +{ + HASHNODE *p, *q ; + unsigned h ; + + p = delete(s) ; + q = ZMALLOC(HASHNODE) ; + q->symtab.type = ST_LOCAL_NONE ; + q->symtab.name = p->symtab.name ; + /* put q in the hash table */ + q->link = hash_table[h = last_hash] ; + hash_table[h] = q ; + + /* save p */ + p->link = save_list ; save_list = p ; + + return &q->symtab ; +} + +/* restore all global indentifiers */ +void +restore_ids() +{ + register HASHNODE *p, *q ; + register unsigned h ; + + q = save_list ; save_list = (HASHNODE *) 0 ; + while (q) + { + p = q ; q = q->link ; + zfree(delete(p->symtab.name), sizeof(HASHNODE)) ; + p->link = hash_table[h = last_hash] ; + hash_table[h] = p ; + } +} + + +/* search the symbol table backwards for the + disassembler. This is slow -- so what +*/ + + +char * +reverse_find(type, ptr) + int type ; + PTR ptr ; +{ + CELL *cp ; + ARRAY array ; + static char uk[] = "unknown" ; + + int i ; + HASHNODE *p ; + + + switch (type) + { + case ST_VAR: + case ST_FIELD: + cp = *(CELL **) ptr ; + break ; + + case ST_ARRAY: + array = *(ARRAY *) ptr ; + break ; + + default: + return uk ; + } + + for (i = 0; i < HASH_PRIME; i++) + { + p = hash_table[i] ; + while (p) + { + if (p->symtab.type == type) + { + switch (type) + { + case ST_VAR: + case ST_FIELD: + if (cp == p->symtab.stval.cp) + return p->symtab.name ; + break ; + + case ST_ARRAY: + if (array == p->symtab.stval.array) + return p->symtab.name ; + break ; + } + } + + p = p->link ; + } + } + return uk ; +} + diff --git a/hash.o b/hash.o new file mode 100644 index 0000000..c3e5761 Binary files /dev/null and b/hash.o differ diff --git a/init.c b/init.c new file mode 100644 index 0000000..af3e3f4 --- /dev/null +++ b/init.c @@ -0,0 +1,389 @@ + +/******************************************** +init.c +copyright 1991, Michael D. Brennan + +This is a source file for mawk, an implementation of +the AWK programming language. + +Mawk is distributed without warranty under the terms of +the GNU General Public License, version 2, 1991. +********************************************/ + + +/* $Log: init.c,v $ + * Revision 1.11 1995/08/20 17:35:21 mike + * include for MSC, needed for environ decl + * + * Revision 1.10 1995/06/09 22:51:50 mike + * silently exit(0) if no program + * + * Revision 1.9 1995/06/06 00:18:30 mike + * change mawk_exit(1) to mawk_exit(2) + * + * Revision 1.8 1994/12/14 14:40:34 mike + * -Wi option + * + * Revision 1.7 1994/12/11 22:43:20 mike + * don't assume **environ is writable + * + * Revision 1.6 1994/12/11 22:14:16 mike + * remove THINK_C #defines. Not a political statement, just no indication + * that anyone ever used it. + * + * Revision 1.5 1994/10/08 19:15:45 mike + * remove SM_DOS + * + * Revision 1.4 1994/03/11 02:23:49 mike + * -We option + * + * Revision 1.3 1993/07/17 00:45:14 mike + * indent + * + * Revision 1.2 1993/07/04 12:52:00 mike + * start on autoconfig changes + * + * Revision 5.5 1993/01/07 02:50:33 mike + * relative vs absolute code + * + * Revision 5.4 1992/12/24 01:58:19 mike + * 1.1.2d changes for MsDOS + * + * Revision 5.3 1992/07/10 16:17:10 brennan + * MsDOS: remove NO_BINMODE macro + * + * Revision 5.2 1992/01/09 08:46:14 brennan + * small change for MSC + * + * Revision 5.1 91/12/05 07:56:07 brennan + * 1.1 pre-release + * +*/ + + +/* init.c */ +#include "mawk.h" +#include "code.h" +#include "memory.h" +#include "symtype.h" +#include "init.h" +#include "bi_vars.h" +#include "field.h" +#include + +#ifdef MSDOS +#include +#ifdef MSDOS_MSC +#include +#endif +#endif + +static void PROTO(process_cmdline, (int, char **)) ; +static void PROTO(set_ARGV, (int, char **, int)) ; +static void PROTO(bad_option, (char *)) ; +static void PROTO(no_program, (void)) ; + +extern void PROTO(print_version, (void)) ; +extern int PROTO(is_cmdline_assign, (char *)) ; + +#if MSDOS +void PROTO(stdout_init, (void)) ; +#if HAVE_REARGV +void PROTO(reargv, (int *, char ***)) ; +#endif +#endif + +char *progname ; +short interactive_flag = 0 ; + +#ifndef SET_PROGNAME +#define SET_PROGNAME() \ + {char *p = strrchr(argv[0],'/') ;\ + progname = p ? p+1 : argv[0] ; } +#endif + +void +initialize(argc, argv) +int argc ; char **argv ; +{ + + SET_PROGNAME() ; + + bi_vars_init() ; /* load the builtin variables */ + bi_funct_init() ; /* load the builtin functions */ + kw_init() ; /* load the keywords */ + field_init() ; + +#if MSDOS + { + char *p = getenv("MAWKBINMODE") ; + + if (p) set_binmode(atoi(p)) ; + } +#endif + + + process_cmdline(argc, argv) ; + + code_init() ; + fpe_init() ; + set_stderr() ; + +#if MSDOS + stdout_init() ; +#endif +} + +int dump_code_flag ; /* if on dump internal code */ +short posix_space_flag ; + +#ifdef DEBUG +int dump_RE ; /* if on dump compiled REs */ +#endif + + +static void +bad_option(s) + char *s ; +{ + errmsg(0, "not an option: %s", s) ; mawk_exit(2) ; +} + +static void +no_program() +{ + mawk_exit(0) ; +} + +static void +process_cmdline(argc, argv) + int argc ; + char **argv ; +{ + int i, nextarg ; + char *optarg ; + PFILE dummy ; /* starts linked list of filenames */ + PFILE *tail = &dummy ; + + for (i = 1; i < argc && argv[i][0] == '-'; i = nextarg) + { + if (argv[i][1] == 0) /* - alone */ + { + if (!pfile_name) no_program() ; + break ; /* the for loop */ + } + /* safe to look at argv[i][2] */ + + if (argv[i][2] == 0) + { + if (i == argc - 1 && argv[i][1] != '-') + { + if (strchr("WFvf", argv[i][1])) + { + errmsg(0, "option %s lacks argument", argv[i]) ; + mawk_exit(2) ; + } + bad_option(argv[i]) ; + } + + optarg = argv[i + 1] ; + nextarg = i + 2 ; + } + else /* argument glued to option */ + { + optarg = &argv[i][2] ; + nextarg = i + 1 ; + } + + switch (argv[i][1]) + { + case 'W': + + if (optarg[0] >= 'a' && optarg[0] <= 'z') + optarg[0] += 'A' - 'a' ; + if (optarg[0] == 'V') print_version() ; + else if (optarg[0] == 'D') + { + dump_code_flag = 1 ; + } + else if (optarg[0] == 'S') + { + char *p = strchr(optarg, '=') ; + int x = p ? atoi(p + 1) : 0 ; + + if (x > SPRINTF_SZ) + { + sprintf_buff = (char *) zmalloc(x) ; + sprintf_limit = sprintf_buff + x ; + } + } +#if MSDOS + else if (optarg[0] == 'B') + { + char *p = strchr(optarg, '=') ; + int x = p ? atoi(p + 1) : 0 ; + + set_binmode(x) ; + } +#endif + else if (optarg[0] == 'P') + { + posix_space_flag = 1 ; + } + else if (optarg[0] == 'E') + { + if ( pfile_name ) + { + errmsg(0, "-W exec is incompatible with -f") ; + mawk_exit(2) ; + } + else if ( nextarg == argc ) no_program() ; + + pfile_name = argv[nextarg] ; + i = nextarg + 1 ; + goto no_more_opts ; + } + else if (optarg[0] == 'I') + { + interactive_flag = 1 ; + setbuf(stdout,(char*)0) ; + } + else errmsg(0, "vacuous option: -W %s", optarg) ; + + + break ; + + case 'v': + if (!is_cmdline_assign(optarg)) + { + errmsg(0, "improper assignment: -v %s", optarg) ; + mawk_exit(2) ; + } + break ; + + case 'F': + + rm_escape(optarg) ; /* recognize escape sequences */ + cell_destroy(FS) ; + FS->type = C_STRING ; + FS->ptr = (PTR) new_STRING(optarg) ; + cast_for_split(cellcpy(&fs_shadow, FS)) ; + break ; + + case '-': + if (argv[i][2] != 0) bad_option(argv[i]) ; + i++ ; + goto no_more_opts ; + + case 'f': + /* first file goes in pfile_name ; any more go + on a list */ + if (!pfile_name) pfile_name = optarg ; + else + { + tail = tail->link = ZMALLOC(PFILE) ; + tail->fname = optarg ; + } + break ; + + default: + bad_option(argv[i]) ; + } + } + + no_more_opts: + + tail->link = (PFILE *) 0 ; + pfile_list = dummy.link ; + + if (pfile_name) + { + set_ARGV(argc, argv, i) ; + scan_init((char *) 0) ; + } + else /* program on command line */ + { + if (i == argc) no_program() ; + set_ARGV(argc, argv, i + 1) ; + +#if MSDOS && ! HAVE_REARGV /* reversed quotes */ + { + char *p ; + + for (p = argv[i]; *p; p++) + if (*p == '\'') *p = '\"' ; + } +#endif + scan_init(argv[i]) ; +/* #endif */ + } +} + + +static void +set_ARGV(argc, argv, i) +int argc ; char **argv ; + int i ; /* argv[i] = ARGV[i] */ +{ + SYMTAB *st_p ; + CELL argi ; + register CELL *cp ; + + st_p = insert("ARGV") ; + st_p->type = ST_ARRAY ; + Argv = st_p->stval.array = new_ARRAY() ; + argi.type = C_DOUBLE ; + argi.dval = 0.0 ; + cp = array_find(st_p->stval.array, &argi, CREATE) ; + cp->type = C_STRING ; + cp->ptr = (PTR) new_STRING(progname) ; + + /* ARGV[0] is set, do the rest + The type of ARGV[1] ... should be C_MBSTRN + because the user might enter numbers from the command line */ + + for (argi.dval = 1.0; i < argc; i++, argi.dval += 1.0) + { + cp = array_find(st_p->stval.array, &argi, CREATE) ; + cp->type = C_MBSTRN ; + cp->ptr = (PTR) new_STRING(argv[i]) ; + } + ARGC->type = C_DOUBLE ; + ARGC->dval = argi.dval ; +} + + +/*----- ENVIRON ----------*/ + +void +load_environ(ENV) + ARRAY ENV ; +{ + CELL c ; +#ifndef MSDOS_MSC /* MSC declares it near */ + extern char **environ ; +#endif + register char **p = environ ; /* walks environ */ + char *s ; /* looks for the '=' */ + CELL *cp ; /* pts at ENV[&c] */ + + c.type = C_STRING ; + + while (*p) + { + if ((s = strchr(*p, '='))) /* shouldn't fail */ + { + int len = s - *p ; + c.ptr = (PTR) new_STRING0(len) ; + memcpy(string(&c)->str, *p, len) ; + s++ ; + + cp = array_find(ENV, &c, CREATE) ; + cp->type = C_MBSTRN ; + cp->ptr = (PTR) new_STRING(s) ; + + free_STRING(string(&c)) ; + } + p++ ; + } +} diff --git a/init.h b/init.h new file mode 100644 index 0000000..9ac4794 --- /dev/null +++ b/init.h @@ -0,0 +1,60 @@ + +/******************************************** +init.h +copyright 1991, Michael D. Brennan + +This is a source file for mawk, an implementation of +the AWK programming language. + +Mawk is distributed without warranty under the terms of +the GNU General Public License, version 2, 1991. +********************************************/ + +/* $Log: init.h,v $ + * Revision 1.2 1995/06/18 19:42:18 mike + * Remove some redundant declarations and add some prototypes + * + * Revision 1.1.1.1 1993/07/03 18:58:14 mike + * move source to cvs + * + * Revision 5.1 1991/12/05 07:59:22 brennan + * 1.1 pre-release + * +*/ + +/* init.h */ + + +#ifndef INIT_H +#define INIT_H + +#include "symtype.h" + +/* nodes to link file names for multiple + -f option */ + +typedef struct pfile { +struct pfile *link ; +char *fname ; +} PFILE ; + +extern PFILE *pfile_list ; + +extern char *sprintf_buff, *sprintf_limit ; + + +void PROTO( initialize, (int, char **) ) ; +void PROTO( code_init, (void) ) ; +void PROTO( code_cleanup, (void) ) ; +void PROTO( compile_cleanup, (void) ) ; +void PROTO(scan_init, ( char *) ) ; +void PROTO(bi_vars_init, (void) ) ; +void PROTO(bi_funct_init, (void) ) ; +void PROTO(print_init, (void) ) ; +void PROTO(kw_init, (void) ) ; +void PROTO( field_init, (void) ) ; +void PROTO( fpe_init, (void) ) ; +void PROTO( load_environ, (ARRAY)) ; +void PROTO( set_stderr, (void)) ; + +#endif /* INIT_H */ diff --git a/init.o b/init.o new file mode 100644 index 0000000..a536f82 Binary files /dev/null and b/init.o differ diff --git a/jmp.c b/jmp.c new file mode 100644 index 0000000..9182fc9 --- /dev/null +++ b/jmp.c @@ -0,0 +1,283 @@ + +/******************************************** +jmp.c +copyright 1991, Michael D. Brennan + +This is a source file for mawk, an implementation of +the AWK programming language. + +Mawk is distributed without warranty under the terms of +the GNU General Public License, version 2, 1991. +********************************************/ + +/* $Log: jmp.c,v $ + * Revision 1.4 1995/06/18 19:42:19 mike + * Remove some redundant declarations and add some prototypes + * + * Revision 1.3 1995/04/21 14:20:16 mike + * move_level variable to fix bug in arglist patching of moved code. + * + * Revision 1.2 1993/07/14 13:17:49 mike + * rm SIZE_T and run thru indent + * + * Revision 1.1.1.1 1993/07/03 18:58:15 mike + * move source to cvs + * + * Revision 5.3 1993/01/09 19:03:44 mike + * code_pop checks if the resolve_list needs relocation + * + * Revision 5.2 1993/01/07 02:50:33 mike + * relative vs absolute code + * + * Revision 5.1 1991/12/05 07:56:10 brennan + * 1.1 pre-release + * +*/ + +/* this module deals with back patching jumps, breaks and continues, + and with save and restoring code when we move code. + There are three stacks. If we encounter a compile error, the + stacks are frozen, i.e., we do not attempt error recovery + on the stacks +*/ + + +#include "mawk.h" +#include "symtype.h" +#include "jmp.h" +#include "code.h" +#include "sizes.h" +#include "init.h" +#include "memory.h" + +#define error_state (compile_error_count>0) + + +/*---------- back patching jumps ---------------*/ + +typedef struct jmp +{ + struct jmp *link ; + int source_offset ; +} +JMP ; + +static JMP *jmp_top ; + +void +code_jmp(jtype, target) + int jtype ; + INST *target ; +{ + if (error_state) return ; + + /* WARNING: Don't emit any code before using target or + relocation might make it invalid */ + + if (target) code2op(jtype, target - (code_ptr + 1)) ; + else + { + register JMP *p = ZMALLOC(JMP) ; + + /* stack for back patch */ + code2op(jtype, 0) ; + p->source_offset = code_offset - 1 ; + p->link = jmp_top ; + jmp_top = p ; + } +} + +void +patch_jmp(target) /* patch a jump on the jmp_stack */ + INST *target ; +{ + register JMP *p ; + register INST *source ; /* jmp starts here */ + + if (!error_state) + { +#ifdef DEBUG + if (!jmp_top) bozo("jmp stack underflow") ; +#endif + + p = jmp_top ; jmp_top = p->link ; + source = p->source_offset + code_base ; + source->op = target - source ; + + ZFREE(p) ; + } +} + + +/*-- break and continue -------*/ + +typedef struct bc +{ + struct bc *link ; /* stack as linked list */ + int type ; /* 'B' or 'C' or mark start with 0 */ + int source_offset ; /* position of _JMP */ +} +BC ; + +static BC *bc_top ; + + + +void +BC_new() /* mark the start of a loop */ +{ + BC_insert(0, (INST *) 0) ; +} + +void +BC_insert(type, address) +int type ; INST *address ; +{ + register BC *p ; + + if (error_state) return ; + + if (type && !bc_top) + { + compile_error("%s statement outside of loop", + type == 'B' ? "break" : "continue") ; + + return ; + } + else + { + p = ZMALLOC(BC) ; + p->type = type ; + p->source_offset = address - code_base ; + p->link = bc_top ; + bc_top = p ; + } +} + + +/* patch all break and continues for one loop */ +void +BC_clear(B_address, C_address) + INST *B_address, *C_address ; +{ + register BC *p, *q ; + INST *source ; + + if (error_state) return ; + + p = bc_top ; + /* pop down to the mark node */ + while (p->type) + { + source = code_base + p->source_offset ; + source->op = (p->type == 'B' ? B_address : C_address) + - source ; + + q = p ; p = p->link ; ZFREE(q) ; + } + /* remove the mark node */ + bc_top = p->link ; + ZFREE(p) ; +} + +/*----- moving code --------------------------*/ + +/* a stack to hold some pieces of code while + reorganizing loops . +*/ + +typedef struct mc +{ /* mc -- move code */ + struct mc *link ; + INST *code ; /* the save code */ + unsigned len ; /* its length */ + int scope ; /* its scope */ + int move_level ; /* size of this stack when coded */ + FBLOCK *fbp ; /* if scope FUNCT */ + int offset ; /* distance from its code base */ +} +MC ; + +static MC *mc_top ; +int code_move_level = 0 ; /* see comment in jmp.h */ + +#define NO_SCOPE -1 + /* means relocation of resolve list not needed */ + +void +code_push(code, len, scope, fbp) + INST *code ; + unsigned len ; + int scope ; + FBLOCK *fbp ; +{ + register MC *p ; + + if (!error_state) + { + p = ZMALLOC(MC) ; + p->len = len ; + p->link = mc_top ; + mc_top = p ; + + if (len) + { + p->code = (INST *) zmalloc(sizeof(INST) * len) ; + memcpy(p->code, code, sizeof(INST) * len) ; + } + if (!resolve_list) p->scope = NO_SCOPE ; + else + { + p->scope = scope ; + p->move_level = code_move_level ; + p->fbp = fbp ; + p->offset = code - code_base ; + } + } + code_move_level++ ; +} + +/* copy the code at the top of the mc stack to target. + return the number of INSTs moved */ + +unsigned +code_pop(target) + INST *target ; +{ + register MC *p ; + unsigned len ; + int target_offset ; + + if (error_state) return 0 ; + +#ifdef DEBUG + if (!mc_top) bozo("mc underflow") ; +#endif + + p = mc_top ; mc_top = p->link ; + len = p->len ; + + while (target + len >= code_warn) + { + target_offset = target - code_base ; + code_grow() ; + target = code_base + target_offset ; + } + + if (len) + { + memcpy(target, p->code, len * sizeof(INST)) ; + zfree(p->code, len * sizeof(INST)) ; + } + + if (p->scope != NO_SCOPE) + { + target_offset = target - code_base ; + relocate_resolve_list(p->scope, p->move_level, p->fbp, + p->offset, len, target_offset - p->offset) ; + } + + ZFREE(p) ; + code_move_level-- ; + return len ; +} diff --git a/jmp.h b/jmp.h new file mode 100644 index 0000000..14d2cca --- /dev/null +++ b/jmp.h @@ -0,0 +1,45 @@ + +/******************************************** +jmp.h +copyright 1991, Michael D. Brennan + +This is a source file for mawk, an implementation of +the AWK programming language. + +Mawk is distributed without warranty under the terms of +the GNU General Public License, version 2, 1991. +********************************************/ + +/* $Log: jmp.h,v $ + * Revision 1.2 1995/04/21 14:20:19 mike + * move_level variable to fix bug in arglist patching of moved code. + * + * Revision 1.1.1.1 1993/07/03 18:58:15 mike + * move source to cvs + * + * Revision 5.2 1993/01/09 19:03:44 mike + * code_pop checks if the resolve_list needs relocation + * + * Revision 5.1 1991/12/05 07:59:24 brennan + * 1.1 pre-release + * +*/ + +#ifndef JMP_H +#define JMP_H + +void PROTO(BC_new, (void) ) ; +void PROTO(BC_insert, (int, INST*) ) ; +void PROTO(BC_clear, (INST *, INST *) ) ; +void PROTO(code_push, (INST *, unsigned, int, FBLOCK*) ) ; +unsigned PROTO(code_pop, (INST *) ) ; +void PROTO(code_jmp, (int, INST *) ) ; +void PROTO(patch_jmp, (INST *) ) ; + +extern int code_move_level ; + /* used to as one part of unique identification of context when + moving code. Global for communication with parser. + */ + +#endif /* JMP_H */ + diff --git a/jmp.o b/jmp.o new file mode 100644 index 0000000..94241f2 Binary files /dev/null and b/jmp.o differ diff --git a/kw.c b/kw.c new file mode 100644 index 0000000..7fbf7dc --- /dev/null +++ b/kw.c @@ -0,0 +1,95 @@ + +/******************************************** +kw.c +copyright 1991, Michael D. Brennan + +This is a source file for mawk, an implementation of +the AWK programming language. + +Mawk is distributed without warranty under the terms of +the GNU General Public License, version 2, 1991. +********************************************/ + + +/* $Log: kw.c,v $ + * Revision 1.2 1993/07/17 13:22:59 mike + * indent and general code cleanup + * + * Revision 1.1.1.1 1993/07/03 18:58:15 mike + * move source to cvs + * + * Revision 5.1 1991/12/05 07:56:12 brennan + * 1.1 pre-release + * +*/ + + +/* kw.c */ + + +#include "mawk.h" +#include "symtype.h" +#include "parse.h" +#include "init.h" + + +static struct kw +{ + char *text ; + short kw ; +} +keywords[] = +{ + + {"print", PRINT}, + {"printf", PRINTF}, + {"do", DO}, + {"while", WHILE}, + {"for", FOR}, + {"break", BREAK}, + {"continue", CONTINUE}, + {"if", IF}, + {"else", ELSE}, + {"in", IN}, + {"delete", DELETE}, + {"split", SPLIT}, + {"match", MATCH_FUNC}, + {"BEGIN", BEGIN}, + {"END", END}, + {"exit", EXIT}, + {"next", NEXT}, + {"return", RETURN}, + {"getline", GETLINE}, + {"sub", SUB}, + {"gsub", GSUB}, + {"function", FUNCTION}, + {(char *) 0, 0} +} ; + +/* put keywords in the symbol table */ +void +kw_init() +{ + register struct kw *p = keywords ; + register SYMTAB *q ; + + while (p->text) + { + q = insert(p->text) ; + q->type = ST_KEYWORD ; + q->stval.kw = p++->kw ; + } +} + +/* find a keyword to emit an error message */ +char * +find_kw_str(kw_token) + int kw_token ; +{ + struct kw *p ; + + for (p = keywords; p->text; p++) + if (p->kw == kw_token) return p->text ; + /* search failed */ + return (char *) 0 ; +} diff --git a/kw.o b/kw.o new file mode 100644 index 0000000..d4ba6f0 Binary files /dev/null and b/kw.o differ diff --git a/main.c b/main.c new file mode 100644 index 0000000..bc468f2 --- /dev/null +++ b/main.c @@ -0,0 +1,84 @@ + +/******************************************** +main.c +copyright 1991, Michael D. Brennan + +This is a source file for mawk, an implementation of +the AWK programming language. + +Mawk is distributed without warranty under the terms of +the GNU General Public License, version 2, 1991. +********************************************/ + +/* $Log: main.c,v $ + * Revision 1.4 1995/06/09 22:57:19 mike + * parse() no longer returns on error + * + * Revision 1.3 1995/06/06 00:18:32 mike + * change mawk_exit(1) to mawk_exit(2) + * + * Revision 1.2 1993/07/17 00:45:19 mike + * indent + * + * Revision 1.1.1.1 1993/07/03 18:58:16 mike + * move source to cvs + * + * Revision 5.4 1993/02/13 21:57:27 mike + * merge patch3 + * + * Revision 5.3 1993/01/07 02:50:33 mike + * relative vs absolute code + * + * Revision 5.2.1.1 1993/01/15 03:33:44 mike + * patch3: safer double to int conversion + * + * Revision 5.2 1992/12/17 02:48:01 mike + * 1.1.2d changes for DOS + * + * Revision 5.1 1991/12/05 07:56:14 brennan + * 1.1 pre-release + * +*/ + + + +/* main.c */ + +#include "mawk.h" +#include "init.h" +#include "code.h" +#include "files.h" + + +short mawk_state ; /* 0 is compiling */ +int exit_code ; + +int +main(argc, argv) +int argc ; char **argv ; +{ + + initialize(argc, argv) ; + + parse() ; + + mawk_state = EXECUTION ; + execute(execution_start, eval_stack - 1, 0) ; + /* never returns */ + return 0 ; +} + +void +mawk_exit(x) + int x ; +{ +#if HAVE_REAL_PIPES + close_out_pipes() ; /* no effect, if no out pipes */ +#else +#if HAVE_FAKE_PIPES + close_fake_pipes() ; +#endif +#endif + + exit(x) ; +} diff --git a/main.o b/main.o new file mode 100644 index 0000000..b4f01f6 Binary files /dev/null and b/main.o differ diff --git a/makescan.c b/makescan.c new file mode 100644 index 0000000..33771cc --- /dev/null +++ b/makescan.c @@ -0,0 +1,119 @@ + +/******************************************** +makescan.c +copyright 1991, Michael D. Brennan + +This is a source file for mawk, an implementation of +the AWK programming language. + +Mawk is distributed without warranty under the terms of +the GNU General Public License, version 2, 1991. +********************************************/ + +/*$Log: makescan.c,v $ + * Revision 1.3 1993/07/17 13:23:01 mike + * indent and general code cleanup + * + * Revision 1.2 1993/07/15 13:26:59 mike + * SIZE_T and indent + * + * Revision 1.1.1.1 1993/07/03 18:58:16 mike + * move source to cvs + * + * Revision 5.1 1991/12/05 07:56:16 brennan + * 1.1 pre-release + * +*/ + +/* source for makescan.exe which builds the scancode[] + via: makescan.exe > scancode.c +*/ + + +#define MAKESCAN + +#include "scan.h" + +char scan_code[256] ; + +void +scan_init() +{ + register char *p ; + + memset(scan_code, SC_UNEXPECTED, sizeof(scan_code)) ; + for (p = scan_code + '0'; p <= scan_code + '9'; p++) + *p = SC_DIGIT ; + scan_code[0] = 0 ; + scan_code[' '] = scan_code['\t'] = scan_code['\f'] = SC_SPACE ; + scan_code['\r'] = scan_code['\013'] = SC_SPACE ; + + scan_code[';'] = SC_SEMI_COLON ; + scan_code['\n'] = SC_NL ; + scan_code['{'] = SC_LBRACE ; + scan_code['}'] = SC_RBRACE ; + scan_code['+'] = SC_PLUS ; + scan_code['-'] = SC_MINUS ; + scan_code['*'] = SC_MUL ; + scan_code['/'] = SC_DIV ; + scan_code['%'] = SC_MOD ; + scan_code['^'] = SC_POW ; + scan_code['('] = SC_LPAREN ; + scan_code[')'] = SC_RPAREN ; + scan_code['_'] = SC_IDCHAR ; + scan_code['='] = SC_EQUAL ; + scan_code['#'] = SC_COMMENT ; + scan_code['\"'] = SC_DQUOTE ; + scan_code[','] = SC_COMMA ; + scan_code['!'] = SC_NOT ; + scan_code['<'] = SC_LT ; + scan_code['>'] = SC_GT ; + scan_code['|'] = SC_OR ; + scan_code['&'] = SC_AND ; + scan_code['?'] = SC_QMARK ; + scan_code[':'] = SC_COLON ; + scan_code['['] = SC_LBOX ; + scan_code[']'] = SC_RBOX ; + scan_code['\\'] = SC_ESCAPE ; + scan_code['.'] = SC_DOT ; + scan_code['~'] = SC_MATCH ; + scan_code['$'] = SC_DOLLAR ; + + for (p = scan_code + 'A'; p <= scan_code + 'Z'; p++) + *p = *(p + 'a' - 'A') = SC_IDCHAR ; + +} + +void +scan_print() +{ + register char *p = scan_code ; + register int c ; /* column */ + register int r ; /* row */ + + printf("\n\n/* scancode.c */\n\n\n") ; + printf("char scan_code[256] = {\n") ; + + for (r = 1; r <= 16; r++) + { + for (c = 1; c <= 16; c++) + { + printf("%2d", *p++) ; + if (r != 16 || c != 16) putchar(',') ; + } + putchar('\n') ; + } + + printf("} ;\n") ; +} + + +int +main(argc, argv) + int argc ; + char **argv ; +{ + scan_init() ; + scan_print() ; + return 0 ; +} diff --git a/man/mawk.1 b/man/mawk.1 new file mode 100644 index 0000000..442442d --- /dev/null +++ b/man/mawk.1 @@ -0,0 +1,1620 @@ +.TH MAWK 1 "Dec 22 1994" "Version 1.2" "USER COMMANDS" +.\" strings +.ds ex \fIexpr\fR +.SH NAME +mawk \- pattern scanning and text processing language +.SH SYNOPSIS +.B mawk +[\-\fBW +.IR option ] +[\-\fBF +.IR value ] +[\-\fBv +.IR var=value ] +[\-\|\-] 'program text' [file ...] +.br +.B mawk +[\-\fBW +.IR option ] +[\-\fBF +.IR value ] +[\-\fBv +.IR var=value ] +[\-\fBf +.IR program-file ] +[\-\|\-] [file ...] +.SH DESCRIPTION +.B mawk +is an interpreter for the AWK Programming Language. +The AWK language +is useful for manipulation of data files, +text retrieval and processing, +and for prototyping and experimenting with algorithms. +.B mawk +is a \fInew awk\fR meaning it implements the AWK language as +defined in Aho, Kernighan and Weinberger, +.I "The AWK Programming Language," +Addison-Wesley Publishing, 1988. (Hereafter referred to as +the AWK book.) +.B mawk +conforms to the Posix 1003.2 +(draft 11.3) +definition of the AWK language +which contains a few features not described in the AWK +book, and +.B mawk +provides a small number of extensions. +.PP +An AWK program is a sequence of \fIpattern {action}\fR pairs and +function definitions. +Short programs are entered on the command line +usually enclosed in ' ' to avoid shell +interpretation. +Longer programs can be read in from a +file with the \-f option. +Data input is read from the list of files on +the command line or from standard input when the list is empty. +The input is broken into records as determined by the +record separator variable, \fBRS\fR. Initially, +.B RS += "\en" and records are synonymous with lines. +Each record is compared against each +.I pattern +and if it matches, the program text for +.I "{action}" +is executed. +.SH OPTIONS +.TP \w'\-\fBW'u+\w'\fRsprintf=\fInum\fR'u+2n +\-\fBF \fIvalue\fP +sets the field separator, \fBFS\fR, to +.IR value . +.TP +\-\fBf \fIfile +Program text is read from \fIfile\fR instead of from the +command line. Multiple +.B \-f +options are allowed. +.TP +\-\fBv \fIvar=value\fR +assigns +.I value +to program variable +.IR var . +.TP +\-\|\- +indicates the unambiguous end of options. +.PP +The above options will be available with any Posix compatible +implementation of AWK, and implementation specific options are +prefaced with +.BR \-W . +.B mawk +provides six: +.TP \w'\-\fBW'u+\w'\fRsprintf=\fInum\fR'u+2n +\-\fBW \fRversion +.B mawk +writes its version and copyright +to stdout and compiled limits to +stderr and exits 0. +.TP +\-\fBW \fRdump +writes an assembler like listing of the internal +representation of the program to stdout and exits 0 +(on successful compilation). +.TP +\-\fBW \fRinteractive +sets unbuffered writes to stdout and line buffered reads from stdin. +Records from stdin are lines regardless of the value of +.BR RS . +.TP +\-\fBW \fRexec \fIfile +Program text is read from +.I file +and this is the last option. Useful on systems that support the +.B #! +"magic number" convention for executable scripts. +.TP +\-\fBW \fRsprintf=\fInum\fR +adjusts the size of +.B mawk's +internal sprintf buffer to +.I num +bytes. More than rare use of this option indicates +.B mawk +should be recompiled. +.TP +\-\fBW \fRposix_space +forces +.B mawk +not to consider '\en' to be space. +.PP +The short forms +.BR \-W [vdiesp] +are recognized and on some systems \fB\-W\fRe is mandatory to avoid +command line length limitations. +.SH "THE AWK LANGUAGE" +.SS "\fB1. Program structure" +An AWK program is a sequence of +.I "pattern {action}" +pairs and user +function definitions. +.PP +A pattern can be: +.nf +.RS +\fBBEGIN +END\fR +expression +expression , expression +.sp +.RE +.fi +One, but not both, +of \fIpattern {action}\fR can be omitted. If +.I {action} +is omitted it is implicitly { print }. If +.I pattern +is omitted, then it is implicitly matched. +.B BEGIN +and +.B END +patterns require an action. +.PP +Statements are terminated by newlines, semi-colons or both. +Groups of statements such as +actions or loop bodies are blocked via { ... } as in C. The +last statement in a block doesn't need a terminator. Blank lines +have no meaning; an empty statement is terminated with a +semi-colon. Long statements +can be continued with a backslash, \e\|. A statement can be broken +without a backslash after a comma, left brace, &&, ||, +.BR do , +.BR else , +the right parenthesis of an +.BR if , +.B while +or +.B for +statement, and the +right parenthesis of a function definition. +A comment starts with # and extends to, but does not include +the end of line. +.PP +The following statements control program flow inside blocks. +.RS +.PP +.B if +( \*(ex ) +.I statement +.PP +.B if +( \*(ex ) +.I statement +.B else +.I statement +.PP +.B while +( \*(ex ) +.I statement +.PP +.B do +.I statement +.B while +( \*(ex ) +.PP +.B for +( +\fIopt_expr\fR ; +\fIopt_expr\fR ; +\fIopt_expr\fR +) +.I statement +.PP +.B for +( \fIvar \fBin \fIarray\fR ) +.I statement +.PP +.B continue +.PP +.B break +.RE +.\" +.SS "\fB2. Data types, conversion and comparison" +There are two basic data types, numeric and string. +Numeric constants can be integer like \-2, +decimal like 1.08, or in scientific notation like +\-1.1e4 or .28E\-3. All numbers are represented internally and all +computations are done in floating point arithmetic. +So for example, the expression +0.2e2 == 20 +is true and true is represented as 1.0. +.PP +String constants are enclosed in double quotes. +.sp +.ce +"This is a string with a newline at the end.\en" +.sp +Strings can be continued across a line by escaping (\e) the newline. +The following escape sequences are recognized. +.nf +.sp + \e\e \e + \e" " + \ea alert, ascii 7 + \eb backspace, ascii 8 + \et tab, ascii 9 + \en newline, ascii 10 + \ev vertical tab, ascii 11 + \ef formfeed, ascii 12 + \er carriage return, ascii 13 + \eddd 1, 2 or 3 octal digits for ascii ddd + \exhh 1 or 2 hex digits for ascii hh +.sp +.fi +If you escape any other character \ec, you get \ec, i.e., +.B mawk +ignores the escape. +.PP +There are really three basic data types; the third is +.I "number and string" +which has both a numeric value and a string value +at the same time. +User defined variables come into existence when first referenced +and are initialized to +.IR null , +a number and string value which has numeric value 0 and string value +"". +Non-trivial number and string typed data come from input +and are typically stored in fields. (See section 4). +.PP +The type of an expression is determined by its context and automatic +type conversion occurs if needed. For example, to evaluate the +statements +.nf +.sp + y = x + 2 ; z = x "hello" +.sp +.fi +The value stored in variable y will be typed numeric. +If x is not numeric, +the value read from x is converted to numeric before it is added to +2 and stored in y. The value stored in variable z will be typed +string, and the value of x will be converted to string if necessary +and concatenated with "hello". (Of course, the value and type +stored in x is not changed by any conversions.) +A string expression is converted to numeric using its longest +numeric prefix as with +.IR atof (3). +A numeric expression is converted to string by replacing +.I expr +with +.BR sprintf(CONVFMT , +.IR expr ), +unless +.I expr +can be represented on the host machine as an exact integer then +it is converted to \fBsprintf\fR("%d", \*(ex). +.B Sprintf() +is an AWK built-in that duplicates the functionality of +.IR sprintf (3), +and +.B CONVFMT +is a built-in variable used for internal conversion +from number to string and initialized to "%.6g". +Explicit type conversions can be forced, +\*(ex "" +is string and +.IR expr +0 +is numeric. +.PP +To evaluate, +\*(ex\d1\u \fBrel-op \*(ex\d2\u, +if both operands are numeric or number and string then the comparison +is numeric; if both operands are string the comparison is string; +if one operand is string, the non-string operand is converted and +the comparison is string. The result is numeric, 1 or 0. +.PP +In boolean contexts such as, +\fBif\fR ( \*(ex ) \fIstatement\fR, +a string expression evaluates true if and only if it is not the +empty string ""; +numeric values if and only if not numerically zero. +.\" +.SS "\fB3. Regular expressions" +In the AWK language, records, fields and strings are often +tested for matching a +.IR "regular expression" . +Regular expressions are enclosed in slashes, and +.nf +.sp + \*(ex ~ /\fIr\fR/ +.sp +.fi +is an AWK expression that evaluates to 1 if \*(ex "matches" +.IR r , +which means a substring of \*(ex is in the set of strings +defined by +.IR r . +With no match the expression evaluates to 0; replacing +~ with the "not match" operator, !~ , reverses the meaning. +As pattern-action pairs, +.nf +.sp + /\fIr\fR/ { \fIaction\fR } and\ + \fB$0\fR ~ /\fIr\fR/ { \fIaction\fR } +.sp +.fi +are the same, +and for each input record that matches +.IR r , +.I action +is executed. +In fact, /\fIr\fR/ is an AWK expression that is +equivalent to (\fB$0\fR ~ /\fIr\fR/) anywhere except when on the +right side of a match operator or passed as an argument to +a built-in function that expects a regular expression +argument. +.PP +AWK uses extended regular expressions as with +.IR egrep (1). +The regular expression metacharacters, i.e., those with special +meaning in regular expressions are +.nf +.sp + \ ^ $ . [ ] | ( ) * + ? +.sp +.fi +Regular expressions are built up from characters as follows: +.RS +.TP \w'[^c\d1\uc\d2\uc\d3\u...]'u+1n +\fIc\fR +matches any non-metacharacter +.IR c . +.TP +\e\fIc\fR +matches a character defined by the same escape sequences used +in string constants or the literal +character +.I c +if +\e\fIc\fR +is not an escape sequence. +.TP +\&\. +matches any character (including newline). +.TP +^ +matches the front of a string. +.TP +$ +matches the back of a string. +.TP +[c\d1\uc\d2\uc\d3\u...] +matches any character in the class +c\d1\uc\d2\uc\d3\u... . An interval of characters is denoted +c\d1\u\-c\d2\u inside a class [...]. +.TP +[^c\d1\uc\d2\uc\d3\u...] +matches any character not in the class +c\d1\uc\d2\uc\d3\u... +.RE +.sp +Regular expressions are built up from other regular expressions +as follows: +.RS +.TP \w'[^c\d1\uc\d2\uc\d3\u...]'u+1n +\fIr\fR\d1\u\fIr\fR\d2\u +matches +\fIr\fR\d1\u +followed immediately by +\fIr\fR\d2\u +(concatenation). +.TP +\fIr\fR\d1\u | \fIr\fR\d2\u +matches +\fIr\fR\d1\u or +\fIr\fR\d2\u +(alternation). +.TP +\fIr\fR* +matches \fIr\fR repeated zero or more times. +.TP +\fIr\fR+ +matches \fIr\fR repeated one or more times. +.TP +\fIr\fR? +matches \fIr\fR zero or once. +.TP +(\fIr\fR) +matches \fIr\fR, providing grouping. +.RE +.sp +The increasing precedence of operators is alternation, +concatenation and +unary (*, + or ?). +.PP +For example, +.nf +.sp + /^[_a\-zA-Z][_a\-zA\-Z0\-9]*$/ and + /^[\-+]?([0\-9]+\e\|.?|\e\|.[0\-9])[0\-9]*([eE][\-+]?[0\-9]+)?$/ +.sp +.fi +are matched by AWK identifiers and AWK numeric constants +respectively. Note that . has to be escaped to be +recognized as a decimal point, and that metacharacters are not +special inside character classes. +.PP +Any expression can be used on the right hand side of the ~ or !~ +operators or +passed to a built-in that expects +a regular expression. +If needed, it is converted to string, and then interpreted +as a regular expression. For example, +.nf +.sp + BEGIN { identifier = "[_a\-zA\-Z][_a\-zA\-Z0\-9]*" } + + $0 ~ "^" identifier +.sp +.fi +prints all lines that start with an AWK identifier. +.PP +.B mawk +recognizes the empty regular expression, //\|, which matches the +empty string and hence is matched by any string at the front, +back and between every character. For example, +.nf +.sp + echo abc | mawk { gsub(//, "X") ; print } + XaXbXcX +.sp +.fi +.\" +.SS "\fB4. Records and fields" +Records are read in one at a time, and stored in the +.I field +variable +.BR $0 . +The record is split into +.I fields +which are stored in +.BR $1 , +.BR $2 ", ...," +.BR $NF . +The built-in variable +.B NF +is set to the number of fields, +and +.B NR +and +.B FNR +are incremented by 1. +Fields above +.B $NF +are set to "". +.PP +Assignment to +.B $0 +causes the fields and +.B NF +to be recomputed. +Assignment to +.B NF +or to a field +causes +.B $0 +to be reconstructed by +concatenating the +.B $i's +separated by +.BR OFS . +Assignment to a field with index greater than +.BR NF , +increases +.B NF +and causes +.B $0 +to be reconstructed. +.PP +Data input stored in fields +is string, unless the entire field has numeric +form and then the type is number and string. +For example, +.sp +.nf + echo 24 24E | + mawk '{ print($1>100, $1>"100", $2>100, $2>"100") }' + 0 1 1 1 +.fi +.sp +.B $0 +and +.B $2 +are string and +.B $1 +is number and string. The first comparison is numeric, +the second is string, the third is string +(100 is converted to "100"), +and the last is string. +.\" +.SS "\fB5. Expressions and operators" +.PP +The expression syntax is +similar to C. Primary expressions are numeric constants, +string constants, variables, fields, arrays and function calls. +The identifier +for a variable, array or function can be a sequence of +letters, digits and underscores, that does +not start with a digit. +Variables are not declared; they exist when first referenced and +are initialized to +.IR null . +.PP +New +expressions are composed with the following operators in +order of increasing precedence. +.PP +.RS +.nf +.vs +2p \" open up a little +\fIassignment\fR = += \-= *= /= %= ^= +\fIconditional\fR ? : +\fIlogical or\fR || +\fIlogical and\fR && +\fIarray membership\fR \fBin +\fImatching\fR ~ !~ +\fIrelational\fR < > <= >= == != +\fIconcatenation\fR (no explicit operator) +\fIadd ops\fR + \- +\fImul ops\fR * / % +\fIunary\fR + \- +\fIlogical not\fR ! +\fIexponentiation\fR ^ +\fIinc and dec\fR ++ \-\|\- (both post and pre) +\fIfield\fR $ +.vs +.RE +.PP +.fi +Assignment, conditional and exponentiation associate right to +left; the other operators associate left to right. Any +expression can be parenthesized. +.\" +.SS "\fB6. Arrays" +.ds ae \fIarray\fR[\fIexpr\fR] +Awk provides one-dimensional arrays. Array elements are expressed +as \*(ae. +.I Expr +is internally converted to string type, so, for example, +A[1] and A["1"] are the same element and the actual +index is "1". +Arrays indexed by strings are called associative arrays. +Initially an array is empty; elements exist when first accessed. +An expression, +\fIexpr\fB in\fI array\fR +evaluates to 1 if +\*(ae +exists, else to 0. +.PP +There is a form of the +.B for +statement that loops over each index of an array. +.nf +.sp + \fBfor\fR ( \fIvar\fB in \fIarray \fR) \fIstatement\fR +.sp +.fi +sets +.I var +to each index of +.I array +and executes +.IR statement . +The order that +.I var +transverses the indices of +.I array +is not defined. +.PP +The statement, +.B delete +\*(ae, +causes +\*(ae +not to exist. +.B mawk +supports an extension, +.B delete +.IR array , +which deletes all elements of +.IR array . +.PP +Multidimensional arrays are synthesized with concatenation using +the built-in variable +.BR SUBSEP . +\fIarray\fR[\fIexpr\fR\d1\u,\|\fIexpr\fR\d2\u] +is equivalent to +\fIarray\fR[\fIexpr\fR\d1\u \fBSUBSEP \fIexpr\fR\d2\u]. +Testing for a multidimensional element uses a parenthesized index, +such as +.sp +.nf + if ( (i, j) in A ) print A[i, j] +.fi +.sp +.\" +.SS "\fB7. Builtin-variables\fR" +.PP +The following variables are built-in and initialized before program +execution. +.RS +.TP \w'FILENAME'u+2n +.B ARGC +number of command line arguments. +.TP +.B ARGV +array of command line arguments, 0..ARGC-1. +.TP +.B CONVFMT +format for internal conversion of numbers to string, +initially = "%.6g". +.TP +.B ENVIRON +array indexed by environment variables. An environment string, +\fIvar=value\fR is stored as +\fBENVIRON\fR[\fIvar\fR] = +.IR value . +.TP +.B FILENAME +name of the current input file. +.TP +.B FNR +current record number in +.BR FILENAME . +.TP +.B FS +splits records into fields as a regular expression. +.TP +.B NF +number of fields in the current record. +.TP +.B NR +current record number in the total input stream. +.TP +.B OFMT +format for printing numbers; initially = "%.6g". +.TP +.B OFS +inserted between fields on output, initially = " ". +.TP +.B ORS +terminates each record on output, initially = "\en". +.TP +.B RLENGTH +length set by the last call to the built-in function, +.BR match() . +.TP +.B RS +input record separator, initially = "\en". +.TP +.B RSTART +index set by the last call to +.BR match() . +.TP +.B SUBSEP +used to build multiple array subscripts, initially = "\e034". +.RE +.\" +.SS "\fB8. Built-in functions" +String functions +.RS +.TP +gsub(\fIr,s,t\fR) gsub(\fIr,s\fR) +Global substitution, every match of regular expression +.I r +in variable +.I t +is replaced by string +.IR s . +The number of replacements is returned. +If +.I t +is omitted, +.B $0 +is used. An & in the replacement string +.I s +is replaced by the matched substring of +.IR t . +\e& and \e\e put literal & and \e, respectively, +in the replacement string. +.TP +index(\fIs,t\fR) +If +.I t +is a substring of +.IR s , +then the position where +.I t +starts is returned, else 0 is returned. +The first character of +.I s +is in position 1. +.TP +length(\fIs\fR) +Returns the length of string +.IR s . +.TP +match(\fIs,r\fR) +Returns the index of the first longest match of regular expression +.I r +in string +.IR s . +Returns 0 if no match. +As a side effect, +.B RSTART +is set to the return value. +.B RLENGTH +is set to the length of the match or \-1 if no match. If the +empty string is matched, +.B RLENGTH +is set to 0, and 1 is returned if the match is at the front, and +length(\fIs\fR)+1 is returned if the match is at the back. +.TP +split(\fIs,A,r\fR) split(\fIs,A\fR) +String +.I s +is split into fields by regular expression +.I r +and the fields are loaded into array +.IR A . +The number of fields +is returned. See section 11 below for more detail. +If +.I r +is omitted, +.B FS +is used. +.TP +sprintf(\fIformat,expr-list\fR) +Returns a string constructed from +.I expr-list +according to +.IR format . +See the description of printf() below. +.TP +sub(\fIr,s,t\fR) sub(\fIr,s\fR) +Single substitution, same as gsub() except at most one substitution. +.TP +substr(\fIs,i,n\fR) substr(\fIs,i\fR) +Returns the substring of string +.IR s , +starting at index +.IR i , +of length +.IR n . +If +.I n +is omitted, the suffix of +.IR s , +starting at +.I i +is returned. +.TP +tolower(\fIs\fR) +Returns a copy of +.I s +with all upper case characters converted to lower case. +.TP +toupper(\fIs\fR) +Returns a copy of +.I s +with all lower case characters converted to upper case. +.RE +.PP +Arithmetic functions +.RS +.PP +.nf +.ie n \ +.ds Pi pi +.el \ +.ds Pi \\(*p +atan2(\fIy,x\fR) Arctan of \fIy\fR/\fIx\fR between -\*(Pi and \*(Pi. +.PP +cos(\fIx\fR) Cosine function, \fIx\fR in radians. +.PP +exp(\fIx\fR) Exponential function. +.PP +int(\fIx\fR) Returns \fIx\fR truncated towards zero. +.PP +log(\fIx\fR) Natural logarithm. +.PP +rand() Returns a random number between zero and one. +.PP +sin(\fIx\fR) Sine function, \fIx\fR in radians. +.PP +sqrt(\fIx\fR) Returns square root of \fIx\fR. +.fi +.TP +srand(\fIexpr\fR) srand() +Seeds the random number generator, using the clock if +.I expr +is omitted, and returns the value of the previous seed. +.B mawk +seeds the random number generator from the clock at startup +so there is no real need to call srand(). Srand(\fIexpr\fR) +is useful for repeating pseudo random sequences. +.RE +.\" +.SS "\fB9. Input and output" +There are two output statements, +.B print +and +.BR printf . +.RS +.TP +print +writes +.B "$0 ORS" +to standard output. +.TP +print \*(ex\d1\u, \*(ex\d2\u, ..., \*(ex\dn\u +writes +\*(ex\d1\u \fBOFS \*(ex\d2\u \fBOFS\fR ... \*(ex\dn\u +.B ORS +to standard output. Numeric expressions are converted to +string with +.BR OFMT . +.TP +printf \fIformat, expr-list\fR +duplicates the printf C library function writing to standard output. +The complete ANSI C format specifications are recognized with +conversions %c, %d, %e, %E, %f, %g, %G, +%i, %o, %s, %u, %x, %X and %%, +and conversion qualifiers h and l. +.RE +.PP +The argument list to print or printf can optionally be enclosed in +parentheses. +Print formats numbers using +.B OFMT +or "%d" for exact integers. +"%c" with a numeric argument prints the corresponding 8 bit +character, with a string argument it prints the first character of +the string. +The output of print and printf can be redirected to a file or +command by appending > +.IR file , +>> +.I file +or +| +.I command +to the end of the print statement. +Redirection opens +.I file +or +.I command +only once, subsequent redirections append to the already open stream. +By convention, +.B mawk +associates the filename "/dev/stderr" with stderr which allows +print and printf to be redirected to stderr. +.B mawk +also associates "\-" and "/dev/stdout" with stdin and stdout which +allows these streams to be passed to functions. +.PP +The input function +.B getline +has the following variations. +.RS +.TP +getline +reads into +.BR $0 , +updates the fields, +.BR NF , +.B NR +and +.BR FNR . +.TP +getline < \fIfile\fR +reads into +.B $0 +from \fIfile\fR, +updates the fields and +.BR NF . +.TP +getline \fIvar +reads the next record into +.IR var , +updates +.B NR +and +.BR FNR . +.TP +getline \fIvar\fR < \fIfile +reads the next record of +.I file +into +.IR var . +.TP +\fI command\fR | getline +pipes a record from +.I command +into +.B $0 +and updates the fields and +.BR NF . +.TP +\fI command\fR | getline \fIvar +pipes a record from +.I command +into +.IR var . +.RE +.PP +Getline returns 0 on end-of-file, \-1 on error, otherwise 1. +.PP +Commands on the end of pipes are executed by /bin/sh. +.PP +The function \fBclose\fR(\*(ex) closes the file or pipe +associated with +.IR expr . +Close returns 0 if +.I expr +is an open file, +the exit status if +.I expr +is a piped command, and \-1 otherwise. +Close is used to reread a file or command, make sure the other +end of an output pipe is finished or conserve file resources. +.PP +The function \fBfflush\fR(\*(ex) flushes the output file or pipe +associated with +.IR expr . +Fflush returns 0 if +.I expr +is an open output stream else \-1. +Fflush without an argument flushes stdout. +Fflush with an empty argument ("") flushes all open output. +.PP +The function +\fBsystem\fR(\fIexpr\fR) +uses +/bin/sh +to execute +.I expr +and returns the exit status of the command +.IR expr . +Changes made to the +.B ENVIRON +array are not passed to commands executed with +.B system +or pipes. +.SS \fB10. User defined functions +The syntax for a user defined function is +.nf +.sp + \fBfunction\fR name( \fIargs\fR ) { \fIstatements\fR } +.sp +.fi +The function body can contain a return statement +.nf +.sp + \fBreturn\fI opt_expr\fR +.sp +.fi +A return statement is not required. +Function calls may be nested or recursive. +Functions are passed expressions by value +and arrays by reference. +Extra arguments serve as local variables +and are initialized to +.IR null . +For example, csplit(\fIs,\|A\fR) puts each character of +.I s +into array +.I A +and returns the length of +.IR s . +.nf +.sp + function csplit(s, A, n, i) + { + n = length(s) + for( i = 1 ; i <= n ; i++ ) A[i] = substr(s, i, 1) + return n + } +.sp +.fi +Putting extra space between passed arguments and local +variables is conventional. +Functions can be referenced before they are defined, but the +function name and the '(' of the arguments must touch to +avoid confusion with concatenation. +.\" +.SS "\fB11. Splitting strings, records and files" +Awk programs use the same algorithm to +split strings into arrays with split(), and records into fields +on +.BR FS . +.B mawk +uses essentially the same algorithm to split files into +records on +.BR RS . +.PP +Split(\fIexpr,\|A,\|sep\fR) works as follows: +.RS +.TP +(1) +If +.I sep +is omitted, it is replaced by +.BR FS . +.I Sep +can be an expression or regular expression. If it is an +expression of non-string type, it is converted to string. +.TP +(2) +If +.I sep += " " (a single space), +then is trimmed from the front and back of +.IR expr , +and +.I sep +becomes . +.B mawk +defines as the regular expression +/[\ \et\en]+/. +Otherwise +.I sep +is treated as a regular expression, except that meta-characters +are ignored for a string of length 1, +e.g., +split(x, A, "*") and split(x, A, /\e*/) are the same. +.TP +(3) +If \*(ex is not string, it is converted to string. +If \*(ex is then the empty string "", split() returns 0 +and +.I A +is set empty. +Otherwise, +all non-overlapping, non-null and longest matches of +.I sep +in +.IR expr , +separate +.I expr +into fields which are loaded into +.IR A . +The fields are placed in +A[1], A[2], ..., A[n] and split() returns n, the number +of fields which is the number +of matches plus one. +Data placed in +.I A +that looks numeric is typed number and string. +.RE +.PP +Splitting records into fields works the same except the +pieces are loaded into +.BR $1 , +\fB$2\fR,..., +.BR $NF . +If +.B $0 +is empty, +.B NF +is set to 0 and all +.B $i +to "". +.PP +.B mawk +splits files into records by the same algorithm, but with the +slight difference that +.B RS +is really a terminator instead of a separator. +(\fBORS\fR is really a terminator too). +.RS +.PP +E.g., if +.B FS += ":+" and +.B $0 += "a::b:" , then +.B NF += 3 and +.B $1 += "a", +.B $2 += "b" and +.B $3 += "", but +if "a::b:" is the contents of an input file and +.B RS += ":+", then +there are two records "a" and "b". +.RE +.PP +.B RS += " " is not special. +.PP +If +.B FS += "", then +.B mawk +breaks the record into individual characters, and, similarly, +split(\fIs,A,\fR"") places the individual characters of +.I s +into +.IR A . +.\" +.SS "\fB12. Multi-line records" +Since +.B mawk +interprets +.B RS +as a regular expression, multi-line +records are easy. Setting +.B RS += "\en\en+", makes one or more blank +lines separate records. If +.B FS += " " (the default), then single +newlines, by the rules for above, become space and +single newlines are field separators. +.RS +.PP +For example, if a file is "a\ b\enc\en\en", +.B RS += "\en\en+" and +.B FS += "\ ", then there is one record "a\ b\enc" with three +fields "a", "b" and "c". Changing +.B FS += "\en", gives two +fields "a b" and "c"; changing +.B FS += "", gives one field +identical to the record. +.RE +.PP +If you want lines with spaces or tabs to be considered blank, +set +.B RS += "\en([\ \et]*\en)+". +For compatibility with other awks, setting +.B RS += "" has the same +effect as if blank lines are stripped from the +front and back of files and then records are determined as if +.B RS += "\en\en+". +Posix requires that "\en" always separates records when +.B RS += "" regardless of the value of +.BR FS . +.B mawk +does not support this convention, because defining +"\en" as makes it unnecessary. +.\" +.PP +Most of the time when you change +.B RS +for multi-line records, you +will also want to change +.B ORS +to "\en\en" so the record spacing is preserved on output. +.\" +.SS "\fB13. Program execution" +This section describes the order of program execution. +First +.B ARGC +is set to the total number of command line arguments passed to +the execution phase of the program. +.B ARGV[0] +is set the name of the AWK interpreter and +\fBARGV[1]\fR ... +.B ARGV[ARGC-1] +holds the remaining command line arguments exclusive of +options and program source. +For example with +.nf +.sp + mawk \-f prog v=1 A t=hello B +.sp +.fi +.B ARGC += 5 with +.B ARGV[0] += "mawk", +.B ARGV[1] += "v=1", +.B ARGV[2] += "A", +.B ARGV[3] += "t=hello" and +.B ARGV[4] += "B". +.PP +Next, each +.B BEGIN +block is executed in order. +If the program consists +entirely of +.B BEGIN +blocks, then execution terminates, else +an input stream is opened and execution continues. +If +.B ARGC +equals 1, +the input stream is set to stdin, +else the command line arguments +.BR ARGV[1] " ... +.B ARGV[ARGC-1] +are examined for a file argument. +.PP +The command line arguments divide into three sets: +file arguments, assignment arguments and empty strings "". +An assignment has the form +\fIvar\fR=\fIstring\fR. +When an +.B ARGV[i] +is examined as a possible file argument, +if it is empty it is skipped; +if it is an assignment argument, the assignment to +.I var +takes place and +.B i +skips to the next argument; +else +.B ARGV[i] +is opened for input. +If it fails to open, execution terminates with exit code 2. +If no command line argument is a file argument, then input +comes from stdin. +Getline in a +.B BEGIN +action opens input. "\-" as a file argument denotes stdin. +.PP +Once an input stream is open, each input record is tested +against each +.IR pattern , +and if it matches, the associated +.I action +is executed. +An expression pattern matches if it is boolean true (see +the end of section 2). +A +.B BEGIN +pattern matches before any input has been read, and +an +.B END +pattern matches after all input has been read. +A range pattern, +\fIexpr\fR1,\|\fIexpr\fR2 , +matches every record between the match of +.IR expr 1 +and the match +.IR expr 2 +inclusively. +.PP +When end of file occurs on the input stream, the remaining +command line arguments are examined for a file argument, and +if there is one it is opened, else the +.B END +.I pattern +is considered matched +and all +.B END +.I actions +are executed. +.PP +In the example, the assignment +v=1 +takes place after the +.B BEGIN +.I actions +are executed, and +the data placed in +v +is typed number and string. +Input is then read from file A. +On end of file A, +t +is set to the string "hello", +and B is opened for input. +On end of file B, the +.B END +.I actions +are executed. +.PP +Program flow at the +.I pattern +.I {action} +level can be changed with the +.nf +.sp + \fBnext + \fBexit \fIopt_expr\fR +.sp +.fi +statements. +A +.B next +statement +causes the next input record to be read and pattern testing +to restart with the first +.I "pattern {action}" +pair in the program. +An +.B exit +statement +causes immediate execution of the +.B END +actions or program termination if there are none or +if the +.B exit +occurs in an +.B END +action. +The +.I opt_expr +sets the exit value of the program unless overridden by +a later +.B exit +or subsequent error. +.SH EXAMPLES +.nf +1. emulate cat. + + { print } + +2. emulate wc. + + { chars += length($0) + 1 # add one for the \en + words += NF + } + + END{ print NR, words, chars } + +3. count the number of unique "real words". + + BEGIN { FS = "[^A-Za-z]+" } + + { for(i = 1 ; i <= NF ; i++) word[$i] = "" } + + END { delete word[""] + for ( i in word ) cnt++ + print cnt + } + +.fi +4. sum the second field of +every record based on the first field. +.nf + + $1 ~ /credit\||\|gain/ { sum += $2 } + $1 ~ /debit\||\|loss/ { sum \-= $2 } + + END { print sum } + +5. sort a file, comparing as string + + { line[NR] = $0 "" } # make sure of comparison type + # in case some lines look numeric + + END { isort(line, NR) + for(i = 1 ; i <= NR ; i++) print line[i] + } + + #insertion sort of A[1..n] + function isort( A, n, i, j, hold) + { + for( i = 2 ; i <= n ; i++) + { + hold = A[j = i] + while ( A[j\-1] > hold ) + { j\-\|\- ; A[j+1] = A[j] } + A[j] = hold + } + # sentinel A[0] = "" will be created if needed + } + +.fi +.SH "COMPATIBILITY ISSUES" +The Posix 1003.2(draft 11.3) definition of the AWK language +is AWK as described in the AWK book with a few extensions +that appeared in SystemVR4 nawk. The extensions are: +.sp +.RS +New functions: toupper() and tolower(). + +New variables: ENVIRON[\|] and CONVFMT. + +ANSI C conversion specifications for printf() and sprintf(). + +New command options: \-v var=value, multiple -f options and +implementation options as arguments to \-W. +.RE +.sp + +Posix AWK is oriented to operate on files a line at +a time. +.B RS +can be changed from "\en" to another single character, +but it +is hard to find any use for this \(em there are no +examples in the AWK book. +By convention, \fBRS\fR = "", makes one or more blank lines +separate records, allowing multi-line records. When +\fBRS\fR = "", "\en" is always a field separator +regardless of the value in +.BR FS . +.PP +.BR mawk , +on the other hand, +allows +.B RS +to be a regular expression. +When "\en" appears in records, it is treated as space, and +.B FS +always determines fields. +.PP +Removing the line at a time paradigm can make some programs +simpler and can +often improve performance. For example, +redoing example 3 from above, +.nf +.sp + BEGIN { RS = "[^A-Za-z]+" } + + { word[ $0 ] = "" } + + END { delete word[ "" ] + for( i in word ) cnt++ + print cnt + } +.sp +.fi +counts the number of unique words by making each word a record. +On moderate size files, +.B mawk +executes twice as fast, because of the simplified inner loop. +.PP +The following program replaces each comment by a single space in +a C program file, +.nf +.sp + BEGIN { + RS = "/\|\e*([^*]\||\|\e*+[^/*])*\e*+/" + # comment is record separator + ORS = " " + getline hold + } + + { print hold ; hold = $0 } + + END { printf "%s" , hold } +.sp +.fi +Buffering one record is needed to avoid terminating the last +record with a space. +.PP +With +.BR mawk , +the following are all equivalent, +.nf +.sp + x ~ /a\e+b/ x ~ "a\e+b" x ~ "a\e\e+b" +.sp +.fi +The strings get scanned twice, once as string and once as +regular expression. On the string scan, +.B mawk +ignores the escape on non-escape characters while the AWK +book advocates +.I \ec +be recognized as +.I c +which necessitates the double escaping of meta-characters in +strings. +Posix explicitly declines to define the behavior which passively +forces programs that must run under a variety of awks to use +the more portable but less readable, double escape. +.PP +Posix AWK does not recognize "/dev/std{out,err}" or \ex hex escape +sequences in strings. Unlike ANSI C, +.B mawk +limits the number of digits that follows \ex to two as the current +implementation only supports 8 bit characters. +The built-in +.B fflush +first appeared in a recent (1993) AT&T awk released to netlib, and is +not part of the posix standard. Aggregate deletion with +.B delete +.I array +is not part of the posix standard. +.PP +Posix explicitly leaves the behavior of +.B FS += "" undefined, and mentions splitting the record into characters as +a possible interpretation, but currently this use is not portable +across implementations. +.PP +Finally, here is how +.B mawk +handles exceptional cases not discussed in the +AWK book or the Posix draft. It is unsafe to assume +consistency across awks and safe to skip to +the next section. +.PP +.RS +substr(s, i, n) returns the characters of s in the intersection +of the closed interval [1, length(s)] and the half-open interval +[i, i+n). When this intersection is empty, the empty string is +returned; so substr("ABC", 1, 0) = "" and +substr("ABC", \-4, 6) = "A". +.PP +Every string, including the empty string, matches the empty string +at the +front so, s ~ // and s ~ "", are always 1 as is match(s, //) and +match(s, ""). The last two set +.B RLENGTH +to 0. +.PP +index(s, t) is always the same as match(s, t1) where t1 is the +same as t with metacharacters escaped. Hence consistency +with match requires that +index(s, "") always returns 1. +Also the condition, index(s,t) != 0 if and only t is a substring +of s, requires index("","") = 1. +.PP +If getline encounters end of file, getline var, leaves var +unchanged. Similarly, on entry to the +.B END +actions, +.BR $0 , +the fields and +.B NF +have their value unaltered from the last record. +.SH SEE ALSO +.IR egrep (1) +.PP +Aho, Kernighan and Weinberger, +.IR "The AWK Programming Language" , +Addison-Wesley Publishing, 1988, (the AWK book), +defines the language, opening with a tutorial +and advancing to many interesting programs that delve into +issues of software design and analysis relevant to programming +in any language. +.PP +.IR "The GAWK Manual" , +The Free Software Foundation, 1991, is a tutorial +and language reference +that does not attempt the depth of the AWK book +and assumes the reader may be a novice programmer. +The section on AWK arrays is excellent. It also +discusses Posix requirements for AWK. +.SH BUGS +.B mawk +cannot handle ascii NUL \e0 in the source or data files. You +can output NUL using printf with %c, and any other 8 bit +character is acceptable input. +.PP +.B mawk +implements printf() and sprintf() using the C library functions, +printf and sprintf, so full ANSI compatibility requires an ANSI +C library. In practice this means the h conversion qualifier may +not be available. Also +.B mawk +inherits any bugs or limitations of the library functions. +.PP +Implementors of the AWK language have shown a consistent lack +of imagination when naming their programs. +.SH AUTHOR +Mike Brennan (brennan@whidbey.com). diff --git a/man/mawk.doc b/man/mawk.doc new file mode 100644 index 0000000..7a4aa78 --- /dev/null +++ b/man/mawk.doc @@ -0,0 +1,1254 @@ + + + +MAWK(1) USER COMMANDS MAWK(1) + + + +NAME + mawk - pattern scanning and text processing language + +SYNOPSIS + mawk [-W _o_p_t_i_o_n] [-F _v_a_l_u_e] [-v _v_a_r=_v_a_l_u_e] [--] 'program + text' [file ...] + mawk [-W _o_p_t_i_o_n] [-F _v_a_l_u_e] [-v _v_a_r=_v_a_l_u_e] [-f _p_r_o_g_r_a_m-_f_i_l_e] + [--] [file ...] + +DESCRIPTION + mawk is an interpreter for the AWK Programming Language. + The AWK language is useful for manipulation of data files, + text retrieval and processing, and for prototyping and + experimenting with algorithms. mawk is a _n_e_w _a_w_k meaning it + implements the AWK language as defined in Aho, Kernighan and + Weinberger, _T_h_e _A_W_K _P_r_o_g_r_a_m_m_i_n_g _L_a_n_g_u_a_g_e, Addison-Wesley + Publishing, 1988. (Hereafter referred to as the AWK book.) + mawk conforms to the Posix 1003.2 (draft 11.3) definition of + the AWK language which contains a few features not described + in the AWK book, and mawk provides a small number of exten- + sions. + + An AWK program is a sequence of _p_a_t_t_e_r_n {_a_c_t_i_o_n} pairs and + function definitions. Short programs are entered on the + command line usually enclosed in ' ' to avoid shell + interpretation. Longer programs can be read in from a file + with the -f option. Data input is read from the list of + files on the command line or from standard input when the + list is empty. The input is broken into records as deter- + mined by the record separator variable, RS. Initially, RS = + "\n" and records are synonymous with lines. Each record is + compared against each _p_a_t_t_e_r_n and if it matches, the program + text for {_a_c_t_i_o_n} is executed. + +OPTIONS + -F _v_a_l_u_e sets the field separator, FS, to _v_a_l_u_e. + + -f _f_i_l_e Program text is read from _f_i_l_e instead of + from the command line. Multiple -f options + are allowed. + + -v _v_a_r=_v_a_l_u_e assigns _v_a_l_u_e to program variable _v_a_r. + + -- indicates the unambiguous end of options. + + The above options will be available with any Posix compati- + ble implementation of AWK, and implementation specific + options are prefaced with -W. mawk provides six: + + -W version mawk writes its version and copyright to + stdout and compiled limits to stderr and + exits 0. + + + +Version 1.2 Last change: Dec 22 1994 1 + + + + + + +MAWK(1) USER COMMANDS MAWK(1) + + + + -W dump writes an assembler like listing of the + internal representation of the program to + stdout and exits 0 (on successful compila- + tion). + + -W interactive sets unbuffered writes to stdout and line + buffered reads from stdin. Records from + stdin are lines regardless of the value of + RS. + + -W exec _f_i_l_e Program text is read from _f_i_l_e and this is + the last option. Useful on systems that sup- + port the #! "magic number" convention for + executable scripts. + + -W sprintf=_n_u_m adjusts the size of mawk's internal sprintf + buffer to _n_u_m bytes. More than rare use of + this option indicates mawk should be recom- + piled. + + -W posix_space forces mawk not to consider '\n' to be space. + + The short forms -W[vdiesp] are recognized and on some sys- + tems -We is mandatory to avoid command line length limita- + tions. + +THE AWK LANGUAGE + 1. Program structure + An AWK program is a sequence of _p_a_t_t_e_r_n {_a_c_t_i_o_n} pairs and + user function definitions. + + A pattern can be: + BEGIN + END + expression + expression , expression + + One, but not both, of _p_a_t_t_e_r_n {_a_c_t_i_o_n} can be omitted. If + {_a_c_t_i_o_n} is omitted it is implicitly { print }. If _p_a_t_t_e_r_n + is omitted, then it is implicitly matched. BEGIN and END + patterns require an action. + + Statements are terminated by newlines, semi-colons or both. + Groups of statements such as actions or loop bodies are + blocked via { ... } as in C. The last statement in a block + doesn't need a terminator. Blank lines have no meaning; an + empty statement is terminated with a semi-colon. Long state- + ments can be continued with a backslash, \. A statement can + be broken without a backslash after a comma, left brace, &&, + ||, do, else, the right parenthesis of an if, while or for + statement, and the right parenthesis of a function defini- + tion. A comment starts with # and extends to, but does not + + + +Version 1.2 Last change: Dec 22 1994 2 + + + + + + +MAWK(1) USER COMMANDS MAWK(1) + + + + include the end of line. + + The following statements control program flow inside blocks. + + if ( _e_x_p_r ) _s_t_a_t_e_m_e_n_t + + if ( _e_x_p_r ) _s_t_a_t_e_m_e_n_t else _s_t_a_t_e_m_e_n_t + + while ( _e_x_p_r ) _s_t_a_t_e_m_e_n_t + + do _s_t_a_t_e_m_e_n_t while ( _e_x_p_r ) + + for ( _o_p_t__e_x_p_r ; _o_p_t__e_x_p_r ; _o_p_t__e_x_p_r ) _s_t_a_t_e_m_e_n_t + + for ( _v_a_r in _a_r_r_a_y ) _s_t_a_t_e_m_e_n_t + + continue + + break + + 2. Data types, conversion and comparison + There are two basic data types, numeric and string. Numeric + constants can be integer like -2, decimal like 1.08, or in + scientific notation like -1.1e4 or .28E-3. All numbers are + represented internally and all computations are done in + floating point arithmetic. So for example, the expression + 0.2e2 == 20 is true and true is represented as 1.0. + + String constants are enclosed in double quotes. + + "This is a string with a newline at the end.\n" + + Strings can be continued across a line by escaping (\) the + newline. The following escape sequences are recognized. + + \\ \ + \" " + \a alert, ascii 7 + \b backspace, ascii 8 + \t tab, ascii 9 + \n newline, ascii 10 + \v vertical tab, ascii 11 + \f formfeed, ascii 12 + \r carriage return, ascii 13 + \ddd 1, 2 or 3 octal digits for ascii ddd + \xhh 1 or 2 hex digits for ascii hh + + If you escape any other character \c, you get \c, i.e., mawk + ignores the escape. + + There are really three basic data types; the third is _n_u_m_b_e_r + _a_n_d _s_t_r_i_n_g which has both a numeric value and a string value + + + +Version 1.2 Last change: Dec 22 1994 3 + + + + + + +MAWK(1) USER COMMANDS MAWK(1) + + + + at the same time. User defined variables come into + existence when first referenced and are initialized to _n_u_l_l, + a number and string value which has numeric value 0 and + string value "". Non-trivial number and string typed data + come from input and are typically stored in fields. (See + section 4). + + The type of an expression is determined by its context and + automatic type conversion occurs if needed. For example, to + evaluate the statements + + y = x + 2 ; z = x "hello" + + The value stored in variable y will be typed numeric. If x + is not numeric, the value read from x is converted to + numeric before it is added to 2 and stored in y. The value + stored in variable z will be typed string, and the value of + x will be converted to string if necessary and concatenated + with "hello". (Of course, the value and type stored in x is + not changed by any conversions.) A string expression is con- + verted to numeric using its longest numeric prefix as with + _a_t_o_f(3). A numeric expression is converted to string by + replacing _e_x_p_r with sprintf(CONVFMT, _e_x_p_r), unless _e_x_p_r can + be represented on the host machine as an exact integer then + it is converted to sprintf("%d", _e_x_p_r). Sprintf() is an AWK + built-in that duplicates the functionality of _s_p_r_i_n_t_f(3), + and CONVFMT is a built-in variable used for internal conver- + sion from number to string and initialized to "%.6g". + Explicit type conversions can be forced, _e_x_p_r "" is string + and _e_x_p_r+0 is numeric. + + To evaluate, _e_x_p_r1 rel-op _e_x_p_r2, if both operands are + numeric or number and string then the comparison is numeric; + if both operands are string the comparison is string; if one + operand is string, the non-string operand is converted and + the comparison is string. The result is numeric, 1 or 0. + + In boolean contexts such as, if ( _e_x_p_r ) _s_t_a_t_e_m_e_n_t, a string + expression evaluates true if and only if it is not the empty + string ""; numeric values if and only if not numerically + zero. + + 3. Regular expressions + In the AWK language, records, fields and strings are often + tested for matching a _r_e_g_u_l_a_r _e_x_p_r_e_s_s_i_o_n. Regular expres- + sions are enclosed in slashes, and + + _e_x_p_r ~ /_r/ + + is an AWK expression that evaluates to 1 if _e_x_p_r "matches" + _r, which means a substring of _e_x_p_r is in the set of strings + defined by _r. With no match the expression evaluates to 0; + + + +Version 1.2 Last change: Dec 22 1994 4 + + + + + + +MAWK(1) USER COMMANDS MAWK(1) + + + + replacing ~ with the "not match" operator, !~ , reverses the + meaning. As pattern-action pairs, + + /_r/ { _a_c_t_i_o_n } and $0 ~ /_r/ { _a_c_t_i_o_n } + + are the same, and for each input record that matches _r, + _a_c_t_i_o_n is executed. In fact, /_r/ is an AWK expression that + is equivalent to ($0 ~ /_r/) anywhere except when on the + right side of a match operator or passed as an argument to a + built-in function that expects a regular expression argu- + ment. + + AWK uses extended regular expressions as with _e_g_r_e_p(1). The + regular expression metacharacters, i.e., those with special + meaning in regular expressions are + + ^ $ . [ ] | ( ) * + ? + + Regular expressions are built up from characters as follows: + + _c matches any non-metacharacter _c. + + \_c matches a character defined by the same + escape sequences used in string constants + or the literal character _c if \_c is not an + escape sequence. + + . matches any character (including newline). + + ^ matches the front of a string. + + $ matches the back of a string. + + [c1c2c3...] matches any character in the class + c1c2c3... . An interval of characters is + denoted c1-c2 inside a class [...]. + + [^c1c2c3...] matches any character not in the class + c1c2c3... + + Regular expressions are built up from other regular expres- + sions as follows: + + _r1_r2 matches _r1 followed immediately by _r2 + (concatenation). + + _r1 | _r2 matches _r1 or _r2 (alternation). + + _r* matches _r repeated zero or more times. + + _r+ matches _r repeated one or more times. + + + + +Version 1.2 Last change: Dec 22 1994 5 + + + + + + +MAWK(1) USER COMMANDS MAWK(1) + + + + _r? matches _r zero or once. + + (_r) matches _r, providing grouping. + + The increasing precedence of operators is alternation, con- + catenation and unary (*, + or ?). + + For example, + + /^[_a-zA-Z][_a-zA-Z0-9]*$/ and + /^[-+]?([0-9]+\.?|\.[0-9])[0-9]*([eE][-+]?[0-9]+)?$/ + + are matched by AWK identifiers and AWK numeric constants + respectively. Note that . has to be escaped to be recog- + nized as a decimal point, and that metacharacters are not + special inside character classes. + + Any expression can be used on the right hand side of the ~ + or !~ operators or passed to a built-in that expects a regu- + lar expression. If needed, it is converted to string, and + then interpreted as a regular expression. For example, + + BEGIN { identifier = "[_a-zA-Z][_a-zA-Z0-9]*" } + + $0 ~ "^" identifier + + prints all lines that start with an AWK identifier. + + mawk recognizes the empty regular expression, //, which + matches the empty string and hence is matched by any string + at the front, back and between every character. For exam- + ple, + + echo abc | mawk { gsub(//, "X") ; print } + XaXbXcX + + + 4. Records and fields + Records are read in one at a time, and stored in the _f_i_e_l_d + variable $0. The record is split into _f_i_e_l_d_s which are + stored in $1, $2, ..., $NF. The built-in variable NF is set + to the number of fields, and NR and FNR are incremented by + 1. Fields above $NF are set to "". + + Assignment to $0 causes the fields and NF to be recomputed. + Assignment to NF or to a field causes $0 to be reconstructed + by concatenating the $i's separated by OFS. Assignment to a + field with index greater than NF, increases NF and causes $0 + to be reconstructed. + + Data input stored in fields is string, unless the entire + field has numeric form and then the type is number and + + + +Version 1.2 Last change: Dec 22 1994 6 + + + + + + +MAWK(1) USER COMMANDS MAWK(1) + + + + string. For example, + + echo 24 24E | + mawk '{ print($1>100, $1>"100", $2>100, $2>"100") }' + 0 1 1 1 + + $0 and $2 are string and $1 is number and string. The first + comparison is numeric, the second is string, the third is + string (100 is converted to "100"), and the last is string. + + 5. Expressions and operators + The expression syntax is similar to C. Primary expressions + are numeric constants, string constants, variables, fields, + arrays and function calls. The identifier for a variable, + array or function can be a sequence of letters, digits and + underscores, that does not start with a digit. Variables + are not declared; they exist when first referenced and are + initialized to _n_u_l_l. + + New expressions are composed with the following operators in + order of increasing precedence. + + _a_s_s_i_g_n_m_e_n_t = += -= *= /= %= ^= + _c_o_n_d_i_t_i_o_n_a_l ? : + _l_o_g_i_c_a_l _o_r || + _l_o_g_i_c_a_l _a_n_d && + _a_r_r_a_y _m_e_m_b_e_r_s_h_i_p in + _m_a_t_c_h_i_n_g ~ !~ + _r_e_l_a_t_i_o_n_a_l < > <= >= == != + _c_o_n_c_a_t_e_n_a_t_i_o_n (no explicit operator) + _a_d_d _o_p_s + - + _m_u_l _o_p_s * / % + _u_n_a_r_y + - + _l_o_g_i_c_a_l _n_o_t ! + _e_x_p_o_n_e_n_t_i_a_t_i_o_n ^ + _i_n_c _a_n_d _d_e_c ++ -- (both post and pre) + _f_i_e_l_d $ + + Assignment, conditional and exponentiation associate right + to left; the other operators associate left to right. Any + expression can be parenthesized. + + 6. Arrays + Awk provides one-dimensional arrays. Array elements are + expressed as _a_r_r_a_y[_e_x_p_r]. _E_x_p_r is internally converted to + string type, so, for example, A[1] and A["1"] are the same + element and the actual index is "1". Arrays indexed by + strings are called associative arrays. Initially an array + is empty; elements exist when first accessed. An expres- + sion, _e_x_p_r in _a_r_r_a_y evaluates to 1 if _a_r_r_a_y[_e_x_p_r] exists, + else to 0. + + + + +Version 1.2 Last change: Dec 22 1994 7 + + + + + + +MAWK(1) USER COMMANDS MAWK(1) + + + + There is a form of the for statement that loops over each + index of an array. + + for ( _v_a_r in _a_r_r_a_y ) _s_t_a_t_e_m_e_n_t + + sets _v_a_r to each index of _a_r_r_a_y and executes _s_t_a_t_e_m_e_n_t. The + order that _v_a_r transverses the indices of _a_r_r_a_y is not + defined. + + The statement, delete _a_r_r_a_y[_e_x_p_r], causes _a_r_r_a_y[_e_x_p_r] not to + exist. mawk supports an extension, delete _a_r_r_a_y, which + deletes all elements of _a_r_r_a_y. + + Multidimensional arrays are synthesized with concatenation + using the built-in variable SUBSEP. _a_r_r_a_y[_e_x_p_r1,_e_x_p_r2] is + equivalent to _a_r_r_a_y[_e_x_p_r1 SUBSEP _e_x_p_r2]. Testing for a mul- + tidimensional element uses a parenthesized index, such as + + if ( (i, j) in A ) print A[i, j] + + + 7. Builtin-variables + The following variables are built-in and initialized before + program execution. + + ARGC number of command line arguments. + + ARGV array of command line arguments, 0..ARGC-1. + + CONVFMT format for internal conversion of numbers to + string, initially = "%.6g". + + ENVIRON array indexed by environment variables. An + environment string, _v_a_r=_v_a_l_u_e is stored as + ENVIRON[_v_a_r] = _v_a_l_u_e. + + FILENAME name of the current input file. + + FNR current record number in FILENAME. + + FS splits records into fields as a regular + expression. + + NF number of fields in the current record. + + NR current record number in the total input + stream. + + OFMT format for printing numbers; initially = + "%.6g". + + OFS inserted between fields on output, initially + + + +Version 1.2 Last change: Dec 22 1994 8 + + + + + + +MAWK(1) USER COMMANDS MAWK(1) + + + + = " ". + + ORS terminates each record on output, initially = + "\n". + + RLENGTH length set by the last call to the built-in + function, match(). + + RS input record separator, initially = "\n". + + RSTART index set by the last call to match(). + + SUBSEP used to build multiple array subscripts, ini- + tially = "\034". + + 8. Built-in functions + String functions + + gsub(_r,_s,_t) gsub(_r,_s) + Global substitution, every match of regular + expression _r in variable _t is replaced by string + _s. The number of replacements is returned. If _t + is omitted, $0 is used. An & in the replacement + string _s is replaced by the matched substring of + _t. \& and \\ put literal & and \, respectively, + in the replacement string. + + index(_s,_t) + If _t is a substring of _s, then the position where + _t starts is returned, else 0 is returned. The + first character of _s is in position 1. + + length(_s) + Returns the length of string _s. + + match(_s,_r) + Returns the index of the first longest match of + regular expression _r in string _s. Returns 0 if no + match. As a side effect, RSTART is set to the + return value. RLENGTH is set to the length of the + match or -1 if no match. If the empty string is + matched, RLENGTH is set to 0, and 1 is returned if + the match is at the front, and length(_s)+1 is + returned if the match is at the back. + + split(_s,_A,_r) split(_s,_A) + String _s is split into fields by regular expres- + sion _r and the fields are loaded into array _A. + The number of fields is returned. See section 11 + below for more detail. If _r is omitted, FS is + used. + + + + +Version 1.2 Last change: Dec 22 1994 9 + + + + + + +MAWK(1) USER COMMANDS MAWK(1) + + + + sprintf(_f_o_r_m_a_t,_e_x_p_r-_l_i_s_t) + Returns a string constructed from _e_x_p_r-_l_i_s_t + according to _f_o_r_m_a_t. See the description of + printf() below. + + sub(_r,_s,_t) sub(_r,_s) + Single substitution, same as gsub() except at most + one substitution. + + substr(_s,_i,_n) substr(_s,_i) + Returns the substring of string _s, starting at + index _i, of length _n. If _n is omitted, the suffix + of _s, starting at _i is returned. + + tolower(_s) + Returns a copy of _s with all upper case characters + converted to lower case. + + toupper(_s) + Returns a copy of _s with all lower case characters + converted to upper case. + + Arithmetic functions + + atan2(_y,_x) Arctan of _y/_x between -pi and pi. + + cos(_x) Cosine function, _x in radians. + + exp(_x) Exponential function. + + int(_x) Returns _x truncated towards zero. + + log(_x) Natural logarithm. + + rand() Returns a random number between zero and one. + + sin(_x) Sine function, _x in radians. + + sqrt(_x) Returns square root of _x. + + srand(_e_x_p_r) srand() + Seeds the random number generator, using the clock + if _e_x_p_r is omitted, and returns the value of the + previous seed. mawk seeds the random number gen- + erator from the clock at startup so there is no + real need to call srand(). Srand(_e_x_p_r) is useful + for repeating pseudo random sequences. + + 9. Input and output + There are two output statements, print and printf. + + print + + + +Version 1.2 Last change: Dec 22 1994 10 + + + + + + +MAWK(1) USER COMMANDS MAWK(1) + + + + writes $0 ORS to standard output. + + print _e_x_p_r1, _e_x_p_r2, ..., _e_x_p_rn + writes _e_x_p_r1 OFS _e_x_p_r2 OFS ... _e_x_p_rn ORS to stan- + dard output. Numeric expressions are converted to + string with OFMT. + + printf _f_o_r_m_a_t, _e_x_p_r-_l_i_s_t + duplicates the printf C library function writing + to standard output. The complete ANSI C format + specifications are recognized with conversions %c, + %d, %e, %E, %f, %g, %G, %i, %o, %s, %u, %x, %X and + %%, and conversion qualifiers h and l. + + The argument list to print or printf can optionally be + enclosed in parentheses. Print formats numbers using OFMT + or "%d" for exact integers. "%c" with a numeric argument + prints the corresponding 8 bit character, with a string + argument it prints the first character of the string. The + output of print and printf can be redirected to a file or + command by appending > _f_i_l_e, >> _f_i_l_e or | _c_o_m_m_a_n_d to the end + of the print statement. Redirection opens _f_i_l_e or _c_o_m_m_a_n_d + only once, subsequent redirections append to the already + open stream. By convention, mawk associates the filename + "/dev/stderr" with stderr which allows print and printf to + be redirected to stderr. mawk also associates "-" and + "/dev/stdout" with stdin and stdout which allows these + streams to be passed to functions. + + The input function getline has the following variations. + + getline + reads into $0, updates the fields, NF, NR and FNR. + + getline < _f_i_l_e + reads into $0 from _f_i_l_e, updates the fields and + NF. + + getline _v_a_r + reads the next record into _v_a_r, updates NR and + FNR. + + getline _v_a_r < _f_i_l_e + reads the next record of _f_i_l_e into _v_a_r. + + _c_o_m_m_a_n_d | getline + pipes a record from _c_o_m_m_a_n_d into $0 and updates + the fields and NF. + + _c_o_m_m_a_n_d | getline _v_a_r + pipes a record from _c_o_m_m_a_n_d into _v_a_r. + + + + +Version 1.2 Last change: Dec 22 1994 11 + + + + + + +MAWK(1) USER COMMANDS MAWK(1) + + + + Getline returns 0 on end-of-file, -1 on error, otherwise 1. + + Commands on the end of pipes are executed by /bin/sh. + + The function close(_e_x_p_r) closes the file or pipe associated + with _e_x_p_r. Close returns 0 if _e_x_p_r is an open file, the + exit status if _e_x_p_r is a piped command, and -1 otherwise. + Close is used to reread a file or command, make sure the + other end of an output pipe is finished or conserve file + resources. + + The function fflush(_e_x_p_r) flushes the output file or pipe + associated with _e_x_p_r. Fflush returns 0 if _e_x_p_r is an open + output stream else -1. Fflush without an argument flushes + stdout. Fflush with an empty argument ("") flushes all open + output. + + The function system(_e_x_p_r) uses /bin/sh to execute _e_x_p_r and + returns the exit status of the command _e_x_p_r. Changes made + to the ENVIRON array are not passed to commands executed + with system or pipes. + + 10. User defined functions + The syntax for a user defined function is + + function name( _a_r_g_s ) { _s_t_a_t_e_m_e_n_t_s } + + The function body can contain a return statement + + return _o_p_t__e_x_p_r + + A return statement is not required. Function calls may be + nested or recursive. Functions are passed expressions by + value and arrays by reference. Extra arguments serve as + local variables and are initialized to _n_u_l_l. For example, + csplit(_s,_A) puts each character of _s into array _A and + returns the length of _s. + + function csplit(s, A, n, i) + { + n = length(s) + for( i = 1 ; i <= n ; i++ ) A[i] = substr(s, i, 1) + return n + } + + Putting extra space between passed arguments and local vari- + ables is conventional. Functions can be referenced before + they are defined, but the function name and the '(' of the + arguments must touch to avoid confusion with concatenation. + + 11. Splitting strings, records and files + Awk programs use the same algorithm to split strings into + + + +Version 1.2 Last change: Dec 22 1994 12 + + + + + + +MAWK(1) USER COMMANDS MAWK(1) + + + + arrays with split(), and records into fields on FS. mawk + uses essentially the same algorithm to split files into + records on RS. + + Split(_e_x_p_r,_A,_s_e_p) works as follows: + + (1) If _s_e_p is omitted, it is replaced by FS. _S_e_p can + be an expression or regular expression. If it is + an expression of non-string type, it is converted + to string. + + (2) If _s_e_p = " " (a single space), then is + trimmed from the front and back of _e_x_p_r, and _s_e_p + becomes . mawk defines as the reg- + ular expression /[ \t\n]+/. Otherwise _s_e_p is + treated as a regular expression, except that + meta-characters are ignored for a string of length + 1, e.g., split(x, A, "*") and split(x, A, /\*/) + are the same. + + (3) If _e_x_p_r is not string, it is converted to string. + If _e_x_p_r is then the empty string "", split() + returns 0 and _A is set empty. Otherwise, all + non-overlapping, non-null and longest matches of + _s_e_p in _e_x_p_r, separate _e_x_p_r into fields which are + loaded into _A. The fields are placed in A[1], + A[2], ..., A[n] and split() returns n, the number + of fields which is the number of matches plus one. + Data placed in _A that looks numeric is typed + number and string. + + Splitting records into fields works the same except the + pieces are loaded into $1, $2,..., $NF. If $0 is empty, NF + is set to 0 and all $i to "". + + mawk splits files into records by the same algorithm, but + with the slight difference that RS is really a terminator + instead of a separator. (ORS is really a terminator too). + + E.g., if FS = ":+" and $0 = "a::b:" , then NF = 3 and + $1 = "a", $2 = "b" and $3 = "", but if "a::b:" is the + contents of an input file and RS = ":+", then there are + two records "a" and "b". + + RS = " " is not special. + + If FS = "", then mawk breaks the record into individual + characters, and, similarly, split(_s,_A,"") places the indivi- + dual characters of _s into _A. + + 12. Multi-line records + Since mawk interprets RS as a regular expression, multi-line + + + +Version 1.2 Last change: Dec 22 1994 13 + + + + + + +MAWK(1) USER COMMANDS MAWK(1) + + + + records are easy. Setting RS = "\n\n+", makes one or more + blank lines separate records. If FS = " " (the default), + then single newlines, by the rules for above, become + space and single newlines are field separators. + + For example, if a file is "a b\nc\n\n", RS = "\n\n+" + and FS = " ", then there is one record "a b\nc" with + three fields "a", "b" and "c". Changing FS = "\n", + gives two fields "a b" and "c"; changing FS = "", gives + one field identical to the record. + + If you want lines with spaces or tabs to be considered + blank, set RS = "\n([ \t]*\n)+". For compatibility with + other awks, setting RS = "" has the same effect as if blank + lines are stripped from the front and back of files and then + records are determined as if RS = "\n\n+". Posix requires + that "\n" always separates records when RS = "" regardless + of the value of FS. mawk does not support this convention, + because defining "\n" as makes it unnecessary. + + Most of the time when you change RS for multi-line records, + you will also want to change ORS to "\n\n" so the record + spacing is preserved on output. + + 13. Program execution + This section describes the order of program execution. + First ARGC is set to the total number of command line argu- + ments passed to the execution phase of the program. ARGV[0] + is set the name of the AWK interpreter and ARGV[1] ... + ARGV[ARGC-1] holds the remaining command line arguments + exclusive of options and program source. For example with + + mawk -f prog v=1 A t=hello B + + ARGC = 5 with ARGV[0] = "mawk", ARGV[1] = "v=1", ARGV[2] = + "A", ARGV[3] = "t=hello" and ARGV[4] = "B". + + Next, each BEGIN block is executed in order. If the program + consists entirely of BEGIN blocks, then execution ter- + minates, else an input stream is opened and execution con- + tinues. If ARGC equals 1, the input stream is set to stdin, + else the command line arguments ARGV[1] ... ARGV[ARGC-1] + are examined for a file argument. + + The command line arguments divide into three sets: file + arguments, assignment arguments and empty strings "". An + assignment has the form _v_a_r=_s_t_r_i_n_g. When an ARGV[i] is + examined as a possible file argument, if it is empty it is + skipped; if it is an assignment argument, the assignment to + _v_a_r takes place and i skips to the next argument; else + ARGV[i] is opened for input. If it fails to open, execution + terminates with exit code 2. If no command line argument is + + + +Version 1.2 Last change: Dec 22 1994 14 + + + + + + +MAWK(1) USER COMMANDS MAWK(1) + + + + a file argument, then input comes from stdin. Getline in a + BEGIN action opens input. "-" as a file argument denotes + stdin. + + Once an input stream is open, each input record is tested + against each _p_a_t_t_e_r_n, and if it matches, the associated + _a_c_t_i_o_n is executed. An expression pattern matches if it is + boolean true (see the end of section 2). A BEGIN pattern + matches before any input has been read, and an END pattern + matches after all input has been read. A range pattern, + _e_x_p_r1,_e_x_p_r2 , matches every record between the match of + _e_x_p_r1 and the match _e_x_p_r2 inclusively. + + When end of file occurs on the input stream, the remaining + command line arguments are examined for a file argument, and + if there is one it is opened, else the END _p_a_t_t_e_r_n is con- + sidered matched and all END _a_c_t_i_o_n_s are executed. + + In the example, the assignment v=1 takes place after the + BEGIN _a_c_t_i_o_n_s are executed, and the data placed in v is + typed number and string. Input is then read from file A. + On end of file A, t is set to the string "hello", and B is + opened for input. On end of file B, the END _a_c_t_i_o_n_s are + executed. + + Program flow at the _p_a_t_t_e_r_n {_a_c_t_i_o_n} level can be changed + with the + + next + exit _o_p_t__e_x_p_r + + statements. A next statement causes the next input record + to be read and pattern testing to restart with the first + _p_a_t_t_e_r_n {_a_c_t_i_o_n} pair in the program. An exit statement + causes immediate execution of the END actions or program + termination if there are none or if the exit occurs in an + END action. The _o_p_t__e_x_p_r sets the exit value of the program + unless overridden by a later exit or subsequent error. + +EXAMPLES + 1. emulate cat. + + { print } + + 2. emulate wc. + + { chars += length($0) + 1 # add one for the \n + words += NF + } + + END{ print NR, words, chars } + + + + +Version 1.2 Last change: Dec 22 1994 15 + + + + + + +MAWK(1) USER COMMANDS MAWK(1) + + + + 3. count the number of unique "real words". + + BEGIN { FS = "[^A-Za-z]+" } + + { for(i = 1 ; i <= NF ; i++) word[$i] = "" } + + END { delete word[""] + for ( i in word ) cnt++ + print cnt + } + + 4. sum the second field of every record based on the first + field. + + $1 ~ /credit|gain/ { sum += $2 } + $1 ~ /debit|loss/ { sum -= $2 } + + END { print sum } + + 5. sort a file, comparing as string + + { line[NR] = $0 "" } # make sure of comparison type + # in case some lines look numeric + + END { isort(line, NR) + for(i = 1 ; i <= NR ; i++) print line[i] + } + + #insertion sort of A[1..n] + function isort( A, n, i, j, hold) + { + for( i = 2 ; i <= n ; i++) + { + hold = A[j = i] + while ( A[j-1] > hold ) + { j-- ; A[j+1] = A[j] } + A[j] = hold + } + # sentinel A[0] = "" will be created if needed + } + + +COMPATIBILITY ISSUES + The Posix 1003.2(draft 11.3) definition of the AWK language + is AWK as described in the AWK book with a few extensions + that appeared in SystemVR4 nawk. The extensions are: + + New functions: toupper() and tolower(). + + New variables: ENVIRON[] and CONVFMT. + + ANSI C conversion specifications for printf() and + + + +Version 1.2 Last change: Dec 22 1994 16 + + + + + + +MAWK(1) USER COMMANDS MAWK(1) + + + + sprintf(). + + New command options: -v var=value, multiple -f options + and implementation options as arguments to -W. + + + Posix AWK is oriented to operate on files a line at a time. + RS can be changed from "\n" to another single character, but + it is hard to find any use for this - there are no examples + in the AWK book. By convention, RS = "", makes one or more + blank lines separate records, allowing multi-line records. + When RS = "", "\n" is always a field separator regardless of + the value in FS. + + mawk, on the other hand, allows RS to be a regular expres- + sion. When "\n" appears in records, it is treated as space, + and FS always determines fields. + + Removing the line at a time paradigm can make some programs + simpler and can often improve performance. For example, + redoing example 3 from above, + + BEGIN { RS = "[^A-Za-z]+" } + + { word[ $0 ] = "" } + + END { delete word[ "" ] + for( i in word ) cnt++ + print cnt + } + + counts the number of unique words by making each word a + record. On moderate size files, mawk executes twice as + fast, because of the simplified inner loop. + + The following program replaces each comment by a single + space in a C program file, + + BEGIN { + RS = "/\*([^*]|\*+[^/*])*\*+/" + # comment is record separator + ORS = " " + getline hold + } + + { print hold ; hold = $0 } + + END { printf "%s" , hold } + + Buffering one record is needed to avoid terminating the last + record with a space. + + + + +Version 1.2 Last change: Dec 22 1994 17 + + + + + + +MAWK(1) USER COMMANDS MAWK(1) + + + + With mawk, the following are all equivalent, + + x ~ /a\+b/ x ~ "a\+b" x ~ "a\\+b" + + The strings get scanned twice, once as string and once as + regular expression. On the string scan, mawk ignores the + escape on non-escape characters while the AWK book advocates + _\_c be recognized as _c which necessitates the double escaping + of meta-characters in strings. Posix explicitly declines to + define the behavior which passively forces programs that + must run under a variety of awks to use the more portable + but less readable, double escape. + + Posix AWK does not recognize "/dev/std{out,err}" or \x hex + escape sequences in strings. Unlike ANSI C, mawk limits the + number of digits that follows \x to two as the current + implementation only supports 8 bit characters. The built-in + fflush first appeared in a recent (1993) AT&T awk released + to netlib, and is not part of the posix standard. Aggregate + deletion with delete _a_r_r_a_y is not part of the posix stan- + dard. + + Posix explicitly leaves the behavior of FS = "" undefined, + and mentions splitting the record into characters as a pos- + sible interpretation, but currently this use is not portable + across implementations. + + Finally, here is how mawk handles exceptional cases not dis- + cussed in the AWK book or the Posix draft. It is unsafe to + assume consistency across awks and safe to skip to the next + section. + + substr(s, i, n) returns the characters of s in the + intersection of the closed interval [1, length(s)] and + the half-open interval [i, i+n). When this intersec- + tion is empty, the empty string is returned; so + substr("ABC", 1, 0) = "" and substr("ABC", -4, 6) = + "A". + + Every string, including the empty string, matches the + empty string at the front so, s ~ // and s ~ "", are + always 1 as is match(s, //) and match(s, ""). The last + two set RLENGTH to 0. + + index(s, t) is always the same as match(s, t1) where t1 + is the same as t with metacharacters escaped. Hence + consistency with match requires that index(s, "") + always returns 1. Also the condition, index(s,t) != 0 + if and only t is a substring of s, requires + index("","") = 1. + + + + + +Version 1.2 Last change: Dec 22 1994 18 + + + + + + +MAWK(1) USER COMMANDS MAWK(1) + + + + If getline encounters end of file, getline var, leaves + var unchanged. Similarly, on entry to the END actions, + $0, the fields and NF have their value unaltered from + the last record. + +SEE ALSO + _e_g_r_e_p(1) + + Aho, Kernighan and Weinberger, _T_h_e _A_W_K _P_r_o_g_r_a_m_m_i_n_g _L_a_n_g_u_a_g_e, + Addison-Wesley Publishing, 1988, (the AWK book), defines the + language, opening with a tutorial and advancing to many + interesting programs that delve into issues of software + design and analysis relevant to programming in any language. + + _T_h_e _G_A_W_K _M_a_n_u_a_l, The Free Software Foundation, 1991, is a + tutorial and language reference that does not attempt the + depth of the AWK book and assumes the reader may be a novice + programmer. The section on AWK arrays is excellent. It also + discusses Posix requirements for AWK. + +BUGS + mawk cannot handle ascii NUL \0 in the source or data files. + You can output NUL using printf with %c, and any other 8 bit + character is acceptable input. + + mawk implements printf() and sprintf() using the C library + functions, printf and sprintf, so full ANSI compatibility + requires an ANSI C library. In practice this means the h + conversion qualifier may not be available. Also mawk inher- + its any bugs or limitations of the library functions. + + Implementors of the AWK language have shown a consistent + lack of imagination when naming their programs. + +AUTHOR + Mike Brennan (brennan@whidbey.com). + + + + + + + + + + + + + + + + + + + +Version 1.2 Last change: Dec 22 1994 19 + + + diff --git a/matherr.c b/matherr.c new file mode 100644 index 0000000..8131faf --- /dev/null +++ b/matherr.c @@ -0,0 +1,277 @@ + +/******************************************** +matherr.c +copyright 1991, Michael D. Brennan + +This is a source file for mawk, an implementation of +the AWK programming language. + +Mawk is distributed without warranty under the terms of +the GNU General Public License, version 2, 1991. +********************************************/ + +/*$Log: matherr.c,v $ + *Revision 1.9 1996/09/01 16:54:35 mike + *Third try at bug fix for solaris strtod. + * + * Revision 1.6 1994/12/18 20:53:43 mike + * check NetBSD mathlib defines + * + * Revision 1.5 1994/12/14 14:48:57 mike + * add include -- sysV doesn't have it inside + * restore #else that had been removed + * + * Revision 1.4 1994/10/11 00:36:17 mike + * systemVr4 siginfo + * + * Revision 1.3 1993/07/17 13:23:04 mike + * indent and general code cleanup + * + * Revision 1.2 1993/07/04 12:52:03 mike + * start on autoconfig changes + * + * Revision 5.2 1992/03/31 16:14:44 brennan + * patch2: + * TURN_ON_FPE_TRAPS() macro + * USE_IEEEFP_H macro + * + * Revision 5.1 91/12/05 07:56:18 brennan + * 1.1 pre-release + * +*/ + +#include "mawk.h" +#include + +/* Sets up NetBSD 1.0A for ieee floating point */ +#if defined(_LIB_VERSION_TYPE) && defined(_LIB_VERSION) && defined(_IEEE_) +_LIB_VERSION_TYPE _LIB_VERSION = _IEEE_; +#endif + +#ifdef USE_IEEEFP_H +#include +#ifdef HAVE_STRTOD_OVF_BUG +static fp_except entry_mask ; +static fp_except working_mask ; +#endif +#endif + +#ifndef TURN_OFF_FPE_TRAPS +#define TURN_OFF_FPE_TRAPS() /* nothing */ +#endif + +#ifndef TURN_ON_FPE_TRAPS +#define TURN_ON_FPE_TRAPS() /* nothing */ +#endif + +#ifdef SV_SIGINFO +#include +#define FPE_ZERODIVIDE FPE_FLTDIV +#define FPE_OVERFLOW FPE_FLTOVF +#endif + +#ifdef FPE_TRAPS_ON +#include + +/* machine dependent changes might be needed here */ + +#ifdef SV_SIGINFO +static void +fpe_catch(signal, sip) + int signal; + siginfo_t *sip ; +{ + int why = sip->si_code ; + +#else + +static void +fpe_catch(signal, why) + int signal, why ; +{ +#endif /* SV_SIGINFO */ + +#if NOINFO_SIGFPE + rt_error("floating point exception, probably overflow") ; + /* does not return */ +#else + + switch (why) + { + case FPE_ZERODIVIDE: + rt_error("division by zero") ; + + case FPE_OVERFLOW: + rt_error("floating point overflow") ; + + default: + rt_error("floating point exception") ; + } +#endif /* noinfo_sigfpe */ +} + +void +fpe_init() +{ + TURN_ON_FPE_TRAPS() ; + +#ifndef SV_SIGINFO + signal(SIGFPE, fpe_catch) ; + +#else + { struct sigaction x ; + + memset(&x, 0, sizeof(x)) ; + x.sa_handler = fpe_catch ; + x.sa_flags = SA_SIGINFO ; + + sigaction(SIGFPE, &x, (struct sigaction*)0) ; + } +#endif + +#ifdef HAVE_STRTOD_OVF_BUG + /* we've already turned the traps on */ + working_mask = fpgetmask() ; + entry_mask = working_mask & ~FP_X_DZ & ~FP_X_OFL ; +#endif +} + +#else /* FPE_TRAPS not defined */ + +void +fpe_init() +{ + TURN_OFF_FPE_TRAPS() ; +} +#endif + +#ifndef NO_MATHERR + +#ifndef FPE_TRAPS_ON + +/* If we are not trapping math errors, we will shutup the library calls +*/ + +int +matherr(e) + struct exception *e ; +{ + return 1 ; +} + +#else /* print error message and exit */ + +int +matherr(e) + struct exception *e ; +{ + char *error ; + + switch (e->type) + { + case DOMAIN: + case SING: + error = "domain error" ; + break ; + + case OVERFLOW: + error = "overflow" ; + break ; + + case TLOSS: + case PLOSS: + error = "loss of significance" ; + break ; + + case UNDERFLOW: + e->retval = 0.0 ; + return 1 ; /* ignore it */ + } + + if (strcmp(e->name, "atan2") == 0) rt_error("atan2(%g,%g) : %s", + e->arg1, e->arg2, error) ; + else rt_error("%s(%g) : %s", e->name, e->arg1, error) ; + + /* won't get here */ + return 0 ; +} +#endif /* FPE_TRAPS_ON */ + +#endif /* ! no matherr */ + + +/* this is how one gets the libm calls to do the right +thing on bsd43_vax +*/ + +#ifdef BSD43_VAX + +#include + +double infnan(arg) + int arg ; +{ + switch (arg) + { + case ERANGE : errno = ERANGE ; return HUGE ; + case -ERANGE : errno = EDOM ; return -HUGE ; + default: + errno = EDOM ; + } + return 0.0 ; +} + +#endif /* BSD43_VAX */ + +/* This routine is for XENIX-68K 2.3A. + Error check routine to be called after fp arithmetic. +*/ + +#if SW_FP_CHECK +/* Definitions of bit values in iserr() return value */ + +#define OVFLOW 2 +#define UFLOW 4 +#define ZERODIV 8 +#define OVFLFIX 32 +#define INFNAN 64 + +void +fpcheck() +{ + register int fperrval ; + char *errdesc ; + + if ((fperrval = iserr()) == 0) + return ; /* no error */ + + errdesc = (char *) 0 ; + + if (fperrval & INFNAN) errdesc = "arg is infinity or NAN" ; + else if (fperrval & ZERODIV) errdesc = "division by zero" ; + else if (fperrval & OVFLOW) errdesc = "overflow" ; + else if (fperrval & UFLOW) ; /* ignored */ + + if (errdesc) rt_error("%s", errdesc) ; +} + +#endif + +#ifdef HAVE_STRTOD_OVF_BUG +/* buggy strtod in solaris, probably any sysv with ieee754 + strtod can generate an fpe */ + +double +strtod_with_ovf_bug(s, ep) + const char *s ; + char **ep ; +{ + double ret ; + + fpsetmask(entry_mask) ; /* traps off */ +#undef strtod /* make real strtod visible */ + ret = strtod(s, ep) ; + fpsetmask(working_mask) ; /* traps on */ + return ret ; +} +#endif diff --git a/matherr.o b/matherr.o new file mode 100644 index 0000000..f993f0c Binary files /dev/null and b/matherr.o differ diff --git a/mawk b/mawk new file mode 100755 index 0000000..a99409b Binary files /dev/null and b/mawk differ diff --git a/mawk.ac.m4 b/mawk.ac.m4 new file mode 100644 index 0000000..1efd798 --- /dev/null +++ b/mawk.ac.m4 @@ -0,0 +1,369 @@ +dnl +dnl custom mawk macros for autoconf +dnl +dnl $Log: mawk.ac.m4,v $ +dnl Revision 1.15 1996/09/04 23:40:34 mike +dnl Small tweak to strtod bug check +dnl +dnl Revision 1.14 1996/08/30 00:07:18 mike +dnl Modifications to the test and implementation of the bug fix for +dnl solaris overflow in strtod. +dnl +dnl Revision 1.13 1996/08/25 19:31:03 mike +dnl Added work-around for solaris strtod overflow bug. +dnl +dnl Revision 1.12 1996/01/18 11:51:36 mike +dnl 1.2.2 release +dnl +dnl Revision 1.11 1995/10/16 12:25:03 mike +dnl configure cleanup +dnl +dnl Revision 1.10 1995/04/20 20:26:54 mike +dnl beta improvements from Carl Mascott +dnl +dnl Revision 1.9 1994/12/18 20:46:23 mike +dnl fpe_check -> ./fpe_check +dnl +dnl Revision 1.8 1994/12/14 14:38:22 mike +dnl don't assume siginfo.h is inside signal.h +dnl +dnl Revision 1.7 1994/12/11 21:25:18 mike +dnl tweak XDEFINE +dnl +dnl Revision 1.6 1994/10/16 18:38:26 mike +dnl use sed on defines.out +dnl +dnl Revision 1.5 1994/10/11 02:49:08 mike +dnl systemVr4 siginfo +dnl +dnl Revision 1.4 1994/10/11 00:39:27 mike +dnl fpe check stuff +dnl +dnl +dnl ********** look for math library ***************** +dnl +define(MIKE, brennan@whidbey.com) +dnl +define(LOOK_FOR_MATH_LIBRARY,[ +if test "${MATHLIB+set}" != set ; then +AC_CHECK_LIB(m,log,[MATHLIB=-lm ; LIBS="$LIBS -lm"], +[# maybe don't need separate math library +AC_CHECK_FUNC(log, log=yes) +if test "$log$" = yes +then + MATHLIB='' # evidently don't need one +else + AC_MSG_ERROR( +Cannot find a math library. You need to set MATHLIB in config.user) +fi])dnl +fi +AC_SUBST(MATHLIB)])dnl +dnl +dnl ********* utility macros ********************** +dnl +dnl I can't get AC_DEFINE_NOQUOTE to work so give up +define([XDEFINE],[AC_DEFINE($1) +echo X '$1' 'ifelse($2,,1,$2)' >> defines.out])dnl +define([XXDEFINE], +[echo X '$1' '$2' >> defines.out])dnl +dnl +dnl +dnl We want #define NO_STRERROR +dnl instead of #define HAVE_STRERROR +dnl +dnl +define([XADD_NO],[NO_[$1]])dnl +define([ADD_NO], [XADD_NO(translit($1, a-z. , A-Z_))])dnl +define([HEADER_CHECK],[AC_CHECK_HEADER($1, ,XDEFINE(ADD_NO($1)))])dnl +define([FUNC_CHECK],[AC_CHECK_FUNC($1, ,XDEFINE(ADD_NO($1)))])dnl +dnl +dnl how to repeat a macro on a list of args +dnl (probably won't work if the args are expandable +dnl +define([REPEAT_IT], +[ifelse($#,1,[$1],$#,2,[$1($2)], +[$1($2) +REPEAT_IT([$1], +builtin(shift,builtin(shift,$*)))])])dnl + +define([CHECK_HEADERS],[REPEAT_IT([HEADER_CHECK],$*)])dnl +define([CHECK_FUNCTIONS],[REPEAT_IT([FUNC_CHECK],$*)])dnl +dnl +dnl ******* find size_t ******************** +dnl +define([SIZE_T_CHECK],[ + [if test "$size_t_defed" != 1 ; then] + AC_CHECK_HEADER($1,size_t_header=ok) + [if test "$size_t_header" = ok ; then] + AC_TRY_COMPILE([ +#include <$1>], +[size_t *n ; +], [size_t_defed=1; +XXDEFINE($2,1) +echo getting size_t from '<$1>']) +[fi;fi]])dnl +define(WHERE_SIZE_T, +[SIZE_T_CHECK(stddef.h,SIZE_T_STDDEF_H) +SIZE_T_CHECK(sys/types.h,SIZE_T_TYPES_H)])dnl +dnl +dnl ********** check compiler ****************** +dnl +define(COMPILER_ATTRIBUTES, +[AC_MSG_CHECKING(compiler supports void*) +AC_TRY_COMPILE( +[char *cp ; +void *foo() ;] , +[cp = (char*)(void*)(int*)foo() ;],void_star=yes,void_star=no) +AC_MSG_RESULT($void_star) +test "$void_star" = no && XXDEFINE(NO_VOID_STAR,1) +AC_MSG_CHECKING(compiler groks prototypes) +AC_TRY_COMPILE(,[int x(char*);],protos=yes,protos=no) +AC_MSG_RESULT([$protos]) +test "$protos" = no && XXDEFINE(NO_PROTOS,1) +AC_C_CONST +test "$ac_cv_c_const" = no && XXDEFINE(const)])dnl +dnl +dnl +dnl +dnl ********** which yacc *********** +define(WHICH_YACC, +[AC_CHECK_PROGS(YACC, byacc bison yacc) +test "$YACC" = bison && YACC='bison -y'])dnl +dnl +dnl ************* header and footer for config.h ******************* +dnl +define(CONFIG_H_HEADER, +[cat<<'EOF' +/* config.h -- generated by configure */ +#ifndef CONFIG_H +#define CONFIG_H + +EOF])dnl +define(CONFIG_H_TRAILER, +[cat<<'EOF' + +#define HAVE_REAL_PIPES 1 +#endif /* CONFIG_H */ +EOF])dnl +dnl +dnl ************* build config.h *********************** +define(DO_CONFIG_H, +[# output config.h +rm -f config.h +( +CONFIG_H_HEADER +[sed 's/^X/#define/' defines.out] +CONFIG_H_TRAILER +) | tee config.h +rm defines.out])dnl +dnl +dnl +dnl *************** [sf]printf checks needed for print.c *********** +dnl +dnl sometimes fprintf() and sprintf() are not proto'ed in +dnl stdio.h +define(FPRINTF_IN_STDIO, +[AC_EGREP_HEADER([[[^v]]fprintf],stdio.h,,XDEFINE(NO_FPRINTF_IN_STDIO)) +AC_EGREP_HEADER([[[^v]]sprintf],stdio.h,,XDEFINE(NO_SPRINTF_IN_STDIO))])dnl +dnl +dnl ************************************************** +dnl C program to compute MAX__INT and MAX__LONG +dnl if looking at headers fails +define([MAX__INT_PROGRAM], +[[#include +int main() +{ int y ; long yy ; + FILE *out ; + + if ( !(out = fopen("maxint.out","w")) ) exit(1) ; + /* find max int and max long */ + y = 0x1000 ; + while ( y > 0 ) y *= 2 ; + fprintf(out,"X MAX__INT 0x%x\n", y-1) ; + yy = 0x1000 ; + while ( yy > 0 ) yy *= 2 ; + fprintf(out,"X MAX__LONG 0x%lx\n", yy-1) ; + exit(0) ; + return 0 ; + }]])dnl +dnl +dnl *** Try to find a definition of MAX__INT from limits.h else compute*** +dnl +define(FIND_OR_COMPUTE_MAX__INT, +[AC_CHECK_HEADER(limits.h,limits_h=yes) +if test "$limits_h" = yes ; then : +else +AC_CHECK_HEADER(values.h,values_h=yes) + if test "$values_h" = yes ; then + AC_TRY_RUN( +[#include +#include +int main() +{ FILE *out = fopen("maxint.out", "w") ; + if ( ! out ) exit(1) ; + fprintf(out, "X MAX__INT 0x%x\n", MAXINT) ; + fprintf(out, "X MAX__LONG 0x%lx\n", MAXLONG) ; + exit(0) ; return(0) ; +} +], maxint_set=1,[MAX_INT_ERRMSG]) + fi +if test "$maxint_set" != 1 ; then +# compute it -- assumes two's complement +AC_TRY_RUN(MAX__INT_PROGRAM,:,[MAX_INT_ERRMSG]) +fi +cat maxint.out >> defines.out ; rm -f maxint.out +fi ;])dnl +dnl +define(MAX_INT_ERRMSG, +[AC_MSG_ERROR(C program to compute maxint and maxlong failed. +Please send bug report to MIKE.)])dnl +dnl +dnl ********** input config.user ****************** +define(GET_USER_DEFAULTS, +[cat < /dev/null > defines.out +test -f config.user && . ./config.user +NOTSET_THEN_DEFAULT(BINDIR,/usr/local/bin) +NOTSET_THEN_DEFAULT(MANDIR,/usr/local/man/man1) +NOTSET_THEN_DEFAULT(MANEXT,1) +echo "$USER_DEFINES" >> defines.out]) +dnl +dnl ************************************************ +dnl +define([NOTSET_THEN_DEFAULT], +[test "[$]{$1+set}" = set || $1="$2" +AC_SUBST($1)])dnl +dnl +dnl ****************** sysV and solaris fpe checks *********** +dnl +define(LOOK_FOR_FPE_SIGINFO, +[AC_CHECK_FUNC(sigaction, sigaction=1) +AC_CHECK_HEADER(siginfo.h,siginfo_h=1) +if test "$sigaction" = 1 && test "$siginfo_h" = 1 ; then + XDEFINE(SV_SIGINFO) +else + AC_CHECK_FUNC(sigvec,sigvec=1) + if test "$sigvec" = 1 && ./fpe_check phoney_arg >> defines.out ; then : + else XDEFINE(NOINFO_SIGFPE) + fi +fi]) +dnl +dnl +dnl ******** AC_PROG_CC with defaultout -g to cflags ************** +dnl +AC_DEFUN([PROG_CC_NO_MINUS_G_NONSENSE], +[AC_BEFORE([$0], [AC_PROG_CPP])dnl +AC_CHECK_PROG(CC, gcc, gcc, cc) +dnl +AC_MSG_CHECKING(whether we are using GNU C) +AC_CACHE_VAL(ac_cv_prog_gcc, +[dnl The semicolon is to pacify NeXT's syntax-checking cpp. +cat > conftest.c <&AC_FD_CC | egrep yes >/dev/null 2>&1; then + ac_cv_prog_gcc=yes +else + ac_cv_prog_gcc=no +fi])dnl +AC_MSG_RESULT($ac_cv_prog_gcc) +rm -f conftest* +])dnl +dnl +dnl *********** dreaded fpe tests ************* +dnl +define(DREADED_FPE_TESTS, +[if echo "$USER_DEFINES" | grep FPE_TRAPS_ON >/dev/null +then echo skipping fpe tests based on '$'USER_DEFINES +else +AC_TYPE_SIGNAL +[ +echo checking handling of floating point exceptions +rm -f fpe_check +$CC $CFLAGS -DRETSIGTYPE=$ac_cv_type_signal -o fpe_check fpe_check.c $MATHLIB +if test -f fpe_check ; then + ./fpe_check 2>/dev/null + status=$? +else + echo fpe_check.c failed to compile 1>&2 + status=100 +fi + +case $status in + 0) ;; # good news do nothing + 3) # reasonably good news] +XDEFINE(FPE_TRAPS_ON) +LOOK_FOR_FPE_SIGINFO ;; + + 1|2|4) # bad news have to turn off traps + # only know how to do this on systemV and solaris +AC_CHECK_HEADER(ieeefp.h, ieeefp_h=1) +AC_CHECK_FUNC(fpsetmask, fpsetmask=1) +[if test "$ieeefp_h" = 1 && test "$fpsetmask" = 1 ; then] +XDEFINE(FPE_TRAPS_ON) +XDEFINE(USE_IEEEFP_H) +XXDEFINE([TURN_ON_FPE_TRAPS()], +[fpsetmask(fpgetmask()|FP_X_DZ|FP_X_OFL)]) +LOOK_FOR_FPE_SIGINFO +# look for strtod overflow bug +AC_MSG_CHECKING([strtod bug on overflow]) +rm -f fpe_check +$CC $CFLAGS -DRETSIGTYPE=$ac_cv_type_signal -DUSE_IEEEFP_H \ + -o fpe_check fpe_check.c $MATHLIB +if ./fpe_check phoney_arg phoney_arg 2>/dev/null +then + AC_MSG_RESULT([no bug]) +else + AC_MSG_RESULT([buggy -- will use work around]) + XXDEFINE([HAVE_STRTOD_OVF_BUG],1) +fi + +else + [if test $status != 4 ; then] + XDEFINE(FPE_TRAPS_ON) + LOOK_FOR_FPE_SIGINFO + fi + + [case $status in + 1) +cat 1>&2 <<'EOF' +Warning: Your system defaults generate floating point exception +on divide by zero but not on overflow. You need to +#define TURN_ON_FPE_TRAPS() to handle overflow. +Please report this so I can fix this script to do it automatically. +EOF +;; + 2) +cat 1>&2 <<'EOF' +Warning: Your system defaults generate floating point exception +on overflow but not on divide by zero. You need to +#define TURN_ON_FPE_TRAPS() to handle divide by zero. +Please report this so I can fix this script to do it automatically. +EOF +;; + 4) +cat 1>&2 <<'EOF' +Warning: Your system defaults do not generate floating point +exceptions, but your math library does not support this behavior. +You need to +#define TURN_ON_FPE_TRAPS() to use fp exceptions for consistency. +Please report this so I can fix this script to do it automatically. +EOF +;; + esac] +echo MIKE +[echo You can continue with the build and the resulting mawk will be +echo useable, but getting FPE_TRAPS_ON correct eventually is best. +fi ;; + + *) # some sort of disaster +cat 1>&2 <<'EOF' +The program `fpe_check' compiled from fpe_check.c seems to have +unexpectly blown up. Please report this to ]MIKE.[ +EOF +# quit or not ??? +;; +esac +rm -f fpe_check # whew!!] +fi]) diff --git a/mawk.h b/mawk.h new file mode 100644 index 0000000..2eb358d --- /dev/null +++ b/mawk.h @@ -0,0 +1,175 @@ + +/******************************************** +mawk.h +copyright 1991-94, Michael D. Brennan + +This is a source file for mawk, an implementation of +the AWK programming language. + +Mawk is distributed without warranty under the terms of +the GNU General Public License, version 2, 1991. +********************************************/ + + +/* $Log: mawk.h,v $ + * Revision 1.10 1996/08/25 19:31:04 mike + * Added work-around for solaris strtod overflow bug. + * + * Revision 1.9 1995/06/18 19:42:21 mike + * Remove some redundant declarations and add some prototypes + * + * Revision 1.8 1995/06/18 19:17:48 mike + * Create a type Int which on most machines is an int, but on machines + * with 16bit ints, i.e., the PC is a long. This fixes implicit assumption + * that int==long. + * + * Revision 1.7 1995/06/09 22:57:17 mike + * parse() no longer returns on error + * + * Revision 1.6 1994/12/13 00:09:55 mike + * rt_nr and rt_fnr for run-time error messages + * + * Revision 1.5 1994/12/11 23:25:09 mike + * -Wi option + * + * Revision 1.4 1994/12/11 22:14:18 mike + * remove THINK_C #defines. Not a political statement, just no indication + * that anyone ever used it. + * + * Revision 1.3 1993/07/07 00:07:41 mike + * more work on 1.2 + * + * Revision 1.2 1993/07/04 12:52:06 mike + * start on autoconfig changes + * +*/ + + +/* mawk.h */ + +#ifndef MAWK_H +#define MAWK_H + +#include "nstd.h" +#include +#include +#include "types.h" + +#ifdef DEBUG +#define YYDEBUG 1 +extern int yydebug ; /* print parse if on */ +extern int dump_RE ; +#endif + +extern short posix_space_flag , interactive_flag ; + + +/*---------------- + * GLOBAL VARIABLES + *----------------*/ + +/* a well known string */ +extern STRING null_str ; + +#ifndef TEMPBUFF_GOES_HERE +#define EXTERN extern +#else +#define EXTERN /* empty */ +#endif + +/* a useful scratch area */ +EXTERN union { +STRING *_split_buff[MAX_SPLIT] ; +char _string_buff[MIN_SPRINTF] ; +} tempbuff ; + +/* anonymous union */ +#define string_buff tempbuff._string_buff +#define split_buff tempbuff._split_buff + +#define SPRINTF_SZ sizeof(tempbuff) + +/* help with casts */ +extern int mpow2[] ; + + + /* these are used by the parser, scanner and error messages + from the compile */ + +extern char *pfile_name ; /* program input file */ +extern int current_token ; +extern unsigned token_lineno ; /* lineno of current token */ +extern unsigned compile_error_count ; +extern int paren_cnt, brace_cnt ; +extern int print_flag, getline_flag ; +extern short mawk_state ; +#define EXECUTION 1 /* other state is 0 compiling */ + + +extern char *progname ; /* for error messages */ +extern unsigned rt_nr , rt_fnr ; /* ditto */ + +/* macro to test the type of two adjacent cells */ +#define TEST2(cp) (mpow2[(cp)->type]+mpow2[((cp)+1)->type]) + +/* macro to get at the string part of a CELL */ +#define string(cp) ((STRING *)(cp)->ptr) + +#ifdef DEBUG +#define cell_destroy(cp) DB_cell_destroy(cp) +#else + +#define cell_destroy(cp) if ( (cp)->type >= C_STRING &&\ + -- string(cp)->ref_cnt == 0 )\ + zfree(string(cp),string(cp)->len+STRING_OH);else +#endif + +/* prototypes */ + +void PROTO( cast1_to_s, (CELL *) ) ; +void PROTO( cast1_to_d, (CELL *) ) ; +void PROTO( cast2_to_s, (CELL *) ) ; +void PROTO( cast2_to_d, (CELL *) ) ; +void PROTO( cast_to_RE, (CELL *) ) ; +void PROTO( cast_for_split, (CELL *) ) ; +void PROTO( check_strnum, (CELL *) ) ; +void PROTO( cast_to_REPL, (CELL *) ) ; +Int PROTO( d_to_I, (double)) ; + +#define d_to_i(d) ((int)d_to_I(d)) + + +int PROTO( test, (CELL *) ) ; /* test for null non-null */ +CELL *PROTO( cellcpy, (CELL *, CELL *) ) ; +CELL *PROTO( repl_cpy, (CELL *, CELL *) ) ; +void PROTO( DB_cell_destroy, (CELL *) ) ; +void PROTO( overflow, (char *, unsigned) ) ; +void PROTO( rt_overflow, (char *, unsigned) ) ; +void PROTO( rt_error, ( char *, ...) ) ; +void PROTO( mawk_exit, (int) ) __attribute__ ((noreturn)) ; +void PROTO( da, (INST *, FILE *)) ; +char *PROTO( str_str, (char*, char*, unsigned) ) ; +char *PROTO( rm_escape, (char *) ) ; +char *PROTO( re_pos_match, (char *, PTR, unsigned *) ) ; +int PROTO( binmode, (void)) ; + + +void PROTO ( parse, (void) ) ; +int PROTO ( yylex, (void) ) ; +int PROTO( yyparse, (void) ) ; +void PROTO( yyerror, (char *) ) ; +void PROTO( scan_cleanup, (void)) ; + +void PROTO( bozo, (char *) ) __attribute__ ((noreturn)); +void PROTO( errmsg , (int, char*, ...) ) ; +void PROTO( compile_error, ( char *, ...) ) ; + +void PROTO( execute, (INST *, CELL *, CELL *) ) ; +char *PROTO( find_kw_str, (int) ) ; + +#ifdef HAVE_STRTOD_OVF_BUG +double PROTO(strtod_with_ovf_bug, (const char*, char**)) ; +#define strtod strtod_with_ovf_bug +#endif + +#endif /* MAWK_H */ diff --git a/memory.c b/memory.c new file mode 100644 index 0000000..f06c843 --- /dev/null +++ b/memory.c @@ -0,0 +1,101 @@ + +/******************************************** +memory.c +copyright 1991, 1992 Michael D. Brennan + +This is a source file for mawk, an implementation of +the AWK programming language. + +Mawk is distributed without warranty under the terms of +the GNU General Public License, version 2, 1991. +********************************************/ + + +/* $Log: memory.c,v $ + * Revision 1.2 1993/07/17 13:23:08 mike + * indent and general code cleanup + * + * Revision 1.1.1.1 1993/07/03 18:58:17 mike + * move source to cvs + * + * Revision 5.2 1993/01/01 21:30:48 mike + * split new_STRING() into new_STRING and new_STRING0 + * + * Revision 5.1 1991/12/05 07:56:21 brennan + * 1.1 pre-release + * +*/ + + +/* memory.c */ + +#include "mawk.h" +#include "memory.h" + +static STRING *PROTO(xnew_STRING, (unsigned)) ; + + +STRING null_str = +{0, 1, ""} ; + +static STRING * +xnew_STRING(len) + unsigned len ; +{ + STRING *sval = (STRING *) zmalloc(len + STRING_OH) ; + + sval->len = len ; + sval->ref_cnt = 1 ; + return sval ; +} + +/* allocate space for a STRING */ + +STRING * +new_STRING0(len) + unsigned len ; +{ + if (len == 0) + { + null_str.ref_cnt++ ; + return &null_str ; + } + else + { + STRING *sval = xnew_STRING(len) ; + sval->str[len] = 0 ; + return sval ; + } +} + +/* convert char* to STRING* */ + +STRING * +new_STRING(s) + char *s ; +{ + + if (s[0] == 0) + { + null_str.ref_cnt++ ; + return &null_str ; + } + else + { + STRING *sval = xnew_STRING(strlen(s)) ; + strcpy(sval->str, s) ; + return sval ; + } +} + + +#ifdef DEBUG + +void +DB_free_STRING(sval) + register STRING *sval ; +{ + if (--sval->ref_cnt == 0) zfree(sval, sval->len + STRING_OH) ; +} + +#endif diff --git a/memory.h b/memory.h new file mode 100644 index 0000000..434e45f --- /dev/null +++ b/memory.h @@ -0,0 +1,50 @@ + +/******************************************** +memory.h +copyright 1991, Michael D. Brennan + +This is a source file for mawk, an implementation of +the AWK programming language. + +Mawk is distributed without warranty under the terms of +the GNU General Public License, version 2, 1991. +********************************************/ + + +/* $Log: memory.h,v $ + * Revision 1.1.1.1 1993/07/03 18:58:17 mike + * move source to cvs + * + * Revision 5.2 1993/01/01 21:30:48 mike + * split new_STRING() into new_STRING and new_STRING0 + * + * Revision 5.1 1991/12/05 07:59:28 brennan + * 1.1 pre-release + * +*/ + + +/* memory.h */ + +#ifndef MEMORY_H +#define MEMORY_H + +#include "zmalloc.h" + + +STRING *PROTO(new_STRING, (char*)) ; +STRING *PROTO(new_STRING0, (unsigned)) ; + +#ifdef DEBUG +void PROTO( DB_free_STRING , (STRING *) ) ; + +#define free_STRING(s) DB_free_STRING(s) + +#else + +#define free_STRING(sval) if ( -- (sval)->ref_cnt == 0 )\ + zfree(sval, (sval)->len+STRING_OH) ; else +#endif + + +#endif /* MEMORY_H */ diff --git a/memory.o b/memory.o new file mode 100644 index 0000000..60bd79b Binary files /dev/null and b/memory.o differ diff --git a/missing.c b/missing.c new file mode 100644 index 0000000..f205477 --- /dev/null +++ b/missing.c @@ -0,0 +1,192 @@ + +/* missing.c */ + +/*$Log: missing.c,v $ + * Revision 1.2 1995/06/03 09:31:11 mike + * handle strchr(s,0) correctly + * + **/ + +#include "nstd.h" + + +#ifdef NO_STRCHR +char * +strchr(s, c) + char *s ; + int c ; +{ + if( c == 0 ) return s + strlen(s) ; + + while (*s) + { + if (*s == c) return s ; + s++ ; + } + return (char *) 0 ; +} + +char * +strrchr(s, c) + char *s ; + int c ; +{ + char *ret = (char *) 0 ; + + if ( c == 0 ) return s + strlen(s) ; + + while (*s) + { + if (*s == c) ret = s ; + s++ ; + } + return ret ; +} +#endif /* NO_STRCHR */ + +#ifdef NO_STRERROR +extern int sys_nerr ; +extern char *sys_errlist[] ; +char * +strerror(n) + int n ; +{ + return n > 0 & n < sys_nerr ? sys_errlist[n] : "" ; +} +#endif + + +#ifdef NO_MEMCPY +PTR +memcpy(t, s, n) + PTR t, s ; + size_t n ; +{ + char *tt = t ; + char *ss = s ; + + while (n > 0) + { + n-- ; + *tt++ = *ss++ ; + } + return t ; +} + +int +memcmp(t, s, n) + PTR t, s ; + size_t n ; +{ + char *tt = t ; + char *ss = s ; + + while (n > 0) + { + if (*tt < *ss) return -1 ; + if (*tt > *ss) return 1 ; + tt++ ; ss++ ; n-- ; + } + return 0 ; +} + +PTR +memset(t, c, n) + PTR t ; + int c ; + size_t n ; +{ + char *tt = (char *) t ; + + while (n > 0) + { + n-- ; + *tt++ = c ; + } + return t ; +} +#endif /* NO_MEMCPY */ + +#ifdef NO_STRTOD + +/* don't use this unless you really don't have strtod() because + (1) its probably slower than your real strtod() + (2) atof() may call the real strtod() +*/ + +double +strtod(s, endptr) + const char *s ; + char **endptr ; +{ + register unsigned char *p ; + int flag ; + double atof(); + + if (endptr) + { + p = (unsigned char *) s ; + + flag = 0 ; + while (*p == ' ' || *p == '\t') p++ ; + if (*p == '-' || *p == '+') p++ ; + while ( scan_code[*p] == SC_DIGIT ) { flag++ ; p++ ; } + if (*p == '.') + { + p++ ; + while ( scan_code[*p] == SC_DIGIT ) { flag++ ; p++ ; } + } + /* done with number part */ + if (flag == 0) + { /* no number part */ + *endptr = s ; return 0.0 ; + } + else *endptr = (char *) p ; + + /* now look for exponent */ + if (*p == 'e' || *p == 'E') + { + flag = 0 ; + p++ ; + if (*p == '-' || *p == '+') p++ ; + while ( scan_code[*p] == SC_DIGIT ) { flag++ ; p++ ; } + if (flag) *endptr = (char *) p ; + } + } + return atof(s) ; +} +#endif /* no strtod() */ + +#ifdef NO_FMOD + +#ifdef SW_FP_CHECK /* this is V7 and XNX23A specific */ + +double +fmod(x, y) + double x, y; +{ + double modf(); + double dtmp, ipart; + + clrerr() ; + dtmp = x / y ; + fpcheck() ; + modf(dtmp, &ipart) ; + return x - ipart * y ; +} + +#else + +double +fmod(x, y) + double x, y; +{ + double modf(); + double ipart; + + modf(x / y, &ipart) ; + return x - ipart * y ; +} + +#endif +#endif /* NO_FMOD */ diff --git a/missing.o b/missing.o new file mode 100644 index 0000000..ad9f40d Binary files /dev/null and b/missing.o differ diff --git a/msdos/INSTALL b/msdos/INSTALL new file mode 100644 index 0000000..fe5d729 --- /dev/null +++ b/msdos/INSTALL @@ -0,0 +1,30 @@ + +READ NOTES about 1.3 for DOS + +how to make mawk under MsDOS +--------------------------- + +TurboC: Move msdos\makefile.tcc to makefile + +MSC : Move msdos\makefile.msc to makefile + Currently you don't get wildcard expansion on filenames with + msc mawk -- this must be simple to fix, if you do please send + the changes. (You have to do something with SETARGV in the link + file) + +Assuming you keep the same directory structure: + +1) If you want a Unix style command line for mawk, you'll need to + write a function called reargv(int *, char ***) which passes + mawk the modified argc and argv via reargv(&argc,&argv). + Put it in a file called reargv.c + + The supplied argvmks.c works with the MKS shell. + +2) YACC -- +Take some care that you don't trash parse.[ch] unless you're +sure you want to remake them. + +3) The same test suite that is run on mawk under Unix can now +be run under DOS. See test\mawktest.bat and test\fpe_test.bat. + diff --git a/msdos/NOTES b/msdos/NOTES new file mode 100644 index 0000000..c712922 --- /dev/null +++ b/msdos/NOTES @@ -0,0 +1,150 @@ + +Version 1.3: + The new array design will fail under msdos if you put more than + 16K items into an array and then walk it with for(i in A). + Unfortunately things will probably fail ungracefully. The new + array design runs into 64K limits at 16K elements in an array and + there are no checks in the code. This is fixable, but tedious and + 1.2.2 works well on DOS. + + If this is a problem use version 1.2.2. + + You can get updated source and executables (1.2.2) for DOS from + ftp.wustl.edu ~/systems/msdos/gnuish/mawk122[sx].zip + + +Version 1.2: + + I no longer have a dos machine so must count on others to verify that + things work under msdos. + + 1.2 has been ported to TCC, but not MSC (I wouldn't expect this to + be too hard). + + You now have to compile large model. Code will no longer fit in + 64K and code ptr and data ptrs must be both fit in a void* hence + large model required. + + Installation instructions are in file INSTALL. + + +Notes for 1.1.2d + + Three changes specific to DOS. + + (1) Internal conversions from doubles to strings that are integers + use internal conversion to long so DOS and unix now give the same + output. E.g. + + '{ print 2^30 }' + + 1073741824 + + (2) Large model uses 8K as opposed to 4K I/O buffers. + + (3) MAWKSHELL is no longer required. If not set in the + environment, MAWKSHELL defaults to %COMSPEC% /c, e.g., if + comspec is + + c:\system\command.com + + then this has the effect of setting MAWKSHELL to + + c:\system\command.com /c + + Comspec should give a full drive-path specification. + + +Notes for MsDOS (mawk 1.1) +--------------- + +command.com doesn't understand ' so if you use command.com as your +shell (the norm under DOS) then on the command line (and NOT from +files) the meanings of " and ' are reversed. + + mawk "{ print 'hello world' }" + +If this seems too weird, use + + mawk -f con + { print "hello world" } + ^Z + +If you use a DOS shell that gives you a Unix style command line, to use +it you'll need to provide a C function reargv() that retrieves argc and +argv[] from your shell. + +To enable system and pipes you need to tell mawk about your shell by +setting the environment variable MAWKSHELL. E.g, with command.com + + set MAWKSHELL=c:\sys\command.com /c + +or with a unix like shell + + set MAWKSHELL=c:\bin\sh.exe -c + +in your autoexec.bat. The full path with drive and extension and +appropriate switch is required. (Small model is a tight squeeze +and there's not enought room for PATH searching code.) + +If you want to use a ram disk for the pipes, set MAWKTMPDIR. + + set MAWKTMPDIR=d:\ + +The trailing backslash is required. You have to set MAWKSHELL, +MAWKTMPDIR is optional -- defaulting to the current directory. + + +For compatibility with Unix, CR are silently stripped from input and LF +silently become CRLF on output. Also ^Z indicates EOF on input ( +evidently for compatibility with CPM!!!). + +CR control can be turned off, with a new variable BINMODE. +BINMODE defaults to 0. + + BINMODE = 1 gives binary input. + BINMODE = 2 gives binary output. + BINMODE = 3 gives both. + +Setting BINMODE with -v or in the BEGIN section affects all +files, otherwise it only affects files opened after the +assignment to BINMODE. Once a file is open, later assignment to +BINMODE does not affect it. Note that with binary output, printf +will behave strangely -- you'll need to explicitly use \r + + Eg mawk -v BINMODE=2 '{ printf "%d %s\r\n", NR, $0}' + +Assignment to BINMODE does not change RS or ORS; however there is +a -W feature + + -W BINMODE=1 is the same as + -v BINMODE=1 -v RS="\r\n" or BEGIN{BINMODE=1 ; RS = "\r\n" } + + -W BINMODE=2 is the same as + -v BINMODE=2 -v ORS="\r\n" or BEGIN{BINMODE=2 ; ORS = "\r\n" } + + -W BINMODE=3 is the same as + -v BINMODE=3 -v RS=ORS="\r\n" or BEGIN{BINMODE=2 ; RS=ORS = "\r\n" } + + +Setting MAWKBINMODE in the environment is the same as using -W, +except its permanent. +If you rarely have to deal with text files that contain ^Z, +then setting MAWKBINMODE=1 in the environment would speed up input +slightly. + + +---------------------------------------------------------- +WARNING: If you write an infinite loop that does not print to the +screen, then you will have to reboot. For example + + x = 1 + while( x < 10 ) A[x] = x + x++ + +By mistake the x++ is outside the loop. What you need to do is type +control break and the keyboard hardware will generate an interrupt and +the operating system will service that interrupt and terminate your +program, but unfortunately MsDOS does not have such a feature. + + diff --git a/msdos/argvmks.c b/msdos/argvmks.c new file mode 100644 index 0000000..86f2a72 --- /dev/null +++ b/msdos/argvmks.c @@ -0,0 +1,112 @@ + +/* argvmks.c + + for MKS Korn Shell + + If you use this file, add -DHAVE_REARGV=1 to your + CFLAGS + + Contributed by Jack Fitts (fittsj%wmv009@bcsaic.boeing.com) + +*/ + +/* +$Log: argvmks.c,v $ + * Revision 1.2 1995/01/07 14:47:24 mike + * remove return 1 from void function + * + * Revision 1.1.1.1 1993/07/03 18:58:49 mike + * move source to cvs + * + * Revision 1.2 1992/12/17 02:48:01 mike + * 1.1.2d changes for DOS + * + * Revision 1.1 1992/12/05 22:38:41 mike + * Initial revision + * +*/ + + +/***********************************************************/ +/* */ +/* prototypes for reargv */ +/* */ +/***********************************************************/ + +void *malloc(unsigned) ; +char * basename ( char * ); +char *strcpy(char* , char*) ; + + +/***********************************************************/ +/* */ +/* reargv reset argc/argv from environment for MKS shell */ +/* */ +/***********************************************************/ + + +void reargv ( int *argcp, char *** argvp ) { + + int i = 0; + int cnt ; + char ** v; + extern char **environ ; + register char **pe = environ; + +/* MKS Command line args are in the first n lines of the environment */ +/* each arg is preceded with a tilde (~)*/ + + while ( **(pe++) == '~' ) + i++; + +/* if no tilde found then not running under MKS */ + + if ( ! i ) return ; + +/* malloc space for array of char pointers */ + + if ( ! ( v = ( char ** ) malloc (( i + 1 ) * sizeof ( char* ))) ) + return ; + +/* set argc to number of args in environ */ + + *argcp = cnt = i; + +/* set char pointers to each command line arg */ +/* jump over the tilde which is the first char in each string */ + + for ( i = 0; i < cnt ; i++ ) + v[i] = environ[i]+1; + + /*set last arg to null*/ + + v[cnt] = (char *) 0 ; + + /*strip leading directory stuff from argv[0] */ + + v[0] = basename(v[0]); + + *argvp = v; +} + + +/***********************************************************/ +/* */ +/* basename */ +/* */ +/***********************************************************/ + +static char * basename ( char * s ) { + + register char * p ; + char *last ; + + /* find the last occurrence of ':' '\\' or '/' */ + p = s ; last = (char *) 0 ; + while ( *p ) { + if ( *p == ':' || *p == '\\' || *p == '/' ) last = p ; + p++ ; + } + + return last ? last+1 : s ; +} diff --git a/msdos/argvpoly.c b/msdos/argvpoly.c new file mode 100644 index 0000000..35f544e --- /dev/null +++ b/msdos/argvpoly.c @@ -0,0 +1,80 @@ + +/* argvpoly.c + -- set arguments via POLYSHELL (now Thompson Shell??) + -- no errors, don't change anything if + -- it seems shell is not activated */ + +/* POLYSHELL puts the shell expanded command line + in the environment variable CMDLINE. Ascii 0 is + replaced by \xff. +*/ + +char *strchr(char *, int), *getenv(char *) ; +char *basename(char *) ; +void *malloc(unsigned) ; +int strcmp(char *, char *) ; + +static char *basename(char *s) +/* strip path and extension , upcase the rest */ +{ + register char *p ; + + for ( p = strchr(s,0) ; p > s ; p-- ) + switch( p[-1] ) + { case '\\' : + case ':' : + case '/' : return p ; + case '.' : p[-1] = 0 ; break ; + default : + if ( p[-1] >= 'a' && p[-1] <= 'z' ) p[-1] -= 32 ; + break ; + } + + return p ; +} + +/*--------------------- + reargv -- recompute argc and argv for PolyShell + if not under shell do nothing + *------------------------------- */ + +extern char *progname ; +extern unsigned char _osmajor ; + +void reargv(int *argcp , char ***argvp) +{ register char *p ; + char **v , *q, *cmdline, **vx ; + int cnt, cntx ; + + if ( _osmajor == 2 ) /* ugh */ + (*argvp)[0] = progname ; + else (*argvp)[0] = basename( (*argvp)[0] ) ; + + if ( ! (cmdline = getenv("CMDLINE")) ) return ; + + if ( *(q = strchr(cmdline,0) - 1) != 0xff ) + return ; /* shexpand set wrong */ + + for ( *q = 0, cnt = 1 , p = cmdline ; p < q ; p++ ) + if ( *p == 0xff ) { cnt++ ; *p = 0 ; } + + if ( ! (v = (char **) malloc((cnt+1)*sizeof(char*))) ) + return ; /* shouldn't happen */ + + p = cmdline ; + vx = v ; cntx = cnt ; + while ( cnt ) + { *v++ = p ; + cnt-- ; + while ( *p ) p++ ; + p++ ; + } + *v = (char *) 0 ; + v = vx ; + + v[0] = basename( v[0] ) ; + if ( strcmp(v[0], (*argvp)[0]) ) return ;/* running under command + and sh earlier */ + /* running under PolyShell */ + *argcp = cntx ; *argvp = v ; +} diff --git a/msdos/dosexec.c b/msdos/dosexec.c new file mode 100644 index 0000000..74093c2 --- /dev/null +++ b/msdos/dosexec.c @@ -0,0 +1,178 @@ + +/******************************************** +dosexec.h +copyright 1991, Michael D. Brennan + +This is a source file for mawk, an implementation of +the AWK programming language. + +Mawk is distributed without warranty under the terms of +the GNU General Public License, version 2, 1991. +********************************************/ + +/*$Log: dosexec.c,v $ + * Revision 1.3 1995/08/20 16:37:22 mike + * exit(1) -> exit(2) + * + * Revision 1.2 1994/10/08 18:50:03 mike + * remove SM_DOS + * + * Revision 1.1.1.1 1993/07/03 18:58:47 mike + * move source to cvs + * + * Revision 1.4 1992/12/05 22:29:43 mike + * dos patch 112d: + * don't use string_buff + * check COMSPEC + * + * Revision 1.3 1992/07/10 16:21:57 brennan + * store exit code of input pipes + * + * Revision 1.2 1991/11/16 10:27:18 brennan + * BINMODE + * + * Revision 1.1 91/10/29 09:45:56 brennan + * Initial revision + * +*/ + +/* system() and pipes() for MSDOS */ + +#include "mawk.h" + +#if MSDOS +#include "memory.h" +#include "files.h" +#include "fin.h" + +#include + +static void PROTO(get_shell, (void)) ; + +static char *shell ; /* e.g. "c:\\sys\\command.com" */ +static char *command_opt ; /* " /c" */ + +static void get_shell() +{ char *s , *p ; + int len ; + + if ( s = getenv("MAWKSHELL") ) + { + /* break into shell part and option part */ + p = s ; + while ( *p != ' ' && *p != '\t' ) p++ ; + len = p - s ; + shell = (char *) zmalloc(len+1) ; + memcpy(shell, s, len) ; shell[len] = 0 ; + command_opt = p ; + } + else + if ( s = getenv("COMSPEC") ) + { + shell = s ; + command_opt = " /c" ; + /* leading space needed because of bug in command.com */ + } + else + { + errmsg(0, + "cannot exec(), must set MAWKSHELL or COMSPEC in environment" ) ; + exit(2) ; + } +} + + +int DOSexec( command ) + char *command ; +{ + char xbuff[256] ; + + if ( ! shell ) get_shell() ; + + sprintf(xbuff, "%s %s", command_opt, command) ; + + fflush(stderr) ; fflush(stdout) ; + + return spawnl(P_WAIT, shell, shell, xbuff, (char *) 0 ) ; +} + + +static int next_tmp ; /* index for naming temp files */ +static char *tmpdir ; /* directory to hold temp files */ +static unsigned mawkid ; /* unique to this mawk process */ + + +/* compute the unique temp file name associated with id */ +char *tmp_file_name(id, buffer ) + int id ; + char *buffer ; +{ + if ( mawkid == 0 ) + { + /* first time */ + union { + void far *ptr ; + unsigned w[2] ; + } xptr ; + + xptr.ptr = (void far*)&mawkid ; + mawkid = xptr.w[1] ; + + tmpdir = getenv("MAWKTMPDIR") ; + if ( ! tmpdir || strlen(tmpdir) > 80 ) tmpdir = "" ; + } + + (void) sprintf(buffer, "%sMAWK%04X.%03X",tmpdir, mawkid, id) ; + return buffer ; +} + +/* open a pipe, returning a temp file identifier by + reference +*/ + +PTR get_pipe( command, type, tmp_idp) + char *command ; + int type, *tmp_idp ; +{ + PTR retval ; + char xbuff[256] ; + char *tmpname ; + + + *tmp_idp = next_tmp ; + tmpname = tmp_file_name(next_tmp, xbuff+163) ; + + if ( type == PIPE_OUT ) + { + retval = (PTR) fopen(tmpname, (binmode()&2)? "wb":"w") ; + } + else + { + sprintf(xbuff, "%s > %s" , command, tmpname) ; + tmp_idp[1] = DOSexec(xbuff) ; + retval = (PTR) FINopen(tmpname, 0) ; + } + + next_tmp++ ; + return retval ; +} + +/* closing a fake pipes involves running the out pipe + command +*/ + +int close_fake_outpipe(command, tid) + char *command ; + int tid ; /* identifies the temp file */ +{ + char xbuff[256] ; + char *tmpname = tmp_file_name(tid, xbuff+163) ; + int retval ; + + sprintf(xbuff, "%s < %s", command, tmpname) ; + retval = DOSexec(xbuff) ; + (void) unlink(tmpname) ; + return retval ; +} + +#endif /* MSDOS */ diff --git a/msdos/examples/add_cr.awk b/msdos/examples/add_cr.awk new file mode 100644 index 0000000..ba14217 --- /dev/null +++ b/msdos/examples/add_cr.awk @@ -0,0 +1,93 @@ + +# add_cr.awk +# converts from Unix (LF only) text files to +# DOS (CRLF) text files +# +# works on both unix and dos +# if used on unix change COPY and DEL in BEGIN section +# +# mawk -f add_cr.awk [files] + +# with no files reads stdin writes stdout +# otherwise the original is overwritten +# +# If a file of the form `@file', then arguments are read from +# `file', one per line + +# +# To add cr's to the whole distribution +# +# mawk -f doslist.awk packing.lis | mawk "{print $2}" > list +# mawk -f add_cr.awk @list +# + + +# read arguments for @file into ARGV[] +function reset_argv(T, i, j, flag, file) #all args local +{ + for( i = 1 ; i < ARGC ; i++ ) + { + T[i] = ARGV[i] + if ( T[i] ~ /^@/ ) flag = 1 + } + + if ( ! flag ) return + + # need to read from a @file into ARGV + j = 1 + for( i = 1 ; i < ARGC ; i++ ) + { + if ( T[i] !~ /^@/ ) ARGV[j++] = T[i] + else + { + T[i] = substr(T[i],2) + # read arguments from T[i] + while ( (getline file < T[i]) > 0 ) ARGV[j++] = file + } + } + ARGC = j +} + + +BEGIN { + COPY = "copy" # unix: "cp" + DEL = "del" # unix: "rm" + + tmpfile = ENVIRON["MAWKTMPDIR"] "MAWK.TMP" + + reset_argv() +} + + +FILENAME == "-" { + # just write to stdout + printf "%s\r\n" , $0 + next +} + +FILENAME != filename { + + if ( filename ) + { + close(tmpfile) + syscmd = sprintf( "%s %s %s", COPY, tmpfile, filename ) + system(syscmd) + } + + filename = FILENAME +} + +{ printf "%s\r\n" , $0 > tmpfile } + + +END { + if ( filename ) + { + close(tmpfile) + syscmd = sprintf( "%s %s %s", COPY, tmpfile, filename ) + system(syscmd) + system(DEL " " tmpfile) + } +} + + diff --git a/msdos/examples/doslist.awk b/msdos/examples/doslist.awk new file mode 100644 index 0000000..cb991ed --- /dev/null +++ b/msdos/examples/doslist.awk @@ -0,0 +1,34 @@ + +# print truncated DOS file names +# from packing.list (packing.lis) +# +# mawk -f doslist.awk packing.lis + + +# discard blanks and comments +/^#/ || /^[ \t]*$/ {next} + + +function dos_name(s, n, front, X) +{ + #lowercase, split on extension and truncate pieces + s = tolower(s) + n = split(s, X, ".") + + front = substr(X[1],1,8) + + if ( n == 1 ) return front + else return front "." substr(X[2], 1, 3) +} + +{ + n = split($1, X, "/") + new = dos_name(X[1]) + + for( i = 2 ; i <= n ; i++ ) + new = new "\\" dos_name(X[i]) + + printf "%-30s%s\n", $1, new +} + + diff --git a/msdos/examples/objstat.awk b/msdos/examples/objstat.awk new file mode 100644 index 0000000..7340288 --- /dev/null +++ b/msdos/examples/objstat.awk @@ -0,0 +1,20 @@ +# Ben Myers <0003571400@mcimail.com> + +# Sum up sizes of OBJ files in current directory +# A clumsy script to count OBJs and sum up their sizes +# run with +# bmawk -fobjsize.awk workfile +# or similar command syntax with your awk program +# where workfile is a work file +BEGIN { +# redirection done by shelled command +system("dir *.obj >" ARGV[1]) +osize = 0 # size accumulator +ocount = 0 # obj counter +} +# Now read workfile back, skipping lines that are not files +$2 == "OBJ" { osize += $3 ; ocount++ } +END { +print ocount " OBJs, total size " osize " bytes" +system("del "ARGV[1]) +} diff --git a/msdos/examples/shell.awk b/msdos/examples/shell.awk new file mode 100644 index 0000000..6b700fb --- /dev/null +++ b/msdos/examples/shell.awk @@ -0,0 +1,16 @@ +# Ben Myers <0003571400@mcimail.com> + +# Test pipes under DOS. comment/uncomment print statements below +BEGIN { +# redirection done by shelled command +system("dir *.* /b >pippo.") +lcount = 0 +} +{ +# print +# Below is redirection done by mawk +# print >"pippo2." +print $0 | "sort" +lcount++ +} +END { print "mawk NR line count=" NR " our line count=" lcount " lines in pippo"} diff --git a/msdos/examples/srcstat.awk b/msdos/examples/srcstat.awk new file mode 100644 index 0000000..3f63e87 --- /dev/null +++ b/msdos/examples/srcstat.awk @@ -0,0 +1,26 @@ +# Ben Myers <0003571400@mcimail.com> + +# Sum up number, line count, and sizes of SOURCE files in current directory +# run with +# bmawk -fsrcsize.awk workfile +# or similar command syntax with your awk program +# where workfile is a work file +BEGIN { +# redirection done by shelled command +# system("dir *.* >workfile") +system("dir *.* >" ARGV[1]) +ssize = 0 # size accumulator +slines = 0 # line counter +scount = 0 # obj counter +} +# Now read workfile back in +$2 == "C" || $2 == "H" || $2 == "CPP" || $2 == "HPP" { + filename = sprintf("%s.%s", $1, $2) + ssize += $3 + while (getline < filename > 0) {slines++} + scount++ + } +END { +print scount " files, " slines " lines, total size " ssize " bytes" +system("del " ARGV[1]) +} diff --git a/msdos/examples/srcstat2.awk b/msdos/examples/srcstat2.awk new file mode 100644 index 0000000..6a11cdd --- /dev/null +++ b/msdos/examples/srcstat2.awk @@ -0,0 +1,28 @@ +# Ben Myers <0003571400@mcimail.com> + +# Sum up number, line count, and sizes of SOURCE files in current directory +# run with +# bmawk -fsrcsize.awk workfile +# or similar command syntax with your awk program +# where workfile is a work file +BEGIN { +# redirection done by shelled command +system("dir *.* >workfile") +ssize = 0 # size accumulator +slines = 0 # line counter +scount = 0 # obj counter +exit +} +END { +# Now read workfile back in + while (getline < "workfile" > 0) { + if ($2 == "C" || $2 == "H" || $2 == "CPP" || $2 == "HPP") { + filename = sprintf("%s.%s", $1, $2) + ssize += $3 + while (getline < filename > 0) {slines++} + scount++ + } + } +print scount " files, " slines " lines, total size " ssize " bytes" +system("del workfile") +} diff --git a/msdos/examples/texttest.awk b/msdos/examples/texttest.awk new file mode 100644 index 0000000..6c952d9 --- /dev/null +++ b/msdos/examples/texttest.awk @@ -0,0 +1,11 @@ +# Ben Myers <0003571400@mcimail.com> + +/^#include/ { +# got #include, see if it has at least one quote. We don't want #include <> + z = gsub(/"/, "", $2) + while ((z > 0) && (getline x <$2 > 0)) +# while (getline x <$2 > 0) + print x + next +} +{ print } diff --git a/msdos/examples/winexe.awk b/msdos/examples/winexe.awk new file mode 100644 index 0000000..80018fe --- /dev/null +++ b/msdos/examples/winexe.awk @@ -0,0 +1,106 @@ +# Ben Myers <0003571400@mcimail.com> + +# Sum up segment sizes of all Windows EXEs in current directory +# requires DOS 5.0 and Borland TDUMP +# run with +# awk -fwinexe.awk work1 +# where work1 is a work file +# You must have at least one filename as an arg, else awk will want to read +# from con:, hence the requirement for work1 +BEGIN { +# redirection done by shelled command +system("del workfile.$%$") # Will probably cause a File Not Found message +# Generate a list of EXEs +system("dir *.exe /b > workfile.$%$") +while (getline < "workfile.$%$" > 0) { +# TDUMP keeps on piping to the workfile +system("tdump " $1 ">> " ARGV[1]) +} +module_name = "" # initialize +# Now read workfile back, processing lines that: +# 1. contain EXE file name +# 2. contain segment type +# Print EXE name and stats for each segment type processed +# When there is a new EXE name, print summary for EXE just processed +j = 1 +while (getline < ARGV[1] > 0) { +# module name +if($1 == "Display" && $2 == "of" && $3 == "File") { +# Print program summary for all but last program +if(module_name != "") { Print_Summary() } +otcount = 0 # text segment counter +odcount = 0 # data segment counter +otsize = 0 # text size accumulator +odsize = 0 # data size accumulator +module_name = $4 } +# File Size +if($1 == "DOS" && $2 == "File" && $3 == "Size") { +# 6+ digit file size with leading left paren +DOS_Size = substr($5,2,7) +# file size < 6 digits +if(DOS_Size == 0 || DOS_Size == "") { DOS_Size = $6 } +} +# CODE segment +if($1 == "Segment" && $2 == "Type:" && $3 =="CODE") { +decval = hexdec(substr($7,1,4)) +otsize += decval +# printf ("%12s CODE %4s %7u\n", module_name, $7, decval) +otcount++ } +# DATA segment +if($1 == "Segment" && $2 == "Type:" && $3 =="DATA") { +decval = hexdec(substr($7,1,4)) +odsize += decval +# printf ("%12s DATA %4s %7u\n", module_name, $7, decval) +odcount++ } +} # while +} # end of BEGIN section +# no main loop at all! +END { +# print record for last program +Print_Summary() +# delete work files +system("del "ARGV[1]) +system("del workfile.$%$") +} # end of END section + +# No scanf in awk, so convert hex string x to decimal the hard way +function hexdec (x) { +result = 0 +for (i=1; i<=length(x); i++) { +thechar = substr(x,i,1) +# digits 0-9 and lower case hex produced by TDUMP +# use brute force +if (thechar == "0") {result = result*16} +if (thechar == "1") {result = result*16 + 1} +if (thechar == "2") {result = result*16 + 2} +if (thechar == "3") {result = result*16 + 3} +if (thechar == "4") {result = result*16 + 4} +if (thechar == "5") {result = result*16 + 5} +if (thechar == "6") {result = result*16 + 6} +if (thechar == "7") {result = result*16 + 7} +if (thechar == "8") {result = result*16 + 8} +if (thechar == "9") {result = result*16 + 9} +if (thechar == "a") {result = result*16 + 10} +if (thechar == "b") {result = result*16 + 11} +if (thechar == "c") {result = result*16 + 12} +if (thechar == "d") {result = result*16 + 13} +if (thechar == "e") {result = result*16 + 14} +if (thechar == "f") {result = result*16 + 15} +if (thechar == "A") {result = result*16 + 10} +if (thechar == "B") {result = result*16 + 11} +if (thechar == "C") {result = result*16 + 12} +if (thechar == "D") {result = result*16 + 13} +if (thechar == "E") {result = result*16 + 14} +if (thechar == "F") {result = result*16 + 15} +} # for (i=1;i + +# Sum up sizes of Windows OBJ files in current directory +# requires DOS 5.0 and Borland TDUMP +# A clumsy script to count Windows OBJs and sum up the CODE sizes +# run with +# awk -fwinobj.awk work1 +# where work1 is a work file +# You must have at least one filename as an arg, else awk will want to read +# from con:, hence the requirement for work1 +BEGIN { +# redirection done by shelled command +ocount = 0 # obj module counter +otsize = 0 # text size accumulator +odsize = 0 # data size accumulator +system("del workfile.$%$") # Will probably cause a File Not Found message +# Generate a list of OBJs +system("dir *.obj /b >" ARGV[1]) +while (getline < ARGV[1] > 0) { +# TDUMP selects only the SEGDEFs to speed things up a lot +# and keeps on piping to the workfile +system("tdump " $1 " -oiSEGDEF >>workfile.$%$") +ocount++ +} +# Now read workfile back, processing lines that are module ids and SEGDEF info +# Print one line for each SEGDEF processed +j = 1 +while (getline < "workfile.$%$" > 0) { +# module name +if($1 == "Display" && $2 == "of" && $3 == "File") { module_name = $4 } +# SEGDEF CODE +if($2 == "SEGDEF" && $9 =="'CODE'") { +decval = hexdec($11) +otsize += decval +printf ("%12s CODE %4s %7i\n", module_name, $11, decval) +j++ } +# SEGDEF DATA +if($2 == "SEGDEF" && $9 =="'DATA'") { +decval = hexdec($11) +odsize += decval +printf ("%12s DATA %4s %7i\n", module_name, $11, decval) +j++ } +} # while +} # end of BEGIN section +# no main loop at all! +END { +# print summary and delete work files +printf ("%i OBJ files\n", ocount) +printf ("Total CODE size %04x %7li bytes\n", otsize, otsize) +printf ("Total DATA size %04x %7li bytes\n", odsize, odsize) +system("del "ARGV[1]) +system("del workfile.$%$") +} # end of END section + +# No scanf in awk, so convert hex string x to decimal the hard way +function hexdec (x) { +result = 0 +for (i=1; i<=length(x); i++) { +thechar = substr(x,i,1) +# digits 0-9 and lower case hex produced by TDUMP +# use brute force +if (thechar == "0") {result = result*16} +if (thechar == "1") {result = result*16 + 1} +if (thechar == "2") {result = result*16 + 2} +if (thechar == "3") {result = result*16 + 3} +if (thechar == "4") {result = result*16 + 4} +if (thechar == "5") {result = result*16 + 5} +if (thechar == "6") {result = result*16 + 6} +if (thechar == "7") {result = result*16 + 7} +if (thechar == "8") {result = result*16 + 8} +if (thechar == "9") {result = result*16 + 9} +if (thechar == "a") {result = result*16 + 10} +if (thechar == "b") {result = result*16 + 11} +if (thechar == "c") {result = result*16 + 12} +if (thechar == "d") {result = result*16 + 13} +if (thechar == "e") {result = result*16 + 14} +if (thechar == "f") {result = result*16 + 15} +} # for (i=1;i $@ + echo $(OBJ2)+ >> $@ + echo $(OBJ3)+ >> $@ + echo $(REXP_OBJ)+ >> $@ + +RFLAGS=-I. -Irexp -DMAWK + +rexp.obj : rexp/rexp.c rexp/rexp.h + $(CC) $(CFLAGS) $(RFLAGS) -c rexp/rexp.c + +rexp0.obj : rexp/rexp0.c rexp/rexp.h + $(CC) $(CFLAGS) $(RFLAGS) -c rexp/rexp0.c + +rexp1.obj : rexp/rexp1.c rexp/rexp.h + $(CC) $(CFLAGS) $(RFLAGS) -c rexp/rexp1.c + +rexp2.obj : rexp/rexp2.c rexp/rexp.h + $(CC) $(CFLAGS) $(RFLAGS) -c rexp/rexp2.c + +rexp3.obj : rexp/rexp3.c rexp/rexp.h + $(CC) $(CFLAGS) $(RFLAGS) -c rexp/rexp3.c + +rexpdb.obj : rexp/rexpdb.c rexp/rexp.h + $(CC) $(CFLAGS) $(RFLAGS) -c rexp/rexpdb.c + +config.h : msdos/msc.h + copy msdos\msc.h config.h + copy msdos\mawk.def . + +dosexec.c : msdos/dosexec.c + copy msdos\dosexec.c dosexec.c + +test : mawk.exe # test that we have a sane mawk + @echo you may have to run the test manually + cd test && mawktest.bat + +fpe_test : mawk.exe # test FPEs are handled OK + @echo testing floating point exception handling + @echo you may have to run the test manually + cd test && fpe_test.bat + +################################################### +# parse.c is provided +# so you don't need to make it. +# +# But if you do: here's how: +# To make it with bison under msdos +# YACC=bison -y +# parse.c : parse.y +# $(YACC) -d parse.y +# rename y_tab.h parse.h +# rename y_tab.c parse.c +######################################## + +#scancode.c : makescan.c scan.h +# $(CC) -o makescan.exe makescan.c +# makescan.exe > scancode.c +# del makescan.exe + +clean : + del *.obj + +distclean : + del *.obj + del config.h dosexec.c + del mawk.exe + + +# dependencies of .objs on .h +array.obj : config.h field.h bi_vars.h mawk.h symtype.h nstd.h memory.h zmalloc.h types.h sizes.h +bi_funct.obj : config.h field.h bi_vars.h mawk.h init.h regexp.h symtype.h nstd.h repl.h memory.h bi_funct.h files.h zmalloc.h fin.h types.h sizes.h +bi_vars.obj : config.h field.h bi_vars.h mawk.h init.h symtype.h nstd.h memory.h zmalloc.h types.h sizes.h +cast.obj : config.h field.h mawk.h parse.h symtype.h nstd.h memory.h repl.h scan.h zmalloc.h types.h sizes.h +code.obj : config.h field.h code.h mawk.h init.h symtype.h nstd.h memory.h jmp.h zmalloc.h types.h sizes.h +da.obj : config.h field.h code.h mawk.h symtype.h nstd.h memory.h repl.h bi_funct.h zmalloc.h types.h sizes.h +error.obj : config.h bi_vars.h mawk.h parse.h vargs.h symtype.h nstd.h scan.h types.h sizes.h +execute.obj : config.h field.h bi_vars.h code.h mawk.h regexp.h symtype.h nstd.h memory.h repl.h bi_funct.h zmalloc.h types.h fin.h sizes.h +fcall.obj : config.h code.h mawk.h symtype.h nstd.h memory.h zmalloc.h types.h sizes.h +field.obj : config.h field.h bi_vars.h mawk.h init.h parse.h regexp.h symtype.h nstd.h memory.h repl.h scan.h zmalloc.h types.h sizes.h +files.obj : config.h mawk.h nstd.h memory.h files.h zmalloc.h types.h fin.h sizes.h +fin.obj : config.h field.h bi_vars.h mawk.h parse.h symtype.h nstd.h memory.h scan.h zmalloc.h types.h fin.h sizes.h +hash.obj : config.h mawk.h symtype.h nstd.h memory.h zmalloc.h types.h sizes.h +init.obj : config.h field.h bi_vars.h code.h mawk.h init.h symtype.h nstd.h memory.h zmalloc.h types.h sizes.h +jmp.obj : config.h code.h mawk.h init.h symtype.h nstd.h memory.h jmp.h zmalloc.h types.h sizes.h +kw.obj : config.h mawk.h init.h parse.h symtype.h nstd.h types.h sizes.h +main.obj : config.h field.h bi_vars.h code.h mawk.h init.h symtype.h nstd.h memory.h files.h zmalloc.h types.h fin.h sizes.h +makescan.obj : parse.h symtype.h scan.h +matherr.obj : config.h mawk.h nstd.h types.h sizes.h +memory.obj : config.h mawk.h nstd.h memory.h zmalloc.h types.h sizes.h +parse.obj : config.h field.h bi_vars.h code.h mawk.h symtype.h nstd.h memory.h bi_funct.h files.h zmalloc.h jmp.h types.h sizes.h +print.obj : config.h field.h bi_vars.h mawk.h parse.h symtype.h nstd.h memory.h scan.h bi_funct.h files.h zmalloc.h types.h sizes.h +re_cmpl.obj : config.h mawk.h parse.h regexp.h symtype.h nstd.h memory.h repl.h scan.h zmalloc.h types.h sizes.h +scan.obj : config.h field.h code.h mawk.h init.h parse.h symtype.h nstd.h memory.h repl.h scan.h files.h zmalloc.h types.h fin.h sizes.h +split.obj : config.h field.h bi_vars.h mawk.h parse.h regexp.h symtype.h nstd.h memory.h scan.h bi_funct.h zmalloc.h types.h sizes.h +version.obj : config.h mawk.h patchlev.h nstd.h types.h sizes.h +zmalloc.obj : config.h mawk.h nstd.h zmalloc.h types.h sizes.h diff --git a/msdos/makefile.tcc b/msdos/makefile.tcc new file mode 100644 index 0000000..6cfd946 --- /dev/null +++ b/msdos/makefile.tcc @@ -0,0 +1,217 @@ + +# this is a makefile for mawk under DOS +# with Borland make +# +# make -- mawk.exe + +# for a unix style command line add +# -DREARV=your_reargv_file without the extension +# +# e.g. -DREARGV=argvmks + +#$Log: makefile.tcc,v $ +# Revision 1.1 1995/08/20 17:44:37 mike +# minor fixes to msc and lower case makefile names +# +# Revision 1.3 1995/01/08 22:56:34 mike +# minor tweaks +# +# Revision 1.2 1995/01/07 21:16:03 mike +# remove small model +# + +.SWAP + +# user settable +# change here or override from command line e.g. -DCC=bcc + +TARGET=mawk + +!if ! $d(CC) +CC=tcc # bcc or ? +!endif + +!if ! $d(LIBDIR) +LIBDIR =c:\lib # where are your Borland C libraries ? +!endif + +!if ! $d(FLOATLIB) +FLOATLIB=emu # or fp87 if you have fp87 hardware +!endif + +!if ! $d(WILDCARD) +WILDCARD=$(LIBDIR)\wildargs.obj +!endif + +# compiler flags +# -G optimize for speed +# -d merge duplicate strings +# -v- symbolic debugging off +# -O optimize +# -ml large model +CFLAGS = -ml -c -d -v- -O -G + +LFLAGS = /c #case sensitive linking + +# how to delete a file +!if ! $d(RM) +RM = del # rm +!endif + +# how to rename a file +!if ! $d(RENAME) +RENAME = rename # mv +!endif + +!if ! $d(COPY) +COPY = copy # cp +!endif + +############################## +# end of user settable +# + +MODEL=l + +CFLAGS=-m$(MODEL) $(CFLAGS) + +!if $d(REARGV) +CFLAGS=$(CFLAGS) -DHAVE_REARGV=1 +!endif + +OBS = parse.obj \ +array.obj \ +bi_funct.obj \ +bi_vars.obj \ +cast.obj \ +code.obj \ +da.obj \ +error.obj \ +execute.obj \ +fcall.obj \ +field.obj \ +files.obj \ +fin.obj \ +hash.obj \ +init.obj \ +jmp.obj \ +kw.obj \ +main.obj \ +matherr.obj \ +memory.obj \ +missing.obj \ +print.obj \ +re_cmpl.obj \ +scan.obj \ +scancode.obj \ +split.obj \ +zmalloc.obj \ +version.obj \ +dosexec.obj + +!if $d(REARGV) +OBS = $(OBS) $(REARGV).obj +!endif + +REXP_OBS = rexp.obj \ +rexp0.obj \ +rexp1.obj \ +rexp2.obj \ +rexp3.obj + +LIBS = $(LIBDIR)\$(FLOATLIB) \ +$(LIBDIR)\math$(MODEL) $(LIBDIR)\c$(MODEL) + +$(TARGET).exe : $(OBS) $(REXP_OBS) + tlink $(LFLAGS) @&&! + $(LIBDIR)\c0$(MODEL) $(WILDCARD) $(OBS) $(REXP_OBS) + $(TARGET),$(TARGET) + $(LIBS) +! + +.c.obj : + $(CC) $(CFLAGS) {$*.c } + + +config.h : msdos\tcc.h + $(COPY) msdos\tcc.h config.h + +dosexec.c : msdos\dosexec.c + $(COPY) msdos\dosexec.c dosexec.c + +#scancode.c : makescan.c scan.h +# $(CC) makescan.c +# makescan.exe > scancode.c +# $(RM) makescan.obj +# $(RM) makescan.exe + + +################################################### +# parse.c is provided +# so you don't need to make it. +# +# But if you do: here's how: +# To make it with bison under msdos +# YACC=bison -y +# parse.c : parse.y +# $(YACC) -d parse.y +# $(RENAME) y_tab.h parse.h +# $(RENAME) y_tab.c parse.c +######################################## + + +clean : + $(RM) *.obj + +distclean : + $(RM) *.obj + $(RM) config.h dosexec.c + $(RM) mawk.exe + +RFLAGS=-Irexp -DMAWK + +rexp.obj : rexp\rexp.c rexp\rexp.h + $(CC) $(CFLAGS) $(RFLAGS) rexp\rexp.c + +rexp0.obj : rexp\rexp0.c rexp\rexp.h + $(CC) $(CFLAGS) $(RFLAGS) rexp\rexp0.c + +rexp1.obj : rexp\rexp1.c rexp\rexp.h + $(CC) $(CFLAGS) $(RFLAGS) rexp\rexp1.c + +rexp2.obj : rexp\rexp2.c rexp\rexp.h + $(CC) $(CFLAGS) $(RFLAGS) rexp\rexp2.c + +rexp3.obj : rexp\rexp3.c rexp\rexp.h + $(CC) $(CFLAGS) $(RFLAGS) rexp\rexp3.c + + +# dependencies of .objs on .h +array.obj : config.h field.h bi_vars.h mawk.h symtype.h nstd.h memory.h zmalloc.h types.h sizes.h +bi_funct.obj : config.h field.h bi_vars.h mawk.h init.h regexp.h symtype.h nstd.h repl.h memory.h bi_funct.h files.h zmalloc.h fin.h types.h sizes.h +bi_vars.obj : config.h field.h bi_vars.h mawk.h init.h symtype.h nstd.h memory.h zmalloc.h types.h sizes.h +cast.obj : config.h field.h mawk.h parse.h symtype.h nstd.h memory.h repl.h scan.h zmalloc.h types.h sizes.h +code.obj : config.h field.h code.h mawk.h init.h symtype.h nstd.h memory.h jmp.h zmalloc.h types.h sizes.h +da.obj : config.h field.h code.h mawk.h symtype.h nstd.h memory.h repl.h bi_funct.h zmalloc.h types.h sizes.h +error.obj : config.h bi_vars.h mawk.h parse.h vargs.h symtype.h nstd.h scan.h types.h sizes.h +execute.obj : config.h field.h bi_vars.h code.h mawk.h regexp.h symtype.h nstd.h memory.h repl.h bi_funct.h zmalloc.h types.h fin.h sizes.h +fcall.obj : config.h code.h mawk.h symtype.h nstd.h memory.h zmalloc.h types.h sizes.h +field.obj : config.h field.h bi_vars.h mawk.h init.h parse.h regexp.h symtype.h nstd.h memory.h repl.h scan.h zmalloc.h types.h sizes.h +files.obj : config.h mawk.h nstd.h memory.h files.h zmalloc.h types.h fin.h sizes.h +fin.obj : config.h field.h bi_vars.h mawk.h parse.h symtype.h nstd.h memory.h scan.h zmalloc.h types.h fin.h sizes.h +hash.obj : config.h mawk.h symtype.h nstd.h memory.h zmalloc.h types.h sizes.h +init.obj : config.h field.h bi_vars.h code.h mawk.h init.h symtype.h nstd.h memory.h zmalloc.h types.h sizes.h +jmp.obj : config.h code.h mawk.h init.h symtype.h nstd.h memory.h jmp.h zmalloc.h types.h sizes.h +kw.obj : config.h mawk.h init.h parse.h symtype.h nstd.h types.h sizes.h +main.obj : config.h field.h bi_vars.h code.h mawk.h init.h symtype.h nstd.h memory.h files.h zmalloc.h types.h fin.h sizes.h +makescan.obj : parse.h symtype.h scan.h +matherr.obj : config.h mawk.h nstd.h types.h sizes.h +memory.obj : config.h mawk.h nstd.h memory.h zmalloc.h types.h sizes.h +missing.obj : config.h nstd.h +parse.obj : config.h field.h bi_vars.h code.h mawk.h symtype.h nstd.h memory.h bi_funct.h files.h zmalloc.h jmp.h types.h sizes.h +print.obj : config.h field.h bi_vars.h mawk.h parse.h symtype.h nstd.h memory.h scan.h bi_funct.h files.h zmalloc.h types.h sizes.h +re_cmpl.obj : config.h mawk.h parse.h regexp.h symtype.h nstd.h memory.h repl.h scan.h zmalloc.h types.h sizes.h +scan.obj : config.h field.h code.h mawk.h init.h parse.h symtype.h nstd.h memory.h repl.h scan.h files.h zmalloc.h types.h fin.h sizes.h +split.obj : config.h field.h bi_vars.h mawk.h parse.h regexp.h symtype.h nstd.h memory.h scan.h bi_funct.h zmalloc.h types.h sizes.h +version.obj : config.h mawk.h patchlev.h nstd.h types.h sizes.h +zmalloc.obj : config.h mawk.h nstd.h zmalloc.h types.h sizes.h diff --git a/msdos/makefile.ztc b/msdos/makefile.ztc new file mode 100644 index 0000000..a7baff7 --- /dev/null +++ b/msdos/makefile.ztc @@ -0,0 +1,28 @@ +# Makefile for Zortech C + +OBJ1=parse.obj scan.obj memory.obj main.obj hash.obj execute.obj code.obj\ + da.obj error.obj init.obj bi_vars.obj cast.obj print.obj bi_funct.obj\ + kw.obj jmp.obj array.obj field.obj split.obj re_cmpl.obj zmalloc.obj\ + fin.obj files.obj scancode.obj matherr.obj fcall.obj version.obj\ + dosexec.obj + +#OBJ2=rexp\rexp.obj rexp\rexp0.obj rexp\rexp1.obj rexp\rexp2.obj\ +# rexp\rexp3.obj rexp\rexpdb.obj + +OBJ2=rexp.obj rexp0.obj rexp1.obj rexp2.obj\ + rexp3.obj rexpdb.obj + +CFLAGS=-ml -bx -o -A- -DLARGE -DMAWK -DHAVE_SMALL_MEMORY=0 + +LFLAGS = -L/ST:32768 + +.c.obj: + ztc -c $(CFLAGS) $< + +bmawkztc.exe: $(OBJ1) $(OBJ2) + ztc $(LFLAGS) -obmawkztc $(OBJ1) $(OBJ2) + +$(OBJ1): BI_FUNCT.H BI_VARS.H CODE.H FIELD.H FILES.H INIT.H JMP.H MEMORY.H\ + PARSE.H PATCHLEV.H REGEXP.H REPL.H SCAN.H SIZES.H SYMTYPE.H TYPES.H\ + ZMALLOC.H CONFIG.H FIN.H MAWK.H +$(OBJ2): rexp.h diff --git a/msdos/mawk.def b/msdos/mawk.def new file mode 100644 index 0000000..d943178 --- /dev/null +++ b/msdos/mawk.def @@ -0,0 +1,2 @@ +NAME mawk WINDOWCOMPAT NEWFILES +DESCRIPTION 'mawk for OS/2 and DOS' diff --git a/msdos/msc.h b/msdos/msc.h new file mode 100644 index 0000000..ac16763 --- /dev/null +++ b/msdos/msc.h @@ -0,0 +1,69 @@ + +/******************************************** +msc.h +copyright 1994, Michael D. Brennan + +This is a source file for mawk, an implementation of +the AWK programming language. + +Mawk is distributed without warranty under the terms of +the GNU General Public License, version 2, 1991. +********************************************/ + +/* Microsoft C 6.0A under MSDOS */ + +/*$Log: msc.h,v $ + *Revision 1.6 1996/07/28 21:46:16 mike + *gnuish patch + * + * Revision 1.5 1995/08/20 17:44:38 mike + * minor fixes to msc and lower case makefile names + * + * Revision 1.4 1995/01/08 21:50:43 mike + * remove extra #endif + * + * Revision 1.3 1994/10/08 19:12:05 mike + * SET_PROGNAME + * + * Revision 1.2 1994/10/08 18:49:28 mike + * add MAX__INT etc + * + * Revision 1.1 1994/10/08 18:24:40 mike + * moved from config directory + * +*/ + +#ifndef CONFIG_H +#define CONFIG_H 1 + + +#define MSDOS_MSC 1 +#define MSDOS 1 + +#define SIZE_T_STDDEF_H 1 +#define MAX__INT 0x7fff +#define MAX__LONG 0x7fffffff +#define HAVE_FAKE_PIPES 1 + + +#define FPE_TRAPS_ON 1 +#define NOINFO_SIGFPE 1 + +/* how to test far pointers have the same segment */ +#define SAMESEG(p,q) \ + (((unsigned long)(p)^(unsigned long)(q))<0x10000L) + +#if HAVE_REARGV +#define SET_PROGNAME() reargv(&argc,&argv) ; progname = argv[0] +#else +#define SET_PROGNAME() progname = "mawk" +#ifdef OS2 +# ifdef MSDOS +# define DOS_STRING "dos+os2" +# else +# define DOS_STRING "os2" +# endif +#endif +#endif + +#endif /* CONFIG_H */ diff --git a/msdos/tcc.h b/msdos/tcc.h new file mode 100644 index 0000000..1c6bc9a --- /dev/null +++ b/msdos/tcc.h @@ -0,0 +1,68 @@ + +/******************************************** +tcc.h +copyright 1994, Michael D. Brennan + +This is a source file for mawk, an implementation of +the AWK programming language. + +Mawk is distributed without warranty under the terms of +the GNU General Public License, version 2, 1991. +********************************************/ + +/* Turbo C under MSDOS */ + +/* $Log: tcc.h,v $ + * Revision 1.5 1995/08/20 17:14:13 mike + * get size_t from + * + * Revision 1.4 1995/01/08 21:48:00 mike + * remove extra #endif + * + * Revision 1.3 1994/10/08 19:12:07 mike + * SET_PROGNAME + * + * Revision 1.2 1994/10/08 18:49:29 mike + * add MAX__INT etc + * + * Revision 1.1 1994/10/08 18:24:41 mike + * moved from config directory + * +*/ + +#ifndef CONFIG_H +#define CONFIG_H 1 + +#define MSDOS 1 + +#define SIZE_T_STDDEF_H 1 + +#define MAX__INT 0x7fff +#define MAX__LONG 0x7fffffff +#define HAVE_FAKE_PIPES 1 + +/* strerror() used to not work because all the lines were + terminated with \n -- if no longer true then this can go + away + ?????????????? +*/ +#define NO_STRERROR 1 + +/* Turbo C float lib bungles comparison of NaNs so we + have to keep traps on */ +#define FPE_TRAPS_ON 1 +#define FPE_ZERODIVIDE 131 +#define FPE_OVERFLOW 132 + +/* how to test far pointers have the same segment */ +#include +#define SAMESEG(p,q) (FP_SEG(p)==FP_SEG(q)) + +#if HAVE_REARGV +#define SET_PROGNAME() reargv(&argc,&argv) ; progname = argv[0] +#else +#define SET_PROGNAME() progname = "mawk" +#endif + + +#endif /* CONFIG_H */ diff --git a/msdos/ztc.h b/msdos/ztc.h new file mode 100644 index 0000000..390ddaf --- /dev/null +++ b/msdos/ztc.h @@ -0,0 +1,69 @@ + +/******************************************** +ztc.h +copyright 1992-4, Michael D. Brennan + +This is a source file for mawk, an implementation of +the AWK programming language. + +Mawk is distributed without warranty under the terms of +the GNU General Public License, version 2, 1991. +********************************************/ + +/* Zortech C++ under MSDOS */ + +/* $Log: ztc.h,v $ + * Revision 1.3 1994/10/08 19:12:08 mike + * SET_PROGNAME + * + * Revision 1.2 1994/10/08 18:49:30 mike + * add MAX__INT etc + * + * Revision 1.1 1994/10/08 18:24:43 mike + * moved from config directory + * + * Revision 1.1.1.1 1993/07/03 18:58:37 mike + * move source to cvs + * + * Revision 1.1 1992/12/27 01:42:50 mike + * Initial revision + * + * Revision 4.2.1 92/06/01 00:00:00 bmyers + * create Zortech C++ version from Borland C++ version + * ZTC has matherr function and no info for floating point exceptions. + * +*/ + +/* +This might not work anymore under mawk 1.2 +MDB 10/94 +*/ + +#ifndef CONFIG_H +#define CONFIG_H 1 + +#define MSDOS 1 + +#define SIZE_T_HFILE +#define MAX__INT 0x7fff +#define MAX__LONG 0x7fffffff +#define HAVE_FAKE_PIPES 1 +/* contradicts comment above ??? */ +#define NO_MATHERR 1 + + +#define FPE_TRAPS_ON 1 +#define NOINFO_SIGFPE 1 + + +/* how to test far pointers have the same segment */ +#include +#define SAMESEG(p,q) (FP_SEG(p)==FP_SEG(q)) + +#if HAVE_REARGV +#define SET_PROGNAME() reargv(&argc,&argv) ; progname = argv[0] +#else +#define SET_PROGNAME() progname = "mawk" +#endif + +#endif /* CONFIG_H */ diff --git a/nstd.h b/nstd.h new file mode 100644 index 0000000..298b8ec --- /dev/null +++ b/nstd.h @@ -0,0 +1,102 @@ +/* nstd.h */ + +/* Never Standard.h + + This has all the prototypes that are supposed to + be in a standard place but never are, and when they are + the standard place isn't standard +*/ + +/* +$Log: nstd.h,v $ + * Revision 1.6 1995/06/18 19:42:22 mike + * Remove some redundant declarations and add some prototypes + * + * Revision 1.5 1995/04/20 20:26:56 mike + * beta improvements from Carl Mascott + * + * Revision 1.4 1994/12/11 22:08:24 mike + * add STDC_MATHERR + * + * Revision 1.3 1993/07/15 23:56:09 mike + * general cleanup + * + * Revision 1.2 1993/07/07 00:07:43 mike + * more work on 1.2 + * + * Revision 1.1 1993/07/04 12:38:06 mike + * Initial revision + * +*/ + +#ifndef NSTD_H +#define NSTD_H 1 + +#include "config.h" + +#ifdef NO_PROTOS +#define PROTO(name,args) name() +#else +#define PROTO(name,args) name args +#endif + + + +/* types */ + +#ifdef NO_VOID_STAR +typedef char *PTR ; +#else +typedef void *PTR ; +#endif + +#ifdef SIZE_T_STDDEF_H +#include +#else +#ifdef SIZE_T_TYPES_H +#include +#else +typedef unsigned size_t ; +#endif +#endif + +/* stdlib.h */ + +double PROTO(strtod, (const char*, char**)) ; +void PROTO(free, (void*)) ; +PTR PROTO(malloc, (size_t)) ; +PTR PROTO(realloc, (void*,size_t)) ; +void PROTO(exit, (int)) ; +char* PROTO(getenv, (const char*)) ; + +/* string.h */ + +int PROTO(memcmp, (const void*,const void*,size_t)) ; +PTR PROTO(memcpy, (void*,const void*,size_t)) ; +PTR PROTO(memset, (void*,int,size_t)) ; +char* PROTO(strchr, (const char*, int)) ; +int PROTO(strcmp, (const char*,const char*)) ; +char* PROTO(strcpy, (char *, const char*)) ; +size_t PROTO(strlen, (const char*)) ; +int PROTO(strncmp, (const char*,const char*,size_t)) ; +char* PROTO(strncpy, (char*, const char*, size_t)) ; +char* PROTO(strrchr, (const char*,int)) ; +char* PROTO(strerror, (int)) ; + + +#ifdef NO_ERRNO_H +extern int errno ; +#else +#include +#endif + +/* math.h */ +double PROTO(fmod,(double,double)) ; + +/* if have to diddle with errno to get errors from the math library */ +#ifndef STDC_MATHERR +#define STDC_MATHERR (FPE_TRAPS_ON && NO_MATHERR) +#endif + +#endif /* NSTD_H */ + diff --git a/packing.list b/packing.list new file mode 100644 index 0000000..5b0a49f --- /dev/null +++ b/packing.list @@ -0,0 +1,171 @@ + +#$Id: packing.list,v 1.17 1996/09/18 00:40:21 mike Exp $ + +################################################ +# These files form the mawk distribution 1.3 +# +# Mawk is an implementation of the AWK Programming Language as +# defined and described in Aho, Kernighan and Weinberger, The +# Awk Programming Language, Addison-Wesley, 1988 and extended +# by Posix 1003.2 D11.3 +# +################################################ +packing.list this file +README description of mawk 1.3 +INSTALL installation instructions +COPYING GNU General Public License, version 2 +ACKNOWLEDGMENT +CHANGES +Makefile.in makefile template +configure Configuration script +config.user user settable configuration parameters +configure.in autoconf script +mawk.ac.m4 ditto +################################# +# directory: config-user hints on CFLAGS and odd configurations +config-user/.config.user readonly copy of template +config-user/apollo +config-user/convex +config-user/mips +config-user/sgi +config-user/ultrix-mips +config-user/cray +###################### +bi_funct.c source files +bi_vars.c +cast.c +code.c +da.c +error.c +execute.c +fcall.c +field.c +files.c +fin.c +hash.c +init.c +jmp.c +kw.c +main.c +makescan.c +matherr.c +memory.c +missing.c +print.c +re_cmpl.c +scan.c +scancode.c +split.c +version.c +zmalloc.c +bi_funct.h +bi_vars.h +code.h +field.h +files.h +fin.h +init.h +jmp.h +mawk.h +memory.h +nstd.h +patchlev.h +regexp.h +repl.h +scan.h +sizes.h +symtype.h +types.h +vargs.h +zmalloc.h +parse.y +parse.c +parse.h +array.w +array.c +array.h +fpe_check.c +######################## +# directory: man +man/mawk.1 troff source for unix style man pages +man/mawk.doc ascii man pages +######################## +# directory: rexp +rexp/Makefile make rexp*.o files +rexp/rexp.c source for regular matching library +rexp/rexp.h +rexp/rexp0.c +rexp/rexp1.c +rexp/rexp2.c +rexp/rexp3.c +rexp/rexpdb.c +####################### +# directory: test testing and benchmarking directory +test/mawktest scripts to test mawk compiled OK +test/mawktest.v7 +test/mawktest.bat DOS +test/mawktest.g atarist +test/mawktest.dat input data for the test +test/fpe_test scripts to test if fpe handling compiled OK +test/fpe_test.v7 +test/fpe_test.bat +test/fpe_test.g +test/wc.awk awk programs used by the tests +test/reg0.awk +test/reg1.awk +test/reg2.awk +test/wfrq0.awk +test/decl-awk.out +test/fpetest1.awk +test/fpetest2.awk +test/fpetest3.awk +test/reg-awk.out +test/wc-awk.out +test/wfrq-awk.out +###################### +# directory: examples useful awk programs +examples/hical calendar program by Bob Stockler +examples/hcal Bob's latest +examples/decl.awk +examples/deps.awk +examples/gdecl.awk +examples/nocomment.awk +examples/eatc.awk +examples/primes.awk +examples/qsort.awk +examples/ct_length.awk change length to length() +###################### +# directory msdos +msdos/NOTES +msdos/INSTALL installation instructions for DOS +msdos/dosexec.c system() and pipes() for DOS +msdos/argvpoly.c for polyshell +msdos/argvmks.c for MKS Korn Shell +msdos/makefile.tcc for [TB]CC and Borland make +msdos/makefile.msc nmake and MSC 6.0A +msdos/makefile.ztc +msdos/mawk.def +msdos/tcc.h +msdos/msc.h +msdos/ztc.h +##################### +# directory msdos/examples awk programs for msdos +msdos/examples/add_cr.awk +msdos/examples/doslist.awk +msdos/examples/objstat.awk +msdos/examples/shell.awk +msdos/examples/srcstat.awk +msdos/examples/srcstat2.awk +msdos/examples/texttest.awk +msdos/examples/winexe.awk +msdos/examples/winobj.awk +##################### +# directory atarist +atarist/README.ST +#################### +# directory v7 +v7/Makefile.v7 +v7/README +v7/V7.h +v7/V7_notes +v7/config.h diff --git a/parse.c b/parse.c new file mode 100644 index 0000000..01d948b --- /dev/null +++ b/parse.c @@ -0,0 +1,3960 @@ + +/* A Bison parser, made by GNU Bison 2.4.1. */ + +/* Skeleton implementation for Bison's Yacc-like parsers in C + + Copyright (C) 1984, 1989, 1990, 2000, 2001, 2002, 2003, 2004, 2005, 2006 + Free Software Foundation, Inc. + + This program is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation, either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see . */ + +/* As a special exception, you may create a larger work that contains + part or all of the Bison parser skeleton and distribute that work + under terms of your choice, so long as that work isn't itself a + parser generator using the skeleton or a modified version thereof + as a parser skeleton. Alternatively, if you modify or redistribute + the parser skeleton itself, you may (at your option) remove this + special exception, which will cause the skeleton and the resulting + Bison output files to be licensed under the GNU General Public + License without this special exception. + + This special exception was added by the Free Software Foundation in + version 2.2 of Bison. */ + +/* C LALR(1) parser skeleton written by Richard Stallman, by + simplifying the original so-called "semantic" parser. */ + +/* All symbols defined below should begin with yy or YY, to avoid + infringing on user name space. This should be done even for local + variables, as they might otherwise be expanded by user macros. + There are some unavoidable exceptions within include files to + define necessary library symbols; they are noted "INFRINGES ON + USER NAME SPACE" below. */ + +/* Identify Bison output. */ +#define YYBISON 1 + +/* Bison version. */ +#define YYBISON_VERSION "2.4.1" + +/* Skeleton name. */ +#define YYSKELETON_NAME "yacc.c" + +/* Pure parsers. */ +#define YYPURE 0 + +/* Push parsers. */ +#define YYPUSH 0 + +/* Pull parsers. */ +#define YYPULL 1 + +/* Using locations. */ +#define YYLSP_NEEDED 0 + + + +/* Copy the first part of user declarations. */ + +/* Line 189 of yacc.c */ +#line 79 "parse.y" + +#include +#include "mawk.h" +#include "symtype.h" +#include "code.h" +#include "memory.h" +#include "bi_funct.h" +#include "bi_vars.h" +#include "jmp.h" +#include "field.h" +#include "files.h" + + +#define YYMAXDEPTH 200 + + +extern void PROTO( eat_nl, (void) ) ; +static void PROTO( resize_fblock, (FBLOCK *) ) ; +static void PROTO( switch_code_to_main, (void)) ; +static void PROTO( code_array, (SYMTAB *) ) ; +static void PROTO( code_call_id, (CA_REC *, SYMTAB *) ) ; +static void PROTO( field_A2I, (void)) ; +static void PROTO( check_var, (SYMTAB *) ) ; +static void PROTO( check_array, (SYMTAB *) ) ; +static void PROTO( RE_as_arg, (void)) ; + +static int scope ; +static FBLOCK *active_funct ; + /* when scope is SCOPE_FUNCT */ + +#define code_address(x) if( is_local(x) ) \ + code2op(L_PUSHA, (x)->offset) ;\ + else code2(_PUSHA, (x)->stval.cp) + +#define CDP(x) (code_base+(x)) +/* WARNING: These CDP() calculations become invalid after calls + that might change code_base. Which are: code2(), code2op(), + code_jmp() and code_pop(). +*/ + +/* this nonsense caters to MSDOS large model */ +#define CODE_FE_PUSHA() code_ptr->ptr = (PTR) 0 ; code1(FE_PUSHA) + + + +/* Line 189 of yacc.c */ +#line 119 "y.tab.c" + +/* Enabling traces. */ +#ifndef YYDEBUG +# define YYDEBUG 0 +#endif + +/* Enabling verbose error messages. */ +#ifdef YYERROR_VERBOSE +# undef YYERROR_VERBOSE +# define YYERROR_VERBOSE 1 +#else +# define YYERROR_VERBOSE 0 +#endif + +/* Enabling the token table. */ +#ifndef YYTOKEN_TABLE +# define YYTOKEN_TABLE 0 +#endif + + +/* Tokens. */ +#ifndef YYTOKENTYPE +# define YYTOKENTYPE + /* Put the tokens into the symbol table, so that GDB and other debuggers + know about them. */ + enum yytokentype { + UNEXPECTED = 258, + BAD_DECIMAL = 259, + NL = 260, + SEMI_COLON = 261, + LBRACE = 262, + RBRACE = 263, + LBOX = 264, + RBOX = 265, + COMMA = 266, + IO_OUT = 267, + POW_ASG = 268, + MOD_ASG = 269, + DIV_ASG = 270, + MUL_ASG = 271, + SUB_ASG = 272, + ADD_ASG = 273, + ASSIGN = 274, + COLON = 275, + QMARK = 276, + OR = 277, + AND = 278, + IN = 279, + MATCH = 280, + GTE = 281, + GT = 282, + LTE = 283, + LT = 284, + NEQ = 285, + EQ = 286, + CAT = 287, + GETLINE = 288, + MINUS = 289, + PLUS = 290, + MOD = 291, + DIV = 292, + MUL = 293, + UMINUS = 294, + NOT = 295, + PIPE = 296, + IO_IN = 297, + POW = 298, + INC_or_DEC = 299, + FIELD = 300, + DOLLAR = 301, + RPAREN = 302, + LPAREN = 303, + DOUBLE = 304, + STRING_ = 305, + RE = 306, + ID = 307, + D_ID = 308, + FUNCT_ID = 309, + BUILTIN = 310, + LENGTH = 311, + PRINT = 312, + PRINTF = 313, + SPLIT = 314, + MATCH_FUNC = 315, + SUB = 316, + GSUB = 317, + DO = 318, + WHILE = 319, + FOR = 320, + BREAK = 321, + CONTINUE = 322, + IF = 323, + ELSE = 324, + DELETE = 325, + BEGIN = 326, + END = 327, + EXIT = 328, + NEXT = 329, + RETURN = 330, + FUNCTION = 331 + }; +#endif +/* Tokens. */ +#define UNEXPECTED 258 +#define BAD_DECIMAL 259 +#define NL 260 +#define SEMI_COLON 261 +#define LBRACE 262 +#define RBRACE 263 +#define LBOX 264 +#define RBOX 265 +#define COMMA 266 +#define IO_OUT 267 +#define POW_ASG 268 +#define MOD_ASG 269 +#define DIV_ASG 270 +#define MUL_ASG 271 +#define SUB_ASG 272 +#define ADD_ASG 273 +#define ASSIGN 274 +#define COLON 275 +#define QMARK 276 +#define OR 277 +#define AND 278 +#define IN 279 +#define MATCH 280 +#define GTE 281 +#define GT 282 +#define LTE 283 +#define LT 284 +#define NEQ 285 +#define EQ 286 +#define CAT 287 +#define GETLINE 288 +#define MINUS 289 +#define PLUS 290 +#define MOD 291 +#define DIV 292 +#define MUL 293 +#define UMINUS 294 +#define NOT 295 +#define PIPE 296 +#define IO_IN 297 +#define POW 298 +#define INC_or_DEC 299 +#define FIELD 300 +#define DOLLAR 301 +#define RPAREN 302 +#define LPAREN 303 +#define DOUBLE 304 +#define STRING_ 305 +#define RE 306 +#define ID 307 +#define D_ID 308 +#define FUNCT_ID 309 +#define BUILTIN 310 +#define LENGTH 311 +#define PRINT 312 +#define PRINTF 313 +#define SPLIT 314 +#define MATCH_FUNC 315 +#define SUB 316 +#define GSUB 317 +#define DO 318 +#define WHILE 319 +#define FOR 320 +#define BREAK 321 +#define CONTINUE 322 +#define IF 323 +#define ELSE 324 +#define DELETE 325 +#define BEGIN 326 +#define END 327 +#define EXIT 328 +#define NEXT 329 +#define RETURN 330 +#define FUNCTION 331 + + + + +#if ! defined YYSTYPE && ! defined YYSTYPE_IS_DECLARED +typedef union YYSTYPE +{ + +/* Line 214 of yacc.c */ +#line 124 "parse.y" + +CELL *cp ; +SYMTAB *stp ; +int start ; /* code starting address as offset from code_base */ +PF_CP fp ; /* ptr to a (print/printf) or (sub/gsub) function */ +BI_REC *bip ; /* ptr to info about a builtin */ +FBLOCK *fbp ; /* ptr to a function block */ +ARG2_REC *arg2p ; +CA_REC *ca_p ; +int ival ; +PTR ptr ; + + + +/* Line 214 of yacc.c */ +#line 322 "y.tab.c" +} YYSTYPE; +# define YYSTYPE_IS_TRIVIAL 1 +# define yystype YYSTYPE /* obsolescent; will be withdrawn */ +# define YYSTYPE_IS_DECLARED 1 +#endif + + +/* Copy the second part of user declarations. */ + + +/* Line 264 of yacc.c */ +#line 334 "y.tab.c" + +#ifdef short +# undef short +#endif + +#ifdef YYTYPE_UINT8 +typedef YYTYPE_UINT8 yytype_uint8; +#else +typedef unsigned char yytype_uint8; +#endif + +#ifdef YYTYPE_INT8 +typedef YYTYPE_INT8 yytype_int8; +#elif (defined __STDC__ || defined __C99__FUNC__ \ + || defined __cplusplus || defined _MSC_VER) +typedef signed char yytype_int8; +#else +typedef short int yytype_int8; +#endif + +#ifdef YYTYPE_UINT16 +typedef YYTYPE_UINT16 yytype_uint16; +#else +typedef unsigned short int yytype_uint16; +#endif + +#ifdef YYTYPE_INT16 +typedef YYTYPE_INT16 yytype_int16; +#else +typedef short int yytype_int16; +#endif + +#ifndef YYSIZE_T +# ifdef __SIZE_TYPE__ +# define YYSIZE_T __SIZE_TYPE__ +# elif defined size_t +# define YYSIZE_T size_t +# elif ! defined YYSIZE_T && (defined __STDC__ || defined __C99__FUNC__ \ + || defined __cplusplus || defined _MSC_VER) +# include /* INFRINGES ON USER NAME SPACE */ +# define YYSIZE_T size_t +# else +# define YYSIZE_T unsigned int +# endif +#endif + +#define YYSIZE_MAXIMUM ((YYSIZE_T) -1) + +#ifndef YY_ +# if YYENABLE_NLS +# if ENABLE_NLS +# include /* INFRINGES ON USER NAME SPACE */ +# define YY_(msgid) dgettext ("bison-runtime", msgid) +# endif +# endif +# ifndef YY_ +# define YY_(msgid) msgid +# endif +#endif + +/* Suppress unused-variable warnings by "using" E. */ +#if ! defined lint || defined __GNUC__ +# define YYUSE(e) ((void) (e)) +#else +# define YYUSE(e) /* empty */ +#endif + +/* Identity function, used to suppress warnings about constant conditions. */ +#ifndef lint +# define YYID(n) (n) +#else +#if (defined __STDC__ || defined __C99__FUNC__ \ + || defined __cplusplus || defined _MSC_VER) +static int +YYID (int yyi) +#else +static int +YYID (yyi) + int yyi; +#endif +{ + return yyi; +} +#endif + +#if ! defined yyoverflow || YYERROR_VERBOSE + +/* The parser invokes alloca or malloc; define the necessary symbols. */ + +# ifdef YYSTACK_USE_ALLOCA +# if YYSTACK_USE_ALLOCA +# ifdef __GNUC__ +# define YYSTACK_ALLOC __builtin_alloca +# elif defined __BUILTIN_VA_ARG_INCR +# include /* INFRINGES ON USER NAME SPACE */ +# elif defined _AIX +# define YYSTACK_ALLOC __alloca +# elif defined _MSC_VER +# include /* INFRINGES ON USER NAME SPACE */ +# define alloca _alloca +# else +# define YYSTACK_ALLOC alloca +# if ! defined _ALLOCA_H && ! defined _STDLIB_H && (defined __STDC__ || defined __C99__FUNC__ \ + || defined __cplusplus || defined _MSC_VER) +# include /* INFRINGES ON USER NAME SPACE */ +# ifndef _STDLIB_H +# define _STDLIB_H 1 +# endif +# endif +# endif +# endif +# endif + +# ifdef YYSTACK_ALLOC + /* Pacify GCC's `empty if-body' warning. */ +# define YYSTACK_FREE(Ptr) do { /* empty */; } while (YYID (0)) +# ifndef YYSTACK_ALLOC_MAXIMUM + /* The OS might guarantee only one guard page at the bottom of the stack, + and a page size can be as small as 4096 bytes. So we cannot safely + invoke alloca (N) if N exceeds 4096. Use a slightly smaller number + to allow for a few compiler-allocated temporary stack slots. */ +# define YYSTACK_ALLOC_MAXIMUM 4032 /* reasonable circa 2006 */ +# endif +# else +# define YYSTACK_ALLOC YYMALLOC +# define YYSTACK_FREE YYFREE +# ifndef YYSTACK_ALLOC_MAXIMUM +# define YYSTACK_ALLOC_MAXIMUM YYSIZE_MAXIMUM +# endif +# if (defined __cplusplus && ! defined _STDLIB_H \ + && ! ((defined YYMALLOC || defined malloc) \ + && (defined YYFREE || defined free))) +# include /* INFRINGES ON USER NAME SPACE */ +# ifndef _STDLIB_H +# define _STDLIB_H 1 +# endif +# endif +# ifndef YYMALLOC +# define YYMALLOC malloc +# if ! defined malloc && ! defined _STDLIB_H && (defined __STDC__ || defined __C99__FUNC__ \ + || defined __cplusplus || defined _MSC_VER) +void *malloc (YYSIZE_T); /* INFRINGES ON USER NAME SPACE */ +# endif +# endif +# ifndef YYFREE +# define YYFREE free +# if ! defined free && ! defined _STDLIB_H && (defined __STDC__ || defined __C99__FUNC__ \ + || defined __cplusplus || defined _MSC_VER) +void free (void *); /* INFRINGES ON USER NAME SPACE */ +# endif +# endif +# endif +#endif /* ! defined yyoverflow || YYERROR_VERBOSE */ + + +#if (! defined yyoverflow \ + && (! defined __cplusplus \ + || (defined YYSTYPE_IS_TRIVIAL && YYSTYPE_IS_TRIVIAL))) + +/* A type that is properly aligned for any stack member. */ +union yyalloc +{ + yytype_int16 yyss_alloc; + YYSTYPE yyvs_alloc; +}; + +/* The size of the maximum gap between one aligned stack and the next. */ +# define YYSTACK_GAP_MAXIMUM (sizeof (union yyalloc) - 1) + +/* The size of an array large to enough to hold all stacks, each with + N elements. */ +# define YYSTACK_BYTES(N) \ + ((N) * (sizeof (yytype_int16) + sizeof (YYSTYPE)) \ + + YYSTACK_GAP_MAXIMUM) + +/* Copy COUNT objects from FROM to TO. The source and destination do + not overlap. */ +# ifndef YYCOPY +# if defined __GNUC__ && 1 < __GNUC__ +# define YYCOPY(To, From, Count) \ + __builtin_memcpy (To, From, (Count) * sizeof (*(From))) +# else +# define YYCOPY(To, From, Count) \ + do \ + { \ + YYSIZE_T yyi; \ + for (yyi = 0; yyi < (Count); yyi++) \ + (To)[yyi] = (From)[yyi]; \ + } \ + while (YYID (0)) +# endif +# endif + +/* Relocate STACK from its old location to the new one. The + local variables YYSIZE and YYSTACKSIZE give the old and new number of + elements in the stack, and YYPTR gives the new location of the + stack. Advance YYPTR to a properly aligned location for the next + stack. */ +# define YYSTACK_RELOCATE(Stack_alloc, Stack) \ + do \ + { \ + YYSIZE_T yynewbytes; \ + YYCOPY (&yyptr->Stack_alloc, Stack, yysize); \ + Stack = &yyptr->Stack_alloc; \ + yynewbytes = yystacksize * sizeof (*Stack) + YYSTACK_GAP_MAXIMUM; \ + yyptr += yynewbytes / sizeof (*yyptr); \ + } \ + while (YYID (0)) + +#endif + +/* YYFINAL -- State number of the termination state. */ +#define YYFINAL 95 +/* YYLAST -- Last index in YYTABLE. */ +#define YYLAST 1173 + +/* YYNTOKENS -- Number of terminals. */ +#define YYNTOKENS 77 +/* YYNNTS -- Number of nonterminals. */ +#define YYNNTS 57 +/* YYNRULES -- Number of rules. */ +#define YYNRULES 172 +/* YYNRULES -- Number of states. */ +#define YYNSTATES 331 + +/* YYTRANSLATE(YYLEX) -- Bison symbol number corresponding to YYLEX. */ +#define YYUNDEFTOK 2 +#define YYMAXUTOK 331 + +#define YYTRANSLATE(YYX) \ + ((unsigned int) (YYX) <= YYMAXUTOK ? yytranslate[YYX] : YYUNDEFTOK) + +/* YYTRANSLATE[YYLEX] -- Bison symbol number corresponding to YYLEX. */ +static const yytype_uint8 yytranslate[] = +{ + 0, 2, 2, 2, 2, 2, 2, 2, 2, 2, + 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, + 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, + 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, + 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, + 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, + 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, + 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, + 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, + 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, + 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, + 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, + 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, + 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, + 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, + 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, + 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, + 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, + 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, + 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, + 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, + 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, + 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, + 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, + 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, + 2, 2, 2, 2, 2, 2, 1, 2, 3, 4, + 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, + 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, + 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, + 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, + 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, + 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, + 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, + 75, 76 +}; + +#if YYDEBUG +/* YYPRHS[YYN] -- Index of the first RHS symbol of rule number YYN in + YYRHS. */ +static const yytype_uint16 yyprhs[] = +{ + 0, 0, 3, 5, 8, 10, 12, 15, 17, 18, + 22, 23, 27, 28, 32, 33, 34, 41, 45, 49, + 51, 53, 55, 58, 60, 63, 65, 68, 71, 74, + 76, 79, 81, 83, 85, 89, 93, 97, 101, 105, + 109, 113, 117, 121, 125, 129, 133, 137, 141, 142, + 147, 148, 153, 154, 155, 163, 165, 168, 170, 172, + 174, 178, 180, 184, 188, 192, 196, 200, 204, 207, + 210, 213, 215, 218, 221, 224, 227, 229, 230, 232, + 234, 238, 244, 246, 247, 253, 255, 257, 259, 263, + 266, 270, 274, 275, 278, 283, 286, 288, 293, 295, + 303, 308, 311, 316, 320, 325, 327, 330, 332, 335, + 339, 345, 351, 357, 364, 372, 376, 383, 386, 388, + 391, 398, 401, 405, 407, 411, 415, 419, 423, 427, + 431, 435, 438, 444, 446, 450, 457, 459, 462, 466, + 469, 473, 475, 478, 481, 485, 490, 492, 494, 496, + 499, 503, 510, 512, 514, 516, 520, 523, 528, 531, + 534, 535, 537, 539, 543, 545, 549, 552, 555, 557, + 561, 565, 568 +}; + +/* YYRHS -- A `-1'-separated list of the rules' RHS. */ +static const yytype_int16 yyrhs[] = +{ + 78, 0, -1, 79, -1, 78, 79, -1, 80, -1, + 125, -1, 130, 86, -1, 86, -1, -1, 71, 81, + 86, -1, -1, 72, 82, 86, -1, -1, 91, 83, + 87, -1, -1, -1, 91, 11, 84, 91, 85, 87, + -1, 7, 88, 8, -1, 7, 1, 8, -1, 86, + -1, 90, -1, 89, -1, 88, 89, -1, 86, -1, + 91, 90, -1, 90, -1, 1, 90, -1, 66, 90, + -1, 67, 90, -1, 119, -1, 74, 90, -1, 5, + -1, 6, -1, 96, -1, 98, 19, 91, -1, 98, + 18, 91, -1, 98, 17, 91, -1, 98, 16, 91, + -1, 98, 15, 91, -1, 98, 14, 91, -1, 98, + 13, 91, -1, 91, 31, 91, -1, 91, 30, 91, + -1, 91, 29, 91, -1, 91, 28, 91, -1, 91, + 27, 91, -1, 91, 26, 91, -1, 91, 25, 91, + -1, -1, 91, 22, 92, 91, -1, -1, 91, 23, + 93, 91, -1, -1, -1, 91, 21, 94, 91, 20, + 95, 91, -1, 97, -1, 96, 97, -1, 49, -1, + 50, -1, 52, -1, 48, 91, 47, -1, 51, -1, + 97, 35, 97, -1, 97, 34, 97, -1, 97, 38, + 97, -1, 97, 37, 97, -1, 97, 36, 97, -1, + 97, 43, 97, -1, 40, 97, -1, 35, 97, -1, + 34, 97, -1, 101, -1, 52, 44, -1, 44, 98, + -1, 115, 44, -1, 44, 115, -1, 52, -1, -1, + 100, -1, 91, -1, 100, 11, 91, -1, 55, 102, + 48, 99, 47, -1, 56, -1, -1, 103, 102, 104, + 106, 90, -1, 57, -1, 58, -1, 99, -1, 48, + 105, 47, -1, 48, 47, -1, 91, 11, 91, -1, + 105, 11, 91, -1, -1, 12, 91, -1, 68, 48, + 91, 47, -1, 107, 89, -1, 69, -1, 107, 89, + 108, 89, -1, 63, -1, 109, 89, 64, 48, 91, + 47, 90, -1, 64, 48, 91, 47, -1, 110, 89, + -1, 111, 112, 113, 89, -1, 65, 48, 6, -1, + 65, 48, 91, 6, -1, 6, -1, 91, 6, -1, + 47, -1, 91, 47, -1, 91, 24, 52, -1, 48, + 105, 47, 24, 52, -1, 52, 102, 9, 100, 10, + -1, 52, 102, 9, 100, 10, -1, 52, 102, 9, + 100, 10, 44, -1, 70, 52, 102, 9, 100, 10, + 90, -1, 70, 52, 90, -1, 65, 48, 52, 24, + 52, 47, -1, 114, 89, -1, 45, -1, 46, 53, + -1, 46, 53, 102, 9, 100, 10, -1, 46, 97, + -1, 48, 115, 47, -1, 115, -1, 115, 19, 91, + -1, 115, 18, 91, -1, 115, 17, 91, -1, 115, + 16, 91, -1, 115, 15, 91, -1, 115, 14, 91, + -1, 115, 13, 91, -1, 116, 117, -1, 59, 48, + 91, 11, 52, -1, 47, -1, 11, 91, 47, -1, + 60, 48, 91, 11, 118, 47, -1, 91, -1, 73, + 90, -1, 73, 91, 90, -1, 75, 90, -1, 75, + 91, 90, -1, 120, -1, 120, 121, -1, 122, 97, + -1, 97, 41, 33, -1, 97, 41, 33, 121, -1, + 33, -1, 98, -1, 115, -1, 120, 42, -1, 120, + 121, 42, -1, 123, 48, 118, 11, 91, 124, -1, + 61, -1, 62, -1, 47, -1, 11, 121, 47, -1, + 126, 86, -1, 127, 48, 128, 47, -1, 76, 52, + -1, 76, 54, -1, -1, 129, -1, 52, -1, 129, + 11, 52, -1, 1, -1, 54, 102, 131, -1, 48, + 47, -1, 132, 133, -1, 48, -1, 132, 91, 11, + -1, 132, 52, 11, -1, 91, 47, -1, 52, 47, + -1 +}; + +/* YYRLINE[YYN] -- source line where rule number YYN was defined. */ +static const yytype_uint16 yyrline[] = +{ + 0, 200, 200, 201, 204, 205, 206, 209, 215, 214, + 221, 220, 227, 226, 234, 250, 233, 263, 265, 271, + 272, 279, 280, 284, 285, 287, 289, 295, 298, 301, + 305, 313, 313, 316, 317, 318, 319, 320, 321, 322, + 323, 324, 325, 326, 327, 328, 329, 331, 359, 358, + 366, 365, 372, 373, 372, 378, 379, 383, 385, 387, + 395, 399, 403, 404, 405, 406, 407, 408, 409, 411, + 413, 415, 418, 426, 433, 437, 444, 453, 454, 457, + 459, 464, 475, 485, 489, 498, 499, 502, 503, 507, + 511, 516, 520, 521, 528, 533, 537, 541, 550, 555, + 561, 581, 607, 631, 632, 636, 637, 654, 658, 671, + 676, 687, 700, 712, 729, 737, 748, 762, 779, 781, + 790, 804, 806, 810, 814, 815, 816, 817, 818, 819, + 820, 826, 830, 837, 839, 863, 870, 893, 896, 900, + 903, 909, 916, 922, 927, 932, 939, 941, 941, 943, + 947, 955, 974, 975, 979, 984, 992, 1001, 1020, 1043, + 1050, 1051, 1054, 1060, 1073, 1086, 1098, 1100, 1115, 1117, + 1124, 1133, 1139 +}; +#endif + +#if YYDEBUG || YYERROR_VERBOSE || YYTOKEN_TABLE +/* YYTNAME[SYMBOL-NUM] -- String name of the symbol SYMBOL-NUM. + First, the terminals, then, starting at YYNTOKENS, nonterminals. */ +static const char *const yytname[] = +{ + "$end", "error", "$undefined", "UNEXPECTED", "BAD_DECIMAL", "NL", + "SEMI_COLON", "LBRACE", "RBRACE", "LBOX", "RBOX", "COMMA", "IO_OUT", + "POW_ASG", "MOD_ASG", "DIV_ASG", "MUL_ASG", "SUB_ASG", "ADD_ASG", + "ASSIGN", "COLON", "QMARK", "OR", "AND", "IN", "MATCH", "GTE", "GT", + "LTE", "LT", "NEQ", "EQ", "CAT", "GETLINE", "MINUS", "PLUS", "MOD", + "DIV", "MUL", "UMINUS", "NOT", "PIPE", "IO_IN", "POW", "INC_or_DEC", + "FIELD", "DOLLAR", "RPAREN", "LPAREN", "DOUBLE", "STRING_", "RE", "ID", + "D_ID", "FUNCT_ID", "BUILTIN", "LENGTH", "PRINT", "PRINTF", "SPLIT", + "MATCH_FUNC", "SUB", "GSUB", "DO", "WHILE", "FOR", "BREAK", "CONTINUE", + "IF", "ELSE", "DELETE", "BEGIN", "END", "EXIT", "NEXT", "RETURN", + "FUNCTION", "$accept", "program", "program_block", "PA_block", "$@1", + "$@2", "$@3", "$@4", "$@5", "block", "block_or_separator", + "statement_list", "statement", "separator", "expr", "$@6", "$@7", "$@8", + "$@9", "cat_expr", "p_expr", "lvalue", "arglist", "args", "builtin", + "mark", "print", "pr_args", "arg2", "pr_direction", "if_front", "else", + "do", "while_front", "for1", "for2", "for3", "array_loop_front", "field", + "split_front", "split_back", "re_arg", "return_statement", "getline", + "fvalue", "getline_file", "sub_or_gsub", "sub_back", "function_def", + "funct_start", "funct_head", "f_arglist", "f_args", "outside_error", + "call_args", "ca_front", "ca_back", 0 +}; +#endif + +# ifdef YYPRINT +/* YYTOKNUM[YYLEX-NUM] -- Internal token number corresponding to + token YYLEX-NUM. */ +static const yytype_uint16 yytoknum[] = +{ + 0, 256, 257, 258, 259, 260, 261, 262, 263, 264, + 265, 266, 267, 268, 269, 270, 271, 272, 273, 274, + 275, 276, 277, 278, 279, 280, 281, 282, 283, 284, + 285, 286, 287, 288, 289, 290, 291, 292, 293, 294, + 295, 296, 297, 298, 299, 300, 301, 302, 303, 304, + 305, 306, 307, 308, 309, 310, 311, 312, 313, 314, + 315, 316, 317, 318, 319, 320, 321, 322, 323, 324, + 325, 326, 327, 328, 329, 330, 331 +}; +# endif + +/* YYR1[YYN] -- Symbol number of symbol that rule YYN derives. */ +static const yytype_uint8 yyr1[] = +{ + 0, 77, 78, 78, 79, 79, 79, 80, 81, 80, + 82, 80, 83, 80, 84, 85, 80, 86, 86, 87, + 87, 88, 88, 89, 89, 89, 89, 89, 89, 89, + 89, 90, 90, 91, 91, 91, 91, 91, 91, 91, + 91, 91, 91, 91, 91, 91, 91, 91, 92, 91, + 93, 91, 94, 95, 91, 96, 96, 97, 97, 97, + 97, 97, 97, 97, 97, 97, 97, 97, 97, 97, + 97, 97, 97, 97, 97, 97, 98, 99, 99, 100, + 100, 101, 101, 102, 89, 103, 103, 104, 104, 104, + 105, 105, 106, 106, 107, 89, 108, 89, 109, 89, + 110, 89, 89, 111, 111, 112, 112, 113, 113, 91, + 91, 98, 97, 97, 89, 89, 114, 89, 115, 115, + 115, 115, 115, 97, 91, 91, 91, 91, 91, 91, + 91, 97, 116, 117, 117, 97, 118, 89, 89, 119, + 119, 97, 97, 97, 97, 97, 120, 121, 121, 122, + 122, 97, 123, 123, 124, 124, 125, 126, 127, 127, + 128, 128, 129, 129, 130, 97, 131, 131, 132, 132, + 132, 133, 133 +}; + +/* YYR2[YYN] -- Number of symbols composing right hand side of rule YYN. */ +static const yytype_uint8 yyr2[] = +{ + 0, 2, 1, 2, 1, 1, 2, 1, 0, 3, + 0, 3, 0, 3, 0, 0, 6, 3, 3, 1, + 1, 1, 2, 1, 2, 1, 2, 2, 2, 1, + 2, 1, 1, 1, 3, 3, 3, 3, 3, 3, + 3, 3, 3, 3, 3, 3, 3, 3, 0, 4, + 0, 4, 0, 0, 7, 1, 2, 1, 1, 1, + 3, 1, 3, 3, 3, 3, 3, 3, 2, 2, + 2, 1, 2, 2, 2, 2, 1, 0, 1, 1, + 3, 5, 1, 0, 5, 1, 1, 1, 3, 2, + 3, 3, 0, 2, 4, 2, 1, 4, 1, 7, + 4, 2, 4, 3, 4, 1, 2, 1, 2, 3, + 5, 5, 5, 6, 7, 3, 6, 2, 1, 2, + 6, 2, 3, 1, 3, 3, 3, 3, 3, 3, + 3, 2, 5, 1, 3, 6, 1, 2, 3, 2, + 3, 1, 2, 2, 3, 4, 1, 1, 1, 2, + 3, 6, 1, 1, 1, 3, 2, 4, 2, 2, + 0, 1, 1, 3, 1, 3, 2, 2, 1, 3, + 3, 2, 2 +}; + +/* YYDEFACT[STATE-NAME] -- Default rule to reduce with in state + STATE-NUM when YYTABLE doesn't specify something else to do. Zero + means the default is an error. */ +static const yytype_uint8 yydefact[] = +{ + 0, 164, 0, 146, 0, 0, 0, 0, 118, 0, + 0, 57, 58, 61, 59, 83, 83, 82, 0, 0, + 152, 153, 8, 10, 0, 0, 2, 4, 7, 12, + 33, 55, 0, 71, 123, 0, 141, 0, 0, 5, + 0, 0, 0, 0, 31, 32, 85, 86, 98, 0, + 0, 0, 0, 0, 0, 0, 0, 0, 23, 0, + 21, 25, 0, 83, 0, 0, 0, 0, 0, 29, + 0, 59, 70, 123, 69, 68, 0, 76, 73, 75, + 119, 121, 0, 0, 123, 72, 0, 0, 0, 0, + 0, 0, 0, 158, 159, 1, 3, 14, 52, 48, + 50, 0, 0, 0, 0, 0, 0, 0, 0, 0, + 56, 0, 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 74, 0, 133, 131, 149, 147, 148, 142, + 143, 0, 156, 160, 6, 18, 26, 0, 0, 27, + 28, 0, 83, 137, 0, 30, 139, 0, 0, 17, + 22, 24, 77, 95, 0, 101, 105, 0, 0, 117, + 0, 0, 0, 0, 0, 0, 60, 0, 0, 122, + 0, 168, 165, 0, 77, 0, 0, 9, 11, 0, + 0, 0, 0, 109, 47, 46, 45, 44, 43, 42, + 41, 19, 13, 20, 63, 62, 66, 65, 64, 144, + 67, 40, 39, 38, 37, 36, 35, 34, 130, 129, + 128, 127, 126, 125, 124, 0, 150, 136, 0, 162, + 0, 161, 0, 103, 59, 0, 0, 115, 0, 138, + 140, 0, 79, 87, 78, 92, 96, 0, 0, 106, + 107, 0, 0, 0, 0, 0, 90, 91, 0, 0, + 166, 59, 0, 167, 0, 0, 0, 15, 0, 49, + 51, 145, 134, 0, 157, 0, 100, 0, 104, 94, + 0, 89, 0, 0, 0, 0, 97, 0, 108, 102, + 0, 0, 0, 110, 112, 170, 172, 169, 171, 81, + 132, 0, 0, 53, 0, 163, 0, 0, 88, 80, + 93, 84, 0, 112, 111, 120, 113, 135, 16, 0, + 0, 154, 151, 116, 0, 0, 54, 0, 114, 99, + 155 +}; + +/* YYDEFGOTO[NTERM-NUM]. */ +static const yytype_int16 yydefgoto[] = +{ + -1, 25, 26, 27, 91, 92, 109, 189, 302, 58, + 202, 59, 60, 61, 62, 191, 192, 190, 319, 30, + 31, 32, 243, 244, 33, 86, 63, 245, 83, 285, + 64, 247, 65, 66, 67, 168, 252, 68, 34, 35, + 135, 228, 69, 36, 139, 37, 38, 322, 39, 40, + 41, 230, 231, 42, 182, 183, 263 +}; + +/* YYPACT[STATE-NUM] -- Index in YYTABLE of the portion describing + STATE-NUM. */ +#define YYPACT_NINF -189 +static const yytype_int16 yypact[] = +{ + 324, -189, 457, -189, 799, 799, 799, 1, -189, 709, + 829, -189, -189, -189, 389, -189, -189, -189, -39, -26, + -189, -189, -189, -189, 39, 289, -189, -189, -189, 1077, + 799, 129, 966, -189, 466, 3, 35, 799, -21, -189, + 29, -3, 29, 18, -189, -189, -189, -189, -189, 40, + 69, 34, 34, 86, -1, 594, 34, 594, -189, 386, + -189, -189, 706, -189, 528, 528, 528, 624, 528, -189, + 829, 10, 51, 14, 51, 51, 30, 128, -189, -189, + 128, -189, 446, 5, 704, -189, 130, 94, 104, 829, + 829, 29, 29, -189, -189, -189, -189, -189, -189, -189, + -189, 105, 829, 829, 829, 829, 829, 829, 829, 26, + 129, 799, 799, 799, 799, 799, 136, 799, 829, 829, + 829, 829, 829, 829, 829, 829, 829, 829, 829, 829, + 829, 829, -189, 829, -189, -189, -189, -189, -189, 119, + 73, 829, -189, 121, -189, -189, -189, 829, 654, -189, + -189, 829, 34, -189, 706, -189, -189, 706, 34, -189, + -189, -189, 859, 102, 110, -189, -189, 1045, 739, -189, + 931, 167, 131, 170, 172, 829, -189, 829, 158, -189, + 829, 140, -189, 889, 829, 1098, 1119, -189, -189, 829, + 829, 829, 829, -189, 70, -189, -189, -189, -189, -189, + -189, -189, -189, -189, 23, 23, 51, 51, 51, 1, + 143, 1142, 1142, 1142, 1142, 1142, 1142, 1142, 1142, 1142, + 1142, 1142, 1142, 1142, 1142, 942, -189, 1142, 177, -189, + 146, 187, 969, -189, 216, 1056, 980, -189, 192, -189, + -189, 769, 1142, -189, 193, 190, -189, 528, 157, -189, + -189, 1007, 528, 829, 829, 829, 1142, 1142, 154, 52, + -189, 199, 517, -189, 162, 159, 829, 1142, 1131, 387, + 595, -189, -189, 829, -189, 168, -189, 171, -189, -189, + 829, -189, 9, 829, 829, 34, -189, 829, -189, -189, + 61, 135, 139, -189, 537, -189, -189, -189, -189, -189, + -189, 175, 26, -189, 586, -189, 179, 145, 158, 1142, + 1142, -189, 1018, 180, -189, -189, -189, -189, -189, 829, + 1, -189, -189, -189, 34, 34, 1142, 189, -189, -189, + -189 +}; + +/* YYPGOTO[NTERM-NUM]. */ +static const yytype_int16 yypgoto[] = +{ + -189, -189, 194, -189, -189, -189, -189, -189, -189, 44, + -75, -189, -53, -14, 0, -189, -189, -189, -189, -189, + 191, -6, 53, -95, -189, 2, -189, -189, 4, -189, + -189, -189, -189, -189, -189, -189, -189, -189, -2, -189, + -189, -28, -189, -189, -188, -189, -189, -189, -189, -189, + -189, -189, -189, -189, -189, -189, -189 +}; + +/* YYTABLE[YYPACT[STATE-NUM]]. What to do in state STATE-NUM. If + positive, shift that token. If negative, reduce the rule which + number is the opposite. If zero, do what YYDEFACT says. + If YYTABLE_NINF, syntax error. */ +#define YYTABLE_NINF -112 +static const yytype_int16 yytable[] = +{ + 29, 78, 73, 73, 73, 79, 160, 73, 84, 89, + 82, 163, 164, 165, 133, 169, 177, 87, 88, -83, + 177, 271, 90, 44, 45, 29, 145, 141, 73, 146, + 137, 44, 45, 2, 138, 73, 2, 149, 150, 44, + 45, 153, 155, 156, 28, 143, 8, 9, 161, 76, + 134, 152, 178, 77, 85, 154, 308, 157, 132, 113, + 114, 115, 294, 283, 116, 162, 117, 167, 84, 28, + 170, 313, 283, 171, 172, 8, 9, 136, 76, 173, + 8, 9, 174, 76, 142, 259, 144, 77, 147, 185, + 186, 93, 116, 94, 117, 203, 103, 104, 105, 106, + 107, 108, 194, 195, 196, 197, 198, 199, 200, 73, + 73, 73, 73, 73, -112, 73, 117, 148, 211, 212, + 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, + 223, 224, 327, 225, 151, 187, 188, -83, 237, 180, + 239, 227, 181, 240, 146, 314, 283, 232, 235, 315, + 283, 236, 184, 201, 238, 324, 283, 193, 290, 291, + 292, 226, 242, 111, 112, 113, 114, 115, 251, 209, + 116, 246, 117, 229, 248, 256, 253, 257, 179, 254, + 242, 255, 258, 262, 242, 307, 117, 260, 273, 267, + 268, 269, 270, 274, 286, 72, 74, 75, 275, 289, + 81, 280, 284, 137, 283, 287, 293, 138, -83, 299, + 295, 300, -76, -76, -76, -76, -76, -76, -76, 96, + 305, 110, 317, 306, 316, -83, 323, 318, 140, -76, + -76, -76, -76, -76, -76, -76, 330, 264, 301, 84, + 277, 82, 0, 85, 0, 282, 296, 0, 0, 0, + 0, 0, 0, 242, 242, 242, 0, 0, 0, 0, + 85, 0, 0, 0, 0, 0, 227, 0, 0, 0, + 0, 311, 0, 304, 0, 0, 0, 0, 0, 0, + 242, 0, 0, 309, 310, 0, 0, 312, 203, 95, + 1, 0, 0, 0, 0, 0, 2, 0, 0, 0, + 0, 0, 204, 205, 206, 207, 208, 0, 210, 0, + 328, 329, 0, 0, 137, 0, 0, 0, 138, 326, + 0, 0, 3, 4, 5, 1, 0, 0, 0, 6, + 0, 2, 0, 7, 8, 9, 0, 10, 11, 12, + 13, 14, 0, 15, 16, 17, 201, 0, 18, 19, + 20, 21, 0, 0, 0, 0, 0, 3, 4, 5, + 22, 23, 0, 0, 6, 24, 0, 0, 7, 8, + 9, 0, 10, 11, 12, 13, 14, 0, 15, 16, + 17, 0, 0, 18, 19, 20, 21, 158, 0, 0, + 0, 44, 45, 2, 159, 22, 23, 0, -83, 0, + 24, 0, -76, -76, -76, -76, -76, -76, -76, 0, + 100, 101, 102, 103, 104, 105, 106, 107, 108, 3, + 4, 5, 0, 0, 0, 0, 6, 0, 0, 0, + 7, 8, 9, 85, 10, 11, 12, 13, 14, 0, + 15, 16, 17, 46, 47, 18, 19, 20, 21, 48, + 49, 50, 51, 52, 53, 0, 54, 175, 43, 55, + 56, 57, 44, 45, 2, 0, 0, 98, 99, 100, + 101, 102, 103, 104, 105, 106, 107, 108, 0, 125, + 126, 127, 128, 129, 130, 131, 0, 0, 0, 0, + 3, 4, 5, 176, 0, 0, 0, 6, 0, 0, + 0, 7, 8, 9, 0, 10, 11, 12, 13, 14, + 132, 15, 16, 17, 46, 47, 18, 19, 20, 21, + 48, 49, 50, 51, 52, 53, 0, 54, 297, 158, + 55, 56, 57, 44, 45, 2, 0, 0, 98, 99, + 100, 101, 102, 103, 104, 105, 106, 107, 108, 0, + -111, -111, -111, -111, -111, -111, -111, 0, 0, 0, + 0, 3, 4, 5, 298, 0, 0, 0, 6, 0, + 0, 0, 7, 8, 9, 0, 10, 11, 12, 13, + 14, 316, 15, 16, 17, 46, 47, 18, 19, 20, + 21, 48, 49, 50, 51, 52, 53, 320, 54, 44, + 45, 55, 56, 57, 0, 0, 0, 98, 99, 100, + 101, 102, 103, 104, 105, 106, 107, 108, 0, 101, + 102, 103, 104, 105, 106, 107, 108, 3, 4, 5, + 166, 0, 0, 321, 6, 0, 0, 0, 7, 8, + 9, 0, 10, 11, 12, 13, 14, 0, 15, 16, + 17, 0, 0, 18, 19, 20, 21, 3, 4, 5, + 233, 0, 0, 0, 6, 0, 0, 0, 7, 8, + 9, 0, 10, 11, 12, 13, 14, 0, 15, 16, + 17, 0, 0, 18, 19, 20, 21, 3, 4, 5, + 0, 0, 0, 0, 6, 0, 0, 0, 7, 8, + 9, 0, 10, 11, 12, 13, 234, 0, 15, 16, + 17, 44, 45, 18, 19, 20, 21, 125, 126, 127, + 128, 129, 130, 131, 0, 0, 0, 98, 99, 100, + 101, 102, 103, 104, 105, 106, 107, 108, 0, 0, + 0, 0, 3, 4, 5, 0, 0, 0, 132, 6, + 0, 179, 0, 7, 8, 9, 0, 70, 11, 12, + 13, 71, 80, 15, 16, 17, 0, 0, 18, 19, + 20, 21, 3, 4, 5, 0, 0, 0, 0, 6, + 0, 0, 0, 7, 8, 9, 250, 10, 11, 12, + 13, 14, 0, 15, 16, 17, 0, 0, 18, 19, + 20, 21, 3, 4, 5, 0, 0, 0, 0, 6, + 0, 0, 0, 7, 8, 9, 281, 10, 11, 12, + 13, 14, 0, 15, 16, 17, 0, 0, 18, 19, + 20, 21, 3, 4, 5, 0, 0, 0, 0, 6, + 0, 0, 0, 7, 8, 9, 0, 70, 11, 12, + 13, 71, 0, 15, 16, 17, 0, 0, 18, 19, + 20, 21, 3, 4, 5, 0, 0, 0, 0, 6, + 0, 0, 0, 7, 8, 9, 0, 10, 11, 12, + 13, 14, 0, 15, 16, 17, 0, 0, 18, 19, + 20, 21, 3, 4, 5, 0, 0, 0, 0, 6, + 0, 0, 0, 7, 8, 9, 0, 241, 11, 12, + 13, 14, 0, 15, 16, 17, 0, 0, 18, 19, + 20, 21, 3, 4, 5, 0, 0, 0, 0, 6, + 0, 0, 0, 7, 8, 9, 0, 10, 11, 12, + 13, 261, 0, 15, 16, 17, 0, 0, 18, 19, + 20, 21, 98, 99, 100, 101, 102, 103, 104, 105, + 106, 107, 108, 98, 99, 100, 101, 102, 103, 104, + 105, 106, 107, 108, 0, 0, 0, 0, 176, 118, + 119, 120, 121, 122, 123, 124, 0, 0, 0, 272, + 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, + 108, 98, 99, 100, 101, 102, 103, 104, 105, 106, + 107, 108, 0, 0, 0, 0, 276, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 279, 98, 99, + 100, 101, 102, 103, 104, 105, 106, 107, 108, 98, + 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, + 0, 249, 0, 0, 288, 0, 0, 0, 0, 0, + 0, 0, 278, 0, 0, 325, 98, 99, 100, 101, + 102, 103, 104, 105, 106, 107, 108, 98, 99, 100, + 101, 102, 103, 104, 105, 106, 107, 108, 97, 0, + 0, 0, 0, 0, 0, 0, 0, 0, 98, 99, + 100, 101, 102, 103, 104, 105, 106, 107, 108, 265, + 0, 0, 0, 0, 0, 0, 0, 0, 0, 98, + 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, + 266, 0, 0, 0, 0, 0, 0, 0, 0, 0, + 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, + 108, 303, 98, 99, 100, 101, 102, 103, 104, 105, + 106, 107, 108, 98, 99, 100, 101, 102, 103, 104, + 105, 106, 107, 108 +}; + +static const yytype_int16 yycheck[] = +{ + 0, 7, 4, 5, 6, 7, 59, 9, 10, 48, + 10, 64, 65, 66, 11, 68, 11, 15, 16, 9, + 11, 209, 48, 5, 6, 25, 8, 48, 30, 43, + 36, 5, 6, 7, 36, 37, 7, 51, 52, 5, + 6, 55, 56, 57, 0, 48, 45, 46, 62, 48, + 47, 52, 47, 52, 44, 55, 47, 57, 44, 36, + 37, 38, 10, 11, 41, 63, 43, 67, 70, 25, + 70, 10, 11, 71, 76, 45, 46, 42, 48, 77, + 45, 46, 80, 48, 40, 180, 42, 52, 48, 89, + 90, 52, 41, 54, 43, 109, 26, 27, 28, 29, + 30, 31, 102, 103, 104, 105, 106, 107, 108, 111, + 112, 113, 114, 115, 41, 117, 43, 48, 118, 119, + 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, + 130, 131, 320, 133, 48, 91, 92, 9, 152, 9, + 154, 141, 48, 157, 158, 10, 11, 147, 148, 10, + 11, 151, 48, 109, 152, 10, 11, 52, 253, 254, + 255, 42, 162, 34, 35, 36, 37, 38, 168, 33, + 41, 69, 43, 52, 64, 175, 9, 177, 47, 9, + 180, 9, 24, 183, 184, 280, 43, 47, 11, 189, + 190, 191, 192, 47, 247, 4, 5, 6, 11, 252, + 9, 9, 12, 209, 11, 48, 52, 209, 9, 47, + 11, 52, 13, 14, 15, 16, 17, 18, 19, 25, + 52, 30, 47, 52, 44, 9, 47, 302, 37, 13, + 14, 15, 16, 17, 18, 19, 47, 184, 266, 241, + 24, 241, -1, 44, -1, 241, 47, -1, -1, -1, + -1, -1, -1, 253, 254, 255, -1, -1, -1, -1, + 44, -1, -1, -1, -1, -1, 266, -1, -1, -1, + -1, 285, -1, 273, -1, -1, -1, -1, -1, -1, + 280, -1, -1, 283, 284, -1, -1, 287, 302, 0, + 1, -1, -1, -1, -1, -1, 7, -1, -1, -1, + -1, -1, 111, 112, 113, 114, 115, -1, 117, -1, + 324, 325, -1, -1, 320, -1, -1, -1, 320, 319, + -1, -1, 33, 34, 35, 1, -1, -1, -1, 40, + -1, 7, -1, 44, 45, 46, -1, 48, 49, 50, + 51, 52, -1, 54, 55, 56, 302, -1, 59, 60, + 61, 62, -1, -1, -1, -1, -1, 33, 34, 35, + 71, 72, -1, -1, 40, 76, -1, -1, 44, 45, + 46, -1, 48, 49, 50, 51, 52, -1, 54, 55, + 56, -1, -1, 59, 60, 61, 62, 1, -1, -1, + -1, 5, 6, 7, 8, 71, 72, -1, 9, -1, + 76, -1, 13, 14, 15, 16, 17, 18, 19, -1, + 23, 24, 25, 26, 27, 28, 29, 30, 31, 33, + 34, 35, -1, -1, -1, -1, 40, -1, -1, -1, + 44, 45, 46, 44, 48, 49, 50, 51, 52, -1, + 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, + 64, 65, 66, 67, 68, -1, 70, 11, 1, 73, + 74, 75, 5, 6, 7, -1, -1, 21, 22, 23, + 24, 25, 26, 27, 28, 29, 30, 31, -1, 13, + 14, 15, 16, 17, 18, 19, -1, -1, -1, -1, + 33, 34, 35, 47, -1, -1, -1, 40, -1, -1, + -1, 44, 45, 46, -1, 48, 49, 50, 51, 52, + 44, 54, 55, 56, 57, 58, 59, 60, 61, 62, + 63, 64, 65, 66, 67, 68, -1, 70, 11, 1, + 73, 74, 75, 5, 6, 7, -1, -1, 21, 22, + 23, 24, 25, 26, 27, 28, 29, 30, 31, -1, + 13, 14, 15, 16, 17, 18, 19, -1, -1, -1, + -1, 33, 34, 35, 47, -1, -1, -1, 40, -1, + -1, -1, 44, 45, 46, -1, 48, 49, 50, 51, + 52, 44, 54, 55, 56, 57, 58, 59, 60, 61, + 62, 63, 64, 65, 66, 67, 68, 11, 70, 5, + 6, 73, 74, 75, -1, -1, -1, 21, 22, 23, + 24, 25, 26, 27, 28, 29, 30, 31, -1, 24, + 25, 26, 27, 28, 29, 30, 31, 33, 34, 35, + 6, -1, -1, 47, 40, -1, -1, -1, 44, 45, + 46, -1, 48, 49, 50, 51, 52, -1, 54, 55, + 56, -1, -1, 59, 60, 61, 62, 33, 34, 35, + 6, -1, -1, -1, 40, -1, -1, -1, 44, 45, + 46, -1, 48, 49, 50, 51, 52, -1, 54, 55, + 56, -1, -1, 59, 60, 61, 62, 33, 34, 35, + -1, -1, -1, -1, 40, -1, -1, -1, 44, 45, + 46, -1, 48, 49, 50, 51, 52, -1, 54, 55, + 56, 5, 6, 59, 60, 61, 62, 13, 14, 15, + 16, 17, 18, 19, -1, -1, -1, 21, 22, 23, + 24, 25, 26, 27, 28, 29, 30, 31, -1, -1, + -1, -1, 33, 34, 35, -1, -1, -1, 44, 40, + -1, 47, -1, 44, 45, 46, -1, 48, 49, 50, + 51, 52, 53, 54, 55, 56, -1, -1, 59, 60, + 61, 62, 33, 34, 35, -1, -1, -1, -1, 40, + -1, -1, -1, 44, 45, 46, 47, 48, 49, 50, + 51, 52, -1, 54, 55, 56, -1, -1, 59, 60, + 61, 62, 33, 34, 35, -1, -1, -1, -1, 40, + -1, -1, -1, 44, 45, 46, 47, 48, 49, 50, + 51, 52, -1, 54, 55, 56, -1, -1, 59, 60, + 61, 62, 33, 34, 35, -1, -1, -1, -1, 40, + -1, -1, -1, 44, 45, 46, -1, 48, 49, 50, + 51, 52, -1, 54, 55, 56, -1, -1, 59, 60, + 61, 62, 33, 34, 35, -1, -1, -1, -1, 40, + -1, -1, -1, 44, 45, 46, -1, 48, 49, 50, + 51, 52, -1, 54, 55, 56, -1, -1, 59, 60, + 61, 62, 33, 34, 35, -1, -1, -1, -1, 40, + -1, -1, -1, 44, 45, 46, -1, 48, 49, 50, + 51, 52, -1, 54, 55, 56, -1, -1, 59, 60, + 61, 62, 33, 34, 35, -1, -1, -1, -1, 40, + -1, -1, -1, 44, 45, 46, -1, 48, 49, 50, + 51, 52, -1, 54, 55, 56, -1, -1, 59, 60, + 61, 62, 21, 22, 23, 24, 25, 26, 27, 28, + 29, 30, 31, 21, 22, 23, 24, 25, 26, 27, + 28, 29, 30, 31, -1, -1, -1, -1, 47, 13, + 14, 15, 16, 17, 18, 19, -1, -1, -1, 47, + 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, + 31, 21, 22, 23, 24, 25, 26, 27, 28, 29, + 30, 31, -1, -1, -1, -1, 47, -1, -1, -1, + -1, -1, -1, -1, -1, -1, -1, 47, 21, 22, + 23, 24, 25, 26, 27, 28, 29, 30, 31, 21, + 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, + -1, 6, -1, -1, 47, -1, -1, -1, -1, -1, + -1, -1, 6, -1, -1, 47, 21, 22, 23, 24, + 25, 26, 27, 28, 29, 30, 31, 21, 22, 23, + 24, 25, 26, 27, 28, 29, 30, 31, 11, -1, + -1, -1, -1, -1, -1, -1, -1, -1, 21, 22, + 23, 24, 25, 26, 27, 28, 29, 30, 31, 11, + -1, -1, -1, -1, -1, -1, -1, -1, -1, 21, + 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, + 11, -1, -1, -1, -1, -1, -1, -1, -1, -1, + 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, + 31, 20, 21, 22, 23, 24, 25, 26, 27, 28, + 29, 30, 31, 21, 22, 23, 24, 25, 26, 27, + 28, 29, 30, 31 +}; + +/* YYSTOS[STATE-NUM] -- The (internal number of the) accessing + symbol of state STATE-NUM. */ +static const yytype_uint8 yystos[] = +{ + 0, 1, 7, 33, 34, 35, 40, 44, 45, 46, + 48, 49, 50, 51, 52, 54, 55, 56, 59, 60, + 61, 62, 71, 72, 76, 78, 79, 80, 86, 91, + 96, 97, 98, 101, 115, 116, 120, 122, 123, 125, + 126, 127, 130, 1, 5, 6, 57, 58, 63, 64, + 65, 66, 67, 68, 70, 73, 74, 75, 86, 88, + 89, 90, 91, 103, 107, 109, 110, 111, 114, 119, + 48, 52, 97, 115, 97, 97, 48, 52, 98, 115, + 53, 97, 91, 105, 115, 44, 102, 102, 102, 48, + 48, 81, 82, 52, 54, 0, 79, 11, 21, 22, + 23, 24, 25, 26, 27, 28, 29, 30, 31, 83, + 97, 34, 35, 36, 37, 38, 41, 43, 13, 14, + 15, 16, 17, 18, 19, 13, 14, 15, 16, 17, + 18, 19, 44, 11, 47, 117, 42, 98, 115, 121, + 97, 48, 86, 48, 86, 8, 90, 48, 48, 90, + 90, 48, 52, 90, 91, 90, 90, 91, 1, 8, + 89, 90, 102, 89, 89, 89, 6, 91, 112, 89, + 91, 102, 115, 102, 102, 11, 47, 11, 47, 47, + 9, 48, 131, 132, 48, 91, 91, 86, 86, 84, + 94, 92, 93, 52, 91, 91, 91, 91, 91, 91, + 91, 86, 87, 90, 97, 97, 97, 97, 97, 33, + 97, 91, 91, 91, 91, 91, 91, 91, 91, 91, + 91, 91, 91, 91, 91, 91, 42, 91, 118, 52, + 128, 129, 91, 6, 52, 91, 91, 90, 102, 90, + 90, 48, 91, 99, 100, 104, 69, 108, 64, 6, + 47, 91, 113, 9, 9, 9, 91, 91, 24, 100, + 47, 52, 91, 133, 99, 11, 11, 91, 91, 91, + 91, 121, 47, 11, 47, 11, 47, 24, 6, 47, + 9, 47, 105, 11, 12, 106, 89, 48, 47, 89, + 100, 100, 100, 52, 10, 11, 47, 11, 47, 47, + 52, 118, 85, 20, 91, 52, 52, 100, 47, 91, + 91, 90, 91, 10, 10, 10, 44, 47, 87, 95, + 11, 47, 124, 47, 10, 47, 91, 121, 90, 90, + 47 +}; + +#define yyerrok (yyerrstatus = 0) +#define yyclearin (yychar = YYEMPTY) +#define YYEMPTY (-2) +#define YYEOF 0 + +#define YYACCEPT goto yyacceptlab +#define YYABORT goto yyabortlab +#define YYERROR goto yyerrorlab + + +/* Like YYERROR except do call yyerror. This remains here temporarily + to ease the transition to the new meaning of YYERROR, for GCC. + Once GCC version 2 has supplanted version 1, this can go. */ + +#define YYFAIL goto yyerrlab + +#define YYRECOVERING() (!!yyerrstatus) + +#define YYBACKUP(Token, Value) \ +do \ + if (yychar == YYEMPTY && yylen == 1) \ + { \ + yychar = (Token); \ + yylval = (Value); \ + yytoken = YYTRANSLATE (yychar); \ + YYPOPSTACK (1); \ + goto yybackup; \ + } \ + else \ + { \ + yyerror (YY_("syntax error: cannot back up")); \ + YYERROR; \ + } \ +while (YYID (0)) + + +#define YYTERROR 1 +#define YYERRCODE 256 + + +/* YYLLOC_DEFAULT -- Set CURRENT to span from RHS[1] to RHS[N]. + If N is 0, then set CURRENT to the empty location which ends + the previous symbol: RHS[0] (always defined). */ + +#define YYRHSLOC(Rhs, K) ((Rhs)[K]) +#ifndef YYLLOC_DEFAULT +# define YYLLOC_DEFAULT(Current, Rhs, N) \ + do \ + if (YYID (N)) \ + { \ + (Current).first_line = YYRHSLOC (Rhs, 1).first_line; \ + (Current).first_column = YYRHSLOC (Rhs, 1).first_column; \ + (Current).last_line = YYRHSLOC (Rhs, N).last_line; \ + (Current).last_column = YYRHSLOC (Rhs, N).last_column; \ + } \ + else \ + { \ + (Current).first_line = (Current).last_line = \ + YYRHSLOC (Rhs, 0).last_line; \ + (Current).first_column = (Current).last_column = \ + YYRHSLOC (Rhs, 0).last_column; \ + } \ + while (YYID (0)) +#endif + + +/* YY_LOCATION_PRINT -- Print the location on the stream. + This macro was not mandated originally: define only if we know + we won't break user code: when these are the locations we know. */ + +#ifndef YY_LOCATION_PRINT +# if YYLTYPE_IS_TRIVIAL +# define YY_LOCATION_PRINT(File, Loc) \ + fprintf (File, "%d.%d-%d.%d", \ + (Loc).first_line, (Loc).first_column, \ + (Loc).last_line, (Loc).last_column) +# else +# define YY_LOCATION_PRINT(File, Loc) ((void) 0) +# endif +#endif + + +/* YYLEX -- calling `yylex' with the right arguments. */ + +#ifdef YYLEX_PARAM +# define YYLEX yylex (YYLEX_PARAM) +#else +# define YYLEX yylex () +#endif + +/* Enable debugging if requested. */ +#if YYDEBUG + +# ifndef YYFPRINTF +# include /* INFRINGES ON USER NAME SPACE */ +# define YYFPRINTF fprintf +# endif + +# define YYDPRINTF(Args) \ +do { \ + if (yydebug) \ + YYFPRINTF Args; \ +} while (YYID (0)) + +# define YY_SYMBOL_PRINT(Title, Type, Value, Location) \ +do { \ + if (yydebug) \ + { \ + YYFPRINTF (stderr, "%s ", Title); \ + yy_symbol_print (stderr, \ + Type, Value); \ + YYFPRINTF (stderr, "\n"); \ + } \ +} while (YYID (0)) + + +/*--------------------------------. +| Print this symbol on YYOUTPUT. | +`--------------------------------*/ + +/*ARGSUSED*/ +#if (defined __STDC__ || defined __C99__FUNC__ \ + || defined __cplusplus || defined _MSC_VER) +static void +yy_symbol_value_print (FILE *yyoutput, int yytype, YYSTYPE const * const yyvaluep) +#else +static void +yy_symbol_value_print (yyoutput, yytype, yyvaluep) + FILE *yyoutput; + int yytype; + YYSTYPE const * const yyvaluep; +#endif +{ + if (!yyvaluep) + return; +# ifdef YYPRINT + if (yytype < YYNTOKENS) + YYPRINT (yyoutput, yytoknum[yytype], *yyvaluep); +# else + YYUSE (yyoutput); +# endif + switch (yytype) + { + default: + break; + } +} + + +/*--------------------------------. +| Print this symbol on YYOUTPUT. | +`--------------------------------*/ + +#if (defined __STDC__ || defined __C99__FUNC__ \ + || defined __cplusplus || defined _MSC_VER) +static void +yy_symbol_print (FILE *yyoutput, int yytype, YYSTYPE const * const yyvaluep) +#else +static void +yy_symbol_print (yyoutput, yytype, yyvaluep) + FILE *yyoutput; + int yytype; + YYSTYPE const * const yyvaluep; +#endif +{ + if (yytype < YYNTOKENS) + YYFPRINTF (yyoutput, "token %s (", yytname[yytype]); + else + YYFPRINTF (yyoutput, "nterm %s (", yytname[yytype]); + + yy_symbol_value_print (yyoutput, yytype, yyvaluep); + YYFPRINTF (yyoutput, ")"); +} + +/*------------------------------------------------------------------. +| yy_stack_print -- Print the state stack from its BOTTOM up to its | +| TOP (included). | +`------------------------------------------------------------------*/ + +#if (defined __STDC__ || defined __C99__FUNC__ \ + || defined __cplusplus || defined _MSC_VER) +static void +yy_stack_print (yytype_int16 *yybottom, yytype_int16 *yytop) +#else +static void +yy_stack_print (yybottom, yytop) + yytype_int16 *yybottom; + yytype_int16 *yytop; +#endif +{ + YYFPRINTF (stderr, "Stack now"); + for (; yybottom <= yytop; yybottom++) + { + int yybot = *yybottom; + YYFPRINTF (stderr, " %d", yybot); + } + YYFPRINTF (stderr, "\n"); +} + +# define YY_STACK_PRINT(Bottom, Top) \ +do { \ + if (yydebug) \ + yy_stack_print ((Bottom), (Top)); \ +} while (YYID (0)) + + +/*------------------------------------------------. +| Report that the YYRULE is going to be reduced. | +`------------------------------------------------*/ + +#if (defined __STDC__ || defined __C99__FUNC__ \ + || defined __cplusplus || defined _MSC_VER) +static void +yy_reduce_print (YYSTYPE *yyvsp, int yyrule) +#else +static void +yy_reduce_print (yyvsp, yyrule) + YYSTYPE *yyvsp; + int yyrule; +#endif +{ + int yynrhs = yyr2[yyrule]; + int yyi; + unsigned long int yylno = yyrline[yyrule]; + YYFPRINTF (stderr, "Reducing stack by rule %d (line %lu):\n", + yyrule - 1, yylno); + /* The symbols being reduced. */ + for (yyi = 0; yyi < yynrhs; yyi++) + { + YYFPRINTF (stderr, " $%d = ", yyi + 1); + yy_symbol_print (stderr, yyrhs[yyprhs[yyrule] + yyi], + &(yyvsp[(yyi + 1) - (yynrhs)]) + ); + YYFPRINTF (stderr, "\n"); + } +} + +# define YY_REDUCE_PRINT(Rule) \ +do { \ + if (yydebug) \ + yy_reduce_print (yyvsp, Rule); \ +} while (YYID (0)) + +/* Nonzero means print parse trace. It is left uninitialized so that + multiple parsers can coexist. */ +int yydebug; +#else /* !YYDEBUG */ +# define YYDPRINTF(Args) +# define YY_SYMBOL_PRINT(Title, Type, Value, Location) +# define YY_STACK_PRINT(Bottom, Top) +# define YY_REDUCE_PRINT(Rule) +#endif /* !YYDEBUG */ + + +/* YYINITDEPTH -- initial size of the parser's stacks. */ +#ifndef YYINITDEPTH +# define YYINITDEPTH 200 +#endif + +/* YYMAXDEPTH -- maximum size the stacks can grow to (effective only + if the built-in stack extension method is used). + + Do not make this value too large; the results are undefined if + YYSTACK_ALLOC_MAXIMUM < YYSTACK_BYTES (YYMAXDEPTH) + evaluated with infinite-precision integer arithmetic. */ + +#ifndef YYMAXDEPTH +# define YYMAXDEPTH 10000 +#endif + + + +#if YYERROR_VERBOSE + +# ifndef yystrlen +# if defined __GLIBC__ && defined _STRING_H +# define yystrlen strlen +# else +/* Return the length of YYSTR. */ +#if (defined __STDC__ || defined __C99__FUNC__ \ + || defined __cplusplus || defined _MSC_VER) +static YYSIZE_T +yystrlen (const char *yystr) +#else +static YYSIZE_T +yystrlen (yystr) + const char *yystr; +#endif +{ + YYSIZE_T yylen; + for (yylen = 0; yystr[yylen]; yylen++) + continue; + return yylen; +} +# endif +# endif + +# ifndef yystpcpy +# if defined __GLIBC__ && defined _STRING_H && defined _GNU_SOURCE +# define yystpcpy stpcpy +# else +/* Copy YYSRC to YYDEST, returning the address of the terminating '\0' in + YYDEST. */ +#if (defined __STDC__ || defined __C99__FUNC__ \ + || defined __cplusplus || defined _MSC_VER) +static char * +yystpcpy (char *yydest, const char *yysrc) +#else +static char * +yystpcpy (yydest, yysrc) + char *yydest; + const char *yysrc; +#endif +{ + char *yyd = yydest; + const char *yys = yysrc; + + while ((*yyd++ = *yys++) != '\0') + continue; + + return yyd - 1; +} +# endif +# endif + +# ifndef yytnamerr +/* Copy to YYRES the contents of YYSTR after stripping away unnecessary + quotes and backslashes, so that it's suitable for yyerror. The + heuristic is that double-quoting is unnecessary unless the string + contains an apostrophe, a comma, or backslash (other than + backslash-backslash). YYSTR is taken from yytname. If YYRES is + null, do not copy; instead, return the length of what the result + would have been. */ +static YYSIZE_T +yytnamerr (char *yyres, const char *yystr) +{ + if (*yystr == '"') + { + YYSIZE_T yyn = 0; + char const *yyp = yystr; + + for (;;) + switch (*++yyp) + { + case '\'': + case ',': + goto do_not_strip_quotes; + + case '\\': + if (*++yyp != '\\') + goto do_not_strip_quotes; + /* Fall through. */ + default: + if (yyres) + yyres[yyn] = *yyp; + yyn++; + break; + + case '"': + if (yyres) + yyres[yyn] = '\0'; + return yyn; + } + do_not_strip_quotes: ; + } + + if (! yyres) + return yystrlen (yystr); + + return yystpcpy (yyres, yystr) - yyres; +} +# endif + +/* Copy into YYRESULT an error message about the unexpected token + YYCHAR while in state YYSTATE. Return the number of bytes copied, + including the terminating null byte. If YYRESULT is null, do not + copy anything; just return the number of bytes that would be + copied. As a special case, return 0 if an ordinary "syntax error" + message will do. Return YYSIZE_MAXIMUM if overflow occurs during + size calculation. */ +static YYSIZE_T +yysyntax_error (char *yyresult, int yystate, int yychar) +{ + int yyn = yypact[yystate]; + + if (! (YYPACT_NINF < yyn && yyn <= YYLAST)) + return 0; + else + { + int yytype = YYTRANSLATE (yychar); + YYSIZE_T yysize0 = yytnamerr (0, yytname[yytype]); + YYSIZE_T yysize = yysize0; + YYSIZE_T yysize1; + int yysize_overflow = 0; + enum { YYERROR_VERBOSE_ARGS_MAXIMUM = 5 }; + char const *yyarg[YYERROR_VERBOSE_ARGS_MAXIMUM]; + int yyx; + +# if 0 + /* This is so xgettext sees the translatable formats that are + constructed on the fly. */ + YY_("syntax error, unexpected %s"); + YY_("syntax error, unexpected %s, expecting %s"); + YY_("syntax error, unexpected %s, expecting %s or %s"); + YY_("syntax error, unexpected %s, expecting %s or %s or %s"); + YY_("syntax error, unexpected %s, expecting %s or %s or %s or %s"); +# endif + char *yyfmt; + char const *yyf; + static char const yyunexpected[] = "syntax error, unexpected %s"; + static char const yyexpecting[] = ", expecting %s"; + static char const yyor[] = " or %s"; + char yyformat[sizeof yyunexpected + + sizeof yyexpecting - 1 + + ((YYERROR_VERBOSE_ARGS_MAXIMUM - 2) + * (sizeof yyor - 1))]; + char const *yyprefix = yyexpecting; + + /* Start YYX at -YYN if negative to avoid negative indexes in + YYCHECK. */ + int yyxbegin = yyn < 0 ? -yyn : 0; + + /* Stay within bounds of both yycheck and yytname. */ + int yychecklim = YYLAST - yyn + 1; + int yyxend = yychecklim < YYNTOKENS ? yychecklim : YYNTOKENS; + int yycount = 1; + + yyarg[0] = yytname[yytype]; + yyfmt = yystpcpy (yyformat, yyunexpected); + + for (yyx = yyxbegin; yyx < yyxend; ++yyx) + if (yycheck[yyx + yyn] == yyx && yyx != YYTERROR) + { + if (yycount == YYERROR_VERBOSE_ARGS_MAXIMUM) + { + yycount = 1; + yysize = yysize0; + yyformat[sizeof yyunexpected - 1] = '\0'; + break; + } + yyarg[yycount++] = yytname[yyx]; + yysize1 = yysize + yytnamerr (0, yytname[yyx]); + yysize_overflow |= (yysize1 < yysize); + yysize = yysize1; + yyfmt = yystpcpy (yyfmt, yyprefix); + yyprefix = yyor; + } + + yyf = YY_(yyformat); + yysize1 = yysize + yystrlen (yyf); + yysize_overflow |= (yysize1 < yysize); + yysize = yysize1; + + if (yysize_overflow) + return YYSIZE_MAXIMUM; + + if (yyresult) + { + /* Avoid sprintf, as that infringes on the user's name space. + Don't have undefined behavior even if the translation + produced a string with the wrong number of "%s"s. */ + char *yyp = yyresult; + int yyi = 0; + while ((*yyp = *yyf) != '\0') + { + if (*yyp == '%' && yyf[1] == 's' && yyi < yycount) + { + yyp += yytnamerr (yyp, yyarg[yyi++]); + yyf += 2; + } + else + { + yyp++; + yyf++; + } + } + } + return yysize; + } +} +#endif /* YYERROR_VERBOSE */ + + +/*-----------------------------------------------. +| Release the memory associated to this symbol. | +`-----------------------------------------------*/ + +/*ARGSUSED*/ +#if (defined __STDC__ || defined __C99__FUNC__ \ + || defined __cplusplus || defined _MSC_VER) +static void +yydestruct (const char *yymsg, int yytype, YYSTYPE *yyvaluep) +#else +static void +yydestruct (yymsg, yytype, yyvaluep) + const char *yymsg; + int yytype; + YYSTYPE *yyvaluep; +#endif +{ + YYUSE (yyvaluep); + + if (!yymsg) + yymsg = "Deleting"; + YY_SYMBOL_PRINT (yymsg, yytype, yyvaluep, yylocationp); + + switch (yytype) + { + + default: + break; + } +} + +/* Prevent warnings from -Wmissing-prototypes. */ +#ifdef YYPARSE_PARAM +#if defined __STDC__ || defined __cplusplus +int yyparse (void *YYPARSE_PARAM); +#else +int yyparse (); +#endif +#else /* ! YYPARSE_PARAM */ +#if defined __STDC__ || defined __cplusplus +int yyparse (void); +#else +int yyparse (); +#endif +#endif /* ! YYPARSE_PARAM */ + + +/* The lookahead symbol. */ +int yychar; + +/* The semantic value of the lookahead symbol. */ +YYSTYPE yylval; + +/* Number of syntax errors so far. */ +int yynerrs; + + + +/*-------------------------. +| yyparse or yypush_parse. | +`-------------------------*/ + +#ifdef YYPARSE_PARAM +#if (defined __STDC__ || defined __C99__FUNC__ \ + || defined __cplusplus || defined _MSC_VER) +int +yyparse (void *YYPARSE_PARAM) +#else +int +yyparse (YYPARSE_PARAM) + void *YYPARSE_PARAM; +#endif +#else /* ! YYPARSE_PARAM */ +#if (defined __STDC__ || defined __C99__FUNC__ \ + || defined __cplusplus || defined _MSC_VER) +int +yyparse (void) +#else +int +yyparse () + +#endif +#endif +{ + + + int yystate; + /* Number of tokens to shift before error messages enabled. */ + int yyerrstatus; + + /* The stacks and their tools: + `yyss': related to states. + `yyvs': related to semantic values. + + Refer to the stacks thru separate pointers, to allow yyoverflow + to reallocate them elsewhere. */ + + /* The state stack. */ + yytype_int16 yyssa[YYINITDEPTH]; + yytype_int16 *yyss; + yytype_int16 *yyssp; + + /* The semantic value stack. */ + YYSTYPE yyvsa[YYINITDEPTH]; + YYSTYPE *yyvs; + YYSTYPE *yyvsp; + + YYSIZE_T yystacksize; + + int yyn; + int yyresult; + /* Lookahead token as an internal (translated) token number. */ + int yytoken; + /* The variables used to return semantic value and location from the + action routines. */ + YYSTYPE yyval; + +#if YYERROR_VERBOSE + /* Buffer for error messages, and its allocated size. */ + char yymsgbuf[128]; + char *yymsg = yymsgbuf; + YYSIZE_T yymsg_alloc = sizeof yymsgbuf; +#endif + +#define YYPOPSTACK(N) (yyvsp -= (N), yyssp -= (N)) + + /* The number of symbols on the RHS of the reduced rule. + Keep to zero when no symbol should be popped. */ + int yylen = 0; + + yytoken = 0; + yyss = yyssa; + yyvs = yyvsa; + yystacksize = YYINITDEPTH; + + YYDPRINTF ((stderr, "Starting parse\n")); + + yystate = 0; + yyerrstatus = 0; + yynerrs = 0; + yychar = YYEMPTY; /* Cause a token to be read. */ + + /* Initialize stack pointers. + Waste one element of value and location stack + so that they stay on the same level as the state stack. + The wasted elements are never initialized. */ + yyssp = yyss; + yyvsp = yyvs; + + goto yysetstate; + +/*------------------------------------------------------------. +| yynewstate -- Push a new state, which is found in yystate. | +`------------------------------------------------------------*/ + yynewstate: + /* In all cases, when you get here, the value and location stacks + have just been pushed. So pushing a state here evens the stacks. */ + yyssp++; + + yysetstate: + *yyssp = yystate; + + if (yyss + yystacksize - 1 <= yyssp) + { + /* Get the current used size of the three stacks, in elements. */ + YYSIZE_T yysize = yyssp - yyss + 1; + +#ifdef yyoverflow + { + /* Give user a chance to reallocate the stack. Use copies of + these so that the &'s don't force the real ones into + memory. */ + YYSTYPE *yyvs1 = yyvs; + yytype_int16 *yyss1 = yyss; + + /* Each stack pointer address is followed by the size of the + data in use in that stack, in bytes. This used to be a + conditional around just the two extra args, but that might + be undefined if yyoverflow is a macro. */ + yyoverflow (YY_("memory exhausted"), + &yyss1, yysize * sizeof (*yyssp), + &yyvs1, yysize * sizeof (*yyvsp), + &yystacksize); + + yyss = yyss1; + yyvs = yyvs1; + } +#else /* no yyoverflow */ +# ifndef YYSTACK_RELOCATE + goto yyexhaustedlab; +# else + /* Extend the stack our own way. */ + if (YYMAXDEPTH <= yystacksize) + goto yyexhaustedlab; + yystacksize *= 2; + if (YYMAXDEPTH < yystacksize) + yystacksize = YYMAXDEPTH; + + { + yytype_int16 *yyss1 = yyss; + union yyalloc *yyptr = + (union yyalloc *) YYSTACK_ALLOC (YYSTACK_BYTES (yystacksize)); + if (! yyptr) + goto yyexhaustedlab; + YYSTACK_RELOCATE (yyss_alloc, yyss); + YYSTACK_RELOCATE (yyvs_alloc, yyvs); +# undef YYSTACK_RELOCATE + if (yyss1 != yyssa) + YYSTACK_FREE (yyss1); + } +# endif +#endif /* no yyoverflow */ + + yyssp = yyss + yysize - 1; + yyvsp = yyvs + yysize - 1; + + YYDPRINTF ((stderr, "Stack size increased to %lu\n", + (unsigned long int) yystacksize)); + + if (yyss + yystacksize - 1 <= yyssp) + YYABORT; + } + + YYDPRINTF ((stderr, "Entering state %d\n", yystate)); + + if (yystate == YYFINAL) + YYACCEPT; + + goto yybackup; + +/*-----------. +| yybackup. | +`-----------*/ +yybackup: + + /* Do appropriate processing given the current state. Read a + lookahead token if we need one and don't already have one. */ + + /* First try to decide what to do without reference to lookahead token. */ + yyn = yypact[yystate]; + if (yyn == YYPACT_NINF) + goto yydefault; + + /* Not known => get a lookahead token if don't already have one. */ + + /* YYCHAR is either YYEMPTY or YYEOF or a valid lookahead symbol. */ + if (yychar == YYEMPTY) + { + YYDPRINTF ((stderr, "Reading a token: ")); + yychar = YYLEX; + } + + if (yychar <= YYEOF) + { + yychar = yytoken = YYEOF; + YYDPRINTF ((stderr, "Now at end of input.\n")); + } + else + { + yytoken = YYTRANSLATE (yychar); + YY_SYMBOL_PRINT ("Next token is", yytoken, &yylval, &yylloc); + } + + /* If the proper action on seeing token YYTOKEN is to reduce or to + detect an error, take that action. */ + yyn += yytoken; + if (yyn < 0 || YYLAST < yyn || yycheck[yyn] != yytoken) + goto yydefault; + yyn = yytable[yyn]; + if (yyn <= 0) + { + if (yyn == 0 || yyn == YYTABLE_NINF) + goto yyerrlab; + yyn = -yyn; + goto yyreduce; + } + + /* Count tokens shifted since error; after three, turn off error + status. */ + if (yyerrstatus) + yyerrstatus--; + + /* Shift the lookahead token. */ + YY_SYMBOL_PRINT ("Shifting", yytoken, &yylval, &yylloc); + + /* Discard the shifted token. */ + yychar = YYEMPTY; + + yystate = yyn; + *++yyvsp = yylval; + + goto yynewstate; + + +/*-----------------------------------------------------------. +| yydefault -- do the default action for the current state. | +`-----------------------------------------------------------*/ +yydefault: + yyn = yydefact[yystate]; + if (yyn == 0) + goto yyerrlab; + goto yyreduce; + + +/*-----------------------------. +| yyreduce -- Do a reduction. | +`-----------------------------*/ +yyreduce: + /* yyn is the number of a rule to reduce with. */ + yylen = yyr2[yyn]; + + /* If YYLEN is nonzero, implement the default value of the action: + `$$ = $1'. + + Otherwise, the following line sets YYVAL to garbage. + This behavior is undocumented and Bison + users should not rely upon it. Assigning to YYVAL + unconditionally makes the parser a bit smaller, and it avoids a + GCC warning that YYVAL may be used uninitialized. */ + yyval = yyvsp[1-yylen]; + + + YY_REDUCE_PRINT (yyn); + switch (yyn) + { + case 7: + +/* Line 1455 of yacc.c */ +#line 210 "parse.y" + { /* this do nothing action removes a vacuous warning + from Bison */ + } + break; + + case 8: + +/* Line 1455 of yacc.c */ +#line 215 "parse.y" + { be_setup(scope = SCOPE_BEGIN) ; } + break; + + case 9: + +/* Line 1455 of yacc.c */ +#line 218 "parse.y" + { switch_code_to_main() ; } + break; + + case 10: + +/* Line 1455 of yacc.c */ +#line 221 "parse.y" + { be_setup(scope = SCOPE_END) ; } + break; + + case 11: + +/* Line 1455 of yacc.c */ +#line 224 "parse.y" + { switch_code_to_main() ; } + break; + + case 12: + +/* Line 1455 of yacc.c */ +#line 227 "parse.y" + { code_jmp(_JZ, (INST*)0) ; } + break; + + case 13: + +/* Line 1455 of yacc.c */ +#line 230 "parse.y" + { patch_jmp( code_ptr ) ; } + break; + + case 14: + +/* Line 1455 of yacc.c */ +#line 234 "parse.y" + { + INST *p1 = CDP((yyvsp[(1) - (2)].start)) ; + int len ; + + code_push(p1, code_ptr - p1, scope, active_funct) ; + code_ptr = p1 ; + + code2op(_RANGE, 1) ; + code_ptr += 3 ; + len = code_pop(code_ptr) ; + code_ptr += len ; + code1(_STOP) ; + p1 = CDP((yyvsp[(1) - (2)].start)) ; + p1[2].op = code_ptr - (p1+1) ; + } + break; + + case 15: + +/* Line 1455 of yacc.c */ +#line 250 "parse.y" + { code1(_STOP) ; } + break; + + case 16: + +/* Line 1455 of yacc.c */ +#line 253 "parse.y" + { + INST *p1 = CDP((yyvsp[(1) - (6)].start)) ; + + p1[3].op = CDP((yyvsp[(6) - (6)].start)) - (p1+1) ; + p1[4].op = code_ptr - (p1+1) ; + } + break; + + case 17: + +/* Line 1455 of yacc.c */ +#line 264 "parse.y" + { (yyval.start) = (yyvsp[(2) - (3)].start) ; } + break; + + case 18: + +/* Line 1455 of yacc.c */ +#line 266 "parse.y" + { (yyval.start) = code_offset ; /* does nothing won't be executed */ + print_flag = getline_flag = paren_cnt = 0 ; + yyerrok ; } + break; + + case 20: + +/* Line 1455 of yacc.c */ +#line 273 "parse.y" + { (yyval.start) = code_offset ; + code1(_PUSHINT) ; code1(0) ; + code2(_PRINT, bi_print) ; + } + break; + + case 24: + +/* Line 1455 of yacc.c */ +#line 286 "parse.y" + { code1(_POP) ; } + break; + + case 25: + +/* Line 1455 of yacc.c */ +#line 288 "parse.y" + { (yyval.start) = code_offset ; } + break; + + case 26: + +/* Line 1455 of yacc.c */ +#line 290 "parse.y" + { (yyval.start) = code_offset ; + print_flag = getline_flag = 0 ; + paren_cnt = 0 ; + yyerrok ; + } + break; + + case 27: + +/* Line 1455 of yacc.c */ +#line 296 "parse.y" + { (yyval.start) = code_offset ; BC_insert('B', code_ptr+1) ; + code2(_JMP, 0) /* don't use code_jmp ! */ ; } + break; + + case 28: + +/* Line 1455 of yacc.c */ +#line 299 "parse.y" + { (yyval.start) = code_offset ; BC_insert('C', code_ptr+1) ; + code2(_JMP, 0) ; } + break; + + case 29: + +/* Line 1455 of yacc.c */ +#line 302 "parse.y" + { if ( scope != SCOPE_FUNCT ) + compile_error("return outside function body") ; + } + break; + + case 30: + +/* Line 1455 of yacc.c */ +#line 306 "parse.y" + { if ( scope != SCOPE_MAIN ) + compile_error( "improper use of next" ) ; + (yyval.start) = code_offset ; + code1(_NEXT) ; + } + break; + + case 34: + +/* Line 1455 of yacc.c */ +#line 317 "parse.y" + { code1(_ASSIGN) ; } + break; + + case 35: + +/* Line 1455 of yacc.c */ +#line 318 "parse.y" + { code1(_ADD_ASG) ; } + break; + + case 36: + +/* Line 1455 of yacc.c */ +#line 319 "parse.y" + { code1(_SUB_ASG) ; } + break; + + case 37: + +/* Line 1455 of yacc.c */ +#line 320 "parse.y" + { code1(_MUL_ASG) ; } + break; + + case 38: + +/* Line 1455 of yacc.c */ +#line 321 "parse.y" + { code1(_DIV_ASG) ; } + break; + + case 39: + +/* Line 1455 of yacc.c */ +#line 322 "parse.y" + { code1(_MOD_ASG) ; } + break; + + case 40: + +/* Line 1455 of yacc.c */ +#line 323 "parse.y" + { code1(_POW_ASG) ; } + break; + + case 41: + +/* Line 1455 of yacc.c */ +#line 324 "parse.y" + { code1(_EQ) ; } + break; + + case 42: + +/* Line 1455 of yacc.c */ +#line 325 "parse.y" + { code1(_NEQ) ; } + break; + + case 43: + +/* Line 1455 of yacc.c */ +#line 326 "parse.y" + { code1(_LT) ; } + break; + + case 44: + +/* Line 1455 of yacc.c */ +#line 327 "parse.y" + { code1(_LTE) ; } + break; + + case 45: + +/* Line 1455 of yacc.c */ +#line 328 "parse.y" + { code1(_GT) ; } + break; + + case 46: + +/* Line 1455 of yacc.c */ +#line 329 "parse.y" + { code1(_GTE) ; } + break; + + case 47: + +/* Line 1455 of yacc.c */ +#line 332 "parse.y" + { + INST *p3 = CDP((yyvsp[(3) - (3)].start)) ; + + if ( p3 == code_ptr - 2 ) + { + if ( p3->op == _MATCH0 ) p3->op = _MATCH1 ; + + else /* check for string */ + if ( p3->op == _PUSHS ) + { CELL *cp = ZMALLOC(CELL) ; + + cp->type = C_STRING ; + cp->ptr = p3[1].ptr ; + cast_to_RE(cp) ; + code_ptr -= 2 ; + code2(_MATCH1, cp->ptr) ; + ZFREE(cp) ; + } + else code1(_MATCH2) ; + } + else code1(_MATCH2) ; + + if ( !(yyvsp[(2) - (3)].ival) ) code1(_NOT) ; + } + break; + + case 48: + +/* Line 1455 of yacc.c */ +#line 359 "parse.y" + { code1(_TEST) ; + code_jmp(_LJNZ, (INST*)0) ; + } + break; + + case 49: + +/* Line 1455 of yacc.c */ +#line 363 "parse.y" + { code1(_TEST) ; patch_jmp(code_ptr) ; } + break; + + case 50: + +/* Line 1455 of yacc.c */ +#line 366 "parse.y" + { code1(_TEST) ; + code_jmp(_LJZ, (INST*)0) ; + } + break; + + case 51: + +/* Line 1455 of yacc.c */ +#line 370 "parse.y" + { code1(_TEST) ; patch_jmp(code_ptr) ; } + break; + + case 52: + +/* Line 1455 of yacc.c */ +#line 372 "parse.y" + { code_jmp(_JZ, (INST*)0) ; } + break; + + case 53: + +/* Line 1455 of yacc.c */ +#line 373 "parse.y" + { code_jmp(_JMP, (INST*)0) ; } + break; + + case 54: + +/* Line 1455 of yacc.c */ +#line 375 "parse.y" + { patch_jmp(code_ptr) ; patch_jmp(CDP((yyvsp[(7) - (7)].start))) ; } + break; + + case 56: + +/* Line 1455 of yacc.c */ +#line 380 "parse.y" + { code1(_CAT) ; } + break; + + case 57: + +/* Line 1455 of yacc.c */ +#line 384 "parse.y" + { (yyval.start) = code_offset ; code2(_PUSHD, (yyvsp[(1) - (1)].ptr)) ; } + break; + + case 58: + +/* Line 1455 of yacc.c */ +#line 386 "parse.y" + { (yyval.start) = code_offset ; code2(_PUSHS, (yyvsp[(1) - (1)].ptr)) ; } + break; + + case 59: + +/* Line 1455 of yacc.c */ +#line 388 "parse.y" + { check_var((yyvsp[(1) - (1)].stp)) ; + (yyval.start) = code_offset ; + if ( is_local((yyvsp[(1) - (1)].stp)) ) + { code2op(L_PUSHI, (yyvsp[(1) - (1)].stp)->offset) ; } + else code2(_PUSHI, (yyvsp[(1) - (1)].stp)->stval.cp) ; + } + break; + + case 60: + +/* Line 1455 of yacc.c */ +#line 396 "parse.y" + { (yyval.start) = (yyvsp[(2) - (3)].start) ; } + break; + + case 61: + +/* Line 1455 of yacc.c */ +#line 400 "parse.y" + { (yyval.start) = code_offset ; code2(_MATCH0, (yyvsp[(1) - (1)].ptr)) ; } + break; + + case 62: + +/* Line 1455 of yacc.c */ +#line 403 "parse.y" + { code1(_ADD) ; } + break; + + case 63: + +/* Line 1455 of yacc.c */ +#line 404 "parse.y" + { code1(_SUB) ; } + break; + + case 64: + +/* Line 1455 of yacc.c */ +#line 405 "parse.y" + { code1(_MUL) ; } + break; + + case 65: + +/* Line 1455 of yacc.c */ +#line 406 "parse.y" + { code1(_DIV) ; } + break; + + case 66: + +/* Line 1455 of yacc.c */ +#line 407 "parse.y" + { code1(_MOD) ; } + break; + + case 67: + +/* Line 1455 of yacc.c */ +#line 408 "parse.y" + { code1(_POW) ; } + break; + + case 68: + +/* Line 1455 of yacc.c */ +#line 410 "parse.y" + { (yyval.start) = (yyvsp[(2) - (2)].start) ; code1(_NOT) ; } + break; + + case 69: + +/* Line 1455 of yacc.c */ +#line 412 "parse.y" + { (yyval.start) = (yyvsp[(2) - (2)].start) ; code1(_UPLUS) ; } + break; + + case 70: + +/* Line 1455 of yacc.c */ +#line 414 "parse.y" + { (yyval.start) = (yyvsp[(2) - (2)].start) ; code1(_UMINUS) ; } + break; + + case 72: + +/* Line 1455 of yacc.c */ +#line 419 "parse.y" + { check_var((yyvsp[(1) - (2)].stp)) ; + (yyval.start) = code_offset ; + code_address((yyvsp[(1) - (2)].stp)) ; + + if ( (yyvsp[(2) - (2)].ival) == '+' ) code1(_POST_INC) ; + else code1(_POST_DEC) ; + } + break; + + case 73: + +/* Line 1455 of yacc.c */ +#line 427 "parse.y" + { (yyval.start) = (yyvsp[(2) - (2)].start) ; + if ( (yyvsp[(1) - (2)].ival) == '+' ) code1(_PRE_INC) ; + else code1(_PRE_DEC) ; + } + break; + + case 74: + +/* Line 1455 of yacc.c */ +#line 434 "parse.y" + { if ((yyvsp[(2) - (2)].ival) == '+' ) code1(F_POST_INC ) ; + else code1(F_POST_DEC) ; + } + break; + + case 75: + +/* Line 1455 of yacc.c */ +#line 438 "parse.y" + { (yyval.start) = (yyvsp[(2) - (2)].start) ; + if ( (yyvsp[(1) - (2)].ival) == '+' ) code1(F_PRE_INC) ; + else code1( F_PRE_DEC) ; + } + break; + + case 76: + +/* Line 1455 of yacc.c */ +#line 445 "parse.y" + { (yyval.start) = code_offset ; + check_var((yyvsp[(1) - (1)].stp)) ; + code_address((yyvsp[(1) - (1)].stp)) ; + } + break; + + case 77: + +/* Line 1455 of yacc.c */ +#line 453 "parse.y" + { (yyval.ival) = 0 ; } + break; + + case 79: + +/* Line 1455 of yacc.c */ +#line 458 "parse.y" + { (yyval.ival) = 1 ; } + break; + + case 80: + +/* Line 1455 of yacc.c */ +#line 460 "parse.y" + { (yyval.ival) = (yyvsp[(1) - (3)].ival) + 1 ; } + break; + + case 81: + +/* Line 1455 of yacc.c */ +#line 465 "parse.y" + { BI_REC *p = (yyvsp[(1) - (5)].bip) ; + (yyval.start) = (yyvsp[(2) - (5)].start) ; + if ( (int)p->min_args > (yyvsp[(4) - (5)].ival) || (int)p->max_args < (yyvsp[(4) - (5)].ival) ) + compile_error( + "wrong number of arguments in call to %s" , + p->name ) ; + if ( p->min_args != p->max_args ) /* variable args */ + { code1(_PUSHINT) ; code1((yyvsp[(4) - (5)].ival)) ; } + code2(_BUILTIN , p->fp) ; + } + break; + + case 82: + +/* Line 1455 of yacc.c */ +#line 476 "parse.y" + { + (yyval.start) = code_offset ; + code1(_PUSHINT) ; code1(0) ; + code2(_BUILTIN, (yyvsp[(1) - (1)].bip)->fp) ; + } + break; + + case 83: + +/* Line 1455 of yacc.c */ +#line 485 "parse.y" + { (yyval.start) = code_offset ; } + break; + + case 84: + +/* Line 1455 of yacc.c */ +#line 490 "parse.y" + { code2(_PRINT, (yyvsp[(1) - (5)].fp)) ; + if ( (yyvsp[(1) - (5)].fp) == bi_printf && (yyvsp[(3) - (5)].ival) == 0 ) + compile_error("no arguments in call to printf") ; + print_flag = 0 ; + (yyval.start) = (yyvsp[(2) - (5)].start) ; + } + break; + + case 85: + +/* Line 1455 of yacc.c */ +#line 498 "parse.y" + { (yyval.fp) = bi_print ; print_flag = 1 ;} + break; + + case 86: + +/* Line 1455 of yacc.c */ +#line 499 "parse.y" + { (yyval.fp) = bi_printf ; print_flag = 1 ; } + break; + + case 87: + +/* Line 1455 of yacc.c */ +#line 502 "parse.y" + { code2op(_PUSHINT, (yyvsp[(1) - (1)].ival)) ; } + break; + + case 88: + +/* Line 1455 of yacc.c */ +#line 504 "parse.y" + { (yyval.ival) = (yyvsp[(2) - (3)].arg2p)->cnt ; zfree((yyvsp[(2) - (3)].arg2p),sizeof(ARG2_REC)) ; + code2op(_PUSHINT, (yyval.ival)) ; + } + break; + + case 89: + +/* Line 1455 of yacc.c */ +#line 508 "parse.y" + { (yyval.ival)=0 ; code2op(_PUSHINT, 0) ; } + break; + + case 90: + +/* Line 1455 of yacc.c */ +#line 512 "parse.y" + { (yyval.arg2p) = (ARG2_REC*) zmalloc(sizeof(ARG2_REC)) ; + (yyval.arg2p)->start = (yyvsp[(1) - (3)].start) ; + (yyval.arg2p)->cnt = 2 ; + } + break; + + case 91: + +/* Line 1455 of yacc.c */ +#line 517 "parse.y" + { (yyval.arg2p) = (yyvsp[(1) - (3)].arg2p) ; (yyval.arg2p)->cnt++ ; } + break; + + case 93: + +/* Line 1455 of yacc.c */ +#line 522 "parse.y" + { code2op(_PUSHINT, (yyvsp[(1) - (2)].ival)) ; } + break; + + case 94: + +/* Line 1455 of yacc.c */ +#line 529 "parse.y" + { (yyval.start) = (yyvsp[(3) - (4)].start) ; eat_nl() ; code_jmp(_JZ, (INST*)0) ; } + break; + + case 95: + +/* Line 1455 of yacc.c */ +#line 534 "parse.y" + { patch_jmp( code_ptr ) ; } + break; + + case 96: + +/* Line 1455 of yacc.c */ +#line 537 "parse.y" + { eat_nl() ; code_jmp(_JMP, (INST*)0) ; } + break; + + case 97: + +/* Line 1455 of yacc.c */ +#line 542 "parse.y" + { patch_jmp(code_ptr) ; + patch_jmp(CDP((yyvsp[(4) - (4)].start))) ; + } + break; + + case 98: + +/* Line 1455 of yacc.c */ +#line 551 "parse.y" + { eat_nl() ; BC_new() ; } + break; + + case 99: + +/* Line 1455 of yacc.c */ +#line 556 "parse.y" + { (yyval.start) = (yyvsp[(2) - (7)].start) ; + code_jmp(_JNZ, CDP((yyvsp[(2) - (7)].start))) ; + BC_clear(code_ptr, CDP((yyvsp[(5) - (7)].start))) ; } + break; + + case 100: + +/* Line 1455 of yacc.c */ +#line 562 "parse.y" + { eat_nl() ; BC_new() ; + (yyval.start) = (yyvsp[(3) - (4)].start) ; + + /* check if const expression */ + if ( code_ptr - 2 == CDP((yyvsp[(3) - (4)].start)) && + code_ptr[-2].op == _PUSHD && + *(double*)code_ptr[-1].ptr != 0.0 + ) + code_ptr -= 2 ; + else + { INST *p3 = CDP((yyvsp[(3) - (4)].start)) ; + code_push(p3, code_ptr-p3, scope, active_funct) ; + code_ptr = p3 ; + code2(_JMP, (INST*)0) ; /* code2() not code_jmp() */ + } + } + break; + + case 101: + +/* Line 1455 of yacc.c */ +#line 582 "parse.y" + { + int saved_offset ; + int len ; + INST *p1 = CDP((yyvsp[(1) - (2)].start)) ; + INST *p2 = CDP((yyvsp[(2) - (2)].start)) ; + + if ( p1 != p2 ) /* real test in loop */ + { + p1[1].op = code_ptr-(p1+1) ; + saved_offset = code_offset ; + len = code_pop(code_ptr) ; + code_ptr += len ; + code_jmp(_JNZ, CDP((yyvsp[(2) - (2)].start))) ; + BC_clear(code_ptr, CDP(saved_offset)) ; + } + else /* while(1) */ + { + code_jmp(_JMP, p1) ; + BC_clear(code_ptr, CDP((yyvsp[(2) - (2)].start))) ; + } + } + break; + + case 102: + +/* Line 1455 of yacc.c */ +#line 608 "parse.y" + { + int cont_offset = code_offset ; + unsigned len = code_pop(code_ptr) ; + INST *p2 = CDP((yyvsp[(2) - (4)].start)) ; + INST *p4 = CDP((yyvsp[(4) - (4)].start)) ; + + code_ptr += len ; + + if ( p2 != p4 ) /* real test in for2 */ + { + p4[-1].op = code_ptr - p4 + 1 ; + len = code_pop(code_ptr) ; + code_ptr += len ; + code_jmp(_JNZ, CDP((yyvsp[(4) - (4)].start))) ; + } + else /* for(;;) */ + code_jmp(_JMP, p4) ; + + BC_clear(code_ptr, CDP(cont_offset)) ; + + } + break; + + case 103: + +/* Line 1455 of yacc.c */ +#line 631 "parse.y" + { (yyval.start) = code_offset ; } + break; + + case 104: + +/* Line 1455 of yacc.c */ +#line 633 "parse.y" + { (yyval.start) = (yyvsp[(3) - (4)].start) ; code1(_POP) ; } + break; + + case 105: + +/* Line 1455 of yacc.c */ +#line 636 "parse.y" + { (yyval.start) = code_offset ; } + break; + + case 106: + +/* Line 1455 of yacc.c */ +#line 638 "parse.y" + { + if ( code_ptr - 2 == CDP((yyvsp[(1) - (2)].start)) && + code_ptr[-2].op == _PUSHD && + * (double*) code_ptr[-1].ptr != 0.0 + ) + code_ptr -= 2 ; + else + { + INST *p1 = CDP((yyvsp[(1) - (2)].start)) ; + code_push(p1, code_ptr-p1, scope, active_funct) ; + code_ptr = p1 ; + code2(_JMP, (INST*)0) ; + } + } + break; + + case 107: + +/* Line 1455 of yacc.c */ +#line 655 "parse.y" + { eat_nl() ; BC_new() ; + code_push((INST*)0,0, scope, active_funct) ; + } + break; + + case 108: + +/* Line 1455 of yacc.c */ +#line 659 "parse.y" + { INST *p1 = CDP((yyvsp[(1) - (2)].start)) ; + + eat_nl() ; BC_new() ; + code1(_POP) ; + code_push(p1, code_ptr - p1, scope, active_funct) ; + code_ptr -= code_ptr - p1 ; + } + break; + + case 109: + +/* Line 1455 of yacc.c */ +#line 672 "parse.y" + { check_array((yyvsp[(3) - (3)].stp)) ; + code_array((yyvsp[(3) - (3)].stp)) ; + code1(A_TEST) ; + } + break; + + case 110: + +/* Line 1455 of yacc.c */ +#line 677 "parse.y" + { (yyval.start) = (yyvsp[(2) - (5)].arg2p)->start ; + code2op(A_CAT, (yyvsp[(2) - (5)].arg2p)->cnt) ; + zfree((yyvsp[(2) - (5)].arg2p), sizeof(ARG2_REC)) ; + + check_array((yyvsp[(5) - (5)].stp)) ; + code_array((yyvsp[(5) - (5)].stp)) ; + code1(A_TEST) ; + } + break; + + case 111: + +/* Line 1455 of yacc.c */ +#line 688 "parse.y" + { + if ( (yyvsp[(4) - (5)].ival) > 1 ) + { code2op(A_CAT, (yyvsp[(4) - (5)].ival)) ; } + + check_array((yyvsp[(1) - (5)].stp)) ; + if( is_local((yyvsp[(1) - (5)].stp)) ) + { code2op(LAE_PUSHA, (yyvsp[(1) - (5)].stp)->offset) ; } + else code2(AE_PUSHA, (yyvsp[(1) - (5)].stp)->stval.array) ; + (yyval.start) = (yyvsp[(2) - (5)].start) ; + } + break; + + case 112: + +/* Line 1455 of yacc.c */ +#line 701 "parse.y" + { + if ( (yyvsp[(4) - (5)].ival) > 1 ) + { code2op(A_CAT, (yyvsp[(4) - (5)].ival)) ; } + + check_array((yyvsp[(1) - (5)].stp)) ; + if( is_local((yyvsp[(1) - (5)].stp)) ) + { code2op(LAE_PUSHI, (yyvsp[(1) - (5)].stp)->offset) ; } + else code2(AE_PUSHI, (yyvsp[(1) - (5)].stp)->stval.array) ; + (yyval.start) = (yyvsp[(2) - (5)].start) ; + } + break; + + case 113: + +/* Line 1455 of yacc.c */ +#line 713 "parse.y" + { + if ( (yyvsp[(4) - (6)].ival) > 1 ) + { code2op(A_CAT,(yyvsp[(4) - (6)].ival)) ; } + + check_array((yyvsp[(1) - (6)].stp)) ; + if( is_local((yyvsp[(1) - (6)].stp)) ) + { code2op(LAE_PUSHA, (yyvsp[(1) - (6)].stp)->offset) ; } + else code2(AE_PUSHA, (yyvsp[(1) - (6)].stp)->stval.array) ; + if ( (yyvsp[(6) - (6)].ival) == '+' ) code1(_POST_INC) ; + else code1(_POST_DEC) ; + + (yyval.start) = (yyvsp[(2) - (6)].start) ; + } + break; + + case 114: + +/* Line 1455 of yacc.c */ +#line 730 "parse.y" + { + (yyval.start) = (yyvsp[(3) - (7)].start) ; + if ( (yyvsp[(5) - (7)].ival) > 1 ) { code2op(A_CAT, (yyvsp[(5) - (7)].ival)) ; } + check_array((yyvsp[(2) - (7)].stp)) ; + code_array((yyvsp[(2) - (7)].stp)) ; + code1(A_DEL) ; + } + break; + + case 115: + +/* Line 1455 of yacc.c */ +#line 738 "parse.y" + { + (yyval.start) = code_offset ; + check_array((yyvsp[(2) - (3)].stp)) ; + code_array((yyvsp[(2) - (3)].stp)) ; + code1(DEL_A) ; + } + break; + + case 116: + +/* Line 1455 of yacc.c */ +#line 749 "parse.y" + { eat_nl() ; BC_new() ; + (yyval.start) = code_offset ; + + check_var((yyvsp[(3) - (6)].stp)) ; + code_address((yyvsp[(3) - (6)].stp)) ; + check_array((yyvsp[(5) - (6)].stp)) ; + code_array((yyvsp[(5) - (6)].stp)) ; + + code2(SET_ALOOP, (INST*)0) ; + } + break; + + case 117: + +/* Line 1455 of yacc.c */ +#line 763 "parse.y" + { + INST *p2 = CDP((yyvsp[(2) - (2)].start)) ; + + p2[-1].op = code_ptr - p2 + 1 ; + BC_clear( code_ptr+2 , code_ptr) ; + code_jmp(ALOOP, p2) ; + code1(POP_AL) ; + } + break; + + case 118: + +/* Line 1455 of yacc.c */ +#line 780 "parse.y" + { (yyval.start) = code_offset ; code2(F_PUSHA, (yyvsp[(1) - (1)].cp)) ; } + break; + + case 119: + +/* Line 1455 of yacc.c */ +#line 782 "parse.y" + { check_var((yyvsp[(2) - (2)].stp)) ; + (yyval.start) = code_offset ; + if ( is_local((yyvsp[(2) - (2)].stp)) ) + { code2op(L_PUSHI, (yyvsp[(2) - (2)].stp)->offset) ; } + else code2(_PUSHI, (yyvsp[(2) - (2)].stp)->stval.cp) ; + + CODE_FE_PUSHA() ; + } + break; + + case 120: + +/* Line 1455 of yacc.c */ +#line 791 "parse.y" + { + if ( (yyvsp[(5) - (6)].ival) > 1 ) + { code2op(A_CAT, (yyvsp[(5) - (6)].ival)) ; } + + check_array((yyvsp[(2) - (6)].stp)) ; + if( is_local((yyvsp[(2) - (6)].stp)) ) + { code2op(LAE_PUSHI, (yyvsp[(2) - (6)].stp)->offset) ; } + else code2(AE_PUSHI, (yyvsp[(2) - (6)].stp)->stval.array) ; + + CODE_FE_PUSHA() ; + + (yyval.start) = (yyvsp[(3) - (6)].start) ; + } + break; + + case 121: + +/* Line 1455 of yacc.c */ +#line 805 "parse.y" + { (yyval.start) = (yyvsp[(2) - (2)].start) ; CODE_FE_PUSHA() ; } + break; + + case 122: + +/* Line 1455 of yacc.c */ +#line 807 "parse.y" + { (yyval.start) = (yyvsp[(2) - (3)].start) ; } + break; + + case 123: + +/* Line 1455 of yacc.c */ +#line 811 "parse.y" + { field_A2I() ; } + break; + + case 124: + +/* Line 1455 of yacc.c */ +#line 814 "parse.y" + { code1(F_ASSIGN) ; } + break; + + case 125: + +/* Line 1455 of yacc.c */ +#line 815 "parse.y" + { code1(F_ADD_ASG) ; } + break; + + case 126: + +/* Line 1455 of yacc.c */ +#line 816 "parse.y" + { code1(F_SUB_ASG) ; } + break; + + case 127: + +/* Line 1455 of yacc.c */ +#line 817 "parse.y" + { code1(F_MUL_ASG) ; } + break; + + case 128: + +/* Line 1455 of yacc.c */ +#line 818 "parse.y" + { code1(F_DIV_ASG) ; } + break; + + case 129: + +/* Line 1455 of yacc.c */ +#line 819 "parse.y" + { code1(F_MOD_ASG) ; } + break; + + case 130: + +/* Line 1455 of yacc.c */ +#line 820 "parse.y" + { code1(F_POW_ASG) ; } + break; + + case 131: + +/* Line 1455 of yacc.c */ +#line 827 "parse.y" + { code2(_BUILTIN, bi_split) ; } + break; + + case 132: + +/* Line 1455 of yacc.c */ +#line 831 "parse.y" + { (yyval.start) = (yyvsp[(3) - (5)].start) ; + check_array((yyvsp[(5) - (5)].stp)) ; + code_array((yyvsp[(5) - (5)].stp)) ; + } + break; + + case 133: + +/* Line 1455 of yacc.c */ +#line 838 "parse.y" + { code2(_PUSHI, &fs_shadow) ; } + break; + + case 134: + +/* Line 1455 of yacc.c */ +#line 840 "parse.y" + { + if ( CDP((yyvsp[(2) - (3)].start)) == code_ptr - 2 ) + { + if ( code_ptr[-2].op == _MATCH0 ) + RE_as_arg() ; + else + if ( code_ptr[-2].op == _PUSHS ) + { CELL *cp = ZMALLOC(CELL) ; + + cp->type = C_STRING ; + cp->ptr = code_ptr[-1].ptr ; + cast_for_split(cp) ; + code_ptr[-2].op = _PUSHC ; + code_ptr[-1].ptr = (PTR) cp ; + } + } + } + break; + + case 135: + +/* Line 1455 of yacc.c */ +#line 864 "parse.y" + { (yyval.start) = (yyvsp[(3) - (6)].start) ; + code2(_BUILTIN, bi_match) ; + } + break; + + case 136: + +/* Line 1455 of yacc.c */ +#line 871 "parse.y" + { + INST *p1 = CDP((yyvsp[(1) - (1)].start)) ; + + if ( p1 == code_ptr - 2 ) + { + if ( p1->op == _MATCH0 ) RE_as_arg() ; + else + if ( p1->op == _PUSHS ) + { CELL *cp = ZMALLOC(CELL) ; + + cp->type = C_STRING ; + cp->ptr = p1[1].ptr ; + cast_to_RE(cp) ; + p1->op = _PUSHC ; + p1[1].ptr = (PTR) cp ; + } + } + } + break; + + case 137: + +/* Line 1455 of yacc.c */ +#line 894 "parse.y" + { (yyval.start) = code_offset ; + code1(_EXIT0) ; } + break; + + case 138: + +/* Line 1455 of yacc.c */ +#line 897 "parse.y" + { (yyval.start) = (yyvsp[(2) - (3)].start) ; code1(_EXIT) ; } + break; + + case 139: + +/* Line 1455 of yacc.c */ +#line 901 "parse.y" + { (yyval.start) = code_offset ; + code1(_RET0) ; } + break; + + case 140: + +/* Line 1455 of yacc.c */ +#line 904 "parse.y" + { (yyval.start) = (yyvsp[(2) - (3)].start) ; code1(_RET) ; } + break; + + case 141: + +/* Line 1455 of yacc.c */ +#line 910 "parse.y" + { (yyval.start) = code_offset ; + code2(F_PUSHA, &field[0]) ; + code1(_PUSHINT) ; code1(0) ; + code2(_BUILTIN, bi_getline) ; + getline_flag = 0 ; + } + break; + + case 142: + +/* Line 1455 of yacc.c */ +#line 917 "parse.y" + { (yyval.start) = (yyvsp[(2) - (2)].start) ; + code1(_PUSHINT) ; code1(0) ; + code2(_BUILTIN, bi_getline) ; + getline_flag = 0 ; + } + break; + + case 143: + +/* Line 1455 of yacc.c */ +#line 923 "parse.y" + { code1(_PUSHINT) ; code1(F_IN) ; + code2(_BUILTIN, bi_getline) ; + /* getline_flag already off in yylex() */ + } + break; + + case 144: + +/* Line 1455 of yacc.c */ +#line 928 "parse.y" + { code2(F_PUSHA, &field[0]) ; + code1(_PUSHINT) ; code1(PIPE_IN) ; + code2(_BUILTIN, bi_getline) ; + } + break; + + case 145: + +/* Line 1455 of yacc.c */ +#line 933 "parse.y" + { + code1(_PUSHINT) ; code1(PIPE_IN) ; + code2(_BUILTIN, bi_getline) ; + } + break; + + case 146: + +/* Line 1455 of yacc.c */ +#line 939 "parse.y" + { getline_flag = 1 ; } + break; + + case 149: + +/* Line 1455 of yacc.c */ +#line 944 "parse.y" + { (yyval.start) = code_offset ; + code2(F_PUSHA, field+0) ; + } + break; + + case 150: + +/* Line 1455 of yacc.c */ +#line 948 "parse.y" + { (yyval.start) = (yyvsp[(2) - (3)].start) ; } + break; + + case 151: + +/* Line 1455 of yacc.c */ +#line 956 "parse.y" + { + INST *p5 = CDP((yyvsp[(5) - (6)].start)) ; + INST *p6 = CDP((yyvsp[(6) - (6)].start)) ; + + if ( p6 - p5 == 2 && p5->op == _PUSHS ) + { /* cast from STRING to REPL at compile time */ + CELL *cp = ZMALLOC(CELL) ; + cp->type = C_STRING ; + cp->ptr = p5[1].ptr ; + cast_to_REPL(cp) ; + p5->op = _PUSHC ; + p5[1].ptr = (PTR) cp ; + } + code2(_BUILTIN, (yyvsp[(1) - (6)].fp)) ; + (yyval.start) = (yyvsp[(3) - (6)].start) ; + } + break; + + case 152: + +/* Line 1455 of yacc.c */ +#line 974 "parse.y" + { (yyval.fp) = bi_sub ; } + break; + + case 153: + +/* Line 1455 of yacc.c */ +#line 975 "parse.y" + { (yyval.fp) = bi_gsub ; } + break; + + case 154: + +/* Line 1455 of yacc.c */ +#line 980 "parse.y" + { (yyval.start) = code_offset ; + code2(F_PUSHA, &field[0]) ; + } + break; + + case 155: + +/* Line 1455 of yacc.c */ +#line 985 "parse.y" + { (yyval.start) = (yyvsp[(2) - (3)].start) ; } + break; + + case 156: + +/* Line 1455 of yacc.c */ +#line 993 "parse.y" + { + resize_fblock((yyvsp[(1) - (2)].fbp)) ; + restore_ids() ; + switch_code_to_main() ; + } + break; + + case 157: + +/* Line 1455 of yacc.c */ +#line 1002 "parse.y" + { eat_nl() ; + scope = SCOPE_FUNCT ; + active_funct = (yyvsp[(1) - (4)].fbp) ; + *main_code_p = active_code ; + + (yyvsp[(1) - (4)].fbp)->nargs = (yyvsp[(3) - (4)].ival) ; + if ( (yyvsp[(3) - (4)].ival) ) + (yyvsp[(1) - (4)].fbp)->typev = (char *) + memset( zmalloc((yyvsp[(3) - (4)].ival)), ST_LOCAL_NONE, (yyvsp[(3) - (4)].ival)) ; + else (yyvsp[(1) - (4)].fbp)->typev = (char *) 0 ; + + code_ptr = code_base = + (INST *) zmalloc(INST_BYTES(PAGESZ)); + code_limit = code_base + PAGESZ ; + code_warn = code_limit - CODEWARN ; + } + break; + + case 158: + +/* Line 1455 of yacc.c */ +#line 1021 "parse.y" + { FBLOCK *fbp ; + + if ( (yyvsp[(2) - (2)].stp)->type == ST_NONE ) + { + (yyvsp[(2) - (2)].stp)->type = ST_FUNCT ; + fbp = (yyvsp[(2) - (2)].stp)->stval.fbp = + (FBLOCK *) zmalloc(sizeof(FBLOCK)) ; + fbp->name = (yyvsp[(2) - (2)].stp)->name ; + fbp->code = (INST*) 0 ; + } + else + { + type_error( (yyvsp[(2) - (2)].stp) ) ; + + /* this FBLOCK will not be put in + the symbol table */ + fbp = (FBLOCK*) zmalloc(sizeof(FBLOCK)) ; + fbp->name = "" ; + } + (yyval.fbp) = fbp ; + } + break; + + case 159: + +/* Line 1455 of yacc.c */ +#line 1044 "parse.y" + { (yyval.fbp) = (yyvsp[(2) - (2)].fbp) ; + if ( (yyvsp[(2) - (2)].fbp)->code ) + compile_error("redefinition of %s" , (yyvsp[(2) - (2)].fbp)->name) ; + } + break; + + case 160: + +/* Line 1455 of yacc.c */ +#line 1050 "parse.y" + { (yyval.ival) = 0 ; } + break; + + case 162: + +/* Line 1455 of yacc.c */ +#line 1055 "parse.y" + { (yyvsp[(1) - (1)].stp) = save_id((yyvsp[(1) - (1)].stp)->name) ; + (yyvsp[(1) - (1)].stp)->type = ST_LOCAL_NONE ; + (yyvsp[(1) - (1)].stp)->offset = 0 ; + (yyval.ival) = 1 ; + } + break; + + case 163: + +/* Line 1455 of yacc.c */ +#line 1061 "parse.y" + { if ( is_local((yyvsp[(3) - (3)].stp)) ) + compile_error("%s is duplicated in argument list", + (yyvsp[(3) - (3)].stp)->name) ; + else + { (yyvsp[(3) - (3)].stp) = save_id((yyvsp[(3) - (3)].stp)->name) ; + (yyvsp[(3) - (3)].stp)->type = ST_LOCAL_NONE ; + (yyvsp[(3) - (3)].stp)->offset = (yyvsp[(1) - (3)].ival) ; + (yyval.ival) = (yyvsp[(1) - (3)].ival) + 1 ; + } + } + break; + + case 164: + +/* Line 1455 of yacc.c */ +#line 1074 "parse.y" + { /* we may have to recover from a bungled function + definition */ + /* can have local ids, before code scope + changes */ + restore_ids() ; + + switch_code_to_main() ; + } + break; + + case 165: + +/* Line 1455 of yacc.c */ +#line 1087 "parse.y" + { (yyval.start) = (yyvsp[(2) - (3)].start) ; + code2(_CALL, (yyvsp[(1) - (3)].fbp)) ; + + if ( (yyvsp[(3) - (3)].ca_p) ) code1((yyvsp[(3) - (3)].ca_p)->arg_num+1) ; + else code1(0) ; + + check_fcall((yyvsp[(1) - (3)].fbp), scope, code_move_level, active_funct, + (yyvsp[(3) - (3)].ca_p), token_lineno) ; + } + break; + + case 166: + +/* Line 1455 of yacc.c */ +#line 1099 "parse.y" + { (yyval.ca_p) = (CA_REC *) 0 ; } + break; + + case 167: + +/* Line 1455 of yacc.c */ +#line 1101 "parse.y" + { (yyval.ca_p) = (yyvsp[(2) - (2)].ca_p) ; + (yyval.ca_p)->link = (yyvsp[(1) - (2)].ca_p) ; + (yyval.ca_p)->arg_num = (yyvsp[(1) - (2)].ca_p) ? (yyvsp[(1) - (2)].ca_p)->arg_num+1 : 0 ; + } + break; + + case 168: + +/* Line 1455 of yacc.c */ +#line 1116 "parse.y" + { (yyval.ca_p) = (CA_REC *) 0 ; } + break; + + case 169: + +/* Line 1455 of yacc.c */ +#line 1118 "parse.y" + { (yyval.ca_p) = ZMALLOC(CA_REC) ; + (yyval.ca_p)->link = (yyvsp[(1) - (3)].ca_p) ; + (yyval.ca_p)->type = CA_EXPR ; + (yyval.ca_p)->arg_num = (yyvsp[(1) - (3)].ca_p) ? (yyvsp[(1) - (3)].ca_p)->arg_num+1 : 0 ; + (yyval.ca_p)->call_offset = code_offset ; + } + break; + + case 170: + +/* Line 1455 of yacc.c */ +#line 1125 "parse.y" + { (yyval.ca_p) = ZMALLOC(CA_REC) ; + (yyval.ca_p)->link = (yyvsp[(1) - (3)].ca_p) ; + (yyval.ca_p)->arg_num = (yyvsp[(1) - (3)].ca_p) ? (yyvsp[(1) - (3)].ca_p)->arg_num+1 : 0 ; + + code_call_id((yyval.ca_p), (yyvsp[(2) - (3)].stp)) ; + } + break; + + case 171: + +/* Line 1455 of yacc.c */ +#line 1134 "parse.y" + { (yyval.ca_p) = ZMALLOC(CA_REC) ; + (yyval.ca_p)->type = CA_EXPR ; + (yyval.ca_p)->call_offset = code_offset ; + } + break; + + case 172: + +/* Line 1455 of yacc.c */ +#line 1140 "parse.y" + { (yyval.ca_p) = ZMALLOC(CA_REC) ; + code_call_id((yyval.ca_p), (yyvsp[(1) - (2)].stp)) ; + } + break; + + + +/* Line 1455 of yacc.c */ +#line 3517 "y.tab.c" + default: break; + } + YY_SYMBOL_PRINT ("-> $$ =", yyr1[yyn], &yyval, &yyloc); + + YYPOPSTACK (yylen); + yylen = 0; + YY_STACK_PRINT (yyss, yyssp); + + *++yyvsp = yyval; + + /* Now `shift' the result of the reduction. Determine what state + that goes to, based on the state we popped back to and the rule + number reduced by. */ + + yyn = yyr1[yyn]; + + yystate = yypgoto[yyn - YYNTOKENS] + *yyssp; + if (0 <= yystate && yystate <= YYLAST && yycheck[yystate] == *yyssp) + yystate = yytable[yystate]; + else + yystate = yydefgoto[yyn - YYNTOKENS]; + + goto yynewstate; + + +/*------------------------------------. +| yyerrlab -- here on detecting error | +`------------------------------------*/ +yyerrlab: + /* If not already recovering from an error, report this error. */ + if (!yyerrstatus) + { + ++yynerrs; +#if ! YYERROR_VERBOSE + yyerror (YY_("syntax error")); +#else + { + YYSIZE_T yysize = yysyntax_error (0, yystate, yychar); + if (yymsg_alloc < yysize && yymsg_alloc < YYSTACK_ALLOC_MAXIMUM) + { + YYSIZE_T yyalloc = 2 * yysize; + if (! (yysize <= yyalloc && yyalloc <= YYSTACK_ALLOC_MAXIMUM)) + yyalloc = YYSTACK_ALLOC_MAXIMUM; + if (yymsg != yymsgbuf) + YYSTACK_FREE (yymsg); + yymsg = (char *) YYSTACK_ALLOC (yyalloc); + if (yymsg) + yymsg_alloc = yyalloc; + else + { + yymsg = yymsgbuf; + yymsg_alloc = sizeof yymsgbuf; + } + } + + if (0 < yysize && yysize <= yymsg_alloc) + { + (void) yysyntax_error (yymsg, yystate, yychar); + yyerror (yymsg); + } + else + { + yyerror (YY_("syntax error")); + if (yysize != 0) + goto yyexhaustedlab; + } + } +#endif + } + + + + if (yyerrstatus == 3) + { + /* If just tried and failed to reuse lookahead token after an + error, discard it. */ + + if (yychar <= YYEOF) + { + /* Return failure if at end of input. */ + if (yychar == YYEOF) + YYABORT; + } + else + { + yydestruct ("Error: discarding", + yytoken, &yylval); + yychar = YYEMPTY; + } + } + + /* Else will try to reuse lookahead token after shifting the error + token. */ + goto yyerrlab1; + + +/*---------------------------------------------------. +| yyerrorlab -- error raised explicitly by YYERROR. | +`---------------------------------------------------*/ +yyerrorlab: + + /* Pacify compilers like GCC when the user code never invokes + YYERROR and the label yyerrorlab therefore never appears in user + code. */ + if (/*CONSTCOND*/ 0) + goto yyerrorlab; + + /* Do not reclaim the symbols of the rule which action triggered + this YYERROR. */ + YYPOPSTACK (yylen); + yylen = 0; + YY_STACK_PRINT (yyss, yyssp); + yystate = *yyssp; + goto yyerrlab1; + + +/*-------------------------------------------------------------. +| yyerrlab1 -- common code for both syntax error and YYERROR. | +`-------------------------------------------------------------*/ +yyerrlab1: + yyerrstatus = 3; /* Each real token shifted decrements this. */ + + for (;;) + { + yyn = yypact[yystate]; + if (yyn != YYPACT_NINF) + { + yyn += YYTERROR; + if (0 <= yyn && yyn <= YYLAST && yycheck[yyn] == YYTERROR) + { + yyn = yytable[yyn]; + if (0 < yyn) + break; + } + } + + /* Pop the current state because it cannot handle the error token. */ + if (yyssp == yyss) + YYABORT; + + + yydestruct ("Error: popping", + yystos[yystate], yyvsp); + YYPOPSTACK (1); + yystate = *yyssp; + YY_STACK_PRINT (yyss, yyssp); + } + + *++yyvsp = yylval; + + + /* Shift the error token. */ + YY_SYMBOL_PRINT ("Shifting", yystos[yyn], yyvsp, yylsp); + + yystate = yyn; + goto yynewstate; + + +/*-------------------------------------. +| yyacceptlab -- YYACCEPT comes here. | +`-------------------------------------*/ +yyacceptlab: + yyresult = 0; + goto yyreturn; + +/*-----------------------------------. +| yyabortlab -- YYABORT comes here. | +`-----------------------------------*/ +yyabortlab: + yyresult = 1; + goto yyreturn; + +#if !defined(yyoverflow) || YYERROR_VERBOSE +/*-------------------------------------------------. +| yyexhaustedlab -- memory exhaustion comes here. | +`-------------------------------------------------*/ +yyexhaustedlab: + yyerror (YY_("memory exhausted")); + yyresult = 2; + /* Fall through. */ +#endif + +yyreturn: + if (yychar != YYEMPTY) + yydestruct ("Cleanup: discarding lookahead", + yytoken, &yylval); + /* Do not reclaim the symbols of the rule which action triggered + this YYABORT or YYACCEPT. */ + YYPOPSTACK (yylen); + YY_STACK_PRINT (yyss, yyssp); + while (yyssp != yyss) + { + yydestruct ("Cleanup: popping", + yystos[*yyssp], yyvsp); + YYPOPSTACK (1); + } +#ifndef yyoverflow + if (yyss != yyssa) + YYSTACK_FREE (yyss); +#endif +#if YYERROR_VERBOSE + if (yymsg != yymsgbuf) + YYSTACK_FREE (yymsg); +#endif + /* Make sure YYID is used. */ + return YYID (yyresult); +} + + + +/* Line 1675 of yacc.c */ +#line 1148 "parse.y" + + +/* resize the code for a user function */ + +static void resize_fblock( fbp ) + FBLOCK *fbp ; +{ + CODEBLOCK *p = ZMALLOC(CODEBLOCK) ; + unsigned dummy ; + + code2op(_RET0, _HALT) ; + /* make sure there is always a return */ + + *p = active_code ; + fbp->code = code_shrink(p, &dummy) ; + /* code_shrink() zfrees p */ + + if ( dump_code_flag ) add_to_fdump_list(fbp) ; +} + + +/* convert FE_PUSHA to FE_PUSHI + or F_PUSH to F_PUSHI +*/ + +static void field_A2I() +{ CELL *cp ; + + if ( code_ptr[-1].op == FE_PUSHA && + code_ptr[-1].ptr == (PTR) 0) + /* On most architectures, the two tests are the same; a good + compiler might eliminate one. On LM_DOS, and possibly other + segmented architectures, they are not */ + { code_ptr[-1].op = FE_PUSHI ; } + else + { + cp = (CELL *) code_ptr[-1].ptr ; + + if ( cp == field || + +#ifdef MSDOS + SAMESEG(cp,field) && +#endif + cp > NF && cp <= LAST_PFIELD ) + { + code_ptr[-2].op = _PUSHI ; + } + else if ( cp == NF ) + { code_ptr[-2].op = NF_PUSHI ; code_ptr-- ; } + + else + { + code_ptr[-2].op = F_PUSHI ; + code_ptr -> op = field_addr_to_index( code_ptr[-1].ptr ) ; + code_ptr++ ; + } + } +} + +/* we've seen an ID in a context where it should be a VAR, + check that's consistent with previous usage */ + +static void check_var( p ) + register SYMTAB *p ; +{ + switch(p->type) + { + case ST_NONE : /* new id */ + p->type = ST_VAR ; + p->stval.cp = ZMALLOC(CELL) ; + p->stval.cp->type = C_NOINIT ; + break ; + + case ST_LOCAL_NONE : + p->type = ST_LOCAL_VAR ; + active_funct->typev[p->offset] = ST_LOCAL_VAR ; + break ; + + case ST_VAR : + case ST_LOCAL_VAR : break ; + + default : + type_error(p) ; + break ; + } +} + +/* we've seen an ID in a context where it should be an ARRAY, + check that's consistent with previous usage */ +static void check_array(p) + register SYMTAB *p ; +{ + switch(p->type) + { + case ST_NONE : /* a new array */ + p->type = ST_ARRAY ; + p->stval.array = new_ARRAY() ; + break ; + + case ST_ARRAY : + case ST_LOCAL_ARRAY : + break ; + + case ST_LOCAL_NONE : + p->type = ST_LOCAL_ARRAY ; + active_funct->typev[p->offset] = ST_LOCAL_ARRAY ; + break ; + + default : type_error(p) ; break ; + } +} + +static void code_array(p) + register SYMTAB *p ; +{ + if ( is_local(p) ) code2op(LA_PUSHA, p->offset) ; + else code2(A_PUSHA, p->stval.array) ; +} + + +/* we've seen an ID as an argument to a user defined function */ + +static void code_call_id( p, ip ) + register CA_REC *p ; + register SYMTAB *ip ; +{ static CELL dummy ; + + p->call_offset = code_offset ; + /* This always get set now. So that fcall:relocate_arglist + works. */ + + switch( ip->type ) + { + case ST_VAR : + p->type = CA_EXPR ; + code2(_PUSHI, ip->stval.cp) ; + break ; + + case ST_LOCAL_VAR : + p->type = CA_EXPR ; + code2op(L_PUSHI, ip->offset) ; + break ; + + case ST_ARRAY : + p->type = CA_ARRAY ; + code2(A_PUSHA, ip->stval.array) ; + break ; + + case ST_LOCAL_ARRAY : + p->type = CA_ARRAY ; + code2op(LA_PUSHA, ip->offset) ; + break ; + + /* not enough info to code it now; it will have to + be patched later */ + + case ST_NONE : + p->type = ST_NONE ; + p->sym_p = ip ; + code2(_PUSHI, &dummy) ; + break ; + + case ST_LOCAL_NONE : + p->type = ST_LOCAL_NONE ; + p->type_p = & active_funct->typev[ip->offset] ; + code2op(L_PUSHI, ip->offset) ; + break ; + + +#ifdef DEBUG + default : + bozo("code_call_id") ; +#endif + + } +} + +/* an RE by itself was coded as _MATCH0 , change to + push as an expression */ + +static void RE_as_arg() +{ CELL *cp = ZMALLOC(CELL) ; + + code_ptr -= 2 ; + cp->type = C_RE ; + cp->ptr = code_ptr[1].ptr ; + code2(_PUSHC, cp) ; +} + +/* reset the active_code back to the MAIN block */ +static void +switch_code_to_main() +{ + switch(scope) + { + case SCOPE_BEGIN : + *begin_code_p = active_code ; + active_code = *main_code_p ; + break ; + + case SCOPE_END : + *end_code_p = active_code ; + active_code = *main_code_p ; + break ; + + case SCOPE_FUNCT : + active_code = *main_code_p ; + break ; + + case SCOPE_MAIN : + break ; + } + active_funct = (FBLOCK*) 0 ; + scope = SCOPE_MAIN ; +} + + +void +parse() +{ + if ( yyparse() || compile_error_count != 0 ) mawk_exit(2) ; + + scan_cleanup() ; + set_code() ; + /* code must be set before call to resolve_fcalls() */ + if ( resolve_list ) resolve_fcalls() ; + + if ( compile_error_count != 0 ) mawk_exit(2) ; + if ( dump_code_flag ) { dump_code() ; mawk_exit(0) ; } +} + + diff --git a/parse.h b/parse.h new file mode 100644 index 0000000..698b555 --- /dev/null +++ b/parse.h @@ -0,0 +1,227 @@ + +/* A Bison parser, made by GNU Bison 2.4.1. */ + +/* Skeleton interface for Bison's Yacc-like parsers in C + + Copyright (C) 1984, 1989, 1990, 2000, 2001, 2002, 2003, 2004, 2005, 2006 + Free Software Foundation, Inc. + + This program is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation, either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see . */ + +/* As a special exception, you may create a larger work that contains + part or all of the Bison parser skeleton and distribute that work + under terms of your choice, so long as that work isn't itself a + parser generator using the skeleton or a modified version thereof + as a parser skeleton. Alternatively, if you modify or redistribute + the parser skeleton itself, you may (at your option) remove this + special exception, which will cause the skeleton and the resulting + Bison output files to be licensed under the GNU General Public + License without this special exception. + + This special exception was added by the Free Software Foundation in + version 2.2 of Bison. */ + + +/* Tokens. */ +#ifndef YYTOKENTYPE +# define YYTOKENTYPE + /* Put the tokens into the symbol table, so that GDB and other debuggers + know about them. */ + enum yytokentype { + UNEXPECTED = 258, + BAD_DECIMAL = 259, + NL = 260, + SEMI_COLON = 261, + LBRACE = 262, + RBRACE = 263, + LBOX = 264, + RBOX = 265, + COMMA = 266, + IO_OUT = 267, + POW_ASG = 268, + MOD_ASG = 269, + DIV_ASG = 270, + MUL_ASG = 271, + SUB_ASG = 272, + ADD_ASG = 273, + ASSIGN = 274, + COLON = 275, + QMARK = 276, + OR = 277, + AND = 278, + IN = 279, + MATCH = 280, + GTE = 281, + GT = 282, + LTE = 283, + LT = 284, + NEQ = 285, + EQ = 286, + CAT = 287, + GETLINE = 288, + MINUS = 289, + PLUS = 290, + MOD = 291, + DIV = 292, + MUL = 293, + UMINUS = 294, + NOT = 295, + PIPE = 296, + IO_IN = 297, + POW = 298, + INC_or_DEC = 299, + FIELD = 300, + DOLLAR = 301, + RPAREN = 302, + LPAREN = 303, + DOUBLE = 304, + STRING_ = 305, + RE = 306, + ID = 307, + D_ID = 308, + FUNCT_ID = 309, + BUILTIN = 310, + LENGTH = 311, + PRINT = 312, + PRINTF = 313, + SPLIT = 314, + MATCH_FUNC = 315, + SUB = 316, + GSUB = 317, + DO = 318, + WHILE = 319, + FOR = 320, + BREAK = 321, + CONTINUE = 322, + IF = 323, + ELSE = 324, + DELETE = 325, + BEGIN = 326, + END = 327, + EXIT = 328, + NEXT = 329, + RETURN = 330, + FUNCTION = 331 + }; +#endif +/* Tokens. */ +#define UNEXPECTED 258 +#define BAD_DECIMAL 259 +#define NL 260 +#define SEMI_COLON 261 +#define LBRACE 262 +#define RBRACE 263 +#define LBOX 264 +#define RBOX 265 +#define COMMA 266 +#define IO_OUT 267 +#define POW_ASG 268 +#define MOD_ASG 269 +#define DIV_ASG 270 +#define MUL_ASG 271 +#define SUB_ASG 272 +#define ADD_ASG 273 +#define ASSIGN 274 +#define COLON 275 +#define QMARK 276 +#define OR 277 +#define AND 278 +#define IN 279 +#define MATCH 280 +#define GTE 281 +#define GT 282 +#define LTE 283 +#define LT 284 +#define NEQ 285 +#define EQ 286 +#define CAT 287 +#define GETLINE 288 +#define MINUS 289 +#define PLUS 290 +#define MOD 291 +#define DIV 292 +#define MUL 293 +#define UMINUS 294 +#define NOT 295 +#define PIPE 296 +#define IO_IN 297 +#define POW 298 +#define INC_or_DEC 299 +#define FIELD 300 +#define DOLLAR 301 +#define RPAREN 302 +#define LPAREN 303 +#define DOUBLE 304 +#define STRING_ 305 +#define RE 306 +#define ID 307 +#define D_ID 308 +#define FUNCT_ID 309 +#define BUILTIN 310 +#define LENGTH 311 +#define PRINT 312 +#define PRINTF 313 +#define SPLIT 314 +#define MATCH_FUNC 315 +#define SUB 316 +#define GSUB 317 +#define DO 318 +#define WHILE 319 +#define FOR 320 +#define BREAK 321 +#define CONTINUE 322 +#define IF 323 +#define ELSE 324 +#define DELETE 325 +#define BEGIN 326 +#define END 327 +#define EXIT 328 +#define NEXT 329 +#define RETURN 330 +#define FUNCTION 331 + + + + +#if ! defined YYSTYPE && ! defined YYSTYPE_IS_DECLARED +typedef union YYSTYPE +{ + +/* Line 1676 of yacc.c */ +#line 124 "parse.y" + +CELL *cp ; +SYMTAB *stp ; +int start ; /* code starting address as offset from code_base */ +PF_CP fp ; /* ptr to a (print/printf) or (sub/gsub) function */ +BI_REC *bip ; /* ptr to info about a builtin */ +FBLOCK *fbp ; /* ptr to a function block */ +ARG2_REC *arg2p ; +CA_REC *ca_p ; +int ival ; +PTR ptr ; + + + +/* Line 1676 of yacc.c */ +#line 219 "y.tab.h" +} YYSTYPE; +# define YYSTYPE_IS_TRIVIAL 1 +# define yystype YYSTYPE /* obsolescent; will be withdrawn */ +# define YYSTYPE_IS_DECLARED 1 +#endif + +extern YYSTYPE yylval; + + diff --git a/parse.o b/parse.o new file mode 100644 index 0000000..9cb3c88 Binary files /dev/null and b/parse.o differ diff --git a/parse.y b/parse.y new file mode 100644 index 0000000..6bd363c --- /dev/null +++ b/parse.y @@ -0,0 +1,1378 @@ + +/******************************************** +parse.y +copyright 1991-94, Michael D. Brennan + +This is a source file for mawk, an implementation of +the AWK programming language. + +Mawk is distributed without warranty under the terms of +the GNU General Public License, version 2, 1991. +********************************************/ + +/* $Log: parse.y,v $ + * Revision 1.11 1995/06/11 22:40:09 mike + * change if(dump_code) -> if(dump_code_flag) + * cleanup of parse() + * add cast to shutup solaris cc compiler on char to int comparison + * switch_code_to_main() which cleans up outside_error production + * + * Revision 1.10 1995/04/21 14:20:21 mike + * move_level variable to fix bug in arglist patching of moved code. + * + * Revision 1.9 1995/02/19 22:15:39 mike + * Always set the call_offset field in a CA_REC (for obscure + * reasons in fcall.c (see comments) there.) + * + * Revision 1.8 1994/12/13 00:39:20 mike + * delete A statement to delete all of A at once + * + * Revision 1.7 1994/10/08 19:15:48 mike + * remove SM_DOS + * + * Revision 1.6 1993/12/01 14:25:17 mike + * reentrant array loops + * + * Revision 1.5 1993/07/22 00:04:13 mike + * new op code _LJZ _LJNZ + * + * Revision 1.4 1993/07/15 23:38:15 mike + * SIZE_T and indent + * + * Revision 1.3 1993/07/07 00:07:46 mike + * more work on 1.2 + * + * Revision 1.2 1993/07/03 21:18:01 mike + * bye to yacc_mem + * + * Revision 1.1.1.1 1993/07/03 18:58:17 mike + * move source to cvs + * + * Revision 5.8 1993/05/03 01:07:18 mike + * fix bozo in LENGTH production + * + * Revision 5.7 1993/01/09 19:03:44 mike + * code_pop checks if the resolve_list needs relocation + * + * Revision 5.6 1993/01/07 02:50:33 mike + * relative vs absolute code + * + * Revision 5.5 1993/01/01 21:30:48 mike + * split new_STRING() into new_STRING and new_STRING0 + * + * Revision 5.4 1992/08/08 17:17:20 brennan + * patch 2: improved timing of error recovery in + * bungled function definitions. Fixes a core dump + * + * Revision 5.3 1992/07/08 15:43:41 brennan + * patch2: length returns. I am a wimp + * + * Revision 5.2 1992/01/08 16:11:42 brennan + * code FE_PUSHA carefully for MSDOS large mode + * + * Revision 5.1 91/12/05 07:50:22 brennan + * 1.1 pre-release + * +*/ + + +%{ +#include +#include "mawk.h" +#include "symtype.h" +#include "code.h" +#include "memory.h" +#include "bi_funct.h" +#include "bi_vars.h" +#include "jmp.h" +#include "field.h" +#include "files.h" + + +#define YYMAXDEPTH 200 + + +extern void PROTO( eat_nl, (void) ) ; +static void PROTO( resize_fblock, (FBLOCK *) ) ; +static void PROTO( switch_code_to_main, (void)) ; +static void PROTO( code_array, (SYMTAB *) ) ; +static void PROTO( code_call_id, (CA_REC *, SYMTAB *) ) ; +static void PROTO( field_A2I, (void)) ; +static void PROTO( check_var, (SYMTAB *) ) ; +static void PROTO( check_array, (SYMTAB *) ) ; +static void PROTO( RE_as_arg, (void)) ; + +static int scope ; +static FBLOCK *active_funct ; + /* when scope is SCOPE_FUNCT */ + +#define code_address(x) if( is_local(x) ) \ + code2op(L_PUSHA, (x)->offset) ;\ + else code2(_PUSHA, (x)->stval.cp) + +#define CDP(x) (code_base+(x)) +/* WARNING: These CDP() calculations become invalid after calls + that might change code_base. Which are: code2(), code2op(), + code_jmp() and code_pop(). +*/ + +/* this nonsense caters to MSDOS large model */ +#define CODE_FE_PUSHA() code_ptr->ptr = (PTR) 0 ; code1(FE_PUSHA) + +%} + +%union{ +CELL *cp ; +SYMTAB *stp ; +int start ; /* code starting address as offset from code_base */ +PF_CP fp ; /* ptr to a (print/printf) or (sub/gsub) function */ +BI_REC *bip ; /* ptr to info about a builtin */ +FBLOCK *fbp ; /* ptr to a function block */ +ARG2_REC *arg2p ; +CA_REC *ca_p ; +int ival ; +PTR ptr ; +} + +/* two tokens to help with errors */ +%token UNEXPECTED /* unexpected character */ +%token BAD_DECIMAL + +%token NL +%token SEMI_COLON +%token LBRACE RBRACE +%token LBOX RBOX +%token COMMA +%token IO_OUT /* > or output pipe */ + +%right ASSIGN ADD_ASG SUB_ASG MUL_ASG DIV_ASG MOD_ASG POW_ASG +%right QMARK COLON +%left OR +%left AND +%left IN +%left MATCH /* ~ or !~ */ +%left EQ NEQ LT LTE GT GTE +%left CAT +%left GETLINE +%left PLUS MINUS +%left MUL DIV MOD +%left NOT UMINUS +%nonassoc IO_IN PIPE +%right POW +%left INC_or_DEC +%left DOLLAR FIELD /* last to remove a SR conflict + with getline */ +%right LPAREN RPAREN /* removes some SR conflicts */ + +%token DOUBLE STRING_ RE +%token ID D_ID +%token FUNCT_ID +%token BUILTIN LENGTH +%token FIELD + +%token PRINT PRINTF SPLIT MATCH_FUNC SUB GSUB +/* keywords */ +%token DO WHILE FOR BREAK CONTINUE IF ELSE IN +%token DELETE BEGIN END EXIT NEXT RETURN FUNCTION + +%type block block_or_separator +%type statement_list statement mark +%type pr_args +%type arg2 +%type builtin +%type getline_file +%type lvalue field fvalue +%type expr cat_expr p_expr +%type while_front if_front +%type for1 for2 +%type array_loop_front +%type return_statement +%type split_front re_arg sub_back +%type arglist args +%type print sub_or_gsub +%type funct_start funct_head +%type call_args ca_front ca_back +%type f_arglist f_args + +%% +/* productions */ + +program : program_block + | program program_block + ; + +program_block : PA_block /* pattern-action */ + | function_def + | outside_error block + ; + +PA_block : block + { /* this do nothing action removes a vacuous warning + from Bison */ + } + + | BEGIN + { be_setup(scope = SCOPE_BEGIN) ; } + + block + { switch_code_to_main() ; } + + | END + { be_setup(scope = SCOPE_END) ; } + + block + { switch_code_to_main() ; } + + | expr /* this works just like an if statement */ + { code_jmp(_JZ, (INST*)0) ; } + + block_or_separator + { patch_jmp( code_ptr ) ; } + + /* range pattern, see comment in execute.c near _RANGE */ + | expr COMMA + { + INST *p1 = CDP($1) ; + int len ; + + code_push(p1, code_ptr - p1, scope, active_funct) ; + code_ptr = p1 ; + + code2op(_RANGE, 1) ; + code_ptr += 3 ; + len = code_pop(code_ptr) ; + code_ptr += len ; + code1(_STOP) ; + p1 = CDP($1) ; + p1[2].op = code_ptr - (p1+1) ; + } + expr + { code1(_STOP) ; } + + block_or_separator + { + INST *p1 = CDP($1) ; + + p1[3].op = CDP($6) - (p1+1) ; + p1[4].op = code_ptr - (p1+1) ; + } + ; + + + +block : LBRACE statement_list RBRACE + { $$ = $2 ; } + | LBRACE error RBRACE + { $$ = code_offset ; /* does nothing won't be executed */ + print_flag = getline_flag = paren_cnt = 0 ; + yyerrok ; } + ; + +block_or_separator : block + | separator /* default print action */ + { $$ = code_offset ; + code1(_PUSHINT) ; code1(0) ; + code2(_PRINT, bi_print) ; + } + ; + +statement_list : statement + | statement_list statement + ; + + +statement : block + | expr separator + { code1(_POP) ; } + | /* empty */ separator + { $$ = code_offset ; } + | error separator + { $$ = code_offset ; + print_flag = getline_flag = 0 ; + paren_cnt = 0 ; + yyerrok ; + } + | BREAK separator + { $$ = code_offset ; BC_insert('B', code_ptr+1) ; + code2(_JMP, 0) /* don't use code_jmp ! */ ; } + | CONTINUE separator + { $$ = code_offset ; BC_insert('C', code_ptr+1) ; + code2(_JMP, 0) ; } + | return_statement + { if ( scope != SCOPE_FUNCT ) + compile_error("return outside function body") ; + } + | NEXT separator + { if ( scope != SCOPE_MAIN ) + compile_error( "improper use of next" ) ; + $$ = code_offset ; + code1(_NEXT) ; + } + ; + +separator : NL | SEMI_COLON + ; + +expr : cat_expr + | lvalue ASSIGN expr { code1(_ASSIGN) ; } + | lvalue ADD_ASG expr { code1(_ADD_ASG) ; } + | lvalue SUB_ASG expr { code1(_SUB_ASG) ; } + | lvalue MUL_ASG expr { code1(_MUL_ASG) ; } + | lvalue DIV_ASG expr { code1(_DIV_ASG) ; } + | lvalue MOD_ASG expr { code1(_MOD_ASG) ; } + | lvalue POW_ASG expr { code1(_POW_ASG) ; } + | expr EQ expr { code1(_EQ) ; } + | expr NEQ expr { code1(_NEQ) ; } + | expr LT expr { code1(_LT) ; } + | expr LTE expr { code1(_LTE) ; } + | expr GT expr { code1(_GT) ; } + | expr GTE expr { code1(_GTE) ; } + + | expr MATCH expr + { + INST *p3 = CDP($3) ; + + if ( p3 == code_ptr - 2 ) + { + if ( p3->op == _MATCH0 ) p3->op = _MATCH1 ; + + else /* check for string */ + if ( p3->op == _PUSHS ) + { CELL *cp = ZMALLOC(CELL) ; + + cp->type = C_STRING ; + cp->ptr = p3[1].ptr ; + cast_to_RE(cp) ; + code_ptr -= 2 ; + code2(_MATCH1, cp->ptr) ; + ZFREE(cp) ; + } + else code1(_MATCH2) ; + } + else code1(_MATCH2) ; + + if ( !$2 ) code1(_NOT) ; + } + +/* short circuit boolean evaluation */ + | expr OR + { code1(_TEST) ; + code_jmp(_LJNZ, (INST*)0) ; + } + expr + { code1(_TEST) ; patch_jmp(code_ptr) ; } + + | expr AND + { code1(_TEST) ; + code_jmp(_LJZ, (INST*)0) ; + } + expr + { code1(_TEST) ; patch_jmp(code_ptr) ; } + + | expr QMARK { code_jmp(_JZ, (INST*)0) ; } + expr COLON { code_jmp(_JMP, (INST*)0) ; } + expr + { patch_jmp(code_ptr) ; patch_jmp(CDP($7)) ; } + ; + +cat_expr : p_expr %prec CAT + | cat_expr p_expr %prec CAT + { code1(_CAT) ; } + ; + +p_expr : DOUBLE + { $$ = code_offset ; code2(_PUSHD, $1) ; } + | STRING_ + { $$ = code_offset ; code2(_PUSHS, $1) ; } + | ID %prec AND /* anything less than IN */ + { check_var($1) ; + $$ = code_offset ; + if ( is_local($1) ) + { code2op(L_PUSHI, $1->offset) ; } + else code2(_PUSHI, $1->stval.cp) ; + } + + | LPAREN expr RPAREN + { $$ = $2 ; } + ; + +p_expr : RE + { $$ = code_offset ; code2(_MATCH0, $1) ; } + ; + +p_expr : p_expr PLUS p_expr { code1(_ADD) ; } + | p_expr MINUS p_expr { code1(_SUB) ; } + | p_expr MUL p_expr { code1(_MUL) ; } + | p_expr DIV p_expr { code1(_DIV) ; } + | p_expr MOD p_expr { code1(_MOD) ; } + | p_expr POW p_expr { code1(_POW) ; } + | NOT p_expr + { $$ = $2 ; code1(_NOT) ; } + | PLUS p_expr %prec UMINUS + { $$ = $2 ; code1(_UPLUS) ; } + | MINUS p_expr %prec UMINUS + { $$ = $2 ; code1(_UMINUS) ; } + | builtin + ; + +p_expr : ID INC_or_DEC + { check_var($1) ; + $$ = code_offset ; + code_address($1) ; + + if ( $2 == '+' ) code1(_POST_INC) ; + else code1(_POST_DEC) ; + } + | INC_or_DEC lvalue + { $$ = $2 ; + if ( $1 == '+' ) code1(_PRE_INC) ; + else code1(_PRE_DEC) ; + } + ; + +p_expr : field INC_or_DEC + { if ($2 == '+' ) code1(F_POST_INC ) ; + else code1(F_POST_DEC) ; + } + | INC_or_DEC field + { $$ = $2 ; + if ( $1 == '+' ) code1(F_PRE_INC) ; + else code1( F_PRE_DEC) ; + } + ; + +lvalue : ID + { $$ = code_offset ; + check_var($1) ; + code_address($1) ; + } + ; + + +arglist : /* empty */ + { $$ = 0 ; } + | args + ; + +args : expr %prec LPAREN + { $$ = 1 ; } + | args COMMA expr + { $$ = $1 + 1 ; } + ; + +builtin : + BUILTIN mark LPAREN arglist RPAREN + { BI_REC *p = $1 ; + $$ = $2 ; + if ( (int)p->min_args > $4 || (int)p->max_args < $4 ) + compile_error( + "wrong number of arguments in call to %s" , + p->name ) ; + if ( p->min_args != p->max_args ) /* variable args */ + { code1(_PUSHINT) ; code1($4) ; } + code2(_BUILTIN , p->fp) ; + } + | LENGTH /* this is an irritation */ + { + $$ = code_offset ; + code1(_PUSHINT) ; code1(0) ; + code2(_BUILTIN, $1->fp) ; + } + ; + +/* an empty production to store the code_ptr */ +mark : /* empty */ + { $$ = code_offset ; } + ; + +/* print_statement */ +statement : print mark pr_args pr_direction separator + { code2(_PRINT, $1) ; + if ( $1 == bi_printf && $3 == 0 ) + compile_error("no arguments in call to printf") ; + print_flag = 0 ; + $$ = $2 ; + } + ; + +print : PRINT { $$ = bi_print ; print_flag = 1 ;} + | PRINTF { $$ = bi_printf ; print_flag = 1 ; } + ; + +pr_args : arglist { code2op(_PUSHINT, $1) ; } + | LPAREN arg2 RPAREN + { $$ = $2->cnt ; zfree($2,sizeof(ARG2_REC)) ; + code2op(_PUSHINT, $$) ; + } + | LPAREN RPAREN + { $$=0 ; code2op(_PUSHINT, 0) ; } + ; + +arg2 : expr COMMA expr + { $$ = (ARG2_REC*) zmalloc(sizeof(ARG2_REC)) ; + $$->start = $1 ; + $$->cnt = 2 ; + } + | arg2 COMMA expr + { $$ = $1 ; $$->cnt++ ; } + ; + +pr_direction : /* empty */ + | IO_OUT expr + { code2op(_PUSHINT, $1) ; } + ; + + +/* IF and IF-ELSE */ + +if_front : IF LPAREN expr RPAREN + { $$ = $3 ; eat_nl() ; code_jmp(_JZ, (INST*)0) ; } + ; + +/* if_statement */ +statement : if_front statement + { patch_jmp( code_ptr ) ; } + ; + +else : ELSE { eat_nl() ; code_jmp(_JMP, (INST*)0) ; } + ; + +/* if_else_statement */ +statement : if_front statement else statement + { patch_jmp(code_ptr) ; + patch_jmp(CDP($4)) ; + } + ; + + +/* LOOPS */ + +do : DO + { eat_nl() ; BC_new() ; } + ; + +/* do_statement */ +statement : do statement WHILE LPAREN expr RPAREN separator + { $$ = $2 ; + code_jmp(_JNZ, CDP($2)) ; + BC_clear(code_ptr, CDP($5)) ; } + ; + +while_front : WHILE LPAREN expr RPAREN + { eat_nl() ; BC_new() ; + $$ = $3 ; + + /* check if const expression */ + if ( code_ptr - 2 == CDP($3) && + code_ptr[-2].op == _PUSHD && + *(double*)code_ptr[-1].ptr != 0.0 + ) + code_ptr -= 2 ; + else + { INST *p3 = CDP($3) ; + code_push(p3, code_ptr-p3, scope, active_funct) ; + code_ptr = p3 ; + code2(_JMP, (INST*)0) ; /* code2() not code_jmp() */ + } + } + ; + +/* while_statement */ +statement : while_front statement + { + int saved_offset ; + int len ; + INST *p1 = CDP($1) ; + INST *p2 = CDP($2) ; + + if ( p1 != p2 ) /* real test in loop */ + { + p1[1].op = code_ptr-(p1+1) ; + saved_offset = code_offset ; + len = code_pop(code_ptr) ; + code_ptr += len ; + code_jmp(_JNZ, CDP($2)) ; + BC_clear(code_ptr, CDP(saved_offset)) ; + } + else /* while(1) */ + { + code_jmp(_JMP, p1) ; + BC_clear(code_ptr, CDP($2)) ; + } + } + ; + + +/* for_statement */ +statement : for1 for2 for3 statement + { + int cont_offset = code_offset ; + unsigned len = code_pop(code_ptr) ; + INST *p2 = CDP($2) ; + INST *p4 = CDP($4) ; + + code_ptr += len ; + + if ( p2 != p4 ) /* real test in for2 */ + { + p4[-1].op = code_ptr - p4 + 1 ; + len = code_pop(code_ptr) ; + code_ptr += len ; + code_jmp(_JNZ, CDP($4)) ; + } + else /* for(;;) */ + code_jmp(_JMP, p4) ; + + BC_clear(code_ptr, CDP(cont_offset)) ; + + } + ; + +for1 : FOR LPAREN SEMI_COLON { $$ = code_offset ; } + | FOR LPAREN expr SEMI_COLON + { $$ = $3 ; code1(_POP) ; } + ; + +for2 : SEMI_COLON { $$ = code_offset ; } + | expr SEMI_COLON + { + if ( code_ptr - 2 == CDP($1) && + code_ptr[-2].op == _PUSHD && + * (double*) code_ptr[-1].ptr != 0.0 + ) + code_ptr -= 2 ; + else + { + INST *p1 = CDP($1) ; + code_push(p1, code_ptr-p1, scope, active_funct) ; + code_ptr = p1 ; + code2(_JMP, (INST*)0) ; + } + } + ; + +for3 : RPAREN + { eat_nl() ; BC_new() ; + code_push((INST*)0,0, scope, active_funct) ; + } + | expr RPAREN + { INST *p1 = CDP($1) ; + + eat_nl() ; BC_new() ; + code1(_POP) ; + code_push(p1, code_ptr - p1, scope, active_funct) ; + code_ptr -= code_ptr - p1 ; + } + ; + + +/* arrays */ + +expr : expr IN ID + { check_array($3) ; + code_array($3) ; + code1(A_TEST) ; + } + | LPAREN arg2 RPAREN IN ID + { $$ = $2->start ; + code2op(A_CAT, $2->cnt) ; + zfree($2, sizeof(ARG2_REC)) ; + + check_array($5) ; + code_array($5) ; + code1(A_TEST) ; + } + ; + +lvalue : ID mark LBOX args RBOX + { + if ( $4 > 1 ) + { code2op(A_CAT, $4) ; } + + check_array($1) ; + if( is_local($1) ) + { code2op(LAE_PUSHA, $1->offset) ; } + else code2(AE_PUSHA, $1->stval.array) ; + $$ = $2 ; + } + ; + +p_expr : ID mark LBOX args RBOX %prec AND + { + if ( $4 > 1 ) + { code2op(A_CAT, $4) ; } + + check_array($1) ; + if( is_local($1) ) + { code2op(LAE_PUSHI, $1->offset) ; } + else code2(AE_PUSHI, $1->stval.array) ; + $$ = $2 ; + } + + | ID mark LBOX args RBOX INC_or_DEC + { + if ( $4 > 1 ) + { code2op(A_CAT,$4) ; } + + check_array($1) ; + if( is_local($1) ) + { code2op(LAE_PUSHA, $1->offset) ; } + else code2(AE_PUSHA, $1->stval.array) ; + if ( $6 == '+' ) code1(_POST_INC) ; + else code1(_POST_DEC) ; + + $$ = $2 ; + } + ; + +/* delete A[i] or delete A */ +statement : DELETE ID mark LBOX args RBOX separator + { + $$ = $3 ; + if ( $5 > 1 ) { code2op(A_CAT, $5) ; } + check_array($2) ; + code_array($2) ; + code1(A_DEL) ; + } + | DELETE ID separator + { + $$ = code_offset ; + check_array($2) ; + code_array($2) ; + code1(DEL_A) ; + } + ; + +/* for ( i in A ) statement */ + +array_loop_front : FOR LPAREN ID IN ID RPAREN + { eat_nl() ; BC_new() ; + $$ = code_offset ; + + check_var($3) ; + code_address($3) ; + check_array($5) ; + code_array($5) ; + + code2(SET_ALOOP, (INST*)0) ; + } + ; + +/* array_loop */ +statement : array_loop_front statement + { + INST *p2 = CDP($2) ; + + p2[-1].op = code_ptr - p2 + 1 ; + BC_clear( code_ptr+2 , code_ptr) ; + code_jmp(ALOOP, p2) ; + code1(POP_AL) ; + } + ; + +/* fields + D_ID is a special token , same as an ID, but yylex() + only returns it after a '$'. In essense, + DOLLAR D_ID is really one token. +*/ + +field : FIELD + { $$ = code_offset ; code2(F_PUSHA, $1) ; } + | DOLLAR D_ID + { check_var($2) ; + $$ = code_offset ; + if ( is_local($2) ) + { code2op(L_PUSHI, $2->offset) ; } + else code2(_PUSHI, $2->stval.cp) ; + + CODE_FE_PUSHA() ; + } + | DOLLAR D_ID mark LBOX args RBOX + { + if ( $5 > 1 ) + { code2op(A_CAT, $5) ; } + + check_array($2) ; + if( is_local($2) ) + { code2op(LAE_PUSHI, $2->offset) ; } + else code2(AE_PUSHI, $2->stval.array) ; + + CODE_FE_PUSHA() ; + + $$ = $3 ; + } + | DOLLAR p_expr + { $$ = $2 ; CODE_FE_PUSHA() ; } + | LPAREN field RPAREN + { $$ = $2 ; } + ; + +p_expr : field %prec CAT /* removes field (++|--) sr conflict */ + { field_A2I() ; } + ; + +expr : field ASSIGN expr { code1(F_ASSIGN) ; } + | field ADD_ASG expr { code1(F_ADD_ASG) ; } + | field SUB_ASG expr { code1(F_SUB_ASG) ; } + | field MUL_ASG expr { code1(F_MUL_ASG) ; } + | field DIV_ASG expr { code1(F_DIV_ASG) ; } + | field MOD_ASG expr { code1(F_MOD_ASG) ; } + | field POW_ASG expr { code1(F_POW_ASG) ; } + ; + +/* split is handled different than a builtin because + it takes an array and optionally a regular expression as args */ + +p_expr : split_front split_back + { code2(_BUILTIN, bi_split) ; } + ; + +split_front : SPLIT LPAREN expr COMMA ID + { $$ = $3 ; + check_array($5) ; + code_array($5) ; + } + ; + +split_back : RPAREN + { code2(_PUSHI, &fs_shadow) ; } + | COMMA expr RPAREN + { + if ( CDP($2) == code_ptr - 2 ) + { + if ( code_ptr[-2].op == _MATCH0 ) + RE_as_arg() ; + else + if ( code_ptr[-2].op == _PUSHS ) + { CELL *cp = ZMALLOC(CELL) ; + + cp->type = C_STRING ; + cp->ptr = code_ptr[-1].ptr ; + cast_for_split(cp) ; + code_ptr[-2].op = _PUSHC ; + code_ptr[-1].ptr = (PTR) cp ; + } + } + } + ; + + + +/* match(expr, RE) */ + +p_expr : MATCH_FUNC LPAREN expr COMMA re_arg RPAREN + { $$ = $3 ; + code2(_BUILTIN, bi_match) ; + } + ; + + +re_arg : expr + { + INST *p1 = CDP($1) ; + + if ( p1 == code_ptr - 2 ) + { + if ( p1->op == _MATCH0 ) RE_as_arg() ; + else + if ( p1->op == _PUSHS ) + { CELL *cp = ZMALLOC(CELL) ; + + cp->type = C_STRING ; + cp->ptr = p1[1].ptr ; + cast_to_RE(cp) ; + p1->op = _PUSHC ; + p1[1].ptr = (PTR) cp ; + } + } + } + ; + + +/* exit_statement */ +statement : EXIT separator + { $$ = code_offset ; + code1(_EXIT0) ; } + | EXIT expr separator + { $$ = $2 ; code1(_EXIT) ; } + ; + +return_statement : RETURN separator + { $$ = code_offset ; + code1(_RET0) ; } + | RETURN expr separator + { $$ = $2 ; code1(_RET) ; } + ; + +/* getline */ + +p_expr : getline %prec GETLINE + { $$ = code_offset ; + code2(F_PUSHA, &field[0]) ; + code1(_PUSHINT) ; code1(0) ; + code2(_BUILTIN, bi_getline) ; + getline_flag = 0 ; + } + | getline fvalue %prec GETLINE + { $$ = $2 ; + code1(_PUSHINT) ; code1(0) ; + code2(_BUILTIN, bi_getline) ; + getline_flag = 0 ; + } + | getline_file p_expr %prec IO_IN + { code1(_PUSHINT) ; code1(F_IN) ; + code2(_BUILTIN, bi_getline) ; + /* getline_flag already off in yylex() */ + } + | p_expr PIPE GETLINE + { code2(F_PUSHA, &field[0]) ; + code1(_PUSHINT) ; code1(PIPE_IN) ; + code2(_BUILTIN, bi_getline) ; + } + | p_expr PIPE GETLINE fvalue + { + code1(_PUSHINT) ; code1(PIPE_IN) ; + code2(_BUILTIN, bi_getline) ; + } + ; + +getline : GETLINE { getline_flag = 1 ; } ; + +fvalue : lvalue | field ; + +getline_file : getline IO_IN + { $$ = code_offset ; + code2(F_PUSHA, field+0) ; + } + | getline fvalue IO_IN + { $$ = $2 ; } + ; + +/*========================================== + sub and gsub + ==========================================*/ + +p_expr : sub_or_gsub LPAREN re_arg COMMA expr sub_back + { + INST *p5 = CDP($5) ; + INST *p6 = CDP($6) ; + + if ( p6 - p5 == 2 && p5->op == _PUSHS ) + { /* cast from STRING to REPL at compile time */ + CELL *cp = ZMALLOC(CELL) ; + cp->type = C_STRING ; + cp->ptr = p5[1].ptr ; + cast_to_REPL(cp) ; + p5->op = _PUSHC ; + p5[1].ptr = (PTR) cp ; + } + code2(_BUILTIN, $1) ; + $$ = $3 ; + } + ; + +sub_or_gsub : SUB { $$ = bi_sub ; } + | GSUB { $$ = bi_gsub ; } + ; + + +sub_back : RPAREN /* substitute into $0 */ + { $$ = code_offset ; + code2(F_PUSHA, &field[0]) ; + } + + | COMMA fvalue RPAREN + { $$ = $2 ; } + ; + +/*================================================ + user defined functions + *=================================*/ + +function_def : funct_start block + { + resize_fblock($1) ; + restore_ids() ; + switch_code_to_main() ; + } + ; + + +funct_start : funct_head LPAREN f_arglist RPAREN + { eat_nl() ; + scope = SCOPE_FUNCT ; + active_funct = $1 ; + *main_code_p = active_code ; + + $1->nargs = $3 ; + if ( $3 ) + $1->typev = (char *) + memset( zmalloc($3), ST_LOCAL_NONE, $3) ; + else $1->typev = (char *) 0 ; + + code_ptr = code_base = + (INST *) zmalloc(INST_BYTES(PAGESZ)); + code_limit = code_base + PAGESZ ; + code_warn = code_limit - CODEWARN ; + } + ; + +funct_head : FUNCTION ID + { FBLOCK *fbp ; + + if ( $2->type == ST_NONE ) + { + $2->type = ST_FUNCT ; + fbp = $2->stval.fbp = + (FBLOCK *) zmalloc(sizeof(FBLOCK)) ; + fbp->name = $2->name ; + fbp->code = (INST*) 0 ; + } + else + { + type_error( $2 ) ; + + /* this FBLOCK will not be put in + the symbol table */ + fbp = (FBLOCK*) zmalloc(sizeof(FBLOCK)) ; + fbp->name = "" ; + } + $$ = fbp ; + } + + | FUNCTION FUNCT_ID + { $$ = $2 ; + if ( $2->code ) + compile_error("redefinition of %s" , $2->name) ; + } + ; + +f_arglist : /* empty */ { $$ = 0 ; } + | f_args + ; + +f_args : ID + { $1 = save_id($1->name) ; + $1->type = ST_LOCAL_NONE ; + $1->offset = 0 ; + $$ = 1 ; + } + | f_args COMMA ID + { if ( is_local($3) ) + compile_error("%s is duplicated in argument list", + $3->name) ; + else + { $3 = save_id($3->name) ; + $3->type = ST_LOCAL_NONE ; + $3->offset = $1 ; + $$ = $1 + 1 ; + } + } + ; + +outside_error : error + { /* we may have to recover from a bungled function + definition */ + /* can have local ids, before code scope + changes */ + restore_ids() ; + + switch_code_to_main() ; + } + ; + +/* a call to a user defined function */ + +p_expr : FUNCT_ID mark call_args + { $$ = $2 ; + code2(_CALL, $1) ; + + if ( $3 ) code1($3->arg_num+1) ; + else code1(0) ; + + check_fcall($1, scope, code_move_level, active_funct, + $3, token_lineno) ; + } + ; + +call_args : LPAREN RPAREN + { $$ = (CA_REC *) 0 ; } + | ca_front ca_back + { $$ = $2 ; + $$->link = $1 ; + $$->arg_num = $1 ? $1->arg_num+1 : 0 ; + } + ; + +/* The funny definition of ca_front with the COMMA bound to the ID is to + force a shift to avoid a reduce/reduce conflict + ID->id or ID->array + + Or to avoid a decision, if the type of the ID has not yet been + determined +*/ + +ca_front : LPAREN + { $$ = (CA_REC *) 0 ; } + | ca_front expr COMMA + { $$ = ZMALLOC(CA_REC) ; + $$->link = $1 ; + $$->type = CA_EXPR ; + $$->arg_num = $1 ? $1->arg_num+1 : 0 ; + $$->call_offset = code_offset ; + } + | ca_front ID COMMA + { $$ = ZMALLOC(CA_REC) ; + $$->link = $1 ; + $$->arg_num = $1 ? $1->arg_num+1 : 0 ; + + code_call_id($$, $2) ; + } + ; + +ca_back : expr RPAREN + { $$ = ZMALLOC(CA_REC) ; + $$->type = CA_EXPR ; + $$->call_offset = code_offset ; + } + + | ID RPAREN + { $$ = ZMALLOC(CA_REC) ; + code_call_id($$, $1) ; + } + ; + + + + +%% + +/* resize the code for a user function */ + +static void resize_fblock( fbp ) + FBLOCK *fbp ; +{ + CODEBLOCK *p = ZMALLOC(CODEBLOCK) ; + unsigned dummy ; + + code2op(_RET0, _HALT) ; + /* make sure there is always a return */ + + *p = active_code ; + fbp->code = code_shrink(p, &dummy) ; + /* code_shrink() zfrees p */ + + if ( dump_code_flag ) add_to_fdump_list(fbp) ; +} + + +/* convert FE_PUSHA to FE_PUSHI + or F_PUSH to F_PUSHI +*/ + +static void field_A2I() +{ CELL *cp ; + + if ( code_ptr[-1].op == FE_PUSHA && + code_ptr[-1].ptr == (PTR) 0) + /* On most architectures, the two tests are the same; a good + compiler might eliminate one. On LM_DOS, and possibly other + segmented architectures, they are not */ + { code_ptr[-1].op = FE_PUSHI ; } + else + { + cp = (CELL *) code_ptr[-1].ptr ; + + if ( cp == field || + +#ifdef MSDOS + SAMESEG(cp,field) && +#endif + cp > NF && cp <= LAST_PFIELD ) + { + code_ptr[-2].op = _PUSHI ; + } + else if ( cp == NF ) + { code_ptr[-2].op = NF_PUSHI ; code_ptr-- ; } + + else + { + code_ptr[-2].op = F_PUSHI ; + code_ptr -> op = field_addr_to_index( code_ptr[-1].ptr ) ; + code_ptr++ ; + } + } +} + +/* we've seen an ID in a context where it should be a VAR, + check that's consistent with previous usage */ + +static void check_var( p ) + register SYMTAB *p ; +{ + switch(p->type) + { + case ST_NONE : /* new id */ + p->type = ST_VAR ; + p->stval.cp = ZMALLOC(CELL) ; + p->stval.cp->type = C_NOINIT ; + break ; + + case ST_LOCAL_NONE : + p->type = ST_LOCAL_VAR ; + active_funct->typev[p->offset] = ST_LOCAL_VAR ; + break ; + + case ST_VAR : + case ST_LOCAL_VAR : break ; + + default : + type_error(p) ; + break ; + } +} + +/* we've seen an ID in a context where it should be an ARRAY, + check that's consistent with previous usage */ +static void check_array(p) + register SYMTAB *p ; +{ + switch(p->type) + { + case ST_NONE : /* a new array */ + p->type = ST_ARRAY ; + p->stval.array = new_ARRAY() ; + break ; + + case ST_ARRAY : + case ST_LOCAL_ARRAY : + break ; + + case ST_LOCAL_NONE : + p->type = ST_LOCAL_ARRAY ; + active_funct->typev[p->offset] = ST_LOCAL_ARRAY ; + break ; + + default : type_error(p) ; break ; + } +} + +static void code_array(p) + register SYMTAB *p ; +{ + if ( is_local(p) ) code2op(LA_PUSHA, p->offset) ; + else code2(A_PUSHA, p->stval.array) ; +} + + +/* we've seen an ID as an argument to a user defined function */ + +static void code_call_id( p, ip ) + register CA_REC *p ; + register SYMTAB *ip ; +{ static CELL dummy ; + + p->call_offset = code_offset ; + /* This always get set now. So that fcall:relocate_arglist + works. */ + + switch( ip->type ) + { + case ST_VAR : + p->type = CA_EXPR ; + code2(_PUSHI, ip->stval.cp) ; + break ; + + case ST_LOCAL_VAR : + p->type = CA_EXPR ; + code2op(L_PUSHI, ip->offset) ; + break ; + + case ST_ARRAY : + p->type = CA_ARRAY ; + code2(A_PUSHA, ip->stval.array) ; + break ; + + case ST_LOCAL_ARRAY : + p->type = CA_ARRAY ; + code2op(LA_PUSHA, ip->offset) ; + break ; + + /* not enough info to code it now; it will have to + be patched later */ + + case ST_NONE : + p->type = ST_NONE ; + p->sym_p = ip ; + code2(_PUSHI, &dummy) ; + break ; + + case ST_LOCAL_NONE : + p->type = ST_LOCAL_NONE ; + p->type_p = & active_funct->typev[ip->offset] ; + code2op(L_PUSHI, ip->offset) ; + break ; + + +#ifdef DEBUG + default : + bozo("code_call_id") ; +#endif + + } +} + +/* an RE by itself was coded as _MATCH0 , change to + push as an expression */ + +static void RE_as_arg() +{ CELL *cp = ZMALLOC(CELL) ; + + code_ptr -= 2 ; + cp->type = C_RE ; + cp->ptr = code_ptr[1].ptr ; + code2(_PUSHC, cp) ; +} + +/* reset the active_code back to the MAIN block */ +static void +switch_code_to_main() +{ + switch(scope) + { + case SCOPE_BEGIN : + *begin_code_p = active_code ; + active_code = *main_code_p ; + break ; + + case SCOPE_END : + *end_code_p = active_code ; + active_code = *main_code_p ; + break ; + + case SCOPE_FUNCT : + active_code = *main_code_p ; + break ; + + case SCOPE_MAIN : + break ; + } + active_funct = (FBLOCK*) 0 ; + scope = SCOPE_MAIN ; +} + + +void +parse() +{ + if ( yyparse() || compile_error_count != 0 ) mawk_exit(2) ; + + scan_cleanup() ; + set_code() ; + /* code must be set before call to resolve_fcalls() */ + if ( resolve_list ) resolve_fcalls() ; + + if ( compile_error_count != 0 ) mawk_exit(2) ; + if ( dump_code_flag ) { dump_code() ; mawk_exit(0) ; } +} + diff --git a/patch-stamp b/patch-stamp new file mode 100644 index 0000000..7f70809 --- /dev/null +++ b/patch-stamp @@ -0,0 +1,2 @@ +Patches applied in the Debian version of : + diff --git a/patchlev.h b/patchlev.h new file mode 100644 index 0000000..04e35ea --- /dev/null +++ b/patchlev.h @@ -0,0 +1,5 @@ +/* mawk 1.3 */ +#define PATCHLEVEL 3 +#define PATCH_STRING ".3" +#define DATE_STRING "Nov 1996" +#define MAWK_ID "@(#)mawk 1.3.3" diff --git a/print.c b/print.c new file mode 100644 index 0000000..ec982fb --- /dev/null +++ b/print.c @@ -0,0 +1,580 @@ + +/******************************************** +print.c +copyright 1991-1993. Michael D. Brennan + +This is a source file for mawk, an implementation of +the AWK programming language. + +Mawk is distributed without warranty under the terms of +the GNU General Public License, version 2, 1991. +********************************************/ + +/* $Log: print.c,v $ + * Revision 1.7 1996/09/18 01:04:36 mike + * Check ferror() after print and printf. + * + * Revision 1.6 1995/10/13 16:56:45 mike + * Some assumptions that int==long were still in do_printf -- now removed. + * + * Revision 1.5 1995/06/18 19:17:50 mike + * Create a type Int which on most machines is an int, but on machines + * with 16bit ints, i.e., the PC is a long. This fixes implicit assumption + * that int==long. + * + * Revision 1.4 1994/10/08 19:15:50 mike + * remove SM_DOS + * + * Revision 1.3 1993/07/15 23:38:19 mike + * SIZE_T and indent + * + * Revision 1.2 1993/07/07 00:07:50 mike + * more work on 1.2 + * + * Revision 1.1.1.1 1993/07/03 18:58:18 mike + * move source to cvs + * + * Revision 5.6 1993/02/13 21:57:30 mike + * merge patch3 + * + * Revision 5.5 1993/01/01 21:30:48 mike + * split new_STRING() into new_STRING and new_STRING0 + * + * Revision 5.4.1.2 1993/01/20 12:53:11 mike + * d_to_l() + * + * Revision 5.4.1.1 1993/01/15 03:33:47 mike + * patch3: safer double to int conversion + * + * Revision 5.4 1992/11/29 18:03:11 mike + * when printing integers, convert doubles to + * longs so output is the same on 16bit systems as 32bit systems + * + * Revision 5.3 1992/08/17 14:23:21 brennan + * patch2: After parsing, only bi_sprintf() uses string_buff. + * + * Revision 5.2 1992/02/24 10:52:16 brennan + * printf and sprintf() can now have more args than % conversions + * removed HAVE_PRINTF_HD -- it was too obscure + * + * Revision 5.1 91/12/05 07:56:22 brennan + * 1.1 pre-release + * +*/ + +#include "mawk.h" +#include "bi_vars.h" +#include "bi_funct.h" +#include "memory.h" +#include "field.h" +#include "scan.h" +#include "files.h" + +static void PROTO(print_cell, (CELL *, FILE *)) ; +static STRING *PROTO(do_printf, (FILE *, char *, unsigned, CELL *)) ; +static void PROTO(bad_conversion, (int, char *, char *)) ; +static void PROTO(write_error,(void)) ; + +/* prototyping fprintf() or sprintf() is a loser as ellipses will + always cause problems with ansi compilers depending on what + they've already seen, + but we need them here and sometimes they are missing +*/ + +#ifdef NO_FPRINTF_IN_STDIO +int PROTO(fprintf, (FILE *, const char *,...)) ; +#endif +#ifdef NO_SPRINTF_IN_STDIO +int PROTO(sprintf, (char *, const char *,...)) ; +#endif + +/* this can be moved and enlarged by -W sprintf=num */ +char *sprintf_buff = string_buff ; +char *sprintf_limit = string_buff + SPRINTF_SZ ; + +/* Once execute() starts the sprintf code is (belatedly) the only + code allowed to use string_buff */ + +static void +print_cell(p, fp) + register CELL *p ; + register FILE *fp ; +{ + int len ; + + switch (p->type) + { + case C_NOINIT: + break ; + case C_MBSTRN: + case C_STRING: + case C_STRNUM: + switch (len = string(p)->len) + { + case 0: + break ; + case 1: + putc(string(p)->str[0], fp) ; + break ; + + default: + fwrite(string(p)->str, 1, len, fp) ; + } + break ; + + case C_DOUBLE: + { + Int ival = d_to_I(p->dval) ; + + /* integers print as "%[l]d" */ + if ((double) ival == p->dval) fprintf(fp, INT_FMT, ival) ; + else fprintf(fp, string(OFMT)->str, p->dval) ; + } + break ; + + default: + bozo("bad cell passed to print_cell") ; + } +} + +/* on entry to bi_print or bi_printf the stack is: + + sp[0] = an integer k + if ( k < 0 ) output is to a file with name in sp[-1] + { so open file and sp -= 2 } + + sp[0] = k >= 0 is the number of print args + sp[-k] holds the first argument +*/ + +CELL * +bi_print(sp) + CELL *sp ; /* stack ptr passed in */ +{ + register CELL *p ; + register int k ; + FILE *fp ; + + k = sp->type ; + if (k < 0) + { + /* k holds redirection */ + if ((--sp)->type < C_STRING) cast1_to_s(sp) ; + fp = (FILE *) file_find(string(sp), k) ; + free_STRING(string(sp)) ; + k = (--sp)->type ; + /* k now has number of arguments */ + } + else fp = stdout ; + + if (k) + { + p = sp - k ; /* clear k variables off the stack */ + sp = p - 1 ; + k-- ; + + while (k > 0) + { + print_cell(p,fp) ; print_cell(OFS,fp) ; + cell_destroy(p) ; + p++ ; k-- ; + } + + print_cell(p, fp) ; cell_destroy(p) ; + } + else + { /* print $0 */ + sp-- ; + print_cell(&field[0], fp) ; + } + + print_cell(ORS, fp) ; + if (ferror(fp)) write_error() ; + return sp ; +} + +/*---------- types and defs for doing printf and sprintf----*/ +#define PF_C 0 /* %c */ +#define PF_S 1 /* %s */ +#define PF_D 2 /* int conversion */ +#define PF_F 3 /* float conversion */ + +/* for switch on number of '*' and type */ +#define AST(num,type) ((PF_F+1)*(num)+(type)) + +/* some picky ANSI compilers go berserk without this */ +#ifdef NO_PROTOS +typedef int (*PRINTER) () ; +#else +typedef int (*PRINTER) (PTR, const char *,...) ; +#endif + +/*-------------------------------------------------------*/ + +static void +bad_conversion(cnt, who, format) + int cnt ; + char *who, *format ; +{ + rt_error("improper conversion(number %d) in %s(\"%s\")", + cnt, who, format) ; +} + +/* the contents of format are preserved, + caller does CELL cleanup + + This routine does both printf and sprintf (if fp==0) +*/ +static STRING * +do_printf(fp, format, argcnt, cp) + FILE *fp ; + char *format ; + unsigned argcnt ; /* number of args on eval stack */ + CELL *cp ; /* ptr to an array of arguments + (on the eval stack) */ +{ + char save ; + char *p ; + register char *q = format ; + register char *target ; + int l_flag, h_flag ; /* seen %ld or %hd */ + int ast_cnt ; + int ast[2] ; + Int Ival ; + int num_conversion = 0 ; /* for error messages */ + char *who ; /*ditto*/ + int pf_type ; /* conversion type */ + PRINTER printer ; /* pts at fprintf() or sprintf() */ + +#ifdef SHORT_INTS + char xbuff[256] ; /* splice in l qualifier here */ +#endif + + if (fp == (FILE *) 0) /* doing sprintf */ + { + target = sprintf_buff ; + printer = (PRINTER) sprintf ; + who = "sprintf" ; + } + else /* doing printf */ + { + target = (char *) fp ; /* will never change */ + printer = (PRINTER) fprintf ; + who = "printf" ; + } + + while (1) + { + if (fp) /* printf */ + { + while (*q != '%') { + if (*q == 0) { + if (ferror(fp)) write_error() ; + /* return is ignored */ + return (STRING *) 0 ; + } + else { putc(*q,fp) ; q++ ; } + } + } + else /* sprintf */ + { + while (*q != '%') + if (*q == 0) + { + if (target > sprintf_limit) /* damaged */ + { + /* hope this works */ + rt_overflow("sprintf buffer", + sprintf_limit - sprintf_buff) ; + } + else /* really done */ + { + STRING *retval ; + int len = target - sprintf_buff ; + + retval = new_STRING0(len) ; + memcpy(retval->str, sprintf_buff, len) ; + return retval ; + } + } + else *target++ = *q++ ; + } + + + /* *q == '%' */ + num_conversion++ ; + + if (*++q == '%') /* %% */ + { + if (fp) putc(*q, fp) ; + else *target++ = *q ; + + q++ ; continue ; + } + + /* mark the '%' with p */ + p = q - 1 ; + + /* eat the flags */ + while (*q == '-' || *q == '+' || *q == ' ' || + *q == '#' || *q == '0') + q++ ; + + ast_cnt = 0 ; + if (*q == '*') + { + if (cp->type != C_DOUBLE) cast1_to_d(cp) ; + ast[ast_cnt++] = d_to_i(cp++->dval) ; + argcnt-- ; q++ ; + } + else + while (scan_code[*(unsigned char *) q] == SC_DIGIT) q++ ; + /* width is done */ + + if (*q == '.') /* have precision */ + { + q++ ; + if (*q == '*') + { + if (cp->type != C_DOUBLE) cast1_to_d(cp) ; + ast[ast_cnt++] = d_to_i(cp++->dval) ; + argcnt-- ; q++ ; + } + else + while (scan_code[*(unsigned char *) q] == SC_DIGIT) q++ ; + } + + if (argcnt <= 0) + rt_error("not enough arguments passed to %s(\"%s\")", + who, format) ; + + l_flag = h_flag = 0 ; + + if (*q == 'l') { q++ ; l_flag = 1 ; } + else if (*q == 'h') { q++ ; h_flag = 1 ; } + switch (*q++) + { + case 's': + if (l_flag + h_flag) + bad_conversion(num_conversion, who, format) ; + if (cp->type < C_STRING) cast1_to_s(cp) ; + pf_type = PF_S ; + break ; + + case 'c': + if (l_flag + h_flag) + bad_conversion(num_conversion, who, format) ; + + switch (cp->type) + { + case C_NOINIT: + Ival = 0 ; + break ; + + case C_STRNUM: + case C_DOUBLE: + Ival = d_to_I(cp->dval) ; + break ; + + case C_STRING: + Ival = string(cp)->str[0] ; + break ; + + case C_MBSTRN: + check_strnum(cp) ; + Ival = cp->type == C_STRING ? + string(cp)->str[0] : d_to_I(cp->dval) ; + break ; + + default: + bozo("printf %c") ; + } + + pf_type = PF_C ; + break ; + + case 'd': + case 'o': + case 'x': + case 'X': + case 'i': + case 'u': + if (cp->type != C_DOUBLE) cast1_to_d(cp) ; + Ival = d_to_I(cp->dval) ; + pf_type = PF_D ; + break ; + + case 'e': + case 'g': + case 'f': + case 'E': + case 'G': + if (h_flag + l_flag) + bad_conversion(num_conversion, who, format) ; + if (cp->type != C_DOUBLE) cast1_to_d(cp) ; + pf_type = PF_F ; + break ; + + default: + bad_conversion(num_conversion, who, format) ; + } + + save = *q ; + *q = 0 ; + +#ifdef SHORT_INTS + if (pf_type == PF_D) + { + /* need to splice in long modifier */ + strcpy(xbuff, p) ; + + if (l_flag) /* do nothing */ ; + else + { + int k = q - p ; + + if (h_flag) + { + Ival = (short) Ival ; + /* replace the 'h' with 'l' (really!) */ + xbuff[k - 2] = 'l' ; + if (xbuff[k - 1] != 'd' && xbuff[k - 1] != 'i') + Ival &= 0xffff ; + } + else + { + /* the usual case */ + xbuff[k] = xbuff[k - 1] ; + xbuff[k - 1] = 'l' ; + xbuff[k + 1] = 0 ; + } + } + } +#endif + + /* ready to call printf() */ + switch (AST(ast_cnt, pf_type)) + { + case AST(0, PF_C): + (*printer) ((PTR) target, p, (int) Ival) ; + break ; + + case AST(1, PF_C): + (*printer) ((PTR) target, p, ast[0], (int) Ival) ; + break ; + + case AST(2, PF_C): + (*printer) ((PTR) target, p, ast[0], ast[1], (int) Ival) ; + break ; + + case AST(0, PF_S): + (*printer) ((PTR) target, p, string(cp)->str) ; + break ; + + case AST(1, PF_S): + (*printer) ((PTR) target, p, ast[0], string(cp)->str) ; + break ; + + case AST(2, PF_S): + (*printer) ((PTR) target, p, ast[0], ast[1], string(cp)->str) ; + break ; + +#ifdef SHORT_INTS +#define FMT xbuff /* format in xbuff */ +#else +#define FMT p /* p -> format */ +#endif + case AST(0, PF_D): + (*printer) ((PTR) target, FMT, Ival) ; + break ; + + case AST(1, PF_D): + (*printer) ((PTR) target, FMT, ast[0], Ival) ; + break ; + + case AST(2, PF_D): + (*printer) ((PTR) target, FMT, ast[0], ast[1], Ival) ; + break ; + +#undef FMT + + + case AST(0, PF_F): + (*printer) ((PTR) target, p, cp->dval) ; + break ; + + case AST(1, PF_F): + (*printer) ((PTR) target, p, ast[0], cp->dval) ; + break ; + + case AST(2, PF_F): + (*printer) ((PTR) target, p, ast[0], ast[1], cp->dval) ; + break ; + } + if (fp == (FILE *) 0) + while (*target) target++ ; + *q = save ; argcnt-- ; cp++ ; + } +} + +CELL * +bi_printf(sp) + register CELL *sp ; +{ + register int k ; + register CELL *p ; + FILE *fp ; + + k = sp->type ; + if (k < 0) + { + /* k has redirection */ + if ((--sp)->type < C_STRING) cast1_to_s(sp) ; + fp = (FILE *) file_find(string(sp), k) ; + free_STRING(string(sp)) ; + k = (--sp)->type ; + /* k is now number of args including format */ + } + else fp = stdout ; + + sp -= k ; /* sp points at the format string */ + k-- ; + + if (sp->type < C_STRING) cast1_to_s(sp) ; + do_printf(fp, string(sp)->str, k, sp + 1); + free_STRING(string(sp)) ; + + /* cleanup arguments on eval stack */ + for (p = sp + 1; k; k--, p++) cell_destroy(p) ; + return --sp ; +} + +CELL * +bi_sprintf(sp) + CELL *sp ; +{ + CELL *p ; + int argcnt = sp->type ; + STRING *sval ; + + sp -= argcnt ; /* sp points at the format string */ + argcnt-- ; + + if (sp->type != C_STRING) cast1_to_s(sp) ; + sval = do_printf((FILE *) 0, string(sp)->str, argcnt, sp + 1) ; + free_STRING(string(sp)) ; + sp->ptr = (PTR) sval ; + + /* cleanup */ + for (p = sp + 1; argcnt; argcnt--, p++) cell_destroy(p) ; + + return sp ; +} + + +static void +write_error() +{ + errmsg(errno, "write failure") ; + mawk_exit(2) ; +} diff --git a/print.o b/print.o new file mode 100644 index 0000000..a8327fe Binary files /dev/null and b/print.o differ diff --git a/re_cmpl.c b/re_cmpl.c new file mode 100644 index 0000000..6609189 --- /dev/null +++ b/re_cmpl.c @@ -0,0 +1,419 @@ + +/******************************************** +re_cmpl.c +copyright 1991, Michael D. Brennan + +This is a source file for mawk, an implementation of +the AWK programming language. + +Mawk is distributed without warranty under the terms of +the GNU General Public License, version 2, 1991. +********************************************/ + +/* $Log: re_cmpl.c,v $ + * Revision 1.6 1994/12/13 00:14:58 mike + * \\ -> \ on second replacement scan + * + * Revision 1.5 1994/10/08 19:15:51 mike + * remove SM_DOS + * + * Revision 1.4 1993/07/21 01:17:53 mike + * handle "&" as replacement correctly + * + * Revision 1.3 1993/07/17 13:23:10 mike + * indent and general code cleanup + * + * Revision 1.2 1993/07/15 23:38:23 mike + * SIZE_T and indent + * + * Revision 1.1.1.1 1993/07/03 18:58:19 mike + * move source to cvs + * + * Revision 5.2 1993/01/01 21:30:48 mike + * split new_STRING() into new_STRING and new_STRING0 + * + * Revision 5.1 1991/12/05 07:56:25 brennan + * 1.1 pre-release + * +*/ + + +/* re_cmpl.c */ + +#include "mawk.h" +#include "memory.h" +#include "scan.h" +#include "regexp.h" +#include "repl.h" + + +static CELL *PROTO(REPL_compile, (STRING *)) ; + +typedef struct re_node +{ + STRING *sval ; + PTR re ; + struct re_node *link ; +} RE_NODE ; + +/* a list of compiled regular expressions */ +static RE_NODE *re_list ; + +static char efmt[] = "regular expression compile failed (%s)\n%s" ; + +/* compile a STRING to a regular expression machine. + Search a list of pre-compiled strings first +*/ +PTR +re_compile(sval) + STRING *sval ; +{ + register RE_NODE *p ; + RE_NODE *q ; + char *s ; + + /* search list */ + s = sval->str ; + p = re_list ; + q = (RE_NODE *) 0 ; + while (p) + { + if (strcmp(s, p->sval->str) == 0) /* found */ + { + if (!q) /* already at front */ + goto _return ; + else /* delete from list for move to front */ + { + q->link = p->link ; goto found ; + } + + } + else + { + q = p ; p = p->link ; + } + } + + /* not found */ + p = ZMALLOC(RE_NODE) ; + p->sval = sval ; + + sval->ref_cnt++ ; + if (!(p->re = REcompile(s))) + { + if (mawk_state == EXECUTION) + rt_error(efmt, REerrlist[REerrno], s) ; + else /* compiling */ + { + compile_error(efmt, REerrlist[REerrno], s) ; + return (PTR) 0 ; + } + } + + +found : +/* insert p at the front of the list */ + p->link = re_list ; re_list = p ; + +_return : + +#ifdef DEBUG + if (dump_RE) REmprint(p->re, stderr) ; +#endif + return p->re ; +} + + + +/* this is only used by da() */ + + +char * +re_uncompile(m) + PTR m ; +{ + register RE_NODE *p ; + + for (p = re_list; p; p = p->link) + if (p->re == m) return p->sval->str ; +#ifdef DEBUG + bozo("non compiled machine") ; +#else + return NULL; +#endif +} + + + +/*=================================================*/ +/* replacement operations */ + +/* create a replacement CELL from a STRING * */ + +static CELL * +REPL_compile(sval) + STRING *sval ; +{ + int i = 0 ; + register char *p = sval->str ; + register char *q ; + char *xbuff ; + CELL *cp ; + + q = xbuff = (char *) zmalloc(sval->len + 1) ; + + while (1) + { + switch (*p) + { + case 0: + *q = 0 ; + goto done ; + + case '\\': + if (p[1] == '&'|| p[1] == '\\') + { + *q++ = p[1] ; + p += 2 ; + continue ; + } + else break ; + + case '&': + /* if empty we don't need to make a node */ + if (q != xbuff) + { + *q = 0 ; + split_buff[i++] = new_STRING(xbuff) ; + } + /* and a null node for the '&' */ + split_buff[i++] = (STRING *) 0 ; + /* reset */ + p++ ; q = xbuff ; + continue ; + + default: + break ; + } + + *q++ = *p++ ; + } + +done : + /* if we have one empty string it will get made now */ + if (q > xbuff || i == 0) split_buff[i++] = new_STRING(xbuff) ; + + /* This will never happen */ + if (i > MAX_SPLIT) overflow("replacement pieces", MAX_SPLIT) ; + + cp = ZMALLOC(CELL) ; + if (i == 1 && split_buff[0]) + { + cp->type = C_REPL ; + cp->ptr = (PTR) split_buff[0] ; + } + else + { + STRING **sp = (STRING **) + (cp->ptr = zmalloc(sizeof(STRING *) * i)) ; + int j = 0 ; + + while (j < i) *sp++ = split_buff[j++] ; + + cp->type = C_REPLV ; + cp->vcnt = i ; + } + zfree(xbuff, sval->len + 1) ; + return cp ; +} + +/* free memory used by a replacement CELL */ + +void +repl_destroy(cp) + register CELL *cp ; +{ + register STRING **p ; + unsigned cnt ; + + if (cp->type == C_REPL) free_STRING(string(cp)) ; + else /* an C_REPLV */ + { + p = (STRING **) cp->ptr ; + for (cnt = cp->vcnt; cnt; cnt--) + { + if (*p) { free_STRING(*p) ; } + p++ ; + } + zfree(cp->ptr, cp->vcnt * sizeof(STRING *)) ; + } +} + +/* copy a C_REPLV cell to another CELL */ + +CELL * +replv_cpy(target, source) + CELL *target, *source ; +{ + STRING **t, **s ; + unsigned cnt ; + + target->type = C_REPLV ; + cnt = target->vcnt = source->vcnt ; + target->ptr = (PTR) zmalloc(cnt * sizeof(STRING *)) ; + + t = (STRING **) target->ptr ; + s = (STRING **) source->ptr ; + while (cnt) + { + cnt-- ; + if ( *s ) (*s)->ref_cnt++ ; + *t++ = *s++ ; + } + return target ; +} + + +/* here's our old friend linked linear list with move to the front + for compilation of replacement CELLs */ + +typedef struct repl_node +{ + struct repl_node *link ; + STRING *sval ; /* the input */ + CELL *cp ; /* the output */ +} REPL_NODE ; + +static REPL_NODE *repl_list ; + +/* search the list (with move to the front) for a compiled + separator. + return a ptr to a CELL (C_REPL or C_REPLV) +*/ + +CELL * +repl_compile(sval) + STRING *sval ; +{ + register REPL_NODE *p ; + REPL_NODE *q ; + char *s ; + + /* search the list */ + s = sval->str ; + p = repl_list ; + q = (REPL_NODE *) 0 ; + while (p) + { + if (strcmp(s, p->sval->str) == 0) /* found */ + { + if (!q) /* already at front */ + return p->cp ; + else /* delete from list for move to front */ + { + q->link = p->link ; + goto found ; + } + + } + else + { + q = p ; p = p->link ; + } + } + + /* not found */ + p = ZMALLOC(REPL_NODE) ; + p->sval = sval ; + sval->ref_cnt++ ; + p->cp = REPL_compile(sval) ; + +found : +/* insert p at the front of the list */ + p->link = repl_list ; repl_list = p ; + return p->cp ; +} + +/* return the string for a CELL or type REPL or REPLV, + this is only used by da() */ + + +char * +repl_uncompile(cp) + CELL *cp ; +{ + register REPL_NODE *p = repl_list ; + + if (cp->type == C_REPL) + { + while (p) + { + if (p->cp->type == C_REPL && p->cp->ptr == cp->ptr) + return p->sval->str ; + else p = p->link ; + } + } + else + { + while (p) + { + if (p->cp->type == C_REPLV && + memcmp(cp->ptr, p->cp->ptr, cp->vcnt * sizeof(STRING *)) + == 0) + return p->sval->str ; + else p = p->link ; + } + } + +#if DEBUG + bozo("unable to uncompile an repl") ; +#else + return NULL; +#endif +} + +/* + convert a C_REPLV to C_REPL + replacing the &s with sval +*/ + +CELL * +replv_to_repl(cp, sval) +CELL *cp ; STRING *sval ; +{ + register STRING **p ; + STRING **sblock = (STRING **) cp->ptr ; + unsigned cnt, vcnt = cp->vcnt ; + unsigned len ; + char *target ; + +#ifdef DEBUG + if (cp->type != C_REPLV) bozo("not replv") ; +#endif + + p = sblock ; cnt = vcnt ; len = 0 ; + while (cnt--) + { + if (*p) len += (*p++)->len ; + else + { + *p++ = sval ; + sval->ref_cnt++ ; + len += sval->len ; + } + } + cp->type = C_REPL ; + cp->ptr = (PTR) new_STRING0(len) ; + + p = sblock ; cnt = vcnt ; target = string(cp)->str ; + while (cnt--) + { + memcpy(target, (*p)->str, (*p)->len) ; + target += (*p)->len ; + free_STRING(*p) ; + p++ ; + } + + zfree(sblock, vcnt * sizeof(STRING *)) ; + return cp ; +} diff --git a/re_cmpl.o b/re_cmpl.o new file mode 100644 index 0000000..8b41c76 Binary files /dev/null and b/re_cmpl.o differ diff --git a/regexp.h b/regexp.h new file mode 100644 index 0000000..8b57740 --- /dev/null +++ b/regexp.h @@ -0,0 +1,32 @@ + +/******************************************** +regexp.h +copyright 1991, Michael D. Brennan + +This is a source file for mawk, an implementation of +the AWK programming language. + +Mawk is distributed without warranty under the terms of +the GNU General Public License, version 2, 1991. +********************************************/ + +/*$Log: regexp.h,v $ + * Revision 1.1.1.1 1993/07/03 18:58:19 mike + * move source to cvs + * + * Revision 5.1 1991/12/05 07:59:30 brennan + * 1.1 pre-release + * +*/ + +#include + +PTR PROTO( REcompile , (char *) ) ; +int PROTO( REtest, (char *, PTR) ) ; +char *PROTO( REmatch, (char *, PTR, unsigned *) ) ; +void PROTO( REmprint, (PTR , FILE*) ) ; + +extern int REerrno ; +extern char *REerrlist[] ; + + diff --git a/repl.h b/repl.h new file mode 100644 index 0000000..3410db4 --- /dev/null +++ b/repl.h @@ -0,0 +1,37 @@ + +/******************************************** +repl.h +copyright 1991, Michael D. Brennan + +This is a source file for mawk, an implementation of +the AWK programming language. + +Mawk is distributed without warranty under the terms of +the GNU General Public License, version 2, 1991. +********************************************/ + +/*$Log: repl.h,v $ + * Revision 1.1.1.1 1993/07/03 18:58:19 mike + * move source to cvs + * + * Revision 5.1 1991/12/05 07:59:32 brennan + * 1.1 pre-release + * +*/ + +/* repl.h */ + +#ifndef REPL_H +#define REPL_H + +PTR PROTO( re_compile, (STRING *) ) ; +char *PROTO( re_uncompile, (PTR) ) ; + + +CELL *PROTO( repl_compile, (STRING *) ) ; +char *PROTO( repl_uncompile, (CELL *) ) ; +void PROTO( repl_destroy, (CELL *) ) ; +CELL *PROTO( replv_cpy, (CELL *, CELL *) ) ; +CELL *PROTO( replv_to_repl, (CELL *, STRING *) ) ; + +#endif diff --git a/rexp/.done b/rexp/.done new file mode 100644 index 0000000..e69de29 diff --git a/rexp/Makefile b/rexp/Makefile new file mode 100644 index 0000000..4f2a5ff --- /dev/null +++ b/rexp/Makefile @@ -0,0 +1,24 @@ + +#################################### +# This is a makefile for mawk, +# an implementation of AWK (1988). +#################################### +# +# + +CC = cc +CFLAGS = -O -DMAWK -I.. + +O=rexp.o rexp0.o rexp1.o rexp2.o rexp3.o +DB=rexpdb.o + +all : $(O) + @cat .done + +debug : $(O) $(DB) + @cat .done + +$(O) : rexp.h + +clean : + rm -f *.o .done diff --git a/rexp/rexp.c b/rexp/rexp.c new file mode 100644 index 0000000..a47af2a --- /dev/null +++ b/rexp/rexp.c @@ -0,0 +1,238 @@ + +/******************************************** +rexp.c +copyright 1991, Michael D. Brennan + +This is a source file for mawk, an implementation of +the AWK programming language. + +Mawk is distributed without warranty under the terms of +the GNU General Public License, version 2, 1991. +********************************************/ + +/*$Log: rexp.c,v $ + *Revision 1.3 1996/09/02 18:47:36 mike + *Make ^* and ^+ syntax errors. + * + *Revision 1.2 1993/07/23 13:21:32 mike + *cleanup rexp code + * + * Revision 1.1.1.1 1993/07/03 18:58:26 mike + * move source to cvs + * + * Revision 3.4 1991/08/13 09:09:59 brennan + * VERSION .9994 + * + * Revision 3.3 91/08/04 15:45:03 brennan + * no longer attempt to recover mem on failed REcompile + * Its not worth the effort + * + * Revision 3.2 91/08/03 07:24:06 brennan + * check for empty machine stack (missing operand) wasn't quite right + * + * Revision 3.1 91/06/07 10:33:16 brennan + * VERSION 0.995 + * + * Revision 1.7 91/06/05 08:58:47 brennan + * change RE_free to free + * + * Revision 1.6 91/06/03 07:07:17 brennan + * moved parser stacks inside REcompile + * removed unnecessary copying + * +*/ + +/* op precedence parser for regular expressions */ + +#include "rexp.h" + + +/* DATA */ +int REerrno ; +char *REerrlist[] = +{(char *) 0, + /* 1 */ "missing '('", + /* 2 */ "missing ')'", + /* 3 */ "bad class -- [], [^] or [", + /* 4 */ "missing operand", + /* 5 */ "resource exhaustion -- regular expression too large" , + /* 6 */ "syntax error ^* or ^+" +} ; +/* E5 is very unlikely to occur */ + + +/* This table drives the operator precedence parser */ +static short table[8][8] = { + +/* 0 | CAT * + ? ( ) */ +/* 0 */ {0, L, L, L, L, L, L, E1}, +/* | */ {G, G, L, L, L, L, L, G}, +/* CAT*/ {G, G, G, L, L, L, L, G}, +/* * */ {G, G, G, G, G, G, E7, G}, +/* + */ {G, G, G, G, G, G, E7, G}, +/* ? */ {G, G, G, G, G, G, E7, G}, +/* ( */ {E2, L, L, L, L, L, L, EQ}, +/* ) */ {G , G, G, G, G, G, E7, G} } ; + + +#define STACKSZ 64 + + +static jmp_buf err_buf ; /* used to trap on error */ + +void +RE_error_trap(x) + int x ; +{ + REerrno = x ; + longjmp(err_buf, 1) ; +} + + +PTR +REcompile(re) + char *re ; +{ + MACHINE m_stack[STACKSZ] ; + struct op + { + int token ; + int prec ; + } + op_stack[STACKSZ] ; + register MACHINE *m_ptr ; + register struct op *op_ptr ; + register int t ; + + /* do this first because it also checks if we have a + run time stack */ + RE_lex_init(re) ; + + if (*re == 0) + { + STATE *p = (STATE *) RE_malloc(sizeof(STATE)) ; + p->type = M_ACCEPT ; + return (PTR) p ; + } + + if (setjmp(err_buf)) return (PTR) 0 ; + /* we used to try to recover memory left on machine stack ; + but now m_ptr is in a register so it won't be right unless + we force it out of a register which isn't worth the trouble */ + + /* initialize the stacks */ + m_ptr = m_stack - 1 ; + op_ptr = op_stack ; + op_ptr->token = 0 ; + + t = RE_lex(m_stack) ; + + while (1) + { + switch (t) + { + case T_STR: + case T_ANY: + case T_U: + case T_START: + case T_END: + case T_CLASS: + m_ptr++ ; + break ; + + case 0: /* end of reg expr */ + if (op_ptr->token == 0) + { + /* done */ + if (m_ptr == m_stack) return (PTR) m_ptr->start ; + else + { + /* machines still on the stack */ + RE_panic("values still on machine stack") ; + } + } + + /* otherwise fall thru to default + which is operator case */ + + default: + + if ((op_ptr->prec = table[op_ptr->token][t]) == G) + { + do + { /* op_pop */ + + if (op_ptr->token <= T_CAT) /*binary op*/ + m_ptr-- ; + /* if not enough values on machine stack + then we have a missing operand */ + if (m_ptr < m_stack) RE_error_trap(-E4) ; + + switch (op_ptr->token) + { + case T_CAT: + RE_cat(m_ptr, m_ptr + 1) ; + break ; + + case T_OR: + RE_or(m_ptr, m_ptr + 1) ; + break ; + + case T_STAR: + RE_close(m_ptr) ; + break ; + + case T_PLUS: + RE_poscl(m_ptr) ; + break ; + + case T_Q: + RE_01(m_ptr) ; + break ; + + default: + /*nothing on ( or ) */ + break ; + } + + op_ptr-- ; + } + while (op_ptr->prec != L); + + continue ; /* back thru switch at top */ + } + + if (op_ptr->prec < 0) + { + if (op_ptr->prec == E7) RE_panic("parser returns E7") ; + else RE_error_trap(-op_ptr->prec) ; + } + + if (++op_ptr == op_stack + STACKSZ) + { + /* stack overflow */ + RE_error_trap(-E5) ; + } + + op_ptr->token = t ; + } /* end of switch */ + + if (m_ptr == m_stack + (STACKSZ - 1)) + { + /*overflow*/ + RE_error_trap(-E5) ; + } + + t = RE_lex(m_ptr + 1) ; + } +} + + +/* getting here means a logic flaw or unforeseen case */ +void +RE_panic(s) + char *s ; +{ + fprintf(stderr, "REcompile() - panic: %s\n", s) ; + exit(100) ; +} diff --git a/rexp/rexp.h b/rexp/rexp.h new file mode 100644 index 0000000..947b916 --- /dev/null +++ b/rexp/rexp.h @@ -0,0 +1,163 @@ + +/******************************************** +rexp.h +copyright 1991, Michael D. Brennan + +This is a source file for mawk, an implementation of +the AWK programming language. + +Mawk is distributed without warranty under the terms of +the GNU General Public License, version 2, 1991. +********************************************/ + +/*$Log: rexp.h,v $ + * Revision 1.2 1993/07/23 13:21:35 mike + * cleanup rexp code + * + * Revision 1.1.1.1 1993/07/03 18:58:27 mike + * move source to cvs + * + * Revision 3.6 1992/01/21 17:31:45 brennan + * moved ison() macro out of rexp[23].c + * + * Revision 3.5 91/10/29 10:53:55 brennan + * SIZE_T + * + * Revision 3.4 91/08/13 09:10:02 brennan + * VERSION .9994 + * + * Revision 3.3 91/06/15 09:40:25 brennan + * gcc defines __STDC__ but might not have stdlib.h + * + * Revision 3.2 91/06/10 16:18:19 brennan + * changes for V7 + * + * Revision 3.1 91/06/07 10:33:18 brennan + * VERSION 0.995 + * + * Revision 1.3 91/06/05 08:57:57 brennan + * removed RE_xmalloc() + * + * Revision 1.2 91/06/03 07:23:26 brennan + * changed type of RE_error_trap + * + * Revision 1.1 91/06/03 07:05:41 brennan + * Initial revision + * +*/ + +#ifndef REXP_H +#define REXP_H + +#include "nstd.h" +#include +#include + +PTR PROTO( RE_malloc, (unsigned) ) ; +PTR PROTO( RE_realloc, (void *,unsigned) ) ; + + +/* finite machine state types */ + +#define M_STR 0 +#define M_CLASS 1 +#define M_ANY 2 +#define M_START 3 +#define M_END 4 +#define M_U 5 +#define M_1J 6 +#define M_2JA 7 +#define M_2JB 8 +#define M_ACCEPT 9 +#define U_ON 10 + +#define U_OFF 0 +#define END_OFF 0 +#define END_ON (2*U_ON) + + +typedef unsigned char BV[32] ; /* bit vector */ + +typedef struct +{ char type ; + unsigned char len ; /* used for M_STR */ + union + { + char *str ; /* string */ + BV *bvp ; /* class */ + int jump ; + } data ; +} STATE ; + +#define STATESZ (sizeof(STATE)) + +typedef struct +{ STATE *start, *stop ; } MACHINE ; + + +/* tokens */ +#define T_OR 1 /* | */ +#define T_CAT 2 +#define T_STAR 3 /* * */ +#define T_PLUS 4 /* + */ +#define T_Q 5 /* ? */ +#define T_LP 6 /* ( */ +#define T_RP 7 /* ) */ +#define T_START 8 /* ^ */ +#define T_END 9 /* $ */ +#define T_ANY 10 /* . */ +#define T_CLASS 11 /* starts with [ */ +#define T_SLASH 12 /* \ */ +#define T_CHAR 13 /* all the rest */ +#define T_STR 14 +#define T_U 15 + +/* precedences and error codes */ +#define L 0 +#define EQ 1 +#define G 2 +#define E1 (-1) +#define E2 (-2) +#define E3 (-3) +#define E4 (-4) +#define E5 (-5) +#define E6 (-6) +#define E7 (-7) + +#define MEMORY_FAILURE 5 + +#define ison(b,x) ((b)[((unsigned char)(x))>>3] & (1<<((x)&7))) + +/* struct for the run time stack */ +typedef struct { +STATE *m ; /* save the machine ptr */ +int u ; /* save the u_flag */ +char *s ; /* save the active string ptr */ +char *ss ; /* save the match start -- only used by REmatch */ +} RT_STATE ; /* run time state */ + +/* error trap */ +extern int REerrno ; +void PROTO(RE_error_trap, (int) ) ; + + +MACHINE PROTO( RE_u, (void) ) ; +MACHINE PROTO( RE_start, (void) ) ; +MACHINE PROTO( RE_end, (void) ) ; +MACHINE PROTO( RE_any, (void) ) ; +MACHINE PROTO( RE_str, (char *, unsigned) ) ; +MACHINE PROTO( RE_class, (BV *) ) ; +void PROTO( RE_cat, (MACHINE *, MACHINE *) ) ; +void PROTO( RE_or, (MACHINE *, MACHINE *) ) ; +void PROTO( RE_close, (MACHINE *) ) ; +void PROTO( RE_poscl, (MACHINE *) ) ; +void PROTO( RE_01, (MACHINE *) ) ; +void PROTO( RE_panic, (char *) ) __attribute__((noreturn)) ; +char *PROTO( str_str, (char *, char *, unsigned) ) ; + +void PROTO( RE_lex_init , (char *) ) ; +int PROTO( RE_lex , (MACHINE *) ) ; +void PROTO( RE_run_stack_init, (void) ) ; +RT_STATE *PROTO( RE_new_run_stack, (void) ) ; + +#endif /* REXP_H */ diff --git a/rexp/rexp.o b/rexp/rexp.o new file mode 100644 index 0000000..f02a223 Binary files /dev/null and b/rexp/rexp.o differ diff --git a/rexp/rexp0.c b/rexp/rexp0.c new file mode 100644 index 0000000..9068632 --- /dev/null +++ b/rexp/rexp0.c @@ -0,0 +1,633 @@ + +/******************************************** +rexp0.c +copyright 1991, Michael D. Brennan + +This is a source file for mawk, an implementation of +the AWK programming language. + +Mawk is distributed without warranty under the terms of +the GNU General Public License, version 2, 1991. +********************************************/ + +/*$Log: rexp0.c,v $ + *Revision 1.5 1996/11/08 15:39:27 mike + *While cleaning up block_on, I introduced a bug. Now fixed. + * + *Revision 1.4 1996/09/02 18:47:09 mike + *Allow []...] and [^]...] to put ] in a class. + *Make ^* and ^+ syntax errors. + * + * Revision 1.3 1994/12/26 16:37:52 mike + * 1.1.2 fix to do_str was incomplete -- fix it + * + * Revision 1.2 1993/07/23 13:21:38 mike + * cleanup rexp code + * + * Revision 1.1.1.1 1993/07/03 18:58:27 mike + * move source to cvs + * + * Revision 3.8 1992/04/21 20:22:38 brennan + * 1.1 patch2 + * [c1-c2] now works if c2 is an escaped character + * + * Revision 3.7 1992/03/24 09:33:12 brennan + * 1.1 patch2 + * When backing up in do_str, check if last character was escaped + * + * Revision 3.6 92/01/21 17:32:51 brennan + * added some casts so that character classes work with signed chars + * + * Revision 3.5 91/10/29 10:53:57 brennan + * SIZE_T + * + * Revision 3.4 91/08/13 09:10:05 brennan + * VERSION .9994 + * + * Revision 3.3 91/07/19 07:29:24 brennan + * backslash at end of regular expression now stands for backslash + * + * Revision 3.2 91/07/19 06:58:23 brennan + * removed small bozo in processing of escape characters + * + * Revision 3.1 91/06/07 10:33:20 brennan + * VERSION 0.995 + * + * Revision 1.2 91/06/05 08:59:36 brennan + * changed RE_free to free + * + * Revision 1.1 91/06/03 07:10:15 brennan + * Initial revision + * +*/ + +/* lexical scanner */ + +#include "rexp.h" + +/* static functions */ +static int PROTO(do_str, (int, char **, MACHINE *)) ; +static int PROTO(do_class, (char **, MACHINE *)) ; +static int PROTO(escape, (char **)) ; +static BV *PROTO(store_bvp, (BV *)) ; +static int PROTO(ctohex, (int)) ; + + +#ifndef EG +/* make next array visible */ +static +#endif +char RE_char2token[ '|' + 1 ] = { +0,13,13,13,13,13,13,13,13,13,13,13,13,13,13,13,13,13,13,13, +13,13,13,13,13,13,13,13,13,13,13,13,13,13,13,13,9,13,13,13, +6,7,3,4,13,13,10,13,13,13,13,13,13,13,13,13,13,13,13,13,13, +13,13,5,13,13,13,13,13,13,13,13,13,13,13,13,13,13,13,13,13, +13,13,13,13,13,13,13,13,13,13,11,12,13,8,13,13,13,13,13,13, +13,13,13,13,13,13,13,13,13,13,13,13,13,13,13,13,13,13,13, +13,13,13,13,1} ; + +#define char2token(x) \ +( (unsigned char)(x) > '|' ? T_CHAR : RE_char2token[x] ) + +#define NOT_STARTED (-1) + +static int prev ; +static char *lp ; /* ptr to reg exp string */ +static unsigned re_len ; + + +void +RE_lex_init(re) + char *re ; +{ + lp = re ; + re_len = strlen(re) + 1 ; + prev = NOT_STARTED ; + RE_run_stack_init() ; +} + + +int +RE_lex(mp) + MACHINE *mp ; +{ + register int c ; + + switch (c = char2token(*lp)) + { + case T_PLUS: + case T_STAR: + if (prev == T_START) RE_error_trap(6) ; + /* fall thru */ + + case T_OR: + case T_Q: + case T_RP: + lp++ ; return prev = c ; + + case T_SLASH: + break ; + + case 0: + return 0 ; + + case T_LP: + switch (prev) + { + case T_CHAR: + case T_STR: + case T_ANY: + case T_CLASS: + case T_START: + case T_RP: + case T_PLUS: + case T_STAR: + case T_Q: + case T_U: + return prev = T_CAT ; + + default: + lp++ ; + return prev = T_LP ; + } + } + + /* *lp is an operand, but implicit cat op is possible */ + switch (prev) + { + case NOT_STARTED: + case T_OR: + case T_LP: + case T_CAT: + + switch (c) + { + case T_ANY: + { + static int plus_is_star_flag = 0 ; + + if (*++lp == '*') + { + lp++ ; + *mp = RE_u() ; + return prev = T_U ; + } + else if (*lp == '+') + { + if (plus_is_star_flag) + { + lp++ ; + *mp = RE_u() ; + plus_is_star_flag = 0 ; + return prev = T_U ; + } + else + { + plus_is_star_flag = 1 ; + lp-- ; + *mp = RE_any() ; + return prev = T_ANY ; + } + } + else + { + *mp = RE_any() ; + prev = T_ANY ; + } + } + break ; + + case T_SLASH: + lp++ ; + c = escape(&lp) ; + prev = do_str(c, &lp, mp) ; + break ; + + case T_CHAR: + c = *lp++ ; + prev = do_str(c, &lp, mp) ; + break ; + + case T_CLASS: + prev = do_class(&lp, mp) ; + break ; + + case T_START: + *mp = RE_start() ; + lp++ ; + prev = T_START ; + break ; + + case T_END: + lp++ ; + *mp = RE_end() ; + return prev = T_END ; + + default: + RE_panic("bad switch in RE_lex") ; + } + break ; + + default: + /* don't advance the pointer */ + return prev = T_CAT ; + } + + /* check for end character */ + if (*lp == '$') + { + mp->start->type += END_ON ; + lp++ ; + } + + return prev ; +} + +/* + Collect a run of characters into a string machine. + If the run ends at *,+, or ?, then don't take the last + character unless the string has length one. +*/ + +static int +do_str(c, pp, mp) + int c ; /* the first character */ + char **pp ; /* where to put the re_char pointer on exit */ + MACHINE *mp ; /* where to put the string machine */ +{ + register char *p ; /* runs thru the input */ + char *pt ; /* trails p by one */ + char *str ; /* collect it here */ + register char *s ; /* runs thru the output */ + unsigned len ; /* length collected */ + + + p = *pp ; + s = str = RE_malloc(re_len) ; + *s++ = c ; len = 1 ; + + while (1) + { + char *save ; + + switch (char2token(*p)) + { + case T_CHAR: + pt = p ; + *s++ = *p++ ; + break ; + + case T_SLASH: + pt = p ; + save = p+1 ; /* keep p in a register */ + *s++ = escape(&save) ; + p = save ; + break ; + + default: + goto out ; + } + len++ ; + } + +out: + /* if len > 1 and we stopped on a ? + or * , need to back up */ + if (len > 1 && (*p == '*' || *p == '+' || *p == '?')) + { + len-- ; + p = pt ; + s-- ; + } + + *s = 0 ; + *pp = p ; + *mp = RE_str((char *) RE_realloc(str, len + 1), len) ; + return T_STR ; +} + + +/*-------------------------------------------- + BUILD A CHARACTER CLASS + *---------------------------*/ + +#define on( b, x) ((b)[(x)>>3] |= ( 1 << ((x)&7) )) + +static void PROTO(block_on, (BV, int, int)) ; + +static void +block_on(b, x, y) + BV b ; + int x, y ; + /* caller makes sure x<=y and x>0 y>0 */ +{ + int lo = x >> 3 ; + int hi = y >> 3 ; + int r_lo = x&7 ; + int r_hi = y&7 ; + + if (lo == hi) + { + b[lo] |= (1<<(r_hi+1)) - (1<='0'&&(x)<='7') + +#define NOT_HEX 16 +static char hex_val['f' - 'A' + 1] = +{ + 10, 11, 12, 13, 14, 15, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0, + 10, 11, 12, 13, 14, 15} ; + +/* interpret 1 character as hex */ +static int +ctohex(c) + register int c ; +{ + int t ; + + if (c >= '0' && c <= '9') return c - '0' ; + if (c >= 'A' && c <= 'f' && (t = hex_val[c - 'A'])) return t ; + return NOT_HEX ; +} + +#define ET_END 7 + +static struct +{ + char in, out ; +} +escape_test[ET_END + 1] = +{ + {'n', '\n'}, + {'t', '\t'}, + {'f', '\f'}, + {'b', '\b'}, + {'r', '\r'}, + {'a', '\07'}, + {'v', '\013'}, + {0, 0} +} ; + + +/*----------------- + return the char + and move the pointer forward + on entry *s -> at the character after the slash + *-------------------*/ + +static int +escape(start_p) + char **start_p ; +{ + register char *p = *start_p ; + register unsigned x ; + unsigned xx ; + int i ; + + + escape_test[ET_END].in = *p ; + i = 0 ; + while (escape_test[i].in != *p) i++ ; + if (i != ET_END) + { + /* in escape_test table */ + *start_p = p + 1 ; + return escape_test[i].out ; + } + + if (isoctal(*p)) + { + x = *p++ - '0' ; + if (isoctal(*p)) + { + x = (x << 3) + *p++ - '0' ; + if (isoctal(*p)) x = (x << 3) + *p++ - '0' ; + } + *start_p = p ; + return x & 0xff ; + } + + if (*p == 0) return '\\' ; + + if (*p++ == 'x') + { + if ((x = ctohex(*p)) == NOT_HEX) + { + *start_p = p ; return 'x' ; + } + + /* look for another hex digit */ + if ((xx = ctohex(*++p)) != NOT_HEX) + { + x = (x<<4) + xx ; p++ ; + } + + *start_p = p ; return x ; + } + + /* anything else \c -> c */ + *start_p = p ; + return *(unsigned char *) (p - 1) ; +} + diff --git a/rexp/rexp0.o b/rexp/rexp0.o new file mode 100644 index 0000000..8366070 Binary files /dev/null and b/rexp/rexp0.o differ diff --git a/rexp/rexp1.c b/rexp/rexp1.c new file mode 100644 index 0000000..c6a4317 --- /dev/null +++ b/rexp/rexp1.c @@ -0,0 +1,246 @@ + +/******************************************** +rexp1.c +copyright 1991, Michael D. Brennan + +This is a source file for mawk, an implementation of +the AWK programming language. + +Mawk is distributed without warranty under the terms of +the GNU General Public License, version 2, 1991. +********************************************/ + +/*$Log: rexp1.c,v $ + * Revision 1.3 1993/07/24 17:55:10 mike + * more cleanup + * + * Revision 1.2 1993/07/23 13:21:41 mike + * cleanup rexp code + * + * Revision 1.1.1.1 1993/07/03 18:58:27 mike + * move source to cvs + * + * Revision 3.4 1992/02/20 16:08:12 brennan + * change new_TWO() to work around sun acc bug + * + * Revision 3.3 91/10/29 10:54:01 brennan + * SIZE_T + * + * Revision 3.2 91/08/13 09:10:11 brennan + * VERSION .9994 + * + * Revision 3.1 91/06/07 10:33:22 brennan + * VERSION 0.995 + * +*/ + +/* re machine operations */ + +#include "rexp.h" + +static void PROTO(new_TWO, (int, MACHINE *)) ; + + + +/* initialize a two state machine */ +static void +new_TWO(type, mp) + int type ; + MACHINE *mp ; /* init mp-> */ +{ + mp->start = (STATE *) RE_malloc(2 * STATESZ) ; + mp->stop = mp->start + 1 ; + mp->start->type = type ; + mp->stop->type = M_ACCEPT ; +} + +/* build a machine that recognizes any */ +MACHINE +RE_any() +{ + MACHINE x ; + + new_TWO(M_ANY, &x) ; + return x ; +} + +/* build a machine that recognizes the start of string */ +MACHINE +RE_start() +{ + MACHINE x ; + + new_TWO(M_START, &x) ; + return x ; +} + +MACHINE +RE_end() +{ + MACHINE x ; + + new_TWO(M_END, &x) ; + return x ; +} + +/* build a machine that recognizes a class */ +MACHINE +RE_class(bvp) + BV *bvp ; +{ + MACHINE x ; + + new_TWO(M_CLASS, &x) ; + x.start->data.bvp = bvp ; + return x ; +} + +MACHINE +RE_u() +{ + MACHINE x ; + + new_TWO(M_U, &x) ; + return x ; +} + +MACHINE +RE_str(str, len) + char *str ; + unsigned len ; +{ + MACHINE x ; + + new_TWO(M_STR, &x) ; + x.start->len = len ; + x.start->data.str = str ; + return x ; +} + + +/* replace m and n by a machine that recognizes mn */ +void +RE_cat(mp, np) + MACHINE *mp, *np ; +{ + unsigned sz1, sz2, sz ; + + sz1 = mp->stop - mp->start ; + sz2 = np->stop - np->start + 1 ; + sz = sz1 + sz2 ; + + mp->start = (STATE *) RE_realloc(mp->start, sz * STATESZ) ; + mp->stop = mp->start + (sz - 1) ; + memcpy(mp->start + sz1, np->start, sz2 * STATESZ) ; + free(np->start) ; +} + + /* replace m by a machine that recognizes m|n */ + +void +RE_or(mp, np) + MACHINE *mp, *np ; +{ + register STATE *p ; + unsigned szm, szn ; + + szm = mp->stop - mp->start + 1 ; + szn = np->stop - np->start + 1 ; + + p = (STATE *) RE_malloc((szm + szn + 1) * STATESZ) ; + memcpy(p + 1, mp->start, szm * STATESZ) ; + free(mp->start) ; + mp->start = p ; + (mp->stop = p + szm + szn)->type = M_ACCEPT ; + p->type = M_2JA ; + p->data.jump = szm + 1 ; + memcpy(p + szm + 1, np->start, szn * STATESZ) ; + free(np->start) ; + (p += szm)->type = M_1J ; + p->data.jump = szn ; +} + +/* UNARY OPERATIONS */ + +/* replace m by m* */ + +void +RE_close(mp) + MACHINE *mp ; +{ + register STATE *p ; + unsigned sz ; + + sz = mp->stop - mp->start + 1 ; + p = (STATE *) RE_malloc((sz + 2) * STATESZ) ; + memcpy(p + 1, mp->start, sz * STATESZ) ; + free(mp->start) ; + mp->start = p ; + mp->stop = p + (sz + 1) ; + p->type = M_2JA ; + p->data.jump = sz + 1 ; + (p += sz)->type = M_2JB ; + p->data.jump = -(sz - 1) ; + (p + 1)->type = M_ACCEPT ; +} + +/* replace m by m+ (positive closure) */ + +void +RE_poscl(mp) + MACHINE *mp ; +{ + register STATE *p ; + unsigned sz ; + + sz = mp->stop - mp->start + 1 ; + mp->start = p = (STATE *) RE_realloc(mp->start, (sz + 1) * STATESZ) ; + mp->stop = p + sz ; + p += --sz ; + p->type = M_2JB ; + p->data.jump = -sz ; + (p + 1)->type = M_ACCEPT ; +} + +/* replace m by m? (zero or one) */ + +void +RE_01(mp) + MACHINE *mp ; +{ + unsigned sz ; + register STATE *p ; + + sz = mp->stop - mp->start + 1 ; + p = (STATE *) RE_malloc((sz + 1) * STATESZ) ; + memcpy(p + 1, mp->start, sz * STATESZ) ; + free(mp->start) ; + mp->start = p ; + mp->stop = p + sz ; + p->type = M_2JB ; + p->data.jump = sz ; +} + +/*=================================== +MEMORY ALLOCATION + *==============================*/ + + +PTR +RE_malloc(sz) + unsigned sz ; +{ + register PTR p ; + + if (!(p = malloc(sz))) RE_error_trap(MEMORY_FAILURE) ; + return p ; +} + +PTR +RE_realloc(p, sz) + register PTR p ; + unsigned sz ; +{ + if (!(p = realloc(p, sz))) RE_error_trap(MEMORY_FAILURE) ; + return p ; +} diff --git a/rexp/rexp1.o b/rexp/rexp1.o new file mode 100644 index 0000000..1f12df3 Binary files /dev/null and b/rexp/rexp1.o differ diff --git a/rexp/rexp2.c b/rexp/rexp2.c new file mode 100644 index 0000000..c46a1ac --- /dev/null +++ b/rexp/rexp2.c @@ -0,0 +1,376 @@ + +/******************************************** +rexp2.c +copyright 1991, Michael D. Brennan + +This is a source file for mawk, an implementation of +the AWK programming language. + +Mawk is distributed without warranty under the terms of +the GNU General Public License, version 2, 1991. +********************************************/ + +/*$Log: rexp2.c,v $ + * Revision 1.3 1993/07/24 17:55:12 mike + * more cleanup + * + * Revision 1.2 1993/07/23 13:21:44 mike + * cleanup rexp code + * + * Revision 1.1.1.1 1993/07/03 18:58:28 mike + * move source to cvs + * + * Revision 3.8 1992/12/24 00:36:44 mike + * fixed major bozo for LMDOS when growing stack + * fixed potential LMDOS bozo with M_STR+U_ON+END_ON + * fixed minor bug in M_CLASS+U_ON+END_ON + * + * Revision 3.7 92/01/21 17:33:15 brennan + * added some casts so that character classes work with signed chars + * + * Revision 3.6 91/10/29 10:54:03 brennan + * SIZE_T + * + * Revision 3.5 91/08/13 09:10:15 brennan + * VERSION .9994 + * + * Revision 3.4 91/08/08 07:53:34 brennan + * work around for turboC realloc() bug + * + * Revision 3.4 91/08/07 07:10:47 brennan + * work around for TurboC realloc() bug + * + * Revision 3.3 91/08/04 15:45:57 brennan + * minor change for large model dos + * + * Revision 3.2 91/06/10 16:18:14 brennan + * changes for V7 + * + * Revision 3.1 91/06/07 10:33:25 brennan + * VERSION 0.995 + * + * Revision 1.8 91/06/05 09:01:33 brennan + * changes to RE_new_run_stack + * + * Revision 1.7 91/05/31 10:56:02 brennan + * stack_empty hack for DOS large model + * +*/ + + + +/* test a string against a machine */ + +#include "rexp.h" + +#define STACKGROWTH 16 + +#ifdef DEBUG +static RT_STATE *PROTO(slow_push, (RT_STATE *, STATE *, char *, int)) ; +#endif + + +RT_STATE *RE_run_stack_base ; +RT_STATE *RE_run_stack_limit ; + +/* Large model DOS segment arithemetic breaks the current stack. + This hack fixes it without rewriting the whole thing, 5/31/91 */ +RT_STATE *RE_run_stack_empty ; + +void +RE_run_stack_init() +{ + if (!RE_run_stack_base) + { + RE_run_stack_base = (RT_STATE *) + RE_malloc(sizeof(RT_STATE) * STACKGROWTH) ; + RE_run_stack_limit = RE_run_stack_base + STACKGROWTH ; + RE_run_stack_empty = RE_run_stack_base - 1 ; + } +} + +/* sometimes during REmatch(), this stack can grow pretty large. + In real life cases, the back tracking usually fails. Some + work is needed here to improve the algorithm. + I.e., figure out how not to stack useless paths. +*/ + +RT_STATE * +RE_new_run_stack() +{ + int oldsize = RE_run_stack_limit - RE_run_stack_base ; + int newsize = oldsize + STACKGROWTH ; + +#ifdef LMDOS /* large model DOS */ + /* have to worry about overflow on multiplication (ugh) */ + if (newsize >= 4096) RE_run_stack_base = (RT_STATE *) 0 ; + else +#endif + + RE_run_stack_base = (RT_STATE *) realloc(RE_run_stack_base, + newsize * sizeof(RT_STATE)) ; + + if (!RE_run_stack_base) + { + fprintf(stderr, "out of memory for RE run time stack\n") ; + /* this is pretty unusual, I've only seen it happen on + weird input to REmatch() under 16bit DOS , the same + situation worked easily on 32bit machine. */ + exit(100) ; + } + + RE_run_stack_limit = RE_run_stack_base + newsize ; + RE_run_stack_empty = RE_run_stack_base - 1 ; + + /* return the new stackp */ + return RE_run_stack_base + oldsize ; +} + +#ifdef DEBUG +static RT_STATE * +slow_push(sp, m, s, u) + RT_STATE *sp ; + STATE *m ; + char *s ; + int u ; +{ + if (sp == RE_run_stack_limit) sp = RE_new_run_stack() ; + sp->m = m ; sp->s = s ; sp->u = u ; + return sp ; +} +#endif + +#ifdef DEBUG +#define push(mx,sx,ux) stackp = slow_push(++stackp, mx, sx, ux) +#else +#define push(mx,sx,ux) if (++stackp == RE_run_stack_limit)\ + stackp = RE_new_run_stack() ;\ +stackp->m=(mx);stackp->s=(sx);stackp->u=(ux) +#endif + + +#define CASE_UANY(x) case x + U_OFF : case x + U_ON + +/* test if str ~ /machine/ +*/ + +int +REtest(str, machine) + char *str ; + PTR machine ; +{ + register STATE *m = (STATE *) machine ; + register char *s = str ; + register RT_STATE *stackp ; + int u_flag ; + char *str_end ; + int t ; /*convenient temps */ + STATE *tm ; + + /* handle the easy case quickly */ + if ((m + 1)->type == M_ACCEPT && m->type == M_STR) + return str_str(s, m->data.str, m->len) != (char *) 0 ; + else + { + u_flag = U_ON ; str_end = (char *) 0 ; + stackp = RE_run_stack_empty ; + goto reswitch ; + } + +refill : + if (stackp == RE_run_stack_empty) return 0 ; + m = stackp->m ; + s = stackp->s ; + u_flag = stackp--->u ; + + +reswitch : + + switch (m->type + u_flag) + { + case M_STR + U_OFF + END_OFF: + if (strncmp(s, m->data.str, m->len)) goto refill ; + s += m->len ; m++ ; + goto reswitch ; + + case M_STR + U_OFF + END_ON: + if (strcmp(s, m->data.str)) goto refill ; + s += m->len ; m++ ; + goto reswitch ; + + case M_STR + U_ON + END_OFF: + if (!(s = str_str(s, m->data.str, m->len))) goto refill ; + push(m, s + 1, U_ON) ; + s += m->len ; m++ ; u_flag = U_OFF ; + goto reswitch ; + + case M_STR + U_ON + END_ON: + if (!str_end) str_end = s + strlen(s) ; + t = (str_end - s) - m->len ; + if (t < 0 || memcmp(s + t, m->data.str, m->len)) + goto refill ; + s = str_end ; m++ ; u_flag = U_OFF ; + goto reswitch ; + + case M_CLASS + U_OFF + END_OFF: + if (!ison(*m->data.bvp, s[0])) goto refill ; + s++ ; m++ ; + goto reswitch ; + + case M_CLASS + U_OFF + END_ON: + if (s[1] || !ison(*m->data.bvp, s[0])) goto refill ; + s++ ; m++ ; + goto reswitch ; + + case M_CLASS + U_ON + END_OFF: + while (!ison(*m->data.bvp, s[0])) + { + if (s[0] == 0) goto refill ; + else s++ ; + } + s++ ; + push(m, s, U_ON) ; + m++ ; u_flag = U_OFF ; + goto reswitch ; + + case M_CLASS + U_ON + END_ON: + if (!str_end) str_end = s + strlen(s) ; + if (s[0] == 0 || !ison(*m->data.bvp, str_end[-1])) + goto refill ; + s = str_end ; m++ ; u_flag = U_OFF ; + goto reswitch ; + + case M_ANY + U_OFF + END_OFF: + if (s[0] == 0) goto refill ; + s++ ; m++ ; + goto reswitch ; + + case M_ANY + U_OFF + END_ON: + if (s[0] == 0 || s[1] != 0) goto refill ; + s++ ; m++ ; + goto reswitch ; + + case M_ANY + U_ON + END_OFF: + if (s[0] == 0) goto refill ; + s++ ; + push(m, s, U_ON) ; + m++ ; u_flag = U_OFF ; + goto reswitch ; + + case M_ANY + U_ON + END_ON: + if (s[0] == 0) goto refill ; + if (!str_end) str_end = s + strlen(s) ; + s = str_end ; m++ ; u_flag = U_OFF ; + goto reswitch ; + + case M_START + U_OFF + END_OFF: + case M_START + U_ON + END_OFF: + if (s != str) goto refill ; + m++ ; u_flag = U_OFF ; + goto reswitch ; + + case M_START + U_OFF + END_ON: + case M_START + U_ON + END_ON: + if (s != str || s[0] != 0) goto refill ; + m++ ; u_flag = U_OFF ; + goto reswitch ; + + case M_END + U_OFF: + if (s[0] != 0) goto refill ; + m++ ; goto reswitch ; + + case M_END + U_ON: + s += strlen(s) ; + m++ ; u_flag = U_OFF ; + goto reswitch ; + + CASE_UANY(M_U): + u_flag = U_ON ; m++ ; + goto reswitch ; + + CASE_UANY(M_1J): + m += m->data.jump ; + goto reswitch ; + + CASE_UANY(M_2JA): /* take the non jump branch */ + /* don't stack an ACCEPT */ + if ((tm = m + m->data.jump)->type == M_ACCEPT) return 1 ; + push(tm, s, u_flag) ; + m++ ; + goto reswitch ; + + CASE_UANY(M_2JB): /* take the jump branch */ + /* don't stack an ACCEPT */ + if ((tm = m + 1)->type == M_ACCEPT) return 1 ; + push(tm, s, u_flag) ; + m += m->data.jump ; + goto reswitch ; + + CASE_UANY(M_ACCEPT): + return 1 ; + + default: + RE_panic("unexpected case in REtest") ; + } +} + + + +#ifdef MAWK + +char * +is_string_split(p, lenp) + register STATE *p ; + unsigned *lenp ; +{ + if (p[0].type == M_STR && p[1].type == M_ACCEPT) + { + *lenp = p->len ; + return p->data.str ; + } + else return (char *) 0 ; +} +#else /* mawk provides its own str_str */ + +char * +str_str(target, key, klen) + register char *target ; + register char *key ; + unsigned klen ; +{ + int c = key[0] ; + + switch (klen) + { + case 0: + return (char *) 0 ; + + case 1: + return strchr(target, c) ; + + case 2: + { + int c1 = key[1] ; + + while (target = strchr(target, c)) + { + if (target[1] == c1) return target ; + else target++ ; + } + break ; + } + + default: + klen-- ; key++ ; + while (target = strchr(target, c)) + { + if (memcmp(target + 1, key, klen) == 0) return target ; + else target++ ; + } + break ; + } + return (char *) 0 ; +} + + +#endif /* MAWK */ diff --git a/rexp/rexp2.o b/rexp/rexp2.o new file mode 100644 index 0000000..d496a3c Binary files /dev/null and b/rexp/rexp2.o differ diff --git a/rexp/rexp3.c b/rexp/rexp3.c new file mode 100644 index 0000000..9c91545 --- /dev/null +++ b/rexp/rexp3.c @@ -0,0 +1,339 @@ + +/******************************************** +rexp3.c +copyright 1991, Michael D. Brennan + +This is a source file for mawk, an implementation of +the AWK programming language. + +Mawk is distributed without warranty under the terms of +the GNU General Public License, version 2, 1991. +********************************************/ + +/*$Log: rexp3.c,v $ + * Revision 1.3 1993/07/24 17:55:15 mike + * more cleanup + * + * Revision 1.2 1993/07/23 13:21:48 mike + * cleanup rexp code + * + * Revision 1.1.1.1 1993/07/03 18:58:28 mike + * move source to cvs + * + * Revision 3.6 1992/12/24 00:44:53 mike + * fixed potential LMDOS bozo with M_STR+U_ON+END_ON + * fixed minor bug in M_CLASS+U_ON+END_ON + * + * Revision 3.5 1992/01/21 17:33:20 brennan + * added some casts so that character classes work with signed chars + * + * Revision 3.4 91/10/29 10:54:09 brennan + * SIZE_T + * + * Revision 3.3 91/08/13 09:10:18 brennan + * VERSION .9994 + * + * Revision 3.2 91/06/10 16:18:17 brennan + * changes for V7 + * + * Revision 3.1 91/06/07 10:33:28 brennan + * VERSION 0.995 + * + * Revision 1.4 91/05/31 10:56:32 brennan + * stack_empty hack for DOS large model + * +*/ + +/* match a string against a machine */ + +#include "rexp.h" + + + +extern RT_STATE *RE_run_stack_base ; +extern RT_STATE *RE_run_stack_limit ; +extern RT_STATE *RE_run_stack_empty ; + +RT_STATE *RE_new_run_stack() ; + + +#define push(mx,sx,ssx,ux) if (++stackp == RE_run_stack_limit)\ + stackp = RE_new_run_stack() ;\ +stackp->m=(mx);stackp->s=(sx);stackp->ss=(ssx);\ +stackp->u = (ux) + + +#define CASE_UANY(x) case x + U_OFF : case x + U_ON + +/* returns start of first longest match and the length by + reference. If no match returns NULL and length zero */ + + char *REmatch(str, machine, lenp) + char *str ; + PTR machine ; + unsigned *lenp ; +{ + register STATE *m = (STATE *) machine ; + register char *s = str ; + char *ss ; + register RT_STATE *stackp ; + int u_flag, t ; + char *str_end, *ts ; + + /* state of current best match stored here */ + char *cb_ss ; /* the start */ + char *cb_e ; /* the end , pts at first char not matched */ + + *lenp = 0 ; + + /* check for the easy case */ + if ((m + 1)->type == M_ACCEPT && m->type == M_STR) + { + if ((ts = str_str(s, m->data.str, m->len))) *lenp = m->len ; + return ts ; + } + + u_flag = U_ON ; cb_ss = ss = str_end = (char *) 0 ; + stackp = RE_run_stack_empty ; + goto reswitch ; + +refill : + if (stackp == RE_run_stack_empty) + { + if (cb_ss) *lenp = cb_e - cb_ss ; + return cb_ss ; + } + ss = stackp->ss ; + s = stackp--->s ; + if (cb_ss) /* does new state start too late ? */ + { + if (ss) + { + if (cb_ss < ss) goto refill ; + } + else if (cb_ss < s) goto refill ; + } + + m = (stackp + 1)->m ; + u_flag = (stackp + 1)->u ; + + +reswitch : + + switch (m->type + u_flag) + { + case M_STR + U_OFF + END_OFF: + if (strncmp(s, m->data.str, m->len)) goto refill ; + if (!ss) + { + if (cb_ss && s > cb_ss) goto refill ; + else ss = s ; + } + s += m->len ; m++ ; + goto reswitch ; + + case M_STR + U_OFF + END_ON: + if (strcmp(s, m->data.str)) goto refill ; + if (!ss) + { + if (cb_ss && s > cb_ss) goto refill ; + else ss = s ; + } + s += m->len ; m++ ; + goto reswitch ; + + case M_STR + U_ON + END_OFF: + if (!(s = str_str(s, m->data.str, m->len))) goto refill ; + push(m, s + 1, ss, U_ON) ; + if (!ss) + { + if (cb_ss && s > cb_ss) goto refill ; + else ss = s ; + } + s += m->len ; m++ ; u_flag = U_OFF ; + goto reswitch ; + + case M_STR + U_ON + END_ON: + if (!str_end) str_end = s + strlen(s) ; + t = (str_end - s) - m->len ; + if (t < 0 || memcmp(ts = s + t, m->data.str, m->len)) + goto refill ; + if (!ss) + { + if (cb_ss && ts > cb_ss) goto refill ; + else ss = ts ; + } + s = str_end ; m++ ; u_flag = U_OFF ; + goto reswitch ; + + case M_CLASS + U_OFF + END_OFF: + if (!ison(*m->data.bvp, s[0])) goto refill ; + if (!ss) + { + if (cb_ss && s > cb_ss) goto refill ; + else ss = s ; + } + s++ ; m++ ; + goto reswitch ; + + case M_CLASS + U_OFF + END_ON: + if (s[1] || !ison(*m->data.bvp, s[0])) goto refill ; + if (!ss) + { + if (cb_ss && s > cb_ss) goto refill ; + else ss = s ; + } + s++ ; m++ ; + goto reswitch ; + + case M_CLASS + U_ON + END_OFF: + while (!ison(*m->data.bvp, s[0])) + { + if (s[0] == 0) goto refill ; + else s++ ; + } + s++ ; + push(m, s, ss, U_ON) ; + if (!ss) + { + if (cb_ss && s - 1 > cb_ss) goto refill ; + else ss = s - 1 ; + } + m++ ; u_flag = U_OFF ; + goto reswitch ; + + case M_CLASS + U_ON + END_ON: + if (!str_end) str_end = s + strlen(s) ; + if (s[0] == 0 || !ison(*m->data.bvp, str_end[-1])) + goto refill ; + if (!ss) + { + if (cb_ss && str_end - 1 > cb_ss) goto refill ; + else ss = str_end - 1 ; + } + s = str_end ; m++ ; u_flag = U_OFF ; + goto reswitch ; + + case M_ANY + U_OFF + END_OFF: + if (s[0] == 0) goto refill ; + if (!ss) + { + if (cb_ss && s > cb_ss) goto refill ; + else ss = s ; + } + s++ ; m++ ; + goto reswitch ; + + case M_ANY + U_OFF + END_ON: + if (s[0] == 0 || s[1] != 0) goto refill ; + if (!ss) + { + if (cb_ss && s > cb_ss) goto refill ; + else ss = s ; + } + s++ ; m++ ; + goto reswitch ; + + case M_ANY + U_ON + END_OFF: + if (s[0] == 0) goto refill ; + s++ ; + push(m, s, ss, U_ON) ; + if (!ss) + { + if (cb_ss && s - 1 > cb_ss) goto refill ; + else ss = s - 1 ; + } + m++ ; u_flag = U_OFF ; + goto reswitch ; + + case M_ANY + U_ON + END_ON: + if (s[0] == 0) goto refill ; + if (!str_end) str_end = s + strlen(s) ; + if (!ss) + { + if (cb_ss && str_end - 1 > cb_ss) goto refill ; + else ss = str_end - 1 ; + } + s = str_end ; m++ ; u_flag = U_OFF ; + goto reswitch ; + + case M_START + U_OFF + END_OFF: + case M_START + U_ON + END_OFF: + if (s != str) goto refill ; + ss = s ; + m++ ; u_flag = U_OFF ; + goto reswitch ; + + case M_START + U_OFF + END_ON: + case M_START + U_ON + END_ON: + if (s != str || s[0] != 0) goto refill ; + ss = s ; + m++ ; u_flag = U_OFF ; + goto reswitch ; + + case M_END + U_OFF: + if (s[0] != 0) goto refill ; + if (!ss) + { + if (cb_ss && s > cb_ss) goto refill ; + else ss = s ; + } + m++ ; goto reswitch ; + + case M_END + U_ON: + s = str_end ? str_end : (str_end = s + strlen(s)) ; + if (!ss) + { + if (cb_ss && s > cb_ss) goto refill ; + else ss = s ; + } + m++ ; u_flag = U_OFF ; + goto reswitch ; + + CASE_UANY(M_U): + if (!ss) + { + if (cb_ss && s > cb_ss) goto refill ; + else ss = s ; + } + u_flag = U_ON ; m++ ; + goto reswitch ; + + CASE_UANY(M_1J): + m += m->data.jump ; + goto reswitch ; + + CASE_UANY(M_2JA): /* take the non jump branch */ + push(m + m->data.jump, s, ss, u_flag) ; + m++ ; + goto reswitch ; + + CASE_UANY(M_2JB): /* take the jump branch */ + push(m + 1, s, ss, u_flag) ; + m += m->data.jump ; + goto reswitch ; + + case M_ACCEPT + U_OFF: + if (!ss) ss = s ; + if (!cb_ss || ss < cb_ss || (ss == cb_ss && s > cb_e)) + { + /* we have a new current best */ + cb_ss = ss ; cb_e = s ; + } + goto refill ; + + case M_ACCEPT + U_ON: + if (!ss) ss = s ; + else s = str_end ? str_end : (str_end = s + strlen(s)) ; + + if (!cb_ss || ss < cb_ss || (ss == cb_ss && s > cb_e)) + { + /* we have a new current best */ + cb_ss = ss ; cb_e = s ; + } + goto refill ; + + default: + RE_panic("unexpected case in REmatch") ; + } +} diff --git a/rexp/rexp3.o b/rexp/rexp3.o new file mode 100644 index 0000000..0f8e1a7 Binary files /dev/null and b/rexp/rexp3.o differ diff --git a/rexp/rexpdb.c b/rexp/rexpdb.c new file mode 100644 index 0000000..1c21411 --- /dev/null +++ b/rexp/rexpdb.c @@ -0,0 +1,85 @@ + +/******************************************** +rexpdb.c +copyright 1991, Michael D. Brennan + +This is a source file for mawk, an implementation of +the AWK programming language. + +Mawk is distributed without warranty under the terms of +the GNU General Public License, version 2, 1991. +********************************************/ + + +/*$Log: rexpdb.c,v $ + * Revision 1.2 1993/07/23 13:21:51 mike + * cleanup rexp code + * + * Revision 1.1.1.1 1993/07/03 18:58:28 mike + * move source to cvs + * + * Revision 3.2 1991/08/13 09:10:09 brennan + * VERSION .9994 + * + * Revision 3.1 91/06/07 10:33:30 brennan + * VERSION 0.995 + * +*/ + + +#include "rexp.h" +#include + +/* print a machine for debugging */ + +static char *xlat[] = { +"M_STR" , +"M_CLASS" , +"M_ANY" , +"M_START" , +"M_END" , +"M_U", +"M_1J" , +"M_2JA" , +"M_2JB" , +"M_ACCEPT" } ; + +void REmprint(m, f) + PTR m ; + FILE *f ; +{ register STATE *p = (STATE *) m ; + char *end_on_string ; + + while ( 1 ) + { + if ( p->type >= END_ON ) + { p->type -= END_ON ; end_on_string = "$" ; } + else end_on_string = "" ; + + if ( p->type < 0 || p->type >= END_ON ) + { fprintf(f, "unknown STATE type\n") ; return ; } + + fprintf(f, "%-10s" , xlat[p->type]) ; + switch( p->type ) + { + case M_STR : fprintf(f, "%s", p->data.str ) ; + break ; + + case M_1J: + case M_2JA: + case M_2JB : fprintf(f, "%d", p->data.jump) ; + break ; + case M_CLASS: + { unsigned char *q = (unsigned char *) p->data.bvp ; + unsigned char *r = q + sizeof(BV) ; + while ( q < r ) fprintf(f, "%x " , *q++) ; + } + break ; + } + fprintf(f, "%s\n" , end_on_string) ; + if ( end_on_string[0] ) p->type += END_ON ; + if ( p->type == M_ACCEPT ) return ; + p++ ; + } +} + diff --git a/scan.c b/scan.c new file mode 100644 index 0000000..ea3880e --- /dev/null +++ b/scan.c @@ -0,0 +1,1080 @@ + +/******************************************** +scan.c +copyright 1991, Michael D. Brennan + +This is a source file for mawk, an implementation of +the AWK programming language. + +Mawk is distributed without warranty under the terms of +the GNU General Public License, version 2, 1991. +********************************************/ + + +/* $Log: scan.c,v $ + * Revision 1.8 1996/07/28 21:47:05 mike + * gnuish patch + * + * Revision 1.7 1995/06/18 19:42:24 mike + * Remove some redundant declarations and add some prototypes + * + * Revision 1.6 1995/06/10 16:57:52 mike + * silently exit(0) if no program + * always add a '\n' on eof in scan_fillbuff() + * + * Revision 1.5 1995/06/06 00:18:33 mike + * change mawk_exit(1) to mawk_exit(2) + * + * Revision 1.4 1994/09/23 00:20:04 mike + * minor bug fix: handle \ in eat_nl() + * + * Revision 1.3 1993/07/17 00:45:21 mike + * indent + * + * Revision 1.2 1993/07/04 12:52:09 mike + * start on autoconfig changes + * + * Revision 1.1.1.1 1993/07/03 18:58:20 mike + * move source to cvs + * + * Revision 5.6 1993/02/13 21:57:33 mike + * merge patch3 + * + * Revision 5.5 1993/01/01 21:30:48 mike + * split new_STRING() into new_STRING and new_STRING0 + * + * Revision 5.4.1.1 1993/01/15 03:33:50 mike + * patch3: safer double to int conversion + * + * Revision 5.4 1992/11/29 18:57:50 mike + * field expressions convert to long so 16 bit and 32 bit + * systems behave the same + * + * Revision 5.3 1992/07/08 15:43:41 brennan + * patch2: length returns. I am a wimp + * + * Revision 5.2 1992/02/21 14:16:53 brennan + * fix: getline <= + * + * Revision 5.1 91/12/05 07:56:27 brennan + * 1.1 pre-release + * +*/ + + +#include "mawk.h" +#include "scan.h" +#include "memory.h" +#include "field.h" +#include "init.h" +#include "fin.h" +#include "repl.h" +#include "code.h" + +#ifndef NO_FCNTL_H +#include +#endif + +#include "files.h" + + +/* static functions */ +static void PROTO(scan_fillbuff, (void)) ; +static void PROTO(scan_open, (void)) ; +static int PROTO(slow_next, (void)) ; +static void PROTO(eat_comment, (void)) ; +static void PROTO(eat_semi_colon, (void)) ; +static double PROTO(collect_decimal, (int, int *)) ; +static int PROTO(collect_string, (void)) ; +static int PROTO(collect_RE, (void)) ; + + +/*----------------------------- + program file management + *----------------------------*/ + +char *pfile_name ; +STRING *program_string ; +PFILE *pfile_list ; +static unsigned char *buffer ; +static unsigned char *buffp ; + /* unsigned so it works with 8 bit chars */ +static int program_fd ; +static int eof_flag ; + +void +scan_init(cmdline_program) + char *cmdline_program ; +{ + if (cmdline_program) + { + program_fd = -1 ; /* command line program */ + program_string = new_STRING0(strlen(cmdline_program) + 1) ; + strcpy(program_string->str, cmdline_program) ; + /* simulate file termination */ + program_string->str[program_string->len - 1] = '\n' ; + buffp = (unsigned char *) program_string->str ; + eof_flag = 1 ; + } + else /* program from file[s] */ + { + scan_open() ; + buffp = buffer = (unsigned char *) zmalloc(BUFFSZ + 1) ; + scan_fillbuff() ; + } + +#ifdef OS2 /* OS/2 "extproc" is similar to #! */ + if (strnicmp(buffp, "extproc ", 8) == 0) + eat_comment(); +#endif + eat_nl() ; /* scan to first token */ + if (next() == 0) + { + /* no program */ + mawk_exit(0) ; + } + + un_next() ; + +} + +static void +scan_open() /* open pfile_name */ +{ + if (pfile_name[0] == '-' && pfile_name[1] == 0) + { + program_fd = 0 ; + } + else if ((program_fd = open(pfile_name, O_RDONLY, 0)) == -1) + { + errmsg(errno, "cannot open %s", pfile_name) ; + mawk_exit(2) ; + } +} + +void +scan_cleanup() +{ + if (program_fd >= 0) zfree(buffer, BUFFSZ + 1) ; + else free_STRING(program_string) ; + + if (program_fd > 0) close(program_fd) ; + + /* redefine SPACE as [ \t\n] */ + + scan_code['\n'] = posix_space_flag && rs_shadow.type != SEP_MLR + ? SC_UNEXPECTED : SC_SPACE ; + scan_code['\f'] = SC_UNEXPECTED ; /*value doesn't matter */ + scan_code['\013'] = SC_UNEXPECTED ; /* \v not space */ + scan_code['\r'] = SC_UNEXPECTED ; +} + +/*-------------------------------- + global variables shared by yyparse() and yylex() + and used for error messages too + *-------------------------------*/ + +int current_token = -1 ; +unsigned token_lineno ; +unsigned compile_error_count ; +int NR_flag ; /* are we tracking NR */ +int paren_cnt ; +int brace_cnt ; +int print_flag ; /* changes meaning of '>' */ +int getline_flag ; /* changes meaning of '<' */ + + +/*---------------------------------------- + file reading functions + next() and un_next(c) are macros in scan.h + + *---------------------*/ + +static unsigned lineno = 1 ; + + +static void +scan_fillbuff() +{ + unsigned r ; + + r = fillbuff(program_fd, (char *) buffer, BUFFSZ) ; + if (r < BUFFSZ) + { + eof_flag = 1 ; + /* make sure eof is terminated */ + buffer[r] = '\n' ; + buffer[r + 1] = 0 ; + } +} + +/* read one character -- slowly */ +static int +slow_next() +{ + + while (*buffp == 0) + { + if (!eof_flag) + { + buffp = buffer ; + scan_fillbuff() ; + } + else if (pfile_list /* open another program file */ ) + { + PFILE *q ; + + if (program_fd > 0) close(program_fd) ; + eof_flag = 0 ; + pfile_name = pfile_list->fname ; + q = pfile_list ; + pfile_list = pfile_list->link ; + ZFREE(q) ; + scan_open() ; + token_lineno = lineno = 1 ; + } + else break /* real eof */ ; + } + + return *buffp++ ; /* note can un_next() , eof which is zero */ +} + +static void +eat_comment() +{ + register int c ; + + while ((c = next()) != '\n' && scan_code[c]) ; + un_next() ; +} + +/* this is how we handle extra semi-colons that are + now allowed to separate pattern-action blocks + + A proof that they are useless clutter to the language: + we throw them away +*/ + +static void +eat_semi_colon() +/* eat one semi-colon on the current line */ +{ + register int c ; + + while (scan_code[c = next()] == SC_SPACE) ; + if (c != ';') un_next() ; +} + +void +eat_nl() /* eat all space including newlines */ +{ + while (1) + switch (scan_code[next()]) + { + case SC_COMMENT: + eat_comment() ; + break ; + + case SC_NL: + lineno++ ; + /* fall thru */ + + case SC_SPACE: + break ; + + case SC_ESCAPE: + /* bug fix - surprised anyone did this, + a csh user with backslash dyslexia.(Not a joke) + */ + { + unsigned c ; + + while (scan_code[c = next()] == SC_SPACE) ; + if (c == '\n') + token_lineno = ++lineno ; + else if (c == 0) + { + un_next() ; + return ; + } + else /* error */ + { + un_next() ; + /* can't un_next() twice so deal with it */ + yylval.ival = '\\' ; + unexpected_char() ; + if( ++compile_error_count == MAX_COMPILE_ERRORS ) + mawk_exit(2) ; + return ; + } + } + break ; + + default: + un_next() ; + return ; + } +} + +int +yylex() +{ + register int c ; + + token_lineno = lineno ; + +reswitch: + + switch (scan_code[c = next()]) + { + case 0: + ct_ret(EOF) ; + + case SC_SPACE: + goto reswitch ; + + case SC_COMMENT: + eat_comment() ; + goto reswitch ; + + case SC_NL: + lineno++ ; + eat_nl() ; + ct_ret(NL) ; + + case SC_ESCAPE: + while (scan_code[c = next()] == SC_SPACE) ; + if (c == '\n') + { + token_lineno = ++lineno ; + goto reswitch ; + } + + if (c == 0) ct_ret(EOF) ; + un_next() ; + yylval.ival = '\\' ; + ct_ret(UNEXPECTED) ; + + + case SC_SEMI_COLON: + eat_nl() ; + ct_ret(SEMI_COLON) ; + + case SC_LBRACE: + eat_nl() ; + brace_cnt++ ; + ct_ret(LBRACE) ; + + case SC_PLUS: + switch (next()) + { + case '+': + yylval.ival = '+' ; + string_buff[0] = + string_buff[1] = '+' ; + string_buff[2] = 0 ; + ct_ret(INC_or_DEC) ; + + case '=': + ct_ret(ADD_ASG) ; + + default: + un_next() ; + ct_ret(PLUS) ; + } + + case SC_MINUS: + switch (next()) + { + case '-': + yylval.ival = '-' ; + string_buff[0] = + string_buff[1] = '-' ; + string_buff[2] = 0 ; + ct_ret(INC_or_DEC) ; + + case '=': + ct_ret(SUB_ASG) ; + + default: + un_next() ; + ct_ret(MINUS) ; + } + + case SC_COMMA: + eat_nl() ; + ct_ret(COMMA) ; + + case SC_MUL: + test1_ret('=', MUL_ASG, MUL) ; + + case SC_DIV: + { + static int can_precede_div[] = + {DOUBLE, STRING_, RPAREN, ID, D_ID, RE, RBOX, FIELD, + GETLINE, INC_or_DEC, -1} ; + + int *p = can_precede_div ; + + do + { + if (*p == current_token) + { + if (*p != INC_or_DEC) { test1_ret('=', DIV_ASG, DIV) ; } + + if (next() == '=') + { + un_next() ; + ct_ret(collect_RE()) ; + } + } + } + while (*++p != -1) ; + + ct_ret(collect_RE()) ; + } + + case SC_MOD: + test1_ret('=', MOD_ASG, MOD) ; + + case SC_POW: + test1_ret('=', POW_ASG, POW) ; + + case SC_LPAREN: + paren_cnt++ ; + ct_ret(LPAREN) ; + + case SC_RPAREN: + if (--paren_cnt < 0) + { + compile_error("extra ')'") ; + paren_cnt = 0 ; + goto reswitch ; + } + + ct_ret(RPAREN) ; + + case SC_LBOX: + ct_ret(LBOX) ; + + case SC_RBOX: + ct_ret(RBOX) ; + + case SC_MATCH: + string_buff[0] = '~' ; + string_buff[0] = 0 ; + yylval.ival = 1 ; + ct_ret(MATCH) ; + + case SC_EQUAL: + test1_ret('=', EQ, ASSIGN) ; + + case SC_NOT: /* ! */ + if ((c = next()) == '~') + { + string_buff[0] = '!' ; + string_buff[1] = '~' ; + string_buff[2] = 0 ; + yylval.ival = 0 ; + ct_ret(MATCH) ; + } + else if (c == '=') ct_ret(NEQ) ; + + un_next() ; + ct_ret(NOT) ; + + + case SC_LT: /* '<' */ + if (next() == '=') ct_ret(LTE) ; + else un_next() ; + + if (getline_flag) + { + getline_flag = 0 ; + ct_ret(IO_IN) ; + } + else ct_ret(LT) ; + + case SC_GT: /* '>' */ + if (print_flag && paren_cnt == 0) + { + print_flag = 0 ; + /* there are 3 types of IO_OUT + -- build the error string in string_buff */ + string_buff[0] = '>' ; + if (next() == '>') + { + yylval.ival = F_APPEND ; + string_buff[1] = '>' ; + string_buff[2] = 0 ; + } + else + { + un_next() ; + yylval.ival = F_TRUNC ; + string_buff[1] = 0 ; + } + return current_token = IO_OUT ; + } + + test1_ret('=', GTE, GT) ; + + case SC_OR: + if (next() == '|') + { + eat_nl() ; + ct_ret(OR) ; + } + else + { + un_next() ; + + if (print_flag && paren_cnt == 0) + { + print_flag = 0 ; + yylval.ival = PIPE_OUT ; + string_buff[0] = '|' ; + string_buff[1] = 0 ; + ct_ret(IO_OUT) ; + } + else ct_ret(PIPE) ; + } + + case SC_AND: + if (next() == '&') + { + eat_nl() ; + ct_ret(AND) ; + } + else + { + un_next() ; + yylval.ival = '&' ; + ct_ret(UNEXPECTED) ; + } + + case SC_QMARK: + ct_ret(QMARK) ; + + case SC_COLON: + ct_ret(COLON) ; + + case SC_RBRACE: + if (--brace_cnt < 0) + { + compile_error("extra '}'") ; + eat_semi_colon() ; + brace_cnt = 0 ; + goto reswitch ; + } + + if ((c = current_token) == NL || c == SEMI_COLON + || c == SC_FAKE_SEMI_COLON || c == RBRACE) + { + /* if the brace_cnt is zero , we've completed + a pattern action block. If the user insists + on adding a semi-colon on the same line + we will eat it. Note what we do below: + physical law -- conservation of semi-colons */ + + if (brace_cnt == 0) eat_semi_colon() ; + eat_nl() ; + ct_ret(RBRACE) ; + } + + /* supply missing semi-colon to statement that + precedes a '}' */ + brace_cnt++ ; + un_next() ; + current_token = SC_FAKE_SEMI_COLON ; + return SEMI_COLON ; + + case SC_DIGIT: + case SC_DOT: + { + double d; + int flag ; + static double double_zero = 0.0 ; + static double double_one = 1.0 ; + + if ((d = collect_decimal(c, &flag)) == 0.0) + { + if (flag) ct_ret(flag) ; + else yylval.ptr = (PTR) & double_zero ; + } + else if (d == 1.0) + { + yylval.ptr = (PTR) & double_one ; + } + else + { + yylval.ptr = (PTR) ZMALLOC(double) ; + *(double *) yylval.ptr = d ; + } + ct_ret(DOUBLE) ; + } + + case SC_DOLLAR: /* '$' */ + { + double d; + int flag ; + + while (scan_code[c = next()] == SC_SPACE) ; + if (scan_code[c] != SC_DIGIT && + scan_code[c] != SC_DOT) + { + un_next() ; + ct_ret(DOLLAR) ; + } + + /* compute field address at compile time */ + if ((d = collect_decimal(c, &flag)) == 0.0) + { + if (flag) ct_ret(flag) ; /* an error */ + else yylval.cp = &field[0] ; + } + else + { + if (d > MAX_FIELD) + { + compile_error( + "$%g exceeds maximum field(%d)", d, MAX_FIELD) ; + d = MAX_FIELD ; + } + yylval.cp = field_ptr((int) d) ; + } + + ct_ret(FIELD) ; + } + + case SC_DQUOTE: + return current_token = collect_string() ; + + case SC_IDCHAR: /* collect an identifier */ + { + unsigned char *p = + (unsigned char *) string_buff + 1 ; + SYMTAB *stp ; + + string_buff[0] = c ; + + while ( + (c = scan_code[*p++ = next()]) == SC_IDCHAR || + c == SC_DIGIT) ; + + un_next() ; + *--p = 0 ; + + switch ((stp = find(string_buff))->type) + { + case ST_NONE: + /* check for function call before defined */ + if (next() == '(') + { + stp->type = ST_FUNCT ; + stp->stval.fbp = (FBLOCK *) + zmalloc(sizeof(FBLOCK)) ; + stp->stval.fbp->name = stp->name ; + stp->stval.fbp->code = (INST *) 0 ; + yylval.fbp = stp->stval.fbp ; + current_token = FUNCT_ID ; + } + else + { + yylval.stp = stp ; + current_token = + current_token == DOLLAR ? D_ID : ID ; + } + un_next() ; + break ; + + case ST_NR: + NR_flag = 1 ; + stp->type = ST_VAR ; + /* fall thru */ + + case ST_VAR: + case ST_ARRAY: + case ST_LOCAL_NONE: + case ST_LOCAL_VAR: + case ST_LOCAL_ARRAY: + + yylval.stp = stp ; + current_token = + current_token == DOLLAR ? D_ID : ID ; + break ; + + case ST_ENV: + stp->type = ST_ARRAY ; + stp->stval.array = new_ARRAY() ; + load_environ(stp->stval.array) ; + yylval.stp = stp ; + current_token = + current_token == DOLLAR ? D_ID : ID ; + break ; + + case ST_FUNCT: + yylval.fbp = stp->stval.fbp ; + current_token = FUNCT_ID ; + break ; + + case ST_KEYWORD: + current_token = stp->stval.kw ; + break ; + + case ST_BUILTIN: + yylval.bip = stp->stval.bip ; + current_token = BUILTIN ; + break ; + + case ST_LENGTH: + + yylval.bip = stp->stval.bip ; + + /* check for length alone, this is an ugly + hack */ + while (scan_code[c = next()] == SC_SPACE) ; + un_next() ; + + current_token = c == '(' ? BUILTIN : LENGTH ; + break ; + + case ST_FIELD: + yylval.cp = stp->stval.cp ; + current_token = FIELD ; + break ; + + default: + bozo("find returned bad st type") ; + } + return current_token ; + } + + + case SC_UNEXPECTED: + yylval.ival = c & 0xff ; + ct_ret(UNEXPECTED) ; + } + return 0 ; /* never get here make lint happy */ +} + +/* collect a decimal constant in temp_buff. + Return the value and error conditions by reference */ + +static double +collect_decimal(c, flag) + int c ; + int *flag ; +{ + register unsigned char *p = (unsigned char *) string_buff + 1 ; + unsigned char *endp ; + double d; + + *flag = 0 ; + string_buff[0] = c ; + + if (c == '.') + { + if (scan_code[*p++ = next()] != SC_DIGIT) + { + *flag = UNEXPECTED ; + yylval.ival = '.' ; + return 0.0 ; + } + } + else + { + while (scan_code[*p++ = next()] == SC_DIGIT) ; + if (p[-1] != '.') + { + un_next() ; + p-- ; + } + } + /* get rest of digits after decimal point */ + while (scan_code[*p++ = next()] == SC_DIGIT) ; + + /* check for exponent */ + if (p[-1] != 'e' && p[-1] != 'E') + { + un_next() ; + *--p = 0 ; + } + else /* get the exponent */ + { + if (scan_code[*p = next()] != SC_DIGIT && + *p != '-' && *p != '+') + { + *++p = 0 ; + *flag = BAD_DECIMAL ; + return 0.0 ; + } + else /* get the rest of the exponent */ + { + p++ ; + while (scan_code[*p++ = next()] == SC_DIGIT) ; + un_next() ; + *--p = 0 ; + } + } + + errno = 0 ; /* check for overflow/underflow */ + d = strtod(string_buff, (char **) &endp) ; + +#ifndef STRTOD_UNDERFLOW_ON_ZERO_BUG + if (errno) compile_error("%s : decimal %sflow", string_buff, + d == 0.0 ? "under" : "over") ; +#else /* ! sun4 bug */ + if (errno && d != 0.0) + compile_error("%s : decimal overflow", string_buff) ; +#endif + + if (endp < p) + { + *flag = BAD_DECIMAL ; + return 0.0 ; + } + return d ; +} + +/*---------- process escape characters ---------------*/ + +static char hex_val['f' - 'A' + 1] = +{ + 10, 11, 12, 13, 14, 15, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0, + 10, 11, 12, 13, 14, 15} ; + +#define isoctal(x) ((x)>='0'&&(x)<='7') + +#define hex_value(x) hex_val[(x)-'A'] + +#define ishex(x) (scan_code[x] == SC_DIGIT ||\ + ('A' <= (x) && (x) <= 'f' && hex_value(x))) + +static int PROTO(octal, (char **)) ; +static int PROTO(hex, (char **)) ; + +/* process one , two or three octal digits + moving a pointer forward by reference */ +static int +octal(start_p) + char **start_p ; +{ + register char *p = *start_p ; + register unsigned x ; + + x = *p++ - '0' ; + if (isoctal(*p)) + { + x = (x << 3) + *p++ - '0' ; + if (isoctal(*p)) x = (x << 3) + *p++ - '0' ; + } + *start_p = p ; + return x & 0xff ; +} + +/* process one or two hex digits + moving a pointer forward by reference */ + +static int +hex(start_p) + char **start_p ; +{ + register unsigned char *p = (unsigned char *) *start_p ; + register unsigned x ; + unsigned t ; + + if (scan_code[*p] == SC_DIGIT) x = *p++ - '0' ; + else x = hex_value(*p++) ; + + if (scan_code[*p] == SC_DIGIT) x = (x << 4) + *p++ - '0' ; + else if ('A' <= *p && *p <= 'f' && (t = hex_value(*p))) + { + x = (x << 4) + t ; + p++ ; + } + + *start_p = (char *) p ; + return x ; +} + +#define ET_END 9 + +static struct +{ + char in, out ; +} +escape_test[ET_END + 1] = +{ + {'n', '\n'}, + {'t', '\t'}, + {'f', '\f'}, + {'b', '\b'}, + {'r', '\r'}, + {'a', '\07'}, + {'v', '\013'}, + {'\\', '\\'}, + {'\"', '\"'}, + {0, 0} +} ; + + +/* process the escape characters in a string, in place . */ + +char * +rm_escape(s) + char *s ; +{ + register char *p, *q ; + char *t ; + int i ; + + q = p = s ; + + while (*p) + { + if (*p == '\\') + { + escape_test[ET_END].in = *++p ; /* sentinal */ + i = 0 ; + while (escape_test[i].in != *p) i++ ; + + if (i != ET_END) /* in table */ + { + p++ ; + *q++ = escape_test[i].out ; + } + else if (isoctal(*p)) + { + t = p ; + *q++ = octal(&t) ; + p = t ; + } + else if (*p == 'x' && ishex(*(unsigned char *) (p + 1))) + { + t = p + 1 ; + *q++ = hex(&t) ; + p = t ; + } + else if (*p == 0) /* can only happen with command line assign */ + *q++ = '\\' ; + else /* not an escape sequence */ + { + *q++ = '\\' ; + *q++ = *p++ ; + } + } + else *q++ = *p++ ; + } + + *q = 0 ; + return s ; +} + +static int +collect_string() +{ + register unsigned char *p = (unsigned char *) string_buff ; + int c ; + int e_flag = 0 ; /* on if have an escape char */ + + while (1) + switch (scan_code[*p++ = next()]) + { + case SC_DQUOTE: /* done */ + *--p = 0 ; + goto out ; + + case SC_NL: + p[-1] = 0 ; + /* fall thru */ + + case 0: /* unterminated string */ + compile_error( + "runaway string constant \"%.10s ...", + string_buff, token_lineno) ; + mawk_exit(2) ; + + case SC_ESCAPE: + if ((c = next()) == '\n') + { + p-- ; + lineno++ ; + } + else if (c == 0) un_next() ; + else + { + *p++ = c ; + e_flag = 1 ; + } + + break ; + + default: + break ; + } + +out: + yylval.ptr = (PTR) new_STRING( + e_flag ? rm_escape(string_buff) + : string_buff) ; + return STRING_ ; +} + + +static int +collect_RE() +{ + register unsigned char *p = (unsigned char *) string_buff ; + int c ; + STRING *sval ; + + while (1) + switch (scan_code[*p++ = next()]) + { + case SC_DIV: /* done */ + *--p = 0 ; + goto out ; + + case SC_NL: + p[-1] = 0 ; + /* fall thru */ + + case 0: /* unterminated re */ + compile_error( + "runaway regular expression /%.10s ...", + string_buff, token_lineno) ; + mawk_exit(2) ; + + case SC_ESCAPE: + switch (c = next()) + { + case '/': + p[-1] = '/' ; + break ; + + case '\n': + p-- ; + break ; + + case 0: + un_next() ; + break ; + + default: + *p++ = c ; + break ; + } + break ; + } + +out: + /* now we've got the RE, so compile it */ + sval = new_STRING(string_buff) ; + yylval.ptr = re_compile(sval) ; + free_STRING(sval) ; + return RE ; +} diff --git a/scan.h b/scan.h new file mode 100644 index 0000000..dfe0e66 --- /dev/null +++ b/scan.h @@ -0,0 +1,103 @@ + +/******************************************** +scan.h +copyright 1991, Michael D. Brennan + +This is a source file for mawk, an implementation of +the AWK programming language. + +Mawk is distributed without warranty under the terms of +the GNU General Public License, version 2, 1991. +********************************************/ + + +/* $Log: scan.h,v $ + * Revision 1.3 1995/06/18 19:42:26 mike + * Remove some redundant declarations and add some prototypes + * + * Revision 1.2 1994/09/23 00:20:06 mike + * minor bug fix: handle \ in eat_nl() + * + * Revision 1.1.1.1 1993/07/03 18:58:20 mike + * move source to cvs + * + * Revision 5.1 1991/12/05 07:59:33 brennan + * 1.1 pre-release + * +*/ + + +/* scan.h */ + +#ifndef SCAN_H_INCLUDED +#define SCAN_H_INCLUDED 1 + +#include + +#ifndef MAKESCAN +#include "symtype.h" +#include "parse.h" +#endif + + +extern char scan_code[256] ; + +/* the scan codes to compactify the main switch */ + +#define SC_SPACE 1 +#define SC_NL 2 +#define SC_SEMI_COLON 3 +#define SC_FAKE_SEMI_COLON 4 +#define SC_LBRACE 5 +#define SC_RBRACE 6 +#define SC_QMARK 7 +#define SC_COLON 8 +#define SC_OR 9 +#define SC_AND 10 +#define SC_PLUS 11 +#define SC_MINUS 12 +#define SC_MUL 13 +#define SC_DIV 14 +#define SC_MOD 15 +#define SC_POW 16 +#define SC_LPAREN 17 +#define SC_RPAREN 18 +#define SC_LBOX 19 +#define SC_RBOX 20 +#define SC_IDCHAR 21 +#define SC_DIGIT 22 +#define SC_DQUOTE 23 +#define SC_ESCAPE 24 +#define SC_COMMENT 25 +#define SC_EQUAL 26 +#define SC_NOT 27 +#define SC_LT 28 +#define SC_GT 29 +#define SC_COMMA 30 +#define SC_DOT 31 +#define SC_MATCH 32 +#define SC_DOLLAR 33 +#define SC_UNEXPECTED 34 + +#ifndef MAKESCAN + +void PROTO(eat_nl, (void) ) ; + +/* in error.c */ +void PROTO( unexpected_char, (void) ) ; + +#define ct_ret(x) return current_token = (x) + +#define next() (*buffp ? *buffp++ : slow_next()) +#define un_next() buffp-- + +#define test1_ret(c,x,d) if ( next() == (c) ) ct_ret(x) ;\ + else { un_next() ; ct_ret(d) ; } + +#define test2_ret(c1,x1,c2,x2,d) switch( next() )\ + { case c1: ct_ret(x1) ;\ + case c2: ct_ret(x2) ;\ + default: un_next() ;\ + ct_ret(d) ; } +#endif /* ! MAKESCAN */ +#endif diff --git a/scan.o b/scan.o new file mode 100644 index 0000000..1fdef4f Binary files /dev/null and b/scan.o differ diff --git a/scancode.c b/scancode.c new file mode 100644 index 0000000..4a4b8fe --- /dev/null +++ b/scancode.c @@ -0,0 +1,23 @@ + + +/* scancode.c */ + + +char scan_code[256] = { + 0,34,34,34,34,34,34,34,34, 1, 2, 1, 1, 1,34,34, +34,34,34,34,34,34,34,34,34,34,34,34,34,34,34,34, + 1,27,23,25,33,15,10,34,17,18,13,11,30,12,31,14, +22,22,22,22,22,22,22,22,22,22, 8, 3,28,26,29, 7, +34,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21, +21,21,21,21,21,21,21,21,21,21,21,19,24,20,16,21, +34,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21, +21,21,21,21,21,21,21,21,21,21,21, 5, 9, 6,32,34, +34,34,34,34,34,34,34,34,34,34,34,34,34,34,34,34, +34,34,34,34,34,34,34,34,34,34,34,34,34,34,34,34, +34,34,34,34,34,34,34,34,34,34,34,34,34,34,34,34, +34,34,34,34,34,34,34,34,34,34,34,34,34,34,34,34, +34,34,34,34,34,34,34,34,34,34,34,34,34,34,34,34, +34,34,34,34,34,34,34,34,34,34,34,34,34,34,34,34, +34,34,34,34,34,34,34,34,34,34,34,34,34,34,34,34, +34,34,34,34,34,34,34,34,34,34,34,34,34,34,34,34 +} ; diff --git a/scancode.o b/scancode.o new file mode 100644 index 0000000..c5d4a29 Binary files /dev/null and b/scancode.o differ diff --git a/sizes.h b/sizes.h new file mode 100644 index 0000000..60cf502 --- /dev/null +++ b/sizes.h @@ -0,0 +1,104 @@ + +/******************************************** +sizes.h +copyright 1991, 1992. Michael D. Brennan + +This is a source file for mawk, an implementation of +the AWK programming language. + +Mawk is distributed without warranty under the terms of +the GNU General Public License, version 2, 1991. +********************************************/ + +/* $Log: sizes.h,v $ + * Revision 1.8 1995/10/14 22:09:51 mike + * getting MAX__INT from values.h didn't really work since the value was + * unusable in an #if MAX__INT <= 0x7fff + * at least it didn't work under sunos -- so use of values.h is a goner + * + * Revision 1.7 1995/06/18 19:17:51 mike + * Create a type Int which on most machines is an int, but on machines + * with 16bit ints, i.e., the PC is a long. This fixes implicit assumption + * that int==long. + * + * Revision 1.6 1994/10/10 01:39:01 mike + * get MAX__INT from limits.h or values.h + * + * Revision 1.5 1994/10/08 19:15:53 mike + * remove SM_DOS + * + * Revision 1.4 1994/09/25 23:00:49 mike + * remove #if 0 + * + * Revision 1.3 1993/07/15 23:56:15 mike + * general cleanup + * + * Revision 1.2 1993/07/04 12:52:13 mike + * start on autoconfig changes + * + * Revision 5.3 1992/12/17 02:48:01 mike + * 1.1.2d changes for DOS + * + * Revision 5.2 1992/08/27 03:20:08 mike + * patch2: increase A_HASH_PRIME + * + * Revision 5.1 1991/12/05 07:59:35 brennan + * 1.1 pre-release + * +*/ + +/* sizes.h */ + +#ifndef SIZES_H +#define SIZES_H + +#ifndef MAX__INT +#include +#define MAX__INT INT_MAX +#define MAX__LONG LONG_MAX +#endif /* MAX__INT */ + +#if MAX__INT <= 0x7fff +#define SHORT_INTS +#define INT_FMT "%ld" +typedef long Int ; +#define Max_Int MAX__LONG +#else +#define INT_FMT "%d" +typedef int Int ; +#define Max_Int MAX__INT +#endif + +#define EVAL_STACK_SIZE 256 /* initial size , can grow */ +/* number of fields at startup, must be a power of 2 + and FBANK_SZ-1 must be divisible by 3! */ +#define FBANK_SZ 256 +#define FB_SHIFT 8 /* lg(FBANK_SZ) */ +#define NUM_FBANK 128 /* see MAX_FIELD below */ + + +#define MAX_SPLIT (FBANK_SZ-1) /* needs to be divisble by 3*/ +#define MAX_FIELD (NUM_FBANK*FBANK_SZ - 1) + +#define MIN_SPRINTF 400 + + +#define BUFFSZ 4096 + /* starting buffer size for input files, grows if + necessary */ + +#ifdef MSDOS +/* trade some space for IO speed */ +#undef BUFFSZ +#define BUFFSZ 8192 +/* maximum input buffers that will fit in 64K */ +#define MAX_BUFFS ((int)(0x10000L/BUFFSZ) - 1) +#endif + +#define HASH_PRIME 53 +#define A_HASH_PRIME 199 + + +#define MAX_COMPILE_ERRORS 5 /* quit if more than 4 errors */ + +#endif /* SIZES_H */ diff --git a/split.c b/split.c new file mode 100644 index 0000000..dc6f798 --- /dev/null +++ b/split.c @@ -0,0 +1,335 @@ + +/******************************************** +split.c +copyright 1991, Michael D. Brennan + +This is a source file for mawk, an implementation of +the AWK programming language. + +Mawk is distributed without warranty under the terms of +the GNU General Public License, version 2, 1991. +********************************************/ + +/* $Log: split.c,v $ + * Revision 1.3 1996/02/01 04:39:42 mike + * dynamic array scheme + * + * Revision 1.2 1993/07/15 01:55:03 mike + * rm SIZE_T & indent + * + * Revision 1.1.1.1 1993/07/03 18:58:21 mike + * move source to cvs + * + * Revision 5.4 1993/05/08 18:06:00 mike + * null_split + * + * Revision 5.3 1993/01/01 21:30:48 mike + * split new_STRING() into new_STRING and new_STRING0 + * + * Revision 5.2 1992/07/08 21:19:09 brennan + * patch2 + * change in split() requires that + * bi_split() call load_array() even + * when cnt is 0. + * + * Revision 5.1 1991/12/05 07:56:31 brennan + * 1.1 pre-release + * +*/ + +/* split.c */ + + +/* For all splitting up to MAX_SPLIT fields go into + split_buff[], the rest go onto split_ov_list ( split + overflow list) + + We can split one of three ways: + (1) By space: + space_split() and space_ov_split() + (2) By regular expression: + re_split() and re_ov_split() + (3) By "" (null -- split into characters) + null_split() and null_ov_split() +*/ + +#define TEMPBUFF_GOES_HERE + +#include "mawk.h" +#include "symtype.h" +#include "bi_vars.h" +#include "bi_funct.h" +#include "memory.h" +#include "scan.h" +#include "regexp.h" +#include "field.h" + +SPLIT_OV *split_ov_list ; + +static int PROTO(re_ov_split, (char *, PTR)) ; +static int PROTO(space_ov_split, (char *, char *)) ; +static int PROTO(null_ov_split, (char *)) ; + +/* split string s of length slen on SPACE without changing s. + load the pieces into STRINGS and ptrs into + split_buff[] + return the number of pieces */ + +int +space_split(s, slen) + register char *s ; + unsigned slen ; +{ + char *back = s + slen ; + int i = 0 ; + int len ; + char *q ; + STRING *sval ; + int lcnt = MAX_SPLIT / 3 ; + +#define EAT_SPACE() while ( scan_code[*(unsigned char*)s] ==\ + SC_SPACE ) s++ +#define EAT_NON_SPACE() \ + *back = ' ' ; /* sentinel */\ + while ( scan_code[*(unsigned char*)s] != SC_SPACE ) s++ ;\ + *back = 0 + + + while (lcnt--) + { + EAT_SPACE() ; + if (*s == 0) goto done ; + /* mark the front with q */ + q = s++ ; + EAT_NON_SPACE() ; + sval = split_buff[i++] = new_STRING0(len = s - q) ; + memcpy(sval->str, q, len) ; + + EAT_SPACE() ; + if (*s == 0) goto done ; + q = s++ ; + EAT_NON_SPACE() ; + sval = split_buff[i++] = new_STRING0(len = s - q) ; + memcpy(sval->str, q, len) ; + + EAT_SPACE() ; + if (*s == 0) goto done ; + q = s++ ; + EAT_NON_SPACE() ; + sval = split_buff[i++] = new_STRING0(len = s - q) ; + memcpy(sval->str, q, len) ; + + } + /* we've overflowed */ + return i + space_ov_split(s, back) ; + + done: + return i ; +} + +static int +space_ov_split(s, back) + register char *s ; + char *back ; + +{ + SPLIT_OV dummy ; + register SPLIT_OV *tail = &dummy ; + char *q ; + int cnt = 0 ; + unsigned len ; + + while (1) + { + EAT_SPACE() ; + if (*s == 0) break ; /* done */ + q = s++ ; + EAT_NON_SPACE() ; + + tail = tail->link = ZMALLOC(SPLIT_OV) ; + tail->sval = new_STRING0(len = s - q) ; + memcpy(tail->sval->str, q, len) ; + cnt++ ; + } + + tail->link = (SPLIT_OV *) 0 ; + split_ov_list = dummy.link ; + return cnt ; +} + +/* match a string with a regular expression, but + only matches of positive length count */ +char * +re_pos_match(s, re, lenp) + register char *s ; +PTR re ; unsigned *lenp ; +{ + while ((s = REmatch(s, re, lenp))) + if (*lenp) return s ; + else if (*s == 0) break ; + else s++ ; + + return (char *) 0 ; +} + +int +re_split(s, re) + char *s ; + PTR re ; +{ + register char *t ; + int i = 0 ; + unsigned mlen, len ; + STRING *sval ; + int lcnt = MAX_SPLIT / 3 ; + + while (lcnt--) + { + if (!(t = re_pos_match(s, re, &mlen))) goto done ; + sval = split_buff[i++] = new_STRING0(len = t - s) ; + memcpy(sval->str, s, len) ; + s = t + mlen ; + + if (!(t = re_pos_match(s, re, &mlen))) goto done ; + sval = split_buff[i++] = new_STRING0(len = t - s) ; + memcpy(sval->str, s, len) ; + s = t + mlen ; + + if (!(t = re_pos_match(s, re, &mlen))) goto done ; + sval = split_buff[i++] = new_STRING0(len = t - s) ; + memcpy(sval->str, s, len) ; + s = t + mlen ; + } + /* we've overflowed */ + return i + re_ov_split(s, re) ; + +done: + split_buff[i++] = new_STRING(s) ; + return i ; +} + +/* + we've overflowed split_buff[] , put + the rest on the split_ov_list + return number of pieces +*/ + +static int +re_ov_split(s, re) + char *s ; + PTR re ; +{ + SPLIT_OV dummy ; + register SPLIT_OV *tail = &dummy ; + int cnt = 1 ; + char *t ; + unsigned len, mlen ; + + while ((t = re_pos_match(s, re, &mlen))) + { + tail = tail->link = ZMALLOC(SPLIT_OV) ; + tail->sval = new_STRING0(len = t - s) ; + memcpy(tail->sval->str, s, len) ; + s = t + mlen ; + cnt++ ; + } + /* and one more */ + tail = tail->link = ZMALLOC(SPLIT_OV) ; + tail->sval = new_STRING(s) ; + tail->link = (SPLIT_OV *) 0 ; + split_ov_list = dummy.link ; + + return cnt ; +} + + +int +null_split(s) + char *s ; +{ + int cnt = 0 ; /* number of fields split */ + STRING *sval ; + int i = 0 ; /* indexes split_buff[] */ + + while (*s) + { + if (cnt == MAX_SPLIT) return cnt + null_ov_split(s) ; + + sval = new_STRING0(1) ; + sval->str[0] = *s++ ; + split_buff[i++] = sval ; + cnt++ ; + } + return cnt ; +} + +static int +null_ov_split(s) + char *s ; +{ + SPLIT_OV dummy ; + SPLIT_OV *ovp = &dummy ; + int cnt = 0 ; + + while (*s) + { + ovp = ovp->link = ZMALLOC(SPLIT_OV) ; + ovp->sval = new_STRING0(1) ; + ovp->sval->str[0] = *s++ ; + cnt++ ; + } + ovp->link = (SPLIT_OV *) 0 ; + split_ov_list = dummy.link ; + return cnt ; +} + + +/* split(s, X, r) + split s into array X on r + + entry: sp[0] holds r + sp[-1] pts at X + sp[-2] holds s +*/ +CELL * +bi_split(sp) + register CELL *sp ; +{ + int cnt ; /* the number of pieces */ + + + if (sp->type < C_RE) cast_for_split(sp) ; + /* can be C_RE, C_SPACE or C_SNULL */ + sp -= 2 ; + if (sp->type < C_STRING) cast1_to_s(sp) ; + + if (string(sp)->len == 0) /* nothing to split */ + cnt = 0 ; + else + switch ((sp + 2)->type) + { + case C_RE: + cnt = re_split(string(sp)->str, (sp + 2)->ptr) ; + break ; + + case C_SPACE: + cnt = space_split(string(sp)->str, string(sp)->len) ; + break ; + + case C_SNULL: /* split on empty string */ + cnt = null_split(string(sp)->str) ; + break ; + + default: + bozo("bad splitting cell in bi_split") ; + } + + + free_STRING(string(sp)) ; + sp->type = C_DOUBLE ; + sp->dval = (double) cnt ; + + array_load((ARRAY) (sp + 1)->ptr, cnt) ; + + return sp ; +} diff --git a/split.o b/split.o new file mode 100644 index 0000000..9a72005 Binary files /dev/null and b/split.o differ diff --git a/symtype.h b/symtype.h new file mode 100644 index 0000000..453c943 --- /dev/null +++ b/symtype.h @@ -0,0 +1,189 @@ + +/******************************************** +symtype.h +copyright 1991, Michael D. Brennan + +This is a source file for mawk, an implementation of +the AWK programming language. + +Mawk is distributed without warranty under the terms of +the GNU General Public License, version 2, 1991. +********************************************/ + +/*$Log: symtype.h,v $ + * Revision 1.6 1996/02/01 04:39:43 mike + * dynamic array scheme + * + * Revision 1.5 1995/04/21 14:20:23 mike + * move_level variable to fix bug in arglist patching of moved code. + * + * Revision 1.4 1994/12/13 00:13:02 mike + * delete A statement to delete all of A at once + * + * Revision 1.3 1993/12/01 14:25:25 mike + * reentrant array loops + * + * Revision 1.2 1993/07/15 01:55:08 mike + * rm SIZE_T & indent + * + * Revision 1.1.1.1 1993/07/03 18:58:21 mike + * move source to cvs + * + * Revision 5.5 1993/01/09 19:03:44 mike + * code_pop checks if the resolve_list needs relocation + * + * Revision 5.4 1993/01/07 02:50:33 mike + * relative vs absolute code + * + * Revision 5.3 1992/12/17 02:48:01 mike + * 1.1.2d changes for DOS + * + * Revision 5.2 1992/07/08 15:44:44 brennan + * patch2: length returns. I am a wimp + * + * Revision 5.1 1991/12/05 07:59:37 brennan + * 1.1 pre-release + * +*/ + +/* types related to symbols are defined here */ + +#ifndef SYMTYPE_H +#define SYMTYPE_H + + +/* struct to hold info about builtins */ +typedef struct { +char *name ; +PF_CP fp ; /* ptr to function that does the builtin */ +unsigned char min_args, max_args ; +/* info for parser to check correct number of arguments */ +} BI_REC ; + +/*--------------------------- + structures and types for arrays + *--------------------------*/ + +#include "array.h" + +extern ARRAY Argv ; + +#if 0 +/* struct to hold the state of an array loop */ +typedef struct al_state { +struct al_state *link ; +CELL *var ; +ARRAY A ; +int index ; /* A[index] */ +ANODE *ptr ; +} ALOOP_STATE ; + +int PROTO( inc_aloop_state, (ALOOP_STATE*)) ; +#endif + +/* for parsing (i,j) in A */ +typedef struct { +int start ; /* offset to code_base */ +int cnt ; +} ARG2_REC ; + +/*------------------------ + user defined functions + ------------------------*/ + +typedef struct fblock { +char *name ; +INST *code ; +unsigned short nargs ; +char *typev ; /* array of size nargs holding types */ +} FBLOCK ; /* function block */ + +void PROTO(add_to_fdump_list, (FBLOCK *) ) ; +void PROTO( fdump, (void) ) ; + +/*------------------------- + elements of the symbol table + -----------------------*/ + +#define ST_NONE 0 +#define ST_VAR 1 +#define ST_KEYWORD 2 +#define ST_BUILTIN 3 /* a pointer to a builtin record */ +#define ST_ARRAY 4 /* a void * ptr to a hash table */ +#define ST_FIELD 5 /* a cell ptr to a field */ +#define ST_FUNCT 6 +#define ST_NR 7 /* NR is special */ +#define ST_ENV 8 /* and so is ENVIRON */ +#define ST_LENGTH 9 /* ditto and bozo */ +#define ST_LOCAL_NONE 10 +#define ST_LOCAL_VAR 11 +#define ST_LOCAL_ARRAY 12 + +#define is_local(stp) ((stp)->type>=ST_LOCAL_NONE) + +typedef struct { +char *name ; +char type ; +unsigned char offset ; /* offset in stack frame for local vars */ +union { +CELL *cp ; +int kw ; +PF_CP fp ; +BI_REC *bip ; +ARRAY array ; +FBLOCK *fbp ; +} stval ; +} SYMTAB ; + + +/***************************** + structures for type checking function calls + ******************************/ + +typedef struct ca_rec { +struct ca_rec *link ; +short type ; +short arg_num ; /* position in callee's stack */ +/*--------- this data only set if we'll need to patch -------*/ +/* happens if argument is an ID or type ST_NONE or ST_LOCAL_NONE */ + +int call_offset ; +/* where the type is stored */ +SYMTAB *sym_p ; /* if type is ST_NONE */ +char *type_p ; /* if type is ST_LOCAL_NONE */ +} CA_REC ; /* call argument record */ + +/* type field of CA_REC matches with ST_ types */ +#define CA_EXPR ST_LOCAL_VAR +#define CA_ARRAY ST_LOCAL_ARRAY + +typedef struct fcall { +struct fcall *link ; +FBLOCK *callee ; +short call_scope ; +short move_level ; +FBLOCK *call ; /* only used if call_scope == SCOPE_FUNCT */ +INST *call_start ; /* computed later as code may be moved */ +CA_REC *arg_list ; +short arg_cnt_checked ; +unsigned line_no ; /* for error messages */ +} FCALL_REC ; + +extern FCALL_REC *resolve_list ; + +void PROTO(resolve_fcalls, (void) ) ; +void PROTO(check_fcall, (FBLOCK*,int,int,FBLOCK*,CA_REC*,unsigned) ) ; +void PROTO(relocate_resolve_list, (int,int,FBLOCK*,int,unsigned,int)) ; + +/* hash.c */ +unsigned PROTO( hash, (char *) ) ; +SYMTAB *PROTO( insert, (char *) ) ; +SYMTAB *PROTO( find, (char *) ) ; +char *PROTO( reverse_find, (int, PTR)) ; +SYMTAB *PROTO( save_id, (char *) ) ; +void PROTO( restore_ids, (void) ) ; + +/* error.c */ +void PROTO(type_error, (SYMTAB *) ) ; + +#endif /* SYMTYPE_H */ diff --git a/test/decl-awk.out b/test/decl-awk.out new file mode 100644 index 0000000..cd45e90 --- /dev/null +++ b/test/decl-awk.out @@ -0,0 +1,10 @@ +hash: function returning unsigned (extern) +last_dhash: unsigned (static) +A: ARRAY +sval: pointer to STRING +cflag: int +A: ARRAY +d: double +cflag: int +ap: pointer to ANODE +signal: function returning pointer to function returning void diff --git a/test/fpe_test b/test/fpe_test new file mode 100755 index 0000000..c150f52 --- /dev/null +++ b/test/fpe_test @@ -0,0 +1,103 @@ +#!/bin/sh + +# tests if mawk has been compiled to correctly handle +# floating point exceptions +# +# $Log: fpe_test,v $ +# Revision 1.3 1995/08/29 14:17:18 mike +# exit 2 changes +# +# Revision 1.2 1994/12/18 18:51:55 mike +# recognize NAN printed as ? for hpux +# + +PATH=.:$PATH + +test1='BEGIN{ print 4/0 }' + + +test2='BEGIN { + x = 100 + do { y = x ; x *= 1000 } while ( y != x ) + print "loop terminated" +}' + +test3='BEGIN{ print log(-8) }' + + +echo "testing division by zero" +echo mawk "$test1" +mawk "$test1" +ret1=$? +echo + +echo "testing overflow" +echo mawk "$test2" +mawk "$test2" +ret2=$? +echo + +echo "testing domain error" +echo mawk "$test3" +mawk "$test3" > temp$$ +ret3=$? +cat temp$$ +echo + + +# the returns should all be zero or all 2 +# core dumps not allowed + +trap ' +echo compilation defines for floating point are incorrect +rm -f temp$$ +exit 1' 0 + +echo +echo ============================== + +echo return1 = $ret1 +echo return2 = $ret2 +echo return3 = $ret3 + + +[ $ret1 -gt 128 ] && { echo test1 failed ; exception=1 ; } +[ $ret2 -gt 128 ] && { echo test2 failed ; exception=1 ; } +[ $ret3 -gt 128 ] && { echo test3 failed ; exception=1 ; } + +[ "$exception" = 1 ] && { rm -f core temp$$ ; exit 1 ; } + + +same=0 + +[ $ret1 = $ret2 ] && [ $ret2 = $ret3 ] && same=1 + + +if [ $same = 1 ] + then + if [ $ret1 = 0 ] + then + echo results consistent: ignoring floating exceptions + # some versions of hpux print NAN as ? + if egrep '[nN][aA][nN]|\?' temp$$ > /dev/null + then : + else + echo "but the library is not IEEE754 compatible" + echo "test 3 failed" + exit 1 + fi + else echo results consistent: trapping floating exceptions + fi + + trap 0 + rm -f temp$$ + exit 0 + + else + echo results are not consistent +echo 'return values should all be 0 if ignoring FPEs (e.g. with IEEE754) +or all 2 if trapping FPEs' + +exit 1 +fi + diff --git a/test/fpe_test.bat b/test/fpe_test.bat new file mode 100644 index 0000000..af76ab4 --- /dev/null +++ b/test/fpe_test.bat @@ -0,0 +1,139 @@ +echo off +rem tests if mawk has been compiled to correctly handle +rem floating point exceptions + +echo testing division by zero +type fpetest1.awk +..\mawk -f fpetest1.awk +if errorlevel 128 goto :test1_128 +if errorlevel 3 goto :test1_3 +if errorlevel 2 goto :test1_2 +if errorlevel 1 goto :test1_1 +set ret1=0 +goto :test2 +:test1_128 +set ret1=128 +goto :test2 +:test1_3 +set ret1=3 +goto :test2 +:test1_2 +set ret1=2 +goto :test2 +:test1_1 +set ret1=1 + +:test2 +echo testing overflow +type fpetest2.awk +..\mawk -f fpetest2.awk +if errorlevel 128 goto :test2_128 +if errorlevel 3 goto :test2_3 +if errorlevel 2 goto :test2_2 +if errorlevel 1 goto :test2_1 +set ret2=0 +goto :test3 +:test2_128 +set ret2=128 +goto :test3 +:test2_3 +set ret2=3 +goto :test3 +:test2_2 +set ret2=2 +goto :test3 +:test2_1 +set ret2=1 + +:test3 +echo testing domain error +type fpetest3.awk +..\mawk -f fpetest3.awk > temp$$ +if errorlevel 128 goto :test3_128 +if errorlevel 3 goto :test3_3 +if errorlevel 2 goto :test3_2 +if errorlevel 1 goto :test3_1 +set ret3=0 +goto :type3 +:test3_128 +set ret3=128 +goto :type3 +:test3_3 +set ret3=3 +goto :type3 +:test3_2 +set ret3=2 +goto :type3 +:test3_1 +set ret3=1 + +:type3 +type temp$$ + +rem the returns should all be zero or all 2 + +echo ************************************* +echo return1 = %ret1% +echo return2 = %ret2% +echo return3 = %ret3% + +set exception=0 +if %ret1% == 2 goto :okay1 +if %ret1% == 0 goto :okay1 +echo test1 failed +set exception=1 +:okay1 +if %ret2% == 2 goto :okay2 +if %ret2% == 0 goto :okay2 +echo test2 failed +set exception=1 +:okay2 +if %ret3% == 2 goto :okay3 +if %ret3% == 0 goto :okay3 +echo test3 failed +set exception=1 +:okay3 + +if %exception% == 1 goto :done + +set same=1 +if %ret1% == %ret2% goto :same12 +set same=0 +:same12 +if %ret2% == %ret3% goto :same23 +set same=0 +:same23 + +if %same% == 1 goto :same123 +echo results are not consistent +echo return values should all be 0 if ignoring FPEs (e.g. with IEEE754) +echo or all 2 if trapping FPEs +goto :cleanup + +:same123 +if %ret1% == 0 goto :allzero +echo results consistent: trapping floating exceptions +goto :cleanup + +:allzero +echo results consistent: ignoring floating exceptions +grep -i nan temp$$ >NUL +if not errorlevel 1 goto :cleanup +echo but the library is not IEEE754 compatible +echo test 3 failed + +:cleanup +del temp$$ + +:done +set ret1= +set ret2= +set ret3= +set same= +if %exception% == 1 goto :done1 +set exception= +exit 0 +:done1 +set exception= +exit 1 +exit %exception% diff --git a/test/fpe_test.g b/test/fpe_test.g new file mode 100644 index 0000000..25f6f65 --- /dev/null +++ b/test/fpe_test.g @@ -0,0 +1,21 @@ +# tests if mawk has been compiled to correctly handle +# floating point exceptions + +echo testing division by zero +mawk -f fpetest1.awk +echo ========================== status = $status ========================== + +echo testing overflow +mawk -f fpetest2.awk +echo ========================== status = $status ========================== + +echo testing domain error +cat fpetest3.awk +mawk -f fpetest3.awk >temp +echo ========================== status = $status ========================== + +cat temp + +echo the returns should be 1 0 1 +echo note on the atari it cannot be 1 1 1 +# rm temp diff --git a/test/fpe_test.v7 b/test/fpe_test.v7 new file mode 100755 index 0000000..9698fe1 --- /dev/null +++ b/test/fpe_test.v7 @@ -0,0 +1,90 @@ +: tests if mawk has been compiled to correctly handle +: floating point exceptions + +test1='BEGIN{ print 4/0 }' + + +test2='BEGIN { + x = 100 + do { y = x ; x *= 1000 } while ( y != x ) + print "loop terminated" +}' + +test3='BEGIN{ print log(-8) }' + + +echo "testing division by zero" +echo mawk "$test1" +./mawk "$test1" +ret1=$? +echo + +echo "testing overflow" +echo mawk "$test2" +./mawk "$test2" +ret2=$? +echo + +echo "testing domain error" +echo mawk "$test3" +./mawk "$test3" > temp$$ +ret3=$? +cat temp$$ +echo + + +: the returns should all be zero or all 1 +: core dumps not allowed + +trap ' +echo compilation defines for floating point are incorrect +rm -f temp$$ +exit 1' 0 + +echo +echo ============================== + +echo return1 = $ret1 +echo return2 = $ret2 +echo return3 = $ret3 + + +[ $ret1 -gt 128 ] && { echo test1 failed ; exception=1 ; } +[ $ret2 -gt 128 ] && { echo test2 failed ; exception=1 ; } +[ $ret3 -gt 128 ] && { echo test3 failed ; exception=1 ; } + +[ "$exception" = 1 ] && { rm -f core temp$$ ; exit 1 ; } + + +same=0 + +[ $ret1 = $ret2 ] && [ $ret2 = $ret3 ] && same=1 + + +if [ $same = 1 ] + then + if [ $ret1 = 0 ] + then + echo results consistent: ignoring floating exceptions + if grep -i nan temp$$ > /dev/null + then : + else + echo "but the library is not IEEE754 compatible" + echo "test 3 failed" + exit 1 + fi + else echo results consistent: trapping floating exceptions + fi + + trap 0 + rm -f temp$$ + exit 0 + + else + echo results are not consistent +echo 'return values should all be 0 if ignoring FPEs (e.g. with IEEE754) +or all 1 if trapping FPEs' + +exit 1 +fi + diff --git a/test/fpetest1.awk b/test/fpetest1.awk new file mode 100644 index 0000000..4fcfb4d --- /dev/null +++ b/test/fpetest1.awk @@ -0,0 +1 @@ +BEGIN{ print 4/0 } diff --git a/test/fpetest2.awk b/test/fpetest2.awk new file mode 100644 index 0000000..c9aaccf --- /dev/null +++ b/test/fpetest2.awk @@ -0,0 +1,5 @@ +BEGIN { + x = 100 + do { y = x ; x *= 1000 } while ( y != x ) + print "loop terminated" +} diff --git a/test/fpetest3.awk b/test/fpetest3.awk new file mode 100644 index 0000000..8246bc3 --- /dev/null +++ b/test/fpetest3.awk @@ -0,0 +1 @@ +BEGIN{ print log(-8) } diff --git a/test/full-awk.dat b/test/full-awk.dat new file mode 100644 index 0000000..28eb649 --- /dev/null +++ b/test/full-awk.dat @@ -0,0 +1,3 @@ +This has to be a small file to check if mawk handles write errors correctly +even on a full disk. It has to be smaller than the write buffer of the +C library. diff --git a/test/mawktest b/test/mawktest new file mode 100755 index 0000000..41ccb88 --- /dev/null +++ b/test/mawktest @@ -0,0 +1,78 @@ +#!/bin/sh + +# This is a simple test that a new made mawk seems to +# be working OK. +# It's certainly not exhaustive, but the last two tests in +# particular use most features. +# +# It needs to be run from mawk/test +# and mawk needs to be in mawk/test or in PATH + +dat=mawktest.dat + +trap 'echo mawk_test failed ; rm -f temp$$ ; exit 1' 0 + +PATH=.:$PATH + +# find out which mawk we're testing +mawk -W version + + +################################# +echo +echo testing input and field splitting + +mawk -f wc.awk $dat | cmp -s - wc-awk.out || exit + +echo input and field splitting OK +##################################### + +echo +echo testing regular expression matching +mawk -f reg0.awk $dat > temp$$ +mawk -f reg1.awk $dat >> temp$$ +mawk -f reg2.awk $dat >> temp$$ + +cmp -s reg-awk.out temp$$ || exit + +echo regular expression matching OK +####################################### + +echo +if [ -c /dev/full ]; then + echo testing checking for write errors + # Check for write errors noticed when closing the file + mawk '{print}' /dev/full 2>/dev/null && exit + # Check for write errors noticed on writing + # The file has to be bigger than the buffer size of the libc + mawk '{print}' <../scan.c >/dev/full 2>/dev/null && exit + + echo checking for write errors OK +else + echo "No /dev/full - check for write errors skipped" +fi + +####################################### + +echo +echo testing arrays and flow of control + +mawk -f wfrq0.awk $dat | cmp -s - wfrq-awk.out || exit + +echo array test OK +################################# + +echo +echo testing function calls and general stress test + +mawk -f ../examples/decl.awk $dat | cmp -s - decl-awk.out || exit + +echo general stress test passed + + +echo +echo tested mawk seems OK + +trap 0 +rm -f temp$$ +exit 0 diff --git a/test/mawktest.bat b/test/mawktest.bat new file mode 100644 index 0000000..7034671 --- /dev/null +++ b/test/mawktest.bat @@ -0,0 +1,51 @@ +echo off +rem This is a simple test that a new made mawk seems to +rem be working OK. +rem It's certainly not exhaustive, but the last two tests in +rem particular use most features. +rem +rem It needs to be run from mawk/test and mawk needs to be in PATH +rem +rem it's too bad that years after MSDOS was introduced that basic +rem system utilities like fc still don't return valid exit codes!!! + +set dat=mawktest.dat +if %CMP%.==. set CMP=cmp + +rem find out which mawk we're testing +..\mawk -Wv + +rem ################################ + +echo testing input and field splitting +..\mawk -f wc.awk %dat% > temp$$ +%CMP% temp$$ wc-awk.out +if errorlevel 1 goto :done + +rem #################################### + +echo testing regular expression matching +..\mawk -f reg0.awk %dat% > temp$$ +..\mawk -f reg1.awk %dat% >> temp$$ +..\mawk -f reg2.awk %dat% >> temp$$ +%CMP% temp$$ reg-awk.out +if errorlevel 1 goto :done + +rem ###################################### + +echo testing arrays and flow of control +..\mawk -f wfrq0.awk %dat% > temp$$ +%CMP% temp$$ wfrq-awk.out +if errorlevel 1 goto :done + +rem ################################ + +echo testing function calls and general stress test +..\mawk -f ../examples/decl.awk %dat% > temp$$ +%CMP% temp$$ decl-awk.out +if errorlevel 1 goto :done + +echo if %CMP% always encountered "no differences", then the tested mawk seems OK +:done +del temp$$ +set dat= diff --git a/test/mawktest.dat b/test/mawktest.dat new file mode 100644 index 0000000..e4e0007 --- /dev/null +++ b/test/mawktest.dat @@ -0,0 +1,107 @@ + +#include + +extern unsigned hash() ; + +/* An array A is a pointer to an array of struct array, + which is two hash tables in one. One for strings + and one for doubles. + + each array is of size A_HASH_PRIME. + + When an index is deleted via delete A[i], the + ANODE is not removed from the hash chain. A[i].cp + and A[i].sval are both freed and sval is set NULL. + This method of deletion simplifies for( i in A ) loops. + + On the D_ANODE list, we use real deletion and move to the + front on access. + + Separate nodes (as opposed to one type of node on two lists) + to + (1) d1 != d2, but sprintf(A_FMT,d1) == sprintf(A_FMT,d1) + so two dnodes can point at the same anode. + (2) Save a little data space(64K PC mentality). + + the cost is an extra level of indirection. + + Some care is needed so that things like + A[1] = 2 ; delete A["1"] work . +*/ + +#define _dhash(d) (((int)(d)&0x7fff)%A_HASH_PRIME) +#define DHASH(d) (last_dhash=_dhash(d)) +static unsigned last_dhash ; + +/* switch =======;;;;;;hhhh */ + +static ANODE *find_by_sval(A, sval, cflag) + ARRAY A ; + STRING *sval ; + int cflag ; /* create if on */ +{ + char *s = sval->str ; + unsigned h = hash(s) % A_HASH_PRIME ; + register ANODE *p = A[h].link ; + ANODE *q = 0 ; /* holds first deleted ANODE */ + + while ( p ) + { + if ( p->sval ) + { if ( strcmp(s,p->sval->str) == 0 ) return p ; } + else /* its deleted, mark with q */ + if ( ! q ) q = p ; + + p = p->link ; + } + + /* not there */ + if ( cflag ) + { + if ( q ) p = q ; /* reuse the deleted node q */ + else + { p = (ANODE *)zmalloc(sizeof(ANODE)) ; + p->link = A[h].link ; A[h].link = p ; + } + + p->sval = sval ; + sval->ref_cnt++ ; + p->cp = (CELL *) zmalloc(sizeof(CELL)) ; + p->cp->type = C_NOINIT ; + } + return p ; +} + + +/* on the D_ANODE list, when we find a node we move it + to the front of the hash chain */ + +static D_ANODE *find_by_dval(A, d, cflag) + ARRAY A ; + double d ; + int cflag ; +{ + unsigned h = DHASH(d) ; + register D_ANODE *p = A[h].dlink ; + D_ANODE *q = 0 ; /* trails p for move to front */ + ANODE *ap ; + + while ( p ) + if ( p->dval == d ) + { /* found */ + if ( ! p->ap->sval ) /* but it was deleted by string */ + { if ( q ) q->dlink = p->dlink ; + else A[h].dlink = p->dlink ; + zfree(p, sizeof(D_ANODE)) ; + break ; + } + /* found */ + if ( !q ) return p ; /* already at front */ + else /* delete to put at front */ + { q->dlink = p->dlink ; goto found ; } + } + else + { q = p ; p = p->dlink ; } + +void (*signal())() ; + diff --git a/test/mawktest.g b/test/mawktest.g new file mode 100644 index 0000000..0cd94d0 --- /dev/null +++ b/test/mawktest.g @@ -0,0 +1,49 @@ +# mawk test gulam script +# +# This is a simple test that a new made mawk seems to +# be working OK. +# Its certainly not exhaustive, but the last two tests in +# particular use most features. +# +# It needs to be run from mawk/test and mawk needs to be in PATH +# + +## set dat=mawk_test.dat + +# find out which mawk were testing +echo testing mawk version +.\mawk.ttp -W version +echo ===================== status = $status ===================== +echo " " +# ################################ + +echo testing input and field splitting +.\mawk.ttp -f wc.awk mawk_tes.dat >temp1 +diff -c temp1 wc-awk.out +echo ===================== status = $status ===================== +echo " " +# #################################### + +echo testing regular expression matching +.\mawk.ttp -f reg0.awk mawk_tes.dat >temp2 +.\mawk.ttp -f reg1.awk mawk_tes.dat >>temp2 +.\mawk.ttp -f reg2.awk mawk_tes.dat >>temp2 +diff -c temp2 reg-awk.out +echo ===================== status = $status ===================== +echo " " +# ###################################### + +echo testing arrays and flow of control +.\mawk.ttp -f wfrq0.awk mawk_tes.dat >temp3 +diff -c temp3 wfrq-awk.out +echo ===================== status = $status ===================== +echo " " +# ################################ + +echo testing function calls and general stress test +.\mawk.ttp -f examples\decl.awk mawk_tes.dat >temp4 +diff -c temp4 decl-awk.out +echo ===================== status = $status ===================== +echo " " +echo if the status after each test is 0, then the tested mawk seems OK +#rm temp[1-4] diff --git a/test/mawktest.v7 b/test/mawktest.v7 new file mode 100755 index 0000000..9fad1c4 --- /dev/null +++ b/test/mawktest.v7 @@ -0,0 +1,46 @@ +: 'This is a simple test that a new made mawk seems to' +: 'be working OK.' +: 'It is certainly not exhaustive, but the last two tests in' +: 'particular use most features.' +: + +dat=mawktest.dat + +trap 'echo mawk_test failed ; rm -f temp$$ ; exit 1' 0 + +: 'find out which mawk we are testing' +./mawk -Wv + + +echo testing input and field splitting +./mawk -f wc.awk $dat | cmp -s - wc-awk.out || exit + +echo input and field splitting OK + +echo +echo testing regular expression matching +./mawk -f reg0.awk $dat > temp$$ +./mawk -f reg1.awk $dat >> temp$$ +./mawk -f reg2.awk $dat >> temp$$ + +cmp -s reg-awk.out temp$$ || exit +echo regular expression matching OK + +echo +echo testing arrays and flow of control +./mawk -f wfrq0.awk $dat | cmp -s - wfrq-awk.out || exit + +echo array test OK + +echo +echo testing function calls and general stress test +./mawk -f ../examples/decl.awk $dat | cmp -s - decl-awk.out || exit + +echo general stress test passed + +echo +echo tested mawk seems OK + +trap 0 +rm -f temp$$ +exit 0 diff --git a/test/reg-awk.out b/test/reg-awk.out new file mode 100644 index 0000000..5c16610 --- /dev/null +++ b/test/reg-awk.out @@ -0,0 +1,3 @@ +3 +4 +1 diff --git a/test/reg0.awk b/test/reg0.awk new file mode 100644 index 0000000..fdc1411 --- /dev/null +++ b/test/reg0.awk @@ -0,0 +1,3 @@ + +/return/ {cnt++} +END{print cnt} diff --git a/test/reg1.awk b/test/reg1.awk new file mode 100644 index 0000000..2c64f7d --- /dev/null +++ b/test/reg1.awk @@ -0,0 +1,3 @@ + +/return|switch/ {cnt++} +END{print cnt} diff --git a/test/reg2.awk b/test/reg2.awk new file mode 100644 index 0000000..8e27fd0 --- /dev/null +++ b/test/reg2.awk @@ -0,0 +1,3 @@ + +/[A-Za-z_][A-Za-z0-9_]*\[.*\][ \t]*=/ {cnt++} +END{print cnt} diff --git a/test/wc-awk.out b/test/wc-awk.out new file mode 100644 index 0000000..52e3c51 --- /dev/null +++ b/test/wc-awk.out @@ -0,0 +1 @@ +107 479 diff --git a/test/wc.awk b/test/wc.awk new file mode 100644 index 0000000..0875399 --- /dev/null +++ b/test/wc.awk @@ -0,0 +1,3 @@ + +{sum += NF} +END{ print NR, sum} diff --git a/test/wfrq-awk.out b/test/wfrq-awk.out new file mode 100644 index 0000000..abc55d0 --- /dev/null +++ b/test/wfrq-awk.out @@ -0,0 +1,20 @@ + 29 p + 21 A + 14 ANODE + 13 q + 12 d + 12 sval + 10 if + 10 the + 8 dlink + 8 h + 8 is + 7 to + 6 D + 6 of + 5 cflag + 5 deleted + 5 else + 5 front + 5 hash + 5 link diff --git a/test/wfrq0.awk b/test/wfrq0.awk new file mode 100644 index 0000000..7791e0b --- /dev/null +++ b/test/wfrq0.awk @@ -0,0 +1,98 @@ + +# this program finds the twenty most freq +# words in document using a heap sort at the end +# +# + +function down_heap(i, k,hold) +{ + while ( 1 ) + { + if ( compare(heap[2*i], heap[2*i+1]) <= 0 ) k = 2*i + else k = 2*i + 1 + + if ( compare(heap[i],heap[k]) <= 0 ) return + + hold = heap[k] ; heap[k] = heap[i] ; heap[i] = hold + i = k + } +} + +# compares two values of form "number word" +# by number and breaks ties by word (reversed) + +function compare(s1, s2, t, X) +{ + t = (s1+0) - (s2+0) # forces types to number + + if ( t == 0 ) + { + split(s1, X); s1 = X[2] + split(s2, X); s2 = X[2] + if ( s2 < s1 ) return -1 + return s1 < s2 + } + + return t +} + + +BEGIN { RS = "[^a-zA-Z]+" ; BIG = "999999:" } + +{ cnt[$0]++ } + +END { delete cnt[ "" ] + +# load twenty values +j = 1 +for( i in cnt ) +{ + heap[j] = num_word( cnt[i] , i ) + delete cnt[i] ; + if ( ++j == 21 ) break ; +} + +# make some sentinals +for( i = j ; i < 43 ; i++ ) heap[i] = BIG + +h_empty = j # save the first empty slot +# make a heap with the smallest in slot 1 +for( i = h_empty - 1 ; i > 0 ; i-- ) down_heap(i) + +# examine the rest of the values +for ( i in cnt ) +{ + j = num_word(cnt[i], i) + if ( compare(j, heap[1]) > 0 ) + { # its bigger + # take the smallest out of the heap and readjust + heap[1] = j + down_heap(1) + } +} + +h_empty-- ; + +# what's left are the twenty largest +# smallest at the top +# + +i = 20 +while ( h_empty > 1 ) +{ + buffer[i--] = heap[1] + heap[1] = heap[h_empty] + heap[h_empty] = BIG + down_heap(1) + h_empty-- +} + buffer[i--] = heap[1] + + for(j = 1 ; j <= 20 ; j++ ) print buffer[j] +} + + +function num_word(num, word) +{ + return sprintf("%3d %s", num, word) +} diff --git a/types.h b/types.h new file mode 100644 index 0000000..c4e4736 --- /dev/null +++ b/types.h @@ -0,0 +1,107 @@ + +/******************************************** +types.h +copyright 1991, Michael D. Brennan + +This is a source file for mawk, an implementation of +the AWK programming language. + +Mawk is distributed without warranty under the terms of +the GNU General Public License, version 2, 1991. +********************************************/ + + +/* $Log: types.h,v $ + * Revision 1.3 1993/07/15 23:56:18 mike + * general cleanup + * + * Revision 1.2 1993/07/04 12:52:15 mike + * start on autoconfig changes + * + * Revision 5.1 1991/12/05 07:59:39 brennan + * 1.1 pre-release + * +*/ + + +/* types.h */ + +#ifndef MAWK_TYPES_H +#define MAWK_TYPES_H + +#include "sizes.h" + + +/* CELL types */ + +#define C_NOINIT 0 +#define C_DOUBLE 1 +#define C_STRING 2 +#define C_STRNUM 3 +#define C_MBSTRN 4 + /*could be STRNUM, has not been checked */ +#define C_RE 5 +#define C_SPACE 6 + /* split on space */ +#define C_SNULL 7 + /* split on the empty string */ +#define C_REPL 8 + /* a replacement string '\&' changed to & */ +#define C_REPLV 9 + /* a vector replacement -- broken on & */ +#define NUM_CELL_TYPES 10 + +/* these defines are used to check types for two + CELLs which are adjacent in memory */ + +#define TWO_NOINITS (2*(1<$*.c + $(CC) $(CFLAGS) $(CFLAGS2) -c $*.c 2>&1 | hash8 decode TABLE + rm $*.c + +.c.o: + $(CC) $(CFLAGS) $(CFLAGS2) -c $*.c 2>&1 | hash8 decode TABLE + +.cl.c: + hash8 -r va_alist -r va_start encode TABLE <$< >$*.c + +.hl.h: + hash8 -r va_alist -r va_start encode TABLE <$< >$@ + +####################################### + +O=parse.o scan.o memory.o main.o hash.o execute.o code.o\ + da.o error.o init.o bi_vars.o cast.o print.o bi_funct.o\ + kw.o jmp.o array.o field.o split.o re_cmpl.o zmalloc.o\ + fin.o files.o scancode.o matherr.o fcall.o version.o + +REXP_O=rexp/rexp.o rexp/rexp0.o rexp/rexp1.o rexp/rexp2.o\ + rexp/rexp3.o rexp/rexpdb.o + + +mawk_and_test : mawk mawk_test fpe_test + +mawk : $(O) rexp/regexp.a + $(CC) $(LDFLAGS) -o mawk $(O) -lm rexp/regexp.a + +mawk_test : mawk # test that we have a sane mawk + @cp mawk test/mawk + cd test ; ./mawk_test.v7 + @rm test/mawk + +fpe_test : mawk # test FPEs are handled OK + @cp mawk test/mawk + @echo ; echo testing floating point exception handling + cd test ; ./fpe_test.v7 + @rm test/mawk + +rexp/regexp.a : $(REXP_O) + cd rexp ; make CC=$(CC) + + +parse.cl : parse.y parse2.xcl + @echo expect 4 shift/reduce conflicts + $(YACC) parse.y + cat y.tab.c parse2.xcl > parse.cl && rm y.tab.c + -if cmp -s y.tab.h parse.hl ;\ + then rm y.tab.h ;\ + else mv y.tab.h parse.hl ; fi + +scancode.cl : makescan.cl scan.h + hash8 -r va_alist -r va_start encode TABLE makescan.c + $(CC) -o makescan.exe $(CFLAGS) makescan.c + ./makescan.exe > scancode.cl + rm makescan.c makescan.exe + +clean : + rm -f *.o rexp/*.o rexp/regexp.a test/mawk core test/core + + +# output from mawk -f deps.awk *.c +array.o : bi_vars.h sizes.h zmalloc.h memory.h types.h field.h mawk.h config.h symtype.h config/Idefault.h +bi_funct.o : fin.h bi_vars.h sizes.h memory.h zmalloc.h regexp.h types.h field.h repl.h files.h bi_funct.h mawk.h config.h symtype.h init.h config/Idefault.h +bi_vars.o : bi_vars.h sizes.h memory.h zmalloc.h types.h field.h mawk.h config.h symtype.h config/Idefault.h init.h +cast.o : parse.h sizes.h memory.h zmalloc.h types.h field.h scan.h repl.h mawk.h config.h symtype.h config/Idefault.h +code.o : sizes.h memory.h zmalloc.h types.h field.h code.h jmp.h mawk.h config.h symtype.h config/Idefault.h init.h +da.o : sizes.h memory.h zmalloc.h types.h field.h repl.h code.h bi_funct.h mawk.h config.h symtype.h config/Idefault.h +error.o : parse.h bi_vars.h sizes.h types.h scan.h mawk.h config.h symtype.h config/Idefault.h +execute.o : bi_vars.h fin.h sizes.h memory.h zmalloc.h regexp.h types.h field.h code.h repl.h bi_funct.h mawk.h config.h symtype.h config/Idefault.h +fcall.o : sizes.h memory.h zmalloc.h types.h code.h mawk.h config.h symtype.h config/Idefault.h +field.o : parse.h bi_vars.h sizes.h memory.h zmalloc.h regexp.h types.h field.h scan.h repl.h mawk.h config.h symtype.h config/Idefault.h init.h +files.o : fin.h sizes.h memory.h zmalloc.h types.h files.h mawk.h config.h config/Idefault.h +fin.o : parse.h fin.h bi_vars.h sizes.h memory.h zmalloc.h types.h field.h scan.h mawk.h config.h symtype.h config/Idefault.h +hash.o : sizes.h memory.h zmalloc.h types.h mawk.h config.h symtype.h config/Idefault.h +init.o : bi_vars.h sizes.h memory.h zmalloc.h types.h field.h code.h mawk.h config.h symtype.h config/Idefault.h init.h +jmp.o : sizes.h memory.h zmalloc.h types.h code.h mawk.h jmp.h config.h symtype.h config/Idefault.h init.h +kw.o : parse.h sizes.h types.h mawk.h config.h symtype.h config/Idefault.h init.h +main.o : fin.h bi_vars.h sizes.h memory.h zmalloc.h types.h field.h code.h files.h mawk.h config.h symtype.h config/Idefault.h init.h +makescan.o : parse.h scan.h symtype.h +matherr.o : sizes.h types.h mawk.h config.h config/Idefault.h +memory.o : sizes.h memory.h zmalloc.h types.h mawk.h config.h config/Idefault.h +parse.o : bi_vars.h sizes.h memory.h zmalloc.h types.h field.h code.h files.h bi_funct.h mawk.h jmp.h config.h symtype.h config/Idefault.h +print.o : bi_vars.h parse.h sizes.h memory.h zmalloc.h types.h field.h scan.h files.h bi_funct.h mawk.h config.h symtype.h config/Idefault.h +re_cmpl.o : parse.h sizes.h memory.h zmalloc.h regexp.h types.h scan.h repl.h mawk.h config.h symtype.h config/Idefault.h +scan.o : parse.h fin.h sizes.h memory.h zmalloc.h types.h field.h scan.h repl.h code.h files.h mawk.h config.h symtype.h config/Idefault.h init.h +split.o : bi_vars.h parse.h sizes.h memory.h zmalloc.h regexp.h types.h field.h scan.h bi_funct.h mawk.h config.h symtype.h config/Idefault.h +version.o : patchlev.h sizes.h types.h mawk.h config.h config/Idefault.h +zmalloc.o : sizes.h zmalloc.h types.h mawk.h config.h config/Idefault.h diff --git a/v7/README b/v7/README new file mode 100644 index 0000000..2b8179c --- /dev/null +++ b/v7/README @@ -0,0 +1,7 @@ +mawk 1.1.x worked under V7 + +this port has not been updated for 1.2. + +If anyone needs this, the relevant files are here. + +config.h is a guess from V7.h diff --git a/v7/V7.h b/v7/V7.h new file mode 100644 index 0000000..0692c8f --- /dev/null +++ b/v7/V7.h @@ -0,0 +1,83 @@ + +/******************************************** +V7.h +copyright 1991, Michael D. Brennan + +This is a source file for mawk, an implementation of +the AWK programming language. + +Mawk is distributed without warranty under the terms of +the GNU General Public License, version 2, 1991. +********************************************/ + +/* +The port of mawk to V7 is the work of +Carl Mascott (cmascott@world.std.com) +*/ + +/*$Log: V7.h,v $ + * Revision 1.1.1.1 1993/07/03 18:58:32 mike + * move source to cvs + * + * Revision 4.2 1991/11/21 13:30:34 brennan + * + * 11/17/91 C. Mascott declare fprintf, sprintf on V7 + * + * Revision 4.1 91/09/25 11:40:41 brennan + * VERSION 1.0 + * + * Revision 1.4 91/08/16 08:22:09 brennan + * Carl's addition of SW_FP_CHECK for XNX23A + * + * Revision 1.3 91/08/13 09:04:07 brennan + * VERSION .9994 + * + * Revision 1.2 91/06/15 09:28:54 brennan + * Carl's diffs for V7 + * + * 06/11/91 C. Mascott change NO_FMOD to HAVE_FMOD + * change NO_STRTOD to HAVE_STRTOD + * + * Revision 1.1 91/06/10 14:20:03 brennan + * Initial revision + * +*/ + +#ifndef CONFIG_H +#define CONFIG_H 1 + +#define V7 + + +#define HAVE_VOID_PTR 0 +#define HAVE_STRTOD 0 +#define HAVE_FMOD 0 +#define HAVE_MATHERR 0 + +#define HAVE_STRING_H 0 +#define HAVE_FCNTL_H 0 + + +#define O_RDONLY 0 +#define O_WRONLY 1 +#define O_RDWR 2 + +#define vfprintf(s,f,a) _doprnt(f,a,s) +#define strchr index +#define strrchr rindex + +#ifdef XNX23A +/* convert double to Boolean. This is a bug work-around for + XENIX-68K 2.3A, where logical test of double doesn't work. This + macro NG for register double. */ +#define D2BOOL(x) (*((long *) &(x))) +#define SW_FP_CHECK 1 +#endif + + +/* these are missing and print.c needs them */ +void fprintf() ; +char *sprintf() ; + +#include "config/Idefault.h" +#endif /* CONFIG_H */ diff --git a/v7/V7_notes b/v7/V7_notes new file mode 100644 index 0000000..c0578cb --- /dev/null +++ b/v7/V7_notes @@ -0,0 +1,65 @@ + MAWK ON V7 UNIX + + +09/08/91 Carl Mascott + + +1. Prerequisites + +hash8 : from comp.sources.unix volume 15 + used by V7 Makefiles + When you build hash8 you should add all long ( > 7 char) + runtime library function names to the reserved word table + +memcmp(), memcpy(), memset() + included in stringlib, comp.sources.unix volume 6 + simple to write if necessary + +2. Procedure + + a. In ~/mawk: + Rename Makefile.v7 Makefile + Rename *.c *.cl + Rename *.xc *.xcl + Rename *.h *.hl + Check CFLAGS and LDFLAGS in Makefile + + Repeat the applicable portions of the above + in ~/mawk/rexp and in ~/mawk/config + + b. From ~/mawk: + make config/V7.h + make config/Idefault.h + ln config/V7.h config.h + + c. Do a make in ~/mawk/rexp + + d. Do a make in ~/mawk + +3. Notes + + a. V7 sh scripts + +The original mawk_test and fpe_test wouldn't run on V7. V7 sh doesn't +have a comment character ('#'). Since ':' is actually a statement its +arguments need to be quoted if they contain any special characters. + + b. SW_FP_CHECK + +SW_FP_CHECK has been added. The particular implementation is +for XENIX-68K 2.3A. There are no checks preceding calls to +fmod() because the check is built into mawk's fmod(). This +would be a problem on a system that needs SW_FP_CHECK but +already has fmod() in the RTL. The work-around is to always +use mawk's fmod() if using SW_FP_CHECK. + +SW_FP_CHECK is activated only if XNX23A is defined. The +standard V7 Makefile doesn't define XNX23A, so you needn't +concern yourself with SW_FP_CHECK. + + c. 3-argument open() + +Mawk always calls open() with the 3rd argument set to 0. V7 +open() really takes only 2 arguments. With most UNIX C compilers +extra arguments in function calls are harmless, so the open() +calls have not been altered for V7. diff --git a/v7/config.h b/v7/config.h new file mode 100644 index 0000000..7671841 --- /dev/null +++ b/v7/config.h @@ -0,0 +1,36 @@ + +/* This has never been tested. A first pass for mawk1.2 + based on V7.h that worked on mawk1.1 +*/ + +#ifndef CONFIG_H +#define CONFIG_H 1 + +#define V7 + + +#define NO_VOID_PTR 1 +#define NO_STRTOD 1 +#define NO_FMOD 1 +#define NO_MATHERR 1 +#define NO_FCNTL_H 1 +#define NO_VFPRINTF 1 +#define NO_STRCHR 1 + + +#define O_RDONLY 0 +#define O_WRONLY 1 +#define O_RDWR 2 + + +#ifdef XNX23A +/* convert double to Boolean. This is a bug work-around for + XENIX-68K 2.3A, where logical test of double doesn't work. This + macro NG for register double. */ +#define D2BOOL(x) (*((long *) &(x))) +#define SW_FP_CHECK 1 +#define STDC_MATHERR 1 +#endif + +#define HAVE_REAL_PIPES 1 +#endif /* CONFIG_H */ diff --git a/vargs.h b/vargs.h new file mode 100644 index 0000000..addda42 --- /dev/null +++ b/vargs.h @@ -0,0 +1,74 @@ + +/******************************************** +vargs.h +copyright 1992 Michael D. Brennan + +This is a source file for mawk, an implementation of +the AWK programming language. + +Mawk is distributed without warranty under the terms of +the GNU General Public License, version 2, 1991. +********************************************/ + +/* +$Log: vargs.h,v $ + * Revision 1.4 1994/12/14 14:36:54 mike + * sometimes stdarg.h exists, but depending on compiler flags it is + * unusable -- assume NO_PROTOS => NO_STDARG_H + * + * Revision 1.3 1994/10/08 19:18:38 mike + * trivial change + * + * Revision 1.2 1993/07/04 12:52:19 mike + * start on autoconfig changes + * + * Revision 1.1.1.1 1993/07/03 18:58:22 mike + * move source to cvs + * + * Revision 1.1 1992/10/02 23:23:41 mike + * Initial revision + * +*/ + +/* provides common interface to or + only used for error messages +*/ + +#ifdef NO_PROTOS +#ifndef NO_STDARG_H +#define NO_STDARG_H 1 +#endif +#endif + +#if NO_STDARG_H +#include + +#ifndef VA_ALIST + +#define VA_ALIST(type, arg) (va_alist) va_dcl { type arg ; +#define VA_ALIST2(t1,a1,t2,a2) (va_alist) va_dcl { t1 a1 ; t2 a2 ; + +#endif + +#define VA_START(p,type, last) va_start(p) ;\ + last = va_arg(p,type) + + +#define VA_START2(p,t1,a1,t2,a2) va_start(p) ;\ + a1 = va_arg(p,t1);\ + a2 = va_arg(p,t2) + +#else /* have stdarg.h */ +#include + +#ifndef VA_ALIST +#define VA_ALIST(type, arg) (type arg, ...) { +#define VA_ALIST2(t1,a1,t2,a2) (t1 a1,t2 a2,...) { +#endif + +#define VA_START(p,type,last) va_start(p,last) + +#define VA_START2(p,t1,a1,t2,a2) va_start(p,a2) + +#endif + diff --git a/version.c b/version.c new file mode 100644 index 0000000..35773af --- /dev/null +++ b/version.c @@ -0,0 +1,147 @@ + +/******************************************** +version.c +copyright 1991-95. Michael D. Brennan + +This is a source file for mawk, an implementation of +the AWK programming language. + +Mawk is distributed without warranty under the terms of +the GNU General Public License, version 2, 1991. +********************************************/ + +/*$Log: version.c,v $ + *Revision 1.10 1996/07/28 21:47:07 mike + *gnuish patch + * + * Revision 1.9 1996/02/01 04:44:15 mike + * roll a beta version + * + * Revision 1.8 1995/08/20 17:40:45 mike + * changed _stackavail to stackavail for MSC + * + * Revision 1.7 1995/06/10 17:04:10 mike + * "largest field" replaced by "max NF" + * +*/ + +#include "mawk.h" +#include "patchlev.h" + +static char mawkid[] = MAWK_ID ; + +#define VERSION_STRING \ + "mawk 1.3%s%s %s, Copyright (C) Michael D. Brennan\n\n" + +/* If use different command line syntax for MSDOS + mark that in VERSION */ + +#ifndef DOS_STRING +#if MSDOS && ! HAVE_REARGV +#define DOS_STRING "MsDOS" +#endif +#endif + +#ifndef DOS_STRING +#define DOS_STRING "" +#endif + +int print_compiler_id(); +int print_aux_limits(); + +static char fmt[] = "%-14s%10lu\n" ; + +/* print VERSION and exit */ +void +print_version() +{ + + printf(VERSION_STRING, PATCH_STRING, DOS_STRING, DATE_STRING) ; + fflush(stdout) ; + + print_compiler_id() ; + fprintf(stderr, "compiled limits:\n") ; + fprintf(stderr, fmt, "max NF", (long) MAX_FIELD) ; + fprintf(stderr, fmt, "sprintf buffer", (long) SPRINTF_SZ) ; + print_aux_limits() ; + exit(0) ; +} + + +/* + Extra info for MSDOS. This code contributed by + Ben Myers +*/ + +#ifdef __TURBOC__ +#include /* coreleft() */ +#define BORL +#endif + +#ifdef __BORLANDC__ +#include /* coreleft() */ +#define BORL +#endif + +#ifdef BORL +extern unsigned _stklen = 16 * 1024U ; + /* 4K of stack is enough for a user function call + nesting depth of 75 so this is enough for 300 */ +#endif + +#ifdef _MSC_VER +#include +#endif + +#ifdef __ZTC__ +#include /* _chkstack */ +#endif + + +int +print_compiler_id() +{ + +#ifdef __TURBOC__ + fprintf(stderr, "MsDOS Turbo C++ %d.%d\n", + __TURBOC__ >> 8, __TURBOC__ & 0xff) ; +#endif + +#ifdef __BORLANDC__ + fprintf(stderr, "MS-DOS Borland C++ __BORLANDC__ %x\n", + __BORLANDC__) ; +#endif + +#ifdef _MSC_VER + fprintf(stderr, "Microsoft C/C++ _MSC_VER %u\n", _MSC_VER) ; +#endif + +#ifdef __ZTC__ + fprintf(stderr, "MS-DOS Zortech C++ __ZTC__ %x\n", __ZTC__) ; +#endif + + return 0 ; /*shut up */ +} + + +int +print_aux_limits() +{ +#ifdef BORL + extern unsigned _stklen ; + fprintf(stderr, fmt, "stack size", (unsigned long) _stklen) ; + fprintf(stderr, fmt, "heap size", (unsigned long) coreleft()) ; +#endif + +#ifdef _MSC_VER + fprintf(stderr, fmt, "stack size", (unsigned long) stackavail()) ; +#endif + +#ifdef __ZTC__ +/* large memory model only with ztc */ + fprintf(stderr, fmt, "stack size??", (unsigned long) _chkstack()) ; + fprintf(stderr, fmt, "heap size", farcoreleft()) ; +#endif + + return 0 ; +} diff --git a/version.o b/version.o new file mode 100644 index 0000000..7435fe1 Binary files /dev/null and b/version.o differ diff --git a/zmalloc.c b/zmalloc.c new file mode 100644 index 0000000..c05eaba --- /dev/null +++ b/zmalloc.c @@ -0,0 +1,195 @@ + +/******************************************** +zmalloc.c +copyright 1991, Michael D. Brennan + +This is a source file for mawk, an implementation of +the AWK programming language. + +Mawk is distributed without warranty under the terms of +the GNU General Public License, version 2, 1991. +********************************************/ + +/*$Log: zmalloc.c,v $ + * Revision 1.6 1995/06/06 00:18:35 mike + * change mawk_exit(1) to mawk_exit(2) + * + * Revision 1.5 1995/03/08 00:06:26 mike + * add a pointer cast + * + * Revision 1.4 1993/07/14 12:45:15 mike + * run thru indent + * + * Revision 1.3 1993/07/07 00:07:54 mike + * more work on 1.2 + * + * Revision 1.2 1993/07/03 21:15:35 mike + * bye bye yacc_mem + * + * Revision 1.1.1.1 1993/07/03 18:58:23 mike + * move source to cvs + * + * Revision 5.4 1993/02/13 21:57:38 mike + * merge patch3 + * + * Revision 5.3 1993/01/14 13:12:33 mike + * casts in front of malloc + * + * Revision 5.1.1.1 1993/02/06 11:12:19 mike + * fix bug in reuse of parser table memory + * for most users ifdef the mess out + * + * Revision 5.1 1991/12/05 07:56:35 brennan + * 1.1 pre-release + * +*/ + +/* zmalloc.c */ +#include "mawk.h" +#include "zmalloc.h" + + + +/* + zmalloc() gets mem from malloc() in CHUNKS of 2048 bytes + and cuts these blocks into smaller pieces that are multiples + of eight bytes. When a piece is returned via zfree(), it goes + on a linked linear list indexed by its size. The lists are + an array, pool[]. + + E.g., if you ask for 22 bytes with p = zmalloc(22), you actually get + a piece of size 24. When you free it with zfree(p,22) , it is added + to the list at pool[2]. +*/ + +#define POOLSZ 16 + +#define CHUNK 256 + /* number of blocks to get from malloc */ + +static void PROTO(out_of_mem, (void)) ; + + +static void +out_of_mem() +{ + static char out[] = "out of memory" ; + + if (mawk_state == EXECUTION) rt_error(out) ; + else + { + /* I don't think this will ever happen */ + compile_error(out) ; mawk_exit(2) ; + } +} + + +typedef union zblock +{ + char dummy[ZBLOCKSZ] ; + union zblock *link ; +} ZBLOCK ; + +/* ZBLOCKS of sizes 1, 2, ... 16 + which is bytes of sizes 8, 16, ... , 128 + are stored on the linked linear lists in + pool[0], pool[1], ... , pool[15] +*/ + +static ZBLOCK *pool[POOLSZ] ; + +/* zmalloc() is a macro in front of bmalloc "BLOCK malloc" */ + +PTR +bmalloc(blocks) + register unsigned blocks ; +{ + register ZBLOCK *p ; + static unsigned amt_avail ; + static ZBLOCK *avail ; + + if (blocks > POOLSZ) + { + p = (ZBLOCK *) malloc(blocks << ZSHIFT) ; + if (!p) out_of_mem() ; + return (PTR) p ; + } + + if ((p = pool[blocks - 1])) + { + pool[blocks - 1] = p->link ; + return (PTR) p ; + } + + if (blocks > amt_avail) + { + if (amt_avail != 0) /* free avail */ + { + avail->link = pool[--amt_avail] ; + pool[amt_avail] = avail ; + } + + if (!(avail = (ZBLOCK *) malloc(CHUNK * ZBLOCKSZ))) + { + /* if we get here, almost out of memory */ + amt_avail = 0 ; + p = (ZBLOCK *) malloc(blocks << ZSHIFT) ; + if (!p) out_of_mem() ; + return (PTR) p ; + } + else amt_avail = CHUNK ; + } + + /* get p from the avail pile */ + p = avail ; avail += blocks ; amt_avail -= blocks ; + return (PTR) p ; +} + +void +bfree(p, blocks) + register PTR p ; + register unsigned blocks ; +{ + + if (blocks > POOLSZ) free(p) ; + else + { + ((ZBLOCK *) p)->link = pool[--blocks] ; + pool[blocks] = (ZBLOCK *) p ; + } +} + +PTR +zrealloc(p, old_size, new_size) + register PTR p ; + unsigned old_size, new_size ; +{ + register PTR q ; + + if (new_size > (POOLSZ << ZSHIFT) && + old_size > (POOLSZ << ZSHIFT)) + { + if (!(q = realloc(p, new_size))) out_of_mem() ; + } + else + { + q = zmalloc(new_size) ; + memcpy(q, p, old_size < new_size ? old_size : new_size) ; + zfree(p, old_size) ; + } + return q ; +} + + + +#ifndef __GNUC__ +/* pacifier for Bison , this is really dead code */ +PTR +alloca(sz) + unsigned sz ; +{ + /* hell just froze over */ + exit(100) ; + return (PTR) 0 ; +} +#endif diff --git a/zmalloc.h b/zmalloc.h new file mode 100644 index 0000000..98e8aec --- /dev/null +++ b/zmalloc.h @@ -0,0 +1,48 @@ + +/******************************************** +zmalloc.h +copyright 1991, Michael D. Brennan + +This is a source file for mawk, an implementation of +the AWK programming language. + +Mawk is distributed without warranty under the terms of +the GNU General Public License, version 2, 1991. +********************************************/ + +/*$Log: zmalloc.h,v $ + * Revision 1.2 1993/07/04 12:52:22 mike + * start on autoconfig changes + * + * Revision 1.1.1.1 1993/07/03 18:58:23 mike + * move source to cvs + * + * Revision 5.1 1991/12/05 07:59:41 brennan + * 1.1 pre-release + * +*/ + +/* zmalloc.h */ + +#ifndef ZMALLOC_H +#define ZMALLOC_H + +#include "nstd.h" + +PTR PROTO( bmalloc, (unsigned) ) ; +void PROTO( bfree, (PTR, unsigned) ) ; +PTR PROTO( zrealloc , (PTR,unsigned,unsigned) ) ; + + +#define ZBLOCKSZ 8 +#define ZSHIFT 3 + + +#define zmalloc(size) bmalloc((((unsigned)size)+ZBLOCKSZ-1)>>ZSHIFT) +#define zfree(p,size) bfree(p,(((unsigned)size)+ZBLOCKSZ-1)>>ZSHIFT) + +#define ZMALLOC(type) ((type*)zmalloc(sizeof(type))) +#define ZFREE(p) zfree(p,sizeof(*(p))) + + +#endif /* ZMALLOC_H */ diff --git a/zmalloc.o b/zmalloc.o new file mode 100644 index 0000000..9fef8e7 Binary files /dev/null and b/zmalloc.o differ