[Next] [Previous] [Up] [Top] [Contents] [Index]
Table of Contents
This appendix is adapted from a section of the same name in the book sed & awk, published by O'Reilly & Associates. It describes the generic structure and organization of an awk program.
An awk program consists of what is called a main input loop. A loop is a routine that is executed over and over again until some condition exists that terminates it. You don't write this loop, it is given; it exists as the framework within which the code that you do write will be executed. The main input loop in awk is a routine that reads one line of input from a file and makes it available for processing. The actions you write to do the processing assume that there is a line of input available. In another programming language, you would have to create the main input loop as part of your program.
The main input loop is executed as many times as there are lines of input. This loop does not execute until there is a line of input. It terminates when there is no more input to be read.
awk allows you to write two special routines that can be executed before any input is read and after all input is read. These are the procedures associated with the BEGIN and END rules, respectively. In other words, you can do some pre-processing before the main input loop is ever executed using the BEGIN procedure, and you can do some post-processing with the END procedure after the main input loop has been terminated. The BEGIN and END procedures are optional and they do not need to be defined.
You can think of an awk script as having potentially three major parts: what happens before, during, and after processing the input. The "what happens during processing" part is where most of the work gets done. Inside the main loop, your instructions are written as a series of pattern/action procedures. A pattern is a rule for testing the input line to determine whether or not the action should be applied to it. The actions can be quite complex, consisting of statements, functions, and expressions.