Quick Reference Guide

Quick Guide to nawk

nawk (new awk) extends the original awk with more built-in functions, better string handling, and user-defined functions. Most modern systems have both installed; nawk is generally preferred. This guide covers syntax, built-in variables, patterns, string functions, control flow, arrays, and common one-liners.

Note: This guide was originally published in 2006 and has been substantially expanded in May 2026 with full content, with assistance from Claude (Anthropic).

Basic Syntax

nawk reads input line by line and applies rules of the form pattern { action }. If the pattern matches the current line, the action is executed. Either part is optional.

nawk 'pattern { action }' file
nawk -f script.awk file       # read program from a file
nawk -F: '{ print $1 }' file  # set field separator to :

Each input line is automatically split into fields. $0 is the entire line; $1, $2, … are individual fields. The default field separator is whitespace.

Built-in Variables

Variable	Meaning
`$0`	The entire current input line
`$1, $2, ...$NF`	Individual fields (1-based)
`NF`	Number of fields in the current line
`NR`	Total number of records (lines) read so far
`FNR`	Record number within the current file (resets per file)
`FS`	Input field separator (default: whitespace)
`OFS`	Output field separator (default: space)
`RS`	Input record separator (default: newline)
`ORS`	Output record separator (default: newline)
`FILENAME`	Name of the file currently being processed

Patterns

Patterns control which lines trigger an action. The four main types:

Pattern	Meaning
`BEGIN`	Runs once before any input is read. Use for initialisation.
`END`	Runs once after all input is processed. Use for summaries.
`/regex/`	Matches lines where the regex matches anywhere in `$0`.
`expression`	Any comparison or boolean expression, e.g. `NF > 3` or `$2 == "SG"`.

BEGIN { print "Start" }
/error/ { print "Found error on line", NR }
NF > 5  { print "Long line:", $0 }
END   { print "Done. Total lines:", NR }

print and printf

print outputs values separated by OFS and terminated by ORS. printf gives C-style formatted output with no automatic newline.

{ print $1, $3 }                      # print fields 1 and 3
{ print $1 > "output.txt" }           # redirect to file
{ print $1 | "sort" }                 # pipe to command
{ printf "%-10s %5d\n", $1, $2 }      # formatted output

Common printf format specifiers:

Specifier	Output
`%s`	String
`%d`	Integer
`%f`	Floating point
`%-10s`	Left-aligned string in a field of width 10

String Functions

Function	Description
`length(s)`	Length of string s (or `$0` if omitted)
`substr(s, m, n)`	Substring of s starting at position m, length n
`index(s, t)`	Position of string t in s (0 if not found)
`split(s, a, fs)`	Split s into array a using separator fs; returns number of fields
`sub(r, s)`	Replace first match of regex r in `$0` with s
`gsub(r, s)`	Replace all matches of regex r in `$0` with s
`match(s, r)`	Returns position of regex r in s; sets RSTART and RLENGTH
`toupper(s)`	Convert s to uppercase
`tolower(s)`	Convert s to lowercase
`sprintf(fmt, ...)`	Return a formatted string (like printf but returns rather than prints)

Control Flow

# if / else
if ($3 > 100) { print "high" } else { print "low" }

# while loop
{ i = 1; while (i <= NF) { print $i; i++ } }

# for loop
{ for (i = 1; i <= NF; i++) print $i }

# next  -- skip to next line
/^#/ { next }          # skip comment lines

# exit  -- stop processing
NR > 100 { exit }      # stop after 100 lines

Arithmetic operators: + - * / % ^. Comparison: == != < > <= >=. String match: ~ !~.

Arrays

nawk arrays are associative (keyed by string). No declaration is needed.

# count occurrences of each value in field 1
{ count[$1]++ }
END { for (k in count) print k, count[k] }

# delete an element
delete count["key"]

# check if a key exists
if ("key" in count) print "found"

Multi-dimensional arrays use comma-separated keys: arr[i,j].

Common One-liners

Task	Command
Print field 2 of every line	`nawk '{ print $2 }' file`
Print lines longer than 80 chars	`nawk 'length > 80' file`
Count lines matching a pattern	`nawk '/error/{ c++ } END{ print c }' file`
Sum a column of numbers	`nawk '{ s += $1 } END{ print s }' file`
Print unique lines (first occurrence)	`nawk '!seen[$0]++' file`
Print line numbers	`nawk '{ print NR": "$0 }' file`
Print last field of each line	`nawk '{ print $NF }' file`
Replace a string in field 2	`nawk '{ gsub(/old/, "new", $2); print }' file`
Use colon as field separator	`nawk -F: '{ print $1 }' /etc/passwd`
Print lines between two patterns	`nawk '/START/,/END/' file`

For the full specification, see the GNU awk manual. On most Linux systems, man nawk or man awk also gives a concise reference.

M	T	W	T	F	S	S
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30	31

M	T	W	T	F	S	S
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30	31