Practical usage of awk


Many of us know very little about awk – a wonderful C-like scripting language that is a part of almost any Linux distro since the early beginning.

In this post, I want to shed some light on awk and share a few practical examples of its daily use.

The most basic awk contruction is oftenly used for prining selected columns. Let’s review an output of a basic command ‘who’

$ who
johnny	 pts/0        2020-04-02 02:45 (
root     pts/1        2020-03-31 01:58 (tmux(25770).%0)
kkleim   pts/2        2020-03-31 01:59 (tmux(25770).%1)
howard   pts/3        2020-03-31 02:00 (tmux(25770).%3)
russ     pts/4        2020-03-31 02:01 (tmux(25770).%4)
dude1    pts/5        2020-03-31 02:02 (tmux(25770).%5

The most primitive way of awk usage might be:

$ who | awk '{print $1}'

Generally speaking, awk is a self-sufficient C-like language that supports almost everything that an old language could. Here is an example of a cycle1Double “##” symbol is a concatination:

awk 'BEGIN { while(i<99){ str=str "##";i++} print str }'

This post does not pursue the goal of explaining every nuance of the language. Thus, let’s focus on the basics, e.g., useful one-liners that can be handy for daily use.

Before we move on with practical examples, I want to mention the most essential construction that will be re-used multiple times in this post as well as in the typical text-parsing tasks.

condition { actions }

This contruction can be expanded to something like:

awk '$1 ~ /pattern/ {print $1}'

In the example above, awk will be searching for patter matching the first column, where the standard delimiter is a space or tab. If the matching pattern found, awk will print it.

Field separators

By default, awk uses a space as a field separator. However, we can specify any other character. It can be anything like comma, letter of the alphabet, special symbol, etc.

$ echo 'a,b,c,d,e' | awk -F',' '{print $1 $2 $3 $4 $5}'

We can easily set semicolon as a FS (field separator) and parse /etc/password:

$ awk -F: '{print $1}' /etc/passwd | tail -10

How to print all columns except the first one?

Another typical example of daily awk usage is printing specific fields while not printing other data. Let’s say, we need to print out all columns, except the first one:

awk '{first = $1; $1 = ""; print $0, first; }'

As shown above, the first column “$1” sets as an empty string. “$0” means “print the entire string.” Altogether, it results in printing all columns except the first one. It might be quite handy!

Typical awk use-cases in one-liner scripts

awk ‘ {print $1,$3, $5} ‘Prints first, third and fifth columns
awk ‘ {print $0} ‘Prints the entire string including all columns
awk ‘ /’pattern’/ {print $2} ‘Prints the length of the longest string
awk ‘BEGIN { print “Hello, world” }’ “Hello world” on awk
awk ‘{ if (length($0) > max) max = \ length($0) } END { print max }’ inputfilePrints the length of the longest string
awk ‘length($0) > 99’ inputfileThis construction should print all strings with more than 99 symbols
awk ‘BEGIN { for (i = 1; i <= 10; i++) print int(101 * rand()) }’Generates random numbers in the range 0…100
awk -F: ‘{ print $1 }’ /etc/passwd | sortPrints a sorted list of usernames

Another awk capability is to perform math. For instance, awk can calculate square roots, logs, trigonometric functions (tangents, cotangents, etc.)

$ awk 'BEGIN {print sqrt(2020)}'
$ awk 'BEGIN {print sin(2020)}'

This note can be a long one, but I didn’t want to make another awk tutorial. There are plenty of them over the network. The main thought of this short post is that awk is still powerful and handy more than forty years after its first release in 1977.

Be the first to comment

Leave a Reply

Your email address will not be published.