awk command

Table of Contents


AWK is one of mostly used text-processing utilities on Linux. It takes its name from initials of its authors - Aho, Wienberger and Kernighan. It is a scripting language that processes data files, especially text files that are organized in rows and columns.

AWK is used for stream processing, where the basic unit is the string. It considers a text file as a collection of fields and records. Each row is a record, and a record is a collection of fields. The syntax of AWK is as follows:

awk [options] 'patter {action}' input_file

As an AWK script:

awk [options] script_name input_file

The most commonly used command-line options of awk are “-F” and -f:

-F : to change input field seperator
-f : to name script file

There are two types of buffers used in AWK - the record buffer and field buffer.

The field buffer is denoted as $1, $2… $n, where ‘n’ indicates the field number in the input file. So, $2 indicates the second field (or the second column in a file).

The record buffer is denoted as $0, which indicates the whole record.

Example

fileName : file1.txt

ram    1 get       7
shyam   2 put      8
krishna  3 lost    9
sharma   4 found   8
archana  5 good    7
kumari   6 best    6

To print third and first field in a file, use the command

awk '{print $3, $1}' file1.txt

Output:

get ram
put shyam
lost krishna
found sharma
good archana
best kumari

AWK process flow

AWK scripts are divided into the following three parts - BEGIN (pre-processing), body (processing) and END (post-processing).

BEGIN is the part of the AWK script where variables can be initialised and report headings can be created. The processing body contains the data that needs to be processed, like a loop. END or the post-processing part analyses or prints the data that has been processed.

awk 'BEGIN {print "Sum \t average"} {tot=$2+$4; avg=tot/3; print tot,avg} END {print "======================\nProcessing End...\n================="}' file1.txt

Output:

Sum    average
8  2.66667
10  3.33333
12  4
12  4
12  4
12  4
======================
Processing End...
=================

AWK field seperator

To be added

skip blank lines

There are two approaches you can try to filter out lines:

awk 'NF' data.txt

and

awk 'length' data.txt

Just put these at the start of your command, i.e.,

awk -v variable=$bashvariable 'NF { print variable ... }' myinfile

or

awk -v variable=$bashvariable 'length { print variable  ... }' myinfile

Both of these act as gatekeepers/if-statements.

The first approach works by only printing out lines where the number of fields (NF) is not zero (i.e., greater than zero).

The second method looks at the line length and acts if the length is not zero (i.e., greater than zero)

You can pick the approach that is most suitable for your data/needs.

Reference: https://stackoverflow.com/questions/11687216/awk-to-skip-the-blank-lines

If Else

Reference: https://www.thegeekstuff.com/2010/02/awk-conditional-statements

awk '{if ($3 < 105 && $3>65) print $3}' mass.txt

Generate random number

awk -v min=5 -v max=10 'BEGIN{srand(); print int(min+rand()*(max-min+1))}'

For loop

Command 1

awk -F'[, ]' '{for(i=2;i<=NF;i++){print $1,$i}}' file

Command 2

awk 'BEGIN { for (i = 1; i <= 5; ++i) print i }'

If you want to print awk output in the same line add | xargs | sed -e 's/ /,/g' as shown below:

awk 'BEGIN { for (i = 1; i <= 5; ++i) print i }' | xargs | sed -e 's/ /, /g'

This will print each value seperated by “comma and a space” in same line.

select row and element in awk

Reference: https://stackoverflow.com/questions/1506521/select-row-and-element-in-awk

replace a particular row and column value of one file with another

Reference: https://stackoverflow.com/questions/26089692/replace-a-particular-row-and-column-value-of-one-file-with-another

Count number of character of each line

awk '{print length}' filename.txt

This command prints length of each line. Here length is equivalent to length($0), where $0 denotes the current line1.

  1. https://unix.stackexchange.com/a/18742 




Enjoy Reading This Article?

Here are some more articles you might like to read next:

  • Vi-Editor
  • sed command
  • GitLab workflow for CMS-AN
  • Condor Jobs
  • Grep command
  • ROOT CheatSheet
  • Git CheatSheet
  • EOS uses
  • Mac Settings
  • XDAQ Basics
  • find command