You are on page 1of 57

What is AWK?

• It is a scripting language and considered the


most powerful command in the Linux
environment.
• It does not require any compiling and
generates reports by processing some files
and then analyze those files.
• It is used to manipulate data and is thus
suitable for pattern searching and processing.
What Operations Can AWK Perform?

• Scan files line by line.


• Splits input record into fields
• Compare the fields to patterns
• Perform on action matched pattern
How is AWK Command Useful in Linux and Unix?

• It helps in manipulating data files as per the


user requirements.
• It generates formatted outputs.
What are the various Program Constructs
that AWK Offers?

• Various programming concepts offered by


AWK are:
– Output formatting
– Inbuilt variables
– Pattern matching
– String operations
– Arithmetic operations
Features of AWK command

Various features of the Awk command are as follows:


– It scans a file line by line.
– It splits a file into multiple fields.
– It compares the input text or a segment of a text file.
– It performs various actions on a file like searching a specified
text and more.
– It formats the output lines.
– It performs arithmetic and string operations.
– It applies the conditions and loops on output.
– It transforms the files and data on a specified structure.
– It produces the format reports.
Awk Working Methodology
• Awk reads the input files one line at a time.
• For each line, it matches with given pattern in the given order, if
matches performs the corresponding action.
• If no pattern matches, no action will be performed.
• In the above syntax, either search pattern or action are optional, But
not both.
• If the search pattern is not given, then Awk performs the given
actions for each line of the input.
• If the action is not given, print all that lines that matches with the
given patterns which is the default action.
• Empty braces with out any action does nothing. It wont perform
default printing operation.
• Each statement in Actions should be delimited by semicolon.
Basic Terminology: input file
• A field is a unit of data in a line
• Each field is separated from the other
fields by the field separator
– default field separator is whitespace
• A record is the collection of fields in a line
• A data file is made up of records

10 CSCI 330 - The UNIX System


Example Input File

11 CSCI 330 - The UNIX System


Buffers :
• awk supports two types of buffers:
record and field
• field buffer:
– one for each fields in the current record.
– names: $1, $2, …
• record buffer :
– $0 holds the entire record
• $cat >employee.txt /*create a file*/
100 Thomas Manager Sales $5,000
200 Jason Developer Technology $5,500
300 Sanjay Sysadmin Technology $7,000
400 Nisha Manager Marketing $9,500
500 Randy DBA Technology $6,000
Awk Example 1. Default behavior of Awk
By default Awk prints every line from the file.
$ awk '{print;}' employee.txt
100 Thomas Manager Sales $5,000
200 Jason Developer Technology $5,500
300 Sanjay Sysadmin Technology $7,000
400 Nisha Manager Marketing $9,500
500 Randy DBA Technology $6,000
Awk Example 2. Print the lines which matches with the pattern.
$ awk '/Thomas/
> /Nisha/' employee.txt
100 Thomas Manager Sales $5,000
400 Nisha Manager Marketing $9,500
Awk Example 3. Print only specific field.
$ awk '{print $2,$5;}' employee.txt
• Thomas $5,000
• Jason $5,500
• Sanjay $7,000
• Nisha $9,500
• Randy $6,000
Variables :
• There are two different types of variables
1.System variables
2.User defined variables
• Print only specific field.
$ awk '{print $2,$NF;}' employee.txt
• Thomas $5,000
• Jason $5,500
• Sanjay $7,000
• Nisha $9,500
• Randy $6,000
User defined variables :
• We can define any number of user defined variables with in an AWK
script.
• They can be numbers ,strings,arrays,
how to define and use awk variables.
• Awk variables should begin with the letter, followed by it can consist
of alpha numeric characters or underscore.
• Keywords cannot be used as a awk variable
• Awk does not support variable declaration like other programming
languages
• Its always better to initialize awk variables in BEGIN section, which
will be executed only once in the beginning.
• There are no datatypes in Awk. Whether a awk variable is to be
treated as a number or as a string depends on the context it is used
in.
awk Scripts
awk scripts are divided into three major parts:

comment lines start with #


awk Scripts
• BEGIN: pre-processing
– performs processing that must be completed
before the file processing starts (i.e., before
awk starts reading records from the input
file)
– useful for initialization tasks such as to
initialize variables and to create report
headings
awk Scripts
• BODY: Processing
– contains main processing logic to be applied
to input records
– like a loop that processes input data one
record at a time:
• if a file contains 100 records, the body will
be executed 100 times, one for each
record
awk Scripts
• END: post-processing
– contains logic to be executed after all input
data have been processed
– logic such as printing report grand total
should be performed in this part of the script
Pattern / Action Syntax
The Command: awk
Examples
$ cat employee.txt
100 Thomas Manager Sales $5,000
200 Jason Developer Technology $5,500
300 Sanjay Sysadmin Technology $7,000
400 Nisha Manager Marketing $9,500
500Randy DBA Technology $6,000
Vi start. awk
BEGIN{print "name\t designation\tdepartment\tsalary";}
{print $2,"\t",$3,"\t",$4,"\t",$NF;}
END{print"Report generated\n------";}
Run :
awk -f start.awk employee.txt
name designation department salary
Thomas Manager Sales $5,000
Jason Developer Technology $5,500
Sanjay Sysadmin Technology $7,000
Nisha Manager Marketing $9,500
Randy DBA Technology $6,000
Report generated
------
Awk Example 2: Billing for Books
• In this example, the input file bookdetails.txt contains
records with fields — item number, Book name, Quantity
and Rate per book.
cat >bookdetails.txt /*create a file */
1 Linux-programming 2 450
2 Advanced-Linux 3 300
3 Computer-Networks 4 400
4 OOAD&UML 3 450
5 Java2 5 200
Ouestion :
Awk script, reads and processes the above bookdetails.txt file, and generates
report that displays — rate of each book sold, and total amount for all the books
sold.
• BEGIN{
total=0;
}
{
itemno=$1;
book=$2;
bookamount=$3*$4;
total=total+bookamount;
print itemno," ",book,"\t","$"bookamount;
}
END{
print "Total amount =$"total;
}
Run :
$ awk -f book2.awk bookdetails.txt
Output:
1 linux-programming $900
2 Advanced-Linux $900
3 Computer-Networks $1600
4 OOAD&UML $1350
5 Java2 $1000
Total amount =$5750
Awk Example 2. Student Mark Calculation
• In this example, create an input file “studentmarks.txt”
with the following content — Student name, Roll Number,
Test1 score, Test2 score and Test3 score
cat >studentmarks.txt /*create a file */
Jones 2143 78 84 77
Gondrol 2321 56 58 45
RinRao 2122 38 37 65
Edwin 2537 78 67 45
Dayan 2415 30 47 20
Question :
Now the following Awk script will calculate and generate the report
to show the Average marks of each student, average of Test1, Test2
and Test3 scores.
BEGIN{
test1=0;
test2=0;
test3=0;
print "name\trollno\tAveragescore";
}
{
total=$3+$4+$5;
test1=test1+$3;
test2=test2+$4;
test3=test3+$5;
print $1"\t"$2"\t", total/3;
}
END{
print"average of test1="test1/NR;
print"average of test2="test2/NR;
print"average of test3="test3/NR;
}
Run :
$ awk -f student.awk studentsmarks.txt
name rollno Averagescore
Jones 2143 79.6667
Gondrol 2321 53
RinRao 2122 46.6667
Edwin 2537 63.3333
Dayan 2415 32.3333
average of test1=56
average of test2=58.6
average of test3=50.4
Operators
• Like any other programming language Awk also has lot of
operators for number and string operations. In this article let us
discuss about all the key awk operators.
There are two types of operators in Awk.
• Unary Operator – Operator which accepts single operand is
called unary operator.
• Binary Operator – Operator which accepts more than one
operand is called binary operator.
• Conditional operator :
awk conditional statements
• Awk supports lot of conditional statements to control the flow of the
program.
• Most of the Awk conditional statement syntax are looks like ‘C’
programming language.
• Normally conditional statement checks the condition, before
performing any action.
• If the condition is true action(s) are performed. Similarly action can
be performed if the condition is false.
• Conditional statement starts with the keyword called ‘if’.
• Awk supports different kind of if statement.
1. Awk Simple If statement
2. Awk If-Else statement
3.Awk If-ElseIf-Ladder
Awk Simple If Statement
• Single Action: Simple If statement is used to check the
conditions, if the condition returns true, it performs its
corresponding action(s).
• Syntax:
if (conditional-expression)
action
• if is a keyword
• conditional-expression – expression to check conditions
• action – any awk statement to perform action.
Multiple Action: If the conditional expression returns true, then action
will be performed.
• If more than one action needs to be performed, the actions should
be enclosed in curly braces, separating them into a new line or
semicolon as shown below.
• Syntax:
if (conditional-expression)
{
action1;
action2;
}
• If the condition is true, all the actions enclosed in braces will be
performed in the given order.
• After all the actions are performed it continues to execute the next
statements.
Awk If Else Statement
• In the above simple awk If statement, there is no set of
actions in case if the condition is false.
• In the awk If Else statement you can give the list of action
to perform if the condition is false.
• If the condition returns true action1 will be performed, if
the condition is false action 2 will be performed.
• Syntax:
if (conditional-expression)
action1
else
action2
• Awk also has conditional operator i.e ternary
operator ( ?: )
• whose feature is similar to the awk If Else
Statement.
• If the conditional-expression is true, action1 will be
performed and if the conditional-expression is false
action2 will be performed.
• Syntax:
conditional-expression ? action1 : action2 ;
Example :
Generate Pass/Fail Report based on Student marks in each subject
cat >student .txt /*create a file
Jones 2143 78 84 77
Gondrol 2321 56 58 45
RinRao 2122 38 37
Edwin 2537 87 97 95
Dayan 2415 30 47
Vi file2.awk
BEGIN{}
{
if ($3 >=35 && $4 >= 35 && $5 >= 35)
print $0,"=>","Pass";
else
print $0,"=>","Fail";
}
END{}
Run :: $ awk -f file2.awk
student.txt

Output:
Jones 2143 78 84 77 => Pass
Gondrol 2321 56 58 45 => Pass
RinRao 2122 38 37 => Fail
Edwin 2537 87 97 95 => Pass
Dayan 2415 30 47 => Fail
Awk If Else If ladder
Syntax: if(conditional-expression1)
action1;
else if(conditional-expression2)
action2;
else if(conditional-expression3)
action3;
.
.
else
action n;
• If the conditional-expression1 is true then action1 will be performed.
• If the conditional-expression1 is false then conditional-expression2 will be
checked, if its true, action2 will be performed and goes on like this.
• Last else part will be performed if none of the conditional-expression is
true.
Awk If Else If Example: Find the average and grade for every student
Vi grade.awk
BEGIN{ }
{
total=$3+$4+$5;
avg=total/3;
if ( avg >= 90 )
grade="A";
else if ( avg >= 80)
grade ="B";
else if (avg >= 70)
grade ="C";
else grade="D";
print $0,"=>",grade;
}
END{ }
Run : $ awk -f grade.awk student-marks
Output:
• Jones 2143 78 84 77 => C
• Gondrol 2321 56 58 45 => D
• RinRao 2122 38 37 => D
• Edwin 2537 87 97 95 => A
• Dayan 2415 30 47 => D
awk loopstatements
• let us review about awk loop statements – while, do while,
for loops, break, continue, and exit statements .
• Awk looping statements are used for performing set of
actions again and again in succession.
• It repeatedly executes a statement as long as condition is
true.
• Awk has number of looping statement as like ‘C’
programming language.
Awk While Loop
Syntax:
while(condition)
actions
• while is a keyword.
• condition is conditional expression
• actions are body of the while loop which can have one or more statement.
• If actions has more than one statement, it has to be enclosed with in the
curly braces.
How it works? 
• Awk while loop checks the condition first, if the condition is true, then it
executes the list of actions.
• After action execution has been completed, condition is checked again, and
if it is true, actions is performed again.
• This process repeats until condition becomes false.
• If the condition returns false in the first iteration then actions are never
executed.
Example :
$ cat >employee.txt
100 Thomas Manager Sales $5,000
200 Jason Developer Technology $5,500
300 Sanjay Sysadmin Technology $7,000
400 Nisha Manager Marketing $9,500
500 Randy DBA Technology $6,000
Write a awk script to prints the first three fields of each record, one per line.
BEGIN{}
{
i=1;
while (i <= 3)
{
print $i;
i++;
}
}
END{}
Output :
RUN : $awk -f awhile.awk employee.txt
100
Thomas
Manager
200
Jason
Developer
300
Sanjay
Sysadmin
400
Nisha
Manager
500
Randy
DBA
Example: Write an awk script to reverse a number::
$cat >awhile2/*create a file */
342
134
Vi awhile2.awk /*awk script
BEGIN{}
{
no=$1
rem=0
while (no >1)
{
rem =no%10
no/=10
printf "%d",rem
}
printf "\n"
}
END{}
Awk Do-While Loop
How it works? – 
• Awk Do while loop is called exit controlled loop, whereas
awk while loop is called as entry controlled loop.
• Because while loop checks the condition first, then it
decides to execute the body or not.
• But the awk do while loop executes the body once, then
repeats the body as long as the condition is true.
Syntax:
do
action
while(condition)
Example : To prints each input record 2 times by using do while ::
BEGIN{ }
{
i=1
do
{
print $0
i++
} while (i <=3 )
}
END{ }
Output:
100 Thomas Manager Sales $5,000
100 Thomas Manager Sales $5,000
200 Jason Developer Technology $5,500
200 Jason Developer Technology $5,500
300 Sanjay Sysadmin Technology $7,000
300 Sanjay Sysadmin Technology $7,000
400 Nisha Manager Marketing $9,500
400 Nisha Manager Marketing $9,500
500 Randy DBA Technology $6,000
500 Randy DBA Technology $6,000
Awk For Loop Statement
• Awk for statement is same as awk while loop, but it is syntax is much
easier to use.
Syntax:
for(initialization ; condition ; increment/decrement)
Actions
How it works? —
•  Awk for statement starts by executing initialization, then checks the
condition, if the condition is true, it executes the actions, then
increment or decrement.
• Then as long as the condition is true, it repeatedly executes action
and then increment/decrement.
Awk For Loop Example :: Print the sum of fields in all lines.
$cat >awfor /*create a file
1 2
3 4
Vi awfor.awk /*awk script*/
BEGIN{}
{
for (i=1;i <= NF;i++)
total=total+$i;
}
END{ print total}
RUN:
$ awk -f awfor.awk awfor
10

You might also like