You are on page 1of 63


Tanuj Maheshwari
Perl stands for "Practical Extraction and
Report Language"
Created by Larry Wall when awk ran out of
Perl grew at almost the same rate as the
Unix operating system
Introduction (cont.)
Perl fills the gaps between program
languages of different levels
A great tool for leverage
High portability and readily available
It's free and runs rather nicely on nearly
everything that calls itself UNIX or UNIX-
Perl has been ported to the Amiga, the
Atari ST, the Macintosh family, VMS,
OS/2, even MS/DOS and Windows
The sources for Perl (and many
precompiled binaries for non-UNIX
architectures) are available from the
Comprehensive Perl Archive Network
(the CPAN).  
Running Perl on Unix
Setup path variable to point to the
directory where Perl is located
Check /usr/local/bin or /usr/bin for “perl”
Run a Perl script by typing “perl
Alternatively, change the file attribute to
executable and include “#!/usr/bin/perl” in
the first line of your perl script
The .pl extension is frequently associated
to Perl scripts
Running Perl on Win32
ActivePerl allows Perl scripts to be executed in
Perl is being ported faithfully
The #! directive is no longer used because it
does not mean anything to MS-DOS/Windows
Perl scripts are executed by typing “perl
Alternatively, double clicking on the file if the
extension .pl is being associated to the Perl
An Example
print “Hello World!”;
The #! directive directs subsequent
lines in the file to the perl executable
All statements are terminated with ; as
in C/C++/Java
print by default outputs any strings to
the terminal console. (such as printf in C
or cout in C++)
Perl completely parses and compiles the
script before executing it
Three main types of variables,
Variables are all global in scope unless
defined to be private or local
Note: remember that hash and array are
used to hold scalar values
Assigning values to a scalar
$i = “hello world!”;
$j = 1 + 1;
($i,$j) = (2, 3)
Assigning values to an array
$array[0] = 1;
$array[1] = “hello world!”;
push(@array,1); #stores the value 1 in the
end of @array
$value = pop(@array); #retrieves and removes the
last element
#from @array
@array = (8,@array); #inserts 8 in front of @array
Examples (cont.)
Assigning values to a hash
$hash{‘greeting’} = “Hello world!”;
$hash{‘available’} = 1;
#or using a hash slice
@hash{“greeting”,”available”} =
(“Hello world!”, 1);
Deleting a key-value pair from a hash:
delete $hash{‘key’};
Conditional Statements
Variables alone will not support switches or
If-Then-Else like clauses are used to make
decisions based on certain preconditions
Keywords: if, else, elsif, unless
Enclosed by ‘{‘ and ‘}’
A Conditional Statement
print "What is your name? ";
$name = <STDIN>;
chomp ($name);
if ($name eq "Randal") {
print "Hello, Randal! How good of you to be here!\n";
} else {
print "Hello, $name!\n"; # ordinary greeting
unless($name eq “Randal”)
print “You are not Randal!!\n”; #part of the ordinary
$name = <STDIN> reads from standard
chomp is a built-in function that removes
Conditional statements cannot handle
repetitive tasks
Keywords: while, foreach, for , until, do-while,
Foreach loop iterates over all of the elements
in an array or hash, executing the loop body
on each element
For is a shorthand of while loop
until is the reverse of while
Loops (cont.)
Do-while and do-until loops executes the
loop body once before checking for
Statements in the loop body are enclosed
by ‘{‘ and ‘}’
While Loop
while(some expression){

#prints the numbers 1 – 10 in reverse order
$a = 10;
while ($a > 0) {
print $a;
$a = $a – 1;
Until Loop
until(some expression){

#prints the numbers 1 – 10 in reverse order
$a = 10;
until ($a <= 0) {
print $a;
$a = $a – 1;
Foreach Loop
foreach [<variable>] (@some-list){
#prints each elements of @a
@a = (1,2,3,4,5);
foreach $b (@a) {
print $b;
Foreach Loop (cont.)
Accessing a hash with keys function:
foreach $key (keys (%fred)) {
# once for each key of %fred
print "at $key we have $fred{$key}\n";
# show key and value
For Loop
For(initial_exp; test_exp; re-init_exp ) {

#prints numbers 1-10
for ($i = 1; $i <= 10; $i++) {
print "$i ";
Do-While and Do-Until Loops
do {statments; do{ statements;
} while some_expression; }until

Example the prints the numbers 1-10 in
reverse order:
$a = 10; $a = 10;
do{ do{
print $a; print $a;
$a = $a – 1; $a = $a - 1;
}while ($a > 0); }until ($a <= 0);
Built-in functions
shift function
Ex: $value = Shift(@fred) is similar to ($x,@fred) =
unshift function
Ex: unshift(@fred,$a); # like @fred = ($a,@fred);
reverse function
@a = (7,8,9);
@b = reverse(@a); # gives @b the value of (9,8,7)
sort function
@y = (1,2,4,8,16,32,64);
@y = sort(@y); # @y gets 1,16,2,32,4,64,8
Built-In Functions (cont.)
qw function
Ex: @words = qw(camel llama alpaca); # is
equivalent to @words =
defined function
Returns a Boolean value saying whether the
scalar value resulting from an expression has
a real value or not. Ex: defined $a;
undefined function
Inverse of the defined function
Built-In Functions (cont.)
uc and ucfirst functions –vs- lc and lcfirst
<result> = uc(<string>)
<result> = ucfirst(<string>)
$string = “abcde”;
$string2 = uc($string); #ABCDE
$string3 = ucfirst($string); #Abcde
Lc and lcfirst has the reverse effect as uc and
ucfirst functions
Basic I/O
STDIN Examples:
$a = <STDIN>;
@a = <STDIN>;
while (defined($line = <STDIN>))
{ # process $line here }
STDOUT Examples:
print(list of arguments);
print “text”;
printf ([HANDLE], format, list of arguments);
Regular Expressions
Template to be matched against a string
Patterns are enclosed in ‘/’s
Matching against a variable are done by the =~
Syntax: /<pattern>/
$string =~/abc/ #matches “abc” anywhere in
<STDIN> =~ /abc/ #matches “abc” from standard
Creating Patterns
Single character patterns:
“.” matches any single character except newline
(\n), for example: /a./
“?” matches zero or one of the preceding
Character class can be created by using “[“ and
“]”. Range of characters can be abbreviated by
using “-”, and a character class can be negated
by using the “^” symbol.
For examples:
 [aeiouAEIOU] matches any one of the vowels
 [a-zA-Z] matches any single letter in the English
 [^0-9] matches any single non-digit
Creating Patterns (cont.)
Predefined character class abbreviations:
\d == [0-9]
\D == [^0-9]
\w == [a-zA-Z0-9]
\W == [^a-zA-Z0-9]
\s == [ \r\t\n\f]
\s == [^ \r\t\n\f]
Creating Patterns (cont.)
Multipliers: *, + And {}
* matches 0 or more of the preceding
ab*c matches a followed by zero or more bs and
followed by a c
+ Matches 1 or more of the preceding
ab+c matches a followed by one or more bs and
followed by a c
{} is a general multiplier
a{3,5} #matches three to five “a”s in a string
a{3,} #matches three of more “a”s
Creating Patterns (cont.)
a{3} #matches any string with more than
three “a”s in it
Complex patterns can be constructed from
these operators
For examples:
/a.*ce.*d/ matches strings such as

Creating Patterns: Exercises
Construct patterns for the following
1. "a xxx c xxxxxxxx c xxx d“
2. a sequence of numbers
3. three or more digits followed by
the string “abc”
4. Strings that have an “a”, one or
more “b”s and at least
five “c”s
5. Strings with three vowels next to
each other. Hint: try
Creating Patterns: Exercises
/\d+/ or /[0-9]+/
/\d\d\d.*abc/ or /\d{3,}abc/
Other possible answers?
Anchoring Patterns
No boundaries are defined by the previous
Word boundary: \w and \W
\b and \B is used to indicate word boundaries
and vice verse
/fred\b/ #matches fred, but not frederick
/\b\+\b/ #matches “x+y”, but not “x + y”, “++”
and ”+”. Why?
/\bfred\B/ #matches “frederick” but not “fred
Anchoring Patterns (cont.)
^ and $
^ matches beginning of a string
$ matches end of a string
/^Fred$/ #matches only “Fred”
/aaa^bbb/ #matches nothing
More on matching operators
Additional flags for the matching operator:
/<pattern>/i #ignores case differences
 /fred/i #matches FRED,fred,Fred,FreD and etc…
/<pattern>/s #treat string as single line
/<pattern>/m #treat string as multiple line
More on Matching Operators
“(“ and “)” can be used in patterns to
remember matches
Special variables $1, $2, $3 … can be used
to access these matches
For example:
$string = “Hello World!”;
if( $string =~/(\w*) (\w*))
#prints Hello World
print “$1 $2\n”;
More on Matching Operators
$string = “Hello World!”;
($first,$second) = ($string =~/(\w*) (\w*));
print “$first $second\n”; #prints Hello World

Line 2: Remember that the =~ return values just
like a function. Normally, it returns 0 or 1, which
stands for true or false, but in this case, the
existence of “(“ and “)” make it returns value of
the matching patterns
Replacement of patterns in string
s/<pattern to search>/<pattern to
i is to case insensitive
g enables the matching to be performed
more than once
$which = “this this this”;
$which =~ s/this/that/; #produces “that this
Substitution (cont.)
$which =~ s/this/that/g; #produces “that that that”
$which =~ s/THIS/that/i; #produces “that this this”
$which =~ s/THIS/that/ig; #produces “that that that”
Multipliers, anchors and memory operators can be
used as well:
$string = “This is a string”;
$string =~ s/^/So/; # “So This is a string”
$string =~ s/(\w{1,})/I think $1/; # “I think This is a
Split and Join Functions
<return value(s)> =
<return value> = join(“<seperator>”,<array>);
$string = “This is a string”;
@words = split(/ /,$string); #splits the string into
#separate words
@words = split(/\s/,$string); #same as above
$string = join(“ “,@words); #”This is a string”
Great functions in parsing formatted documents
Automates certain tasks
sub <name>

Global to the current package. Since we are
not doing OOP and packages, functions are
“global” to the whole program
Functions (cont.)
sub say_hello
print “Hello world!\n”;
Invoking a function:
say_hello(); #takes in parameters
&say_hello; #no parameters
Functions (cont.)
Return values
Two types of functions: void functions (also
known as routine or procedure), and functions
void functions have no return values
Functions in Perl can return more than one
sub threeVar
return ($a, $b, $c); #returns a list of 3 variables
Functions (cont.)
($one,$two,$three) = threeVar();
@list = threeVar(); #stores the three values into a
($one, @two, $three) = threeVar(); #$three will not
have #any value,
Functions (cont.)
Functions can’t do much without parameters
Parameters to a function are stored as a list
with the @_ variable
sub say_hello_two
$string = @_; #gets the value of the parameter
say_hello_two(“hello world!\n”);
Functions (cont.)
For example:
sub add
($left,$right) = @_;
return $left + $right;

$three = add(1,2);
Functions (cont.)
Variables are all global even if they are
defined within a function
my keyword defines a variable as being
private to the scope it is defined
For example:
sub add
my($left,$right) = @_;
return $left + $right;
Functions (cont.)
$three = add(1,2); #$three gets the value of
print “$one\n”; #prints 0
Print “$two\n”; #prints 0
A trim() function that removes leading and
trailing spaces in a string
Hint: use the s/// operator in conjunction with
A date() function that converts date string,
“DD:MM:YY” to “13th of December, 2003”
Hint: use a hash table to create a lookup
table for the month strings.
File I/O
Automatic filehandles: STDIN, STDOUT and
open(<handle name>,”(<|>|>>)filename”);
close(<handle name>);
open(INPUTFILE,”<inputs.txt”); #opens file

Close(INPUTFILE); #closes file handle
File I/O (cont.)
Handle access does not always yield true
Check for return value of the open function
… #do something
print “File open failed\n”;
File I/O (cont.)
The previous method is the standard practice
Unlike other languages, Perl is for lazy people
Ifs can be simplified by the logical operator “||”
For example:
open(INPUT,”<inputs.txt”) ||die “File open
Use $! variable to display additional operating
system errors
die “cannot append $!\n”;
File I/O (cont.)
Filehandles are similar to standard I/O handles
<> operator to read lines
For example:
print “$_\n”;
Use print <handle_name> <strings> to output
to a file
File I/O (cont.)
File copy example:
open(IN,$a) || die "cannot open $a for
reading: $!"; open(OUT,">$b") || die "cannot
create $b: $!";
while (<IN>) { # read a line from
file $a into $_
print OUT $_; # print that line to file
close(IN) || die "can't close $a: $!";
close(OUT) || die "can't close $b: $!";
File I/O (cont.)
File tests provides convenience for
-e –r –w –x –d –f –l –T –B
For example:
if(-f $name){
print “$name is a file\n”;
elsif(-d $name){
print “$name is a directory\n”;
Special Variables
$_, @_
$1, $2… - backreferencing variables
$_ = "this is a test";
/(\w+)\W+(\w+)/; # $1 is "this" and $2 is "is"
$`, $& and $’ - match variables
$string = “this is a simple string”;
/si.*le/; #$& is now “sample”, $` is “this is a” and $’
is #“string”
And many more…refer to ActivePerl’s online
documentations for their functions
Packages and Modules
Concentrate only on their usage in the
Greenstone environment
Package: a mechanism to protect codes from
tempering each other’s variables
Module: reusable package that is stored in
<Name of Module>.dm
The ppm (Perl Package Manager) for Linux
and Win32 version of Perl manages
installation and uninstallation of Perl packages
Packages and Modules
Install the module and put “use ModuleName”
or “require ModuleName” near the top of the
:: qualifying operator allow references to things
in the package, such as $Module::Variable
So “use Math::Complex module” refers to the
module Math/
new creates an instance of the object, then use
the handle and operator -> to access its
Packages and Modules
use accepts a list of strings as well, such that the
we can access the elements directly without the
qualifying operator
For example:
use Module qw(const1 const2 func1 func2
const1, const2, func1, func2 and func3 can now
be used directly in the program
Packages and Modules
Perl locates modules by searching the @INC
The first instance found will be used for the
module referenced within a program
Where to locate modules are an automatic
process as the Makefiles and PPM take care
placing modules in the correct path
Packages and Modules
An example that uses the package
use CGI; #uses the
$query = CGI::new(); #creates an
instance of CGI
$bday = $query->param("birthday"); #gets a
named parameter
print $query->header(); #outputs
html header
print $query->p("Your birthday is $bday.");
#outputs text to html
Packages and Modules
Advantages: Encourages code reuse and
less work
Disadvantages: 33% as fast as procedural
Perl according to the book “object-oriented
Perl”, generation of Perl modules involves
some ugly codes
Packages and Modules
Huge library
Tanuj maheshwari