Professional Documents
Culture Documents
(PERL)
Introduction
What is PERL?
Practical Report and Extraction Language. It is an interpreted language optimized for scanning arbitrary text files, extracting information from them, and printing reports based on that information. Very powerful string handling features. Available on all platforms.
Main Advantages
Speed of development
You can enter the program in a text file, and just run it. It is an interpretive language; no compiler is needed.
It is powerful
The regular expressions of Perl are extremely powerful. Uses sophisticated pattern matching techniques to scan large amounts of data very quickly.
Portability
Perl is a standard language and is available on all platforms. Free versions are available on the Internet.
Flexibility
Perl does not limit the size of your data. If memory is available, Perl can handle the whole file as a single string. Allows one to write simple programs to perform complex tasks.
Assumptions:
For Windows operating system, you can run Perl programs from the command prompt. Run cmd to get command prompt window. For Unix/Linux, you can run directly from the shell prompt.
On Unix/Linux, an additional line has to be given at the beginning of every Perl program.
#!/usr/bin/perl print Good day\n; print This is my first Perl program \n;
Variables
Scalar variables
A scalar variable holds a single value. Other variable types are also available (array and associative array) to be discussed later. A $ is used before the name of a variable to indicate that it is a scalar variable. $xyz = 20;
Some examples:
$a = 10; $name=Indranil Sen Gupta; $average = 28.37; Variables do not have any fixed types. Variables can be printed as: print My name is $name, the average temperature is $average\n;
Data types:
Perl does not specify the types of variables. It is a loosely typed language. Languages like C or java are strongly typed.
10
Variable Interpolation
A powerful feature
Variable names are automatically replaced by values when they appear in double-quoted strings.
An example:
$stud = Rupak; $marks = 75; print Marks obtained by $stud is $marks\n; print Marks obtained by $stud is $marks\n;
11
The program will give the following output: Marks obtained by Rupak is 75 Marks obtained by $stud is $marks What do we see: If we need to do variable interpolation, use double quotes; otherwise, use single quotes.
12
Another example:
$Expense = $100; print The expenditure is $Expense.\n;
13
14
Operations on strings:
Concatenation: the dot (.) is used. $a = Good; $b = day; $c = \n; $total = $a.$b.$c; # concatenate the strings $a .= day\n; # add to the string $a
15
Arithmetic operations on strings $a = bat; $b = $a + 1; print $a, and , $b; will print bat and bau
Operations carried out based on ASCII codes. May not always be meaningful.
16
String repetition operator (x). $a = $b x3; will concatenate three copies of $b and assign it to $a. print Ba. nax2; will print the string banana.
17
String as a Number
A string can be used in an arithmetic expression.
How is the value evaluated? When converting a string to a number, Perl takes any spaces, an optional minus sign, and as many digits it can find (with dot) at the beginning of the string, and ignores everything else. 23.54 123Hello25 banana evaluates to 23.54 evaluates to 123 evaluates to 0
18
Escaping
The character \ is used as the escape character.
It escapes all of Perls special characters (e.g., $, @, #, etc.). $num = 20; print Value of \$num is $num\n; print The windows path is c:\\perl\\;
19
Example:
print << terminator; Hello, how are you? Good day. terminator
20
Another example:
print print print print print <HTML>\n; <HEAD><TITLE>Test page </TITLE></HEAD>\n; <BODY>\n; <H2>This is a test document.<H2>\n; </BODY></HTML>;
21
print << EOM; <HTML> <HEAD><TITLE>Test page </TITLE></HEAD> <BODY> <H2>This is a test document.<H2> </BODY></HTML> EOM
22
Basic Difference
List is an ordered list of scalars. Array is a variable that holds a list. Each element of an array is a scalar. The size of an array:
Lower limit: 0 Upper limit: no specific limit; depends on virtual memory.
24
List Literal
Examples:
(10, 20, 50, 100) (red', blue", green") (a", 1, 2, 3, b') ($a, 12) () (10..20) (A..Z)
25
The individual elements of the array are scalars, and can be referred to as:
$months[0] $months[1] # first element of @months # second element of @months
26
Initializing an Array
Two ways:
Specify values, separated by commas. @color = (red, green, blue, black); Use the quote words (qw) function, that uses space as the delimiter: @color = qw (red green blue black);
27
Array Assignment
Assign from a list of literals @numbers = (1, 2, 3); @colors = (red, green, blue); From the contents of another array. @array1 = @array2; Using the qw function: @word = qw (Hello good morning); Combination of above: @allcolors = (white, @colors, brown);
28
Some other examples: @xyz = (2..5); @xyz = (1, @xyz); @xyz = (@xyz, 6);
29
Multiple Assignments
($x, $y, $y) = (10, 20, 30);
# swap elements
($a, @col) = (red, green, blue); # $a gets the value red # @col gets the value (green, blue)
($first, @val, $last) = (1, 2, 3, 4); # $first gets the value 1 # @val gets the value (2, 3, 4) # $last is undefined
Internet & Web Based Technology 30
31
Accessing Elements
@list = (1, 2, 3, 4); $first = $list[0]; $fourth = $list[3]; $list[1]++; $x = $list[5]; # array becomes (1, 3, 3, 4) # $x gets the value undef
32
33
34
Example:
@color = qw (red, blue, green, black); $first = shift @color; # $first gets red, and @color becomes # (blue, green, black) unshift (@color, white); # @color becomes (white, blue, green, black)
35
36
Example:
@color = qw (red, blue, green, black); $first = pop @color; # $first gets black, and @color becomes # (red, blue, green) push (@color, white); # @color becomes (red, blue, green, white)
37
Reversing an Array
By using the reverse keyword.
@names = (Mina, Tina, Rina) @rev = reverse @names; # Reversed list stored in rev. @names = reverse @names; # Original array is reversed.
38
Printing an Array
Example:
@colors = qw (red, green, blue); print @colors; # prints without spaces redgreenblue print @colors; # prints with spaces red green blue
39
40
Another example: @num = qw (10 2 5 22 7 15); @new = sort @num; # @new will contain (10 15 2 22 5 7) How do sort numerically? @num = qw (10 2 5 22 7 15); @new = sort {$a <=> $b} @num; # @new will contain (2 5 7 10 15 22)
41
42
File Handling
44
46
File Operations
Opening a file
The open command opens a file and returns a file handle. For standard input, we have a predefined handle <STDIN>. $fname = /home/isg/report.txt; open XYZ , $fname; while (<XYZ>) { print Line number $. : $_; }
47
Checking the error code: $fname = /home/isg/report.txt; open XYZ, $fname or die Error in open: $!; while (<XYZ>) { print Line number $. : $_; } $. $_ $i returns the line number (starting at 1) returns the contents of last match returns the error code/message
48
49
50
Appending to a file:
$out = /home/isg/out.txt; open XYZ , >>$out or die Error in write: $!; for $i (1..20) { print XYZ $i :: Hello, the time is, scalar(localtime), \n; }
51
Closing a file:
close XYZ;
52
Printing a file:
This is very easy to do in Perl. $input = /home/isg/report.txt; open IN, $input or die Error in open: $!; while (<IN>) { print; } close IN;
53
54
<STDOUT>
Print to standard output (screen).
<STDERR>
For outputting error messages.
<ARGV>
Reads the names of the files from the command line and opens them all.
55
@ARGV array contains the text after the programs name in command line. <ARGV> takes each file in turn. If there is nothing specified on the command line, it reads from the standard input. Since this is very commonly used, Perl provides an abbreviation for <ARGV>, namely, < > An example is shown.
56
$lineno = 1; while (< >) { print $lineno ++; print $lineno: $_; } In this program, the name of the file has to be given on the command line. perl list_lines.pl file1.txt perl list_lines.pl a.txt b.txt c.txt
57
Control Structures
Introduction
There are many control constructs in Perl.
Similar to those in C. Would be illustrated through examples. The available constructs: for foreach if/elseif/else while do, etc.
59
Concept of Block
A statement block is a sequence of statements enclosed in matching pair of { and }.
if (year == 2000) { print You have entered new millenium.\n; }
60
61
if .. else
General syntax:
if (test expression) { # if TRUE, do this } else { # if FALSE, do this }
62
Examples:
if ($name eq isg) { print Welcome Indranil. \n; } else { print You are somebody else. \n; } if ($flag == 1) { print There has been an error. \n; } # The else block is optional
63
elseif
Example:
print Enter your id: ; chomp ($name = <STDIN>); if ($name eq isg) { print Welcome Indranil. \n; } elseif ($name eq bkd) { print Welcome Bimal. \n; } elseif ($name eq akm) { print Welcome Arun. \n; } else { print Sorry, I do not know you. \n; }
64
while
Example: (Guessing the correct word)
$your_choice = ; $secret_word = India; while ($your_choice ne $secret_word) { print Enter your guess: \n; chomp ($your_choice = <STDIN>); } print Congratulations! Mera Bharat Mahan.
65
for
Syntax same as in C. Example:
for ($i=1; $i<10; $i++) { print Iteration number $i \n; }
66
foreach
Very commonly used function that iterates over a list. Example:
@colors = qw (red blue green); foreach $name (@colors) { print Color is $name. \n; }
67
68
69
70
Relational Operators
72
Logical Connectives
If $a and $b are logical expressions, then the following conjunctions are supported by Perl:
$a and $b $a or $b not $a $a && $b $a || $b ! $a
Both the above alternatives are equivalent; first one is more readable.
73
String Functions
The first parameter to split is a regular expression that specifies what to split on. The second specifies what to split.
75
Another example:
$_= Indranil isg@iitkgp.ac.in 283496; ($name, $email, $phone) = split / /, $_;
76
77
Regular Expressions
Introduction
One of the most useful features of Perl. What is a regular expression (RegEx)?
Refers to a pattern that follows the rules of syntax. Basically specifies a chunk of text. Very powerful way to specify string patterns.
79
80
Using RegEx
$_ = Hello good morning everybody; if ($_ =~ /every/) { print Found the word every \n; }
Very easy to use. The text between the forward slashes defines the regular expression. If we use !~ instead of =~, it means that the pattern is not present in the string.
Internet & Web Based Technology 81
Point to remember:
When performing the matching, all the characters in the string are considered to be significant, including punctuation and white spaces. For example, /every / will not match in the previous example.
82
83
Types of RegEx
Basically two types:
Matching Checking if a string contains a substring. The symbol m is used (optional if forward slash used as delimiter). Substitution Replacing a substring by another substring. The symbol s is used.
84
Matching
The =~ Operator
Tells Perl to apply the regular expression on the right to the value on the left. The regular expression is contained within delimiters (forward slash by default).
If some other delimiter is used, then a preceding m is essential.
86
Examples
$string = Good day; if ($string =~ m/day/) { print Match successful \n"; } if ($string =~ /day/) { print Match successful \n"; }
87
$string = Good day; if ($string =~ m@day@) { print Match successful \n"; } if ($string =~ m[day[ ) { print Match successful \n"; }
88
Character Class
Use square brackets to specify any value in the list of possible values.
my $string = Some test string 1234"; if ($string =~ /[0123456789]/) { print "found a number \n"; } if ($string =~ /[aeiou]/) { print "Found a vowel \n"; } if ($string =~ /[0123456789ABCDEF]/) { print "Found a hex digit \n"; }
89
90
Pattern Abbreviations
Useful in common cases .
\d \w \s \D \W \S Anything except newline (\n) A digit, same as [0-9] A word character, [0-9a-zA-Z_] A space character (tab, space, etc) Not a digit, same as [^0-9] Not a word character Not a space character
Internet & Web Based Technology 91
$string = Good and bad days"; if ($string =~ /d..s/) { print "Found something like days\n"; } if ($string =~ /\w\w\w\w\s/) { print "Found a four-letter word!\n"; }
92
Anchors
Three ways to define an anchor:
^ :: anchors to the beginning of string $ :: anchors to the end of the string \b :: anchors to a word boundary
93
if ($string =~ /^\w/) :: does string start with a word character? if ($string =~ /\d$/) :: does string end with a digit? if ($string =~ /\bGood\b/) :: Does string contain the word Good?
94
Multipliers
There are three multiplier characters.
* :: Find zero or more occurrences + :: Find one or more occurrences ? :: Find zero or one occurrence
95
Substitution
Basic Usage
Uses the s character. Basic syntax is:
$new =~ s/pattern_to_match/new_pattern/; What this does? Looks for pattern_to_match in $new and, if found, replaces it with new_pattern. It looks for the pattern once. That is, only the first occurrence is replaced. There is a way to replace all occurrences (to be discussed shortly).
97
Examples
$xyz = Rama and Lakshman went to the forest; $xyz =~ s/Lakshman/Bharat/; $xyz =~ s/R\w+a/Bharat/; $xyz =~ s/[aeiou]/i/; $abc = A year has 11 months \n; $abc =~ s/\d+/12/; $abc =~ s /\n$/ /;
98
Common Modifiers
Two such modifiers are defined:
/i :: /g :: ignore case match/substitute all occurrences
$string = Ram and Shyam are very honest"; if ($string =~ /RAM/i) { print Ram is present in the string; } $string =~ s/m/j/g; # Ram -> Raj, Shyam -> Shyaj
99
100
Examples
$string = Ram and Shyam are honest"; $string =~ /^(\w+)/; print $1, "\n"; $string =~ /(\w+)$/; print $1, "\n";
# prints Ra\n
# prints st\n
$string =~ /^(\w+)\s+(\w+)/; print "$1 $2\n"; # prints Ramnd Shyam are honest;
101
$string = Ram and Shyam are very poor"; if ($string =~ /(\w)\1/) { print "found 2 in a row\n"; } if ($string =~ /(\w+).*\1/) { print "found repeat\n"; } $string =~ s/(\w+) and (\w+)/$2 and $1/;
102
Example 1
validating user input
print Enter age (or 'q' to quit): "; chomp (my $age = <STDIN>); exit if ($age =~ /^q$/i); if ($age =~ /\D/) { print "$age is a non-number!\n"; }
103
104
$&, $` and $
What is $&?
It represents the string matched by the last successful pattern match.
What is $`?
It represents the string preceding whatever was matched by the last successful pattern match.
What is $?
It represents the string following whatever was matched by the last successful pattern match .
106
107
So actually .
S` represents pre match $& represents present match $ represents post match
108
Associative Arrays
Introduction
Associative arrays, also known as hashes.
Similar to a list Every list element consists of a pair, a hash key and a value. Hash keys must be unique. Accessing an element Unlike an array, an element value can be found out by specifying the hash key value. Associative search. A hash array name must begin with a %.
110
111
Using the => operator. %directory = ( Rabi => 258345, Chandan => 325129, Atul => 445287, Sruti => 237221 ); Whatever appears on the left hand side of => is treated as a double-quoted string.
112
113
114
Modifying a Value
By simple assignment:
@list = qw (Rabi 258345 Chandan 325129 Atul 445287 Sruti 237221); %directory = @list; $directory{Sruti} = 453322; $directory{Chandan} ++;
115
Deleting an Entry
A (hash key, value) pair can be deleted from a hash array using the delete function.
Hash key has to be specified. @list = qw (Rabi 258345 Chandan 325129 Atul 445287 Sruti 237221); %directory = @list; delete $directory{Atul};
116
117
118
An Example
List all person names and telephone numbers.
@list = qw (Rabi 258345 Chandan 325129 Atul 445287 Sruti 237221); %directory = @list; foreach $name (keys %directory) { print $name \t $directory{$name} \n; }
119
Subroutines
Introduction
A subroutine ..
Is a user-defined function. Allows code reuse. Define ones, use multiple times.
121
How to use?
Defining a subroutine
sub test_sub { # the body of the subroutine goes here # .. }
Calling a subroutine
Use the & prefix to call a subroutine. &test_sub; &gcd ($val1, $val2); # Two parameters However, the & is optional.
122
A subroutine can also return a non-scalar. Some examples are given next.
123
Example 1
$name = Indranil'; welcome(); welcome_namei(); exit; sub welcome { print "hi there\n"; } sub welcome_name { print "hi $name\n"; # uses global $name variable } # call the first sub # call the second sub
124
Example 2
# Return a non-scalar sub return_alpha_and_beta { return ($alpha, $beta); } $alpha = 15; $beta = 25; @c = return_alpha_and_beta; # @c gets (5,6)
125
Passing Arguments
All arguments are passed into a Perl function through the special array $_.
Thus, we can send as many arguments as we want.
126
Example 3
# Two different ways to write a subroutine to add two numbers sub add_ver1 { ($first, $second) = @_; return ($first + $second); } sub add_ver2 { return $_[0] + $_[1]; # $_[0] and $_[1] are the first two # elements of @_ }
127
Example 4
$total = find_total (5, 10, -12, 7, 40); sub find_total { # adds all numbers passed to the sub $sum = 0; for $num (@_) { $sum += $num; } return $sum; }
128
my variables
We can define local variables using the my keyword.
Confines a variable to a region of code (within a block { } ). my variables storage is freed whenever the variable goes out of scope. All variables in Perl is by default global.
129
Example 5
$sum = 7; $total = add_any (20, 10, -15); # $total gets 15 sub add_any { # local variable, won't interfere # with global $sum my $sum = 0; for my $num (@_ ) { $sum += $num; } return $sum; }
130
Introduction
Perl provides with a number of facilities to facilitate writing of CGI scripts.
Standard library modules. Included as part of the Perl distribution. No need to install them separately. #!/usr/bin/perl use CGI qw (:standard);
132
133
end_html This prints out the closing HTML tags, </body>, >/html>.
134
135
136
137
foreach $name_value (@nv_pairs) { my ($name, $value) = split /=/, $name_value; $name =~ tr/+/ /; $name =~ s/%([\da-f][\da-f])/chr (hex($1))/egi; $value =~ tr/+/ /; $value =~ s/%([\da-f][\da-f])/chr (hex($1))/egi; $form_data{$name} = $value; } return %form_data; }
138
Using CGI.pm
The decoded form value can be directly accessed as:
$value = param (fieldname);
139
Example 4
#!/usr/bin/perl -wT use CGI qw(:standard); my %form_data; foreach my $name (param() ) { $form_data {$name} = param($name); }
140
141
foreach my $xyz (param()) { print MAIL $xyz = , param($xyz), \n; } close (MAIL); print <<EOM; <h2>Thanks for the comments</h2> <p>Hope you visit again.</p> EOM print end_html;
142