You are on page 1of 60

Perl

There is more than one way to do it


(TIMTOWDI)
Perl(Practical Extraction and Reporting
Language)is a scripting language, designed
by Larry Wall.

Perl is a language optimized for scanning


arbitrary text files, extracting information
from those text files,and printing reports
based on that information.

(From version 5 Pearl uses both compiler and interpreter)


What is a scripting language?

Operating systems can do many things
− copy, move, create, delete, compare files
− execute programs, including compilers
− schedule activities, monitor processes, etc.

A command-line interface gives you access to
these functions, but only one at a time

A scripting language is a “wrapper” language
that integrates OS functions
Major scripting languages

UNIX has sh, Perl

Macintosh has AppleScript, Frontier

Windows has no major scripting languages
− probably due to the weaknesses of DOS

Generic scripting languages include:
− Perl (most popular)
− Tcl (easiest for beginners)
− Python (new, Java-like, best for large
programs)
Running Perl

#!/usr/local/bin/perl (tells the file to run
through perl [ only on *nix platform] )

Use .pl extension

Perl programName (to run the program)

Perl -d programName (to run using debugger)

Perl -w programName (enable many useful
warnings
(RECOMMENDED) )
“Hello, World Of Perl”
#!/usr/local/bin/perl
#Program to do the obvious
print 'Hello world.'; #Print a message

Perl is a free form language

Perl statements end with semicolons

Perl is case-sensitive

Perl is compiled and run in a single operation

Comments are # to end of line
− But the first line, #!/usr/local/bin/perl, tells
where to find the Perl compiler on your system
Input


Most heavily used input operator is th angle
operator. <STDIN> is used to read line from
standard input.

e.g.
chomp($name=<STDIN>);

The <> operator may also be used as it implicitly


uses STDIN.
Output
The print and printf functions are built-in functions
used to display output.

The print function arguments consist of a
comma-separated list of strings and/or numbers.

The printf function is similar to the C printf()
function and is used for formatting output.

print value, value, value;
printf ( string format [, mixed args [, mixed ...]] );
eg.
print "Hello, world\n";
printf("Meet %s:Age %d:Salary \$%10.2f\n", 
"John", 40, 55000);
Literals

All computer programs have to handle data. In


every program there are certain kinds of data that
do not change with time. They are called Literals.
Mainly two types of literals exist in Perl.

Numeric Literal

String Literal

Named Constants can also be define in perl e.g.-


use constant BUFFER_SIZE => 4096;
Numeric Literal

In Perl numbers can be expressed in decimal ,


hexadecimal or octal notation.

Octal numbers are preceded by a 0
eg. 023 is equivalent to 1910

hexadecimal numbers are preceded by 0x
eg. 0xfe is equivalent to 25410

Underscore( _ ) can be used to make large
numbers more readable.
eg. 4_976_297_305 is same as 4976297305
String Literal

A string is a sequence of characters enclosed by


either double quotes (") or single quotes (').They
differ in variable substitution and in the way
escape characters are handled.

Example-
"Hell is filled with ameture musicians."
'Give me $500'
Quoting format

Different quoting mechanisms can be used to make


different kinds of values.

Double quotes do -

Variable interpolation (replacement of variable
name by its value)

Backslash interpretation (escape characters)

While single quotes(') don't do all these.


Escape characters
Code Meaning
\n Newline
\r Carriage return
\t Horizontal Tab
\f form feed
\b backspace
\a Alert(bell)
\e ESC character
\033 ESC in octal
\x7f DEL om hexadecimal
\cC Control-C
\x{263a} Unicode(smiley)
\N{Name} Named Character
Escape characters cont.
\N{Name} is mainly used in denoting UNICODE
characters by as they cant be typed using
standard keyboards we use.


\u - Force next character to uppercse

\l - Force next character to lowercase

\U - Force all following characters to uppercase

\L - Force all following characters to lowercase

\Q - backslash all following nonalphanumeric
characters

\E - end \U,\L and \Q
Single and double quotes

$a = 'apples';

$b = 'bananas';

print $a . ' and ' . $b;
− prints: apples and bananas

print '$a and $b \n';
− prints: $a and $b \n

print "$a and $b \n";
− prints: apples and bananas followed by a newline
Single Quotes

Backquotes (`) will execute an external program and


return the output of the program,so that it can be
captured as a single string containing all the lines of
output.

Example-
$cwd =`pwd`; #string output from a command
Variables naming
Naming convention -

Unlike C or Java, Perl variables don't have to be
declared before being used. Variables are created
when it's 1st used. They are identified by the "funny
characters" that precede them.

Since reserved words and filehandles are not
preceded by a 'funny character', variable names will
not conflict with reserved words or filehandles.

A variable name starts with a letter, it may consist of
any number of letters (an underscore counts as a
letter) and/or digits.
Types
Perl has mainly these three data types -

Type Symbol
Scalar $
Array @
Hash %

$cents - An individual value (number or string)


@large - A list of values, keyed by number
%interest - A group of values, keyed by string
Scalars
Scalar variables can be assigned any form of scalar
value: integers,floating-point numbers, strings, and even
eferences to other variables, or to objects.
Example:-
$answer = 42;  # an integer
$pi = 3.14159265;  # a "real" number
$avocados = 6.02e23;  # scientific notation
$pet = "Camel";  # string
$fido = new Camel "Fido";  # an object
Array

An array is an ordered list of scalars, accessed by
the scalar's position in the list.

The list may contain numbers, or strings,or a
mixture of both.

As in C the array indexes start from 0.

Array subscripts are enclosed in square brackets.
Example-
@array = ("couch", 16, 35.68, "32");
Assignments can also be written as-
$array[0] = "couch";
$array[1] = 16;
$array[2] = 35.68;
$array[3] = "32";
Operation on Array
Some commonly used built-in functions:

push ­adds new elements to the end of array

pop - removes last element

shift - removes first element

unshift - adds new elements to the beginning
of the array

splice - removes or adds elements from
some position in the array

sort - sorts the elements of an array

reverse – returns reversed elements from a
given array
Operations on array cont.

push (@food, "eggs", "bread");
− push returns the new length of the list

$sandwich = pop(@food);

$ret  = shift @food;

unshift(@food,"sandwich","cake");

@discarded=splice(@food,2,2);

@sorted = sort(@food);

@reversed=reverse(@food);

$len = @food;  #$len gets length of @food

$#food  #returns index of last element
split

split breaks a string into parts


$info = "Caine:Michael:Actor:14, Leafy 
Drive";
@personal = split(/:/, $info);


@personal =
    ("Caine", "Michael", "Actor", "14, 
Leafy Drive");
Lists and arrays


Double dot(..) operator for generating numbers
@x = (1..6); # same as (1, 2, 3, 4, 5, 6)
@z =(2..5,8,11..13);#same as (2,3,4,5,8,11,12,13)
qw() "quote word" function
qw(Jan Piet Marie) is a shorter notation for
("Jan","Piet","Marie").
Hash
A hash is an unordered set of scalars, accessed by
some string value that is associated with each scalar.
For this reason hashes are often called "associative
arrays".
Example -
%longday = (
    "Sun" => "Sunday",
    "Mon" => "Monday",
    "Tue" => "Tuesday",
    "Wed" => "Wednesday",
    "Thu" => "Thursday",
    "Fri" => "Friday",
    "Sat" => "Saturday",
);
Operations on Hash


To get the value with key Wed we use
$longday{"Wed"}

Some operations on hash -


keys - retrieves all the keys in a hash

values - retrieves all the values in a hash

each - retrieves a key/value pair from a hash

delete - removes a key/value pair
Operations on Hash cont.
 @keys=keys(%longday);

return array of keys in random order.


@values=values(%longday);
return array of values in random order.


($key,$value)=each(%weekday);
returns, in random order, a two-element array whose
elements are the key and the corresponding value of a
hash.

 $del=delete $weekday{"Fri"};

delete a value from %weekday and return it if successful


Scope

In Perl scripts, the variable is visible to the entire


script and can be changed anywhere within the
script.

Using our, local, and my functions in packages,


it is possible to change the scope and namespace
of a variable.
Operators
Associ
Operators Description
ativity
left terms and list operators These include variables, quote and
quotelike operators, any
expression in parentheses, and any
function whose arguments are
parenthesized
left -> Infix dereference operator.
++ Auto-increment.
-- Auto-decrement.
right ** Exponentiation.
right \ Reference to an object (unary).
right ! ~ Unary negation, bitwise complement.
right + - Unary plus, minus.
Binds a scalar expression to a pattern
left =~
match.
left !~ Same, but negates the result.
Operators cont.
Associa
Operators Description
tivity
left * / % x Multiplication, division, modulo, repetition.
left + - . Addition, subtraction, concatenation.
left >> << Bitwise shift right, bitwise shift left.
named unary
E.g. sleep, sqrt, -r, -e.
operators
< > <= >= Numerical relational operators.
lt gt le ge String relational operators.
== != <=> Numerical equal, not equal, compare.
Stringwise equal, not equal, compare.
eq ne cmp Compare operators return -1 (less), 0 (equal)
or 1 (greater).
left & Bitwise AND.
left | ^ Bitwise OR, exclusive OR.
left && Logical AND.
left || Logical OR.
Operators cont.
Associa
Operators Description
tivity
In scalar context, range operator. In array
left ..
context, enumeration.
right ?: Conditional (if ? then : else) operator.
right = += -= *= etc. Assignment operators.
Comma operator, also list element
left ,
separator.
Same as comma, but enforces the left
left =>
operand to be a string.
The right side of a list operator governs
list operators(rightward)
all the list operator's arguments
right not Low precedence logical NOT.
left and Low precedence logical AND.
Low precedence logical OR, exclusive
left or xor
OR.
Arithmetic in Perl
$a = 1 + 2;   # Add 1 and 2 and store in $a
$a = 3 ­ 4;   # Subtract 4 from 3 and store in $a
$a = 5 * 6;   # Multiply 5 and 6
$a = 7 / 8;   # Divide 7 by 8 to give 0.875
$a = 9 ** 10; # Nine to the power of 10,ie., 910
$a = 5 % 2;   # Remainder of 5 divided by 2
++$a;         # Increment $a and then return it
$a++;         # Return $a and then increment it
­­$a;         # Decrement $a and then return it
$a­­;         # Return $a and then decrement it
String and assignment operators
$a = $b . $c;   # Concatenate $b and $c
$a = $b x $c;   # $b repeated $c times

$a = $b;        # Assign $b to $a
$a += $b;       # Add $b to $a
$a ­= $b;       # Subtract $b from $a
$a .= $b;       # Append $b onto $a
Conditional checks

Perl provides mainly two kinds of conditional


checks if condition and a its variant if..elsif
condition.

Perl has no boolean data type. Anything that


evaluates to the null string,undefined variable ,the
number zero or the string "0" is considered false,
everything else is true (including strings like
"00"!).
if statement

Syntax:-

if (Expression) {Block}

if (Expression) {Block} else {Block}


if statements
if ($a)
{
print "The string is not empty\n";
}
else
{
print "The string is empty\n";
}
if-elsif

Syntax:
if (Expression) {Block} elsif (Expression)
{Block}... else {Block}
if - elsif statements
if (!$a) 
  { print "The string is empty\n"; }
elsif (length($a) == 1)
  { print "The string has one 
character\n"; }
elsif (length($a) == 2)
  { print "The string has two 
characters\n"; }
else 
  { print "The string has many 
characters\n"; }
Loop Structures
while(Expression){Block}

until (Expression) {Block}

do {Block} while (Expression);

do {Block} until (Expression);

for (Expression1;Expression2;Expression3)
{Block}

foreach VARIABLE (ARRAY)


{BLOCK}
Break / continue

Stop a loop, or force continuation:


last; # C break
next; # C continue
while loops
#!/usr/local/bin/perl
print "Password? ";
$a = <STDIN>;
chop $a;     # Remove the newline at end
while ($a ne "fred")
{
    print "sorry. Again? ";     
    $a = <STDIN>;
    chop $a;
}
Until

The until statement executes the block as long as


the control expression after the until is false, or
zero. When the expression evaluates to true
(nonzero), the loop exits.
Until example

until ($answer eq "yes"){
       sleep(1);
       print "Are you o.k. yet? ";
       chomp($answer=<STDIN>);
}
do..while loops
#!/usr/local/bin/perl
do
{
       print "Password? "; 
       $a = <STDIN>;
        chop $a;
}while ($a ne "fred");
do..until loop

#!/usr/local/bin/perl
do
{
$a=<>;
print $a,"\n";
}until($a==1);
for loops

for loops are just as in C or Java


for ($i = 0; $i < 10; ++$i)
{
        print "$i\n";
}
foreach

#Visit each item in turn and call it 
$morsel

foreach $morsel (@food)
{
        print "$morsel\n";  
        print "Yum yum\n"; 
}
Pattern Matching
The s/// operator modifies sequences of characters Substitute
The tr/// operator changes individual characters. Translation
The m// operator checks for matching (or in short //) Matching

The first part between the first two slashes contain a search
pattern

The second part between two final slashes contain the
replacement

Behind the final slash are characters to modify the behavior of the
commands.(modifiers)

By default s/// only replaces the first occurrence of the search


patten
Pattern Matching cont.
s/// and m// accepts the following modifiers-
Modifier Meaning
g Replace every occurrence
i Do case-insensitive pattern matching.
m Treat string as multiple lines (^ and $ match internal \n).
s Treat string as single line (^ and $ ignore \n, but . matches \n).
x Extend your pattern's legibility with whitespace and comments.
o Compile pattern once only

Quantifier Meaning
+ Matches the preceding pattern element one or more times.
? Matches zero or one times.
* Matches zero or more times.
Denotes the minimum N and maximum M match count. {N}
{N,M} means exactly N times; {N,} means at least N times.
Basic pattern matching

$sentence =~ /the/
− True if $sentence contains "the"

$sentence = "The dog bites.";
if ($sentence =~ /the/)  # is false
− …because Perl is case-sensitive

!~ is "does not contain"
RE special characters
. # Any single character except a newline

^ # The beginning of the line or string

$ # The end of the line or string

* # Zero or more of the last character

+ # One or more of the last character

? # Zero or one of the last character


RE examples

^.*$ # matches the entire string

hi.*bye # matches from "hi" to "bye" inclusive

x +y # matches x, one or more blanks, and y

^Dear # matches "Dear" only at beginning


bags? # matches "bag" or "bags"

hiss+ # matches "hiss", "hisss", "hissss", etc.


Square brackets
[qjk] # Either q or j or k
[^qjk] # Neither q nor j nor k
[a-z] # Anything from a to z inclusive
[^a-z] # No lower case letters
[a-zA-Z] # Any letter
[a-z]+ # Any non-zero sequence of
# lower case letters
More examples
[aeiou]+ # matches one or more vowels
[^aeiou]+ # matches one or more nonvowels
[0-9]+ # matches an unsigned integer
[0-9A-F] # matches a single hex digit
[a-zA-Z] # matches any letter
[a-zA-Z0-9_]+ # matches identifiers
More special characters
\n # A newline
\t # A tab
\w # Any alphanumeric; same as [a-zA-Z0-9_]
\W # Any non-word char; same as [^a-zA-Z0-9_]
\d # Any digit. The same as [0-9]
\D # Any non-digit. The same as [^0-9]
\s # Any whitespace character
\S # Any non-whitespace character
\b # A word boundary, outside [] only
\B # No word boundary
Match

m/Good morning/
/Good evening/
/\/usr\/var\/adm/
m#/usr/var/adm#
m(Good evening)
m'$name'
Substitution

s/old/new/;
s/old/new/i;
s/old/new/g;
s+old+new+g;
s(old)/new/; s[old]{new};
s/old/expression to be evaluated/e;
s/old/new/ige;
s/old/new/x;
Transliterate
The tr/// operator allows the modification of character
tr does character-by-character translation

tr returns the number of substitutions made


c(replaces the complement of each the search class,anything not in
first spec is replaced
by second,uses last character of string)
s(squeeze sequences of identical replaced character to one character

$sentence =~ tr/abc/edf/;
− replaces a with e, b with d, c with f

$count = ($sentence =~ tr/*/*/);
− counts asterisks

tr/a-z/A-Z/;
− converts to all uppercase
Examples
# replace first occurrence of "bug"
$text =~ s/bug/feature/;
# replace all occurrences of "bug"
$text =~ s/bug/feature/g;
# convert to lower case
$text =~ tr/[A-Z]/[a-z]/;
# delete vowels
$text =~ tr/AEIOUaeiou//d;
# replace nonnumber sequences with a single x
$text =~ tr/[0-9]/x/cs;
# replace each capital character by CAPS
$text =~ s/[A-Z]/CAPS/g;
End 1st part

You might also like