Professional Documents
Culture Documents
Perl
Training for beginners by Guru,Suhas
Module 1
Overview of Perl
Objectives
At the end of this module you will know What is Perl Why to use Perl Where to get Perl Where to use Perl Perl in comparison with other languages Execute a simple Perl script
What is Perl
It is an acronym for practical extraction and report language Developed by Larry Wall A scripting language used basically for text and data manipulation Glue and gateway between systems, databases, and users.
What is Perl
It is an interpreted language It does not need a special compiler to turn the scripts into working code Majorly used by system administrators and web programmers Perl is a composition of sed , awk , C , Shell and English and many more .
Why Perl
Easy to learn and implement There is more than one way to do it Simple tasks are simple and complex tasks are possible Perl is portable Perl is free
Where is Perl
On a Unix OS Perl is mostly already available whereis Perl , Perl v Perl can be installed from the following sites
1. www.perl.com 2. www.activestate.com/ActivePerl/download.htm
Perl help
The keyword perldoc followed by f gives documentation of Perl's built in function
Perldoc followed by v gives the version of Perl on the system On UNIX , documentation can be available through man pages www.cpan.org www.perlfaq.com ( frequently asked questions ) www.perl.org (Perl user groups )
On windows directly the script can be executed with the above command
Summary
In this module we have learnt What is Perl Why use Perl Where to get Perl and help for Perl Writing a Perl program
Module 2
Data Types
Perl identifies the following types Scalars Arrays $ @
Points on variables
Length of the variable name is not limited Variable names are case sensitive Variables need not be pre declared No special character can be a part of the variable name
Packages
Perls package variables are global variables By default all variables and function belong to the main package new packages can be created with package package_name A new symbol table will be created for the new package created
Packages
Each package has its own symbol table where it puts all the variable names and functions defined in that package There is only one package in a source file but there can be more Other packages can be included in a file by use keyword use package_name Perl looks for the package in the current directory and the directory listed in the implicit @INC variable
Modules
A module is a package that gives the programmer more control over how the identifiers in it can be referenced. A module can export identifiers for use in another files namespace. The module must be set up as an Exporter module.
Modules
To use the module in another file use Exporter directive is included
Exporter must be included in the list that is the value of the implicit variable @ISA. Finally, we must set the implicit variable @EXPORT to a list of names (as strings) of the identifiers to export e.g., @EXPORT = ( $x, @ar, func );
Lexical variables
Lexical variables are created with my keyword Lexical variables do not belong to any package Perl writes the lexical variables in a scratch pad Outside the package where the lexical variable is defined the variable ceases to exist
Summary
In this module we learnt The different types of data containers in Perl Their properties The global scope and lexical scope through package and my keywords modules
Module 3
Scalars
Objectives
By the end of this module we will Define scalars Describe the intricacies of scalars a. Numeric b. Strings Describe Interpolation Define Context in Perl Describe the different operators in Perl
Scalars
They are denoted with $ symbol Eg : $variable They depict a singular quantity There is no distinction between an integer , a float , a string $x = 1 $x= 2.15555 $x=Welcome to the world of perl $x=Welcome to the world of Perl
Quote operators
Other than the conventional single and double quote characters some quote operators exist which are similar to their working q denotes a single quote and is used the following way $x = q character string character Eg : $x = q : another way to declare a single quoted string : qq denotes a double quote and the syntax is $x = qq character string character Egg : $x = qq : another way to declare a double quoted string :
Quote operators
The back quotes are used fro executing an external command from within a script Eg : print ` dir c: ` ( listing the contents of the c drive ) print ` ls ` ( in UNIX ) The equivalent for ` is qx eg : print qx ^ del file_name.c ^ qr operator is used for regular expressions
Interpolation
Interpolation happens in double quoted strings
It is a process where metacharacters like \n . \b , \t , \r etc reflect their special meanings associated with them
Interpolation
In case of arrays spaces get embedded between the elements Interpolation rules hold true even for quote operators q , qq and qr Interpolation occurs in print command and in regular expressions
Context
There are two types of context wrt scalars - Numeric context - String context The context is determined by the operation on the scalar In a numeric context any string values in an expression are converted to numeric values Similarly in a string context numbers are converted to strings
Numeric operators
Perl determines the context based on the operation There are numeric operators and string operators separately All the arithmetic operators supported are present in Perl + , - , / , * ,[ ++ , -- ( postfix and pre fix versions ) ] , % , = to list a few ** ( power operator ) Eg : 9 ** 2 ( result is 81 )
Numeric operators
Numeric logical operators are AND OR NOT && ( logical and ) | | ! ( logical or ) ( negation )
Numeric operators
The relational operators for numbers are Equal = = ( comparing equality ) Not equal != ( comparing non equality ) Less than < Greater than > Less than equal to <= Greater than equal to >= Numeric comparison operator <=>
String operators
. is the operator to concatenate two strings Eg : $x = Michelle ; $y = Mclany ; $z= $x . $y ($z is equal to Michelle Mclany ) x is the repetition operator Eg : $a = hello ; $b =2 ; $a x $b gives ( hellohello )
Equality Not equal Less than Greater than Less than equal to Greater than equal to String comparison
eq ne lt gt le ge cmp
Cmp operator
The string comparison operator cmp works similar to < = > If the LHS string is greater ( ASCII value ) than the RHS it returns 1 If the LHS string is equal to the RHS it returns 0 If the LHS string is less than the RHS it returns -1
Summary
In this module we learned to Define numeric scalar and a string Describe different values which Perl supports Identify different quoting operators used to define a string Define Interpolation Describe context Identify arithematic operators , logical operators for numerics and strings
Module 4
Objectives
After the end of this module we will learn to Describe a list Define an array Implement operations on arrays Describe some built in functions related to array Define hashes Implement operation on hashes Describe rules on hashes Describe some built in functions related to hashes
Arrays
@array_name = ( ostrich , kingfisher,woodpecker ) ( ) is used to define a list ( $x , $y , $z ) = @array_name # assigning an array to a list # assigning an array an empty list When an array is assigned to scalar it returns the length of the array @array_1 = ( ) ;
Arrays
qw operator can be used whenever all the elements in the list a need to be quoted Example : @array = ( Mickey , Donald, Goofy ) # instead of this @array =qw ( Mickey Donald Goofy ) # note no commas
Arrays
Index value of an array starts with 0 If a nth element is to be accessed then the syntax used is $array_name[n-1] $# is a special variable which gives the last elements index value Example : @array_num = ( mercury, venus, earth ,mars ); print $#array #result is mars
Arrays
array_indexing.plx Example : @array = qw( Delhi Katmandu Canberra London Paris ) for ( $i=0 ; $i <= $#array ; $i++) { print \$array[$i] = $array[$i] \n ; }
Arrays contd.
The output of the previous script would be $array[0] = Delhi $array[1] = Katmandu $array[2] = Canberra $array[3] = London $array[4] = Paris
Arrays contd.
Arrays can be negative indexed Perl starts from backwards when given negative subscripts Eg: @array_neg = ( a..d ) ; print $array_neg[-1] print $array_neg[-2] print $array_neg[-3] print $array_neg[-4] # result is d c b a
Array slices
Perl allows us to work with part of an array called array slices Eg : @array = ( 1 ,2 ,3 ,4 ,5 ,6 , 7, 8 , 9 ,10); @array[0 , 2, 4, 6, 8] = ( a , b , d , e , f ) ; # array becomes ( a, 2 , b , 4 , d , 6 , e , 8 , f , 10 )
Sort function
The sort function takes an array as its parameter and returns a sorted array The original array will not be affected The sort function uses cmp operator internally for the comparison of elements
Sort function
Examples : @arr = qw( xylophone America Thailand ) sort @arr # for strings sort {$a cmp $b} @arr # more explicit version of the previous one sort {$a <=> $b} @arr # for numbers Sort function normally sorts in ascending order *
Reverse function
The reverse function returns a reversed array Eg : @array = ( 1, 10 , 20 ,30 , 40 ,50) @rev_array = reverse @array print @rev_array # the result is 50 , 40 , 30 , 20 , 10 ,1 The reverse function does not affect the original array
Push function
Push function inserts a list into an array from the highest indexed position It returns the new length of the array Eg: @array = ( hello , hi ) ; push @array , bye , see you ; # @array is now hello , hi , bye , see you
Pop function
Pop function is used to remove the last element from the array Eg: @array=( a , b , c, d, e ) ; pop @array ; # array becomes ( a b c d ) , last element is removed Only one element can be removed with pop at a time
Unshift function
Unshift function is used to add a list to an array at the beginning It returns the new length of the array Eg: @array = qw( tiger lion ); unshift @array ( cheetah , leopard ) # the array becomes ( cheetah , leopard , tiger , lion )
Shift function
Shift function is used to remove the first element from the array Eg : @array = ( 1, 2 , 3 , 4 , 5 ) shift @array # @array is now ( 2 , 3 , 4 , 5 )
Splice function
Splice is used to add , remove any number of elements from any anywhere in an array splice ( @array_name , offset , no of elements to be removed , list to be added) Egg : @array = ( a, b , c , d , e , f ) splice ( @array , 2 , 3 , A , B , C ) # The array is now ( a b A B C f ) # start from 3 element ( 2 index) and 3 elements have been removed
Foreach function
foreach keyword returns each element from the array It starts from the first element and continues till last Eg : @my_array = qw ( Bangalore Mysore Dharwad Coorg) foreach $var ( @my_array ) { print $var ; }
Array Interpolation
When an array is put into double quotes interpolation takes place Array interpolation is marked with a space between the elements Eg @array = ( a,b,c,d,e); print @array; #the result would be a b c d e # note the spaces in between the elements
Hashes
Hashes
To access a scalar value from a hash the syntax is $hash{ key } Eg : % my_hash = qw ( Mickey Miney Donald Daisy ) The value of $my_hash{Mickey} is Miney All the even indexed elements are keys All the odd indexed elements are the values of the keys
Rules in Hashes
The keys in a hash are always a string The keys should always be unique though their values need not be There should not be an undefined key in a hash
Keys function
keys function returns the list of keys from hash In a scalar context, the number of keys is returned %cities = (Kar => Blore,AP => Hyd,TN =>Chen) ; for $my_key (keys %cities) { print key $my_key : value $cities{$my_key} \n; }
Values function
values functions returns the list of values from a hash my %cities = (Kar => Blore,AP => Hyd,TN => Chennai) ; for $value (values %cities) { print "value : $value \n"; }
Each function
each function returns both a key and a value from a hash Eg : my %cities = (Kar => Blore,AP => Hyd,TN => Chennai) ; while(($key, $value)=each %cities) { print Key : $key - Value : $value \n"; }
Exists function
This function tests if a given key value exists in an associative array It returns true if the value is present in the list
Delete Function
Delete function is used to delete a value from a hash or an array Eg : %my_hash = qw ( Disneyland USA Tajmahal India ) delete $my_hash{Disneyland } @my_array=qw ( Cool Cold Warm Hot ) delete $my_array[1]
Summary
In this module we learnt to Define arrays Implement operations on arrays Describe built in functions related to arrays Define hashes Implement operations on hashes Describe built in functions related to hashes
Module 5
Objectives
By the end of this module we will learn to Define the branching constructs supported by Perl
a. a. If , if elsif else , unless While , do while , until , for
Jumping
Branching statements
else statement is used if there is a block to be executed if if evaluates to false Eg : if ( condition ) { Statements to be executed if condition is true } else { Statements to be executed if condition evaluates false }
Branching statements
Elsif statement is used if there are more than one conditions to be tested Eg : if ( condition_1 ) { statements } elsif ( condition_2 ) { statements } else { statements }
Branching statements
Unless statement is used when the block has to executed if the condition is false unless ( condition ) { } Eg : unless ( $x = =5 ) { # in enters the block only if $x not equal print $x++; # to 5 }
Looping statements
Looping statements are used to iterate a block n number of times if the condition is met ( n can be 0 or more ) while is an entry controlled loop statement while ( condition ) { # till condition is true statements are statements # executed } Eg : while ( $x < 5 ) { print $x++; }
Looping statements
Until is used if a block has to be iterated if the condition is false until ( condition ) { statements } Eg : until ( $x == 5) { # prints hello till $x is 5 print hello ; $x++ }
Looping statements
Do while loop is an exit controlled loop Similarly do until do { statements } while ( condition ) Eg : do do { { print $x ; print $x ; $x++; $x++; } while ( $x < 10 ) } until ( $x > 10 )
Looping statements
For statement where initialization , condition check , variable manipulation for ( initialize ; condition ; variable manipulation) { statement ; }
Labels
Labels are used to provide more control over program flow through loops A label consists of any word, usually in uppercase, followed by a colon The label appears just before the loop operator (while, for, or foreach) and can be used as an anchor for jumping to from within the block
Summary
In this module we learnt to Describe the different branching constructs in Perl Describe the different looping constructs in Perl Breaking out of blocks with last Jumping to a particular point using goto
Module 6
Subroutines
Objectives
By the end of this module we will learnt to Define subroutines in Perl Describe the way of invoking the subroutine Send arguments to the subroutine Describe @_ special variable Return values from the subroutine Describe local variables
Subroutines
Subroutines gives us the ability to name a section of code When the code is needed then the name can be used to call it sub keyword is used to define a subroutine block Function call is made by using the name subroutine_name
Subroutines
Arguments can be sent to subroutine subroutine_name arg1 , arg2 , arg3 so on A subroutine may return a value with the return keyword More than one value can be returned from a subroutine
Subroutines - @_ variable
The arguments passed to a subroutine in Perl are put in @_ @_ is a special array unique to each subroutine Any change made to @_ will be reflected back in the original variable Thus in Perl, arguments are sent by reference
Local variables
local is the keyword used to localize a global variable Any changes made to the local variable in a block is not reflected on the original variable A local variable extends its visibility to the subroutines called from that subroutine Local variables are another way to define a lexical scope
Subroutines
When arrays or hashes are passed they flatten out as a single array in @_ The references of the arrays or hashes should be passed Similarly the references of the lists should be returned
Summary
In this module we learnt to Define subroutines Invoke the subroutine Pass different arguments to the subroutine Return values from subroutine Define local variables
Module 7
References
Objectives
By the end of this module we will learn to Describe a reference Define references for scalars , arrays , hashes and subroutines Define references to anonymous arrays , hashes and subroutines Dereferencing references Creating complex data structures by the use of references Define a filehandler Define a type glob Describe the @ARGV array
References
References are scalars that hold an address of other variable \ is the character that returns the address of the variable
References to a scalar
Reference to a scalar is done in the following way $x = 20; $scal_ref = \ $x ; $scal_ref has the address of $x To get the value in that address we need to dereference it Dereferencing a scalar is done by appending a $ before the scalar Eg : $$scal_ref gives 20
Reference to an Array
Reference to an array is done in the following way @array = ( 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 ) $arr_ref = \@array To dereference an array reference append a @ symbol to it @$arr_ref # whole array is dereferenced If a single element has to dereferenced the a $ symbol has to be appended and [ ] have to be used $$arr_ref [ n ] # n index value
References to hashes
Reference to a hash is done by the following way $hash_ref = \ %my_hash To dereference append a % symbol to de reference the whole hash at once To dereference one element at a time append a $ symbol to the hash Eg $$hash_ref { key }
References to Subroutines
References to a subroutine can also be defined in the similar way Eg: sub my_sub { print hello , how are you doing ; } $x=\&my_sub ; &$x ; # Function call via reference
File Handlers
File handlers are connections from the Perl script to external sources The external source can be a pipe , standard output , standard input , file , network socket etc Through the file handlers data flows into or out of the script File handlers are created with the open function
Type globs
Perl uses an internal type called a typeglob to hold an entire symbol table entry It has 6 slots created implicitly eg : *foo has $foo, @foo , %foo , &foo , foo ( filehanlde ) , foo ( format ) Globs are often used to pass filehandles to a subroutine Typeglobs in modern Perl is to create symbol table aliases. *this = *that;
@ARGV Array
@ARGV array comprises of the command line arguments file_name of the script being executed is not included in it foreach $variable (@ARGV) { print "Element: $variable \n"; } > Perl file_name a b c d
Summary
In this module we learnt to Define a reference to scalars , arrays and hashes and subroutines Dereferencing a reference Creating anonymous reference and thus creating complex data structures Define a filehandler Define a typeglob Describe the @ARGV array
Module 8
File Handling
Objectives
By the end of this module we will learn to Open a file for operations
a. Describe the open function
File Handling
In Perl , for any operation on a file a file handler is required A file handle is linked to a file by the open function open ( FILEHANDLE , file_name ) The open function returns true on successful opening open ( FILEHANDLE , $x ); # $x is containing a file name
The elements of LIST are printed to STDERR . It sets the script's return value to $! (errno) .
warn function
The warn( ) function has the same functionality as die( ) The sole exception is the script is not exited. This function is better suited for non-fatal messages The $! variable is used to display the system error message.
Modes of Files
A file can be opened in 3 different modes read, write, append
For read mode the syntax is
< > operator is used to read from a file handler The syntax is $variable = <FILEHANDLE > The read operation can be done in scalar context or list context $/ is the special variable called the record separator Default value of $/ is \n
If no files are specified there then it reads from the STDIN $_ is the default variable in which data is read into if no variable is provided **
Read function
The read function read n number of bytes of characters from a file Read ( file handle , $scal , n ) The second parameter is the scalar variable where the content read from the file is returned It returns the number of bytes actually read
eof() function
This function tests if the file pointer to file specified by the file handle is at the and of the file If no argument is supplied the file tested is the last file that was read
Writing to a file
Print is used to write to a file Print FILEHANDLE list Eg : $\=\n ; open (FH, ">myfile.txt"); print FH line1; print FH line2;print FH line3; *
Close ( ) function
close( ) function is used to close the file in that mode It also delinks the filehandle from that file
The filehandle can be used for a different file now close ( FILEHANDLE ) It is optional
Special variables
$0 - Name of the currently executing script. $$ - Current pid. $! - The current system error message from errno. $_ - Default for pattern operators and implicit I/O $. - The current input line number of the last filehandle that was read. Reset when the file handle is closed.
Special variables
$< - UID of the process $( - GID of the process $? - The status returned by the last `...` command, pipe close or system operator $| output auto flush if value is non zero
Unlink function
unlink deletes a list of files unlink ( FILE_LIST ) If FILE_LIST is not specified, then $_ will be used. It returns the number of files successfully deleted. Therefore, it returns false or 0 if no files were deleted.
Seek function
This function sets the file pointer to a specified offset position in a file seek( file handle , position , whence ) File handle is the name of the file handle Position is number of characters to be moved Whence is the reference from where the file pointer moves
Seek function
Whence can take the following 3 values 0 indicates to move from the beginning of the file 1 indicates to move from the current position 2 indicates to move from the end of the file
Tell function
Tell function returns the current position of the file pointer Tell( file handle ) On error it returns -1 If file handle is omitted consider the last file read
Summary
In this module we learnt to Open a file in different modes Implement read, write and append operations on files Error checking Special variables Deleting a file
Module 9
Pattern Matching
Objectives
By the end of this session we will be able to Describe pattern matching in Perl Describe regular expressions Define the pattern matching operators Define the different metacharacters available for pattern matching Special variables in pattern matching Substitution Translation
Pattern matching
Pattern matching is used to extract a pattern from a string Pattern matching is used mainly for 3 reasons Matching Substitution Translation m/ regular expression / modifiers s/ reg expn / replacement / modifiers tr/ character / replacement / modifiers
Pattern matching
The m // , s/// , tr /// all search the special variable $_ The binding operator =~ can be used to search other variables Eg : $x = Good Morning Everybody , Enjoy the Perl session if( $x =~ m /Perl/ )
Special operators
The dot ( . ) operator matches one occurrence of any character except a new line E.g.: / b.ll/ # matches ( ball , boll , b:ll , b<ll etc )
Bitwise or ( | ) is used for specifying alternative patterns E.g. /ball | bull/ # matches either ball or bull
Quantifiers
Quantifiers are those characters that specify the number of times a pattern should occur * is the metacharacter that specifies the character preceding it can occur 0 to n number of times
+ is the metacharacter that specifies the character preceding it can occur 1 to n number of times
Quantifiers
? is used if the character can occur either 0 or once Eg /ba?y/ matches by or bay {} braces can be used to specify a range or an exact number Eg /1{6}/ means 1 should atleast occur 6 times /1{2,5} means 1 should occur atleast 2 times and any number of occurrences above that till 5 will be matched *
Special metacharacters
\w \W \d \D \s \b \B
match an alpha numeric character not to match an alpha numeric character match a digit not to match a digit matches \t , \n , \f ,\r boundary character non boundary character
Anchors
Patterns float unless anchored Anchors are used to match a pattern only at a particular position ^ is used to anchor a pattern at the beginning $ is used to match a pattern at the end
Modifiers
Modifiers change the method pattern matching happens s m x i g e makes dot match a new line as well tells that input is a multiline string ( helps in anchors ) allows comments and spaces inside the / / allows case insensitive match allows repetitive match Evaluate the second part of s/// as a mini-function
Special variables
Perl provides many special variables in pattern matching $` variable stores the text before the matched pattern $& stores the matched text $ stores the text after the matched pattern
Special variables
Perl also allows to extract part of string called substrings For this Perl provides numbered variables starting from 1 The part of the string wished to be extracted has to be enclosed within ( ) If the pattern put in ( ) gets matched in the string then it is put in these special numbered variables $1 , $2 , $3 and so on ..
Special variables
Numbered variables start getting created as and how the match occurs The number of variables will be created depending upon the number of matched patterns within ( ) s They do not exist before the pattern matching begins
Backreferences
Back references are used to match a matched pattern again in a regular expression Eg : $_ = Wow ! What beautiful blue sea ! m/( ! ) .*? \1/ A \ followed by the number of our need puts that pattern previously matched in its place
Substitution
s/// operator is used to match and replace a pattern in a text Interpolation happens both in the pattern and replacement section Only the first occurrence of the pattern will be replaced If all the patterns occurred should be replaced then modifier g should be used ( g global )
Translation
tr/// is the operator used to translate character at a time Instead of tr/// y/// can be used It translates all the characters occurring in the input ( g need not be used )
Summary
In this session we learnt to Describe pattern matching in Perl Describe regular expressions Define the pattern matching operators Define the different metacharacters available for pattern matching Describe some special variables in pattern matching Substitution Translation
Module 12
Appendix
Objectives
By the end of this module we will be able to Describe pragmas
a. b. c. Strict Subs integer
Pragmas
Pragmas are compiler directives in Perl
Strict pragma
This pragma generates compiler errors if unsafe programming is detected. There are three specific things that are detected: Symbolic references Non-local variables (those not declared with my()) Non-quoted words that aren't subroutine names or file handles.
Summary
In this module we learnt to Define a pragma Describe important pragmas in Perl Describe the English module
Bibliography
Beginning Perl Thinking in Perl Perl in a nutshell Programming Perl Advanced Perl Programming Practical Perl Programming http://www.comp.leeds.ac.uk/Perl/start.html ( good link) http://search.cpan.org/ ( to search the required modules) All e-books \\indqa\Groups\Ind-QA-Fiery-Core\Perl_documents