Introduction to Perl Programming ( perl 5 ) Contents

Basics Variables and Operators Branching Looping File Test Operators Regular Expressions Input and Output Processing files mentioned on the Command line Get Filenames Pipe input and ouput from/to Unix Commands Execute Unix Commands The Perl built-in Functions Subroutines Some of the special Variables Forking Building Pipes for forked Children Building a Socket Connecting to another Computer Get User and Network Information Arithmetics Formatting Output with "format" Commandline Switches

Perl is a script language, which is compiled each time before running. That unix knows that it is a perl script there must be the following header at the topline of every perl script: #!/usr/bin/perl where the path to perl has to be correct and the line must not exeed 32 charachters.

Comments and Commands
After the header line: #!/usr/bin/perl there are either empty lines with no effect or command lines or commentary lines. Everything from and behind a "#" up to the end of the line is comment and has no effect on the program. Commands start with the first non space charachter on a line and end with a ";". So one can continue a command over several lines and terminates it only with the semicolon.

Direct commands and subroutines
Normal commands are executed in the order written in the script. But subroutines can be placed anywhere and will only be evaluated when called from a normal command line. Perl knows it's a subroutine if it the code is preceded with a "sub" and enclosed in a block like: sub name { command;}

Other special lines
Perl can include other programming code with: require something or with use something.

Single quote: '' or: q// Double quote: "" or: qq// Quote for execution: `` or: qx// Quote a list of words: ('term1','term2','term3') or: qw/term1 term2 term3/ Quote a quoted string: qq/"$name" is $name/; Quote something wich contains "/": qq!/usr/bin/$file is readdy!;

Scalar and list context

That perl distinguishes between scalar and list context is the big feature, which makes it unique and more useful then most other script languages. A subroutine can return lists and not only scalars like in C. Or an array gives the number of elements in a scalar context and the elements itself in a list context. The enormous value of that feature should be evident.

Variables and Operators
There are scalar variables, one and two dimensional arrays and associative arrays. Instead of declaring a variable, one precedes it with a special character. $variable is a normal scalar variable. @variable is an array and %variable is an associative array. The user of perl does not have to distinguish between a number and a string in a variable. Perl switches the type if necessary.

Fill in a scalar with: $price = 300; $name = "JOHN"; Calculate with it like: $price *= 2; $price = $oldprice * 4; $count++; $worth--; Print out the value of a scalar with: print $price,"\n";

Fill in a value: $arr[0] = "Fred"; $arr[1] = "John"; Print out this array: print join(' ',@arr),"\n"; If two dimensional: $arr[0][0] = 5; $arr[0][1] = 7;

Hashes (Associative Arrays)
Fill in a single element with: $hash{'fred'} = "USA"; $hash{'john'} = "CANADA"; Fill in the entire hash:
%a = ( 'r1', 'r2', 'r3', ); 'this is val of r1', 'this is val of r2', 'this is val of r3',

or with:
%a = (


r1 => 'this is val of r1', r2 => 'this is val of r2', r3 => 'this is val of r3',

Put something into a variable with a "=" or with some combined operator which assigns and does something at the same time: $var = "string"; Puts the string into $var $var = 5; Puts a number into $var $var .= "string"; Appends string to $var $var += 5; Adds number to $var $var *= 5; Multipliy with 5 $var ||= 5; If $var is 0 make it 5 $var x= 3; Make $var to three times $var as string: from a to aaa Modify and assign with: ($new = $old) =~ s/pattern/replacement/;

Compare strings with: eq ne like in: $name eq "mary". Compare numbers with: == != >= <= <=> like in: $price == 400.

Acct on success or failure of an expression: $yes or die; means exit if $yes is not set. For AND we have: && and "and" and for OR we have: || or "or". Not is "!" or "not". AND,OR and NOT are regularly used in if() statements: if($first && $second){....;} if($first || $second){....;} if($first && ! $second{....;} means that $first must be non zero but $second must not be so. But many NOT's can be handled more reasonable with the unless() statement. Instead: print if ! $noway; one uses: print unless $noway;


if(condition){ command; }elsif(condition){ command; }else{ command; } command if condition;

unless (just the opposite of if)
unless(condition){ command; }else{ command; } command unless condition;

while(condition){ command; } # Go prematurely to the next iteration while(condition){ command; next if condition; command; } # Prematureley abort the loop with last while(condition){ command; last if condition; } # Prematurely continue the loop but do continue{} in any case while(condition){ command; continue if condition; command; }continue{



# Redo the loop without evaluating while(condtion) while(condtion){ command; redo if condition; } command while condition;

until (just the opposite of while)
until(condition){ command; } until(condition){ command; next if condition; command; } until(condition){ command; last if condition; } until(condition){ command; continue if condition; command; }continue{ command; } command until condtion;

for (=foreach)
# Iterate over @data and have each value in $_ for(@data){ print $_,"\n"; } # Get each value into $info iteratively for $info (@data){ print $info,"\n"; } # Iterate over a range of numbers for $num (1..100){ next if $num % 2; print $num,"\n";

} # Eternal loop with (;;) for (;;){ $num++; last if $num > 100; }

# syntax map (command,list); map {comm1;comm2;comm3;} list; # example map (rename($_,lc($_),<*>);

File Test Operators
File test operators check for the status of a file: Some examples: -f $file -d $file -r $file -x $file -w $file -o $file -l $file -e $file -z $file -s $file -t FILEHANDLE -T $file -B $file -M $file It's a plain file It's a directory Readable file Executable file Writable file We are owner File is a link File exists File has zero size, but exists File is greater than zero This filehandle is connected to a tty Text file Binary file Returns the day number of last modification time

Regular Expressions

What it is
A regular expression is an abstract formulation of a string. Usually one has a search pattern and a match which is the found string. There is also a replacement for the match, if a substitution is made.

A pattern stands for either one, any number, several, a particular number or none cases of a character or a character-set given literally, abstractly or octaly. PATTERN . .* a* a*? .? .+ .{3,7} .{3,7}? .{3,} .{3} [ab] [^ab] [a-z] ^a \Aa a$ a\Z A|bb|CCC tele(f|ph)one \w \W \d \D MATCH any character (dot) any number on any character (dot asterix) the maximum of consecutive a's the minimum of consecutive a's one or none of any characters one or more of any character three up to seven of any characters, but as many as possible three up to seven, but the fewest number possible at least 3 of any character exactly 3 times any character a or b not a and also not b any of a through z a at beginning of string a at end of string A or bb or CCC telefone or telephone A-Z or a-z or _ none of the above 0-9 none of 0-9

\s \S \t \n \r \b \bkey (?#.......) (?i) (?:a|b|c) (?=.....) (?!.....)

space or \t or \n (white space) non space tabulator newline carridge return word boundary matches key but not housekey Comment Case insensitive match. This can be inside a pattern variable. a or b or c, but without string in $n Match ..... but do not store in $& Anything but ..... and do not store in $&

One can replace found matches with a replacement with the s/pattern/replacement/; statement. The "s" is the command. Then there follow three delimiters with first a search pattern and second a replacement between them. If there are "/" within the pattern or the replacement then one chooses another delimiter than "/" for instance a "!". To change the content of a variable do: $var =~ s/pattern/replacement/; To put the changed value into another variable, without distorting the original variable do: ($name = $line) =~ s/^(\w+).*$/$1/; COMMAND s/A/B/; s/A/B/g; s/A+/A/g; s/^#//; s/^/#/; WHAT it DOES substitute the first a in a string with B substitute every a with a B substitute any number of a with one A substitute a leading # with nothing. i.e remove it prepend a # to the string

s/A(\d+)/B$1/g; s/(\d+)/$1*3/e;

substitute a followed by a number with b followed by the same number substitute the found number with 3 times it's value

Use two "e" for to get an eval effect: perl -e '$aa = 4; $bb = '$aa'; $bb =~ s/(\$\w+)/$1/ee; print $bb,"\n";' s/here goes date/$date/g; s/(Masumi) (Nakatomi)/$2 $1/g; s/\000//g; s/$/\033/; substitute "here goes date" with the value of $date switch the two terms remove null charachters append a ^M to make it readable for dos

Input and Output
Output a value from a variable
print $var,"\n";

Output a formated string

Read in a value into a variable and remove the newline
chomp() (perl5) removes a newline if one is there. The chop() (perl4) removes any last character.
chomp($var = <STDIN>);

Read in a file an process its linewise
open(IN,"<filename") || die "Cannot open filename for input\n"; while(<IN>){ command;

} close IN;

Read a file into an array
open(AAA,"<infile") || die "Cannot open infile\n"; @bigarray = <AAA>; close AAA;

Output into a file
open(OUT,">file") || die "Cannot oben file for output\n"; while(condition){ print OUT $mystuff; } close OUT;

Check, whether open file would yield something (eof)
open(IN,"<file") || die "Cannot open file\n"; if(eof(IN)){ print "File is empty\n"; }else{ while(<IN>){ print; } } close IN;

Process Files mentioned on the Commandline
The empty filehandle "<>" reads in each file iteratively. The name of the current processed file is in $ARGV. For example print each line of several files prepended with its filename:
while(<>){ $file = $ARGV; print $file,"\t",$_; open(IN,"<$file") or warn "Cannot open $file\n"; ....commands for this file.... close(IN); }

Get Filenames

Get current directory at once
@dir = <*>;

Use current directory iteratively
while(<*>){ ...commands... }

Select files with <>
@files = </longpath/*.c>;

Select files with glob()
This is the official way of globbing:
@files = glob("$mypatch/*$suffix");

Perl can also read a directory itself, without a globing shell. This is faster and more controllable, but one has to use opendir() and closedir().
opendir(DIR,".") or die "Cannot open dir.\n"; while(readdir DIR){ rename $_,lc($_); } closedir(DIR);

Pipe Input and Output from/to Unix Commands
Process Data from a Unix Pipe
open(IN,"unixcommand|") || die "Could not execute unixcommand\n"; while(<IN>){ command; } close IN;

Output Data into a Unix Pipe
open(OUT,"|more") || die "Could not open the pipe to more\n"; for $name (@names){ $length = length($name); print OUT "The name $name consists of $lenght characters\n"; } close OUT;

Execute Unix Commands
Execute a Unix Command and forget about the Output
system("someprog -auexe -fv $filename");

Execute a Unix Command an store the Output into a Variable
If it's just one line or a string: chomp($date = qx!/usr/bin/date!); The chomp() (perl5) removes the trailing "\n". $date gets the date. If it gives a series of lines one put's the output into an array:
chomp(@alllines = qx!/usr/bin/who!);

Replace the whole perl program by a unix program
exec anotherprog; But then the perl program is gone.

The Perl built-in Functions
String Functions
Get all upper case with: Get only first letter uppercase: Get all lowercase: Get only first letter lowercase: Get the length of a string: Extract 5-th to 10-th characters from a string: Remove line ending: Remove last character: Crypt a string: Execute a string as perl code: Show position of substring in string: Show position of last substring in string: Quote all metacharachters: $name = uc($name); $name = ucfirst($name); $name = lc($name); $name = lcfirst($name); $size = length($string); $part = substr($whole,4,5); chomp($var); chop($var); $code = crypt($word,$salt); eval $var; $pos = index($string,$substring); $pos = rindex($string,$substring); $quote = quotemeta($string);

Array Functions
Get expressions for which a command returned true: @found = grep(/[Jj]ohn/,@users);

Applay a command to each element of an @new = map(lc($_),@start); array: Put all array elements into a single string: $string = join(' ',@arr); Split a string and make an array out of it: Sort an array: Reverse an array: Get the values of a hash: Get key and value of a hash iteratively: @data = split(/&/,$ENV{'QUERY_STRING'}; sort(@salery); reverse(@salery); values(%hash); each(%hash);

Get the keys of a hash(associative array): keys(%hash);

Delete an array: Delete an element of a hash: Check for a hash key: Check wether a hash has elements:

@arr = (); delete $hash{$key}; if(exists $hash{$key}){;} scalar %hash;

Cut of last element of an array and return $last = pop(@IQ_list); it: Cut of first element of an array and return $first = shift(@topguy); it: Append an array element at the end: Prepend an array element to the front: push(@waiting,$name); unshift(@nowait,$name);

Remove first 2 chars an replace them with splice(@arr,0,2,$var); $var: Get the number of elements of an array: Get the last index of an array: scalar @arr; $lastindex = $#arr;

File Functions
Open a file for input: Open a file for output: Open for appending: Close a file: Set permissions: Delete a file: Rename a file: Make a hard link: Make a symbolic link: Make a directory: Delete a directory: Reduce a file's size: open(IN,"</path/file") || die "Cannot open file\n"; open(OUT,">/path/file") || die "Cannot open file\n"; open(OUT,">>$file") || &myerr("Couldn't open $file"); close OUT; chmod 0755, $file; unlink $file; rename $file, $newname; link $existing_file, $link_name; symlink $existing_file, $link_name; mkdir $dirname, 0755; rmdir $dirname; truncate $file, $size;

Change owner- and group-ID: chown $uid, $gid; Find the real file of a symlink: $file = readlink $linkfile; Get all the file infos: @stat = stat $file;

Conversions Functions
Number to character: Charachter to number: Hex to decimal: Octal to decimal: Get localtime from time: Get greenwich meantime: Pack variables into string: Unpack the above string: chr $num; ord($char); hex(0x4F); oct(0700); localtime(time); gmtime(time); $string = pack("C4",split(/\./,$IP)); @arr = unpack("C4",$string);

Subroutines (=functions in C++)
Define a Subroutine
sub mysub { command; }

sub myerr { print "The following error occured:\n"; print $_[0],"\n"; &cleanup; exit(1); }

Call a Subroutine

Give Arguments to a Subroutine

Receive Arguments in the Subroutine
As global variables:
sub mysub { @myarr = @_; } sub mysub { ($dat1,$dat2,$dat3) = @_; }

As local variables:
sub mysub { local($dat1,$dat2,$dat3) = @_; }

Some of the Special Variables
$_ $. $0 $$ $< $> $| $& $1.... $` $'

String from current loop. e.g. for(@arr){ $field = $_ . " ok"; } Line number from current file processed with: while(<XX>){ Program name Process id of current program The real uid of current program Effective uid of current program For flushing output: select XXX; $| = 1; The match of the last pattern search The ()-embraced matches of the last pattern search The string to the left of the last match The string to the right of the last match

Forking is very easy! Just fork. One puts the fork in a three way if(){} to separately the parent, the child and the error.
if($pid = fork){ # Parent command; }elsif($pid == 0){

# Child command; # The child must end with an exit!! exit; }else{ # Error die "Fork did not work\n"; }

Building Pipes for forked Children
Building a Pipe

Flushing the Pipe
select(WRITEHANDLE); $| = 1; select(STDOUT);

Setting up two Pipes between the Parent and a Child
pipe(FROMCHILD,TOCHILD); select(TOCHILD); $| = 1; select(STDOUT); pipe(FROMPARENT,TOPARENT);select(TOPARENT);$| = 1; select(STDOUT); if($pid = fork){ # Parent close FROMPARENT; close TOPARENT; command; }elsif($pid == 0){ # Child close FROMCHILD; close TOCHILD; command; exit; }else{ # Error command; exit; }

Building a Socket Connection to another Computer

# Somwhere at the beginning of the script require 5.002; use Socket; use sigtrap; # Prepare $port = $remote = $iaddr = $paddr = infos 80; 'remotehost.domain'; inet_aton($remote); sockaddr_in($port,$iaddr);

# Socket socket(S,AF_INET,SOCK_STREAM,$proto) or die $!; # Flush socket select(S); $| = 1; select(STDOUT); # Connect connect(S,$paddr) or die $!; # Print to socket print S "something\n"; # Read from socket $gotit = <S>; # Or read a single character only read(S,$char,1); # Close the socket close(S);

Get Unix User and Network Information
Get the password entry for a particular user with: @entry = getpwnam("$user"); Or with bye user ID: @entry = getpwuid("$UID"); One can information for group, host, network, services, protocols in the above way with the commands: getgrnam, getgrid, gethostbyname, gethostbyaddr, getnetbyname, getnetbyaddr, getservbyname, getservbyport, getprotobyname, getprotobynumber. If one wants to get all the entries of a particular category one can loop through them by:

setpwent; while(@he = getpwent){ commands... } entpwent;

For example: Get a list of all users with their home directories:
setpwent; while(@he = getpwent){ printf("%-20s%-30s\n",$he[0],$he[7]); } endpwent;

The same principle works for all the above data categories. But most of them need a "stayopen" behind the set command.

Addition: + Subtraction: Multiplication: * Division: / Rise to the power of: ** Rise e to the pwoer of: exp() Modulus: % Square root: sqrt() Absolut value: abs() Tangens: atan2() Sinus: sin() Cosine: cos() Random number: rand()

Formatting Output with "format"
This should be simplification of the printf formatting. One formats once only and then it will be used for every write to a specified file handle. Prepare a format somwhere in the program:
format filehandle = @<<<<<<<<<<@###.#####@>>>>>>>>>>@|||||||||| $var1, $var3, $var4 .

Now use write to print into that filhandle according to the format:

The @<<< does left adjustment, the @>>> right adjustment, @##.## is for numericals and @||| centers.

Command line Switches
Show the version number of perl: Check a new program without runing it: Have an editing command on the command line: Automatically print while precessing lines: Remove line endings and add them again: Edit a file in place: Autosplit the lines while editing: Have an input loop without printing: perl -v; perl -wc <file>; perl -e 'command'; perl -pe 'command' <file>; perl -lpe 'command' <file>; perl -i -pe 'command' <file>; perl -a -e 'print if $F[3] =~ /ETH/;' <file>; perl -ne 'command' <file>;

Sign up to vote on this title
UsefulNot useful

Master Your Semester with Scribd & The New York Times

Special offer: Get 4 months of Scribd and The New York Times for just $1.87 per week!

Master Your Semester with a Special Offer from Scribd & The New York Times