Perl and UNIX Network Programming

Naoya Ito naoya at hatena.ne.jp

Why now network programming?
 httpd is boring  Some recent web application have special feature of networking.
 Comet  Socket API of ActionScript 3

 mini server for development, like Catalyst's server.pl

Agenda
 UNIX network programming basics with Perl  I/O multiplexing  Perl libraries for modern network programming

UNIX network programming basics with Perl

BSD Socket API with C
int main (void) { int struct sockaddr_in char listenfd, connfd; servaddr; buf[1024];

listenfd = socket(AF_INET, SOCK_STREAM, 0); bzero(&servaddr, sizeof(servaddr)); servaddr.sin_family = AF_INET; servaddr.sin_addr.s_addr = htonl(INADDR_ANY); servaddr.sin_port = htons(9999); bind(listenfd, (struct sockaddr *) &servaddr, sizeof(servaddr)); listen(listenfd, 5); for (;;) { connfd = accept(listenfd, NULL, NULL) ; while (read(connfd, buf, sizeof(buf)) > 0) { write(connfd, buf, strlen(buf)); } close(connfd); }

}

BSD Socket API
       socket() struct sockaddr_in bind() listen() accept() read() / write() close()

Perl Network Programming
    TMTOWTDI less code CPAN performance is good enough

 right design >> ... >> language advantage

BSD Socket API with Perl
#!/usr/local/bin/perl use strict; use warnings; use Socket; socket LISTEN_SOCK, AF_INET, SOCK_STREAM, scalar getprotobyname('tcp'); bind LISTEN_SOCK, pack_sockaddr_in(9999, INADDR_ANY); listen LISTEN_SOCK, SOMAXCONN; while (1) { accept CONN_SOCK, LISTEN_SOCK; while (sysread(CONN_SOCK, my $buffer, 1024)) { syswrite CONN_SOCK, $buffer; } close CONN_SOCK; }

use IO::Socket
#!/usr/local/bin/perl use strict; use warnings; use IO::Socket; my $server = IO::Socket::INET->new( Listen => 20, LocalPort => 9999, Reuse => 1, ) or die $!; while (1) { my $client = $server->accept; while ($client->sysread(my $buffer, 1024)) { $client->syswrite($buffer); } $client->close; } $server->close;

blocking on Network I/O
while (1) { my $client = $server->accept; while ($client->sysread(my $buffer, 1024)) { # block $client->syswrite($buffer); } $client->close; }
accept(2)

server
read(2)
listen queue

I can't do

client #1

client #2

busy loop / blocking
% ps -e -o stat,pid,wchan=WIDE-WCHAN-COLUMN,time,comm

while (1) { $i++ }
STAT PID WIDE-WCHAN-COLUMN TIME COMMAND R+ 18684 00:00:38 perl

while (1) { STDIN->getline }
STAT S+ PID WIDE-WCHAN-COLUMN 8671 read_chan TIME COMMAND 00:00:00 perl

Linux internals
process
fread()

buffer buffer

libc.so
read(2)

TASK_RUNNIN G

system call vfs ext3
switch to KernelMode. Userprocess goes sleep.

TASK_UNINTERRUPTIBLE

device driver

Kernel-Mode Hardware Interruptio n.

Hardware (HDD)

ref: 『 Linux カーネル 2.6 解読室』 p.32

Again: blocking
while (1) { my $client = $server->accept; while ($client->sysread(my $buffer, 1024)) { # block $client->syswrite($buffer); } $client->close; }

We need parallel processing
     fork() threads Signal I/O I/O Multiplexing Asynchronous I/O

I/O multiplexing

I/O Multiplexing
 Parallel I/O in single thread, watching I/O event of file descripters  less resource than fork/threads  select(2) / poll(2)
 wait for a number of file descriptors to change status.

select(2)
listening socket
accepted connection #1 accepted connection #2

1. ready!

select(2 ) caller

3. ok, I'll try to accept()

2. now listening socket is ready to accept a new connection.

select(2) on Perl
 select(@args)
 number of @args is not 1 but 4.  difficult interface

 IO::Select
 OO interface to select(2)  easy interface

IO::Select SYNOPSYS
use IO::Select; $s = IO::Select->new(); $s->add(\*STDIN); $s->add($some_handle); @ready = $s->can_read($timeout); # block

use IO::Select
my $listen_socket = IO::Socket::INET->new(...) or die $@; my $select = IO::Select->new or die $!; $select->add($listen_socket); while (1) { my @ready = $select->can_read; # block for my $handle (@ready) { if ($handle eq $listen_socket) { my $connection = $listen_socket->accept; $select->add($connection); } else { my $bytes = $handle->sysread(my $buffer, 1024); $bytes > 0 ? $handle->syswrite($buffer) : do { $select->remove($handle); $handle->close; } } } }

And more things we must think...
 blocking when syswrite()
 use non-blocking socket

 Line-based I/O  select(2) disadvantage

non-blocking socket + Line-based I/O
use use use use POSIX; IO::Socket; IO::Select; Tie::RefHash; my $server = IO::Socket::INET->new(...); $server->blocking(0); my (%inbuffer, %outbuffer, %ready); tie %ready, "Tie::RefHash"; my $select = IO::Select->new($server); while (1) { foreach my $client ( $select->can_read(1) ) { handle_read($client); } sub handle_read { my $client = shift; if ($client == $server) { my $new_client = $server->accept(); $new_client->blocking(0); $select->add($new_client); return; } my $data = ""; my $rv = $client->recv($data, POSIX::BUFSIZ, 0); unless (defined($rv) and length($data)) { handle_error($client); return; } $inbuffer{$client} .= $data; while ( $inbuffer{$client} =~ s/(.*\n)// ) { push @{$ready{$client}}, $1; }

foreach my $client ( keys %ready ) { foreach my $request ( @{ $ready{$client} } ) { $outbuffer{$client} .= $request; } delete $ready{$client}; } foreach my $client ( $select->can_write(1) ) { handle_write($client); }

oops
} { }

sub handle_write { my $client = shift; return unless exists $outbuffer{$client}; my $rv = $client->send($outbuffer{$client}, 0); unless (defined $rv) { warn "I was told I could write, but I can't.\n"; return; } if ($rv == length( $outbuffer{$client}) or $! == POSIX::EWOULDBLOCK) substr( $outbuffer{$client}, 0, $rv ) = ""; delete $outbuffer{$client} unless length $outbuffer{$client}; return;

}

sub handle_error { my $client = shift; delete $inbuffer{$client}; delete $outbuffer{$client}; delete $ready{$client}; $select->remove($client); close $client;

}

} handle_error($client);

select(2) disadvantage
 FD_SETSIZE limitation
 not good for C10K

 Inefficient processing
 coping list of fds to the kernel  You must scan list of fds in UserLand

select(2) internals
process select(2 )
fd fd

fd

fd

select(2 )
fd fd

FD_ISSET
fd fd fd

copy
fd fd fd fd

copy
fd fd

I/O event

kerne l
ref: http://osdn.jp/event/kernel2003/pdf/C06.pdf

Modern UNIX APIs
 epoll
 Linux 2.6

 /dev/kqueue
 BSD

 devpoll
 Solaris

epoll(4)
 better than select(2), poll(2)
 no limitation of numbers of fds  O(1) scallability
 needless to copy list of fds  epoll_wait(2) returns only fds that has new event

epoll internals
proces s
epoll_create()

epoll_ctl(ADD ) epoll_ctl(ADD ) epoll_ctl(ADD )

epoll_wait()

fd table

fd

fd

fd

fd

fd

fd

I/O event

kerne l
ref: http://osdn.jp/event/kernel2003/pdf/C06.pdf

epoll on perl
 Sys::Syscall
 epoll  sendfile

 IO::Epoll
 use IO::Epoll qw/:compat/

Perl libraries for modern network programming

Libraries for Perl Network Programming

 TMTOWTDI
POE Event::Lib Danga::Socket Event Stem Coro ...      

They provides:  Event-based programming for parallel processing  system call abstraction
 select(2) / poll(2) / epoll / kqueue(2) / devpoll

POE
 "POE is a framework for cooperative, event driven multitasking in Perl. "  POE has many "components" on CPAN  I'm lovin' it :)

Hello, POE
use strict; use warnings; use POE qw/Sugar::Args/; POE::Session->create( inline_states => { _start => sub { my $poe = sweet_args; $poe->kernel->yield('hello'), }, hello => sub { STDOUT->print("Hello, POE!"); }, }, ); POE::Kernel->run;

# async / FIFO

Watching handles in Event loop
POE::Session->create( inline_states => { _start => sub { my $poe = sweet_args; $poe->kernel->yield('readline'), }, readline => sub { my $poe = sweet_args; STDOUT->syswrite("input> "); $poe->kernel->select_read(\*STDIN, 'handle_input'); }, handle_input => sub { my $poe = sweet_args; my $stdin = $poe->args->[0]; STDOUT->syswrite(sprintf "Hello, %s", $stdin->getline); $poe->kernel->yield('readline'); } }, );

Results
% perl input> Hello, input> Hello, input> Hello, input> hello_poe2.pl naoya naoya hatena hatena foo bar foo bar

Results of strace
% strace -etrace=select,read,write -p `pgrep perl` Process 8671 attached - interrupt to quit select(8, [0], [], [], {3570, 620000}) = 1 (in [0], left {3566, 500000}) read(0, "naoya\n", 4096) = 6 write(1, "Hello, naoya\n", 13) = 13 select(8, [0], [], [], {0, 0}) = 0 (Timeout) write(1, "input> ", 7) = 7 select(8, [0], [], [], {3600, 0}) = 1 (in [0], left {3595, 410000}) read(0, "hatena\n", 4096) = 7 write(1, "Hello, hatena\n", 14) = 14 select(8, [0], [], [], {0, 0}) = 0 (Timeout) write(1, "input> ", 7) = 7 select(8, [0], [], [], {3600, 0}) = 1 (in [0], left {3598, 860000}) read(0, "foobar\n", 4096) = 7 write(1, "Hello, foobar\n", 14) = 14 select(8, [0], [], [], {0, 0}) = 0 (Timeout) write(1, "input> ", 7) = 7 select(8, [0], [], [], {3600, 0}

use POE::Wheel::ReadLine
POE::Session->create( inline_states => { ... readline => sub { my $poe = sweet_args; $poe->heap->{wheel} = POE::Wheel::ReadLine->new( InputEvent => 'handle_input', ); $poe->heap->{wheel}->get('input> '); }, handle_input => sub { my $poe = sweet_args; $poe->heap->{wheel}->put(sprintf "Hello, %s", $poe>args->[0]); $poe->heap->{wheel}->get('input> '); } }, ); ...

Parallel echo server using POE
POE::Session->create( inline_states => { _start => \&server_start, }, package_states => [ main => [qw/ accept_new_client accept_failed client_input /], ] ); POE::Kernel->run; sub server_start { my $poe = sweet_args; $poe->heap->{listener} = POE::Wheel::SocketFactory->new( BindPort => 9999, Reuse => 'on', SuccessEvent => 'accept_new_client', FailureEvent => 'accept_failed', ); } sub accept_new_client { my $poe = sweet_args; my $wheel = POE::Wheel::ReadWrite->new( Handle => $poe->args->[0], InputEvent => 'client_input', ); $poe->heap->{wheel}->{$wheel->ID} = $wheel; } sub client_input { my $poe = sweet_args; my $line = $poe->args->[0]; my $wheel_id = $poe->args->[1]; $poe->heap->{wheel}->{$wheel_id}>put($line); } sub accept_failed {}

Again, Parallel echo server using POE
use POE qw/Sugar::Args Component::Server::TCP/; POE::Component::Server::TCP->new( Port => 9999, ClientInput => sub { my $poe = sweet_args; my $input = sweet_args->args->[0]; $poe->heap->{client}->put($input); }, ); POE::Kernel->run();

POE has many components on CPAN PoCo::IRC PoCo::Client::HTTP PoCo::Server::HTTP PoCo::EasyDBI PoCo::Cron PoCo::Client::MSN PoCo::Client::Linger ...       

using POE with epoll
 just use POE::Loop::Epoll
 use POE qw/Loop::Epoll/;

Event::Lib
 libevent(3) wrapper
 libevent is used by memcached

 libevent provides:
 event-based programming  devpoll, kqueue, epoll, select, poll abstraction

 Similar to Event.pm  Simple

echo server using Event::Lib

my $server = IO::Socket::INET->new(...) or die $!; $server->blocking(0); event_new($server, EV_READ|EV_PERSIST, \&event_accepted)->add; event_mainloop; sub event_accepted { my $event = shift; my $server = $event->fh; my $client = $server->accept; $client->blocking(0); event_new($client, EV_READ|EV_PERSIST, \&event_client_input)->add; } sub event_client_input { my $event = shift; my $client = $event->fh; $client->sysread(my $buffer, 1024); event_new($client, EV_WRITE, \&event_client_output, $buffer)->add; } sub event_client_output { ... }

Result of strace on Linux 2.6
epoll_wait(4, {{EPOLLIN, {u32=135917448, u64=135917448}}}, 1023, 5000) = 1 gettimeofday({1167127923, 189763}, NULL) = 0 read(7, "gho\r\n", 1024) = 5

epoll_ctl(4, EPOLL_CTL_MOD, 7, {EPOLLIN|EPOLLOUT, {u32=135917448, u64=135917448}}) = 0

Danga::Socket
 by Brad Fitzpatrick - welcome to Japan :)  It also provides event-driven programming and epoll abstraction  Perlbal, MogileFS

Summary
 For Network programming, need a little knowledge about OS, especially process scheduling, I/O and implementation of TCP/IP.  Use modern libraries/frameworks to keep your codes simple.  Perl has many good libraries for UNIX Network Programming.

Thank you!

Master your semester with Scribd & The New York Times

Special offer for students: Only $4.99/month.

Master your semester with Scribd & The New York Times

Cancel anytime.