Unix Programming with Perl Cybozu Labs, Inc. Kazuho Oku
Writing correct code tests aren’t enough tests don’t ensure that the code is correct writing correct code requires… knowledge of perl and knowledge of the OS the presentation covers the ascpects of unix programming using perl, including errno fork and filehandles Unix signals Oct 16 2010 Unix Programming with Perl
Errno Oct 16 2010 Unix Programming with Perl
The right way to “create a dir if not exists” Is this OK? if (! -d $dir) { mkdir $dir or die "failed to create dir:$dir:$!"; } Oct 16 2010 Unix Programming with Perl
The right way to “create a dir if not exists” (2) No! if (! -d $dir) { # what if another process created a dir # while we are HERE? mkdir $dir or die "failed to create dir:$dir:$!"; } Oct 16 2010 Unix Programming with Perl
The right way to “create a dir if not exists” (3) The right way is to check the cause of the error when mkdir fails Oct 16 2010 Unix Programming with Perl
The right way to “create a dir if not exists” (4) So, is this OK? if (mkdir $dir) { # ok, dir created } elsif ($! =~ /File exists/) { # ok, directory exists } else { die "failed to create dir:$dir:$!"; } Oct 16 2010 Unix Programming with Perl
The right way to “create a dir if not exists” (5) No! The message stored in $! depends on OS and / or locale. if (mkdir $dir) { # ok, dir created } elsif ($! =~ /File exists/) { # ok, directory exists } else { die "failed to create dir:$dir:$!"; } Oct 16 2010 Unix Programming with Perl
The right way to “create a dir if not exists” (6) The right way is to use Errno. use Errno (); if (mkdir $dir) { # ok, created dir } elsif ($! == Errno::EEXIST) { # ok, already exists } else { die "failed to create dir:$dir:$!"; } Oct 16 2010 Unix Programming with Perl
$! and Errno $! is a dualvar is a number in numeric context (ex. 17) equiv. to the errno global in C is a string in string context (ex. “File exists”) equiv. to strerror(errno) in C the Errno module list of constants (numbers) that errno ($! in numeric context) may take Oct 16 2010 Unix Programming with Perl
How to find the Errnos perldoc -f mkdir doesn’t include a list of errnos it might return see man 2 mkdir man mkdir will show the man page of the mkdir command specify section 2 for system calls specify section 3 for C library calls Oct 16 2010 Unix Programming with Perl
How to find the Errnos errnos on man include those defined by POSIX and OS-specific constants the POSIX spec. can be found at opengroup.org, etc. http://www.opengroup.org/onlinepubs/000095399/ Oct 16 2010 Unix Programming with Perl
Fork and filehandles Oct 16 2010 Unix Programming with Perl
Filehandles aren’t cloned by fork fork clones the memory image uses CoW (Copy-on-Write) for optimization fork does not clone file handles only increments the refcount to the file handle in the OS the file is left open until both the parent and child closes the file seek position and lock states are shared between the processes the same for TCP / UDP sockets Oct 16 2010 Unix Programming with Perl
File handles aren’t cloned by fork (2) Oct 16 2010 Unix Programming with Perl memory Parent Process Operating System File System lock owner, etc. File memory Another Process memory Child Process fork seek pos. / lock state, etc. File Control Info. open(file) seek pos. / lock state, etc. File Control Info. open(file)
Examples of resource collisions due to fork FAQ “ The SQLite database becomes corrupt” “ MySQL reports malformed packet” mostly due to sharing a single DBI connection created before calling fork SQLite uses file locks for access control file lock needed for each process, however after fork the lock is shared between the processes in the case of MySQL a single TCP (or unix) connection is shared between the processes Oct 16 2010 Unix Programming with Perl
Examples of resource collisions due to fork (2) The wrong code… my $dbh = DBI->connect(...); my $pid = fork; if ($pid == 0) { # child process $dbi->do(...); } else { # parent process $dbi->do(...); Oct 16 2010 Unix Programming with Perl
How to avoid resource collisions after fork close the file handle in the child process (or in the parent) right after fork only the refcount will be decremented. lock states / seek positions do not change Oct 16 2010 Unix Programming with Perl
How to avoid collisions after fork (DBI) undef $dbh in the child process doesn’t work since the child process will run things such as unlocking and / or rollbacks on the shared DBI connection the connection needs to be closed, without running such operations Oct 16 2010 Unix Programming with Perl
How to avoid collisions after fork (DBI) (2) the answer: use InactiveDestroy my $pid = fork; if ($pid == 0) { # child process $dbh->{InactiveDestroy} = 1; undef $dbh; ... } Oct 16 2010 Unix Programming with Perl
How to avoid collisions after fork (DBI) (3) if fork is called deep inside a module and can’t be modified, then… # thanks to tokuhirom, kazeburo BEGIN { no strict qw(refs); no warnings qw(redefine); *CORE::GLOBAL::fork = sub { my $pid = CORE::fork; if ($pid == 0) { # do the cleanup for child process $dbh->{InactiveDestroy} = 1; undef $dbh; } $pid; }; } Oct 16 2010 Unix Programming with Perl
How to avoid collisions after fork (DBI) (4) other ways to change the behavior of fork POSIX::AtFork (gfx) Perl wrapper for pthread_atfork can change the behavior of fork(2) called within XS forks.pm (rybskej) Oct 16 2010 Unix Programming with Perl
Close filehandles before calling exec file handles (file descriptors) are passed to the new process created by exec some tools (setlock of daemontools, Server::Starter) rely on the feature OTOH, it is a good practice to close the file handles that needn’t be passed to the exec’ed process, to avoid child process from accidentially using them Oct 16 2010 Unix Programming with Perl
Close file handles before calling exec (2) my $pid = fork; if ($pid == 0) { # child process, close filehandles $dbh->{InactiveDestroy} = 1; undef $dbh; exec ...; ... Oct 16 2010 Unix Programming with Perl
Close file handles before calling exec (3) Some OS’es have O_CLOEXEC flag designates the file descriptors to be closed when exec(2) is being called is OS-dependent linux supports the flag, OSX doesn’t not usable from perl? Oct 16 2010 Unix Programming with Perl
Unix Signals Oct 16 2010 Unix Programming with Perl
SIGPIPE “ my network application suddenly dies without saying anything” often due to not catching SIGPIPE a signal sent when failing to write to a filehandle ex. when the socket is closed by peer the default behavior is to kill the process solution: $SIG{PIPE} = 'IGNORE'; downside: you should consult the return value of print, etc. to check if the writes succeeded Oct 16 2010 Unix Programming with Perl
Using alarm alarm can be used (together with SIG{ALRM}, EINTR) to handle timeouts local $SIG{ALRM} = sub {}; alarm($timeout); my $len = $sock->read(my $buf, $maxlen); if (! defined($len) && $! == Errno::EINTR) { warn 'timeout’; return; } Oct 16 2010 Unix Programming with Perl
Pros and cons of using alarm + can be used to timeout almost all system calls (that may block) − the timeout set by alarm(2) is a process-wide global (and so is $SIG{ALRM}) use of select (or IO::Select) is preferable for network access Oct 16 2010 Unix Programming with Perl
Writing cancellable code typical use-case: run forever until receiving a signal, and gracefully shutdown ex. Gearman::Worker Oct 16 2010 Unix Programming with Perl
Writing cancellable code (2) make your module cancellable - my $len = $sock->read(my $buf, $maxlen); + my $len; + { + $len = $sock->read(my $buf, $maxlen); + if (! defined($len) && $! == Errno::EINTR) { + return if $self->{cancel_requested}; + redo; + } + } ... + sub request_cancel { + my $self = shift; + $self->{cancel_requested} = 1; + } Oct 16 2010 Unix Programming with Perl
Writing cancellable code (3) The code that cancels the operation on SIGTERM $SIG{TERM} = sub { $my_module->request_cancel }; $my_module->run_forever(); Or the caller may use alarm to set timeout $SIG{ALRM} = sub { $my_module->request_cancel }; alarm(100); $my_module->run_forever(); Oct 16 2010 Unix Programming with Perl
Proc::Wait3 built-in wait() and waitpid() does not return when receiving signals use Proc::Wait3 instead Oct 16 2010 Unix Programming with Perl
Further reading Oct 16 2010 Unix Programming with Perl
Further reading this presentation is based on my memo for an article on WEB+DB PRESS if you have any questions, having problems on Unix programming, please let me know Oct 16 2010 Unix Programming with Perl

Unix Programming with Perl

  • 1.
    Unix Programming withPerl Cybozu Labs, Inc. Kazuho Oku
  • 2.
    Writing correct codetests aren’t enough tests don’t ensure that the code is correct writing correct code requires… knowledge of perl and knowledge of the OS the presentation covers the ascpects of unix programming using perl, including errno fork and filehandles Unix signals Oct 16 2010 Unix Programming with Perl
  • 3.
    Errno Oct 162010 Unix Programming with Perl
  • 4.
    The right wayto “create a dir if not exists” Is this OK? if (! -d $dir) { mkdir $dir or die "failed to create dir:$dir:$!"; } Oct 16 2010 Unix Programming with Perl
  • 5.
    The right wayto “create a dir if not exists” (2) No! if (! -d $dir) { # what if another process created a dir # while we are HERE? mkdir $dir or die "failed to create dir:$dir:$!"; } Oct 16 2010 Unix Programming with Perl
  • 6.
    The right wayto “create a dir if not exists” (3) The right way is to check the cause of the error when mkdir fails Oct 16 2010 Unix Programming with Perl
  • 7.
    The right wayto “create a dir if not exists” (4) So, is this OK? if (mkdir $dir) { # ok, dir created } elsif ($! =~ /File exists/) { # ok, directory exists } else { die "failed to create dir:$dir:$!"; } Oct 16 2010 Unix Programming with Perl
  • 8.
    The right wayto “create a dir if not exists” (5) No! The message stored in $! depends on OS and / or locale. if (mkdir $dir) { # ok, dir created } elsif ($! =~ /File exists/) { # ok, directory exists } else { die "failed to create dir:$dir:$!"; } Oct 16 2010 Unix Programming with Perl
  • 9.
    The right wayto “create a dir if not exists” (6) The right way is to use Errno. use Errno (); if (mkdir $dir) { # ok, created dir } elsif ($! == Errno::EEXIST) { # ok, already exists } else { die "failed to create dir:$dir:$!"; } Oct 16 2010 Unix Programming with Perl
  • 10.
    $! and Errno$! is a dualvar is a number in numeric context (ex. 17) equiv. to the errno global in C is a string in string context (ex. “File exists”) equiv. to strerror(errno) in C the Errno module list of constants (numbers) that errno ($! in numeric context) may take Oct 16 2010 Unix Programming with Perl
  • 11.
    How to findthe Errnos perldoc -f mkdir doesn’t include a list of errnos it might return see man 2 mkdir man mkdir will show the man page of the mkdir command specify section 2 for system calls specify section 3 for C library calls Oct 16 2010 Unix Programming with Perl
  • 12.
    How to findthe Errnos errnos on man include those defined by POSIX and OS-specific constants the POSIX spec. can be found at opengroup.org, etc. http://www.opengroup.org/onlinepubs/000095399/ Oct 16 2010 Unix Programming with Perl
  • 13.
    Fork and filehandlesOct 16 2010 Unix Programming with Perl
  • 14.
    Filehandles aren’t clonedby fork fork clones the memory image uses CoW (Copy-on-Write) for optimization fork does not clone file handles only increments the refcount to the file handle in the OS the file is left open until both the parent and child closes the file seek position and lock states are shared between the processes the same for TCP / UDP sockets Oct 16 2010 Unix Programming with Perl
  • 15.
    File handles aren’tcloned by fork (2) Oct 16 2010 Unix Programming with Perl memory Parent Process Operating System File System lock owner, etc. File memory Another Process memory Child Process fork seek pos. / lock state, etc. File Control Info. open(file) seek pos. / lock state, etc. File Control Info. open(file)
  • 16.
    Examples of resourcecollisions due to fork FAQ “ The SQLite database becomes corrupt” “ MySQL reports malformed packet” mostly due to sharing a single DBI connection created before calling fork SQLite uses file locks for access control file lock needed for each process, however after fork the lock is shared between the processes in the case of MySQL a single TCP (or unix) connection is shared between the processes Oct 16 2010 Unix Programming with Perl
  • 17.
    Examples of resourcecollisions due to fork (2) The wrong code… my $dbh = DBI->connect(...); my $pid = fork; if ($pid == 0) { # child process $dbi->do(...); } else { # parent process $dbi->do(...); Oct 16 2010 Unix Programming with Perl
  • 18.
    How to avoidresource collisions after fork close the file handle in the child process (or in the parent) right after fork only the refcount will be decremented. lock states / seek positions do not change Oct 16 2010 Unix Programming with Perl
  • 19.
    How to avoidcollisions after fork (DBI) undef $dbh in the child process doesn’t work since the child process will run things such as unlocking and / or rollbacks on the shared DBI connection the connection needs to be closed, without running such operations Oct 16 2010 Unix Programming with Perl
  • 20.
    How to avoidcollisions after fork (DBI) (2) the answer: use InactiveDestroy my $pid = fork; if ($pid == 0) { # child process $dbh->{InactiveDestroy} = 1; undef $dbh; ... } Oct 16 2010 Unix Programming with Perl
  • 21.
    How to avoidcollisions after fork (DBI) (3) if fork is called deep inside a module and can’t be modified, then… # thanks to tokuhirom, kazeburo BEGIN { no strict qw(refs); no warnings qw(redefine); *CORE::GLOBAL::fork = sub { my $pid = CORE::fork; if ($pid == 0) { # do the cleanup for child process $dbh->{InactiveDestroy} = 1; undef $dbh; } $pid; }; } Oct 16 2010 Unix Programming with Perl
  • 22.
    How to avoidcollisions after fork (DBI) (4) other ways to change the behavior of fork POSIX::AtFork (gfx) Perl wrapper for pthread_atfork can change the behavior of fork(2) called within XS forks.pm (rybskej) Oct 16 2010 Unix Programming with Perl
  • 23.
    Close filehandles beforecalling exec file handles (file descriptors) are passed to the new process created by exec some tools (setlock of daemontools, Server::Starter) rely on the feature OTOH, it is a good practice to close the file handles that needn’t be passed to the exec’ed process, to avoid child process from accidentially using them Oct 16 2010 Unix Programming with Perl
  • 24.
    Close file handlesbefore calling exec (2) my $pid = fork; if ($pid == 0) { # child process, close filehandles $dbh->{InactiveDestroy} = 1; undef $dbh; exec ...; ... Oct 16 2010 Unix Programming with Perl
  • 25.
    Close file handlesbefore calling exec (3) Some OS’es have O_CLOEXEC flag designates the file descriptors to be closed when exec(2) is being called is OS-dependent linux supports the flag, OSX doesn’t not usable from perl? Oct 16 2010 Unix Programming with Perl
  • 26.
    Unix Signals Oct16 2010 Unix Programming with Perl
  • 27.
    SIGPIPE “ mynetwork application suddenly dies without saying anything” often due to not catching SIGPIPE a signal sent when failing to write to a filehandle ex. when the socket is closed by peer the default behavior is to kill the process solution: $SIG{PIPE} = 'IGNORE'; downside: you should consult the return value of print, etc. to check if the writes succeeded Oct 16 2010 Unix Programming with Perl
  • 28.
    Using alarm alarmcan be used (together with SIG{ALRM}, EINTR) to handle timeouts local $SIG{ALRM} = sub {}; alarm($timeout); my $len = $sock->read(my $buf, $maxlen); if (! defined($len) && $! == Errno::EINTR) { warn 'timeout’; return; } Oct 16 2010 Unix Programming with Perl
  • 29.
    Pros and consof using alarm + can be used to timeout almost all system calls (that may block) − the timeout set by alarm(2) is a process-wide global (and so is $SIG{ALRM}) use of select (or IO::Select) is preferable for network access Oct 16 2010 Unix Programming with Perl
  • 30.
    Writing cancellable codetypical use-case: run forever until receiving a signal, and gracefully shutdown ex. Gearman::Worker Oct 16 2010 Unix Programming with Perl
  • 31.
    Writing cancellable code(2) make your module cancellable - my $len = $sock->read(my $buf, $maxlen); + my $len; + { + $len = $sock->read(my $buf, $maxlen); + if (! defined($len) && $! == Errno::EINTR) { + return if $self->{cancel_requested}; + redo; + } + } ... + sub request_cancel { + my $self = shift; + $self->{cancel_requested} = 1; + } Oct 16 2010 Unix Programming with Perl
  • 32.
    Writing cancellable code(3) The code that cancels the operation on SIGTERM $SIG{TERM} = sub { $my_module->request_cancel }; $my_module->run_forever(); Or the caller may use alarm to set timeout $SIG{ALRM} = sub { $my_module->request_cancel }; alarm(100); $my_module->run_forever(); Oct 16 2010 Unix Programming with Perl
  • 33.
    Proc::Wait3 built-in wait()and waitpid() does not return when receiving signals use Proc::Wait3 instead Oct 16 2010 Unix Programming with Perl
  • 34.
    Further reading Oct16 2010 Unix Programming with Perl
  • 35.
    Further reading thispresentation is based on my memo for an article on WEB+DB PRESS if you have any questions, having problems on Unix programming, please let me know Oct 16 2010 Unix Programming with Perl