[Perl.sig] Perl Tip of the Week - Subroutine and Regular Expression References

Litss Coordinator litss.coord at anu.edu.au
Tue Mar 1 12:04:59 EST 2005


From: Jacinta Richardson

==== Subroutine and Regular Expression References ====

     Last week we discussed variable references. These allow us to keep the
     identity of arrays and hashes that we pass to subroutines. References
     are also used to built complex data structures, which are important in
     many problem domains.

     This week we'll look at two other useful kinds of references:
     subroutines and regular expressions.


==== References to subroutines ====

     As well as taking references to variables, we can take references to
     subroutines. This is useful as it allows us to build dispatch tables,
     pass around processing functions and much more.

     To obtain a reference to a subroutine, we use a backslash, the same
     operator that we use to obtain a reference to a variable. However, to
     indicate to Perl that we want a reference to the subroutine we need to
     prefix the subroutine with an ampersand:

             my $sub_ref = \&my_subroutine;

     Leaving off the ampersand, or supply parentheses at the end of the
     subroutine name will result in a reference to its return value. This is
     an easy mistake to make when taking a subroutine reference.

     It's also possible to take a reference to an anonymous subroutine by
     using Perl's ``sub'' keyword without providing a subroutine name:

             my $sub_ref = sub { print "Hello World!\n"; }

     To invoke a subroutine to which we hold a reference, we simply use
     Perl's arrow notation:

             $sub_ref->(@args);


==   Dispatch tables   ==

     Often when we deal with input from a user, we often end up writing code
     handling a number of differing cases. In many of these cases our code
     often looks a lot like this:

             if ( $action eq "this" ) {
                     this();
             }
             elsif ( $action eq "that" ) {
                     that();
             }
             elsif ( $action eq "something else" ) {
                     something_else();
             }
             else {
                     unknown_action();
             }

     While this is certainly one way to do it; it can seem unwieldy and
space
     consuming. However with subroutine references we can create a very
     elegant and fast alternative:

             # Dispatch table (hash of subroutine references)
             my %dispatch = (
                     this => \&this,
                     that => \&that,
                     "something else" => \&something_else,
             );

             # Check that the action exists in our table
             if ( exists $dispatch{$action} ) {
                     $dispatch{$action}->();
             } else {
                     unknown_action();
             }

     This allows us to add or change new cases easily, in a single place,
     while simplifying our code. If your subroutines are designed to accept
     the same parameter list, then you can pass in parameters when you
invoke
     the subroutine:

             $dispatch{$action}->($dbh, $cgi, $status);


==   Passing around processing functions   ==

     A good example of passing a subroutine reference to another subroutine
     is when using the File::Find module. File::Find's ``find'' function
     takes a subroutine reference and a list of directories to search. The
     subroutine referred to is then used to process each file and directory.
     For example, to print the paths and file names of all the Perl scripts
     (assuming they end in .pl) in a directory tree we might write:

             use File::Find;

             find(\&wanted, '.');

             sub wanted {
                     print "$File::Find::name\n" if /\.pl$/;
             }

     File::Find's ``find'' takes a subroutine reference to make it easy for
     the programmer to specify what to do with each file and directory.
     Furthermore, as the subroutine reference is an argument to ``find'',
     this means that you can call ``find'' from different places in your
code
     each time with a different subroutine reference.


==   Ref and Subroutine References   ==

     To check whether something is a reference to a subroutine, we can use
     the ``ref'' function:

             my $sub_ref = sub { print "I'm a subroutine!\n" };

             print ref $sub_ref;             # prints CODE

     Unfortunately, it's not possible to dump a subroutine reference using
     ``Data::Dumper'' or other tools. Perl doesn't (yet) have a way of
     turning a subroutine back into code. If we do pass a subroutine
     reference to Data::Dumper, it will instead print a place holder:

             use Data::Dumper;
             print Dumper $sub_ref;          # prints $VAR1 = sub { "DUMMY"
};


==== Regular expression references ====

     Once you get used to the idea of passing around subroutine references
     for special processing needs, you might start thinking about whether
you
     can do the same with regular expressions. Well, you can, and it's also
     easy.

     To create a regular expression for use later, we use ``qr//'':

             my $regexp = qr/^Perl$/;        # a line only containing Perl

     This compiles the regular expression for use later. If there's a
problem
     with our regular expression, we'll hear about it immediately. To use
     this pre-compiled regular expression we can use any of the following:

             # See if we have a match
             $string =~ $regexp;

             # A simple substitution
             $string =~ s/$regexp/Camel/;

             # Comparing against $_
             /$regexp/;

     Regular expression references can save you time and effort and reduce
     the number of places where bugs can slip into your programs. For
     example, you might use the same regular expression which validates a
     filename:

             my $file = qr/\w+\.\w+/;

     If you check filename validity in more than one script or in more than
     one place in a script, then should your definition of a filename
change,
     you'll have to update the regular expression in multiple places.
     However, by using a regular expression reference, you can use the
     expression many times, but only need to update it in a single location.
     By giving your references appropriate names, you can also significantly
     improve the readability of your regular expressions:

             # Check our input refers to a valid filename:
             /^Filename: "($file)"\s*$/;

     An alternative to using ``qr//'' is to create your regular expression
as
     a string and pass that around. However, this requires Perl to recompile
     your regular expression fragment for each match. You also miss the
     advantage of compile-time errors for any syntax errors that may be
     present in your expression.


==   Ref and Regular Expression References   ==

     To check whether something is a reference to a regular expression, we
     can use the ``ref'' function:

             my $regexp = qr/^Perl$/;

             print ref $regexp;      # prints 'Regexp'

     Something very special happens when you use a regular expression
     reference as a string. It get turns back into a human readable
     representation of the original regular expression.

             print "$regexp\n";      # prints '(?-xism:^Perl$)'

     The ``(?-xism:...)'' notation is an indication of which regular
     expression switches have been enabled for this fragment. In this
     particular example, all four switches ('x', 'i', 's', and 'm') are
     disabled.


==== References and objects ====

     Perl allows you take references to almost anything you can access in
     Perl. Furthermore, anything you can get a reference to can be 'blessed'
     and turned into an object. Should you ever have a need for a class
based
     on a regular expression, Perl is very happy to oblige.


==== In summary ====

     Perl's variable references allow us to pass multiple lists into
     subroutines, create complex data structures and objects. With
subroutine
     references we can create processing subroutines, and build elegant and
     powerful dispatch tables. Regular expression references allow the
     creation of a regular expression in one place for use in many places,
     improving speed, readability, and maintainability.

     Furthermore, all of these can be used as a basis for Perl objects,
     although the hash reference remains the most common choice.


==== Upcoming Courses in Sydney ====

     http://perltraining.com.au/bookings/Sydney.html

     Introduction to Perl    15th Mar - 16th Mar
     Intermediate Perl       17th Mar - 18th Mar


==== Corporate Courses ====

     http://perltraining.com.au/corporate.html

     Do you have a large group, or the need for training at a particular
     time? Perl Training Australia is happy to arrange a course in the time
     and place that best suits you. For more information read our page on
     Corporate Courses at http://perltraining.com.au/corporate.html or call
     us on +61-3-9354-6001.

-- 
    ("`-''-/").___..--''"`-._          |  Jacinta Richardson         |
     `6_ 6  )   `-.  (     ).`-.__.`)  |  Perl Training Australia    |
     (_Y_.)'  ._   )  `._ `. ``-..-'   |      +61 3 9354 6001        |
   _..`--'_..-_/  /--'_.' ,'           | contact at perltraining.com.au |
  (il),-''  (li),'  ((!.-'             |   www.perltraining.com.au   |


_______________________________________________
This Perl tooltip and associated text is Copyright Perl Training Australia.
You may freely distribute this text so long as it is distributed in full
with this Copyright noticed attached.

If you have any questions please don't hesitate to contact us:
   Email: contact at perltraining.com.au
   Phone: 03 9354 6001

Perl-tips mailing list
To change your subscription details visit:
http://perltraining.com.au/cgi-bin/mailman/listinfo/perl-tips





More information about the Perl.sig mailing list