[Perl.sig] Big-O notation
Litss Coordinator
litss.coord at anu.edu.au
Wed Mar 14 09:01:32 EST 2007
Posted on behalf of Jacinta Richardson.
==== Upcoming courses ====
We're proud to be running the following courses over the next couple of
months:
Programming Perl
Canberra: 17th - 20th April (Early bird: 26th March)
Perth: 15th - 18th May
Object Oriented Perl
Melbourne: 1st - 2nd May (Early bird: 2nd April)
Sydney: 19th - 20th June
Canberra: 26th - 27th June
Databases Programming with Perl:
Melbourne: 3rd - 4th May (Early bird: 2nd April)
Sydney: 21st - 22nd June
Canberra: 28th - 29th June
Web Development with Perl:
Melbourne: 23rd - 24th May
Perl Security:
Melbourne: 25th May
==== Big-O notation ====
Big-O notation is a mathematical construct used to describe algorithmic
complexity. In particular it is allows efficiency comparisons of
different algorithms for large data sets.
Typical orders are (in order of desirability):
O(1) also ``C''
Constant time. Computation does not take appreciably longer as the
data set grows. For example, a hash lookup is considered to occur in
constant time.
``O(log N)''
Logarithmic. Computation time grows as N grows, but this growth is
small for large N. For example doing a binary search over a sorted
list to find an item.
``O((log N)^c)''
Poly-logarithmic. Computation time grows as N grows, but this extra
growth in time is still small for large N.
O(N)
Linear. Computation time grows in equal proportion to N's growth.
For example, adding a value to each item in a list is linear.
``O(N log N)''
Quasi-linear. Computation time grows only slightly faster than N's
growth, for large N.
``O(N^2)'' and ``O(N^c)''
Quadratic and Polynomial. For ``N^2'': doubling ``N'' will quadruple
execution time, ``N^3'': doubling ``N'' increases execution time by
a factor of 8. For large N, ``O(N^2)'' and above rapidly become
unusable.
``O(n!)''
Factorial. Almost never a good idea, large for even small N. 5! =
120.
It is an interesting property of Big-O notation that if a function can
be written as a sum of other functions, then only the fastest growing
function matters. This is because the growth in that function will
rapidly dwarf the growth in the others. For example:
cN + dN^2 => O(N^2) for any constants c and d
N^4 + N! => O(N!)
== Why Big-O matters ==
Understanding the basics of how to calculate Big-O notation can help us
identify bad choices in our code and spot likely causes of slow
execution time. For example consider the following attempt to find all
the elements in list1 which are also in list2:
my @list1 = (10 .. 100);
my @list2 = (50 .. 200);
# Walk over @list1
foreach my $element (@list1) {
# Check in @list2 to see if it's there
foreach my $item (@list2) {
if($item eq $element) {
print "$item is in both lists\n";
}
}
}
If @list1 and @list2 are small, this code works very quickly. However,
if we use:
my @list1 = (10 .. 10000);
my @list2 = (5000 .. 2000000);
then the code takes a very long time. Let's call the size of @list1
``N'' and the size of @list2 ``M'' then what we're doing is:
for each N
for each M
do something
You can see here that this means we walk over @list1 once, ``but'' for
each item we also walk over the whole of @list2. This means we're
dealing with ``O(N M)''. As the smaller list approaches the same size as
the larger, this approaches ``O(N^2)'' which means we should investigate
whether a faster possibility exists.
We can rewrite the above code as follows:
my @list1 = (10 .. 100); # length N
my @list2 = (50 .. 200); # length M
# Create a hash of all the values in @list1: O(N)
my %in_list1;
foreach my $element (@list1) {
$in_list1{$element} = ();
}
# Walk over each element in @list2 and compare against the hash
# O(M)
foreach my $item (@list2) {
if(exists $in_list1{$item}) {
print "$item is in both lists\n";
}
}
Now we can see that the end result is ``O(N+M)'' - we walk over @list1
once, and we walk over @list2 once. This is an O(N) solution so we know
it's good. Note that we're achieving this trade-off by using extra
memory to store our %in_list1 hash. For very large ``N'' this needs to
be considered.
The above style problem comes up regularly, even though it may not quite
look as simple as this. For example, imagine that you have a flat file
containing addresses for businesses, and you wish to display a subset of
those based on a list of suburbs. You could go through the whole file
for each suburb (``O(N M)'') or you could adapt the above solution to
store either the addresses or suburbs in a hash (depending on your
requirements) and go through each only once. The same style solution can
also be used to find the complement of a list.
== Ignoring constants ==
Big-O notation does not predict how fast something will run, but rather
gives us information about how well it will scale. For example doubling
the size of ``N'' for a ``O(N^2)'' problem will *quadruple* the program
execution time regardless of whether the function complexity is actually
``O(0.5 N^2)'' or ``O(10000 N^2)''.
Constants only matter when comparing functions of different
complexities. For example ``O(100*N)'' is slower than ``O(N^2 /100)''
until ``N'' is greater than 10,000.
== See also ==
Mathematical coverage for Big-O notation can be found on Wikipedia at:
<http://en.wikipedia.org/wiki/Big_O_notation>.
There are also some great notes on Big-O notation on perlmonks.org
* <http://www.perlmonks.org/?node_id=573138>,
* <http://www.perlmonks.org/?node_id=227909> and
* <http://www.perlmonks.org/?node_id=25833>.
==== Corporate Courses ====
http://perltraining.com.au/corporate.html
Do you have a large group, or the need for training at a particular
time? Perl Training Australia is happy to arrange a course in the time
and place that best suits you. For more information read our page on
Corporate Courses at http://perltraining.com.au/corporate.html or call
us on +61-3-9354-6001.
_______________________________________________
This Perl tooltip and associated text is Copyright Perl Training Australia.
You may freely distribute this text so long as it is distributed in full
with this Copyright noticed attached.
If you have any questions please don't hesitate to contact us:
Email: contact at perltraining.com.au
Phone: 03 9354 6001
Perl-tips mailing list
To change your subscription details visit:
http://perltraining.com.au/cgi-bin/mailman/listinfo/perl-tips
More information about the Perl.sig
mailing list