[LINK] Re: Rant: NSW School e-mails?

Craig Sanders cas at taz.net.au
Mon Sep 12 10:30:15 EST 2005


On Sun, Sep 11, 2005 at 08:48:29PM +1000, Con Zymaris wrote:
> Yup, and it's not expensive either. 
> 
> Last I was informed by those who had done it, US$3k buys you a
> dual-CPU system which supports 10,000 concurrent IMAP clients. Five
> of these systems and you've covered the needs of the entire student
> population in NSW - the students only spend a small amount of time
> [...]
> Underpin the lot with a decent network file-system and authentication
> server and you've covered half the problem for under $100,000.

yes, the software's cheap (free, even) but there are other expenses.
hardware will be significantly more than you estimate, plus there are
salaries and rent and bills to pay and a reasonable profit for the
implementor.

mail is an I/O bound application. that means that throwing high-powered
CPUs at it doesn't help to scale it up. what it needs is to have the I/O
load spread over lots of fast disks.

on the very small scale, you can do all this with just one or two good
machines. if you're talking about a million users and 10000 concurrent
clients, though, you need a lot more.

so, you need:

 - mail receiving front end boxes (MXs). receive mail, filter it for
   spam and viruses, forward it to back end for storage.

 - file-server backend boxes.  big, fast.  lots of disks, lots of RAM.
   ideally, multiple servers (for redundancy AND load-balancing) all
   using the same fiber-channel disk array.

 - multiple redundant authentication servers (e.g. LDAP).

 - mail reader (MR) boxes.  running imap, pop, and webmail.  these are 
   the machines that the end-users interact with.

 - outgoing mail servers for relaying users' outbound mail. could be
   dedicated machines, or this function could be provided by either
   the MX or MR boxes. realistically, the system is going to receive at
   least 10 to 100 times as much mail as it is going to send (most of
   that being spam and viruses), so making the MX boxes do this job is
   probably best.

 - (optional) web server machines, to allow each user their own home
   page. can use the file-servers for storage.

 - load-balancing boxes (e.g. LVS - www.linuxvirtualserver.org) in front
   of the MX, MR, and web-server boxes.

 - switches, cabling, internet connections, miscellaneous gear.

 - server room(s) with backup power and cooling.


this model scales up indefinitely. have as many MX and MR and webserver
boxes as you need. they're disposable too - if one dies, it's no big
deal because nothing depends upon it, just install a replacement. they
can also be mass-produced/cloned from a disk-image - build and configure
one, then copy it (with minor, scriptable, changes like hostname and IP
address).

actually, the MX and MR boxes don't even need disks for the OS - they
can network boot and mount their / filesystem from the file-server, or
copy it into ramdisk and run from that. the MXs do need *non-volatile*
storage for the incoming/in-transit mail spool (i.e. hard disk or the
much-faster solid-state disk).

then once the system has been built, it will need to be looked after by
people who know what they're doing.


my (very rough) guess is somewhere between $500K and $1M, plus on-going
costs of $150-$250K (two or three staff plus office & server-room
rental). oh, and somewhere between 2 and 6 months to implement and test,
depending on size (and on tardiness of the HW suppliers)

a lot less than $32 million, but still more than $100K.

> [...] checking email each day, so you do the statistics to get that
> level of coverage, along with peak daily and weekly spikes.

yes, this is very important. you have to scale the system to cope with
peak demand even if that capacity sits idle for most of the day. for
the MR boxes, that would be Mon-Fri between 8.30 and 10, then morning &
afternoon recess, lunchtime, and an hour or two after school. for the MX
boxes, that would be 24/7 - mail can come in at any time, especially if
users are subscribing to external mailing lists.


craig

-- 
craig sanders <cas at taz.net.au>           (part time cyborg)


More information about the Link mailing list