[LINK] RFI: Bulk-Caching of a Web-Site down onto a PC

Roger Clarke Roger.Clarke at xamax.com.au
Tue Oct 10 20:03:27 AEST 2006


I run a web-site.  (Well, I run a few, but one in particular).

It uses PHP to insert common headers and footers on all pages.

I want to download the whole site to my portable, so that I can make 
the web-site available from cache, in a location that has no Internet 
access.

How do I download a web-site that depends on server-side PHP processing?

Possibilities considered:

1.  An FTP-Client (I use Interarchy)
*But* it downloads the unprocessed source-files, with embedded PHP. 
Browsers ignore embedded PHP;  but the browser displays no headers, 
and the headers contain the navigation buttons.

2.  A Browser
I have, and use, Safari, Mozilla and Camino.
Each saves-as one file at a time, and I can see no bulk download facility.
There are 120 HTML files (and 130 in other formats), so there's a lot 
of downloading to do, which is time, effort, and scope for errors 
(because I have the concentration-span of a gnat, and get bored and 
impatient on repetitive tasks.  Yes, psychiatrists would have a field 
day with me).

(Safari saves-as into a proprietary format, which only Safari can use.
Mozilla and Camino download into a widely-used format, viz. they put 
the HTML under fn.html, and all invoked files in a directory called 
fn_files.
So they are cross-compatible.)

3.  Own-Code
I could write a script to download them.
But hang on, I'm superannuated (I was competent-to-good 1970-1982. 
In fact I used to write program-generators, which was pretty flash 
back then.  I even wrote string-handling routines in COBOL - now 
*there's* a challenge).
And anyway, surely this kind of facility is pretty mainstream?!

4.  Use PHP on the PC
*But* to invoke PHP, I need a web-server, and all the paraphernalia 
(spg?) that goes with it.  That's been a miscellaneous project for 
years, but I'm still busy, and I need to solve this problem promptly 
and reliably.

5.  Am I missing something?

Thanks in advance to the Link Institute, in whom we trust!

-- 
Roger Clarke                  http://www.anu.edu.au/people/Roger.Clarke/

Xamax Consultancy Pty Ltd      78 Sidaway St, Chapman ACT 2611 AUSTRALIA
                    Tel: +61 2 6288 1472, and 6288 6916
mailto:Roger.Clarke at xamax.com.au                http://www.xamax.com.au/

Visiting Professor in Info Science & Eng  Australian National University
Visiting Professor in the eCommerce Program      University of Hong Kong
Visiting Professor in the Cyberspace Law & Policy Centre      Uni of NSW



More information about the Link mailing list