[LINK] RFI: Bulk-Caching of a Web-Site down onto a PC

Kim Holburn kim at holburn.net
Tue Oct 10 20:41:46 AEST 2006

I have used wget to make a static mirror of a dynamic site.

I suppose you could use something like blue crab <http://www.limit- 

On 2006 Oct 10, at 8:03 PM, Roger Clarke wrote:

> I run a web-site.  (Well, I run a few, but one in particular).
> It uses PHP to insert common headers and footers on all pages.
> I want to download the whole site to my portable, so that I can  
> make the web-site available from cache, in a location that has no  
> Internet access.
> How do I download a web-site that depends on server-side PHP  
> processing?
> Possibilities considered:
> 1.  An FTP-Client (I use Interarchy)
> *But* it downloads the unprocessed source-files, with embedded PHP.  
> Browsers ignore embedded PHP;  but the browser displays no headers,  
> and the headers contain the navigation buttons.
> 2.  A Browser
> I have, and use, Safari, Mozilla and Camino.
> Each saves-as one file at a time, and I can see no bulk download  
> facility.
> There are 120 HTML files (and 130 in other formats), so there's a  
> lot of downloading to do, which is time, effort, and scope for  
> errors (because I have the concentration-span of a gnat, and get  
> bored and impatient on repetitive tasks.  Yes, psychiatrists would  
> have a field day with me).
> (Safari saves-as into a proprietary format, which only Safari can use.
> Mozilla and Camino download into a widely-used format, viz. they  
> put the HTML under fn.html, and all invoked files in a directory  
> called fn_files.
> So they are cross-compatible.)
> 3.  Own-Code
> I could write a script to download them.
> But hang on, I'm superannuated (I was competent-to-good 1970-1982.  
> In fact I used to write program-generators, which was pretty flash  
> back then.  I even wrote string-handling routines in COBOL - now  
> *there's* a challenge).
> And anyway, surely this kind of facility is pretty mainstream?!
> 4.  Use PHP on the PC
> *But* to invoke PHP, I need a web-server, and all the paraphernalia  
> (spg?) that goes with it.  That's been a miscellaneous project for  
> years, but I'm still busy, and I need to solve this problem  
> promptly and reliably.
> 5.  Am I missing something?
> Thanks in advance to the Link Institute, in whom we trust!

Kim Holburn
IT Network & Security Consultant
Ph: +61 2 61258620 M: +61 417820641  F: +61 2 6230 6121
mailto:kim at holburn.net  aim://kimholburn
skype://kholburn - PGP Public Key on request
Cacert Root Cert: http://www.cacert.org/cacert.crt
Aust. Spam Act: To stop receiving mail from me: reply and let me know.
Use ISO 8601 dates [YYYY-MM-DD] http://www.saqqara.demon.co.uk/ 

Democracy imposed from without is the severest form of tyranny.
                           -- Lloyd Biggle, Jr. Analog, Apr 1961

More information about the Link mailing list