[LINK] RFI: Mirroring Complete Web-Sites?

Paul Bolger pbolger at gmail.com
Wed Aug 31 15:28:38 AEST 2011


Try wget. Command line nix tool which can recursively grab anything
which is indexed (it can't get items which aren't linked to and which
are in directories which can't be listed). You'll need to look at the
wget switches, but once you have the command built it'll do it all
very quickly and neatly.

Have Pandora indexed the site?

On 31 August 2011 16:29, Roger Clarke <Roger.Clarke at xamax.com.au> wrote:
> I need to urgently mirror a web-site, which is about to disappear
> (don't ask, but assume bumbling governmental incompetencies).
>
> I ought to know about things like this, but I don't
> (don't criticise, but assume bumbling Rogerish incompetencies).
>
> I can't get to a database that's in behind the site, but there's a
> great deal of HTML that's well worth rescuing.
>
> I'm a mere user / member of the public, and have no ftp privileges,
> and my first attempts led nowhere (i.e. timed out).
>
> In any case, can an anonymous ftp user do a recursive download?
>
> Is there an easy way to do a bulk download within a browser?
>
> I frequently mirror individual pages in Firefox 3.0.19, using
> File / Save Page As ... / Web Page Complete
>
> But I don't see a way to get it to follow links or directory-structures.
>
> --
> Roger Clarke                                 http://www.rogerclarke.com/
>
> Xamax Consultancy Pty Ltd      78 Sidaway St, Chapman ACT 2611 AUSTRALIA
>                    Tel: +61 2 6288 1472, and 6288 6916
> mailto:Roger.Clarke at xamax.com.au                http://www.xamax.com.au/
>
> Visiting Professor in the Cyberspace Law & Policy Centre      Uni of NSW
> Visiting Professor in Computer Science    Australian National University
> _______________________________________________
> Link mailing list
> Link at mailman.anu.edu.au
> http://mailman.anu.edu.au/mailman/listinfo/link
>




More information about the Link mailing list