[LINK] RFI: Mirroring Complete Web-Sites?

Paul Koerbin pkoerbin at nla.gov.au
Wed Aug 31 16:15:43 AEST 2011


We haven't collected it for PANDORA - actually rejected it as not meeting our priority areas for collecting. However it is picked up in our annual .au domain harvests, last done in Feb-March 2011 (though these are not currently publicly accessible).


Paul Koerbin
Manager Web Archiving
National Library of Australia

-----Original Message-----
From: link-bounces at mailman.anu.edu.au [mailto:link-bounces at mailman.anu.edu.au] On Behalf Of Paul Bolger
Sent: Wednesday, 31 August 2011 3:29 PM
To: Roger Clarke
Cc: link at anu.edu.au
Subject: Re: [LINK] RFI: Mirroring Complete Web-Sites?

Try wget. Command line nix tool which can recursively grab anything
which is indexed (it can't get items which aren't linked to and which
are in directories which can't be listed). You'll need to look at the
wget switches, but once you have the command built it'll do it all
very quickly and neatly.

Have Pandora indexed the site?

k




More information about the Link mailing list