[LINK] Something rotten at the SMH?

Tom Koltai tomk at unwired.com.au
Mon Jun 8 16:16:41 EST 2009



> -----Original Message-----
> From: link-bounces at mailman1.anu.edu.au 
> [mailto:link-bounces at mailman1.anu.edu.au] On Behalf Of Danny Yee
> Sent: Monday, 8 June 2009 3:44 PM
> To: David Boxall
> Cc: Link
> Subject: Re: [LINK] Something rotten at the SMH?
> 
> 
> David Boxall wrote:
> > Every link on <http://www.smh.com.au/text/> that I've tried threw a 
> > 404
> > error. Am I doing something wrong, or is it the site?
>  
> The links on that page are also a week old, despite the date 
> in the heading.  The Herald appears to have given up 
> providing a text version of its articles.  I guess I'll have 
> to AdBlock all their advertisers now...
> 

Could it be the first step in blocking free access to its content?

To an editor, it would seem that text is harder to scrape from a HTML
4-0 site than from a plain text file.
I predict that we will all become experts at scraping based on key word
search very shortly.

At which time I am sure they will move to the new PDF/Flash version
(e.g.: subs version of NY Times)
At which time someone will devise an OCR of the Flash method.....

And so we develop.

Without advertising newspapers will fold and the wire services will turn
into retail news feeds.
An interesting financial model for the wire-services, not so good for
the current crop of wire-service scrapper's (I believe they're called
ISP's - Integration Service Publishers..... <grin>)


Tom


_______________________________________
No viruses found in this outgoing message
Scanned by iolo AntiVirus 1.5.6.4
http://www.iolo.com



More information about the Link mailing list