[LINK] R programming language
Richard Chirgwin
rchirgwin at ozemail.com.au
Thu Jan 8 06:45:37 AEDT 2009
Wow, the NYT leads the scoops with a story about a bit of software
that's 13 years old ... this is the "slow news week" syndrome with a
vengance!
RC
stephen at melbpc.org.au wrote:
> Data Analysts Captivated by R Power
>
> By ASHLEE VANCE www.nytimes.com Published: January 6, 2009
>
> . R is the name of a popular programming language used by a growing
> number of data analysts inside corporations and academia.
>
> It is becoming their lingua franca partly because data mining has entered
> a golden age, whether being used to set ad prices, find new drugs more
> quickly or fine-tune financial models. Companies as diverse as Google,
> Pfizer, Merck, Bank of America, the InterContinental Hotels Group and
> Shell use it.
>
> But R has also quickly found a following because statisticians, engineers
> and scientists without computer programming skills find it easy to use.
>
> “R is really important to the point that it’s hard to overvalue it,” said
> Daryl Pregibon, a research scientist at Google, which uses the software
> widely. “It allows statisticians to do very intricate and complicated
> analyses without knowing the blood and guts of computing systems.”
>
> It is also free. R is an open-source program, and its popularity reflects
> a shift in the type of software used inside corporations. Open-source
> software is free for anyone to use and modify ..
>
> R is similar to other programming languages, like C, Java and Perl, in
> that it helps people perform a wide variety of computing tasks by giving
> them access to various commands.
>
> For statisticians, however, R is particularly useful because it contains
> a number of built-in mechanisms for organizing data, running calculations
> on the information and creating graphical representations of data sets.
>
> Some people familiar with R describe it as a supercharged version of
> Microsoft’s Excel spreadsheet software that can help illuminate data
> trends more clearly than is possible by entering information into rows
> and columns.
>
> What makes R so useful — and helps explain its quick acceptance — is that
> statisticians, engineers and scientists can improve the software’s code
> or write variations for specific tasks. Packages written for R add
> advanced algorithms, colored and textured graphs and mining techniques to
> dig deeper into databases.
>
> Close to 1,600 different packages reside on just one of the many Web
> sites devoted to R, and the number of packages has grown exponentially.
>
> One package, called BiodiversityR, offers a graphical interface aimed at
> making calculations of environmental trends easier.
>
> Another package, called Emu, analyzes speech patterns, while GenABEL is
> used to study the human genome.
>
> The financial services community has demonstrated a particular affinity
> for R; dozens of packages exist for derivatives analysis alone.
>
> “The great beauty of R is that you can modify it to do all sorts of
> things,” said Hal Varian, chief economist at Google. “And you have a lot
> of prepackaged stuff that’s already available, so you’re standing on the
> shoulders of giants.”
>
> R first appeared in 1996, when the statistics professors Ross Ihaka and
> Robert Gentleman of the University of Auckland in New Zealand released
> the code as a free software package.
>
> According to them, the notion of devising something like R sprang up
> during a hallway conversation. They both wanted technology better suited
> for their statistics students, who needed to analyze data and produce
> graphical models of the information. Most comparable software had been
> designed by computer scientists and proved hard to use.
>
> Lacking deep computer science training, the professors considered their
> coding efforts more of an academic game than anything else. Nonetheless,
> starting in about 1991, they worked on R full time. “We were pretty much
> inseparable for five or six years,” Mr. Gentleman said. “One person would
> do the typing and one person would do the thinking.”
>
> Some statisticians who took an early look at the software considered it
> rough around the edges. But despite its shortcomings, R immediately
> gained a following with people who saw the possibilities in customizing
> the free software.
>
> John M. Chambers, a former Bell Labs researcher who is now a consulting
> professor of statistics at Stanford University, was an early champion.
>
> At Bell Labs, Mr. Chambers had helped develop S, another statistics
> software project, which was meant to give researchers of all stripes an
> accessible data analysis tool. It was, however, not an open-source
> project.
>
> The software failed to generate broad interest and ultimately the rights
> to S ended up in the hands of Tibco Software. Now R is surpassing what
> Mr. Chambers had imagined possible with S.
>
> “The diversity and excitement around what all of these people are doing
> is great,” Mr. Chambers said.
>
> While it is difficult to calculate exactly how many people use R, those
> most familiar with the software estimate that close to 250,000 people
> work with it regularly.
>
> The popularity of R at universities could threaten SAS Institute, the
> privately held business software company that specializes in data
> analysis software. SAS, with more than $2 billion in annual revenue, has
> been the preferred tool of scholars and corporate managers.
>
> “R has really become the second language for people coming out of grad
> school now, and there’s an amazing amount of code being written for it,”
> said Max Kuhn, associate director of nonclinical statistics at
> Pfizer. “You can look on the SAS message boards and see there is a
> proportional downturn in traffic.”
>
> SAS says it has noticed R’s rising popularity at universities, despite
> educational discounts on its own software, but it dismisses the
> technology as being of interest to a limited set of people working on
> very hard tasks.
>
> “I think it addresses a niche market for high-end data analysts that want
> free, readily available code," said Anne H. Milley, director of
> technology product marketing at SAS. She adds, “We have customers who
> build engines for aircraft. I am happy they are not using freeware when I
> get on a jet.”
>
> But while SAS plays down R’s corporate appeal, companies like Google and
> Pfizer say they use the software for just about anything they can.
>
> Google, for example, taps R for help understanding trends in ad pricing
> and for illuminating patterns in the search data it collects. Pfizer has
> created customized packages for R to let its scientists manipulate their
> own data during nonclinical drug studies rather than send the information
> off to a statistician.
>
> The co-creators of R express satisfaction that such companies profit from
> the fruits of their labor and that of hundreds of volunteers.
>
> Mr. Ihaka continues to teach statistics at the University of Auckland and
> wants to create more advanced software. Mr. Gentleman is applying R-based
> software, called Bioconductor, in work he is doing on computational
> biology at the Fred Hutchinson Cancer Research Center in Seattle.
>
> “R is a real demonstration of the power of collaboration, and I don’t
> think you could construct something like this any other way,” Mr. Ihaka
> said. “We could have chosen to be commercial, and we would have sold five
> copies of the software.”
>
> » A version of this article appeared in print on January 7, 2009, on page
> B6 of the New York edition.
> --
>
> Cheers people
> Stephen Loosley
> Victoria, Australia
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> Link mailing list
> Link at mailman.anu.edu.au
> http://mailman.anu.edu.au/mailman/listinfo/link
>
More information about the Link
mailing list