[LINK] IBM Builds Biggest Data Drive Ever
stephen at melbpc.org.au
stephen at melbpc.org.au
Mon Aug 29 21:07:31 AEST 2011
IBM Builds Biggest Data Drive Ever
Thursday, August 25, 2011 By Tom Simonite
<http://www.technologyreview.com/computing/38440/>
A data repository almost 10 times bigger than any made before is being
built by researchers at IBM's Almaden, California, research lab.
The 120 petabyte "drive" that's 120 million gigabytes is made up of
200,000 conventional hard disk drives working together.
The giant data container is expected to store around one trillion files
and should provide the space needed to allow more powerful simulations of
complex systems, like those used to model weather and climate.
A 120 petabyte drive could hold 24 billion typical five-megabyte MP3
files or comfortably swallow 60 copies of the biggest backup of the Web,
the 150 billion pages that make up the Internet Archive's WayBack Machine.
The data storage group at IBM Almaden is developing the record-breaking
storage system for an unnamed client that needs a new supercomputer for
detailed simulations of real-world phenomena. However, the new
technologies developed to build such a large repository could enable
similar systems for more conventional commercial computing, says Bruce
Hillsberg, director of storage research at IBM and leader of the project.
"This 120 petabyte system is on the lunatic fringe now, but in a few
years it may be that all cloud computing systems are like it," Hillsberg
says. Just keeping track of the names, types, and other attributes of the
files stored in the system will consume around two petabytes of its
capacity.
Steve Conway, a vice president of research with the analyst firm IDC who
specializes in high-performance computing (HPC), says IBM's repository is
significantly bigger than previous storage systems. "A 120-petabye
storage array would easily be the largest I've encountered," he says.
The largest arrays available today are about 15 petabytes in size.
Supercomputing problems that could benefit from more data storage include
weather forecasts, seismic processing in the petroleum industry, and
molecular studies of genomes or proteins, says Conway.
IBM's engineers developed a series of new hardware and software
techniques to enable such a large hike in data-storage capacity. Finding
a way to efficiently combine the thousands of hard drives that the system
is built from was one challenge.
As in most data centers, the drives sit in horizontal drawers stacked
inside tall racks. Yet IBM's researchers had to make those significantly
wider than usual to fit more disks into a smaller area.
The disks must be cooled with circulating water rather than standard fans.
The inevitable failures that occur regularly in such a large collection
of disks present another major challenge, says Hillsberg. IBM uses the
standard tactic of storing multiple copies of data on different disks,
but it employs new refinements that allow a supercomputer to keep working
at almost full speed even when a drive breaks down.
--
Cheers,
Stephen
More information about the Link
mailing list