[LINK] The Next Frontier: Decoding the Internet's Raw Data

Bernard Robertson-Dunn brd at iimetro.com.au
Tue Jun 2 16:34:25 AEST 2009


The Next Frontier: Decoding the Internet's Raw Data
By Kim Hart
Monday, June 1, 2009
Washington Post
http://www.washingtonpost.com/wp-dyn/content/article/2009/05/31/AR2009053102340.html

There's no shortage of uses for the massive amounts of data in every 
nook and cranny of the Internet.

Advertisers want to mine the photos and status updates you post on 
Facebook to better sell their wares. Scientists want to track weather 
patterns based on decades of climate records to better forecast 
troubling storms. And White House officials now want to make government 
data sets available for citizens to use however they see fit.

The problem is figuring out how to organize and display the data in a 
useful and informative way, instead of forcing people to sift through 
heaps of mind-numbing spreadsheets. When are bar graphs and pie charts 
enough to break down a set of numbers? What is the best way to display 
flu outbreaks, cellphone call logs or senators' voting records?

These are some of the questions that were debated last week by 
government researchers, computer science professors and corporate 
financial analysts who attended workshops at the annual symposium of the 
University of Maryland's Human-Computer Interaction Lab.

"We're trying to understand data and make sense of it visually, but 
there's no way of evaluating how effective these visuals really are for 
people," said Mave Houston, a research manager for 
PricewaterhouseCoopers. Part of her job is finding tools to help 
auditors and investigators examine complex financial data, such as 
diagrams that show relationships between revenue and expenditures.

Also in the room were analysts from the Department of Defense, SAIC and 
Lockheed Martin, who expressed frustrations that information 
visualization tools, or "infoviz" as some call it, are too complex for 
novice users. Or they don't work well with user-generated content. Or 
they can't handle large amounts of data.

One of the most common tools was used recently to track the spread of 
the H1N1 virus. It showed breakouts as they occurred in various cities 
by using larger dots to show higher concentrations of reported illnesses 
and smaller dots for individual cases.

Linking information, designing user-friendly technology devices and 
finding ways to improve people's interaction with the Web has long been 
part of the Human-Computer Interaction Lab's mission since it was 
founded by Ben Shneiderman in 1983.

Since then, the lab has been credited with creating hyperlinks -- 
highlighted words in a document that direct Web surfers to another site 
-- even down to their characteristic light blue color. Shneiderman also 
developed a tool known as "treemaps," which display information as 
blocks of color to show hierarchal relationships. Hive Group, a Texas 
software company, licensed the technology and now uses it to help 
corporations display a variety of data, such as stock prices or computer 
systems.

"It's satisfying to see what was once considered esoteric research turn 
into mainstream computer science that has revolutionized industries," 
Shneiderman said last week. "Just think, YouTube works because designers 
made it easy to search for videos effectively. Now we have high school 
kids creating videos that get 5 million views."

While many of the lab's projects focus on consumer tools, such as 
improving Web site designs and developing devices that are easy for 
children to use, the lab is getting increased attention from 
policymakers looking to leverage technology for government needs.

Last week, Shneiderman met with federal Chief Information Officer Vivek 
Kundra and Deputy Chief Technology Officer Beth Noveck to discuss ways 
of improving public participation in policymaking. The lab also works 
with the Library of Congress, NASA and the National Archives to 
integrate technology into their services.

"Our belief is that technology is not just useful as toys or for 
business," Shneiderman said. "We're talking about using these 
technologies for national priorities."

/David Wang/, a computer science doctoral student who has worked in the 
lab for several years, has focused his research on electronic health 
records. Working with several Washington area hospitals, he is designing 
ways to organize time-sensitive patient data to keep better track of 
patients who need repeat treatments or who could qualify for drug trials 
or other procedures based on their medical histories.

Wang said he's suddenly received a lot of interest from medical groups 
and companies wanting to learn more about analyzing health records, now 
that more than $19 billion in stimulus funding has been allocated to 
digitizing the information.

"I guess there's buzz wherever there's money," he said. His project is 
open-source, so others can take a look at his progress and the code he 
is using to design his tools.

/Allison Druin/, the lab's director, said the researchers have become 
much more focused on partnering with nonprofits, non-governmental 
organizations and federal agencies. She said she's received a number of 
questions about improving access to information, providing data on new 
platforms such as cellphones and getting a better handle on online 
threats to children, such as cyber-bullying. All these questions, she 
said, have a direct impact on government actions.

"A lot of what we do affects policy and, of course, the policy affects 
the way we use technology," she said.

/Kim Hart writes about the Washington technology scene every Monday. 
Contact her at hartk at washpost.com <mailto:hartk at washpost.com>./--

 
Regards
brd

Bernard Robertson-Dunn
Canberra Australia
brd at iimetro.com.au




More information about the Link mailing list