Two new NASA technologies have squeezed 10
times more power out of climate modelling supercomputers.
by Patrick L. Barry
For many people, faster computers mean
better video games and quicker Internet surfing. But for decision
makers grappling with the future of Earth's climate, faster computers
have a very practical benefit: more realistic climate simulations
that give more reliable predictions.
NASA scientists have managed to squeeze about 10 times more power
out of cutting-edge "parallel" supercomputers through innovations
in software and memory design. This leap in effective computing
"muscle" - together with the data from NASA's Earth-observing satellites
- enables greater realism and statistical confidence in simulations
of global climate.
"That's something that we want to achieve,
so that when policy makers have to make a decision involving hundreds
of millions of dollars in the economy, they can say: "These
simulations are something we have confidence in," says Dr. Ghassem
Asrar, associate administrator for Earth science at NASA Headquarters
in Washington, D.C.
Whether the question concerns the path of an approaching hurricane
or the rise in global temperatures over the next century, predictions
always carry some amount of uncertainty. But the computer "models"
that produce the simulations can be improved so that this uncertainty
is reduced.
Making these improvements will require
all the computing power scientists can get their hands on.
To provide the immense "number crunching" power needed for demanding
scientific applications such as climate simulation, some computer
makers are turning to "parallel" designs with hundreds or thousands
of processors in one supercomputer.
The
bottom line shows how the increase in computing power (gigaflops)
normally tapers off as the number of processors increases.
The top line shows the performance on the same processors
using the software tools developed by NASA.
|
The numbers on these machines make
even the fastest desktop computers look like pocket calculators.
For example, the newest supercomputer at NASA's Goddard Space Flight
Centre boasts 512 processors each running at 400 MHz, 128 GB of
RAM, 2,800 GB of disk space, and a peak performance of 409 gigaflops!
(A "gigaflop" is a billion calculations per second.) A newer machine
at NASA's Ames Research Centre will top even this with 1,024 processors.
But simply adding more processors doesn't guarantee
a proportionate increase in effective power. In fact, the full potential
of these parallel supercomputers still has not been tapped.
"So what's the problem? Each node (i.e. processor) has certain performance,"
Asrar explains. "Individually they perform well, but as you add
them all together, as the number of nodes goes up, the overall efficiency
degrades." For example, a system with 100 processors would not have
100 times the power of a single processor - the overall performance
would be somewhat lower.
This loss of computing efficiency is a bit like
what happens when people must work together to get a task done.
Some effort must go into managing and coordinating the people involved
- effort that's diverted away from producing anything - and even
the productive workers must spend some amount of time communicating
with each other. In a similar way, a supercomputer with more processors
must use more of its power to coordinate those processors, and the
increased communication between all the processors bogs the system
down.
So the challenge was, how do you write the computer
programs such that you get the maximum performance out of a single
machine?" Asrar says.
Using faster
computers, forecasters will be able to narrow the estimated
path of hurricanes and perhaps save millions of dollars in
unneeded evacuations.
|
For the past four years, scientists
at NASA's Ames Research Centre have been working in partnership
with computer maker Silicon Graphics, Inc., to tackle this problem.
The fruits of their labour are two new technologies that increase
the effective power of these machines by roughly an order of magnitude
(that is, a factor of 10). Both technologies are freely available
to the supercomputing community, are computer vendor independent,
and are not specific to climate modeling.
The first of these technologies is a memory architecture
called "single-image shared memory." In this design, all of the
supercomputer's memory is used as one continuous memory space by
all of the processors. (Other architectures distribute the memory
among the processors.) This lets the processors exchange the messages
needed to coordinate their efforts by accessing this "common ground"
of memory. This scheme is more efficient than passing the messages
directly between the processors, as most parallel supercomputers
do.
But a new memory architecture needs software that
knows how to make good use of it. The second innovation does just
that. It is a software design tool called "multi-level parallelism."
Software made using this tool can use the common pool of memory
to break the problem being solved into both coarse-grained and fine-grained
pieces, as needed, and compute these pieces in parallel. The single
memory space gives more flexibility in dividing up the problem than
other designs in which the memory is physically divided among the
processors.
The extra computing power milked from the processors
by these technologies will help NASA's Earth Science Enterprise
make better models of Earth's climate.
These models work by dividing the atmosphere and
oceans up into a 3-dimensional grid of boxes. These boxes are assigned
values for temperature, moisture content, chemical content, and
so on, and then the interactions between the boxes are calculated
using equations from physics and chemistry. The result is an approximation
of the real system.
With more computing power available, more of the physics of the
real climate system can be incorporated into the models, and the
atmosphere can be divided into more, smaller boxes. This makes the
models more realistic, and the predictions they will produce will
be of more interest on a regional scale.
Also, the ability to run these models faster will
mean that more simulations can be performed, which will produce
a larger pool of results. In statistical terms, this larger "population"
will allow for a better analysis of the strength of the conclusions.
The software
tools developed by NASA and SGI can be used for other simulations,
too. Shown here is a supercomputer model of a human protein.
|
NASA's suite of Earth-observing satellites, together
with a global network of meteorological stations, provide the dose
of real-world data that is needed to keep the models on track. And
the archives of this data provide the ultimate proving grounds for
the models: Can the computers accurately "predict" the real weather
observed in the past?
Asrar says that the computer models are already quite good at this,
but there's still room for improvement. As supercomputers continue
to advance - along with the software that taps that power - climate
models will become more and more accurate, offering better answers
to the vexing questions of climate change.
|