Advertisment

The Supercomputer Strikes Back

author-image
DQI Bureau
New Update

Here, Cray and IBM argue on why their choice of technology is better than the

other’s. This is hot stuff, happening real-time...

Advertisment

When Cray announced the launch of its X1 supercomputer last October and

predicted a renaissance of specialized supercomputers, the supercomputing

community sat up and listened. The reason–just about six months ago, NEC

(Japan) had fired the fastest supercomputer in the world, which used the vector

architecture. An architecture which leading vendors like IBM had dismissed as

"old technology." With the launch of the X1, the debate has started

off all over again.

For nearly four decades, supercomputing was synonymous with supercomputers–machines

large enough to occupy six cricket pitches and were typically used to crack ‘grand

challenge’ problems like weather and climate modeling, high-fidelity crash

testing of automobiles, advanced drug and design, and discovery simulation.

Cray, NEC, and Hitachi made such machines that were based on vector

architectures. In the last decade or so however, the market for such machines

has been shrinking rapidly. True to the name, supercomputers crunched voluminous

amounts of data but were very costly. Hence they adorned government-run labs or

very large organizations.

Advertisment

That changed in the early 90s with the emergence of a supercomputing

technology called clustering.

Jargon

Buster
Vector

Supercomputers:
Machines optimized

for applying arithmetic operations to large arrays called vectors.

Vector supercomputers dominated the supercomputing world from the

1960s to the 1990s. Cray was the market leader of this era

High

performance parallel machines:
Instead

of a single machine with supercomputing capabilities, parallel

processing machines are a group of machines or processors that split

the load amongst themselves. They can scale up significantly. E.g.

IBM’s ASCI White

High

performance clustering:
Machines

in a cluster, operating simultaneously with an aim to get better

number crunching capabilities. While some in the cluster split the

computational load, others handle data input/output and still others

act as controllers. High-speed networks connect all of them.

COTS:

Commercial off the shelf clusters is a recent attempt to create

supercomputers out of PCs. Eg . SETI@home project (under which home

PCs have been linked up in a Search for Extra Terrestrials project)

The cluster story



Basically, instead of a single machine with supercomputing capabilities, a

cluster is a group of machines or processors that are linked together and can

offer comparable computing power. With the increasing processing power of

commodity microprocessors (thanks to Moore’s Law), the very need for

specialized supercomputers was questioned. After all, the computation power of

any laptop available today was the same as that of a specialized supercomputer

25 years ago.

Advertisment

Also, the idea of using low-end processors to achieve high performance

computing appealed to corporate users who were looking for affordable

supercomputing solutions. So organizations like Sun and Dell that have been

associated with personal computing have entered the field of High Performance

Computing (HiPC).

"There are two major advantages of cluster based supercomputing. The

first is scalability. You can start at very small systems (a few processors) and

scale up to the largest systems in the world (tens of thousands of processors).

This allows companies with the smallest budgets or highest performance

requirements to use the same technology to meet their needs. It allows

developers to work on small systems and scale their applications to grand

challenge proportions on the largest systems available," says Peter Ungaro,

IBM Vice President for Sales, Worldwide High Performance Computing. IBM has

largely placed its bets on cluster based high performance computing in the

recent past.

"The second advantage is that they will get the most sustained

performance for their budget. This is important as organizations worldwide want

to get the highest return on their investments in supercomputing

technologies," he adds.

Advertisment

The Cray argument



But for Cray, such arguments are trivial.

"Organizations like IBM contend that cluster systems are cheaper than

custom systems like those made by Cray and Hitachi.

But if you look at issues like facility costs, power consumption and support

costs, clusters are pretty costly considering the relative level of (low)

performance that you get from them," says Dr Burton Smith, chief scientist

Cray Incorporated who was in India recently.

Advertisment

For Dr Burton, the supercomputing world is divided into two halves–the

manufacturers of "Type C" machines like Cray, NEC, Hitachi and those

who make "Type T" machines like IBM, SGI etc.

"Basically, there are two different types of supercomputers–the

clusters and grid systems called Type T machines, whose prices are based on

transistor costs and their peak performance is characterized by LINPACK (a

benchmark to measure performance of a dedicated system for solving a dense

system of linear equations). On the other hand, high memory bandwidth and fast

interconnection switches characterize the performance of Type C systems (NEC,

Cray, and Hitachi).

Hence, the cost in Type C systems is in wires (connections) not in

processors," says Dr Burton.

Advertisment

He also contends that both these supercomputers have their respective niche

in the supercomputing ecosystem.

"Type T systems like some of those made by IBM, perform well with local

data, well-balanced workload, and explicit methods. Type C systems, on the other

hand, perform well with global access of data, poorly balanced workloads, sparse

linear algebra, implicit methods, and adaptive or irregular meshes," he

adds.

Simply put, whereas the Type T systems are good at solving relatively simpler

problems like handling large e-commerce transactions, they are simply not good

enough to handle highly complex tasks in volatile environments. Weather

prediction for instance, is a volatile environment.

Advertisment

" To suggest that Type C cluster made machines can undertake grand

challenges like weather modeling is bunkum and sheer marketing fluff of large

corporations like IBM," says Dr Burton.

Buoyed by the success in high performance computing areas like weather

modeling, Big Blue is unfazed by such criticism.

The IBM counter-point



"We see that the HiPC market is primarily focused on purchasing systems

which have the best sustained performance for their money and that has typically

been clusters based on high performance processors, such as the POWER4. Even in

markets traditionally dominated by specialized vector supercomputers, like

weather forecasting, we see them moving to high performance integrated clusters

such as those sold by number of IT companies including IBM. An example would be

the European Center for Medium Range Weather Forecasting (ECMWF)," says

Ungaro.

Ungaro points out that IBM’s deal with ECMWF to build the world’s most

powerful supercomputer and storage network for weather prediction is proof

enough that high performance integration clusters are not far behind.

"Also, it is critical that you don’t just have good hardware, but you

have a pervasive solution that attracts the thousands of important software

developers. You have to make sure that there is a large portfolio of

applications to run so that customers have a choice of what solution is best for

them. Also one needs to have a group of experts who take these ported

applications and optimize them. This broad application portfolio is a major

advantage of using high performance cluster technologies over specialized, niche

technologies such as vectors," says Ungaro.

Renaissance?



However, a recent school of thought has emerged which contends that a

"renaissance of specialized supercomputers" is likely.

The launch of the Earth Simulator by NEC, Japan is something manufacturers of

Vector based systems are proud of. The Earth Simulator, which is based on Vector

architecture, is five times faster than the most powerful US configuration and

the most powerful supercomputer in the world today. Cray believes this is a

"renaissance of high bandwidth vector based systems."

"The Earth Simulator is a slap on the face of all those who claimed that

specialized supercomputers were a thing of the past.

It’s a major embarrassment for the authorities and vendors in the US who

can’t believe that the fastest supercomputer is now in Japan and is a vector

based system. This is bound to rekindle a renaissance of specialized

supercomputers," says Burton.

Whether such a revival might happen or not is something that only time can

tell. But one thing is very clear–it’s going to be an uphill battle for

vector-based supercomputer makers like Cray to take on the sheer muscle power of

large corporations like IBM and make a significant dent in the supercomputing

market.

TV Mahalingam

Advertisment