[Beowulf] SGI shows off Molecule concept machine

Eugen Leitl eugen at leitl.org
Fri Nov 21 04:03:29 PST 2008


SGI shows off Molecule concept machine

SC08 A dense cluster of Intel Atoms

By Timothy Prickett Morgan • Get more from this author

Posted in Servers, 20th November 2008 22:05 GMT

While supercomputer maker Silicon Graphics was showing off its existing Altix
lines of Xeon and Itanium servers at the SC08 supercomputing show in Austin,
Texas, this week, the most interesting thing the company touted was not yet a
real computer, but a concept system, called Molecule.

The Molecule machine takes a few pages out of IBM's BlueGene massively
parallel supercomputer book, and the main one is that for some workloads,
where a large number of compute nodes need to be brought to bear to run a
simulation, sometimes it makes more sense to have relatively modest
processors instead of big fat ones.

IBM built the BlueGene/L super from its embedded PowerPC 440 dual-core
processors. SGI's Molecule concept machine would be built from Intel
dual-core Atom x64 chips, which are based on 45 nanometer processes and are
designed for netbooks and other portable computing devices where long battery
life, not computing power, is the limit of usefulness. The chips run at
between 800 MHz and 1.67 GHz and implement HyperThreading, so they can
deliver up to two virtual threads per core.

With the BlueGene box, IBM controlled not only the chip but also the
interface off the chip and out into the system interconnect. Michael Brown,
sciences segment manager at SGI who was showing off the Molecule concept box,
says that SGI can't really control the interconnect Intel will put on Atom
boards. But presumably a fast enough interconnect could be designed to plug
multiple Atom boards into a chassis.

The Molecule concept machine puts a dual-core Atom N330, code-named
"Diamondville," on a system board that is about the size of a credit card.
This particular chip runs at 1.6 GHz and has a thermal design point of about
8 watts. The Atom N330 is not a true dual-core chip, but rather two
single-core Atoms side-by-side in a single chip package (it really isn't even
a socket) that is mounted to the board. Brown said that the future "Lincroft"
iteration of the Atom chip, which will put a DDR2 memory controller on the
chip, and thereby eliminate the need for an external chipset since the
Molecule boards have no direct attached storage other than main memory, would
be an interesting possibility. But Brown made no commitments to SGI actually
using this chip.

In any event, the Molecule board had four memory DIMMs soldered directly to
the board and linked to the chip, which provided 2 GB of memory capacity. The
interconnect is along the side of the board as the memory chips, and would
plug into a backplane of some sort that would reach out to external storage
and networks, much as blade servers do inside their chassis.

The Molecule design glues two of these Atom boards to a hollow ceramic
cartridge that is used to hold the boards in place, to draw heat off the
boards, and to channel cooling air that comes in through the bottom of the
chassis and is diverted at a 90 degree angle out the back of the chassis. The
cartridges interlace to create a bunch of channels, and have fins and baffles
inside to direct airflow very precisely. SGI calls this Atom board packaging
Kelvin.  SGI's Molecule Kelvin Packaging

Kelvin, lording over the Atoms in the Molecule

The concept machine at the SC08 show was a 3U rack that contained 180 of the
Atom boards, for a total of 360 cores. These boards would present 720 virtual
threads to a clustered application, and have 720 GB of main memory (using 512
MB DDR2 DIMMs mounted on the board) and a total of 720 GB/sec of memory
bandwidth. The important thing to realize, explained Brown, is that if the
interconnect was architected correctly, the entire memory inside the chassis
could be searched in one second. That memory bandwidth, Brown explained, was
up to 15 TB/sec per rack, or about 20 times that of a single-rack cluster
these days. This setup would be good for applications where cache memory or
out-of-order execution don't help, but massive amounts of threads do help.
(Search, computational fluid dynamics, seismic processing, stochastic
modeling, and others were mentioned).

The other advantages that the Molecule system might have are low energy use
and low cost. The aggregate memory bandwidth in a rack of these machines
(that's 10,080 cores with 9.8 TB of memory) would deliver about 7 times the
GB per second per watt of a rack of x64 servers in a cluster today, according
to Brown. On applications where threads rule, the Molecule would do about 7
times the performance per watt of x64 servers, and on SPEC-style floating
point tests, it might even deliver twice the performance per watt. On
average, SGI is saying performance per watt should be around 3.5 times that
of a rack of x64 servers.

One more thing: It has no moving parts, and that increases reliability. And
if storage needs to be added to the Molecule architecture, it will be flash

The Molecule aims to run off-the-shelf HPC applications on top of Linux or
Windows. Brown said that SGI was showing off the concept box to solicit input
from prospective customers even before it creates an alpha box. If SGI sees
enough interest, it could take 12 to 18 months to produce the concept. If the
idea is sound, let's hope it doesn't take that long.

More information about the Beowulf mailing list