[Beowulf] Mare Nostrum (not quite COTS)

Eugen Leitl eugen at leitl.org
Wed Feb 16 02:31:53 PST 2005


Power Architecture Community Newsletter, 15 Feb 2005: MareNostrum: A new
concept in Linux supercomputing		
	e-mail it!
The name and the history
Meet MareNostrum
Distinguishing technologies
View from the crow's nest
About the author
Rate this article
Related content:
Project MareNostrum site
IBM eServer Cluster Servers
dW newsletters

Level: Introductory

developerWorks Power Architecture editors
15 Feb 2005

    The MareNostrum supercomputer at the Barcelona Supercomputing Center,
ranked number four in the world in speed in November 2004, is constructed of
such totally off-the-shelf parts as IBM BladeCenter JS20 servers, 64-bit
970FX PowerPC processors, TotalStorage DS4100 storage servers, and Linux 2.6.
This is its story.

IBM® has long been a supercomputing leader -- its heritage of innovation
currently and spectacularly manifested in its most powerful supercomputer,
Blue Gene®/L. The MareNostrum project is the latest bold experiment in
supercomputing by IBM -- a small but powerful, rapidly deployed and built
system that comes entirely from commercially available components. The Latin
term mare nostrum means "our sea" (which to the Romans meant the
Mediterranean, as familiar and available to the Italici as the air they
breathed, but also the critical key to their success).

MareNostrum is one of the world's most powerful supercomputers, ranked among
the top five in the prestigious TOP500 (see Resources), yet it is constructed
from products available for sale to any business, lives within a relatively
small footprint, and was built on a tight schedule using blade servers, a
Linux. operating environment, and other cost-efficient technologies.
MareNostrum represents a new way of thinking about high-performance

Blade servers, some of the most thin and dense machines that can be slid into
chassis with the ability to share sources such as power and network switches,
became the base components of this supercomputer design. Those familiar with
the IBM BladeCenter. JS20 servers' shared-resources architecture will
recognize how these servers cost-effectively minimize power consumption and
heat output. Running the Linux operating system, the servers exploit the
capabilities of the 2.6 kernel on 64-bit PowerPC® processors.

MareNostrum also demonstrates something very unique in its project timeline:
Part of its mission was to prove the speed at which IBM Linux clusters could
be implemented and unleashed. According to the IBM MareNostrum e-Science
Lead, Dr. Juan Jose Porta (Open Systems Design and Development, IBM
Boeblingen Laboratory):

    This is all about timely and focused execution. The speed at which this
project was realized is important. Consider: from the initial concept in late
December of 2003 to assembling the computer in Madrid took less than a year.
Normally, this kind of supercomputer projects take years. 

To make a remarkable saga short, MareNostrum is here and will soon be put
into operation by the Barcelona Supercomputer Center (BSC), a public
consortium created by the Spanish Government, the Catalonian Government, and
the Technical University of Catalonia (UPC), the hosts of the MareNostrum
supercomputer. The Barcelona Supercomputing Center is located on the
Polytechnic University of Catalonia (UPC) campus in Barcelona.

Dr. Porta added, "The supercomputer is based upon commodity technology
already developed and available. We were also playing with another piece of
magic -- an open environment. This has been a collaborative community effort,
where we closely worked with our partners."

The name and the history
Why "MareNostrum?" In the words of Dr. Porta:

    MareNostrum means literally "our sea," which is also the Latin name for
the Mediterranean Sea on which Barcelona is a port. It carries other apt
connotations. "Our sea" refers to a sea of processors and professors who are
flocking to the MareNostrum project with a deep commitment to breakthrough
science. MareNostrum also refers to the fact that our supercomputer is on the
shores of the Mediterranean which, in the days of old Rome, was the middle of
the world. This was the center of the Roman Empire, now to become the center
of European e-Science on the shores of the nice Mediterranean Sea! Thus, we
are talking about an ocean of many professors and a major hub around which
such facilitation will grow and thrive to empower a new generation of

    Another significant aspect of the name is that, being Latin, it is more
culturally inclusive. Not everyone is aware that Spain has actually four
official languages, and we did not want to slight anyone. Latin was a safe
choice. Spain now understandably becomes the proud home to the most powerful
supercomputer in Europe. We see references to its having been assembled in
Madrid, but also references to its permanent home as being in Barcelona. 

MareNostrum is a result of the burgeoning partnership between IBM and the
Spanish Government, which has also led to the creation of the Barcelona
Supercomputing Center (BSC). BSC is a public consortium created by the
Spanish Government, the Catalonian Government, and the Technical University
of Catalonia (UPC), which will host the MareNostrum supercomputer.

Housed in a majestic 1920s chapel on the university grounds, MareNostrum
serves a dual purpose: To serve as a primary high-performance computing
resource for the European e-science community and to demonstrate the many
benefits of Linux on POWER. in scale.

Meet MareNostrum
With peak system performance of 40 teraflops for the final system
configuration, and a number four spot on the TOP500 list, MareNostrum
continues the IBM tradition of high-performance computing breakthroughs in
the service of scientific advancement with a twist: MareNostrum is built
entirely of commercially available components, including:

    * 2,282 IBM eServer BladeCenter JS20 blade servers housed in 163
    * BladeCenter chassis
    * 4,564 64-bit IBM PowerPC 970FX processors
    * 140 TB of IBM TotalStorage® DS4100 storage servers

The thinking behind MareNostrum's construction represents a new way of
looking at these and other compute-intensive areas. Today's typical
high-performance computing installation runs a large, parallel RISC-based
UNIX® system with performance instead of reliability being of utmost
importance. MareNostrum, however, is a small-footprint Linux cluster made up
entirely of off-the-shelf components. With the extreme density of IBM eServer
BladeCenter JS20 servers, diskless nodes, and an open system environment,
MareNostrum offers superior price/performance; greater reliability,
availability, and serviceability; and significant cost efficiencies --
factors that are endearing Linux-based cluster servers to more and more
businesses all the time.

Distinguishing technologies
The next sections explain the hardware and software technologies that
distinguish the high-performance computing strategy behind MareNostrum.

Hardware: Servers
There are 2,282 IBM eServer BladeCenter JS20 servers housed in 163
BladeCenters chassis. Each server Blade has two PowerPC 970 processors
running at 2.20GHz, providing superior performance for several varieties of
Linux. The BladeCenter technology offers the highest commercially available
computer density in the industry, which results in high performance with a
small footprint. The BladeCenter technology allows for 84 dual processor
servers in a single 42 U rack, giving more than 1.4 teraflops of compute
power in a single rack.

Hot-swappable JS20 servers also allow administrators to change servers
without disrupting applications, maximizing availability. Its
shared-resources architecture helps to minimize power consumption and heat
output, as well.

Hardware: Storage
MareNostrum's storage subsystem consists of 20 storage server nodes with 7
terabytes of capacity each or 140 terabytes of total capacity. Its backbone
is the IBM TotalStorage DS4100 storage server which, like the BladeCenter
JS20, uses redundant hot-swappable components for high availability. IBM
TotalStorage DS4100 technology enables tremendous scalability and a wide
range of RAID data protection options.

Hardware: Switching
Four switch frames with Myrinet, including 10 CLOS 256+256 switches and 2
Spine 1280s and densely bundled Myrinet cabling enables faster parallel
processing with less switching hardware. The redundant hot-swappable power
supply ensures greater availability. The complete switch with 12 chassis
provides for 2,560 uniform ports. This uniformity simplifies the programming
model so researches can focus on their programs and not the system
interconnect architecture.

Software: The power of Linux on POWER
The Linux 2.6 kernel offers an array of enterprise and performance features
that exploit the Power Architecture.. The virtualization capabilities of
Linux on POWER allow for more flexible partitioning, better balancing of
workloads, and superior scalability should workloads increase. Dr. Porta
explained, "It is the Linux 2.6 kernel which offers an array of enterprise
and performance features that exploit the Power Architecture."

Software: Diskless Image Management (DIM)
DIM is a prototype utility for managing the Linux distribution for the
compute nodes on the storage servers so that the compute node does not have
to manage the root file system. All the files for operation are obtained
through the cluster network. Because of this, blades can operate immediately
without Linux installation. This is on-demand operation. The blades do have a
disk drive but that is reserved for future application use such as
checkpointing. DIM also supports the network boot environment in a highly
distributed fashion.

Software: IBM Linux on POWER clustering technologies
The goal is to endow MareNostrum with the same benefits businesses in many
industries derive from IBM Linux clusters, albeit on a larger scale. Benefits
such as:

    * Superior density and improved operating efficiency, including smaller
    * space, power, and cooling requirements and related costs -- thanks to
    * the BladeCenter JS20 architecture
    * Record price/performance and system throughput for high-performance
    * computing workloads thanks to innovative POWER semiconductor
    * technology, specifically the eight-way superscalar design of the
    * PowerPC 970FX processor which fully supports symmetric multi-processing
    * (SMP)
    * The leading IBM 64-bit POWER microprocessors are capable of addressing
    * four billion times the amount of physical memory as traditional 32-bit
    * processors without resorting to complex memory-extension techniques.
    * Better systems management control thanks to embedded service processors
    * and software image management
    * Increased reliability, availability, and serviceability, as well as
    * lower installation and maintenance costs -- provided by diskless
    * compute nodes
    * Improved functionality and performance thanks to the Linux 2.6 kernel
    * Reduced switching hardware requirements and faster parallel processing
    * provided by Myrinet switch cabling
    * Improved storage subsystem costs and reliability thanks to TotalStorage
    * DS4100 storage technology

View from the crow's nest
When the power of MareNostrum is unleashed later this year, it will be at the
service of scientific, engineering, and medical researchers in the Spanish
and international scientific communities. Its to-do list includes issues that
are familiar in the supercomputing world, such as protein folding, in silico
(computer generated) drug screening and enzymatic reactions. MareNostrum will
be used to support basic and applied research in areas that include biology,
chemistry, physics, and information-based medicine.

As Dr. Porta summed up:

    ...[T]he very thinking that drove MareNostrum's construction is a new way
of looking at compute-intensive areas, particularly in the life sciences, as
we prepare new work to resolve challenging problems in information based
medicine -- including improvements in diagnostic and therapeutic treatments
in hospitals. In the EU context, many of the projects will be conducted in
collaboration with other leading European research institutions. We are
building collaborative efforts across geographic borders and disciplines. And
remember -- the name of the supercomputer is MareNostrum. Traditionally, it
was the Mediterranean Sea which allowed commerce and communication to
flourish in Europe and beyond. 


    * Visit the Project MareNostrum site, demonstrating the value of Linux
    * clustering for science, for business, for life itself.

    * MareNostrum is now at home at the Barcelona Supercomputing Center (BSC)
    * on the Polytechnic University of Catalonia (UPC) campus in Barcelona, a
    * prestigious public institution focused on higher education, research,
    * and technology transfer.

    * The TOP500 Supercomputer Sites project was started in 1993 to provide a
    * reliable basis for tracking and detecting trends in high-performance
    * computing -- twice a year, the project releases a list of the 500 sites
    * operating the most powerful computer systems.

    * See this chart for the Linpack benchmark for MareNostrum and others.

    * This news article examines MareNostrum, IBM's top-ranked,
    * off-the-shelf, blade-based supercomputer.

    * Connecting two or more IBM eServer Cluster Servers can create a single,
    * unified computing resource that will dramatically improve availability,
    * flexibility, and adaptability for essential services.

    * The IBM BladeCenter JS20 is well- suited for commercial mainstream
    * applications and 64-bit high performance computing (HPC) environments.

    * The IBM Redbook, The IBM eServer BladeCenter JS20, takes an in-depth
    * look at the two-way Blade eServer for applications requiring 64-bit
    * computing.

    * The Linux on IBM eServer product line is Linux-enabled to deliver
    * maximum performance, reliability, manageability, and price/performance
    * benefits.

    * See this site for more on how IBM supercomputing solutions can help
    * remove the barriers to deployment of clustered server systems.

    * IBM TotalStorage DS400 series has been enhanced with the DS4000 Storage
    * Manager V9.10, enhanced remote mirror option, DS4100 option for larger
    * capacity configurations, and support for EXP100 serial ATA expansion
    * units .

    * Take a look at the Myrinet switches used in MareNostrum.

About the author
The developerWorks Power Architecture editors welcome your comments on this
article. E-mail them at dwpower at us.ibm.com.

Eugen* Leitl <a href="http://leitl.org">leitl</a>
ICBM: 48.07078, 11.61144            http://www.leitl.org
8B29F6BE: 099D 78BA 2FD3 B014 B08A  7779 75B0 2443 8B29 F6BE
http://moleculardevices.org         http://nanomachines.net
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20050216/14ca2081/attachment.sig>

More information about the Beowulf mailing list