[Beowulf] Codethink jumps into the ARM server fray with Baserock Slab

Mon Sep 17 04:01:44 PDT 2012

http://www.theregister.co.uk/2012/09/13/codethink_basertock_slab_arm_system/

Codethink jumps into the ARM server fray with Baserock Slab

A Marvell-ous cluster in a box

By Timothy Prickett Morgan • Get more from this author

Posted in Servers, 13th September 2012 23:51 GMT

The crafty engineers at embedded software development house Codethink
assembled an ARM-application build server for their own use this June, and
have now decided that you might want one, as well – so starting this week
they'll sell you a commercial version of that box called the Baserock Slab.

The reason for Codethink's entry into this biz is simple: for all the talk
about ARM-based servers out there, it's pretty tough to get your hands on a
machine with enough oomph to both handle build server tasks and be a
deployment box for scalable Linux applications.

That's what Gabriel Vizzard, VP of business development at embedded software
development house Codethink, says is the current state of ARM serverage. And
that's why the company's engineers, who know a thing or two about iron, built
their original box, which is now on sale.

The Baserock Slab is based on Marvell's Armada XP processor, and uses a
system-on-module design that welds everything you need for a multiprocessor
cluster onto a board, including main memory, links out to inter-node
networking, and server management modules. It's a compact design, and one
that is the shape of things to come for both low-power Atom and ARM servers
that will be commercialized in the coming years from myriad vendors.

The initial Baserock Slab servers are based on the quad-core MV78460
processors from Marvell, which acquired Intel's Xscale ARM chip business a
few years back and is determined to have a go at the ARM server racket after
having some success peddling ARM chips for embedded and client devices.

The initial Baserock servers are based on the A0 stepping of the "Sheeva"
MV78460 processors, which run at 1.33GHz and support 2GB of memory per
socket. Each of those cores on the Sheeva die implement the ARMv7-A core
spec, and they have 32KB of L1 data cache and 32KB of L1 instruction cache
per core and share 2MB of L2 cache memory on the die.

Each Sheeva ARM chip also sports the VFPv3-D16 floating point unit, which
does single- and double-precision floating point math, cryptographic duties,
and has accelerators that support the DES, 3DES, AES-128, AES-256, SHA-1, and
MD5 natively in the transistors.

While the Sheeva ARM cores do 32-bit processing, the core has been tweaked to
do 40-bit virtual memory addressing, and with the B0 stepping of the Sheeva
processor, Codethink will be adding a 1.6GHz option that has 4GB of memory
per socket. Larger main memory is possible, and very likely desirable by many
customers – you just use normal 64-bit DDR3-1333 memory with 8-bit error
correction in the server; nothing fancy.

This system-on-module is actually made by Cogent Computer Systems, an
embedded system maker based in Rhode Island; it's the CSB1726 module, to be
specific. At idle, this board, including main memory, burns about 12 watts.
No word on what it does under load.

Each server node has an ST Microelectronics STM32F103 microcontroller on it
as well, which does power management, system monitoring, and general I/O
handling for the node. It monitors voltages and temperature for the nodes and
the "warm swapping" of server nodes and their related storage.

A cluster in a box is more than just a server node, of course. Each Sheeva
chip has two SGMII ports running at 1Gb/sec with the A0 stepping and at
2.5Gb/sec with the B0 stepping. These link out to a Marvell "Cheetah 3"
98DX5156 Layer 2/3 network switching fabric that is embedded on an adjacent
board in the box.

Those SGMII trunks can be bonded for a maximum of 5Gb/sec of bandwidth coming
out of the processor into the switching fabric, which links the eight nodes
in the Baserock Slab system together and which also exposes four Gigabit
Ethernet and two 10 Gigabit Ethernet ports to the outside world.

That Cheetah 3 managed switch chip from Marvell is no slouch, either. It can
drive up to 24 SGMII ports at 2.5Gb/sec in full-duplex mode and can process
119 million packets per second of Layer 2 traffic at wire speed. It can do
4,096 active VLANs, supports Layer 2 multicast groups, and has jumbo 10KB
frame support as well as integrated buffer and control memories.

Codethink's Baserock Slab eight-node ARM system

In addition to the eight server nodes in the Baserock Slab, the machine also
includes an out-of-band management controller, which is itself based on a
variant of the Sheeva processor from Marvell. In this case, it is the Armada
300, a single-core chip (number 88F6282 in the Marvell catalog) that is based
on the ARMv5TE core design.

This processor has more modest 16KB L1 data and instruction caches and only
256KB of L2 cache for its one core; it has 1GB of DDR3-800 memory and 512MB
of SLC flash memory soldered onto it and two 100Mbit management ports for
talking slowly to system admins. This board burns about 4 watts at idle, and
is also made by Cogent (part number CSB1724 if you want to buy it and build
your own); it provides network boot services, upgrades to the BIOS on each
node, and other common lights-out management features.

For storage, the Baserock Slab system uses Nocti MLC mSATA flash chips from
OCZ as high-speed storage for a Linux operating system and for quick access
to files for specific workloads. Codethink is offering 30GB, 60GB, and 120GB
mSATA units for the server nodes and is using the Sandforce 2141/2181 SATA II
controller to talk to the flash.

These flashes have a 0.1 millisecond seek time and can do 280MB/sec reading
and 255MB/sec writing data; they burn 0.5 watts when idle and 1.7 watts when
active, and have a mean time between failure of 2 million hours. Each server
node has one mSATA unit by default. But there is one regular SATA ports
hanging off each processor card if want to plug a real disk drive in for
fatter storage.

The Baserock slab management controller

The Baserock Slab chassis is the standard 19 inches wide and 1U high, but is
only 14 inches deep. That means you can pack them back-to-back and get as
many as 76 units into a single rack. (It is not clear why it's not 84 units,
but it is probably to leave room for aggregation switches at the top of the
rack.) That is up to 2,440 cores of compute capacity into a single rack.

Codethink is just starting shipments of the Baserock Slab, which supports its
own homegrown Baserock Embedded Linux variant for ARM servers as well as the
stock Debian Linux if you want to roll your own. Red Hat Fedora and Canonical
Ubuntu Linuxes, which are working on better support for ARM processors, will
eventually be loadable on the Slab.

A fully loaded Slab, with all eight processors and the Baserock Linux stack,
runs $10,000. That's $1,250 per node, including switching and systems
software.

What everyone wants to know, of course, is how these compute Sheeva nodes and
clusters stack up on real-world work. Being an embedded Linux software house,
that's what Codethink cares about. And a lot of ARM developers have been
compiling in an emulated ARM environment on top of x86 machines, which sucks.

"In our tests, we've found that native compilation on each core of the Armada
XP at 1.33GHz is almost exactly twice as fast as a Core i7 core running at
3.4GHz and doing the same compilation under QEMU," says Vizzard.

While that's interesting for ARM developers, what we all want to know is how
a LAMP stack, a Hadoop job, and Memcached all stack up on a node-by-node and
cluster-by-cluster basis with native code running on both ARM and x86 iron
(Atom, low-voltage Xeon, and regular Xeon would be nice). Hopefully Codethink
will sit down with ARM Holdings and work to get such tests done. We are all
waiting, not exactly patiently. Dell and HP, which are both monkeying around
with ARM servers, don't seem to be in any hurry to do such rigorous tests