[Beowulf] Small form computers as cluster nodes - any comments about the Shuttle brand ?
Bogdan Costescu
Bogdan.Costescu at iwr.uni-heidelberg.de
Sun Aug 9 06:50:59 PDT 2009
On Fri, 7 Aug 2009, David Ramirez wrote:
> Due to space constraints I am considering implementing a 8-node (+ master)
> HPC cluster project using small form computers. Knowing that Shuttle is a
> reputable brand, with several years in the market, I wonder if any of you
> out there has already used them on clusters and how has been your experience
> (performance, reliability etc.)
I've built a cluster of 80 nodes, which will turn 5 this month. Using
Shuttle SB75G2, supports ECC, has a GigE on board (Broadcom) and the
power supply is more than enough for the CPU (PIV Northwood 3.2GHz),
one SATA HDD, a low power and performance graphics card (there's no on
board graphics unfortunately) and an extra GigE card (Intel E1000).
The decision for adding an extra NIC was not due to problems with the
Broadcom chip, but simply to have dedicated networks; the Broadcom is
able to do PXE just fine and this is the way these nodes have booted
since setting them up.
I was pleasantly surprised by the reliability of these computers.
Given their tightness, they require attention and good skills when
building them, f.e. using good quality thermal paste to avoid local
thermal problems and routing cables to avoid transport thermal
problems. About 70 of the 80 are still running well today, most of the
failed ones stopped working correctly after the 3 years of warranty so
I didn't make much effort to find out what is wrong - the main problem
being instability under combined CPU and I/O load. Of course, when RAM
and HDDs failed and were easy to recognize as causes, they were
replaced as needed.
As I wrote earlier on this list, the main disadvantage of such SFFs is
the lack of IPMI support. There is no serial console support in the
BIOS, so changing BIOS settings is a pain. Power control can be
achieved with a PDU, but I didn't choose this way because I knew that
the nodes should be always up and I wouldn't have to press the power
buttons too often ;-) Another thing to keep in mind is that, due to
their tightness, they are quite sensitive to the external temperature
- if the A/C fails, expect a sharp raise in internal temperature, so
setting up monitoring, both environmental and for the builtin sensors,
is recommended.
Good luck!
--
Bogdan Costescu
IWR, University of Heidelberg, INF 368, D-69120 Heidelberg, Germany
Phone: +49 6221 54 8240, Fax: +49 6221 54 8850
E-mail: bogdan.costescu at iwr.uni-heidelberg.de
More information about the Beowulf
mailing list