[Beowulf] recommendations for a good ethernet switch for connecting ~300 compute nodes

Joe Landman landman at scalableinformatics.com
Wed Sep 2 21:15:41 PDT 2009

Rahul Nabar wrote:

> That brings me to another important question. Any hints on speccing
> the head-node? Especially the kind of storage I put in on the head

For a cluster of this size, divide and conquer.  Head node to handle 
cluster admin.  Create login nodes for users to access to handle builds, 
job submission, etc.

> node. I need around 1 Terabyte of storage. In the past I've uses
> RAID5+SAS in the server. Mostly for running jobs that access their I/O
> via files stored centrally.

Hmmm... We don't recommend burdening the head node with storage apart 
for very small clusters, where it is a bit more cost effective.

Depending upon how your nodes do IO for your jobs, this will dictate how 
you need your IO designed.  If all nodes will do IO, then you need 
something that can handle *huge* transients from time to time.  If one 
node does IO, you need just a good fast connection.  Is GbE enough?  How 
much IO are we talking about?

Bad storage design can make a nice new 300 node cluster seem very slow.

> For muscle I was thinking of a Nehalem E5520 with 16 GB RAM. Should I
> boost the RAM up? Or any other comments. It is tricky to spec the
> central node.

Head node: from a management perspective (name service, dhcp/tftp/pxe, 
authentication/gateway, status monitor, etc) can be relatively light 

Login node(s): should have sufficient RAM/CPU for builds.

Storage node(s): should be built with thought towards the IO patterns 

> Or is it more advisable to go for storage-box external to the server
> for NFS-stores and then figure out a fast way of connecting it to the
> server. Fiber perhaps?

Start with your IO patterns, your IO volume, and how many are running at 
once.  Once you have this, move on to figuring out capacity needs, 
availability needs (replication, fast home vs fast scratch + slow home)

Avoid worrying about the technologies you should consider until you have 
a better handle on how it will be used.  The use cases will suggest the 
technologies you should consider.

We are biased (given what we build, sell and support) of course.

Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics, Inc.
email: landman at scalableinformatics.com
web  : http://scalableinformatics.com
phone: +1 734 786 8423 x121
fax  : +1 866 888 3112
cell : +1 734 612 4615

More information about the Beowulf mailing list