[Beowulf] recommendations for a good ethernet switch for connecting ~300 compute nodes

Joe Landman landman at scalableinformatics.com
Thu Sep 3 05:58:14 PDT 2009


Rahul Nabar wrote:
> On Wed, Sep 2, 2009 at 11:15 PM, Joe
> Landman<landman at scalableinformatics.com> wrote:
>> Rahul Nabar wrote:
> 
>> For a cluster of this size, divide and conquer.  Head node to handle cluster
>> admin.  Create login nodes for users to access to handle builds, job
>> submission, etc.
> 
>> Hmmm... We don't recommend burdening the head node with storage apart for
>> very small clusters, where it is a bit more cost effective.
> 
> Thanks Joe! My total number of users is relatively small. ~50 with
> rarely more than 20 concurrent logged in users. Of course, each user
> might have multiple shell sessions.
> 
> So the experts would recommend three separate central nodes?
> 
> Loginnode
> Management node (dhcp / schedulers etc.)
> Storage node

You can add more login nodes as you need.  Management nodes for the 
cluster stack (if any) can be fairly simple.

The storage node is a function of your IO patterns.

For really large clusters, you'd separate out the scheduler and some of 
the other functions as well.  Mark Hahn and some of the other folks on 
the list run some of the really large clusters out there.  They have 
some good advice for those scaling up.

> Or more?
> 
>> Depending upon how your nodes do IO for your jobs, this will dictate how you
>> need your IO designed.  If all nodes will do IO, then you need something
>> that can handle *huge* transients from time to time.  If one node does IO,
>> you need just a good fast connection.  Is GbE enough?  How much IO are we
>> talking about?
> 
> I did my economics and on the compute nodes I am stuck to GbE nothing
> more. If this becomes a totally unworkable proposition I'll be forced
> to split into smaller clusters. 10GbE, Myrinet, Infiniband just do not
> make economic sense for us. On the central nodes, though, I can afford
> to have better interconnects. Should I? Of what type?

It might be worth asking what your targeted per node budget is.  24 port 
SDR IB switches are available, and relatively inexpensive.  24 port SDR 
PCIe cards are available and relatively inexpensive.  Jeff Layton (a 
great resource BTW) wrote about them last year.  We've used them in a 
number of designs.  Not the rock bottom in latency, but we have 
customers using our storage over NFS over RDMA at 500+ MB/s with them, 
so its not too bad.



-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics Inc.
email: landman at scalableinformatics.com
web  : http://scalableinformatics.com
        http://scalableinformatics.com/jackrabbit
phone: +1 734 786 8423 x121
fax  : +1 866 888 3112
cell : +1 734 612 4615



More information about the Beowulf mailing list