[Beowulf] cluster building advice?

Gus Correa gus at ldeo.columbia.edu
Mon Sep 17 13:03:03 PDT 2012


On 09/16/2012 05:52 PM, Jeffrey Rossiter wrote:
> Hello everyone!
>
> I am getting started on a cluster building project at my university. We
> just replaced all of our lab machines so I am going to be using the old
> machines to rebuild our cluster. The intention is for the system to be
> used for scientific computation. I am trying to decide on a linux
> distribution to use. Does it matter all that much? Any advice would be
> greatly appreciated. Book suggestions would help too. I am waiting to
> receive Building Clustered Linux Systems
> <http://www.amazon.com/Building-Clustered-Linux-Systems-Robert/dp/0131448536/ref=sr_1_20?s=books&ie=UTF8&qid=1347832213&sr=1-20&keywords=building+linux+clusters>
> by Robert W. Lucke
> <http://www.amazon.com/Robert-W.-Lucke/e/B001IQXRT6/ref=sr_ntt_srch_lnk_20?qid=1347832213&sr=1-20>
> but my advisor for the project is concerned that it may be out of date
> for what we are doing. Please share your ideas. Thanks!
>
> -Jeffrey Rossiter
> <http://www.amazon.com/Robert-W.-Lucke/e/B001IQXRT6/ref=sr_ntt_srch_lnk_20?qid=1347832213&sr=1-20>
>
>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

If yours is a computer science project to learn about
clusters, the list archives are worth searching:

http://www.beowulf.org/

**

On the other hand, if the goals are just to deploy the
cluster quickly, and basically use it for scientific computation,
you can simply use Rocks [although some people may frown at it],
and be up and running in a very short time:

http://www.rocksclusters.org/wordpress/

They have decent documentation on the hardware requirements
and how to setup the cluster:

http://www.rocksclusters.org/roll-documentation/base/5.5/

It will use/require a specific Linux distribution, tied to the
Rocks version. [The current Rocks 6.0 uses CentOS 6.2,
replaceable by RHEL 6.2 or Scientific Linux 6.2, IIRR.]
Most system administration tasks are handled [and sometimes
must be handled exclusively] by their "rocks" command,
which some people like, some don't.

**

Literature:
Douglas Eadline already pointed out cluster monkey:
http://www.clustermonkey.net/
and there is also Robert G. Brown's [2004 ?] book:
http://www.phy.duke.edu/~rgb/Beowulf/beowulf_book.php

**

Other things to think about, since you're cannibalizing
old computers:

1. How homogeneous is the hardware: All x86, x86_64,
how much memory, disk capacity,
what type of network adapter [100T Ethernet,
Gigabit Ethernet, well Infinband is unlikely
if the machines are old]?

The more homogeneous the machines are,
the easier to cluster them.

2. Network switch
[which depends on the network adapters in your machines]

Do you have a [Ethernet/GigE, other] switch to connect the machines?

Even a SOHO-type switch may work, although with poor performance.

3. Cluster deployment/maintenance [if you don't want to use Rocks]

http://warewulf.lbl.gov/trac
http://www.perceus.org/
http://xcat.sourceforge.net/

or DIY

4. Job scheduler to use:

Torque:
http://www.adaptivecomputing.com/products/open-source/torque/

SGE:
http://gridscheduler.sourceforge.net/

Slurm:
https://computing.llnl.gov/linux/slurm/

There are others, mostly commercial.


5. MPI [if you're doing parallel processing - most likely]

OpenMPI [Ethernet, GigE, Infinband, Myrinet, etc]
http://www.open-mpi.org/

MPICH2 [Ethernet/GigE, ...]
http://www.mcs.anl.gov/research/projects/mpich2/

MVAPICH2 [for Infinband]
http://mvapich.cse.ohio-state.edu/overview/mvapich2/

**

I hope this helps,
Gus Correa



More information about the Beowulf mailing list