[Beowulf] Re: need advice on buying a small cluster (fwd from salathe at washington.edu)
eugen at leitl.org
Fri Apr 15 03:45:18 PDT 2005
----- Forwarded message from Eric Salathe <salathe at washington.edu> -----
From: Eric Salathe <salathe at washington.edu>
Date: Thu, 14 Apr 2005 15:01:31 -0700
To: scitech at lists.apple.com
Subject: Re: need advice on buying a small cluster
X-Mailer: Apple Mail (2.619.2)
On Apr 14, 2005, George Duncan wrote:
>I am thinking about purchasing up to four G5 machines
>with say 2 GB per node.
As others have pointed out, a lot of this depends on what sort of
models you are running. We run climate models, which are probably less
parallel than what I imagine you run in physics research. We use
non-Apple systems, but general considerations are the same. I run MPICH
jobs on around 10 nodes. The domain is split up among each process, so
RAM is a lot less than would be needed on a single node. We are fine
with 1 GB RAM per node. You may want to start with 1 GB and add more
only if necesary.
>I am wondering whether it makes any sense to do so?
Which part? A small cluster versus some other architecture or G5 versus
some other processor? The real question is: What is the fastest machine
you can afford to buy to perform the computations you want to do? For a
few $10K, small clusters are the obvious solution for most people. Does
a G5 cluster make sense compared to Xeon or Opteron? That depends on
how your model benchmarks on these systems and how the ancillary costs
(system administration, computer room) work out at your location for
the various options. System cost depends on what deals your university
can get with HP, Dell, or Apple. The comparison is difficult since you
might be comparing similar-sized Opteron and Xserve clusters to a much
larger Xeon cluster.
>Will I need fast interconnects?
For your apparent budget ($15,000), gigabit is the only thing that
makes sense. Otherwise, your interconnect will reduce the number of
nodes by so much that you won't need an interconnect! It is a balance
between the marginal performace benefit of adding more nodes versus
adding faster interconnect. Myrinet will run you $6000 for 4 nodes (the
cost of two Xserves); an unmanaged 8-port gigabit switch should be only
a couple hundred bucks. I'd be willingto bet that 4 nodes with gigabit
will be faster than two nodes with myrinet for the majority of
We use gigabit, and there is considerable communication. However, it
appears that, for smaller clusters (under 20 or so nodes), the problem
is broken into large enough parts that communication is small compared
to computation in a given thread. Myrinet gives a small, but not
overwhelming increase in performance. As a problem gets broken into
more smaller parts (ie on a bigger cluster) you'd start to see less
marginal benefit from adding another node and more benefit of better
interconnect. It would be highly problem dependent, however.
>What is really needed and what do they cost?
All you really need are the servers, a gigabit switch, and a rack. You
might want a larger head node with RAID storage. You do not need
auxilliary power or UPS; if you loose power, restart your job. Most
models output peridic restart files to account for system failure. For
a small cluster, you most likely do not need special cooling or
upgraded electrical supply, but you don't want to stick this under your
As far as cost, ask Apple for a quote. You may also want comparisons
from HP Dell or IBM.
>And with such a small 'cluster' what is the community's view of
>whether real work can be done?
Real work can be done with a pad of yellow paper and a pencil. We tend
to forget that. It is a question of whether your research objective and
resources are compatible.
>I know that this last question is difficult to answer, but would
Actually, the last one was easy!
Climate Impacts Group <salathe at washington.edu>
University of Washington
Do not post admin requests to the list. They will be ignored.
Scitech mailing list (Scitech at lists.apple.com)
Help/Unsubscribe/Update your Subscription:
This email sent to eugen at leitl.org
----- End forwarded message -----
Eugen* Leitl <a href="http://leitl.org">leitl</a>
ICBM: 48.07078, 11.61144 http://www.leitl.org
8B29F6BE: 099D 78BA 2FD3 B014 B08A 7779 75B0 2443 8B29 F6BE
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 198 bytes
Desc: not available
More information about the Beowulf