On Apr 14, 2005, George Duncan wrote:
>I am thinking about purchasing up to four G5 machines
>with say 2 GB per node.

As others have pointed out, a lot of this depends on what sort of 
models you are running. We run climate models, which are probably less 
parallel than what I imagine you run in physics research. We use 
non-Apple systems, but general considerations are the same. I run MPICH 
jobs on around 10 nodes. The domain is split up among each process, so 
RAM is a lot less than would be needed on a single node. We are fine 
with 1 GB RAM per node. You may want to start with 1 GB and add more 
only if necesary.

>I am wondering whether it makes any sense to do so?

Which part? A small cluster versus some other architecture or G5 versus 
some other processor? The real question is: What is the fastest machine 
you can afford to buy to perform the computations you want to do? For a 
few $10K, small clusters are the obvious solution for most people. Does 
a G5 cluster make sense compared to Xeon or Opteron? That depends on 
how your model benchmarks on these systems and how the ancillary costs 
(system administration, computer room) work out at your location for 
the various options. System cost depends on what deals your university 
can get with HP, Dell, or Apple. The comparison is difficult since you 
might be comparing similar-sized Opteron and Xserve clusters to a much 
larger Xeon cluster.

>Will I need fast interconnects?

For your apparent budget ($15,000), gigabit is the only thing that 
makes sense. Otherwise, your interconnect will reduce the number of 
nodes by so much that you won't need an interconnect! It is a balance 
between the marginal performace benefit of adding more nodes versus 
adding faster interconnect. Myrinet will run you $6000 for 4 nodes (the 
cost of two Xserves); an unmanaged 8-port gigabit switch should be only 
a couple hundred bucks. I'd be willingto bet that 4 nodes with gigabit 
will be faster than two nodes with myrinet for the majority of 

We use gigabit, and there is considerable communication. However, it 
appears that, for smaller clusters (under 20 or so nodes), the problem 
is broken into large enough parts that communication is small compared 
to computation in a given thread. Myrinet gives a small, but not 
overwhelming increase in performance. As a problem gets broken into 
more smaller parts (ie on a bigger cluster) you'd start to see less 
marginal benefit from adding another node and more benefit of better 
interconnect. It would be highly problem dependent, however.

>What is really needed and what do they cost?

All you really need are the servers, a gigabit switch, and a rack. You 
might want a larger head node with RAID storage. You do not need 
auxilliary power or UPS; if you loose power, restart your job. Most 
models output peridic restart files to account for system failure. For 
a small cluster, you most likely do not need special cooling or 
upgraded electrical supply, but you don't want to stick this under your 

As far as cost, ask Apple for a quote. You may also want comparisons 
from HP Dell or IBM.

>And with such a small 'cluster' what is the community's view of 
>whether real work can be done?

Real work can be done with a pad of yellow paper and a pencil. We tend 
to forget that. It is a question of whether your research objective and 
resources are compatible.

>I know that this last question is difficult to answer, but would 
>appreciate response.

Actually, the last one was easy!


Eric Salathe
Climate Impacts Group             <salathe at washington.edu>
University of Washington          

