Question about custers

Fri Feb 7 10:39:32 PST 2003

On Fri, 7 Feb 2003, Joe Griffin wrote:

> What kind of units do you want?
> 
> MFLOPS-Bytes?
> 
> 
> 
> KNT wrote:
> 
> >	Greetz!
> >I wanted to ask if there's a way of theoreticaly calculating a cluster
> >power by a mathematical formula, basing on the nodes procesor type, ram,
> >etc.? Assuming also that the components of each node can be different.
> >
> >						-Thanks from the above
> >							  KNT

There are LOTS of theoretical ways, the simplest one being a simple
aggregate of the individual node "power" by whatever measure you like.
This is even a reasonable one for an embarrassingly parallel task with
long runtime, small system footprint, and small I/O and overhead
requirements, e.g. SETI or RC5.  For other tasks, this is completely
meaningless.

To get a USEFUL measure of the power of a cluster on YOUR PROBLEM, one
can proceed theoretically, but the answer will depend on the details of
the work being done and how it is parallelized.

To BEGIN to understand at least the most important components of a
parallelized task and how their individual timings affect the parallel
scaling of the work done, split up among many nodes, you might look at
the first few chapters of my online beowulf book:

 http://www.phy.duke.edu/brahma/beowulf_online_book/

Especially focus on Amdahl's law and its generalizations.

However, many tasks are sufficiently complex that estimating parallel
speedup theoretically for a given node and network design is very
difficult and prone to error; it just isn't worth it.  The best way to
proceed is to empirically measure all sorts of things -- ideally the
parallel speedup itself, but sometimes that leaves one with a chicken
and egg problem if you're trying to design a cluster that will work
effectively for some particular problem -- and then make your estimates
from an understanding of the basic ideas in the book and the explicit
measurements of task times on your possible hardware.  This is still a
bit risky -- there are lots of nonlinearities and superlinearities in
computer performance as the size, stride, and communications pattern of
a program is varied across the memory, cache, and bus subsystems, and
scaling up to "production" can sometimes lead to pleasant or unpleasant
surprises.

A last thing to note is that there are often many ways to parallelize a
given task, and some may be better than others.  Some may be MUCH better
than others, as in the code will scale "well" for one algorithm and
"terribly" for another.  One is thus advised that even the best
theoretical estimate of power for a particular problem is based on the
SOFTWARE implementation of that problem as well as the cluster design,
and if one's problem is indeed complex one may have to study parallel
programming extensively to learn enough to be able to get things to work
optimally for you.

In other words, sure, there is lots of theory but it isn't "simple" and
YMMV significantly from task to task, network to network, node to node.

Nobody ever said parallel/cluster computing was "easy"...;-)

   rgb

Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu