[Beowulf] Practicality of a Beowulf Cluster

Thu Aug 27 16:53:29 PDT 2009

In my personal experience, I developed long time ago CFD software on a single
machine with a single core.
Once I started complicating my life with more complex problems (eg. from 2D to
3D), the time it was taking to solve the problems was growing exponentially (
from hours to several weeks). Therefore I required at some point more
computational infrastructure despite all the sw and numerical things I would
do to accelerate the sw. I required more, ie a bunch of computers, or a
cluster to get to my performance or productivity goal (eg. reduce a 3 week
simulation to a overnight run time). But it took me my time to _realize_ or to
get more demanding based on my evolving computational needs. On other cases,
you cannot simply fit the data you are crunching on a single node so that is a
capacity problem you can solve by distributing the data among more compute
nodes and having the aggregated capacity needed. The performance one, can also
be seen as the # of arithmetic operations to solve your problem is growing so
much that you need the aggregated computing power of multiple computers, again
distributing the computation among more processors and nodes.
Given this sort of introduction, my advice would be to grow your computational
infrastructure along your needs over the time (eg. every 1 or 2 years, and
much better if aligned with your trusted HW vendor provider), from a single
node which nowadays looks like a cluster 5 years ago. And then if you need
more, start adding computational infrastructure, which could be also more
storage or more network gear, or more gpus which these days are also used for
accelerating the computation of the multicore processors.
Making the assumption of very little knowledge on your computational needs and
usage it is nearly impossible to guess if a cluster will satisfy you and even
harder to size it properly.
The people that use clusters typically have computational needs well
understood for many years, so the sizing can be more or less estimated by
running "kernels" (the meat) of their applications on a single node and then
multiplying the performance achieved on that node by the number of nodes
necessary to reach the total performance or productivity or capacity target.
Having a cluster without using it for what its been designed/built is a whole
waste of money, electric power and time and a bunch of unnecessary headaches
on many directions.

Finally on your comment on Windows, Microsoft has spent already since 2004
money and people in developing and bringing to the market a HPC solution as
well. So yes, there is a windows solution for clusters with same features as
you will see on Linux/Unix. You can also run decently on Windows on a single
box a compute intensive application....

I hope it helps you clear out whether it makes sense or not for you to build a
cluster.
This group assumes you are already on it and you need perhaps the
analysis/feedback/friendly advice on components or on the way an application
stresses that specific component of the cluster (eg. processor, networking,
storage, OS, settings), or sw tools for management/debugging of the HW+SW
clustered solution, among many other things...

Best regards,
Joshua Mora.

------ Original Message ------
Received: 11:42 PM CEST, 08/27/2009
From: J Bickhard <jbickhard at gmail.com>
To: beowulf at beowulf.org
Subject: [Beowulf] Practicality of a Beowulf Cluster

> So, I was thinking of making a cluster, but wondered: what are the
> practical uses of one? I mean, you can't exactly run Windows on these
> things, and it looks like they're mostly for parallel computing of
> complex algorithms.
>
> Would an average Joe like me have a use for a cluster?
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
>