[Beowulf] $2500 cluster. What it's good for?

Sun Dec 19 12:02:32 PST 2004

----- Original Message -----
From: "Douglas Eadline, Cluster World Magazine" <deadline at linux-mag.com>
To: "Jim Lux" <jimlux at earthlink.net>
Cc: <beowulf at beowulf.org>
Sent: Sunday, December 19, 2004 10:58 AM
Subject: Re: [Beowulf] $2500 cluster. What it's good for?

> On Sat, 18 Dec 2004, Jim Lux wrote:
>
> > I think it would be interesting to contemplate potential uses of a $2500
> > cluster.  Once you've had the thrill of putting it together and
rendering
> > something with POVray, what next?
>
> That is the $64,000 dollar question. Here is my 2 cent answer.
> BTW, your ideas are great. I would love to see a discussion like this
> continue because we all know the hardware is easy part!
>
> There is part of this project which has a "build it and they will come
> (and write software)" dream. Not being that naive, I believe there are
> some uses for systems like this. The indented audience are not the
> uber-cluster-geeks on this list, but rather the education, home, hacker,
> crowd. In regards to education, I think if cluster technology is readily
> available, then perhaps students will look to these technologies to solve
> problems. And who knows maybe the "Lotus 123 of the cluster" will be built
> by some person or persons with some low cost hardware and an idea everyone
> said would not work.
>
> If you have followed the magazine, you will see that we highlighted
> many open projects that are useful today. From an educational standpoint,
> a small chemistry/biology department that can do quantum chemistry,
> protein folding, or sequence analysis  is pretty interesting to me.
> There are others ares as well.

I was thinking of the cluster video wall idea, however the video hardware
would be kind of pricey (more than the cluster!).  Something like using the
cluster to provide the crunch to provide an immersive environment might be
interesting.

>
> There are also some other immediate things like running Mosix or Condor
> on the cluster. A small group that has a need for a computation server
> could find this useful for single process computational jobs.

This brings up an interesting optimization question. Just like in many
things (I'm thinking RF amplifiers in specific) it's generally cheaper/more
cost effective to buy one big thing IF it's fast enough to meet the
requirements. Once you get past what ONE widget can do, then, you're forced
to some form of parallelism or combining smaller widgets, and to a certain
extent it matters not how many you need to combine (to an order of
magnitude).   The trade comes from the inevitable increase in system
management/support/infrastructure to support N things compared to supporting
just one. (This leaves aside high availability/high reliability kinds of
things).

So, for clusters, where's the breakpoint?  Is it at whatever the fastest
currently available processor is?   This is kind of the question that's been
raised before.. Do I buy N processors now with my grant money, or do I wait
a year and buy N processors that are 2x as fast and do all the computation
in the second of two years?  If one can predict the speed of future
processors, this might guide you whether you should wait for that single
faster processor, or decide that no matter if you wait 3 years, you'll need
more than the crunch of a single processor to solve your problem, so you
might as well get cracking on the cluster.

Several times, I've contemplated a cluster to solve some problem, and then,
by the time I had it all spec'd out and figured out and costed, it turned
out that I'd been passed by AMD/Intel, and it was better just to go buy a
(single) faster processor.  There are some interesting power/MIPS trades
that are non-obvious in this regime, as well as anomalous application
environments where the development cycle is much slower (not too many "Rad
Hard" Xeons out there).

There are also inherently parallel kinds of tasks where you want to use
commodity hardware to get multiples of some resource, rather than some
special purpose thing (say, recording multi-track audio or the
aforementioned video wall). Another thing is some sort of single input
stream, multiple parallel processes for multiple outputs. High performance
speech recognition might be an example.

What about some sort of search process with applicability to casual users
(route finding for robotics or such...)

>
> I also have an interest in seeing a cluster version of Octave or SciLab
> set to work like a server. (as I recall rgb had some reasons not to use
> these high level tools, but we can save this discussion for later)

I'd be real interested in this... Mathworks hasn't shown much interest in
accomodating clusters in the Matlab model, and I spend a fair amount of time
running Matlab code.

>
> What I can say as part of the project, we will be collecting a software
> list of applications and projects.
>
> Finally, once we all have our local clusters and software running to our
> hearts content, maybe we can think about a grid to provide spare compute
> cycles to educational and public projects around the world.
>