[Beowulf] Anybody using Redhat HPC Solution in their Beowulf

Tue Oct 26 09:09:12 PDT 2010

On 10/26/10 04:16, Hearns, John wrote:
> I have worked as an engineer for two HPC companies - Clustervision and
> Streamline.
> My slogan phrase on this issue is "Any fool can go down PC World and buy
> a bunch of PCs"

Well if you are buying PCs in bulk at retail pricing, you are a fool 
anyway.  Plus most PC World PCs won't have ECC RAM so I wasn't really 
referring to those as few of us tolerate random bit flips.

> However, as regards price, I would say that actually you will be paying
> very, very little premium
> for getting a supported, tested and pre-assembled cluster from a vendor.
> Academic margins are razor thin - the companies are not growing fat over
> academic deals.
> They also can get special pricing from Intel/AMD if the project can be
> justified - probably ending
> up at a price per box near to what you pay at PC World.

Again, not comparing PC World to Tier 1 bulk purchases.  I'm comparing 
Tier 1 bulk purchases w/o an OS (so you can DIY) with specialized HPC 
vendor purchases where you don't have to DIY.  Even then, perhaps it 
breaks even the first year if you get a very, very good deal from the 
HPC vendor.  However, to get the deal you are probably contracted into 
four or five years of support and when considering HPC, involving more 
humans are the fastest way to get a really inefficient and expensive 
cluster.  After the first year and up until the lifetime of the cluster 
involving human support annually will add a large cost overhead you have 
to account for at the beginning (and probably buy less hardware because 
of which).

> Or take (say) rack top switches. Do you want to have a situation where
> the company which supports your cluster
> has switches sitting on a shelf, so when a switch fails someone (me!) is
> sent out the next morning to deliver
> a new switch in a box, cable it in and get you running?

That's probably a hell of a lot faster than waiting on a vendor to get 
you a new switch through some RMA process.  Plus you know the cabling is 
done right :).

Optimally IMHO, in university setups physical scientists create the need 
for HPC.  These types shouldn't (as Kilian mentions) need to inherit all 
of the responsibilities and overheads of cluster management to use one 
(or pay cluster vendors annually for support).  They should simply walk 
over to the CS department, find system guys (who would probably drool 
over the potential of administering a reasonably sized cluster) and work 
out an agreement where the physical science types can "just use it" and 
the systems/CS guys administer it and can once in a while trace 
workloads, test new load balancing mechanisms, try different kernel 
settings for performance, etc.  This way the physical scientists get 
their work done on a well supported HPC system for no extra cash and 
computer scientists get great, non-toy traces and workloads to further 
their own research.  Both parties win.

Now in organizations that don't have a CS department I agree that HPC 
vendors are the way to go.

ellis