[Beowulf] Building new cluster - estimate

Tue Jul 29 13:11:42 PDT 2008

Ivan Oleynik wrote:
>     vendors have at least list prices available on their websites.
> 
> 
> I saw only one vendor siliconmechanics.com <http://siliconmechanics.com> 
> that has online integrator. Others require direct contact of a saleperson.

This isn't usually a problem if you have good spec's that they can work 
with for you.

> 
>     "thermal management"?  servers need cold air in front and unobstructed
>     exhaust.  that means open or mesh front/back (and blanking panels).
> 
> 
> Yes, are there other options? Built-in airconditioning unit that will 
> exhaust the hot air through a pipe to dump the air outside  computer 
> room other than heating it?

You will pay (significantly) more per rack to have this.  You seemed to 
indicate that bells and whistles are not wanted (e.g. "cost is king").

The hallmarks of good design for management of 
power/heat/performance/systems *all* will add (fairly non-trivial) 
premiums over your pricing.  IPMI will make your life easier on 
management, though there is a cross-over where serial 
consoles/addressable and switchable PDUs make more sense.  Of course 
grad students are "free", though the latency to get one into a server 
room at 2am may be higher than that of the IPMI and other solutions.

> 
>  
> 
>     wouldn't a 5100-based board allow you to avoid the premium of fbdimms?
> 
> 
> May be I am wrong but I saw only FB-DIMMs options and assumed that we 
> need to wait for Nehalems for DDR3?

Some vendors here can deliver the San Clemente based boards in compute 
nodes (DDR2).  DDR3 can be delivered on non-Xeon platforms, though you 
lose other things by going that route.

>  
> 
> 
> 
>         - WD Caviar 750 Gb SATA HD                                      
>                  :
>         $110
> 
> 
>     I usually figure a node should have zero or as many disks as feasible.
> 
> 
> 
> We prefer to avoid intensive IO over network, therefore, use local scratch.

We are measuring about 460 MB/s with NFS over RDMA from a node to our 
JackRabbit unit.  SDR all the way around, with a PCIx board in the 
client.  Measuring ~800 MB/s on OSU benchmarks, and 750 MB/s on RDMA bw 
tests in OFED 1.3.1.

If you are doing IB to the nodes this should work nicely.

Also, 10 GbE would work as well, though NFS over RDMA is more limited here.
>  
> 
>     HP is safe and solid and not cheap.  for a small cluster like this,
>     I don't think vendor integration is terribly important.
> 
> 
> Yes, it is important. Optimized cost is what matters.

If cost is king, then you don't want IPMI, switchable PDUs, serial 
consoles/kvm over IP, fast storage units, ...

Listening to the words of wisdom coming from the folks on this list, 
suggest that revising this plan, to incorporate at least some elements 
that make your life easier, is definitely in your interest.

We agree with those voices.  We are often asked to help solve our 
customers problems, remotely.  Having the ability to take complete 
control (power, console, ...) of a node via a connection enables us to 
provide our customer with better support.  Especially when they are a 
long car/plane ride away.

I might suggest polling the people who build them for their research 
offline and ask them what things they have done, or wish they have done. 
  You can always buy all the parts from Newegg and build it yourself if 
you wish.  Newegg won't likely help you with subtle booting/OS load/bios 
versioning problems.  Or help you identify performance bottlenecks under 
load.  If this is important to you, ask yourself (and the folks on the 
list) what knowledgeable support and good design is worth.

Joe

-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: landman at scalableinformatics.com
web  : http://www.scalableinformatics.com
        http://jackrabbit.scalableinformatics.com
phone: +1 734 786 8423
fax  : +1 866 888 3112
cell : +1 734 612 4615