custom cluster cabinets (was Re: 1U P4 Systems)

Douglas Eadline deadline at plogic.com
Tue May 29 09:16:57 PDT 2001


On Tue, 29 May 2001, Bari Ari wrote:

> Douglas Eadline wrote:
> 
> > Keep in mind there is no easy way to correlate dense CPUs or MFLOPS per
> > watt to actual performance. Packing CPUs in a custom 1U box
> > may be a big win for some problems, but a big waste of money
> > for others.
> > 
> The StoneSoup BeoClusters will always be the best approach for many as 
> others have pointed out here due to budget, reuse and 
> management/organizational constraints.
> 
> > Also, the "lego we use" in the Beowulf community is largely
> > what is produced for other much larger markets and while we would like
> > to see some things done differently, there is a reasonable
> > trade-off between cost/flexibility/performance.
> 
> The turn-key clusters of 16 nodes and greater targeted at high speed 
> with high bandwith/low latency interconnects seems to be where things 
> can be  improved significantly. Maybe some new "legos" are needed.

The new "legos" we get are usually out of our control and
created through market forces much larger then Beowulf clusters.
We are to some degree parasites. Indeed, there are many among us
that have been burned on "proprietary legos" (legos that
do work the other kids toys) and now will only use those legos
we can get from the bigger markets because we know that they
will be low cost and have an upgrade path and play well with others.


> 
> > 
> > I find it interesting that in all the talk of P4 systems
> > there is little discussion about the Intel chips-set
> > for the P4 only supporting 32 PCI. If you need anything
> > other than Fast Ethernet this could be a real drawback
> > (i.e. an imbalanced system with bottle necks)
> > The new Xeon chip set has 2 64-bit slots. Of course there
> > are other chip-sets on the immediate horizon, but
> > the market seems to be making a clear differentiation about the
> > "desktop" vs. "server" product.
> 
> Some applications will churn away at a piece of data for hours or days 
> before spitting out a few bits to pass on or compare to what the other 
> nodes have as results. A 300bps link between nodes may be adequate here. 
> Other applications may only run through a few CPU cycles before passing 
> on a chunk of data where even 10Gb/sec interconnections are bottlenecks. 
> It's this high speed end of clustering where I see a need for improvements.

And in some cases many slower less expensive CPUs may be better than
expensive faster ones. With clusters, focusing on one part of the system
(usually the cpu) can be a dangerous. Other aspects of the
system from software to hardware need to be considered.

> 
> I have seen interest in the P4 for it's raw performance only and not for 
> the lack of current support for DDR and 64/66 PCI. It's all around the 
> corner though from all the chipset vendors. Infinband will really make a 
> big difference for high speed and high bandwidth applications when it 
> comes standard in chipsets.

An increase of processor speed without a corresponding 
increase in network speed can reduce scalability of an
application. Of course it depends on the application. If you are 
doing rendering, then you will not see the effect. If you are
calculating molecular orientation of some kind, you would be 
be well rewarded to consider a balanced system.


Doug


> 
> Bari
> 
> 
> 

-- 
-------------------------------------------------------------------
Paralogic, Inc.           |     PEAK     |      Voice:+610.814.2800
130 Webster Street        |   PARALLEL   |        Fax:+610.814.5844
Bethlehem, PA 18015 USA   |  PERFORMANCE |    http://www.plogic.com
-------------------------------------------------------------------





More information about the Beowulf mailing list