CCL:[Beowulf] Question regarding Mac G5 performance (fwd from mmccallum at pacific.edu)

Tue May 25 11:07:24 PDT 2004

----- Forwarded message from Mike McCallum <mmccallum at pacific.edu> -----

From: Mike McCallum <mmccallum at pacific.edu>
Date: Tue, 25 May 2004 10:40:55 -0700
To: chemistry at ccl.net
Cc: Eugen Leitl <eugen at leitl.org>
Subject: Re: CCL:[Beowulf] Question regarding Mac G5 performance 
X-Mailer: Apple Mail (2.613)

I'm sorry that this response is so long in coming --- I had forgotten 
that I was filtering messages to a different mbox!

On May 20, 2004, at 01:54, Eugen Leitl wrote:

>
>Any actual numbers would be very useful, ideally with the compilers,
>compiler options, motherboard, and similar.  Did you compare 1 job per
>node vs 1 job per node?  Or 2 jobs per node vs 2 jobs per node?  4 or 8
>dimms?  Which kind of memory PC2700? PC3200?
>

The numbers I have are for NAMD2, using one of my own jobs.  This was 
with the stock PC3200, 512M.
I'm asking Rick V. for his input files for his benchmarks, so we can 
make a more direct comparison in the types of stuff I'm interested in.  
I'm going to try and borrow another dual G5 or two so I can test the 
gigabit enet scaling.

>>The built-in Gigabit enet is attractive, also, as charmm and NAMD 
>>scale
>>very well with gigabit, and it makes myrinet less price-effective 
>>(when
>>used on any platform, wintel included, see
>>http://biobos.nih.gov/apps/charmm/charmmdoc/Bench/c30b1.html for
>>example).  I decided that dual G5 xserve cluster nodes with gigabit
>
>I come to a different conclusion based on those graphs.  In the first
>graph myrinet improves by a factor of 2.6 (250 -> 95 seconds) from 2
>processors to 8, where gige improves by only 20% (255 -> 210).  In the
>second graph gigE gets SLOWER from 2 to 8 processors.  Do you think in
>either case the 8 node (let alone 16) gige cluster would have better
>price/performance then a myrinet cluster?

I realize after actually looking at that URL that that wasn't the data 
I was thinking about when I wrote that message.  I'm struggling to find 
the URL I was thinking of --- it mentioned the very poor performance of 
wintel systems with gigE, and mentioned that the G5 system architecture 
didn't suffer from the same.  I realize this is useful information, so 
I'll find it!

>
>Seems like for applications like shown on that page you shouldn't
>really bother with a cluster over 2 nodes with gigE, not many people
>would be willing to pay a factor of 4 more for 20% or even negative
>scaling.
>
>>switches were much more cost-effective for me than any other 
>>processor,
>
>Cost effective = price/performance?  Can you make any numbers 
>available?

It boiled down to the cost of the smallest myrinet switch + cards cost 
almost as much as 5 more 2.0GHz G5s.  Add that to the comparable price 
of the wintel hardware (P4s, I'm not talking itaniums here, those guys 
are even more expensive), and it was a no brainer, even with the 
slightly poorer scaling.  Also realize that I'm talking about small 
clusters, here, not  > 16 nodes.

>
>>especially any high-bandwidth specialty comm method (apple's gigabit
>>has a pretty low latency also).
>
>Oh?  Can you share the apple gigabit numbers?  What is "pretty low"?
>
>>Additional considerations for us were the BSD environment which is 
>>more
>>secure than windows, and the OS is arguably more stable and supported
>
>I'd agree with more stable and supported for use as a desktop, I'd
>disagree with stable and supported as computational node.  OSX is the
>new player on the block in this space.  Do you really think you would
>get a good response from calling apple's tech support line when the
>scheduler or network stack isn't performing to your performance
>expectations?
>
>Certainly a very reasonable thing.
>

Actually, Apple has been very responsive and interested in making 
things work.  I've already been over  to Cupertino twice, and met with 
engineers here(on my campus) twice.  I don't even have a cluster yet.  
To be fair, I don't have the resources (time, students or $$) to 
analyze the code to the nth degree, and find out why the network stack 
isn't doing 100 mph.  I'm pretty much forced to analyze stock 
situations, because that is all my situation allows me to do.  This 
clearly puts much more weight on the "familiarity" aspect, as I can't 
afford a whole lot of time spent under the hood, so to speak.

Cheers,

Mike
--
C. Michael McCallum                        
http://chem.cop.uop.edu/cmmccallum.html
Associate Professor
Department of Chemistry, UOP
mmccallum .at. pacific .dot. edu                (209) 946-2636 v  / 
(209) 946-2607 fax

----- End forwarded message -----
-- 
Eugen* Leitl <a href="http://leitl.org">leitl</a>
______________________________________________________________
ICBM: 48.07078, 11.61144            http://www.leitl.org
8B29F6BE: 099D 78BA 2FD3 B014 B08A  7779 75B0 2443 8B29 F6BE
http://moleculardevices.org         http://nanomachines.net
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20040525/06448eda/attachment.sig>