[Beowulf] New HPCC results and the Myri viewpoint

Patrick Geoffray patrick at myri.com
Wed Jul 20 20:48:15 PDT 2005

Vincent Diepeveen wrote:
>>>Tests at all processors at the same time make major sense.
>>Yes and no. Most networking people believe the job of a node is to send 
>>messages. Actually, it's mainly to compute, and sometimes sends 
>>messages. So, would running a pingpong test on multiple processors at 
>>the same time sharing a NIC an interesting benchmark ? Not really, it 
>>won't happen much on real codes that compute most of the time. I prefer 
>>to optimize other things that help the host compute faster.
> If most of the time they are 'just computing', then it just doesn't make
> sense to have a highend network. A $10 gigabit network will do in that case.

And it does for many people. What is the most used interconnect in the 
cluster market ? GigE.

> Reality is however different. Reality is that you simply stress the network
> until it wastes say 10-20% of your system time until a maximum of 50%.

What do you know about my reality ? Your reality is a 8x8 chessboard. 
Have you looked at a trace of one of the 10 ISV codes that are the 
majority of applications running on real world clusters ? Yes they do 
communicate, but they compute most of the time.

Your reality is very unususal: your problem size if tiny, you add nodes 
to go faster, not bigger. If you would add nodes to go bigger, then you 
will realize that your compute/communicate ratio (usually) increases.

You have rambled on this list about parallel machines not being suited 
to your usage. Maybe it's the way around, maybe nobody thinks about 
chess when they buy a cluster.

> In short, if you deliver highend nic's, ASSUME they get used.

Of course they will get used, that's not the question ! It's about what 
is important. Tuning for a pattern that is not common has little return.

An example for your curious and open mind: many interconnect people 
advertize the streamed bandwidth curve, where the sender just keeps 
sending messages as fast as possible. How often does this communication 
pattern happens in my reality ? Never. I have never seen an application 
sending enough messages back to back to fill up the pipeline. So why 
optimizing for this case ? because the curve looks good and people likes 
to think they have a bigger pipe than their friends.

> At least *i* understand that principle.

Good for you. It must be lonely up there, so many stupid people around.

> Weirdly enough the manufacturer of a product assumes his stuff isn't going
> to get used.
> Why make it then for your users?
> You try to sell a product without your users using it?

What was that procmail filter again ? I just remember the "idiot" part. 
Got to look in the archives...


Patrick Geoffray
Myricom, Inc.

More information about the Beowulf mailing list