networking options

Tue Sep 24 12:58:52 PDT 2002

Ken Chase wrote:

>> 
>> OK. Your most obvious choices in 2002 with rough performance estimates
>> (these are generally better-than MPI numbers) are:
>> 
>>        Technology             ~Best Case Bandwidth    ~Hardware Latency
>> 
>>     1. Fast Ethernet*            ~10 Mbytes/sec       ~50-75  usecs
>>     2. Channel Bonded FE      ~20-25 Mbytes/sec       ~50-75  usecs
>>     3. Gigabit Ethernet*      ~60-110 Mbytes/sec      ~50-200 usecs
>
>I remember reading here on beowulf much lower latencies than this for
>Gig E - is it really as slow as FE (and with a higher upper limit?)
>On the order of 5-10us  for good GBE nics, and 15-50us for good
>switches. Does fibre or copper GBE make a diff?

 Hey Ken,

 Numbers that low would surprise me, but ... ;-).

 My GIGE estimates above numbers are based on data at:

 http://www.cs.uni.edu/~gray/gig-over-copper/

 where the best latency number reported is 48 usecs for a Sysconnect NIC. 
 It was current as of April of 2002. Of course this is copper. Someone else
 might chime with data showing that fiber is better.

>What about for switchless GigE connections? (Useful in some cases where a ring
>topology only is required - we used that in a cluser we built (with copper
>Gige) as there was no internode communication except with adjacent nodes.
>Saves alot of cash on an n-port GBE switch for a large cluser sized n. n
>actually becomes much larger when you get rid of the switch but keep $
>constant. :)

 Switchless is good as long as you do not drive hop count up, but climbing
 hop counts for larger systems with GIGE type latencies would kill you
 in a hurry.  Of course, if you have large messages and long compute cycles
 who cares what the latency is. Regarding switchless topologies, there is 
 also the question of how many NICs your board can have, sharing bandwidth 
 to the cards, and increased total cost per node (you are making your own 
 switch ;-) inside the node).  

 SCI's switchless latency drops off steeply as you scale to largish 
 (> 128 nodes) sizes ...  you can of course use SCI switches to reduce 
 this problem. Few (none??) really large clusters (Cray T3E is not a
 cluster) using completely switchless topologies these days.

 Regards,

 rbw