[Beowulf] Re: torus versus (fat) tree topologies

Isaac Dooley idooley at isaacdooley.com
Tue Nov 9 14:21:47 PST 2004


>   I have had the luxury of testing an SCI 2D torus cluster and found 
>latency performance to be exceptional.  In fact this has been most 
>limiting factor for the major application (Fluent) running on our 
>cluster.  Upon performing some performance profiling I found Fluent 
>scales signficantly better on it than the gig/e network currently 
>implemented.  I have not had the oportunity to evaluate any other 
>interconnect hardware so I can not comment on their performance.

Chances are any of the cluster interconnects will have better latency times than Ethernet, but that comes at a cost. Ethernet can also do harmful things like dropping packets, where some interconnects do reliable message transmission in the hardware. Thus for ethernet the OS must something like TCP. So I think it is probably better to compare a real tree or fat tree dedicated interconnect to the torus for your situation.

The real latency should be measured from application level to application level, and often the latency measurements are from NIC to NIC, but many of the expensive interconnect NICs and switches can perform routing and other functions(TCP,sockets,etc.) that would be done by the OS in a Linux cluster with a standard ethernet card. Even though these are not huge time consuming issues, even a few OS context switches and a few interrupts may be needed to send or receive a message, which can take valuable nanoseconds or maybe microseconds.

Isaac Dooley
Parallel Programming Laboratory, UIUC




More information about the Beowulf mailing list