[Beowulf] Performance characterising a HPC application

Mark Hahn hahn at mcmaster.ca
Wed Mar 21 06:10:13 PDT 2007


>> well, if the node is compute-bound, nearly all time will be user-time.
>> if interconnect-bound, much time will be system or idle.  if system time
>> dominates, then cpu or memory is too slow.  if there is idle time, you
>> bottleneck is probably latency (perhaps network, but possibly also of
>> whoever you're communicating with - compute node or fileserver.)
>
> Thanks - its starting to look like latency is the issue here alright.
> There is plenty of idle time on the processors. Increasing the frame
> size to jumbo frames resulted in an overall speed-up of the model (about
> 30%) suggesting that the total number of packets being sent was reduced,
> suggesting that packet latency is the current bottleneck.

I don't follow why that indicts latency - multiple smaller packets 
don't each require a round trip, for instance.  with TCP, I've only
ever seen jumbo packets resulting in modestly higher bandwidth and 
often noticibly lower CPU overhead.  TCP with 1500B packets will 
_certainly_ have multiple packets in flight, so on a lan is not terribly
latency-sensitive.

if this traffic is to some hotspot, I'd be more inclined to think that 
small packets are overloading the CPU with interrupt and TCP-stack overhead.

> looking for. Do you eyeball raw tcpdump data or use wireshark to browse it?

I use perl to browse it ;)
I don't really know what wireshark offers, I'm afraid.

> How does Myrinet compare price-wise to IB/10GE? How does it compare in
> terms of reliability?

pricing is always squishy for this category of hardware.  you can see the 
myrinet list prices on the myricom website.  for 16 ports: 16*(495+75)+6600
comes to just under $1k/port.  using local (Canadian) HP public-sector
prices, IB is $1600/port.  I'd be surprised of both couldn't be bettered.

> Interesting suggestion - I had a bit of hair-pulling getting it to
> smoothly handle jumbo frames though - I'm wondering how much hassle a
> 10G module would be :)

no additional hair, afaikt.  I have the impression myri's 10G eth driver 
works well, and there are at least a couple other 10G vendors who seem to 
like linux.

regards, mark hahn.



More information about the Beowulf mailing list