[Beowulf] Help with inconsistent network performance
Joe Landman
landman at scalableinformatics.com
Tue Dec 18 15:27:22 PST 2007
As has been pointed out to me offline, my numbers may be a bit more
pessimistic than needed, in part to pipelining and other effects. If my
numbers were the result of a correct analysis, the most you would be
able to see from a gigabit link would be about 37 MB/s for 1500 byte
packets. This is obviously not the case, so assume this to be a "worst
case" analysis (and I am going to go back and review what I seem to
have dropped from the TCP bits).
Joe
Joe Landman wrote:
> Hi Brendan:
>
> Brendan Moloney wrote:
>> I have a cluster of 8 Linux machines connected with gigabit
>> ethernet (full duplex) to a HP Procurve 2848 switch. I am using the
>> machines to do interactive distributed rendering. I have noticed that
>> the
>> final gather stage (where the intermediate images from the render
>> nodes are
>> sent back to the viewing node) has "hiccups" in the performance. These
>
> How are they sent? NFS? Sockets? ...
>
>> hiccups occur with as few as two render nodes, and become more common
>> as I
>> add more render nodes. With a 512x512 image the final gather usually
>> takes
>> a few milliseconds for each frame, but when the hiccups occur it is more
>> like 200+ milliseconds.
>
> Is this "real time" rendering so that frame rate isthe most important
> aspect?
>
>> Since it is a full duplex switched network, there should not be any
>> collisions happening. Since the image is less than 1 MB total, I don't
>
> There could be blocking ... if one unit grabs the single network pipe
> of the display node while the another node tries to send data, then the
> late node will back off (well with TCP it will) in a pre-determined manner.
>
>> think I am saturating the switch. I have checked the contents of
>> /sbin/ifconfig and there are zero erroneous packets being reported.
>> At this
>
> You wouldn't see it there. It would be on the switch, and even then it
> wouldn't term it a collision. It is a switch behaving normally.
>
>> point I am really at a loss as to what is causing this. Any input on
>> things
>> to check would be greatly appreciated.
>
> I assume you have a single gigabit from the display node to the switch.
> As you scale up the number of render nodes, you notice more of these
> "hiccups" scaling about linearly with the number of nodes.
>
> This suggests resource contention. Each image would be fragmented into
> units of 175 1500-byte packets. This assumes 8 bit images. If you are
> using 8 bits per color, 3 colors and an alpha channel, then this is ~700
> packets. Each 1500 byte packet takes about 11us to transmit, and has a
> non-trivial latency associated with it. I will estimate the latency at
> 30us (this is switch latency of ~ 5us + network stack latency on each
> side of about 12.5us). So for each packet, you have about 41us to
> transfer it. If you have 8 bit images, then this corresponds to 7.2
> ms. There may be some other caching effects that I am missing, or
> mis-computed. For 32 bits (3x 8bit color channels + 1 alpha channel),
> this is looking like 28.8 ms for each image. Best case you could do
> with this is about 34.7 frames per second.
>
> If on the other hand, you used jumbo frames with 9000 byte packets, you
> would need 30 to transfer each image, which would require 67.1us to
> move, and still 30 us of latency, for 97.1us per packet. For 30
> packets, this is 2.9ms. For the 32 bit version as indicated previously
> (3x 8 bit color channels, and one alpha channel) this would be about
> 11.6ms. Or 85.9 frames per second.
>
> Based on this, I would suggest seeing if changing mtu to 9000 helps.
>
> ifconfig eth0 mtu 9000
>
> on all your nodes (every one).
>
> The argument for this is that you have less latency to pay for, even
> though it takes longer to transfer the payload.
>
> Another possibility is channel bonding on your display node.
>
>
--
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: landman at scalableinformatics.com
web : http://www.scalableinformatics.com
http://jackrabbit.scalableinformatics.com
phone: +1 734 786 8423
fax : +1 866 888 3112
cell : +1 734 612 4615
More information about the Beowulf
mailing list