[vortex] Packet-loss and/or interface blocking on 3c905C-TX cards
Gerrit Einhoff
einhoff@i4.informatik.rwth-aachen.de
Wed Apr 24 11:00:05 2002
Hello.
I got this address from the source of the 3c59x-driver. If this is the wrong
place to ask for help, please point me to a better one. Thanks.
We are experiencing weird problems with our 3Com nics. We use the following
setup:
- 8 SMP (2 x 1GHz Pentium III) machines
- each has three 3Com 905C-TX nics
- the first one is connected two a switch, the second via a patch cable to
the third of the next machine to form a ring
- kernel is linux-2.4.17-mosix (the mosix-stuff wasn't used during our tests)
To test the throughput of the interfaces I wrote a program that sends
UDP-packets from one system to the next and measures the
throughput/losses/jitter/etc. I used it to send batches of one million
packets over the second interface of one machine to the third of the next
(i.e. over the direct connection) with different packet-sizes.
First we tried it with the latest drivers from www.scyld.com. Unfortunately
after the first million packets, the sending interface suddenly blocked,
meaning it wouldn't send any more packets, although it returned fine from all
send()-calls. This could be solved by an 'ifconfig eth1 down' + 'ifconfig
eth1 up', but only for the next million packets.
Next, we tried the driver distributed with the kernel sources. With that, the
interface didn't block anymore, even after millions of packets. You can see
the results at http://www.einhoff.de/plot.png (sorry, the legend is in
German. Red is bandwidth in Mbps, green are losses in %). As you can see we
experienced packet losses up to 80%, but only for the packet-sizes from 112
bytes to 354 bytes. These results are reliable reproducible. With
packet-sizes of 354 bytes, we have losses of 80%, with 355 bytes no losses at
all! Apparently the packets are lost at the sending side, because the program
finishes five times faster with 354 bytes packets than with 355 bytes packets.
When we tried the scyld-driver again, with smaller packet batches, we noticed
that it produces losses to, this time from 91 to 385 bytes. We didn't notice
the first time, because the interface kept blocking.
At last we tried the driver from the 3com website. This time we saw no
losses, but the interface started blocking again after a million packets.
To summarize:
- scyld-driver: losses for packet sizes from 91 to 385 bytes, interface
blocks after the first million packets.
- kernel-driver: losses for packet sizes from 112 to 354 bytes, no blocking.
- 3Com-driver: no losses, interface blocks after first million packets.
So we've seen all combinations of losses and blocking except the one we're
looking for...
We tried all this on several (identical) machines, so a hardware-failure is
highly unlikely.
Any ideas?
Thanks,
Gerrit Einhoff