[vortex] Packet-loss and/or interface blocking on 3c905C-TX cards

Gerrit Einhoff einhoff@i4.informatik.rwth-aachen.de
Wed Apr 24 11:00:05 2002


Hello.

I got this address from the source of the 3c59x-driver. If this is the wrong 
place to ask for help, please point me to a better one. Thanks.

We are experiencing weird problems with our 3Com nics. We use the following 
setup:

 - 8 SMP (2 x 1GHz Pentium III) machines
 - each has three 3Com 905C-TX nics
 - the first one is connected two a switch, the second via a patch cable to 
the third of the next machine to form a ring
 - kernel is linux-2.4.17-mosix (the mosix-stuff wasn't used during our tests)

To test the throughput of the interfaces I wrote a program that sends 
UDP-packets from one system to the next and measures the 
throughput/losses/jitter/etc. I used it to send batches of one million 
packets over the second interface of one machine to the third of the next 
(i.e. over the direct connection) with different packet-sizes.

First we tried it with the latest drivers from www.scyld.com. Unfortunately 
after the first million packets, the sending interface suddenly blocked, 
meaning it wouldn't send any more packets, although it returned fine from all 
send()-calls. This could be solved by an 'ifconfig eth1 down' + 'ifconfig 
eth1 up', but only for the next million packets.

Next, we tried the driver distributed with the kernel sources. With that, the 
interface didn't block anymore, even after millions of packets. You can see 
the results at http://www.einhoff.de/plot.png (sorry, the legend is in 
German. Red is bandwidth in Mbps, green are losses in %). As you can see we 
experienced packet losses up to 80%, but only for the packet-sizes from 112 
bytes to 354 bytes. These results are reliable reproducible. With 
packet-sizes of 354 bytes, we have losses of 80%, with 355 bytes no losses at 
all! Apparently the packets are lost at the sending side, because the program 
finishes five times faster with 354 bytes packets than with 355 bytes packets.

When we tried the scyld-driver again, with smaller packet batches, we noticed 
that it produces losses to, this time from 91 to 385 bytes. We didn't notice 
the first time, because the interface kept blocking.

At last we tried the driver from the 3com website. This time we saw no 
losses, but the interface started blocking again after a million packets.

To summarize:

 - scyld-driver: losses for packet sizes from 91 to 385 bytes, interface 
blocks after the first million packets.
 - kernel-driver: losses for packet sizes from 112 to 354 bytes, no blocking.
 - 3Com-driver: no losses, interface blocks after first million packets.

So we've seen all combinations of losses and blocking except the one we're 
looking for...

We tried all this on several (identical) machines, so a hardware-failure is 
highly unlikely.

Any ideas?

Thanks,

      Gerrit Einhoff