[vortex-bug] 3c59x new bug ?

Emmanuel Fleury fleury@cs.auc.dk
Tue, 27 Mar 2001 11:50:28 +0200


Hi,

A friend and me are testing the 2.4 kernel under high bandwith. We want
to perform some experiments on the netfilter statefull firewall.

But our first test was only with the 2.4 kernel to calibrate our
experiment. Then we notice a bottleneck in the network (actually we add
a pentium 75 box as a router in an fake network designed for such
experiments, so we knew what was the behavior of the network without the
2.4 box).

We take a look at the connections (using tcpdump) and we noticed a big
amount of packets loss.

Actually, hte high bandwith has nothing to do with it. The loss of
packet remain the same whatever the bandwith is used.

We were thinking about a broken hardware (network cable, NIC, ...) but
it happens again and again except when we add a new network card
(EtherExpress 10/100) in place of one of the two 3C905B-TX. In this case
the bottleneck only appear on one way.

Then we start to look at some log message and we get this one (several
times) from the kernel (in /var/log/kern.log).

Mar 25 10:32:03 cerium kernel: eth0: Transmit error, Tx status register
82.
Mar 25 10:32:03 cerium kernel:   Flags; bus-master 1, full 0; dirty
394461(13) current 394461(13).
Mar 25 10:32:03 cerium kernel:   Transmit list 00000000 vs. c3b412d0.
Mar 25 10:32:03 cerium kernel:   0: @c3b41200  length 8000004a status
0001004a
Mar 25 10:32:03 cerium kernel:   1: @c3b41210  length 8000004a status
0001004a
Mar 25 10:32:03 cerium kernel:   2: @c3b41220  length 80000052 status
00010052
Mar 25 10:32:03 cerium kernel:   3: @c3b41230  length 80000147 status
00010147
Mar 25 10:32:03 cerium kernel:   4: @c3b41240  length 8000004a status
0001004a
Mar 25 10:32:03 cerium kernel:   5: @c3b41250  length 80000052 status
00010052
Mar 25 10:32:03 cerium kernel:   6: @c3b41260  length 8000004a status
0001004a
Mar 25 10:32:03 cerium kernel:   7: @c3b41270  length 80000123 status
00010123
Mar 25 10:32:03 cerium kernel:   8: @c3b41280  length 80000052 status
00010052
Mar 25 10:32:03 cerium kernel:   9: @c3b41290  length 8000012b status
0001012b
Mar 25 10:32:03 cerium kernel:   10: @c3b412a0  length 8000004a status
0001004a
Mar 25 10:32:03 cerium kernel:   11: @c3b412b0  length 80000052 status
00010052
Mar 25 10:32:03 cerium kernel:   12: @c3b412c0  length 800004a3 status
800104a3
Mar 25 10:32:03 cerium kernel:   13: @c3b412d0  length 8000004a status
0001004a
Mar 25 10:32:03 cerium kernel:   14: @c3b412e0  length 80000052 status
00010052
Mar 25 10:32:03 cerium kernel:   15: @c3b412f0  length 8000004a status
0001004a

When you look at the tcpdump logs, this problem looks very similar to
http://www.cs.unc.edu/~jeffay/dirt/FAQ/fxp_drain.html

I think that the 3C59x keep several packets and at one point (don't know
which one !) send them very quickly. So the receiver is overloaded and
the emiter has to resend the data.

My question is: Is this bug a new one ? And if not, is there a patch for
that ? I really look after this patch but I didn't find one.

Thanks
-- 
Emmanuel

I'm killing time while I wait for life to shower me with meaning and
happiness.  -- Calvin & Hobbes