[3c509] 3c509B TCP overflow trouble
Marco Emilio Poleggi
poleggi@dis.uniroma1.it
Fri, 26 Oct 2001 15:45:41 +0200
Donald Becker wrote:
> On Mon, 22 Oct 2001, Marco Emilio Poleggi wrote:
>
>
>>I noticed an anomalous behaviour of my 3c509B TCP ("combo") card: it
>>gets stuck intermittently (can't receive nor transmit) for some
>>minutes. This happens on both the connectors (UTP and BNC), and the
>>log reports only this line:
>>
>> kernel: _M_str_putnext: queue overflow: dropping a message
>>
>>I tried to enlarge the /proc/sys/net/core/netdev_max_backlog, but nothing
>>changed (only the kernel messages disappered!).
>>I compiled statically the module on a 2.4.5 kernel. I noticed the same
>>trouble on a 2.2.16 kernel too.
>>
>
> Presumably this error message is from the 2.4 kernel. Your problem
> isn't with the 3c509 device driver, it's with the kernel.
Yup! Most probably you're right, infact I'm trying a 3c905b with similar
results, as shows the dmesg output:
eth0: Transmit error, Tx status register 82.
Flags; bus-master 1, dirty 23901(13) current 23905(1)
Transmit list 0d512200 vs. cd512540.
0: @cd512200 length 800005ea status 800005ea
1: @cd512240 length 800005ea status 000105ea
2: @cd512280 length 800005ea status 000105ea
3: @cd5122c0 length 800005ea status 000105ea
4: @cd512300 length 800005ea status 000105ea
5: @cd512340 length 800005ea status 000105ea
6: @cd512380 length 800005ea status 000105ea
7: @cd5123c0 length 800005ea status 000105ea
8: @cd512400 length 800005ea status 000105ea
9: @cd512440 length 800005ea status 000105ea
10: @cd512480 length 800005ea status 000105ea
11: @cd5124c0 length 800005ea status 000105ea
12: @cd512500 length 800005ea status 000105ea
13: @cd512540 length 800005ea status 000105ea
14: @cd512580 length 800005ea status 000105ea
15: @cd5125c0 length 800005ea status 000105ea
_M_str_putnext: queue overflow: dropping a message
_M_str_putnext: queue overflow: dropping a message
I found that the last errors (_M_str_putnext...) come from an external module,
but I don't know if they're related to network malfunctions. However, I've
removed that module to see if things go better...
>
> What is the error message with 2.2.16?
No errors! Only got the network stuck...
Anyway, I don't want to abuse of this mailing list, but I'd like to know something:
1)why the above log shows an eth0's tx error, whereas 'netstat -i' doesn't
report it (see below)?
Kernel Interface table
Iface MTU Met RX-OK RX-ERR RX-DRP RX-OVR TX-OK TX-ERR TX-DRP TX-OVR Flg
eth0 1500 0 2271047 2275 0 0 25415 0 0 0 BRU
lo 16436 0 20185 0 0 0 20185 0 0 0 LRU
2) I noticed that during the network malfunctions the ARP resolution stops working
('arp' command can't return),
so the local network is unreachable (except the gateway!), while the outside world
is reachable (e.g. via HTTP). So I conjecture that the problem is just the ARP management.
Does anybody knows about ARP problems with 2.4.5 kernels?
Bye!
m.e.p.
--
________________________________________________
Ing. Marco Emilio Poleggi
Universita' degli Studi di Roma "La Sapienza"
Dipartimento di Informatica e Sistemistica
Via Salaria 113, 00198 Roma - Italy
Tel: +39 06 49918479 Fax: +39 06 85300849
E-mail: poleggi@dis.uniroma1.it