network interface zombies randomly [was Re: interface dies
under network load on SMP machines]
Al Youngwerth
alberty@apexxtech.com
Fri Aug 28 12:19:13 1998
At 05:50 PM 8/27/98 -0700, david wrote:
>Reply to mail from Mike Simons about interface dies under network load on
SMP machines
>-----------------
>>> The interface goes dead from time to time without leaving any log
>>> messages - a simple ifconfig down/up brings them back to live. To
>>> stabilise the systems I've written a small program that checks the
>>> network and restarts the interface if required. So everything is
>>> nearly perfectly fine...
>[some parts snipped]
>
>stipulations:
>
> - it may be the tulip driver
> - it may be the [xxx] driver
> - it may be the kernel
> it may be a subtle bug somewhere in the network/device path that's not
> directly related
>
>I used to feel that it was the tulip driver. Now I am not sure. The
>default version of tulip.c that comes in the 2.1 tarball works fine. If I
>get any of the most recent tulip.c files from cesdis, then the above
>anomally appears.
>
>I.e., random dying of the interface. It can happen within minutes, it can
>take days. I have not yet found a pattern.
>
>As is mentioned, ifconfig down/up fixes it.
>
>If I recall correctly, packets aren't lost by doing ifconfig down/up. For
>example, ping <host> will flood your screen with all the latent packets
>finally getting through.
>
>comments?
I've been seeing a similar but slightly different problem, every now and
then I'll have a tulip interface that becomes "bursty". Under this
condition, if you ping the card, you'll see no traffic for 5 seconds and
then all at once you'll get 5 replies, no packets for 5 seconds, 5 replies
at once, etc.
Linux 2.0.31, tulip 0.89K, Lite-on PNIC
An ifconfig up/down fixes the problem.
Thanks,
Al Youngwerth
alberty@apexxtech.com