network interface zombies randomly [was Re: interface dies under network load on SMP machines]

Al Youngwerth alberty@apexxtech.com
Fri Aug 28 12:19:13 1998


At 05:50 PM 8/27/98 -0700, david wrote:
>Reply to mail from Mike Simons about interface dies under network load on
SMP machines
>-----------------
>>>    The interface goes dead from time to time without leaving any log
>>> messages - a simple ifconfig down/up brings them back to live. To
>>> stabilise the systems I've written a small program that checks the
>>> network and restarts the interface if required. So everything is 
>>> nearly perfectly fine...
>[some parts snipped]
>
>stipulations:
>
> - it may be the tulip driver
> - it may be the [xxx] driver
> - it may be the kernel
>    it may be a subtle bug somewhere in the network/device path that's not
>    directly related
>
>I used to feel that it was the tulip driver.  Now I am not sure.  The
>default version of tulip.c that comes in the 2.1 tarball works fine.  If I
>get any of the most recent tulip.c files from cesdis, then the above
>anomally appears.
>
>I.e., random dying of the interface.  It can happen within minutes, it can
>take days.  I have not yet found a pattern.
>
>As is mentioned, ifconfig down/up fixes it.
>
>If I recall correctly, packets aren't lost by doing ifconfig down/up.  For
>example, ping <host> will flood your screen with all the latent packets
>finally getting through.
>
>comments?

I've been seeing a similar but slightly different problem, every now and
then I'll have a tulip interface that becomes "bursty". Under this
condition, if you ping the card, you'll see no traffic for 5 seconds and
then all at once you'll get 5 replies, no packets for 5 seconds, 5 replies
at once, etc.

Linux 2.0.31, tulip 0.89K, Lite-on PNIC

An ifconfig up/down fixes the problem.

Thanks,

Al Youngwerth
alberty@apexxtech.com