eepro100 Transmit timeouts - time to cut my losses? & misc

Jay Freeman (saurik) saurik@saurik.com
Wed Sep 15 20:14:43 1999


Personally, I am just going to ditch using the card as my main interface and
get myself a PCI 3c905 and use that instead, but continue trying the new
drivers and looking for a better diagnosis (and if it ever does maybe link
them together or something using ipenslave).  This will make it easier
actually to work with the card since it will operate as a secondary device
whose driver can be unloaded, upgraded to the next version of the driver,
and loaded again all the time :P.  Upon seeing the claims that certain types
of traffic seem to cause a lot of the problems, and personally only getting
the timeouts while doing SMB file sharing and X terminals (specifically
moving windows down really fast), I am thinking about seeing if there is a
pattern to the data that is sent right before the chip goes wacko, and doing
that when the only card you have is the one that isn't working :).

Here is something strange:  After upgrading to 1.09l I am still getting
problems, but I am getting fewer actual "Transmit timed out" messages, and
instead getting transmission errors every now and then that temporarily
locks up a single connection while it waits to retry.  Probably useless
information, but information none-the-less.

>From an `ifconfig eth0`:
  TX packets:534883 errors:144 dropped:0 overruns:2672 carrier:0

My "Transmit timeout" messages though have stopped going to my syslog, but
are still going to my console, I don't remember changing anything in my
syslog.conf though, probably just hallucinating :).

On a side note, I am curious, is there any particular reason you don't use
modules on servers?  I have started advocating the use of modules as much as
possible on all mission critical applications, and would like to know what
possible problems you are seeing so I know whether I should adjust my
thinking.  My main point is that modules seem to allow for the least amount
of down time when a problem occurs that requires an upgrade, such as this
Ethernet card.  When a new version of this driver comes out that fixes a
particular problem, I can upgrade to the new version, taking the Ethernet
driver out of memory for no more than 15 seconds, and not even sever any of
the existing, long-term connections that might be in place.  The best
explanation I could come up with is that you wanted as much performance as
you could get, at the expense of availability, but I noticed that you
mentioned that your client wanted "24/7 connectivity", which is why I bring
it up.

No one else commented on your mention of IPv6 (at least, that I saw anyway),
I thought I would mention that I do not have IPv6 compiled into my kernel,
yet I was still encountering these timeout problems :(.

Sincerely,
Jay Freeman (saurik)
saurik@saurik.com

-----Original Message-----
From: owner-linux-eepro100@beowulf.gsfc.nasa.gov
[mailto:owner-linux-eepro100@beowulf.gsfc.nasa.gov]On Behalf Of David
Ford
Sent: Tuesday, September 14, 1999 11:20 AM
To: Osma Ahvenlampi; linux-eepro100@beowulf.gsfc.nasa.gov
Subject: Re: eepro100 Transmit timeouts - time to cut my losses?


Osma Ahvenlampi wrote:

> You guys ARE disabling the hardware multicast filter with
>
> options eepro100 multicast_filter_limit=3
>
> (in /etc/conf.modules, assuming you're using modules), right? I have
> half a dozen eepro100's in various PCs running Linux (all single
> processor, though), all rock solid with this option, all crash within
> a few minutes of uptime without it.

no, i never use modules on servers.

i have eepro100 cards elsewhere and all run fine without problems if i do
-not- compile ipv6 support in the kernel.  this was the solution from a
couple
months back from the alan cox and assoc.

-d