RTL 8129/39 transmit timeout error messages

Christopher Schanzle chris@cam.nist.gov
Tue Mar 2 13:52:35 1999


Sir Donald Becker (I bow to thee with honor and respect),

We've just brought up a cluster of eight 400 MHz machines with the RTL
8139 driver version 1.04 9/22/98 release running on Redhat 5.2 (with all
updates).  All eight are sharing a non-switched 100 Mbit hub.  With six
machines doing pvm work, all is fine.  When we bump it up to seven or
eight machines, we get errors on the console and a lot less work
performed:

eth0: Transmit timeout, status 0d 0000 media 00
eth0 Tx queue start entry 73460 dirty entry 73456

These pairs of lines repeat on the console intermittantly, where the
numbers in the second line have been reported as:

9850, 9846
1586, 1582
2593, 2589
827, 823
449, 445

Usually the interface recovers, but sometimes not.  Do you have any
suggestions?

I plan on swapping out the 100 Mbit hub today with something else just
to make sure it's not the hub.

I enjoyed your code comments that basically state these cards stink to
high heaven.  We are investigating getting something based on the Tulip
21140-AF chip, but in the meantime, we've got 16 or so more machines to
bring up and would like to have a plan.  I've got a Kingston KNE 100TX
card that appears to work well -- comments?

Thanks a bunch,

Chris Schanzle
Unix System Administrator
 | To unsubscribe, send mail to Majordomo@cesdis.gsfc.nasa.gov, and within the
 |  body of the mail, include only the text:
 |   unsubscribe this-list-name youraddress@wherever.org
 | You will be unsubscribed as speedily as possible.