[tulip] ton o' errors with netgear NIC

undefined@engineer.com undefined@engineer.com
Tue, 23 May 2000 01:08:32 +0000


Mr. Becker and other mailing-list participants,

In my home I have 2 computers, each with a Netgear card, connected merely 
by a crossover cable.  Under light network-traffic loads running 
interactive network applications (telnet, ssh, linuxconf, swat for samba) 
between the two computers, the two computers are happy.  But when 
transfering (ftp, scp) a file greater than (approx.) 100 KB between the 
computers, one computer complains loudly of errors.  Large files transfered 
in one specific direction actually slow to a patheticcrawl after the first 
600 KB (see trouble-shooting below for a more detailed description).  So 
it's not just errors, but the problem causes severe network slow-downs.

I have read the mailing-list archives as far back as Jan '99 and did not 
see a problem posted similar to mine, though after the first 3 hours of 
reading the problems all seem the same. I have tried to include all 
information that I saw included and saw requested in previous postings from 
the archive.  I have also included general computer hardware and 
configuration information.  Below is all information I could think of 
extracting from both computers relevant to this problem, and following that 
is a description of my trouble-shooting to date.

NOTE: I simply refer to the two computers as "tulip" and "ne2k" as those 
are generic descriptions of the installed NICs and the easiest way to 
distinguish between the two computers on a mailing-list specifically 
concerning NIC drivers.

tulip:
Netgear FA310TX PCI
Red Hat 6.1
kernel 2.2.12-20
tulip.c v0.89H 5/23/98 and upgraded to v0.92 4/17/2000
circa '98-'99 PII-450 (440BX chipset)

ne2k:
Netgear EA201c
Red Hat 6.1
kernel 2.2.12-20
ne.c v1.10 9/23/94
circa '94 486 (ISA & VLB only)

connected by cat5 crossover cable

EDITORIAL NOTE: Hardware addresses withheld due to privacy concerns.  If 
the hardware addresses are really necessary for trouble-shooting, then 
please request that I mail them directly to your email address, and not to 
a public forum.

--- dmesg on tulip ---

tulip.c:v0.92 4/17/2000  Written by Donald Becker <becker@scyld.com>
   http://www.scyld.com/network/tulip.html
eth0: Lite-On 82c168 PNIC rev 32 at 0xc40f1000, **:**:**:**:**:**, IRQ 5.
eth0:  MII transceiver #1 config 3000 status 7829 advertising 01e1.

--- /var/log/messages on tulip ---

no errors or any references

--- eth0 in /proc/net/dev on tulip ---

    Receive
bytes    packets errs drop fifo frame compressed multicast
  1008022    1427  523    0    0  1046          0         0
   Transmit
bytes    packets errs drop fifo colls carrier compressed
  1662749    1855  307    0    0   276     307          0

--- ifconfig eth0 on tulip ---

Link encap:Ethernet  HWaddr **:**:**:**:**:**
inet addr:192.168.0.2  Bcast:192.168.0.255  Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
RX packets:1427 errors:523 dropped:0 overruns:0 frame:1046
TX packets:1855 errors:307 dropped:0 overruns:0 carrier:307
collisions:276 txqueuelen:100
Interrupt:5 Base address:0x1000

--- mii-diag on tulip ---

mii-diag.c:v2.00 4/19/2000  Donald Becker (becker@scyld.com)
  http://www.scyld.com/diag/index.html
Basic registers of MII PHY #1:  3000 782d 0040 6212 01e1 0021 0000 0000.
  Basic mode control register 0x3000: Auto-negotiation enabled.
  You have link beat, and everything is working OK.
  Your link partner is generating 10baseT link beat  (no autonegotiation).

--- tulip-diag on tulip ---

tulip-diag.c:v2.00 4/19/2000 Donald Becker (becker@scyld.com)
  http://www.scyld.com/diag/index.html
Index #1: Found a Lite-On 82c168 PNIC adapter at 0xd400.
  Port selection is MII, half-duplex.
  Transmit started, Receive started, half-duplex.
   The Rx process state is 'Waiting for packets'.
   The Tx process state is 'Idle'.
   The transmit threshold is 72.
  MII PHY found at address 1, status 0x782d.
  MII PHY #1 transceiver registers:
    3000 782d 0040 6212 01e1 0021 0000 0000
    0000 0000 0000 0000 0000 0000 0000 0000
    5000 0000 0000 0000 0000 0000 0300 0000
    003c 8006 0f00 ff00 002c 4000 0080 000b.
  Basic mode control register 0x3000: Auto-negotiation enabled.
  Basic mode status register 0x782d ... 782d.
    Link status: established.
    Capable of  100baseTx-FD 100baseTx 10baseT-FD 10baseT.
    Able to perform Auto-negotiation, negotiation complete.
  Vendor ID is 00:10:18:--:--:--, model 33 rev. 2.
    No specific information is known about this transceiver type.
  I'm advertising 01e1: 100baseTx-FD 100baseTx 10baseT-FD 10baseT
    Advertising no additional info pages.
    IEEE 802.3 CSMA/CD protocol.
  Link partner capability is 0021: 10baseT.
    Negotiation did not complete.

--- dmesg on ne2k ---

ne.c:v1.10 9/23/94 Donald Becker (becker@cesdis.gsfc.nasa.gov)
NE*000 ethercard probe at 0x300: ** ** ** ** ** **
eth0: NE2000 found at 0x300, using IRQ 10.

--- /var/log/messages on ne2k ---

no errors or any references

--- eth0 in /proc/net/dev on ne2k ---

    Receive
bytes    packets errs drop fifo frame compressed multicast
  1667226    1940    0    0    0   230          0         1
   Transmit
bytes    packets errs drop fifo colls carrier compressed
  1562737    2043    0    0    0     0       0          0

--- ifconfig eth0 on ne2k ---

Link encap:Ethernet  HWaddr **:**:**:**:**:**
inet addr:192.168.0.1  Bcast:192.168.0.255  Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
RX packets:1933 errors:0 dropped:0 overruns:0 frame:230
TX packets:2036 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:100
Interrupt:10 Base address:0x300

--- ne2k-diag on ne2k ---

ne2k-diag.c:v2.00 4/19/2000 Donald Becker (becker@scyld.com)
Checking the ethercard at 0x300.
   Receive alignment error counter (0x30d) is ff
   Passed initial NE2000 probe, value 00.
Station Address PROM    0: ** ** ** ** ** ** ** ** ** ** ** ** 00 00 00 00
Station Address PROM 0x10: 00 00 00 00 00 00 00 00 00 00 00 00 57 57 57 57
   NE2000 found at 0x300, using start page 0x40 and end page 0x80.
The current MAC stations address is **:**:**:**:**:**.
8390 page 0: 20 ff ff 7a 42 00 ff 00 20 00 7b ff 21 00 00 00.
8390 page 1: 60 ** ** ** ** ** ** 7b 00 00 00 80 00 ff 00 00.
8390 page 2: 00 ff 49 ff 49 ff 49 ff 49 ff 49 ff 49 ff 49 ff.
8390 page 3: e0 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff.

---

What have I tried so far?  I noticed that the tulip was the one screaming 
about errors, so I assume (very possibly incorectly) that the problem lies 
with it.  The network slowdown is also greatest when transfering from tulip 
to ne2k.  (Though that doesn't really suggest anything as the problem could 
be ne2k RX and not necessarily tulip TX.)  That's why I'm posting to this 
tulip mailing-list, instead of the ne2k mailing-list (if one even 
exists).  I've also focused most of my trouble-shooting on the tulip.

Since all the errors were on the tulip, I first thought maybe it was the 
tulip driver.  So I upgraded to v0.92 from netdriver-2.0-2.src.rpm.  No 
improvement.  Tulip dual-boots between linux and win98, so I transfered 
(ftp) files between tulip (in both OSes) and ne2k, and based on throughput 
numbers win98 is quite faster, which makes me believe the problem is 
driver-based and not hardware-based.  But I don't know of a ifconfig or 
/proc/net/dev equivalent in win98, so maybe the problem of massive errors 
still exists in win98, I just can't see the errors.

I then started reading the mailing-list archives and saw a mention of bad 
cabling.  "Duh," I thought, but I don't have a spare known-working 
crossover cable (nor do we have them in general use at work for me to 
borrow overnight) to test with.  But if the problem was with the cabling 
wouldn't there be errors reported on both computers instead of all (TX and 
RX) on one end?  I did flip-flop which end of the crossover cable was 
plugged into which computer, but all the errors continued to be reported by 
tulip.  If this is very possibly the problem, then I'll purchase another 
crossover cable for testing, but I would prefer to not have to.

I've checked the simple configuration stuff like IRQ, but that doesn't seem 
to be the problem.  Both cards are recognized with no problems reported, so 
it makes me believe it's something else besides driver configuration, like 
the driver itself.

Oh, I don't think it's the TCP/IP stack, as I use to have these machines 
connected by PLIP, and never encountered any problems (besides balancing 
the use of the parallel port with both PLIP and a printer).

I am unable to solve the problem, and short of testing (which requires 
buying) a new crossover cable and new NICs, I don't know what to test or 
try next.

I apologize if the problem/solution is obvious from the above information, 
and I've just happened to be blind to it.  If that's the case, then I am 
sorry I wasted your time and bandwidth, but please still inform me of the 
solution (with whatever ridicule you fill necessary).

Thank you in advance for your assistance.

C o r e y  W r i g h t
mailto:undefined@pobox.com
http://zeros-ones.homepage.com/Corey-Wright-public-key.html