tulip driver fails - please help

edgar@arrakis.htu.tuwien.ac.at edgar@arrakis.htu.tuwien.ac.at
Wed Dec 8 13:22:59 1999

  This message is in MIME format.  The first part should be readable text,
  while the remaining parts are likely unreadable without MIME-aware tools.
  Send mail to mime@docserver.cac.washington.edu for more info.

Content-Type: TEXT/PLAIN; CHARSET=us-ascii
Content-ID: <Pine.LNX.3.96.991202220446.24884D@arrakis.htu.tuwien.ac.at>

Recently I bought a new mainbord and installed it in my computer. My
problem is that I can't get my nic to work with the new hardware:

old hw-configuration:
P100, ASUS Mainboard (AT) with Intel chipset
32Mb RAM
DE530CT+ nic with DEC21041 chip
S3 Vision 968 PCI
Voodoo Graphics Adapter
Sb AWE 32

new hw-configruation:
K6-2 450
Gigabyte GA-5AX (ATX), chipset: ALi Aladdin V
64Mb RAM
same as above, except for the SCSI adapter which I removed

I'm using Debian Linux 2.1 (slink), Kernel 2.2.13, tulip.c version 1.19.
In fact it's still the same installation.

The problem is that as soon as I try to send data over the network (thin
ethernet, 10base2), the tulip-driver starts reporting weird errors. The
data of course is not being sent.

Here is an excerpt from /proc/pci:
Bus  0, device  12, function  0:
  Ethernet controller: DEC DC21041 (rev 33).
    Medium devsel.  Fast back-to-back capable.  IRQ 10.  
    Master Capable.  Latency=32. I/O at 0xe000 [0xe001].
    Non-prefetchable 32 bit memory at 0xee000000 [0xee000000].

..the nic is using IRQ 10, there are no IRQ-conflicts or other conflicts
that I know of. Also the nic is working fine under Windows 98
(DEC21041-driver). There are however some problems with Windows 98SE and
the Intel2104-driver: The computer crashes frequently during network
games. This may be somehow related to the problem I experience in Linux,
maybe it's not.

Next I do a:
>modprobe tulip options=1

..the tulip driver answers (cut&pasted from /var/log/messages):
Dec  2 12:46:59 Hal kernel: Found Digital DC21041 Tulip
  at PCI I/O address 0xe000. 
Dec  2 12:46:59 Hal kernel: tulip.c:v0.91 4/14/99 
Dec  2 12:46:59 Hal kernel: eth0: Digital DC21041 Tulip 
  rev 33 at 0xe000, 21041 mode, 00:80:C8:7F:0B:67, IRQ 10. 
Dec  2 12:46:59 Hal kernel: eth0:21041 Media information 
  at 30, default media 0800 (Autosense). 
Dec  2 12:46:59 Hal kernel: eth0:  21041 media #0, 10baseT. 
Dec  2 12:46:59 Hal last message repeated 2 times

..now there are some weird things about the above answer: Why does the
dirver insist 3 times in using Media 10baseT although I specified
10base2 in the modprobe-line: "options=1", meaning 10base2! Up to this
point I always get the same behavior. But now things seem to happen
randomly. Sometimes the driver recognises its mistake and reports the
following line to "messages":

Dec  2 12:43:55 Hal kernel: eth0: No 21041 10baseT link beat, 
  Media switched to 10base2. 

..sometimes the driver swiches the Media to AUI, which is completely
absurd, sometimes it does nothing at all.

Now the ifconfig line:
>ifconfig eth0 netmask broadcast

- no answer on stdout, stderr or messages. So everything seems to be
fine. The routing-table is also set up correctly. ifconfig reports:
eth0  Link encap:Ethernet  HWaddr 00:80:C8:7F:0B:67  
      inet addr:  Bcast:  Mask:
      RX packets:0 errors:0 dropped:0 overruns:0 frame:0
      TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
      Interrupt:10 Base address:0xe000

if I now generate some traffic, for example ping something:
..nothing happens at all. Ping reports x packets sent, 0 received. And
about 20 seconds from the moment I start pinging, my terminal (if I run
as root) and also messages gets spammed with the following:
Dec  2 12:49:00 Hal kernel: eth0: 21041 transmit 
  timed out, status fc670045, CSR12 000021c4, 
  CSR13 ffffef09, CSR14 fffff7fd, resetting... 
Dec  2 12:49:05 Hal kernel: eth0: 21041 transmit
  timed out, status fc678047, CSR12 000021c4,
  CSR13 ffffef09, CSR14 fffff7fd, resetting...

The error messages can be stopped with:
>ifconfig eth0 down

from this point on, whenever I bring eth0 up again, ifconfig reports
errors in the TX line:
TX packets:0 errors:36 dropped:0 overruns:0 carrier:0

Attached is the output of "tulip-diag -aa -ee".

I get basicly the same behaviour with different IRQs, PCI-slots,
Kernelversions and tulip.c-versions.

Could anybody please interpret the above output/behaviour and give me
some feedback on what is wrong. Is it a software / hardware problem, are
there any known incompatibilities involved?

Thax in advance

Edgar Holleis

Content-Type: TEXT/PLAIN; CHARSET=us-ascii; NAME="tulip-diag.txt"
Content-ID: <Pine.LNX.3.96.991202220446.24884E@arrakis.htu.tuwien.ac.at>

tulip-diag.c:v1.19 10/2/99 Donald Becker (becker@cesdis.gsfc.nasa.gov)
Index #1: Found a Digital DC21041 Tulip adapter at 0xe000.
Digital DC21041 Tulip chip registers at 0xe000:
  ffe08000 ffffffff ffffffff 0029a810 0029aa10 fc000000 fffe0000 fffe0000
  fffe0000 ffff4bf8 ffffffff fffe0000 000021c4 ffffef09 fffff7fd ffff0006
 Port selection is half-duplex.
 Transmit stopped, Receive stopped, half-duplex.
  The Rx process state is 'Stopped'.
  The Tx process state is 'Stopped'.
  The transmit unit is set to store-and-forward.
  The NWay status register is 000021c4.
EEPROM size is 6.
PCI Subsystem IDs, vendor 1186, device 0100.
CardBus Information Structure at offset 00000000.
Ethernet MAC Station Address 00:80:C8:7F:0B:67.
EEPROM transceiver/media description for the Digital DC21041 Tulip chip.
Leaf node at offset 30, default media type 0800 (Autosense).
 3 transceiver description blocks:
  21041 media index 00 (10baseT).
  21041 media index 04 (10baseT-Full Duplex).
  21041 media index 01 (10base2).
EEPROM contents:
  1186 0100 0000 0000 0000 0000 0000 0000
  00b1 0101 8000 7fc8 670b 1e00 0000 0800
  0003 0104 0000 0000 0000 0000 0000 0000
  0000 0000 0000 0000 0000 0000 0000 0000
  0000 0000 0000 0000 0000 0000 0000 0000
  0000 0000 0000 0000 0000 0000 0000 0000
  0000 0000 0000 0000 0000 0000 0000 0000
  0000 0000 0000 0000 0000 0000 0000 bc00
 ID block CRC 0xb1 (vs. 0xb1).
  Full contents CRC 0xbc00 (read as 0xbc00).
  Internal autonegotiation state is 'Ability detect'.