even newer info WAS: Re: new info WAS: Re: [tulip] LNE100TX version 4.1 timeouts on tx

Erik Steffl steffl@bigfoot.com
Fri, 01 Dec 2000 03:02:07 -0800


  before you read any further: the HW was tested, I booted win on the
linux computer with LNE100TX and everything worked, there were no
errors, I tested it by copying a lot of megabytes from one machine to
another (simultaneously both ways). there were no errors reported on the
win machine with intel network card.

  there is information about history of my problems below (quoted
previosu e-mails), I did some more testing (based on info I've got here,
thanks!) since then and got nowhere close to a working network (but
slightly closer), so here I go again:

  I have two computers connected with crossover cable:

	Belkin Pro Series Cate5 UTP Crossover Cable, meets or exceeds
trequirements of the Ethernet IEEE 802.3ab 10/100Base-T (from cable
description)

  system one:

	linux, kernel 2.2.17,
	LNE100TX with driver tulip.c:v0.92 4/17/2000 with patch from Dan
Hollis,
		the patch resets the card when tx-freeze occurs (why there
		are these tx hung-ups in the first place?)

  system two:

	win98
	Intel PRO/100 (driver name: e100bnt5.sys)

  more info on linux side (note that it says half duplex, even though I
used path from Dan Hollis,the other computer thinks its full 100MB
duplex connection):

  jojda:/home/erik/skusobna/lne100tx/drivers#
~erik/skusobna/lne100tx/diag/tulip-diag/tulip-diag-patched 
tulip-diag.c:v2.04 9/26/2000 Donald Becker (becker@scyld.com)
 http://www.scyld.com/diag/index.html
Index #1: Found a ADMtek AL985 Centaur-P adapter at 0x6100.
 Port selection is 100mbps half duplex (Link is on)
 Transmit started, Receive started, half-duplex.
  The Rx process state is 'Waiting for packets'.
  The Tx process state is 'Idle'.
  The transmit threshold is 128.
 The Comet MAC registers are 12782000 ffffcc14 filter 8000000000000000.
 Use '-a' or '-aa' to show device registers,
     '-e' to show EEPROM contents, -ee for parsed contents,
  or '-m' or '-mm' to show MII management registers.
jojda:/home/erik/skusobna/lne100tx/drivers# 

  description of problem symptoms follow:

  if I only do ping -s 10000 192.168.1.2 (that's from linux to windows),
everything seems to be fine, ifconfig says there were no errors.

  if I do something more network intensive (using vnc to display win on
linux, using X to display linux programs on windows (and back on linux,
via vnc:-)) I get carrier errors, the ping does not work flawlessly
anymore and intel diagnostics on windows machine says there were errors
(it does not say any details, just displays a field labeled "Errors" and
non zero number that increases over time (typically there are thousands
of errors over about hour of my testing)).

  this is what ping's output looks like (when the network is busy):

...
10008 bytes from 192.168.1.2: icmp_seq=5147 ttl=128 time=2.0 ms
10008 bytes from 192.168.1.2: icmp_seq=5148 ttl=128 time=1.8 ms
10008 bytes from 192.168.1.2: icmp_seq=5149 ttl=128 time=1.8 ms
>From 192.168.1.2: Frag reassembly time exceeded
10008 bytes from 192.168.1.2: icmp_seq=5150 ttl=128 time=3610.2 ms
10008 bytes from 192.168.1.2: icmp_seq=5151 ttl=128 time=2611.4 ms
10008 bytes from 192.168.1.2: icmp_seq=5153 ttl=128 time=612.5 ms
10008 bytes from 192.168.1.2: icmp_seq=5154 ttl=128 time=3.6 ms
10008 bytes from 192.168.1.2: icmp_seq=5155 ttl=128 time=4.9 ms
10008 bytes from 192.168.1.2: icmp_seq=5156 ttl=128 time=1.9 ms
10008 bytes from 192.168.1.2: icmp_seq=5157 ttl=128 time=1.9 ms
...

  usually the time is 1.5 to 1.7 but the hick-ups like the one above are
quite regular. this happens only if there's something going on on the
network, otherwise there are no problems (that I would notice). when I
say 'busy' network it is far from network being overloaded, there are NO
noticeable slowdowns, only isolated problems like the one seen above.

  here's what ifconfig says (after the network has been busy for
awhile):

eth0      Link encap:Ethernet  HWaddr 00:20:78:12:14:CC  
          inet addr:192.168.1.1  Bcast:192.168.0.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:157019 errors:1 dropped:0 overruns:0 frame:1
          TX packets:131985 errors:2152 dropped:0 overruns:0
carrier:2152
          collisions:7425 txqueuelen:100 
          Interrupt:9 Base address:0xa000 

  note the carrier errors, these are most probably not HW problems (when
when both computers are running windows there are no errors).

  I have also tested the network using ftp. I used it to transfer 80MB
file from win to linux, no noticeable problems. however, when I try to
transfer the file from lin to win I get disconnect (after part of the
file was transferred).

  final question: it looks like the linux driver thinks that the
connection is half-duplex, the other side (intel card on windows
machine) thinks it's full-duplex. If that were really so (one card
working in half-duplex mode and another one in full duplex mode) would
that cause problems described above?

  is there any way to check if the card is really working in half-duplex
mode? and why would it? note that windows seems to work fine on the same
computer (and the other side still thinks it's full duplex).

  is there any hope?

  thanks in advance.

	erik

Erik Steffl wrote:
> 
>   after having problems described below I tried to move the LNE100TX to
> different PCI slot and now it's all different. Soundcard works OK (it
> used to play only about 1 second of sound when network card used to be
> in original slot).
> 
>   however, now the network card hangs up, not sure if it's tx or rx
> (probably tx judging by the messages below).
> 
>   I get plenty of messages like this one (all of them are exactly the
> same):
> 
> eth0: Transmit timed out, status fc664010, CSR12 00000000, resetting...
> 
>   the eth0 interface does not work, it starts working if I unload the
> drivers and ifup eth0 again.
> 
>   any ideas what's going on?
> 
>   (/etc/interrupts did not changed, see below)
> 
>   TIA
> 
>         erik
> 
> Erik Steffl wrote:
> >
> >   the linksys LNE100TX card with driver from linksys timeouts when there
> > is busy traffic on tx. From application point of view only one
> > connection dies (because of timeout),all the rest are working ok.
> >
> >   I have tested this with various programs, mostly vnc but also ftp,
> > telnet, ping and X.
> >
> >   my system:
> >
> >   linux machine with LNE100TX (PCI), version 4.12 card, running debian
> > distro (unstable), kernel 2.2.17, driver:
> >
> >   "tulip.c:v0.92 4/17/2000  Written by Donald Becker
> > <becker@scyld.com>\n";
> >
> >   windows machine on the other side, intel on board network card (does
> > it matter?)
> >
> >   these two machines are connected by crossover cable, full duplex,
> > 100Mbs. I guess the cable is OK, I tried to change the direction (errors
> > are in one direction only)... There are no errors (corrupted packets)
> > indicated by any programs I tried (mostly intel diagnostics tool, I also
> > checked variosu log on linux machine).
> >
> >   anytime there is huge transfer from linux to win there is a chance it
> > time outs, for example vnc only works for about one minute, X almost
> > does not work (I have X server on win machine), ftp transfer of 1MB file
> > work often but sometime does not work...
> >
> >   when the transfer goes the other way around, it works like charm, I
> > used vnc to view win screensaver, it ran all night, no problems.
> >
> >  low bandwidth transfer doesnot seem to be a problem, ping works both
> > way (evenusing 10k packets), telnet works, I use linux machine as
> > gateway/masq/firewall and I haven't noticed any problems (Ihave only
> > 33.600 modem)
> >
> >   another, probably unrelated problem: once the card stopped working at
> > all, unfortunately I do not have error message...
> >
> >   yet another problem: if the LNE100TX is in the machine, the soundcard
> > (soundblaster 64 awe, ISA) does not work, even if I do not load
> > tulip/pci-scan drivers at all, there is only about one second of any
> > audio heard (and no midi at all). If I take out the netwrok card and
> > reboot, the sound works. I see no error messages at all.
> >
> >   also, why does it says tulipc.: ... (see below) during bootup? After
> > the boot is completed lsmod does not show tulip.o loaded...
> >
> >   here's some more info:
> >
> >   lspci identifies card as:
> >
> > 00:0b.0 Ethernet controller: Bridgecom, Inc: Unknown device 0985 (rev
> > 11)
> >
> >   dmesg says:
> >
> > ...
> > mtrr: v1.35a (19990819) Richard Gooch (rgooch@atnf.csiro.au)
> > PCI: PCI BIOS revision 2.10 entry at 0xfb100
> > PCI: Using configuration type 1
> > PCI: Probing PCI hardware
> > Linux NET4.0 for Linux 2.2
> > ...
> > ttyS02 at 0x03e8 (irq = 4) is a 16550A
> > tulip.c:v0.92 4/17/2000  Written by Donald Becker <becker@scyld.com>
> >   http://www.scyld.com/network/tulip.html
> > eth0: ADMtek Comet rev 17 at 0xc584a000, 00:20:78:12:14:CC, IRQ 9.
> > PnP: Calling quirk for 02:00
> > PnP: Calling quirk for 02:02
> > isapnp: Card 'Creative Modem Blaster DI5600'
> > ...
> >
> >   (I was told that it's amdtek centaur-P anyway)
> >
> > jojda:/home/erik# cat /proc/interrupts
> >            CPU0
> >   0:    7465096          XT-PIC  timer
> >   1:      32049          XT-PIC  keyboard
> >   2:          0          XT-PIC  cascade
> >   5:          0          XT-PIC  Sound Blaster 16
> >   8:          1          XT-PIC  rtc
> >   9:    1187655          XT-PIC  eth0
> >  11:   11687930          XT-PIC  serial
> >  12:     347181          XT-PIC  PS/2 Mouse
> >  13:          0          XT-PIC  fpu
> >  14:    3414768          XT-PIC  ide0
> > NMI:          0
> >
> >   is there any chance to get this working?
> >
> >   any ideas? TIA
> >
> >         erik
> >
> > _______________________________________________
> > tulip mailing list
> > tulip@scyld.com
> > http://www.scyld.com/mailman/listinfo/tulip
> 
> _______________________________________________
> tulip mailing list
> tulip@scyld.com
> http://www.scyld.com/mailman/listinfo/tulip