Alignment errors (update)

James Stevens James.Stevens@jrcs.co.uk
Wed Sep 23 09:20:23 1998


Daniele Orlandi wrote:
> 
> Donald Becker wrote:
> >
> > What about netperf or ttcp while you generate a lot of PCI bus traffic?
> > Perhaps reading your whole disk in the background.
> 
> Bingo!
> 
> Started netperf, no alignment errors...
> 
> Started copying big files from one disk to another, immediately the alignment
> errors began to raise at a rate of about 200/second.
> The disks are two 6 GB UDMA EIDE disks.
> 
> hdc: FUJITSU MPB3064ATU, 6187MB w/0kB Cache, CHS=13410/15/63, DMA
> hdd: FUJITSU MPB3064ATU E, 6187MB w/0kB Cache, CHS=13410/15/63, DMA
> 
> > This might indicate a problem with the FIFO threshold or PCI burst length
> > settings.  (The driver currently uses the recommended defaults.)
> 
> The FIFO probably has much to do with my problem:
> 
> [root@etabeta /]# cat /proc/net/dev
> Inter-|   Receive                  |  Transmit
>  face |packets errs drop fifo frame|packets errs drop fifo colls carrier
>     lo: 242667    0    0    0    0   242667    0    0    0     0    0
>   eth0:2210414    0    0    0    0  3557684    0    0 37264     0    0
> 
> All these 37000 fifo errors were generated in few minutes.


Hi Donald & Daniele,

This sounds like the problem I reported a few weeks ago, (Tx resetting a
lot and crashing the entire system), however, I get the problem really
easily using a peice of TCP stressing software I worte called
"open_socket".

I just ran it on the EEPro-100 again and got the following :-

nin# cat /proc/net/dev 
Inter-|   Receive                  |  Transmit
 face |packets errs drop fifo frame|packets errs drop fifo colls carrier
    lo:     76    0    0    0    0       76    0    0    0     0    0
  eth0:  70487    0    0    0    0    69671    1    0   21     0    0

At this point I then got :-

kernel: eth0: Transmit timed out: status 0050 command 0000.
kernel: eth0: Trying to restart the transmitter...

in the syslog.

Donald, if you can't reproduce this on 10MBps ethernet like I can
(within a couple of seconds) I can always give you a "telnet" account on
my system.

James

> 
> Well, what should I do now ?

Good question. "open_socket" does NOT do any other I/O at all, but still
blows this driver. I tried a flood ping with 1400 size packets AND a
disk dump and got NO problems !

> (Now I'm going to try to play with my BIOS settings to see if something changes,
> I'll let you know if something happens).

I'd be surprised if you can fix it like this - IMHO there is a bug in
the driver.

James