[realtek] "too much work at interrupt" and PCI Bus error

Donald Becker becker@scyld.com
Tue, 6 Mar 2001 23:57:16 -0500 (EST)


On Tue, 6 Mar 2001, Lyle Hanson wrote:

> I have an SMC 10/100 EZcard with a rtl8139 chip in a RedHat 6.2 machine
> running on an old 486.  I have installed the most recent drivers.

What driver version?
  (I'm pretty certain that you do _not_ have "the most recent drivers".)

> The card works okay sometimes, this last time it was up and running for
> several hours with no problems whatsoever.  But often, when the card is
> busy I get this message:
> 
> Kernel: eth0: PCI Bus error 22800007
> Kernel: eth0: Too much work at interrupt, IntrStatus=0x8000
> Kernel: ide0: reset: success

Hmmm, likely a hardware problem.
The status indicates Master Abort, which means the motherboard killed a
bus master transaction with an abort command.  This could be non-fatal,
however the documentation doesn't tell us how the chip handles the
error.
Try adding the following line
	pci_read_config_dword(tp->pci_dev, PCI_COMMAND, &pci_cmd_status);
+	pci_write_config_dword(tp->pci_dev, PCI_COMMAND,
+		 pci_cmd_status& 0xff000000);
	printk(KERN_ERR "%s: PCI Bus error %4.4x.\n",
		   dev->name, pci_cmd_status);
to see if the error is cleared.

> This gets repeated in the syslog many times.  I'm not sure what the ide0
> message has to do with it, but it usually accompanies the troubles.  At
> this point, the network dies and a full system lockup follows shortly
> after.

That strongly hints that the bus error is just a symptom, not the real
problem.

>  Then I reboot, and the card seems to think its MAC is
> ff:ff:ff:ff:ff:ff, it logs a PCI latency error (0), sets it to 64, then
> says "eth0: RTL8139 Interrupt line blocked, status ffff".

Do a hard power off, not just a soft-off, to clear the error status.

> Somebody just gave me this box, and I'm not sure if it's got some flakey
> hardware (I think the cdrom is because I get "lost interrupt" messages
> sometimes when accessing it, or maybe that's an indication of some other,
> related troubles?).

Likely a related problem.
BTW, a 486 PCI system is likely not-quite-PCI v1.0.  The RTL8139 chip is
PCI v2.0 or v2.1 ('C' chip).

Donald Becker				becker@scyld.com
Scyld Computing Corporation		http://www.scyld.com
410 Severn Ave. Suite 210		Second Generation Beowulf Clusters
Annapolis MD 21403			410-990-9993