[eepro100] Unknown receiver error

West, Jeff Jeff.S.West@msfc.nasa.gov
Wed Sep 4 17:24:01 2002


All:

	I have looked at the list and noted some traffic early in 2002 about
this error message.  However, a definitive solution I could not locate.  So
I ask....

Linux Cluster, Redhat 7.2 with custom (trimmed down, .config on request)
2.4.18 SMP kernel on dual Athlon 1800s, 1GB per node.  

Running latest (downloaded as of 9-3-02) Becker eepro100 driver.  Sleep mode
disabled.
from dmesg:
pci-scan.c:v1.10 7/13/2002  Donald Becker <becker@scyld.com>
http://www.scyld.com/linux/drivers.html
eepro100.c:v1.25 8/27/2002 Donald Becker <becker@scyld.com>
  http://www.scyld.com/network/eepro100.html
eth0: Intel PCI EtherExpress Pro100 at 0xf88c8000, 00:02:B3:48:4F:03, IRQ
17.
  Board assembly 751767-004, Physical connectors present: RJ45
  Primary interface chip i82555 PHY #1.
    Secondary interface chip i82555.
  General self-test: passed.
  Serial sub-system self-test: passed.
  Internal registers self-test: passed.
  ROM checksum self-test: passed (0x3258698e).

Running large MPI job, runs master nearly out of memory, machine goes into
swap.  Sometimes job runs, sometimes does not.  

When job does not run, we get the following in /var/log/messages on the
master and several (10%) of the slaves:
Sep  4 15:32:12 n9 kernel: Command 80 was not immediately accepted, 112
ticks!
Sep  4 15:32:12 n9 kernel: eth0: Unknown receiver error, status=0x5048.


We would like to overcome this problem.  

What other information is needed to follow up on this problem?  Is there a
known solution?

Jeff



Jeff West, Ph.D.
Applied Fluid Dynamics Analysis Group
MSFC, AL 35812
ph:(256) 544-6309
jeff.west@msfc.nasa.gov