[eepro100] Kernel panic with RedHat 2.2.16-3 and eepro100 v1.13

Emil Sit sit@cisco.com
Thu, 31 May 2001 17:57:30 -0400


Hi,

I have a system that's running under 2.2.16 with the eepro driver
downloaded from the web site. Under heavy load (as a web server),
it will fairly reliably crash. I was able to get an oops but I'm
not sure how to interpret it; anyone have any suggestions?

>From dmesg:

eepro100.c:v1.13 1/9/2001 Donald Becker <becker@scyld.com>
  http://www.scyld.com/network/eepro100.html
eth0: Intel PCI EtherExpress Pro100 at 0xf8846000, 00:04:4D:E3:5F:54, IRQ 9.
  Board assembly 668081-002, Physical connectors present: RJ45
  Primary interface chip i82555 PHY #1.
  General self-test: passed.
  Serial sub-system self-test: passed.
  Internal registers self-test: passed.
  ROM checksum self-test: passed (0x04f4518b).
  Receiver lock-up workaround activated.
eth1: Intel PCI EtherExpress Pro100 at 0xf8848000, 00:04:4D:E3:5F:55, IRQ 10.
  Board assembly 668081-002, Physical connectors present: RJ45
  Primary interface chip i82555 PHY #1.
  General self-test: passed.
  Serial sub-system self-test: passed.
  Internal registers self-test: passed.
  ROM checksum self-test: passed (0x04f4518b).
  Receiver lock-up workaround activated.

The pre-crash message:

eth0: Transmit timed out: status 0050  00f0 at 4613965/4613977 commands 000c0000 000c0000 000c0000.

The oops:

ksymoops 0.7c on i686 2.2.16-3.  Options used
     -v /boot/vmlinux-2.2.16-3 (specified)
     -k /proc/ksyms (default)
     -l /proc/modules (default)
     -o /lib/modules/2.2.16-3/ (default)
     -m /boot/System.map (specified)

Unable to handle kernel NULL pointer dereference at virtual address 00000000
current->tss.cr3 = 00101000, %cr3 = 00101000 *pde = 00000000 ops: 0000
CPU:    0
EIP:    0010:[<f8841f14>]
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010292
eax: 00000000   ebx: f6225ce4   ecx: 000003ca   edx: f64d8000
esi: 00000000   edi: f62258e4   ebp: c0233ec8   esp: c0233eb0
ds: 0018   es: 0018   ss: 0018
Process swapper (pid: 0, process nr: 0, stackpage=c0233000)
Stack: f6225800 f8844100 f6225f74 00000050 00000001 f8846000 c0233ef0 f884204d
       f6225800 f6225800 0000000c f62258e4 f6225800 c0117e94 c0219410 f62258e4
       c0233f1c f8841df9 f6225800 f6225800 f8841b4c c0219410 182541e1 00200000
Call Trace: [<c0117e94>] [<c0112130>] [<c010ad65>] [<c0118871>] [<c010b0bb>] [<c010ad88>] [<c0108799>]
       [<c0106000>] [<c01087bc>] [<c0109f0c>] [<c0106000>] [<c010607b>] [<c0106000>] [<c0100175>]
Code: ff 30 56 68 a0 40 84 f8 e8 2f 21 8d c7 83 c4 0c 46 83 fe 1f

>>EIP; f8841f14 <[eepro100]speedo_show_state+cc/170>   <=====
Trace; c0117e94 <it_real_fn+0/44>
Trace; c0112130 <timer_bh+2c0/3fc>
Trace; c010ad65 <do_8259A_IRQ+9d/a8>
Trace; c0118871 <do_bottom_half+45/64>
Trace; c010b0bb <do_IRQ+3b/40>
Trace; c010ad88 <common_interrupt+18/20>
Trace; c0108799 <cpu_idle+5d/6c>
Trace; c0106000 <get_options+0/74>
Trace; c01087bc <sys_idle+14/24>
Trace; c0109f0c <system_call+34/38>
Trace; c0106000 <get_options+0/74>
Trace; c010607b <cpu_idle+7/18>
Trace; c0106000 <get_options+0/74>
Trace; c0100175 <L6+0/2>
Code;  f8841f14 <[eepro100]speedo_show_state+cc/170>
00000000 <_EIP>:
Code;  f8841f14 <[eepro100]speedo_show_state+cc/170>   <=====
   0:   ff 30                     pushl  (%eax)   <=====
Code;  f8841f16 <[eepro100]speedo_show_state+ce/170>
   2:   56                        push   %esi
Code;  f8841f17 <[eepro100]speedo_show_state+cf/170>
   3:   68 a0 40 84 f8            push   $0xf88440a0
Code;  f8841f1c <[eepro100]speedo_show_state+d4/170>
   8:   e8 2f 21 8d c7            call   c78d213c <_EIP+0xc78d213c> c0114050 <printk+0/16c>
Code;  f8841f21 <[eepro100]speedo_show_state+d9/170>
   d:   83 c4 0c                  add    $0xc,%esp
Code;  f8841f24 <[eepro100]speedo_show_state+dc/170>
  10:   46                        inc    %esi
Code;  f8841f25 <[eepro100]speedo_show_state+dd/170>
  11:   83 fe 1f                  cmp    $0x1f,%esi

Aiee, killing interrupt handler
Kernel panic: Attempted to kill the idle task!
In swapper task - not syncing


This looks to be the printk at line 1031, but I can't see why any of
those pointers would be null, since speedo_tx_timeout just printed
them out prior to that. But then, I'm not really familiar with kernel
internals.

Any help would be appreciated. Thanks.

-- 
Emil Sit