[eepro100] Dell 4400 instability with eepro100 driver...
Henrik Schmiediche
Henrik Schmiediche" <henrik@stat.tamu.edu
Sun Feb 17 00:36:00 2002
Hello Ben,
I have exchanged the RAM to completely different RAM and decreased it to
1GB. No success. I have also tried the 2.4.17 kernel with and without IRQ
Rate patch with no suceess. Granted I have not tried the 2.4.17 kernel with
latest eepro100 drivers --- I tried this when I was still using the stock
eepro100 and the Intel e100 drivers. The number of driver/kernel/RAM
permutations is getting ridiculous.
I have not run memtest. Where do I find this?
Sincerely,
- Henrik
----- Original Message -----
From: "Ben Greear" <greearb@candelatech.com>
To: "Henrik Schmiediche" <henrik@stat.tamu.edu>
Cc: <eepro100@scyld.com>
Sent: Saturday, February 16, 2002 10:39 PM
Subject: Re: [eepro100] Dell 4400 instability with eepro100 driver...
> Have you tried running memtest to see if your memory is
> good? Might want to try limiting the machine to 1 or 2 GB
> of RAM to see if the problems go away. (Not a cure, but it
> will tell us something useful.)
>
> Henrik Schmiediche wrote:
>
> > Hello,
> > I have a single processor Dell 4400 server with 4GB of RAM that I cannot
get
> > to run stable under high network loads (NFS, remote backups). I am about
> > ready to trash this system and go back to a Sun. I am running RH 7.2
with
> > 2.4.9-13 and I have used the stock eepro100 drivers that come with RH,
the
> > latest Intel 1.6.29 drivers and the latest eepro100 drivers and all of
them
> > lock up. I also get lockups (WATCHDOG/timeout) when I install a 3com
3c905C
> > card (though I have not tried the latest drivers for this card from the
> > scyld website). I have also tried changing to an external eepro100 card
> > (instead of using the buildin one) with no success. When I installed the
> > latest eepro100 drivers I get this NMI message which may be related to
the
> > lockups, but I am not sure... I have tried changing RAM with no success.
> >
> > Feb 16 07:58:38 s0 kernel: eepro100.c:v1.20 1/28/2002 Donald Becker
> > <becker@scyld.com>
> > Feb 16 07:58:38 s0 kernel: http://www.scyld.com/network/eepro100.html
> > Feb 16 07:58:38 s0 kernel: Uhhuh. NMI received. Dazed and confused, but
> > trying to continue
> > Feb 16 07:58:38 s0 kernel: You probably have a hardware problem with
your
> > RAM chips
> > Feb 16 07:58:38 s0 kernel: Uhhuh. NMI received. Dazed and confused, but
> > trying to continue
> > Feb 16 07:58:38 s0 kernel: You probably have a hardware problem with
your
> > RAM chips
> > Feb 16 07:58:38 s0 kernel: Uhhuh. NMI received for unknown reason 25.
> > Feb 16 07:58:38 s0 kernel: Dazed and confused, but trying to continue
> > Feb 16 07:58:38 s0 kernel: Do you have a strange power saving mode
enabled?
> > Feb 16 07:58:38 s0 kernel: eth0: Intel i82559 rev 8 at 0xf899f000,
> > 00:B0:D0:20:87:60, IRQ 14.
> > Feb 16 07:58:38 s0 kernel: Board assembly 07195d-000, Physical
connectors
> > present: RJ45
> > Feb 16 07:58:38 s0 kernel: Primary interface chip i82555 PHY #1.
> > Feb 16 07:58:38 s0 kernel: General self-test: passed.
> > Feb 16 07:58:38 s0 kernel: Serial sub-system self-test: passed.
> > Feb 16 07:58:38 s0 kernel: Internal registers self-test: passed.
> > Feb 16 07:58:38 s0 kernel: ROM checksum self-test: passed
(0x04f4518b).
> > Feb 16 07:58:38 s0 kernel: Receiver lock-up workaround activated.
> >
> > The error message I get (a whole lot of them):
> >
> > Feb 15 23:35:22 s0 kernel: Command 0080 was not immediately accepted,
10001
> > ticks!
> > Feb 15 23:35:54 s0 last message repeated 19 times
> > Feb 15 23:36:00 s0 last message repeated 3 times
> > Feb 15 23:36:04 s0 kernel: eth0: Transmit timed out: status 0090 0080
at
> > 25279986/25280017 commands 000ca000 000c0000 000c0000.
> > Feb 15 23:36:04 s0 kernel: Command 0080 was not immediately accepted,
10001
> > ticks!
> > Feb 15 23:36:04 s0 kernel: eth0: Restarting the chip...
> > Feb 15 23:36:04 s0 kernel: Command 0070 was not accepted after 10001
polls!
> > Feb 15 23:36:08 s0 kernel: eth0: Transmit timed out: status 0000 0010
at
> > 25279986/25280018 commands 000ca000 000c0000 000c0000.
> > Feb 15 23:36:08 s0 kernel: eth0: Restarting the chip...
> >
> > A few additional comments:
> >
> > - I cannot recover from this except with a reboot. At least I do not
know
> > how.
> > - The eepro100 card shares an interrupt with the SCSI controller. Is
> > there a way to reassign the IRQ of the eepro100 card?
> > - The system is even more unstable when I install a second CPU.
> >
> > Any ideas on what to try? Bad motherboard?
> >
> > Sincerely,
> >
> > - Henrik
> >
> > CPU0
> > 0: 5115449 XT-PIC timer
> > 1: 1875 XT-PIC keyboard
> > 2: 0 XT-PIC cascade
> > 5: 30 XT-PIC aic7xxx
> > 8: 1 XT-PIC rtc
> > 10: 36396202 XT-PIC aic7xxx
> > 11: 0 XT-PIC usb-ohci
> > 12: 3151 XT-PIC PS/2 Mouse
> > 14: 6638596 XT-PIC aic7xxx, eth0
> > NMI: 3
> > ERR: 0
> >
> > PCI devices found:
> > Bus 0, device 0, function 0:
> > Host bridge: ServerWorks CNB20LE Host Bridge (rev 5).
> > Master Capable. Latency=48.
> > Bus 0, device 0, function 1:
> > Host bridge: ServerWorks CNB20LE Host Bridge (#2) (rev 5).
> > Master Capable. Latency=48.
> > Bus 0, device 17, function 0:
> > Host bridge: ServerWorks CNB20LE Host Bridge (#3) (rev 5).
> > Master Capable. Latency=48.
> > Bus 0, device 17, function 1:
> > Host bridge: ServerWorks CNB20LE Host Bridge (#4) (rev 5).
> > Master Capable. Latency=48.
> > Bus 0, device 4, function 0:
> > Ethernet controller: Intel Corporation 82557 [Ethernet Pro 100] (rev
8).
> > IRQ 14.
> > Master Capable. Latency=32. Min Gnt=8.Max Lat=56.
> > Non-prefetchable 32 bit memory at 0xfeb02000 [0xfeb02fff].
> > I/O at 0xfcc0 [0xfcff].
> > Non-prefetchable 32 bit memory at 0xfe900000 [0xfe9fffff].
> > Bus 0, device 6, function 0:
> > VGA compatible controller: ATI Technologies Inc 3D Rage IIC (rev
122).
> > Master Capable. Latency=32. Min Gnt=8.
> > Prefetchable 32 bit memory at 0xfd000000 [0xfdffffff].
> > I/O at 0xf800 [0xf8ff].
> > Non-prefetchable 32 bit memory at 0xfeb01000 [0xfeb01fff].
> > Bus 0, device 15, function 0:
> > ISA bridge: ServerWorks OSB4 South Bridge (rev 79).
> > Bus 0, device 15, function 2:
> > USB Controller: ServerWorks OSB4/CSB5 OHCI USB Controller (rev 4).
> > IRQ 11.
> > Master Capable. Latency=32. Max Lat=80.
> > Non-prefetchable 32 bit memory at 0xfeb00000 [0xfeb00fff].
> > Bus 6, device 4, function 0:
> > PCI bridge: PCI device 8086:0962 (Intel Corporation) (rev 1).
> > Master Capable. Latency=32. Min Gnt=6.
> > Bus 7, device 4, function 0:
> > SCSI storage controller: Adaptec 7899P (rev 1).
> > IRQ 10.
> > Master Capable. Latency=32. Min Gnt=40.Max Lat=25.
> > I/O at 0xcc00 [0xccff].
> > Non-prefetchable 64 bit memory at 0xfacff000 [0xfacfffff].
> > Bus 7, device 4, function 1:
> > SCSI storage controller: Adaptec 7899P (#2) (rev 1).
> > IRQ 5.
> > Master Capable. Latency=32. Min Gnt=40.Max Lat=25.
> > I/O at 0xc800 [0xc8ff].
> > Non-prefetchable 64 bit memory at 0xfacfe000 [0xfacfefff].
> > Bus 7, device 6, function 0:
> > SCSI storage controller: Adaptec AIC-7880U (rev 2).
> > IRQ 14.
> > Master Capable. Latency=32. Min Gnt=8.Max Lat=8.
> > I/O at 0xc400 [0xc4ff].
> > Non-prefetchable 32 bit memory at 0xfacfd000 [0xfacfdfff].
> >
> > [root@s0:/var/log]# mii-diag
> > Using the default interface 'eth0'.
> > Basic registers of MII PHY #1: 3000 782d 02a8 0154 05e1 41e1 0003 0000.
> > The autonegotiated capability is 01e0.
> > The autonegotiated media type is 100baseTx-FD.
> > Basic mode control register 0x3000: Auto-negotiation enabled.
> > You have link beat, and everything is working OK.
> > Your link partner advertised 41e1: 100baseTx-FD 100baseTx 10baseT-FD
> > 10baseT.
> > End of basic transceiver information.
> >
> >
> >
> >
> > _______________________________________________
> > eepro100 mailing list
> > eepro100@scyld.com
> > http://www.scyld.com/mailman/listinfo/eepro100
> >
> >
>
>
> --
> Ben Greear <greearb@candelatech.com> <Ben_Greear AT excite.com>
> President of Candela Technologies Inc http://www.candelatech.com
> ScryMUD: http://scry.wanfear.com http://scry.wanfear.com/~greear
>
>
> _______________________________________________
> eepro100 mailing list
> eepro100@scyld.com
> http://www.scyld.com/mailman/listinfo/eepro100
>