[vortex] Problem with built-in dual 3c982, Dual AMD S2462 MB.
Heflin, Roger A.
Roger.A.Heflin@conoco.com
Fri Jan 18 10:06:00 2002
Apparently we were not running noapic on the test machines.
Lilo or the kernel cut the kernel boot line at 80 chars or so and cut
off
noapic and apparently also cutoff part of our serial port speed (115
baud
vs. 115200 baud), we had noticed that the boot across the serial was
slow
but had not put our fingers on why. We are investigating how to
correct
this feature.
And your assertion does agree with out new test data, the machines still
lose their networking, just not nearly as quickly as before.
Roger
> -----Original Message-----
> From: Bogdan Costescu [SMTP:bogdan.costescu@iwr.uni-heidelberg.de]
> Sent: 1/ 18/ 2002 7:22 AM
> To: Heflin, Roger A.
> Cc: vortex@scyld.com
> Subject: RE: [vortex] Problem with built-in dual 3c982, Dual AMD
> S2462 MB.
>
> On Wed, 16 Jan 2002, Heflin, Roger A. wrote:
>
> > Jan 16 13:25:57 poeplx1008 kernel: eth0: Interrupt posted but
> > not delivered -- IRQ blocked by another device?
>
> That's the problem: an interrupt gets lost. That's consistent with
> your
> observations on increasing Tx timeout and Tx ring size. I'm actually
> surprised that the "noapic" didn't help. Could you make sure that it's
>
> really running in "noapic" mode by making sure that /proc/interrupts
> only
> contains "XT-PIC" entries ?
>
> The problem is that because of the Tx interrupt mitigation, sometimes
> a
> full Tx ring waits for one interrupt to free it all. If this interrupt
> is
> lost, the Tx timeout occurs. By increasing the Tx ring size you are
> not
> solving the problem, just decreasing the probability of happening.
>
> I still wonder what's the difference between using 905C and 982...
>
> --
> Bogdan Costescu
>
> IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
> Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
> Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
> E-mail: Bogdan.Costescu@IWR.Uni-Heidelberg.De
>