Transmit timeouts in v1.06 (v1.09l is worse??)
Andreas Vierengel
andreas@Vierengel.de
Tue Sep 7 13:00:41 1999
Mark Hagger wrote:
>
> Hi,
>
> I've been running kernel 2.2.5 under Redhat 5.2 on a parallel cluster of
> machines. Unfortunately under high network, CPU and disk I/O load I kept
> getting repeated Transmit timeouts messages from the eepro100 driver, (v1.06),
> these effectively left the machine hung up and I typically had to power off to
> reboot it.
>
> I've tried replacing v1.06 with the latest version v1.09l, but if anything this
> was worse, under the same conditions of load the machines now fatally crashes,
> I got a kernel oops once but it didn't appear in the syslog so I wasn't able to
> process it. Other than that the machine typically locks solid (blank screen as
> well sadly), and I couldn't do anything except power off.
>
> Is anyone out there having any sucess with these eepro100 cards, I see a number
> of people getting similiar problems with machines with high network traffic.
>
> Unfortunately this is somewhat killing my parallel application, as I update my
> code to get better network throughput I am able to crash/hang the machines
> quicker......
:-)
But seriously, I had the same problems under high load, even with 2.0.x kernel.
I switched back to 1.05 and all is working again. Maybe for you, too ??
--Andy