Unexplained I/O errors
Steven Timm
timm at fnal.gov
Tue Jul 17 11:59:36 PDT 2001
> We had similar problems with UDMA and the 2.4.X kernels
> on this particular type of MB. You could try setting
> /sbin/hdparm -c1 -d1 -X32 /dev/hda
This command immediately took the disk offline and hung the machine.
>
> You can also try running a kernel without any UDMA
> support compiled in.
>
> A third option is to compile the kernel without initial
> UDMA support.
Do either of these two options apply to 2.2.19 kernels instead
of 2.4 kernels?
Steve
>
> A fourth possibility is that one of your hardware components is
> really broken ;-)
>
> Good luck,
> Matthijs
> ____________________________
> Dr ir Matthijs van Leeuwen
> HPC Specialist
> Compusys Plc, 58 Edison Road
> Rabans Lane Industrial Estate
> Aylesbury, Bucks HP19 8UT, UK
> Tel: +44 (0)1296 505143
> Fax: +44 (0)1296 424165
> Email: m.vanleeuwen at compusys.co.uk
> Web: http://www.compusys.co.uk
>
>
> -----Original Message-----
> From: Steven Timm [mailto:timm at fnal.gov]
> Sent: Tuesday, July 17, 2001 4:19 PM
> To: beowulf at beowulf.org
> Subject: Unexplained I/O errors
>
>
>
> Hi everyone,
>
> We are currently burning in a new cluster and seeing the following
> problem:
>
> We see a number of files, usually contiguous in the same directory,
> that ls will list as being there, but ls -l will show Input/output error.
> An fsck of the system gets rid of the I/O errors but also gets
> rid of the file. There is no error message on the console, nor
> in /var/log/messages, to indicate any disk controller problems.
>
> The problem appears to get worse over time, over a period of a few
> days the majority of our 136 machines exhibit these errors.
>
> Our configuration: Supermicro 370DLE motherboard, 2x1000MHz pentium III,
> 512 MB ram, Seagate system disk (30 GB) and CDROM on IDE primary,
> 2x40GB IBM drives on IDE secondary.
> hda: ST330620A, ATA DISK drive
> hdb: CD-ROM 48X/AKH, ATAPI CDROM drive
> hdc: IC35L040AVER07-0, ATA DISK drive
> hdd: IC35L040AVER07-0, ATA DISK drive
>
> I/O errors happen only on the system disk.
>
> We swapped out a large number of IDE cables for the system disk,
> replacing them with a better grade, with no luck.
>
> We have downgraded a few machines to the 2.2.16 kernel, and this
> appears to be OK, but it is a bit early to tell.
>
> We have also pulled the CD roms off of a few machines and this
> also appears to be stable but we need more data yet.
>
> Any idea what could be causing all of this?
>
> Steve
>
>
>
> ------------------------------------------------------------------
> Steven C. Timm (630) 840-8525 timm at fnal.gov http://home.fnal.gov/~timm/
> Fermilab Computing Division/Operating Systems Support
> Scientific Computing Support Group--Computing Farms Operations
>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>
>
> **********************************************************************
> Disclaimer
> This email is confidential and intended solely for the use of the individual to whom it is addressed. Any views or opinions presented are solely those of the author and do not necessarily represent those of Compusys or any of it's affiliates. If you are not the intended recipient, be advised that you have received this email in error and that any use, dissemination, forwarding, printing, or copying of this email is strictly prohibited. If you have received this email in error please notify Compusys Customer Services by telephone on +44(0)1296 505140
>
> This footnote also confirms that this email message has been swept by MIMEsweeper for the presence of computer viruses.
> **********************************************************************
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
>
More information about the Beowulf
mailing list