[Beowulf] Slow RAID reads, no errors logged, why?
David Mathog
mathog at caltech.edu
Mon Mar 19 14:36:18 PDT 2018
On 19-Mar-2018 14:21, Alex Chekholko wrote:
> Normally I would suggest to do a diagnostic read dd from each disk, but
> you
> may not be able to do that with your RAID controller since it hides the
> individual disks.
I run full smartctl scans on the disks once a week. In fact those were
running at the time (on both machines). Nothing turns up other than
what was posted.
>
> My next recommendation would be a full AC cycle; can you power the host
> off
> for a few minutes? It's a bit cargo cult-y but sometimes it works. It
> may
> also help (or not) for you to spin around 3 times while the machine is
> off.
Did a reboot, but not a full AC cycle. There is a battery backup on the
RAID controller, so power down is not really ever power off for that
board. When it came back up the IO rate was unchanged. Perhaps a few
minutes with the power off might clear some stuck bits elsewhere in the
system.
Checked the speed on a third similar machine, H370P controller and again
SAS
disks. This one is all SEAGATE ST4000NM0005. It was as fast as the 'A'
machine (more or less.)
Thanks,
David Mathog
mathog at caltech.edu
Manager, Sequence Analysis Facility, Biology Division, Caltech
More information about the Beowulf
mailing list