[Beowulf] Slow RAID reads, no errors logged, why?

Mon Mar 19 13:58:12 PDT 2018

On one of our Centos 6.9 systems with a PERC H370 controller I just 
noticed
that file system reads are quite slow.  Like 30Mb/s slow.  Anybody care 
to hazard a guess what might be causing this situation?  We have another 
quite similar machine which is fast (A), compared to this (B) which is 
slow:
            A      B
RAM        512    512     GB
CPUs       48     56      (via /proc/cpuinfo, actually this is threads)
Adapter    H710P  H730
RAID Level *      *       Primary-5, Secondary-0, RAID Level Qualifier-3
Size       7.275  9.093   TB
state      *      *       Optimal
Drives     5      6
read rate  540    30     Mb/s (dd if=largefile bs=8192 of=/dev/null& ; 
iotop)
sata disk   ST2000NM0033
sas disk          ST2000NM0023
patrol     No    No       (megacli shows patrol read not going now)

ulimit -a on both is:
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 2067196
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 60000
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 10240
cpu time               (seconds, -t) unlimited
max user processes              (-u) 4096
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

Nothing in the SMART values indicating a read problem, although on "B"
one disk is slowly accumulating events in the write x rereads/rewrites
measurement (it has 2346, accumulated at about 10 per week).  The value 
is 0 there for reads x rereads/rewrites.  For "B" the smartctl output 
columns are:

  Errors Corrected by         Total   Correction     Gigabytes    Total
        ECC        rereads/  errors    algorithm      processed   
uncorrected
    fast | delayed rewrites corrected invocations   [10^9 bytes]  errors

read: 934353848  0 0 934353848  0 48544.026 0
read: 2017672022 0 0 2017672022 0 48574.489 0
read: 2605398517 3 0 2605398520 3 48516.951 0
read: 3237457411 1 0 3237457412 1 48501.302 0
read: 2028103953 0 0 2028103953 0 14438.132 0
read: 197018276  0 0 197018276  0 48640.023 0

write: 0 0 0 0 0 26394.472 0
write: 0 0 2346 2346 2346 26541.534 0
write: 0 0 0 0 0 27549.205 0
write: 0 0 0 0 0 25779.557 0
write: 0 0 0 0 0 11266.293 0
write: 0 0 0 0 0 26465.227 0

verify: 341863005  0 0 341863005  0 241374.368 0
verify: 866033815  0 0 866033815  0 223849.660 0
verify: 2925377128 0 0 2925377128 0 221697.809 0
verify: 1911833396 6 0 1911833402 6 228054.383 0
verify: 192670736  0 0 192670736  0 66322.573 0
verify: 1181681503 0 0 1181681503 0 222556.693 0

If the process doing the IO is root it doesn't go any faster.

Oddly if on "B" a second dd process is started on another file it ALSO 
reads at 30Mb/s.  So the disk system then does a total of 60Gb/s, but 
only 30Gb/s per process.  Added a 3rd and a 4th process doing the same.  
At the 4th it seemed to hit some sort of limit, with each process now 
consistently less than 30Gb/s and the total at maybe 80Gb/s total.  Hard 
to say what the exact total was as it was jumping around like crazy.  On 
"A" 2 processes each got 270Mb/s,
and 3 180Mb/s.  Didn't try 4.

The only oddness of late on "B" is that a few days ago it loaded too 
many memory hungry processes so the OS killed some.  I have had that 
happen before on other systems without them doing anything odd 
afterwards.

Any ideas what this slowdown might be?

Thanks,

David Mathog
mathog at caltech.edu
Manager, Sequence Analysis Facility, Biology Division, Caltech