BEOWULF cluster hangs
Josip Loncaric
josip at icase.edu
Thu Sep 26 13:04:13 PDT 2002
Regarding VM/SMP/IDE issues in 2.4 kernels:
We still see some VM problems on SMP machines with memory intensive jobs, even
in the latest Red Hat kernel 2.4.18-10smp. Single CPU machines running
2.4.18-10 are generally stable (but see below). Also, support for ServerWorks
chipset in 2.4 kernels is worse than in 2.2 kernels, resulting in IDE
performance degradation (no UDMA) and downright crashes when kernel detects
that OSB4 is in an "impossible state".
Sincerely,
Josip
P.S. "Optimistic memory allocation" in 2.4 kernels can misbehave. User
application typically gets no indication of memory shortage when it asks for
memory, but when it tries to use the allocated memory, the application (or
another process) can get terminated without any warning by the kernel's
out-of-memory (OOM) killer. Given this design, I would not want to rely on
any applications staying up under heavy memory demand. Moreover, while this
at least seems to work as designed on uniprocessor machines, our experience is
that when swap is enabled on SMP machines, even the OOM killer often cannot
prevent system crashes during OOM conditions (the machine crashes trying to
find a free memory page).
--
Dr. Josip Loncaric, Research Fellow mailto:josip at icase.edu
ICASE, Mail Stop 132C PGP key at http://www.icase.edu./~josip/
NASA Langley Research Center mailto:j.loncaric at larc.nasa.gov
Hampton, VA 23681-2199, USA Tel. +1 757 864-2192 Fax +1 757 864-6134
More information about the Beowulf
mailing list