[Beowulf] Stress / torture test cluster hardware
John Hearns
john.hearns at streamline-computing.com
Sun Oct 8 01:09:11 PDT 2006
Nico Mittenzwey wrote:
> Dear Beowulf mailing list members,
>
> we are building a new Beowulf cluster at the moment. New hardware is
> arriving every day. Now we want to make certain that this hardware has
> no errors. Therefore we want to stress test them.
> Do you know of any papers, articles, proceedings, tools... concerning
> this topic beside the ones below?
All of the links you suggest look good.
Other things to consider for a stress test are:
Unpack a clean Linux kernel tree. Do a kernel compile. Tar up the
resulting tree. Repeat, and compare the two resulting tar files.
A linux kernel compile is a surprisingly good way of stressing a system.
On a completed cluster, run HPL on all nodes for an extended period and
let the cluster heat up.
More information about the Beowulf
mailing list