[Beowulf] Stress / torture test cluster hardware
William Scullin
wscullin at ncsa.uiuc.edu
Mon Oct 9 13:50:18 PDT 2006
Hi,
I am a big fan of running repeated single node HPL and HPCC runs - it
beats up memory and the cpu quite nicely. I would emphasize the
repeated part. A lot of hardware issues don't show up until machines
heat up and cool down a few times, so maybe wait a bit between runs.
Also, feel free to exceed your physical memory and use a bit of swap
too for a couple of runs - although I'd never do that for a
qualifying or tuning run. All of that said, the individual node HPLs
are the sort of baseline data that makes tuning during multinode HPLs
easier down the line.
I'd also agree with the value of thousands of tars and untars - but
I'd keep it to directories with large numbers of small files. One of
my co-workers favors /usr/include for that purpose.
Happy Testing,
William
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
William Scullin
Systems Engineer
3006E NCSA Building
National Center for Supercomputing Applications
University of Illinois at Urbana-Champaign
Urbana, Illinois 61801
office: +1-217-244-4866
mobile: +1-225-772-5273
e-mail: wscullin at ncsa.uiuc.edu
AIM IM: WilliamAtNCSA
“Like almost all others that began with metaphysical discussions.
The theory has advanced but the practical science is still in its
infancy and the modern statesman is constantly short of facts on
which he can base his speculations.”
- Antoine Lavosier
On Oct 8, 2006, at 11:32 AM, Karen Shaeffer wrote:
> On Sun, Oct 08, 2006 at 09:09:11AM +0100, John Hearns wrote:
>> Nico Mittenzwey wrote:
>> Other things to consider for a stress test are:
>>
>> Unpack a clean Linux kernel tree. Do a kernel compile. Tar up the
>> resulting tree. Repeat, and compare the two resulting tar files.
>> A linux kernel compile is a surprisingly good way of stressing a
>> system.
>
> I would agree compiling the linux kernel is an excellent stress
> test. I've
> set it up in an endless loop, where multiple, independent tress are
> compiled
> in parallel. It does discover memory problems rather effectively,
> if you
> let it run a day or two.
>
> Thanks,
> Karen
> --
> Karen Shaeffer
> Neuralscape, Palo Alto, Ca. 94306
> shaeffer at neuralscape.com http://www.neuralscape.com
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20061009/e1d32e2e/attachment.html>
More information about the Beowulf
mailing list