Diagnostic tools
alvin at Maggie.Linux-Consulting.com
alvin at Maggie.Linux-Consulting.com
Mon Oct 21 04:38:46 PDT 2002
hi ya manel
On Mon, 21 Oct 2002, Manel Soria wrote:
> Hi,
>
> We are looking for a diagnostic tool that (ideally) would
> allow us to determine what component/s of a node fail. It should
> test the processor, RAM, disk and network cards under heavy load
> but in repeatable conditions.
testing those items individually is a lot of work ...
test process/proceedure is more important than the actual test ??
- many different cpu/disk/memory/nic tests
http://www.Linux-1U.net/Diags/
( not quite finished yet...
- many ways to tweek the system to maximize its performance
http://www.Linux-1U.net/Tuning/
( way-incomplete but .. maybe its useful to ya ??
> Other desirable features would be:
> -Run from a floppy, without OS in the disk, in order to allow
> good quality control of the new nodes.
running from floppy is a wee bit tricky to squeeze your kernel
into 1.44MB ( 1.77MB ) that can boot it and get iton the network
- newer mb might be simpler/easier for network booting too
( diskless etwork booting is easier.. than a floppy boot
- use a 4MB compact flash ... and the problem is trivial
to be a diskless node for booting
> -Monitor the CPU temperature.
use i2c-2.6.5 and lm_sensors to read the health monitors on the
mbotherboard
also get a regular digital thermometer from your local hw store
for sanity checking
have fun
alvin
>
> We would appreciate suggestions and comments about this topic.
>
> Thanks for your help.
More information about the Beowulf
mailing list