[Beowulf] How do people keep track of computers in your cluster(s)?

Kilian CAVALOTTI kilian at stanford.edu
Sun Oct 21 12:14:30 PDT 2007


Hi Carsten,

On Sunday 21 October 2007 07:29:49 Carsten Aulbert wrote:
> However, I would like to have something where we have something like a
> large table about the hardware in question. In there information like
>
> * vendor
> * serial number
> * MAC addresses (eth0, eth1,..., IPMI, RAID,...)
> * maybe even firmware versions and serial numbers of exchangeable
> internal hardware (hard disks)
> * basically all physical information of the box

We're using Dell OpenManage for this purpose. It's obviously limited to 
Dell hardware, but it gather all this information, and makes it available 
from a central place if you use their IT Assistant software. It allows to 
batch upgrade firmwares and BIOSes, provides a way to gather SNMP traps 
and send email alerts, run various reports, and to monitor basic 
performance metrics too.

> another table should hold the current setup, i.e. a mapping between the
> hardware and the "logical" setup, e.g.
>
> Hardware box number #1234 from above table has in the current setup the
> following...
>
> * hostname
> * IP addresses
> * running services

Hostname and IP addresses are available through OMSA too, but running 
services are not.

> And finally, another table where special problems, like memory errors
> and the like can be entered.

SNMP and BMC logs are reported to IT Assistant, so you got an instant 
notification of hardware errors.

I'm not sure if that fits your environment, but if you own Dell hardware, 
it's definitely worth it.

Cheers,
-- 
Kilian 



More information about the Beowulf mailing list