[Beowulf] Best Practices SOL vs Cyclades ACS

Paulo Afonso Lopes poral at fct.unl.pt
Sun Oct 11 09:10:50 PDT 2009


>> We have more then 400 machines. Every month there is one machine that we
>> can
>> not reboot using IPMI or the SOL is not working.
>
> we have something like 2500 nodes, mostly HP dl145g2's, and have a
> BMC-wedge
> probably 6-12 times/year.  can I ask what brand/model has such flakey
> IPMI?
> if you run "ipmi mc reset" on the node, does it resolve the problem?
> I wonder whether flakiness might also correspond to some config or usage
> pattern.  (ours dhcp from a local server - actually all the traffic is
> local.)

Mark,

Do you have SOL on the HP DL145-G2 ?

I also have these nodes, and although I can use most ipmi functions
(including remote access power up/cycle), I can not get SOL to work.

Also, i have noticed that the kipmi0 daemon does consume a little bit,
e.g., 45 minutes for 9 days uptime (with the top default refresh, it shows
up every 4 screens or so). (CentOS 5.3)

Regards,

paulo

-- 
Paulo Afonso Lopes                        | Tel: +351- 21 294 8536
Departamento de Informática               | 294 8300 ext.10702
Faculdade de Ciências e Tecnologia        | Fax: +351- 21 294 8541
Universidade Nova de Lisboa               | e-mail: poral at fct.unl.pt
2829-516 Caparica, PORTUGAL




More information about the Beowulf mailing list