[Beowulf] Anyone having IPMI problems on Intel S3200 series
Perry E. Metzger
perry at piermont.com
Wed Apr 15 14:16:15 PDT 2009
Greg Lindahl <lindahl at pbm.com> writes:
> On Wed, Apr 15, 2009 at 04:51:57PM -0400, Perry E. Metzger wrote:
>> Unfortunately, every once in a while, the IPMI BMCs on my test systems
>> simply stop talking to the network. This isn't overly tragic since I can
>> have a process go over to such a board when it detects that pings have
>> stopped working and use a local IPMI command to cold rest the BMC, but
>> it is still really Not The Right Thing.
> Hey, you're lucky that you have a way to reset the BMC without power
> cycling the box. It is not unusual for IPMI implementations to be much
It is in the IPMI spec -- you can request a hard reset presuming the BMC
is responding at all -- luckily in this case it still responds locally.
Usage in ipmitool (which I've largely abandoned for freeipmi for the
moment since it seems less buggy):
# ipmitool bmc reset cold
>> Also, I suspect every once in a great while I'll get a simultaneous
>> OS and IPMI BMC failure and shoe leather will be needed to reset the
>> box, which I don't like.
> Belt and suspenders -- that's what remote-controlled power strips are
> for. It doesn't sound like you'd see a double-failure very often.
Unless this is a lot more unreliable than expected, my expected double
failure rate looks low enough that I'm not going to bother. Having to
reset a box here and there isn't that big a deal. I just wish it was not
even an issue...
Perry E. Metzger perry at piermont.com
More information about the Beowulf