[Beowulf] Re: [Linux-HA] Couldn't get watchdog to work
alex at DSRLab.com
Tue Dec 28 09:14:16 PST 2004
> -----Original Message-----
> Paul Chen wrote:
> > Both nodes did restart
> > heartbeat but none of them reboot or shut down. Am I doing
> > something wrong?
> Alan Robertson wrote:
> The watchdog timer will only kill the system if heartbeat goes insane.
> It didn't. So, the watchdog timer is happy.
> At this point in time, the watchdog timer is not a
> replacement for a STONITH device.
Which is exactly what I am looking into (the STONITH device)...
I see two solutions, one hardware and one software. The hardware solution
looks expensive, but I believe the software solution will help Mr. Chen
(above), and would appreciate comments.
I would have my "backup" system execute a command as part of its attempts to
assume the identity, responsibilities and resources of the "primary" system.
The command is run from backup, as follows:
root at backup> ssh root at primary shutdown -h now
This will not work in all cases, but it should work in cases like the above.
A hardware solution is more general, but it doesn't hurt to run this command
in any case.
More information about the Beowulf