[Beowulf] S2466 permanent poweroff, round 2

David Mathog mathog at caltech.edu
Wed Feb 14 12:12:20 PST 2007

Robert G. brown wrote:

> They were so damn touchy and difficult to get
> running so that they actually were stable and so that the buttons worked
> and so on that once we finally got there, I'd have taken a hammer to the
> head of anybody that tried to change them.

The S2466Ns are incredibly touchy, aren't they?  When reimaging the
other 19 nodes some of them had to be reset, and in a couple of cases,
unplugged and plugged back in, before they all come up properly with
"boel" loaded from the headnode.

I never did get acpi working perfectly, just (barely) good enough.
If the "button" module is loaded once, and never looked at
sideways again, then after "poweroff" the front panel switch
works to restart the system.  Turn acpi on, or even just do:
  rmmod button; modprobe button

and that front panel switch won't work after "poweroff".  
Never could get acpid working at the same time, so no way to
trigger a shutdown from the power button.  That's less of a problem
though, since historically if "rsh nodename; poweroff" doesn't get
through, it's about 50% odds that that node will also ignore its
reset and power buttons.

Anyway, the two main reasons for upgrading these nodes were:

1.  Get athcool working.  This knocks about 50W/CPU off the idle
power consumption and drops the idle CPU temps from 39C to 29C.
(A previous fling with athcool and the previous kernel did not work.)
2.  Hopefully eliminate a bug in i2c that was causing sensors to stop
working every once in a while, resulting in node shutdowns because
the Over Temp scripts would suddenly be unable to obtain valid
temperature or fan speed measurements.


David Mathog
mathog at caltech.edu
Manager, Sequence Analysis Facility, Biology Division, Caltech

