Help on cluster hang problem...

Robert G. Brown rgb at phy.duke.edu
Wed May 30 07:35:43 PDT 2001


On Tue, 29 May 2001, David Vos wrote:

> Hmmm.  I've seen Windows do that to enough computers I doubt the problem
> is the power supply.  Although to make Linux hang like that is usually a
> hardware problem.
>
> David
>
> On Tue, 29 May 2001, Greg Lindahl wrote:
>
> > Or, maybe I don't understand power switches and it actually is bios
> > catchable or something.

I don't think it is Linux or Windows -- I think it is just a mismatch
between the power supply capacity and the hardware configuration.  I've
definitely seen difficulty with an ATX system turning itself on and off
with an inferior power supply -- I once had to go through three on a
brand new system to get the damn thing to power up as the vendor clearly
hadn't read the motherboard spec (or powered it up before shipping it,
grrr).  Note that all three had the proper lineout voltages (I checked)
-- they simply didn't have the peak power capacity required to do the
switching.

Thus by "inferior" I mean unable to provide the >>peak<< current
required on the switching line to make an ATX board (given the hardware
loading of the total system configuration) turn on or off, not that
there is anything necessarily "wrong with" or cheap about the power
supply itself.  Note also that the supply can actually have plenty of
nominal capacity measured in aggregate watts -- it is its ability to
deliver power on ONE LINE that matters.

Since I tend to get the cheapest possible systems, I probably see this
more often than some.  In my own experience, it is not at all unusual
that the front panel toggle (which is the thing that controls this) on a
"loaded" hardware configurations can turn the system on when it is
basically unloaded but cannot seem to turn the system off when running
(presumably it could provide enough juice for the first with the system
"off", but when the motherboard is under even idle load it cannot manage
the second).  I've got a couple of these systems sitting in the room
with me right now. One is "loaded" -- CD-RW, a couple of HD's, a floppy,
dual CPUs, lots of memory, a NIC, a high end video card.  The other
isn't as loaded but has an older motherboard and a smaller case and
power supply.  They run only Linux -- this isn't an OS issue.

Motherboards often come with their switching current requirements
indicated somewhere in the technical specs, but given the vast range of
motherboards, cases and "generic" power supplies, and hardware
configurations within the case (with every element making its own
demands, in many cases with e.g. NICs powered up even before the system
is turned "on") it really isn't that surprising that some systems are
mismatched or end up operating on the margins of the switching power
range.  Systems that have a hard time on the front panel switch also
generally can't manage to do a proper powerdown "halt" in software.

A LOT of systems come with both the front panel "hot" switch and a
rocker switch on the power supply itself, and even if the front panel
switch is tired and doesn't want to turn a system off the back one
always works.  So does pulling the plug;-)

   rgb

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu







More information about the Beowulf mailing list