[Beowulf] Re: Fedora 6 System won't shut down

David Mathog mathog at caltech.edu
Mon Jul 30 08:33:59 PDT 2007


"A Lenzo" <alenzo at mail.rochester.edu> wrote:

> But I noticed something
> strange on a client machine - this only happens when I log into the client
> with a nonlocal userid (ie, one pulled from the NIS server).  When I am
> working, everything is fine on such a client node.  But when I shut it
down,
 > it stops with the following errors:
> 
> Unmounting pipe file systems OK
> Unmounting file systems OK
> Halting system...
> nfs: server barneysrv not responding, still trying
> nfs: server barneysrv not responding, still trying
> nfs: server barneysrv not responding, still trying
> (it does this forever)

Does the account use a home directory which is NFS mounted?  If so
shutting down from that account may jam up because the NFS client may
refuse to dismount the in use partition.  If that's the issue you
should see the same thing if you do something like:

 (login to compute node as root)
 cd some_nfs_mounted_directory
 ls
 poweroff

It's also possible there are some mixed up required-start
and required-stop lines in the init scripts, which can lead
to a jam on shutdown when service A needs service B in order
to turn itself off, but service B is turned off first.  The 
current SGE scripts had this problem on my systems, which run
SGE 6 from an NFS mounted directory.  The /etc/rc.d/init.d/sgeexecd
was like this:

# Required-Start: $network
# Required-Stop:

but had to be changed to this:

# Required-Start: $network $remote_fs
# Required-Stop: $network $remote_fs

Or it hung on shutdown.

To avoid some of this I always shut down compute nodes with
something like this from the root account:

  rsh computenode 'poweroff'

root on the compute nodes has no NIS or NFS dependencies and so
can turn off the system without leaving its own processes hung.

Regards,

David Mathog
mathog at caltech.edu
Manager, Sequence Analysis Facility, Biology Division, Caltech



More information about the Beowulf mailing list