[Beowulf] cli alternative to cluster top?

Sun Nov 30 09:19:29 PST 2008

On Sun, 30 Nov 2008, Robert G. Brown wrote:

> On Sat, 29 Nov 2008, Greg Kurtzer wrote:
> 
> > Warewulf has a real time top like command for the cluster nodes and
> > has been known to scale up to the thousands of nodes:
> >
> > http://www.runlevelzero.net/images/wwtop-screenshot.png

> > On Wed, Nov 26, 2008 at 12:39 PM, Thomas Vixel <tvixel at gmail.com> wrote:
> >> I've been googling for a top-like cli tool to use on our cluster, but
> >> the closest thing that comes up is Rocks' "cluster top" script. That
> >> could be tweaked to work via the cli, but due to factors beyond my
> >> control (management) all functionality has to come from a pre-fab
> >> program rather than a software stack with local, custom modifications.
> >>
> >> I'm sure this has come up more than once in the HPC sector as well --
> >> could anyone point me to any top-like apps for our cluster?
> >>
> >> For reference, wulfware/wulfstat was nixed as well because of the
> >> xmlsysd dependency.
> 
> That's fine, but I'm curious.  How do you expect to run a cluster
> information tool over a network without a socket at both ends?  If not
> xmlsysd, then something else -- sshd, xinetd, dedicated or general
> purpose, where the latter almost certainly will have have higher
> overhead?  Or are you looking for something with a kernel level network
> interface, more like scyld?

The theoretical architecture of our system has all of the  
process control communication going over persistent TCP/IP sockets.  The 
master node has a 'master daemon'.  As compute nodes boot and join the 
cluster their 'slave daemon' opens a single TCP socket to the master 
daemon.

Having a persistent connection is a key element to performance.  It 
eliminates the cost and delay of name lookup, reverse name lookup, socket 
establishment and authentication.  (Example: The MPICH people learned this 
lesson -- MPD is much faster than MPICH v1 using 'rsh'.) 

We optimized our system extensively, down to the number of bytes in
efficiently constructed and parsed packets.  But to get scalability to
thousands of nodes and processes, we found that we needed to "cheat".  
While connections are established to the user-level daemon, we optimize by
having some of the communication handled by a kernel module that shares
the socket.  The optimization isn't needed for 'only' hundreds of nodes
and processes, or if you are willing to dedicate most of a very powerful
head node to process control.  But 'thousands' is much more challenging 
than 'hundreds'.

-- 
Donald Becker				becker at scyld.com
Penguin Computing / Scyld Software
www.penguincomputing.com		www.scyld.com
Annapolis MD and San Francisco CA