Looking for remote file view utility

Robert G. Brown rgb at phy.duke.edu
Thu Jan 9 10:56:38 PST 2003


On Thu, 9 Jan 2003 mgb at mgbeckett.com wrote:

> Thank you to everyone for their concern.
> Unfortunately the UK lacks the wildlife for proper
> stacking out of users over anthills.
> The use in this system is actually very secure - there
> is a 6 foot gap between the cluster and the nearest
> internet cable :-)
> 
> I have a Linux cluster running an MPI job - I want a
> python script on the master to cycle through all the
> nodes checking for local error logs as part of the
> testing.
> 
> I was looking for a simple technique which didn't put
> any load on the working nodes.

"any load" is the question.  If your script cycles and checks ten times
a second, not many tools or methods will work without a load, right?  If
it cycles and checks ten times a minute, the load will likely be barely
visible using ssh as a medium, and still invisible on top of either rsh
or nfs (the VERY simplest way to solve your problem would be to have the
local error logs written to an NFS mount where you could just open them,
read them, and close them again, or better yet just stat them to see if
they've changed and open them only if).  Or if it is a smallish cluster,
you could just run tail -f on the files.  If you check once every ten
minutes, you could just about have the nodes email their current log
file to you or something else immensely heavyweight and still generate
no significant load.

Monitoring is just another parallel process.  It has the same sort of
scaling with granularity as any other parallel task.

If your nodes support NFS, I'd be VERY inclined to try it first, as the
code is bone simple and you can adjust the sampling granularity to
ensure that load stays at "zero" due to the sampling itself.  If they
don't, I'd change them so that they do and STILL use nfs, unless you're
e.g. running scyld, in which case you'll have to talk to them about what
tools might work (or if any tool is even needed).

Note that any sort of "tail -f"-like tool is going to do better than
getting the entire file every time you want to scan it.  The minimum
"cost" of getting a remote file is stat, open, read/encapsulate/send
over the network, close.  You'd much rather avoid as much of this as
possible by opening it only once and sending only the new text at the
end.

The second thing to try if you just can't or won't use NFS is a real
remote logger.  There are syslog-like tools out there already for the
particular purpose of monitoring logs on a network of machines and
sending it via daemon to a central location.  One of these is almost
certainly going to be better than anything you cobble together on your
own that has to actually fetch the files every time they are to be
scanned.

   rgb

Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu






More information about the Beowulf mailing list