[Beowulf] Users abusing screen
Ellis H. Wilson III
ellis at runnersroll.com
Fri Oct 21 08:44:37 PDT 2011
On 10/21/11 09:10, Prentice Bisbal wrote:
> I have a question that isn't directly related to clusters, but I suspect
> it's an issue many of you are dealing with are dealt with: users using
> the screen command to stay logged in on systems and running long jobs
> that they forget about. Have any of you experienced this, and how did
> you deal with it?
I think this is strongly tied to what kind of work the users are doing
(i.e. how interactive it is, how long jobs take, how likely failure is
to occur that they must react to). In my personal experience the jobs I
spawn aren't interactive, tend to take a long time, and because of point
2 require me to react pretty quickly to their failure or I lose out on
valuable compute-time. However, they are cumbersome to execute via a
queuing manager (my work is in systems, so perhaps that area is an
exception). Therefore what I always do is just nohup myself a job, and
tail -f it if I need to watch it. I've adapted my ssh config such that
I don't get booted off after 5 or 10 minutes without any input from me
(I think the limit I set is like 2hours or something), so I can watch
output fly by to my hearts content.
If I were you, I think the best way to avoid a user-uprising, but to
achieve your goal is to give instructions on how a user can nohup (yes,
just assume they don't know how) and how to configure ssh to not die
after a short time. This way they don't have to worry about getting
disconnected if they aren't constantly interacting (so they can watch
output), but they also aren't staying logged on indefinitely (since
presumably their laptops/desktops aren't on indefinitely).
If you give them an alternative that is well defined with an example
(not just, "Oh you can use such-and-such instead.") I can hardly believe
they'll be all that upset.
More information about the Beowulf