[Beowulf] block runing jobs individually on each node
Chris Dagdigian
dag at sonsorol.org
Thu Apr 7 12:14:13 PDT 2005
Stopping people from gaming or bypassing the cluster scheduler is
possible via various methods (sorry I can't help you out with a PBS
specific method!) but in the long run it is an arms race between you and
the misbehaving users that you will probably never win completely and
willl sap lots of your time and effort.
My take on this issue has always been that this is a policy issue, not a
technology issue.
If you have a written policy that says "bypassing the scheduler results
in an unfair allocation of shared resources that hurts your fellow
users" then you have a framework for dealing with abusers. Typically
this means a gentle note to the manger/advisor of the user. Further
abuses result in user account suspension.
I know people who do both methods - on some Grid Engine clusters any
user process running on a compute node that is not a child of the proper
sge_sheperd daemon gets a "kill -9" signal sent to it. Users get the
message quickly.
In general though, I think the admins who deal with this problem as a
policy issue are overall "happier" and have a better relationship with
the user community as well.
Just my $.02
-Chris
jerry xu wrote:
> Hi, Dear All:
> I am managing a simple 24 nodes beowulf cluster, basically I require
> all my jobs are running through PBS. However, some undergraduate
> students in our lab always try to ssh to each individual node in the
> cluster and run their jobs, which is pretty bad for me to managing the
> resources and control my program running status. I remember there is way
> to block people running job that is seperated from the batch system but
> at the same time still allow them ssh to each node to grab some tmp
> files?. But I just donot remember how to do it, can anyone give some
> directions?
>
More information about the Beowulf
mailing list