[Beowulf] first cluster
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Chris Dagdigian dag at sonsorol.orgFri Jul 16 10:01:01 PDT 2010
- Previous message: [Beowulf] first cluster
- Next message: [Beowulf] first cluster
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
You want the honest answer? There are technical things you can do to to prevent users from bypassing the scheduler and resource allocation policies. One of the cooler things I've seen in Grid Engine environments was a cron job that did a "kill -9" against any user process that was not a child of a sge_shepherd daemon. Very effective. Other people play games with pam settings and the like. The honest truth is that technical countermeasures are mostly a waste of time. A motivated user always has more time and effort to spend trying to game the system than an overworked administrator. My recommendation is to subject users to a cluster acceptable use policy. Any abuses of the policy are treated as a teamwork and human resources issue. The first time you screw up you get a warning, the second time you get caught I'll send a note to your manager. After that any abuses are treated with a loss of cluster access and a referral to human resources for further action. Simply put -- you don't have enough time in the day to deal with users who want to game/abuse the system. It's far easier for all concerned to have everyone agree on a fair use policy and treat any infractions via management rather than cluster settings. This is another reason why having a cluster governance body helps a lot. A committee of cluster power users and IT staff is a great way to get consensus on queue setup, cluster policies, disk quotas and the like. They can also come down hard with peer pressure on pissy users. my $.02 -Chris Douglas Guptill wrote: > How does the presence of a job scheduler interact with the ability of a user to > ssh to<head>, > ssh to<compute-node-n>, and then type > mpirun -np 64 my_application > > Intuition tells me there has to be something in a cluster setup, when > it has a scheduler, that prevents a user from circumventing the > scheduler by doing something like the above.
- Previous message: [Beowulf] first cluster
- Next message: [Beowulf] first cluster
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
