[Beowulf] question about enforcement of scheduler use

Daniel Widyono widyono at seas.upenn.edu
Wed Jun 7 07:40:22 PDT 2006

Late response... but what I do (again reiterating that this is only a
technical part to a partially political problem, and I won't regurgitate the
fine answers already suggested) is essentially the following, which is
simpler than the limits.conf approach (but has its own limitations):

in /etc/pam.d/sshd after this line
	account    required     pam_stack.so service=system-auth
add these two lines
	account    sufficient   pam_access.so       # allow root in always
	account    required     pam_listfile.so file=/etc/pbs_sshauth onerr=fail sense=allow item=user

for prologue AND prologue.parallel use
        /bin/rm -f /etc/pbs_sshauth ; echo $2 > /etc/pbs_sshauth ; exit 0

for epilogue AND epilogue.parallel use
        /bin/rm -f /etc/pbs_sshauth ; echo "" > /etc/pbs_sshauth ; exit 0

        -:ALL EXCEPT root:ALL

        UsePAM                                 yes

This is spelled out at www.liniac.upenn.edu/wiki.  Search for PAM.  One major
limitation is only one user per node.  An associate, Bryan Cardillo, already
fixed this; with his blessing I'll follow up with his versions shortly.

Dan W.

On Wed, May 24, 2006 at 01:36:02PM -0400, Larry Felton Johnson wrote:
> On Tue, May 23, 2006 at 10:18:45AM -0400, Matt Allen wrote:
> > Larry,
> > 
> > The prologue script only runs on the mother superior node, so it can
> > only alter the other nodes in the job via (probably) ssh.  I think Dr.
> > Weisz's script does this, although the version I've seen has "rsh"
> > hard-coded.  I'd check to see that the prologue script is actually
> > altering limits.conf on all of the nodes, since it looks like that could
> > be why you're seeing connection failures.
> > 
> > I think the way you're going about this is fine; we've tried the same
> > thing here at IU.  In the end, we just didn't have that many problem
> > users connecting directly to the compute nodes, so we abandoned the
> > restriction enforcement.  Our problems have more to do with orphaned MPI
> > processes hanging around on nodes, so we use a script to periodically
> > clean out processes owned by users who shouldn't be on the node.
> > 
> I want to thank all of you for answering this question.  Each of the
> responses I got provided me with useful possible approaches.  I'll
> summarize how I've actually resolved  this since getting your replies
> when I've finished working through the problem.  In the meantime I just 
> wanted to acknowledge your replies and thank you for the help.
> Larry
> -- 
> ========================================================
> "I learned long ago, never to wrestle with a pig. You 
>  get dirty, and besides, the pig likes it."
>                               George Bernard Shaw
> ========================================================
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

More information about the Beowulf mailing list