PBS and PGI interactions?

Mon Apr 9 14:15:15 PDT 2001

On Sat, 7 Apr 2001, Don Morton wrote:

> Please note that I'm posting this the xtreme
> group, the beowulf group, and the Portland Group 
> support email.
> 
> We recently received an 18-CPU Xtreme machine from Paralogic.
> PBS didn't work but, fortunately, I had a "little" experience
> installing it once before, and have finally succeeded in getting
> it to work PROPERLY for running MPICH jobs on the cluster.

Because PBS is so configurable, we usually do very little
pre-configuration.  We will however see if we can get our
build process to do a better job.

> 
> I would like it to run PROPERLY for HPF jobs (and OpenMP jobs)
> using the PGI suite and, had just "assumed" that since both
> PGI and Paralogic package these things together, that somebody,
> somewhere had done this work before.  However, any searches I've
> tried have ended up fruitless.  I see a few instances of people
> using PGI's mpirun with PBS, but that doesn't cover the HPF, etc.
> 

In general, you seem to be one of the few to use HPF or OpenMP across the 
cluster. As OpenMP is designed for shared memory, its
performance on clusters can vary widely.  

> Has anybody at Paralogic or PGI actually integrated PBS and
> PGI runtime environments so that PBS is utilized to run HPF
> and OpenMP jobs?  I haven't seen any evidence of this, and
> would sure appreciate some pointers in the right direction.
> 
> Some specific questions:
> 
> 1) Has anybody devised an approach (I would guess a wrapper) that
>    allows parallel jobs to be run only via PBS?  In other words,
>    in my opinion, you can't have a production cluster if any old
>    Joe Sixpack can come in and bypass PBS by typing 
> 
>              mpirun -np 16 a.out
> 
> 2) Has anybody devised an approach for launching PGI HPF (and OpenMP)
>    jobs via PBS, that does so correctly (i.e. keeps track of node 
>    allocations from other jobs, etc.)?  
> 
> 3) With PGI's HPF runtime environment, is it possible to execute 
>    completely on "compute nodes?"  I'm trying to reserve our "head"
>    node for compilation, visualization, etc., but it appears to me that
>    when you run PGI HPF processes, they always put one on the "local" node,
>    which isn't necessarily a good thing.  I don't see a "clear" way around
>    that.
>    
> 
> My "first" impression of PBS inclusion in systems like Paralogic's, and
> in PGI's CDK, is that perhaps PBS isn't a fully-integrated application.
> I hope I'm wrong, and I'd sure appreciate it if someone pointed me in
> the right direction!  

The problem is that the combination of all user requested options is
quite large. In most cases there are 3 possible MPIs (MPI-PRO, LAM,
MPICH), 2-3 possible compilers (PGI, GNU, ABSOFT), single vs dual cpu
nodes, several possible interconnects (and their subsequent libmpi.a).  
That said, we are working on trying to provide a seamless environment.

> 
> I seem to come from an older school where we used NQS on Cray T3E's
> to run ANY parallel job, whether it was written in PVM, MPICH, EPCC
> MPI, PGI HPF, etc.  In fact, Cray's "mpprun" command seemed to 
> abstract away the details of dealing with various parallel libraries.
> You launched a job from a shell node - you could run an interactive job
> for maybe 30 minutes, and the allocation of those nodes would be
> coordinated with NQS.  Anything longer required a batch NQS job
> and, again, you could use the same submission paradigm whether you
> were using PGI HPF, MPICH, etc.  
> 
> Has anybody worked on this in the realm of clusters?  My guess is that
> most of these issues can be resolved through high-level scripts, but
> I know it ain't easy.  If someone has explored this, I'd love
> to hear about it.  If not, I'll put a student or two to work on it.  
> In my opinion, you just can't make clusters accessible to the general
> scientific public without such mechanisms. :)

There is an interesting thing with clusters. As their cost allows
"single problem" or "single problem class" machines, once the 
critical path of tools is established and working, there is
no incentive to do any more.  I have seen this mechanism quite a bit.
I do not necessarily think this is bad thing, just a property of
the whole "cluster thing". (i.e. many people do not want the extra cost 
required for a fully integrated system, they want a machine
with the absolute minimum hardware and software cost to run their code.)

Finally, I think your feedback is helpful and we will certainly
look at taking the integration up a notch.

Doug Eadline
-------------------------------------------------------------------
Paralogic, Inc.           |     PEAK     |      Voice:+610.814.2800
130 Webster Street        |   PARALLEL   |        Fax:+610.814.5844
Bethlehem, PA 18015 USA   |  PERFORMANCE |    http://www.plogic.com
-------------------------------------------------------------------