[Beowulf] first cluster [was [OMPI users] trouble using openmpi under slurm]
Douglas Guptill
douglas.guptill at dal.ca
Fri Jul 9 09:43:13 PDT 2010
On Thu, Jul 08, 2010 at 09:43:48AM -0400, Gus Correa wrote:
> Douglas Guptill wrote:
>> On Wed, Jul 07, 2010 at 12:37:54PM -0600, Ralph Castain wrote:
>>
>>> No....afraid not. Things work pretty well, but there are places
>>> where things just don't mesh. Sub-node allocation in particular is
>>> an issue as it implies binding, and slurm and ompi have conflicting
>>> methods.
>>>
>>> It all can get worked out, but we have limited time and nobody cares
>>> enough to put in the effort. Slurm just isn't used enough to make it
>>> worthwhile (too small an audience).
>>
>> I am about to get my first HPC cluster (128 nodes), and was
>> considering slurm. We do use MPI.
>>
>> Should I be looking at Torque instead for a queue manager?
>>
> Hi Douglas
>
> Yes, works like a charm along with OpenMPI.
> I also have MVAPICH2 and MPICH2, no integration w/ Torque,
> but no conflicts either.
Thanks, Gus.
After some lurking and reading, I plan this:
Debian (lenny)
+ fai - for compute-node operating system install
+ Torque - job scheduler/manager
+ MPI (Intel MPI) - for the application
+ MPI (OpenMP) - alternative MPI
Does anyone see holes in this plan?
Thanks,
Douglas
--
Douglas Guptill voice: 902-461-9749
Research Assistant, LSC 4640 email: douglas.guptill at dal.ca
Oceanography Department fax: 902-494-3877
Dalhousie University
Halifax, NS, B3H 4J1, Canada
More information about the Beowulf
mailing list