[Beowulf] first cluster [was [OMPI users] trouble using openmpi under slurm]
Douglas Guptill
douglas.guptill at dal.ca
Tue Jul 13 15:05:38 PDT 2010
Hello Gus, list:
On Fri, Jul 09, 2010 at 07:06:05PM -0400, Gus Correa wrote:
> Douglas Guptill wrote:
>> On Thu, Jul 08, 2010 at 09:43:48AM -0400, Gus Correa wrote:
>>> Douglas Guptill wrote:
>>>> On Wed, Jul 07, 2010 at 12:37:54PM -0600, Ralph Castain wrote:
>>>>
>>>>> No....afraid not. Things work pretty well, but there are places
>>>>> where things just don't mesh. Sub-node allocation in particular is
>>>>> an issue as it implies binding, and slurm and ompi have conflicting
>>>>> methods.
>>>>>
>>>>> It all can get worked out, but we have limited time and nobody cares
>>>>> enough to put in the effort. Slurm just isn't used enough to make it
>>>>> worthwhile (too small an audience).
>>>> I am about to get my first HPC cluster (128 nodes), and was
>>>> considering slurm. We do use MPI.
>>>>
>>>> Should I be looking at Torque instead for a queue manager?
>>>>
>>> Hi Douglas
>>>
>>> Yes, works like a charm along with OpenMPI.
>>> I also have MVAPICH2 and MPICH2, no integration w/ Torque,
>>> but no conflicts either.
>>
>> Thanks, Gus.
>>
>> After some lurking and reading, I plan this:
>> Debian (lenny)
>> + fai - for compute-node operating system install
>> + Torque - job scheduler/manager
>> + MPI (Intel MPI) - for the application
>> + MPI (OpenMP) - alternative MPI
>>
>> Does anyone see holes in this plan?
>>
>> Thanks,
>> Douglas
>
>
> Hi Douglas
>
> I never used Debian, fai, or Intel MPI.
>
> We have two clusters with cluster management software, i.e.,
> mostly the operating system install stuff.
>
> I made a toy Rocks cluster out of old computers.
> Rocks is a minimum-hassle way to deploy and maintain a cluster.
> Of course you can do the same from scratch, or do more, or do better,
> which makes some people frown at Rocks.
> However, Rocks works fine, particularly if your network(s)
> is (are) Gigabit Ethernet,
> and if you don't mix different processor architectures (i.e. only i386
> or only x86_64, although there is some support for mixed stuff).
> It is developed/maintained by UCSD under an NSF grant (I think).
> It's been around for quite a while too.
>
> You may want to take a look, perhaps experiment with a subset of your
> nodes before you commit:
>
> http://www.rocksclusters.org/wordpress/
I am sure Rocks suits many, but not me, at first glance. I am too
much of a tinkerer. That comes, partially, from starting this
business too earlier; my first computer was a Univac II - vacuum
tubes, no operating system.
> What is the interconnect/network hardware you have for MPI?
> Gigabit Ethernet? Infiniband? Myrinet? Other?
Infiniband - QLogic 12300-BS18
> If Infiniband you may need to add the OFED packages,
Gotcha. Thanks.
> If you are going to handle a variety of different compilers, MPI
> flavors, with various versions, etc, I recommend using the
> "Environment module" package.
My one user has requested that.
> I hope this helps.
A Big help. Much appreciated.
Douglas.
--
Douglas Guptill voice: 902-461-9749
Research Assistant, LSC 4640 email: douglas.guptill at dal.ca
Oceanography Department fax: 902-494-3877
Dalhousie University
Halifax, NS, B3H 4J1, Canada
More information about the Beowulf
mailing list