[Beowulf] Mixing 32-bit compute nodes with 64-bit head nodes
Joe Landman
landman at scalableinformatics.com
Wed May 10 17:37:39 PDT 2006
Andrew D. Fant wrote:
> I know that the common wisdom on this subject is "don't do that", but for
Shouldn't be an issue if you have a sane distribution and distribution
load system, a way to automatically handle the ABI (bit width) during
installation/package selection. Distros which do this (mostly)
correctly include FCx, SuSE, Centos, ...
> various reasons, I have to look at the possibility of putting a 64-bit system
> (probably EMT as opposed to Opteron) as the user node of our cluster, I have a
> separate management node that handles the batch scheduler, license management
> and compute node imaging, and related duties, which would remain a 32-bit Xeon,
> so that isn't going to directly factor into the decision. This is motivated by
> a desire to allow users to run interactive jobs on the user node instead of
> playing games with wrapper scripts to run them on compute nodes. My personal
> preference would be to have a separate system that can remotely submit to the
> existing cluster via the batch queues, but there is a desire by management to
> limit the number of different systems that a user needs to know about logging
> into. The 64-bit motivation is mostly about providing adequate memory for
> multiple users running gui applications.
Hmmm... so you want to provide a single 64 bit machine to run GUI code
on rather than hacking stuff for the cluster? Assuming I understood
this right, apart from contention for that resource, this should be
fine. Is there any reason why the SGE/PBS methods (qrsh/qsub -I)
wouldn't work? Or is this the pain of which you speak?
> Has anyone had any success with this approach, or failing that, any horror
> stories that would support the more flexible approach of separating the shell
> server from the head node?
I think this is actually a good practice. You really don't want users
logging onto a management node to run jobs. You would likely prefer them
to run on some sort of user-login-node. Lots of cluster distros do fuse
these two. This is assuming a non-SSI machine (e.g. not
Scyld/bproc/Clustermatic/...).
The only major issue is that if they then submit a job with a binary
which happens to be the wrong ABI, you will get lots of dud runs and
unhappy users. You can fix that with some clever defaults on the
submission side for each user-login-node.
>
> Thanks,
> Andy
>
--
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: landman at scalableinformatics.com
web : http://www.scalableinformatics.com
phone: +1 734 786 8423
fax : +1 734 786 8452 or +1 866 888 3112
cell : +1 734 612 4615
More information about the Beowulf
mailing list