[Beowulf] openMosix ending
mwill at penguincomputing.com
Tue Jul 17 12:21:26 PDT 2007
IMHO you don't need dynamic migration for embarassingly parallel applications as they can just be launched
on any available compute node directly and run there to completion. A simple queue system / scheduler
like torque or similar will be enough to make sure to not run more than cpus are available on a give node
at the same time in order to get best throughput. Just throw your 100 parametrized runs into the queue,
and the headnode/scheduler will keep all available nodes busy until all work is done.
The hierarchical approach of classical beowulf works just fine for that.
Sr. Cluster Engineer
From: beowulf-bounces at beowulf.org on behalf of Tony Travis
Sent: Tue 7/17/2007 8:03 AM
To: beowulf at beowulf.org
Subject: Re: [Beowulf] openMosix ending
Robert G. Brown wrote:
> On Mon, 16 Jul 2007, Jeffrey B. Layton wrote:
>> Afternoon all,
>> I don't know how many people this affects, but I thought it was
>> worth posting in case people are using openMosix. The
>> leader of openMosix, Moshe Bar, has announced that the
>> openMosix project is ending.
>> While I haven't used openMosix, I've seen it used and it is
>> pretty cool to see processes move around nodes.
> Yeah, but it has nearly always had a few tragic flaws. One was that it
> was always basically a hack of a specific kernel version and image,
> meaning that if you used it you were outside of a working kernel update
> stream. The second was that it was basically a hack of a specific
> kernel version and image at all, where one really would prefer a tool
> that did the same thing outside of kernel space (like Condor, for
> example). It survived those flaws, of course -- but it cannot survive
> the advent of virtualization, which will provide new pathways for this
> sort of thing to be done with far greater ease and stability.
I've been using openMosix for a long time, and you're right about the
kernel 'trap' it puts you into. I recently 'ported' linux-2.4.26-om1 to
Ubuntu. Although I've succeeded in getting our 92-node Beowulf up and
running openMosix under Ubuntu 6.06.1 LTS the end of life announcement
means I have to start thinking about replacing it.
Do you really think that Condor is an alternative to openMosix?
I don't know much about Condor, but I thought is was a DRM (Distributed
Resource Manager) like SGE. Is it more than that?
The great thing about openMosix is that most 'ordinary' programs
migrate. I've thought about using openSSI previously: What's your
opinion about that for 'embarrassingly' parallel computation?
Dr. A.J.Travis, | mailto:ajt at rri.sari.ac.uk
Rowett Research Institute, | http://www.rri.sari.ac.uk/~ajt
Greenburn Road, Bucksburn, | phone:+44 (0)1224 712751
Aberdeen AB21 9SB, Scotland, UK. | fax:+44 (0)1224 716687
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Beowulf