[Beowulf] Mosix

Fri Jan 21 06:19:14 PST 2005

> Before message-passing mechanisms arrived, and before the concept of
> multi-threading was introduced, the favored mechanism for
> multi-processing and parallelism was the good old fork-join method. That

nah, threads have been around a very long time - after all
separate processes assume you have separate address spaces
and thus something like an MMU.  it's not like MMU's were on 
the first computers!

> Mosix (or at least Open Mosix) handles this kind of parallelism
> brilliantly in that it will balance the forked child processes around

sure.  Mosix is great.  it just doesn't do everything, especially
it doesn't introduce parallelism to a serial application, and it 
provides only one fairly restrictive mechanism for parallelism.
stretching the latter boundary obviously leads to inefficiency.

> system like Mosix to provide parallelism without the overhead of the
> message-passing paradigm. Maybe not better, probably not worse, just

"overhead" of message passing?  how strange!  look, either you have 
certain communication needs or you don't.  Mosix permits a certain kind 
of communication (in terms of looseness and granularity), which may actually
work well for your application.

for instance, if your level of parallelism is approximately the same as 
your number of CPUs, and have parallel chunks which do almost no 
communication, and they run for long enough, then by all means, fork em off,
fire and forget, and Mosix is your best buddy.

otoh, many people reagard "real" parallelism to be much more tightly coupled
than that.  for instance, suppose you're doing a gravity simulation where 
each star in your virtual cosmos influences the motion of each other star.
MPI is what you want, though you can also do it using shared memory (OpenMP).
the point though is that you absolutely must think in terms of message
passing no matter how your parallelism is implemented, because you have so 
much communication.

message passing is not an overhead, but rather a consequence of what data
your problem needs to exchange.  if you have a lot of data exchange, and 
do not think in terms of discrete packets of data collected and sent where 
needed, your performance will SUCK.

if you do not have serious communication, there are other paradigms which 
may suit you, and which have implementations which may well work efficiently.
for instance, some applications expose parallelism in streams, which 
transform data, usually in a digraph.  a regular and pipelinable
communication pattern like this just *begs* for an implementation which 
is tuned for it (ie something involving producer/consumer/buffer models).

if your communication is so sparse that you can literally 

	for my $problem (@problems) {
		exec("application $problem") if (fork() == 0);
	}

well, good for you!  what you need is dead easy, and can be done nicely
using Mosix (actually, almost any cluster will work well, since even a 
network of scavenged workstations can do: 

	for my $problem (@problems) {
		exec("submittoqueue application $problem") if (fork() == 0);
	}

personally, I believe that Mosix is mostly interesting only where
communication is minimal, but parallelism is also extremely dynamic.
after all, parallelism isn't wildly varying, then any old queue-manager 
can create a good load-balance (without bothing with migration).
("another level of indirection solves any problem")

> distribution system. Mosix is used regularly to do image processing and
> other highly parallel tasks. Creating a system like this for Mosix

"highly parallel" here means lots of loosely-coupled parallelism.
it's really a good idea to distinguish between the *amount* of parallelism
and how coupled it is.

> One final note: most people consider a Mosix cluster to be a Beowulf as
> long as it meets the requirements of using commodity hardware and
> readily available software.

sort of.  purely from the load-balance perspective, Mosix dynamically
balances load through migration, and most "normal" beowulfs don't.
(though Scyld does, in a very useful sense.)

but the real point is that if you use Mosix, and therefore eschew MPI,
you are restricting the set of problems which you can efficiently handle.

regards, mark hahn.