[Beowulf] Mosix

Michael Will mwill at penguincomputing.com
Thu Jan 20 08:18:10 PST 2005


Steve, his original question was why we still bother with mpi and other 
parallel programming
headaches when instead we could just use Mosix that does things 
transparently. My response
intended to clarify that you still need parallell programming 
techniques, and your point that
you could then also use mosix to have them migrate around (and away from 
the ressources
in the worst case) transparently is true.

My point is: There is no automated transparent parallelization of your 
serial code.

My apologies if my answer was not clear enough.

Michael Will

Steve Brenneis wrote:

>On Tue, 2005-01-18 at 15:29, Michael Will wrote:
>  
>
>>On Tuesday 18 January 2005 11:31 am, Rajesh Bhairampally wrote:
>>    
>>
>>>i am wondering when we have something like mosix (distributed OS available
>>>at www.mosix.org ), why we should still develop parallel programs and
>>>strugle with PVM/MPI etc. 
>>>      
>>>
>>Because Mosix does not work?
>>
>>This of course is not really true, for some applications Mosix might be appropriate,
>>but what it really does is transparently move processes around in a cluster, not
>>have them become suddenly parallelized. 
>>
>>Let's have an example:
>>
>>Generally your application is solving a certain problem, like say taking an image and apply
>>a certain filter to it. You can write a program for it that is not parallel-aware, and does not use
>>MPI and just solves the problem of creating one filtered image from one original image.
>>
>>This serial program might take one hour to run (assuming really large image and really 
>>complicated filter). 
>>
>>Mosix can help you now run this on a cluster with 4 nodes, which is cool if you have 4
>>images and still want to wait 1 hour until you see the first result.
>>
>>Now if you want to really filter only one image, but in about 15 minutes, you can program your
>>application differently so that it only works on a quarter of the image. Mosix could still help you
>>run your code with different input data in your cluster, but then you have to collect the four pieces
>>and stitch them together and would be unpleasently surprised because the borders of the filter
>>will show - there was information missing because you did not have the full image available but just
>>a quarter of it. Now when you adjust your code to exchange that border-information, you are actually
>>already on the path to become an MPI programmer, and might as well just run it on a beowulf cluster.
>>
>>So the mpi aware solution to this would be a program that splits up the image into the four quadrants, 
>>forks into four pieces that will be placed on four available nodes, communicates the border-data between
>>the pieces and finally collects the result and writes it out as one final image, all in not much more than 
>>the 15 minutes expected.
>>
>>Thats why you want to learn how to do real parallel programming instead of relying on some transparent
>>mechanism to guess how to solve your problem.
>>
>>Michael
>>
>>
>>    
>>
>
>Ignoring the inflammatory opening of the above response, I'll just state
>that its representation of what Mosix does and how it works is neither
>fair nor accurate.
>
>Before message-passing mechanisms arrived, and before the concept of
>multi-threading was introduced, the favored mechanism for
>multi-processing and parallelism was the good old fork-join method. That
>is, a parent process divided the task into small, manageable sub-tasks
>and then forked child processes off to handle each subtask. When the
>subtask was complete, the child notified the parent (usually by simply
>exiting) and the parent joined the results of the sub-tasks into the
>final task result. This mechanism works quite well on multi-tasking
>operating systems with various scheduling models. It can be effective on
>multi-CPU single systems or on clusters of single or multiple CPU
>systems.
>
>Mosix (or at least Open Mosix) handles this kind of parallelism
>brilliantly in that it will balance the forked child processes around
>the cluster based on load factors. So your image processing, your
>Gaussian signal analysis, your fluid dynamics simulations, your parallel
>software compilations, or your Fibonacci number generations are
>efficiently distributed while you still maintain programmatic control of
>the sub-tasking.
>
>While the fork-join mechanism is not without a downside
>(synchronization, for one, as mentioned above), it can be used with a
>system like Mosix to provide parallelism without the overhead of the
>message-passing paradigm. Maybe not better, probably not worse, just
>different.
>
>The effect described above in which sub-tasks operate completely
>independently to produce an erroneous result is really an artifact of
>poor programming and design skills and cannot be blamed on the task
>distribution system. Mosix is used regularly to do image processing and
>other highly parallel tasks. Creating a system like this for Mosix
>requires no knowledge of a message-passing interface or API, but simply
>requires a working knowledge of standard multi-processing methods and
>parallelism in general.
>
>One final note: most people consider a Mosix cluster to be a Beowulf as
>long as it meets the requirements of using commodity hardware and
>readily available software.
>
>Just keeping the record straight.
>
>  
>
>>>Tough i never used either mosix or PVM/MPI, I am 
>>>genunely puzzled about it. Can someone kindly educate me?
>>>
>>>thanks,
>>>rajesh
>>>
>>>_______________________________________________
>>>Beowulf mailing list, Beowulf at beowulf.org
>>>To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
>>>
>>>      
>>>






More information about the Beowulf mailing list