Is Boewulf what I need?

Omri Schwarz omri at NMR.MGH.Harvard.EDU
Fri May 19 23:10:23 PDT 2000


Omri Schwarz --- omri at nmr.mgh.harvard.edu
Timeless wisdom of biomedical engineering:
"Noise is principally due to the presence of the 
patient." -- R.F. Farr


On Fri, 19 May 2000, Greg Lindahl wrote:

> > Our need is no paralelization in the programs but distribution of the
> > processes.
> >
> > We would like to access to a single system, but to have a lot of CPUs.
> 
> It sounds like you have a large number of single-process jobs that you would
> like to run. This is often called "capacity computing". The only sharing
> between these processes would be via disk.
> 
> There are a couple of basic ways to solve this kind of problem, and several
> systems aimed at this kind of problem. One already mentioned is Mosix, which
> will take arbitrary processes and migrate them around. The user does nothing
> different. A second class of systems are ones where you do something special
> to start the process (other than just running it). Condor falls into that
> category, and something like a PBS wrapper would also fall into this
> category. The wrapper could just be a few lines.

I'm using Mosix and PBS. This means that jobs get queued and can run
fairly quickly (since file I/O usually stays on the same node), but
when PBS makes a mistake allocating jobs, (allocating too many jobs, or 
mismatching jobs so two jobs share the same node thought both have a high
memory footprint), Mosix comes to the rescue.





More information about the Beowulf mailing list