Beowulf Questions

Randall Jouett rules at
Sat Jan 4 04:07:10 PST 2003

Hello again, Donald.

Donald Becker wrote:
> Our cluster philosophy is that the end user should not be required to
> do anything new or special to run a cluster application.

Great. End users and turn-key solutions are always a nice
thing to have in a business-level environment. Rock on, dewd!

> That means
>    Applications should work even if there is only a single machine in
>    the cluster.  Many beginner MPI applications don't handle this case
>    correctly.

Wow. I would have thought that people would have made plans
to deal with this, especially since something along these lines
can happen, although I'm pretty sure it's rather infrequent.
Go figure.

>    Cluster applications should not require a helper program such as
>    'mpirun' or 'mpiexec'.

In a commerical system, where end users shouldn't and wouldn't
know about such things, I totally agree. OTOH, in a production
environment where that vast majority of users are geekoids, I don't
have a problem with this, especially if mpirun or mpiexec is hidden
by a GUI or something. Since you are doing this as a commercial
endeavor, though, I agree with the way you guys are handling
this, Donald. This lets me and others know that your systems are
well thought out and end-user friendly, and that is something
we all expect when shelling out serious cash for a good
number-cruncher setup.

>The application code should interact with the scheduler to set any
>special scheduling requirements or suggestions.

True. Also, this shouldn't be any big deal, and I'd imagine
this is easily done via shell scripts or a quick C hack,
especially if feel that this type of your code should be
propriatary or something. Personally, I'd want to see something
like this done at the script level, though, so that a geek could
come along and change a few things for tweaks. That's just
me, though. (Shrug.)

> A sophisticated user should still be able to optimize and do clever
> things, but the basic operation shouldn't require any new knowledge.


>>>    It does all of the serial setup and run-time I/O on the front end
>>>      machine (technically, the MPI rank 0 node).  This minimizes
>>>      overall work and keeps the POV-Ray call-out semantics unchanged
>>>    It does the rendering only on compute nodes (except for the N=0 case). 
>>>    It completes the rendering even with crashed or slow nodes.
>>Ah. So it redistributes the work, huh? Kewl.
> Here we use knowledge about the application semantics to implement
> failure tolerence.  When we have idle workers and the rendering isn't
> finished, we send some of the remaining work to the idle machine.

Well, I hate to sound like a knothead here, Donald, and I don't
mean to be rude, but isn't this a defacto setup and standard in
a beowulf environment?? If not, what the hell are people thinking
about? :^) :^). To me, this just seems like the logical way to
write code, but the heck do I know? :^)

> If a machine fails we still finish the rendering and do the final
> call-outs, but don't cleanly terminate.

Ah. Ok. Kewl. Sounds logical to me.

Type at ya' later,

Randall Jouett
Amateur Radio: AB5NI

I eat spaghetti code out of a bit bucket while sitting at a hash table!

More information about the Beowulf mailing list