[Beowulf] RE: Compare and contrast MPI implementations

Leif Nixon nixon at nsc.liu.se
Mon Dec 19 05:40:35 PST 2005


"Robert G. Brown" <rgb at phy.duke.edu> writes:

> But there are so many different ways to remove fur from felines.

Oh, yes. And modules are a nice way of storing cat skins, once you're
done with the grisly bit.

> But I think modules already sound like they take care of the problem.

They don't, actually.

One of your Frobotz jobs might need $FROBOTZ_VERSION set to "fritz",
"/usr/local/frobotz/0.992b/bin" added to $PATH and
"/usr/local/frobotz/0.992b/lib" added to $LD_LIBRARY_PATH, while
another of your jobs runs Frobotz 0.993a and needs different values
for these variables.

Modules offer a nice way of packaging these settings into manageable
chunks of configuration. I *like* modules, or rather, I like cmod.

*However*, if Frobotz is one of those applications that like to handle
their own remote process start-up via rsh or ssh, modules don't help
you much. You still have to find that rsh invocation buried deep below
layers of helper scripts and change

  rsh $otherhost frobotz

to

  ssh $otherhost 'export FROBOTZ_VERSION=fritz; export PATH=/usr/local/frobotz/0.992b/bin:$PATH; export LD_LIBRARY_PATH=/usr/local/frobotz/0.992b/lib:$LD_LIBRARY_PATH; frobotz'

OK, modules let you write

  ssh $otherhost "modules add frobotz/0992b; frobotz"

instead¹, but the basic problem of setting up the environment for
parallel jobs remains the same.

And an irritating problem, at that, because you often have to fight a
huge, complex program written by an author that has a real problem
understanding the concept of a cluster that *isn't* purpose-built to
run a single frobotz version and nothing else.

[For the sake of brevity, I here omit 300 lines of ranting]

So, getting back to my original point, modules *don't* help
David with his problem.


¹ With the added bonus of not having to care about the user's login
  shell. That first example should of course really be wrapped in a
  "bash -c".

-- 
Leif Nixon                       -            Systems expert
------------------------------------------------------------
National Supercomputer Centre    -      Linkoping University
------------------------------------------------------------




More information about the Beowulf mailing list