[Beowulf] RE: Compare and contrast MPI implementations
Leif Nixon
nixon at nsc.liu.se
Mon Dec 19 05:40:35 PST 2005
"Robert G. Brown" <rgb at phy.duke.edu> writes:
> But there are so many different ways to remove fur from felines.
Oh, yes. And modules are a nice way of storing cat skins, once you're
done with the grisly bit.
> But I think modules already sound like they take care of the problem.
They don't, actually.
One of your Frobotz jobs might need $FROBOTZ_VERSION set to "fritz",
"/usr/local/frobotz/0.992b/bin" added to $PATH and
"/usr/local/frobotz/0.992b/lib" added to $LD_LIBRARY_PATH, while
another of your jobs runs Frobotz 0.993a and needs different values
for these variables.
Modules offer a nice way of packaging these settings into manageable
chunks of configuration. I *like* modules, or rather, I like cmod.
*However*, if Frobotz is one of those applications that like to handle
their own remote process start-up via rsh or ssh, modules don't help
you much. You still have to find that rsh invocation buried deep below
layers of helper scripts and change
rsh $otherhost frobotz
to
ssh $otherhost 'export FROBOTZ_VERSION=fritz; export PATH=/usr/local/frobotz/0.992b/bin:$PATH; export LD_LIBRARY_PATH=/usr/local/frobotz/0.992b/lib:$LD_LIBRARY_PATH; frobotz'
OK, modules let you write
ssh $otherhost "modules add frobotz/0992b; frobotz"
instead¹, but the basic problem of setting up the environment for
parallel jobs remains the same.
And an irritating problem, at that, because you often have to fight a
huge, complex program written by an author that has a real problem
understanding the concept of a cluster that *isn't* purpose-built to
run a single frobotz version and nothing else.
[For the sake of brevity, I here omit 300 lines of ranting]
So, getting back to my original point, modules *don't* help
David with his problem.
¹ With the added bonus of not having to care about the user's login
shell. That first example should of course really be wrapped in a
"bash -c".
--
Leif Nixon - Systems expert
------------------------------------------------------------
National Supercomputer Centre - Linkoping University
------------------------------------------------------------
More information about the Beowulf
mailing list