[Beowulf] PVM on wireless...
kohlja at ornl.gov
kohlja at ornl.gov
Fri Feb 8 09:22:02 PST 2008
On Fri, Feb 08, 2008 at 11:40:50AM -0500, Robert G. Brown wrote:
> On Fri, 8 Feb 2008, kohlja at ornl.gov wrote:
>> Awesome Strangelove Reference...! :-D
>>
>> "I Have A Plan...!" :-)
>>
>> Yep, I am now getting inundated with people having rsh/ssh problems
>> with PVM, so a higher power clearly wants me to better document this.
>>
>> Thanks Much, Will Do... :)
> Excellentamundo!
I'm already getting lots of practice explaining how to get this stuff
to work for 3 separate PVM users... :)
> At some point at your convenience in the future when
> you have all kinds of time to metaphorically sit down and REALLY work
> over PVM...
Ahhh... Lemme picture that moment... :-D
> I have about 800 specific suggestions for bringing it up to
> current and modern and everything. Just a wee list. You know:
> * Purge aimk for all time, die die die
Ha ha ha... You don't like "aimk"...? :-)
Yeah, PVM was originally pre-autoconf... Too bad, eh...? :)
> * Actually use the FSH so e.g. apropos pvm works.
I'm assuming you don't mean FSH="Follicle Stimulating Hormone";
did you mean "SSH", or am I clueless...?
Sorry, I guess I'm not "up" on all the latest \/32/\/4[vL/\r... :-}
> * Document the hell out of everything
Yes! :D
> * Rewrite the network back end in a way that openly encourages high
> end network vendors to contribute reusable non-IP native drivers
Ha ha ha... Tried to cater to vendors many times. See all those funny
arch subdirs in pvm3/src...? Yeah, been there, done that...
(Though I agree that building on top of some generic "standardized"
networking layer would be "nice" - there are so many to choose from... :)
> * Add a (possibly macro-driven) middle layer that makes PVM into MPI
> as well -- one set of actual message-passing functions, two conformally
> mapped call interfaces.
You mean like "PVMPI"...?
http://www.netlib.org/utk/papers/pvmpi/paper.html
Or its offspring "MPI-Glue"...?
http://www.scientific-computing.de/people/rabenseifner/projects/mpi_glue.html
Or do you mean something completely different...? :)
> * Make Ctrl-C work so one can break out of the annoying timeout on add
> hosts when things don't work.
Yeah, bummer eh? :) Where did Bob Manchek go to anyway...?
(He's the real culprit behind the majority of PVM code, btw,
I merely "inherited" the maintenance job... :)
> * Make the console capable of cleaning up after a crash or
> interruption.
We talked about things we could do there, e.g. to clean up old
leftover /tmp/pvmd.* files, etc, but it was always easier to
just remove the files by hand...! ;)
Good suggestions, though. I'll add them to my "to do" list,
along with any others that may come up...? :-)
Thanks, Man!
Jeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeem ;)
> that kind of thing...;-)
> rgb
>>
>> Jeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeem ;)
>>
>> On Fri, Feb 08, 2008 at 05:35:31AM -0500, Robert G. Brown wrote:
>> > On Thu, 7 Feb 2008, kohlja at ornl.gov wrote:
>>
>> >> I admit this may be an antiquated cynical mentality, and I
>> >> further concur that PVMNETSOCKPORT is an obvious omission
>> >> in the basic documentation/faq...
>>
>> > As they say, you can't RTFM if there ain't no FM... (or if the solution
>> > exists but isn't there).
>>
>> > One is reminded of Dr. Strangelove, where the president (Peter Sellers)
>> > has just learned that if the maverick B52 piloted by Slim Pickens gets
>> > through, a doomsday device that is supposed to deter first nuclear
>> > strikes will go off that will destroy the world. Unfortunately, the
>> > Soviet Union didn't actually tell us that it was built. Dr.
>> > Strangelove (Peter Sellers), after musing for a moment on the
>> brilliance
>> > of the concept, turns and says in an increasingly shrill voice:
>>
>> > But...the whole point of the Doomsday Machine...is lost...if you keep
>> > it a SECRET. Why didn't you tell the world, eh?
>>
>> > Hmmm...;-)
>>
>> > rgb
>>
>> >> Thanks for your suggested text! (And the suggestion to
>> >> enhance our coverage of rsh/ssh usage... :-)
>>
>> > Ya, well. Just now finished telling the umptieth would-be PVM user how
>> > to go about it in an email message, augmenting further online docs such
>> > as this one:
>>
>> > http://www.uow.edu.au/~suresh/web/cfamily/pvm.html
>>
>> > which is actually pretty decent, although I generally use the ssh
>> > default dsa instead of rsa since on linux boxes it invariably works.
>> > But better than forcing each user to employ google to snarf out
>> > solutions to each problem they encounter, how much better to write a
>> > really nice "Getting Started with PVM" or perhaps better still, a "PVM
>> > HOWTO" on tldp.org. Publish there, and be sure to include a copy in
>> > plain sight in /usr/share/pvm3/PVM_HOWTO.
>>
>> > Truthfully, good documentation, especially a walkthrough tutorial on
>> > getting started (including sample code or links to sample code) that
>> > takes a would-be user from "yum install pvm\*" to executing a Real
>> > Parallel Program (however trivial) on a two node cluster would really
>> > encourage the use of the library. Adding a bit more (such as a PVM
>> > program development template) would be only icing on the cake, so to
>> > speak.
>>
>> > If I had the time I'd write it myself. I've already got a project_pvm
>> > program template up on the web, but it is sadly underdocumented through
>> > the setup of PVM itself.
>>
>> > rgb
>>
>> >>
>> >> All the Best,
>> >>
>> >> Jeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeem ;)
>> >>
>> >> On Thu, Feb 07, 2008 at 04:42:21PM -0500, Robert G. Brown wrote:
>> >> >> > It would really, really help if man pvm (or man pvmd or man
>> >> pvm_intro)
>> >> >> > documented a suitable firewall setting that will let PVM
>> function
>> >> >> > without just turning off the firewall altogether. There is no
>> pvm
>> >> >> setup
>> >> >> > in /etc/services, for example, no pvm checkbox in the panels
>> >> managed by
>> >> >> > system-config-firewall in the latest Fedoras, no suggestion as
>> to
>> >> what
>> >> >> > what protected port(s) or ranges one has to enable explicitly.
>> In
>> >> fact
>> >> >> > for once even google is failing me -- I'm not finding a lot of
>> >> >> > documentation or remarks by ANYONE on what ports pvm needs open
>> >> >> (besides
>> >> >> > ssh, which obviously is open and works). Usually as long as
>> the
>> >> >> > spawning of a network application itself works using an enabled
>> >> >> > protected port (in this case, I would have expected ssh), the
>> >> secondary
>> >> >> > ports opened in unprotected space just work. Am I wrong in
>> this?
>> >> Do I
>> >> >> > need to explicitly open more ports somewhere?
>> >> >>
>> >> >> Ah Yes. O.K., so I wish it was that simple, but alas PVM can use
>> as
>> >> >> many ports as you have machines in your cluster, or could use just
>> 1.
>> >> :-}
>> >> >>
>> >> >> Normally, the master pvmd creates/accepts connections over a small
>> >> >> set of ports, possibly 1, but if PvmRouteDirect is enabled in a
>> PVM
>> >> >> application, then a myriad of direct-connection socket links are
>> >> >> created, to link whichever machines the local PVM application
>> tasks
>> >> >> communicate with, on a demand-driven basis...
>> >> >>
>> >> >> So it's not generally possible to specify an explicit "range" of
>> >> ports.
>> >> >> However, it _is_ possible to set the "starting" port for this
>> >> collection,
>> >> >> using the aforementioned "$PVMNETSOCKPORT" environment variable.
>> >>
>> >> > OK, I'm giving this a try. Although I'd have to ask why pvmd
>> doesn't
>> >> do
>> >> > the fork thing and clone a single open port on which it listens
>> into a
>> >> > dynamically allocated port that inherits from the open one. In
>> >> > principle one only needs a single port to be open to connect to
>> pretty
>> >> > much any network based application, or so I had thought. At least,
>> I
>> >> do
>> >> > that in xmlsysd and never have to punch more than one porthole
>> through
>> >> a
>> >> > firewall.
>> >>
>> >> > Hmmm, it's working sort of -- looks like I need to open UPD ports,
>> >> > right, not TCP? Having trouble on one host where I've punched the
>> hole
>> >> > but didn't >>locally<< set PVMNETSOCKPORT to match, so I'm trying
>> again
>> >> > with the local environment variable set.
>> >>
>> >> > Yup, that works.
>> >>
>> >> > So I'm guessing that pvmd reads it as it starts up wherever. Why
>> does
>> >> > it need to do this on a client? Can't the port(s) be passed from
>> the
>> >> > master when it starts up pvmd?
>> >>
>> >> >> This sets the first port that PVM will try to use, and all
>> subsequent
>> >> >> ports will usually be consecutive positive increments of that
>> starting
>> >> >> port (i.e. PVMNETSOCKPORT++... :-).
>> >> >>
>> >> >> So in most cases, you could probably plan on opening up a 100 or
>> 1000
>> >> >> ports _somewhere_ in your firewall, depending on your needs, and
>> then
>> >> >> just tell PVM where to start, using $PVMNETSOCKPORT...
>> >> >>
>> >> >> I've always considered this solution a bit of a kludge, which is
>> why
>> >> >> it doesn't show up in the man pages, but if it works well enough
>> to
>> >> >> save people lots of hassle, then I can add some commentary on
>> it...?
>> >>
>> >> > Kludge or not, how can you have an environment variable in an
>> >> > application and not provide knowledge of it or instructions on its
>> use
>> >> > in the man page? Something like:
>> >>
>> >> > PVM requires open ports on target hosts to function. Many hosts
>> are
>> >> > installed with strong firewall rules by default. If you install
>> pvm
>> >> on
>> >> > a slave and pvm appears to hang when you attempt to add it,
>> eventually
>> >> > timing out without success, consider adding the following to your
>> >> local
>> >> > personal or system environment (in, for example, ~/.bash_profile
>> on
>> >> all
>> >> > hosts):
>> >>
>> >> > PVMNETSOCKPORT=10000
>> >> > export PVMNETSOCKPORT
>> >>
>> >> > Then configure your firewall(s) to open a range of udp ports
>> starting
>> >> > at this value, such as 10000-11024 (which need be any larger than
>> the
>> >> > largest number of machines you expect to have in your virtual
>> >> machine).
>> >>
>> >> > However a better solution still is to have the daemon fork on a
>> single
>> >> > "permanent" port address > 1024, e.g. 10000, and get a negotiated
>> >> > connection in the upper (non-protected) port space that way.
>> >>
>> >> >> It may depend on the firewall settings, but a nice "Connection
>> >> >> Refused" would usually go a long way toward diagnosing things,
>> >> >> whereas the more secure firewall alternative of simply
>> >> >> "no response" would only result in a "timed out" PVM message...
>> >> >>
>> >> >> I'm open to suggestions on ways to identify or diagnose the
>> >> problem...!
>> >>
>> >> > As I said, document EVERYTHING in the man page(s). It is what it
>> is
>> >> for.
>> >> > Lots of users do, in fact, RTFM but get frustrated and give up when
>> >> they
>> >> > try something and it just doesn't work and they can't see why.
>> >>
>> >> > On the same line, a perennial problem with PVM is getting it to
>> work
>> >> > with rsh and ssh. In fact, half the problems I help people with
>> who
>> >> > randomly write me is getting it to work with one or the other. The
>> >> > internal diagnostics are certainly very helpful, at this point, but
>> it
>> >> > would also be worth adding a new man page like pvm_rsh that does
>> >> nothing
>> >> > but walk users through the ritual of setting PVM_RSH and
>> establishing
>> >> > appropriate e.g. ssh keys.
>> >>
>> >> > Just a thought or two.
>> >>
>> >> > rgb
>> >>
>> >> >>
>> >> >> Thanks Much for your interest and feedback!
>> >> >>
>> >> >> All the Best,
>> >> >>
>> >> >> Jeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeem ;)
>> >> >>
>> >> >> > I actually help a lot of people get started with PVM (they
>> write me
>> >> >> > offline because I have a template PVM tarball up on my personal
>> >> >> website)
>> >> >> > and the more I know, the better I can help them...;-)
>> >> >>
>> >> >> > rgb
>> >> >>
>> >> >> > --
>> >> >> > Robert G. Brown Phone(cell):
>> >> 1-919-280-8443
>> >> >> > Duke University Physics Dept, Box 90305
>> >> >> > Durham, N.C. 27708-0305
>> >> >> > Web: http://www.phy.duke.edu/~rgb
>> >> >> > Book of Lilith Website:
>> >> http://www.phy.duke.edu/~rgb/Lilith/Lilith.php
>> >> >> > Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977
>> >> >>
>> >> >>
>> >>
>> (:(:(:(:(:(:(:(:(:(:(:(:(:(:(:(:(:(:(:(:(:(:(:(:(:(:(:(:(:(:(:(:(:(:(:(:(:(:
>> >> >>
>> >> >> James Arthur "Jeeembo" Kohl, Ph.D. "Da Blooos Brathas?!
>> They
>> >> >> Oak Ridge National Laboratory still owe you money,
>> >> Fool!"
>> >> >> kohlja at ornl.gov
>> >> >> http://www.csm.ornl.gov/~kohl/ Long Live Curtis
>> Blues!!!
>> >> >>
>> >> >>
>> >>
>> :):):):):):):):):):):):):):):):):):):):):):):):):):):):):):):):):):):):):):)
>> >> >>
>> >>
>> >> > --
>> >> > Robert G. Brown Phone(cell):
>> 1-919-280-8443
>> >> > Duke University Physics Dept, Box 90305
>> >> > Durham, N.C. 27708-0305
>> >> > Web: http://www.phy.duke.edu/~rgb
>> >> > Book of Lilith Website:
>> http://www.phy.duke.edu/~rgb/Lilith/Lilith.php
>> >> > Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977
>> >>
>>
>> > --
>> > Robert G. Brown Phone(cell): 1-919-280-8443
>> > Duke University Physics Dept, Box 90305
>> > Durham, N.C. 27708-0305
>> > Web: http://www.phy.duke.edu/~rgb
>> > Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php
>> > Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977
>>
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org
>> To change your subscription (digest mode or unsubscribe) visit
>> http://www.beowulf.org/mailman/listinfo/beowulf
>>
> --
> Robert G. Brown Phone(cell): 1-919-280-8443
> Duke University Physics Dept, Box 90305
> Durham, N.C. 27708-0305
> Web: http://www.phy.duke.edu/~rgb
> Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php
> Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977
More information about the Beowulf
mailing list