[Beowulf] PVM on wireless...
kohlja at ornl.gov
kohlja at ornl.gov
Thu Feb 7 10:53:04 PST 2008
Hi Robert/Rob/RGB! :-)
On Thu, Feb 07, 2008 at 12:55:31PM -0500, Robert G. Brown wrote:
> On Wed, 6 Feb 2008, kohlja at ornl.gov wrote:
>> Hey Gang!
>> Sounds like you're having some "fun" with PVM over wireless...? :-)
>> (A buddy (Wael Elwasif) forwarded your discussion to me;
>> please always feel free to copy "pvm at msr.csm.ornl.gov"
>> with PVM inquiries when you get stuck. I try to be
>> pretty responsive, though this is all unfunded work now... :)
> Bless you.
De nada, you're welcome. :-)
> However, I've just manage to figure the problem out on my own. It is,
> after all, a firewall issue... <snip/>
Ah, Good! Glad that's all it was, not that it wasn't a hassle to identify! :)
Sorry it was so non-obvious from the PVM side of things... :-b
> While I've got the One True PVM Human(s) on the line, though...
> -- a suggestion for PVM to help others avoid this problem in the future
> on networks wired and wireless:
> It would really, really help if man pvm (or man pvmd or man pvm_intro)
> documented a suitable firewall setting that will let PVM function
> without just turning off the firewall altogether. There is no pvm setup
> in /etc/services, for example, no pvm checkbox in the panels managed by
> system-config-firewall in the latest Fedoras, no suggestion as to what
> what protected port(s) or ranges one has to enable explicitly. In fact
> for once even google is failing me -- I'm not finding a lot of
> documentation or remarks by ANYONE on what ports pvm needs open (besides
> ssh, which obviously is open and works). Usually as long as the
> spawning of a network application itself works using an enabled
> protected port (in this case, I would have expected ssh), the secondary
> ports opened in unprotected space just work. Am I wrong in this? Do I
> need to explicitly open more ports somewhere?
Ah Yes. O.K., so I wish it was that simple, but alas PVM can use as
many ports as you have machines in your cluster, or could use just 1. :-}
Normally, the master pvmd creates/accepts connections over a small
set of ports, possibly 1, but if PvmRouteDirect is enabled in a PVM
application, then a myriad of direct-connection socket links are
created, to link whichever machines the local PVM application tasks
communicate with, on a demand-driven basis...
So it's not generally possible to specify an explicit "range" of ports.
However, it _is_ possible to set the "starting" port for this collection,
using the aforementioned "$PVMNETSOCKPORT" environment variable.
This sets the first port that PVM will try to use, and all subsequent
ports will usually be consecutive positive increments of that starting
port (i.e. PVMNETSOCKPORT++... :-).
So in most cases, you could probably plan on opening up a 100 or 1000
ports _somewhere_ in your firewall, depending on your needs, and then
just tell PVM where to start, using $PVMNETSOCKPORT...
I've always considered this solution a bit of a kludge, which is why
it doesn't show up in the man pages, but if it works well enough to
save people lots of hassle, then I can add some commentary on it...?
> To find out, this leaves me with running e.g. tcpdump and watching as
> pvm attempts to connect, opening port ranges one at a time and doing a
> binary search, or something similarly painful. Or just asking you. So
> what (minimal set of) ports do I need to leave open besides ssh, which
> is always open on my systems anyway?
> An additional suggestion would be to (if possible) have the RPM install
> "fix" the port situation so that pvm shows up on system-config-firewall
> and/or finish with a message to the installer that a particular firewall
> setting must be installed or enabled and/or add something to the
> debugging info provided by pvm so that on a timeout (in particular) it
> prints something like "Unable to connect due to timeout. Verify that
> pvm is correctly installed and that port range xxxx-xxxx is open on the
You _should_ be getting some sort of timeout message in the slave
pvmd's log file (/tmp/pvml.<uid> on the slave machine), when the
connection request to the master pvmd doesn't get a reply...?
It may depend on the firewall settings, but a nice "Connection
Refused" would usually go a long way toward diagnosing things,
whereas the more secure firewall alternative of simply
"no response" would only result in a "timed out" PVM message...
I'm open to suggestions on ways to identify or diagnose the problem...!
Thanks Much for your interest and feedback!
All the Best,
> I actually help a lot of people get started with PVM (they write me
> offline because I have a template PVM tarball up on my personal website)
> and the more I know, the better I can help them...;-)
> Robert G. Brown Phone(cell): 1-919-280-8443
> Duke University Physics Dept, Box 90305
> Durham, N.C. 27708-0305
> Web: http://www.phy.duke.edu/~rgb
> Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php
> Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977
James Arthur "Jeeembo" Kohl, Ph.D. "Da Blooos Brathas?! They
Oak Ridge National Laboratory still owe you money, Fool!"
kohlja at ornl.gov
http://www.csm.ornl.gov/~kohl/ Long Live Curtis Blues!!!
More information about the Beowulf