[Beowulf] Re: PVM on wireless...
Robert G. Brown
rgb at phy.duke.edu
Wed Feb 6 13:42:04 PST 2008
On Wed, 6 Feb 2008, David Mathog wrote:
>> Anybody on list have any idea why PVM fails to add hosts over a wireless
>> link? I've now tried this over multiple distro version and at least one
>> PVM update, and it just doesn't work. Works fine over a wire, fails on
>> wireless, and as far as I know wire and wireless are both "identical"
>> at the kernel interface layer so that any e.g. socket one might open is
>> absolutely ecumenical about what the underlying hardware is (good old
>> ISO/OSI layering, right?).
> Sounds like multiple network hell, with some type of name mismatch
> causing the problems. Start up pvmd directly on one of the wireless
> machines and then use pvm to see what it calls itself. If that
> differs in any way from the entries in your host list then that is
> probably the problem. If they come up the same then run -d settings on
> pvmd to find out more information.
> It is also possible the firewall settings are different, and the wired
> interface allows pvm connections in some way that the wireless does not.
> Did you try starting pvmd on a pure wireless machine and see if it can
> connect to other pure wireless machines? It would be good to get the
> wired interfaces completely out of the equation.
Any connection with wireless on at least one end fails. Or if you like,
only wire-to-wire succeeds.
And I HAVE been doing TCP/IP sysadmin for about twenty-one years now,
pro-grade linux for twelve-plus. I really don't think that there is
much of a chance left that there is any trivial networking error
underlying this, as of course I've checked this pretty carefully (in two
completely different instances, with significant changes to my home
network -- different primary server, different WAP, different wireless
cards, different laptops and as I said, the mapping between IP number
and slave pvmd is exactly correct as are all host table entries, ping
works by name or IP to the same IP(s), ssh works by name or IP, http
works ditto, wulfware works ditto (and shows both interfaces), NM
manages wireless now while then I did it by hand, the kernels 2.4 vs 2.6
different, yet the symptoms are exactly the same. It works to a point
just half-way through the handshaking and then, AFTER the remote daemon
is successfully spawned with the right lockfiles and IP numbers visible
to ps with ww, it freezes until something times out, then it fails while
claiming that it succeeded in adding the remote host.
I can literally snap the same box onto a wire, wait for it to get an IP
number on the wire, and rerun the experiment on the same hardware and it
works perfectly (with a different but identically entered name, of
course). And it is the wireless name that corresponds with the
`hostname` (in /etc/sysconfig/network), not that this should matter (and
it doesn't on the wire).
That's not to say that I can't make a mistake -- only that I've checked
all the really obvious ones and EVERYTHING ELSE works perfectly and
universally independent of wire vs wired. I snap in a wire in my
office, snap it off the wire and onto wireless, and back again, back and
forth home to office many times per boot. After about ten days of this
NM will sometimes destabilize as maybe the wireless card fails to hold
state perfectly, but in the meantime every network-using tool BUT pvm
just works, exactly as one expects.
> David Mathog
> mathog at caltech.edu
> Manager, Sequence Analysis Facility, Biology Division, Caltech
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
Robert G. Brown Phone(cell): 1-919-280-8443
Duke University Physics Dept, Box 90305
Durham, N.C. 27708-0305
Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php
Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977
More information about the Beowulf