[Beowulf] PVM on wireless...

Robert G. Brown rgb at phy.duke.edu
Thu Feb 7 08:08:46 PST 2008

On Wed, 6 Feb 2008, Jim Lux wrote:

> Ahh.. but there is a "routing" process of sorts inside your AP.  It has to 
> bridge from the 802.3 wired world to the 802.11 wireless world, and that 
> usually involves some store and forward type processing.  Some of these are 
> implemented as a store and forward router (e.g. home firewall) with one of 
> the logical ports connected to the wireless modem.  Very, very few access 
> points (if any) are actually a dumb packet oriented bridge that just unwraps 
> the payload from one frame type and shoots it out rewrapped for the other. 
> The AP has to do things like send out broadcast frames with the SSID, send 
> and receive the link setup/teardown kinds of frames (i.e. the link between 
> your PC's wireless interface and the AP), as well as bridging/routing traffic 
> from the wired network to the wireless network.
> Who's to say what kind of logic they have inside there to deal with all the 
> issues (the wireless MAC and the wired MAC are different, if nothing else.)

No arguments, but...

As far as the programmer API is concerned, IP is IP is IP, TCP is even
more removed.  The whole point of TCP, in fact, is that one is NOT
supposed to need to know or care if the packet one is wrapping up for
some destination is about to go out on a wire or wireless link, travel
over copper or fiber, pass through hubs, bridges, routers, switches.  A
properly formed packet that isn't in a channel controlled by e.g.
firewalls or port blockers is "guaranteed" to reach its destination, if
its destination be reachable and correctly bidirectionally routed, and
even to be resequenced and/or retransmitted if need be until the entire
message is at least "reasonably" accurately received by the receiver.

UDP is somewhat different.  It is a connectionless protocol, for one
thing.  However, the most important difference is that it is not a
"reliable" protocol -- is is close to what one might call "raw" IP.
Form a packet, drop it on the wire, pray that it is received.  If it is
part of a sequence of packets, pray that they are received in the
correct order, as WAN connections may well switch routes or delay
individual packets in route as the circumstances of traffic dictate or
lose a packet altogether.

Services built on UDP (PVM and at one time NFS) have to basically
replicate TCP's e.g. packet sequencing and reliable delivery checks in
order to become reliable.  Ordinarily UDP based services are
non-critical, and they are usually offered only "on the same wire" -- on
a network without a lot of routing hops in between, although switched
connections or single-hop bridges don't usually constitute a problem --
unless UDP is so augmented to make it reliable, and even then it is RARE
to run a UDP-based service over a WAN AFAIK.

I still don't seriously suspect that WAP per se, because every other
service in the Universe, TCP or UDP or ICMP, that I've used over
wireless works perfectly, always.  Oh, the connection itself isn't
horribly reliable -- turn on the microwave oven, drop the link, load the
device heavily, links get a bit flaky -- but EVERYthing works when the
link is up and solid.  To the best of my ability to test it (which isn't
terribly shabby, given nmap after all), it is transparent to IP from
broadcasts on down to individual ports on the local bridged 192.168.1.x
network, in both directions.

What is different on a WAP is timing (e.g. latency).  As you say,
there's a fair bit of out-of-band traffic associated with wireless
links.  My MIMO router up to the very latest firmware upgrade would
generate all sorts of spurious traffic that I suspect was associated
with link optimization and so on, but of course it was difficult to know
for sure as it was largely out of band.  Even so, however, it is really,
really odd that PVM has a segment that is so sensitive (or so unusual in
terms of its socket code) that it fails while everything else works.

Anyway, it sounds like the general answer is that nobody on list has
really encountered this or knows what is causing it, so I guess my
choices are to grab the PVM sources, do a build, do a -d0x0 run to
isolate once again the precise point where the process of adding a
wireless host fails, instrument the code to the point (possibly on both
ends -- it could be the target PVMD as easily as the master) where I can
actually see what is or isn't getting through, and then either figure
out why and "properly" fix it or muck around with the code to where the
problem goes away even though I don't know why (by e.g. inserting
"arbitrary" delays here or there to give a wireless network time to
catch up and avoid a race, which sucks I agree but which often works
anyway...;-) OR to just blow it off again and live with it, like I did
last time.

Or I suppose I could always file a bugzilla report and hope that it
filters back to the developers who actually know the code and can
properly fix it.  Hmmm, time time time.  Who has the time.


> Jim Lux

Robert G. Brown                            Phone(cell): 1-919-280-8443
Duke University Physics Dept, Box 90305
Durham, N.C. 27708-0305
Web: http://www.phy.duke.edu/~rgb
Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php
Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977

More information about the Beowulf mailing list