[Beowulf] cluster for doing real time video panoramas?

Wed Dec 21 13:29:57 PST 2005

At 12:48 PM 12/21/2005, Robert G. Brown wrote:
>On Wed, 21 Dec 2005, Jim Lux wrote:
>
>>OK all you cluster fiends.. I've got a cool application (for home, sadly, 
>>not for work where I'd get paid to fool with it)..
>>
>>I've got a robot on which I want to put a half dozen or so video cameras 
>>(video in that they capture a stream of images, but not necessarily that 
>>put out analog video..) with overlapping fields of view.  I've also got 
>>some telemetry that tells me what the orientation of the robot is.  I 
>>want to take the video streams and stitch them (in near real time) into a 
>>spherical panorama, that I can then render from a corrected viewpoint 
>>(based on orientation) to "stabilize" the image.
>
>Goal being to get spherical panorama in 2d,

spherical panorama in 2d

>  or to reconstruct
>polynocular representation of 3d facing surfaces?

that's a substantially harder problem, although there is some work out 
there on how to do it, at least in a theoretical way.

>  That is, is the robot
>trying to "know where it is" and how far away things are in its field(s)
>of view, or just generating a 2d point-projective representation of
>incoming light...

2 d projection, so I can be tele-immersed (which is a service mark of a 
company called something like immersive imaging... they make a 11 camera 
ball called the dodeca, with software, etc.)

>Second question is how are you going to handle the map?  Patches?
>Spherical coords?  A triangular covering?  I ask because (of course)
>there is no nice mapping between the cartesian coords implicit in most
>e.g. video cameras and spherical coordinates e.g. \theta,\phi of a point
>projective view I(\theta,\phi) (incoming light intensity as a function
>of position on the projective sphere).  This leaves you with a long-term
>problem in handling nonlinear cartesian to whatever pixel
>transformations as well as an addressing problem.

This is what Dersch's panotools handles. it does the nonlinear 
transformation from one coordinate space to another, based on some camera 
parameters (direction of look, lens length, aberrations, etc.).

>   As in this could be
>the MOST expensive part of things computationally, as it involves
>trancendental function calls that are some 1000x slower than ordinary
>flops in a linear transform.

However, once you know what the mapping is from some camera pixel(x,y) to 
your spherical panorama (x,y), it stays constant (unless the cameras move 
relative to each other)...

So, if you have, say, 6 images of 640x480 that map to a 2000x1000 pixel 
projection, you only need the 10 million or so constants  (actually, a bit 
more.. you probably need to do some interpolation, say over a 3x3 pixel 
grid, and you have to do colors, etc..

But at a fundamental level, you wind up with an equation like
for all pixels i:
output pixel(i) = a1(i)*inputpixel(j1(i)) + a2(i)*inputpixel(j2(i)) + 
a3(i)*inputpixel(j3(i))...

And the problem is somewhat partitionable, because big chunks of the output 
image come mainly from one input image, with two inputs only needed along 
the join lines.

>    rgb
>
>(who faces similar problems in spherical decompositions in some of his
>research, and finds them to be a real PITA.)
>
>
>--
>Robert G. Brown                        http://www.phy.duke.edu/~rgb/
>Duke University Dept. of Physics, Box 90305
>Durham, N.C. 27708-0305
>Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu
>

James Lux, P.E.
Spacecraft Radio Frequency Subsystems Group
Flight Communications Systems Section
Jet Propulsion Laboratory, Mail Stop 161-213
4800 Oak Grove Drive
Pasadena CA 91109
tel: (818)354-2075
fax: (818)393-6875