[Beowulf] cluster for doing real time video panoramas?
James.P.Lux at jpl.nasa.gov
Wed Dec 21 20:11:21 PST 2005
At 04:41 PM 12/21/2005, Bogdan Costescu wrote:
>[ I think that most of what I write below is quite OT for this list...
>Apologies to those that don't enjoy the subject! ]
>On Wed, 21 Dec 2005, Jim Lux wrote:
> > I've got a robot on which I want to put a half dozen or so video
> > cameras (video in that they capture a stream of images, but not
> > necessarily that put out analog video..)
>It's not entirely clear to me what you want to say above... How is the
>video coming to the computer ? You are later mentioning 1394 cameras,
>so I assume something similar to the DV output from common camcorders.
Could be anything that works (preferably cheap!)... I had originally
contemplated using analog video cameras and some sort of frame grabber, but
the firewirestuff folks have a nifty 1394 video camera that essentially
rolls the camera and digitizer into one widget.
> > I've also got some telemetry that tells me what the orientation of
> > the robot is.
>Does it also give info about the orientation of the assembly of
yes.. in the sense that you know the orientation of the camera relative to
the body, and you know the orientation of the body. Say you've got 6
cameras, pointed along the x,y,z axes. For instance, if the cameras
pointed along the x axis have the long direction of the frame in the y
direction and the short in the z, and the cameras on the y axis have the
long direction in z, and the short in x, and the cameras on z have the long
direction in x and short in y. IF the short direction covers 90 degrees,
then you'll have overlap (I think..)
since the cameras are "bolted down", once you've done the calibration to
establish the relative orientations of the cameras, you're all set.
> I'm thinking of the usual problem: is the horizon line going
>down or is my camera tilted ? Although if you really mean spherical
>(read below), this might not be a problem anymore as you might not
>care about what horizon is anyway.
> > I want to take the video streams and stitch them (in near real time)
> > into a spherical panorama
>Do you really mean spherical or only circular (the example that you
>gave being what I call circular) ? IOW: are the focal axes of the
>cameras placed only in a plane (or approximately, given alignment
I'm thinking spherical (or, at least, an approximation to spherical, given
that the focal points of the lenses can't be coincident).
>Given that I have a personal interest in video, I thought about
>something similar: not a circular or spherical coverage, but at least
>a large field of view from which I can choose when editing the
>"window" that is then given to the viewer - this comes from a very
>mundane need: I'm not such a good camera operator, so I always miss
>things or make a wrong framing. So this would be similar to how
>Electronic (as opposed to optical) Image Stabilization (EIS) works,
>although EIS does not need any transformation or calibration as the
>whole big image comes from one source.
Exactly.. If I create an image (by compositing the 6) that covers the
sphere, then, all I have to do is "look" in the direction of the
orientation of the robot to stabilize the image. Imagine a sphere rolling
on the floor, or sinking down through a kelp forest.
>All that I write below starts from the assumption that the cameras are
>mounted on an assembly in a "permanent" position, such that their
>relative positions (one camera with respect to another) do not change.
That's a valid assumption.
>Also that you don't zoom or that you can control the zoom on all
>cameras simultaneously; otherwise putting all the movies together is
>probably hard (in the circular setup; but doable probably with motion
>vectors or related stuff that is already used in MPEG4 compression) or
>impossible (in the spherical setup where you'd miss parts of the
Indeed.. all cameras are fixed, in terms of lens.
>First step should be the calibration of the cameras with respect to
>each other. In the COTS world, I don't think that you'd be able to get
>cameras to be fixed such that they equally split the space between
>them (so that the overlap between any 2 cameras would be the same);
>then you also need color calibration, sound level calibration (with
>directional mics, otherwise it makes no sense) and so on -
>multi-camera setups are rather difficult to master for an amateur
>(like me, at least :-))
ACtually, that's fairly straightforward. PTGui does it quite nicely for
still frames, shot arbitrarily. See, for instance, my very first attempt
some years ago: http://www.luxfamily.com/pano1.htm
It even did all the color and exposure cals (although pretty clunky in that
So, I could capture a few frames with my robot in a "nice" environment
(with calibration points sufficiently far away), hand calibrate, and save
the camera transformation parameters. There's some folks at work (JPL) who
have brought this process to a fine art with automation, but that's
probably overkill (they use big boards with arrays of big white dots to
calibrate, e.g., the cameras on the Mars rovers)
>Another problem that you might face is the
>frame synchronization between the cameras which might come into play
>for moving objects.
That would be a huge problem (which I am ignoring for now!)
>Talking about moving, IMHO you need to have progressive output from
>the cameras. Interlaced movies would probably create artifacts when
>joining together; deinterlacing several video streams at once might be
>a nice application (but very coarse grained - f.e. one stream per CPU)
>for a cluster, but the results might still not be "perfect", as with
>progressive output, as the deinterlacing results for the same part of
>the scene taken from several cameras might be different.
Indeed.. although some empirical evidence (the movies from the guy with the
Mac and 6 cameras driving around) shows that's not all that bad
>If the position of the cameras can be finely modified, it might be a
>good idea to try to get them close to the ideal equal splitting of the
>space by just looking at their output. But in any case, if the cameras
>are fixed with respect to each other, you don't need to calculate the
>overlapping regions for every frame - which means that the final frame
>size can also be known at this time; knowing the overlapping also
>makes easy to arrange the blending parameters. If you want a spherical
>setup, it's quite likely to have more than 2 cameras that overlap in a
>certain place so the calibration will likely be more difficult, but
>once it's done you don't need to do it again...
>To come back to cluster usage, I think that you can treat the whole
>thing by doing something like a spatial decomposition, such that
>- each computer gets the same amount of video data (to avoid
>overloading). This way it's clear which computer takes video from
>which camera, but the amount of "real world" space that each of them
>gives is not equal, so putting them together might be difficult.
>- each computer takes care the same amount of the "real world" space,
>so each computer provides the same amount of output data. However the
>video streams splitting between the CPUs might be a problem as each
>frame might need to be distributed to several CPUs.
It just occurred to me that another way might be to interleave frames among
CPUs (there'd be some lag, and there's the sync problem).. Say it takes 0.1
seconds to process all the cameras into a single image. Then, with three
processors, I could keep up at 30 frames/second.
> > But, then, how do you do the real work... should the camera
> > recalibration be done all on one processor?
>It's not clear to me what you call "recalibration". You mean color
>correction, perspective change and so on ? These could be done in
>parallel, but if the cameras don't move relative to each other, the
>transformations are always the same so it's probably easy to partition
>them on CPUs even as much as to get a balanced load.
That was exactly it..
> > Should each camera (or pair) gets its own cpu, which builds that
> > part of the overall spherical image, and hands them off to yet
> > another processor which "looks" at the appropriate part of the video
> > image and sends that to the user?
>Well, first of all, your words suggest to me that you are talking
>about a circular setup. In a real spherical one, you should have some
>parts that overlap from at least 3 cameras (where their edges look
>like a T) so you can't talk about pairs.
You're probably right.. there are joins where three cameras see the same place.
>Secondly, all my thoughts above try to cover the case where you want
>to get at each moment a complete image out of the system, like when
>different people are watching maybe different parts of the output. If
>there's only one "window" that should be seen, then the image would
>probably come from at most 2-3 cameras; the transformation could
>probably be done on different CPUs (like in the case for full output),
>but putting them together (blending) would be easy enough to do even
>on one CPU, so no much use for a cluster there... unless the
>transformations are so CPU intensive that can't be done in realtime,
>in which case you could send each frame to a different CPU and get the
>output with a small delay (equal to the time needed to transform one
I envision a scheme where you generate a series of full spherical frames
that you could record (e.g. onto a hard disk), and then, later you could
play it back looking in any direction. And, in real time (while driving
the robot), you could look at a subset.
>That's it, I hope that I made sense... given that it's well past
makes plenty of sense...
>IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
>Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
>Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
>E-mail: Bogdan.Costescu at IWR.Uni-Heidelberg.De
James Lux, P.E.
Spacecraft Radio Frequency Subsystems Group
Flight Communications Systems Section
Jet Propulsion Laboratory, Mail Stop 161-213
4800 Oak Grove Drive
Pasadena CA 91109
More information about the Beowulf