Stereo Vision for Autonomous MAVs

Konstantin Schauwecker

Compared to laser scanners, which are commonly deployed on wheeled robots, cameras offer the advantage of a lower weight and power consumption. It is thus tempting to rely on cameras for a payload and energy constrained autonomous Micro Aerial Vehicle (MAV). Compared to a monocular camera, a stereo camera pair offers the advantage of depth perception. This allows us to reconstruct the metric 3D position for a point that is observed by both cameras. Hence, a stereo camera is a rich sensor, which in principle offers an extensive three-dimensional perception of the surrounding environment.

Our quadrotor research platform with two stereo systems, as seen from front and bottom.

The construction of an autonomous MAV that primarily relies on stereo vision requires us to find solutions for several problems. The first problem is of course stereo matching, which allows us to reconstruct the 3D position of points that are observed by both cameras. Next, the MAV has to be able to determine its current pose, i.e. its 3D position and orientation. Finally, the MAV has to be able to map its environment, in order to determine which space is traversable and which one is not.

For stereo matching, we developed a new and efficient stereo matching method. Unlike most current research on stereo vision this method is sparse, which means that it only delivers matching results for a small set of salient image features. This enables the method to achieve high processing rates, which allows us to process the camera images a the video frame rate of 30 Hz. The accuracy of the stereo method is improved by a new feature detector, which was specifically designed for stereo matching. Despite our new stereo matching method being sparse, it densely examines the valid disparity range in the opposite matching direction for each found stereo correspondence. This allows us to identify features that received non-consistent matching results, or whose matching results are not sufficiently unique. Once these features have been removed, the remaining features show a high matching accuracy.

View seen from front and bottom camera pair, with overlaid stereo matching and ground plane detection results.

We use this fast and accurate stereo matching method for tracking the pose of our MAV. For this task, we use a visual SLAM method that processes both, image and depth information from a forward-facing stereo camera pair. Because this SLAM method relies on a sparse set of image features, it integrates well with our sparse stereo matching system. To meet the performance requirements for our MAV, the SLAM method was simplified such that it only retains a small local map. Using this local SLAM method, an autonomous MAV was constructed that relies on a forward-facing stereo camera pair and an Inertial Measurement Unit (IMU) as only sensors.

We have further developed a localization method that relies on imagery from on a downward-facing stereo camera pair. For this approach it was assumed that the ground is flat and level, which is a valid assumption when flying indoors. The ground plane is detected by fitting a plane to the 3D points received from stereo matching. From this plane it is then possible to extract the MAV's current height, and its roll and pitch angles. Horizontal translations and yaw rotations are observed by using another method, which is based on frame-to-frame tracking. With this method we hence receive a full 6DoF pose estimate that can be used as an alternative to the estimate obtained by local SLAM. Both pose estimation methods were integrated on one MAV that has been equipped with two stereo camera pairs. The two redundant pose estimates are fused using an extended Kalman Filter. The resulting MAV was successfully evaluated in several flight and offline-processing experiments. Compared to the first autonomous MAV prototype, this MAV exhibits a more robust and more precise pose estimation, which improves the quality of the autonomous flight.

Our occupancy mapping method when applied to map an indoor corridor.

Further, we have approached the problem of environment perception, by developing a new method for volumetric occupancy mapping. This method is based on the popular OctoMap approach, which creates voxel-based maps that are stored in octrees. While OctoMap has shown to provide good results when used with measurements from laser scanners, we have demonstrated that this is not the case for dense measurements from a stereo vision system. We thus introduced an extension of OctoMap, which considers the visibility of a voxel when updating the voxel's occupancy probability. Further, the depth error of a stereo vision system is modeled and considered during the map update procedure. Despite the higher complexity of this method, it achieved shorter processing times in most of the conducted performance measurements. This result can be credited to an optimization of OctoMap's original ray casting scheme.

See Also


[1] Konstantin Schauwecker and Andreas Zell. Robust and Efficient Volumetric Occupancy Mapping with an Application to Stereo Vision. In IEEE International Conference on Robotics and Automation (ICRA), pages 6102--6107, Hong Kong, China, May 2014. [ pdf ]
[2] Konstantin Schauwecker and Andreas Zell. On-Board Dual-Stereo-Vision for the Navigation of an Autonomous MAV. Journal of Intelligent & Robotic Systems, 74(1-2):1--16, January 2014. [The final publication is available at]. [ link | pdf ]
[3] Konstantin Schauwecker and Andreas Zell. On-Board Dual-Stereo-Vision for Autonomous Quadrotor Navigation. In International Conference on Unmanned Aircraft Systems (ICUAS), pages 332--341, Atlanta, GA, USA, May 2013. IEEE. [ link | pdf ]
[4] Konstantin Schauwecker, Reinhard Klette, and Andreas Zell. A New Feature Detector and Stereo Matching Method for Accurate High-Performance Sparse Stereo Matching. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 5171--5176, Vilamoura, Algarve, Portugal, October 2012. IEEE. [ link | pdf ]
[5] Konstantin Schauwecker, Nan Rosemary Ke, Sebastian A. Scherer, and Andreas Zell. Markerless Visual Control of a Quad-Rotor Micro Aerial Vehicle by Means of On-Board Stereo Processing. In 22nd Conference on Autonomous Mobile Systems (AMS), pages 11--20, Stuttgart, Germany, September 2012. Springer. [ link | pdf ]


Konstantin Schauwecker,