Stereo Vision Explained: Bottlenose Cameras and the Importance of Confidence Maps

Welcome back to the Labforge foundations of Machine Vision series! In this series, we are breaking down basic concepts of machine vision into simple, bite-sized pieces that are easy to understand. Whether you are a beginner or just curious about how computers interpret and interact with the visual world, this series is for you.

In this post, we will focus our attention on stereo vision. We will discuss how a stereo camera such as Bottlenose perceives depth, enabling machines to see the world in 3D. Moreover, we’ll shine a spotlight on the often-overlooked yet invaluable confidence map and its indispensable role in the post-processing of depth data. Join us as we uncover the potential of stereo vision with the Bottlenose camera.

Depth information is lost during the formation of an image, making it difficult for a single camera to estimate the distance to objects in the real world. A stereo camera, such as the Bottlenose camera, simulates the way human eyes perceive depth by simultaneously capturing two images with slightly different viewpoints. These images are further processed to determine the depth or distance from the camera of objects in the scene.  The depth of each pixel is determined by triangulating corresponding points between the left and the right images. The following figure depicts a stereo camera observing a single point P. The view cone of each camera is also shown.

To facilitate the process of depth estimation, a stereo camera removes lens distortion and rectifies the input images using data obtained from calibration. The rectification process transforms a pair of images such that 3D points appear row-aligned on the resulting images. Stereo rectification simplifies the disparity computation problem by allowing the search for pixel correspondences on a single row. This figure from NI highlights the steps a stereo camera follows to acquire, undistort, and rectify images.

Bottlenose cameras perform undistortion and rectification as a first step to depth estimation. The camera computes the disparity of each pixel from the left image by searching the corresponding row inside the right image. The disparity map is generated as the displacement between the position of each pixel and its matching pixel position in the right image. Bottlenose uses the SGM (Semi-Global Matching) algorithm to estimate disparity. The following figure shows how the disparity of a given point is obtained from the left image by searching the corresponding pixel inside the right image. The final disparity d is computed as d = xL-xR, where xL and xR represent the pixel position of the point in the left and right images respectively. 

Disparity map estimation

The depth or distance from the camera to a given point P is obtained from its disparity using the equation below,  where f is the focal length of the camera, B is the baseline and d is the disparity of the point.

Depth map formula

The accuracy of this depth estimation process is directly linked to that of the underlying disparity estimation. However, due to various factors such as lighting conditions, surface texture, and occlusions, estimating the disparity of a pixel may become challenging. This makes the resulting depth inaccurate or uncertain. To mitigate this issue, a confidence score can be produced to mark how reliable each disparity is. The generated confidence map may be used to assess the reliability or certainty of the depth measurements obtained from a stereo camera.

While both disparity and depth maps provide information about the spatial layout of a scene captured by a stereo camera, they differ in the nature of the information they convey. Disparity maps show relative disparities between corresponding points in stereo images, whereas depth maps provide absolute distance measurements in metric units. A disparity map can be further processed to represent points in 3D space. The next figure shows an example of a disparity map from a Bottlenose camera with the corresponding reprojected 3D view.

Bottlenose stereo cameras now produce a confidence map alongside the disparity map. The confidence map can be activated either programmatically or using any GigE Vision-compliant software package. This functionality requires that the camera is properly calibrated. For detailed information on how to calibrate your Bottlenose camera, refer to our documentation page.

Labforge’s StereoViewer is a basic utility that can be used with any Bottlenose camera. It allows you to control camera settings, tune image quality, or stream images from your Bottlenose camera. Use the following steps to request and visualize confidence map and disparity from your camera using StereoViewer :

  1. Click on the Connect button to select your camera
  2. Open Device Control and Navigate to ImageFormatControl
  3. Set ComponentSelector to Disparity
  4. Set ComponentEnable to True to activate disparity computation. This assumes that your camera is properly calibrated.
  5. Set ComponentSelector to Confidence
  6. Set ComponentEnable to True to activate to request a confidence

The following is an example disparity map with the associated confidence map generated by a Bottlenose stereo camera. Darker areas of the confidence map highlight highly reliable disparity.

A disparity image with corresponding confidence map

A confidence map evaluates the confidence or certainty associated with each depth or disparity measurement.  A high confidence value suggests that the depth measurement is likely accurate, while a low confidence value indicates the measurement may be less reliable.

The use of a confidence map in stereo vision systems has several applications:

  1. Quality Assessment: It helps in evaluating the overall quality of the depth map generated by the stereo camera. Pixel areas with high confidence values are considered more reliable, while areas with low confidence values may require further analysis.
  2. Scene Understanding: By analyzing the confidence map, the system can identify areas where depth estimation is particularly challenging, such as regions with low texture or strong reflections. This information can be used to improve scene understanding and object recognition algorithms.
  3. Adaptive Processing: The confidence map can be used to adaptively adjust the parameters or algorithms used for depth estimation. For example, more sophisticated algorithms or additional processing can be applied to regions with low confidence values to improve accuracy.
  4. Decision Making: In applications such as industrial automation or robotics, decisions are often based on the depth data obtained from stereo cameras. By considering the confidence and depth maps, the system can make more informed decisions, especially in critical situations where accuracy is crucial.

In this post, we learned by which process stereo cameras like Bottlenose perceive depth and enable machines to see the world in 3D. More importantly, we highlighted the use of a confidence map when dealing with unreliable depth information from challenging scenarios such as uniform walls and occlusion.

Guy Martin Tchamgoue

– by G. M. Tchamgoue
Contact me!

Related Posts

0 replies

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

Your email address will not be published. Required fields are marked *