Depth estimation using monocular and stereo cues

Authors:
Ashutosh Saxena;Jamie Schulte;Andrew Y. Ng
Affiliations:
Computer Science Department, Stanford University, Stanford, CA;Computer Science Department, Stanford University, Stanford, CA;Computer Science Department, Stanford University, Stanford, CA
Venue:
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Year:
2007

Citing 11
Cited 14

Shape from Shading: A Survey

IEEE Transactions on Pattern Analysis and Machine Intelligence
Computer Vision: A Modern Approach

Computer Vision: A Modern Approach
A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms

International Journal of Computer Vision
Performance Analysis of Stereo, Vergence, and Focus as Depth Cues for Active Vision

IEEE Transactions on Pattern Analysis and Machine Intelligence
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Maximum-likelihood depth-from-defocus for active vision

IROS '95 Proceedings of the International Conference on Intelligent Robots and Systems-Volume 3 - Volume 3
Comparison of Graph Cuts with Belief Propagation for Stereo, using Identical MRF Parameters

ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Geometric Context from a Single Image

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1 - Volume 01
High speed obstacle avoidance using monocular vision and reinforcement learning

ICML '05 Proceedings of the 22nd international conference on Machine learning
A Dynamic Bayesian Network Model for Autonomous 3D Reconstruction from a Single Indoor Image

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Multiscale conditional random fields for image labeling

CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition

3-D Depth Reconstruction from a Single Still Image

International Journal of Computer Vision
Robotic Grasping of Novel Objects using Vision

International Journal of Robotics Research
Stereo effect of image converted from planar

Information Sciences: an International Journal
Inter-Image Statistics for 3D Environment Modeling

International Journal of Computer Vision
Belief Propagation for Depth Cue Fusion in Minimally Invasive Surgery

MICCAI '08 Proceedings of the 11th International Conference on Medical Image Computing and Computer-Assisted Intervention, Part II
Make3D: depth perception from a single still image

AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 3
Monocular vision SLAM for indoor aerial vehicles

IROS'09 Proceedings of the 2009 IEEE/RSJ international conference on Intelligent robots and systems
A close-form iterative algorithm for depth inferring from a single image

ECCV'10 Proceedings of the 11th European conference on Computer vision: Part V
3D information extraction using Region-based Deformable Net for monocular robot navigation

Journal of Visual Communication and Image Representation
Continuous markov random fields for robust stereo estimation

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part V
Combining monocular geometric cues with traditional stereo cues for consumer camera stereo

ECCV'12 Proceedings of the 12th international conference on Computer Vision - Volume 2
Depth recovery from a single defocused image based on depth locally consistency

Proceedings of the Fifth International Conference on Internet Multimedia Computing and Service
Image partitioning and illumination in image-based pose detection for teleoperated flexible endoscopes

Artificial Intelligence in Medicine
A robust cost function for stereo matching of road scenes

Pattern Recognition Letters

Quantified Score

Hi-index	0.00

Visualization

Abstract

Depth estimation in computer vision and robotics is most commonly done via stereo vision (stereopsis), in which images from two cameras are used to triangulate and estimate distances. However, there are also numerous monocular visual cues--such as texture variations and gradients, defocus, color/haze, etc. --that have heretofore been little exploited in such systems. Some of these cues apply even in regions without texture, where stereo would work poorly. In this paper, we apply a Markov Random Field (MRF) learning algorithm to capture some of these monocular cues, and incorporate them into a stereo system. We show that by adding monocular cues to stereo (triangulation) ones, we obtain significantly more accurate depth estimates than is possible using either monocular or stereo cues alone. This holds true for a large variety of environments, including both indoor environments and unstructured outdoor environments containing trees/forests, buildings, etc. Our approach is general, and applies to incorporating monocular cues together with any off-the-shelf stereo system.