Review and analysis of solutions of the three point perspective pose estimation problem
International Journal of Computer Vision
Object Recognition from Local Scale-Invariant Features
ICCV '99 Proceedings of the International Conference on Computer Vision-Volume 2 - Volume 2
Multiple View Geometry in Computer Vision
Multiple View Geometry in Computer Vision
Maintaining Multiple Motion Model Hypotheses Over Many Views to Recover Matching and Structure
ICCV '98 Proceedings of the Sixth International Conference on Computer Vision
Strike a Pose: Tracking People by Finding Stylized Poses
CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
Real Time Localization and 3D Reconstruction
CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 1
Tracking People by Learning Their Appearance
IEEE Transactions on Pattern Analysis and Machine Intelligence
MonoSLAM: Real-Time Single Camera SLAM
IEEE Transactions on Pattern Analysis and Machine Intelligence
VideoTrace: rapid interactive scene modelling from video
ACM SIGGRAPH 2007 papers
Stereoscopic 3D from 2D video with super-resolution capability
Image Communication
Parallel Tracking and Mapping for Small AR Workspaces
ISMAR '07 Proceedings of the 2007 6th IEEE and ACM International Symposium on Mixed and Augmented Reality
Generic and real-time structure from motion using local bundle adjustment
Image and Vision Computing
SURF: speeded up robust features
ECCV'06 Proceedings of the 9th European conference on Computer Vision - Volume Part I
Machine learning for high-speed corner detection
ECCV'06 Proceedings of the 9th European conference on Computer Vision - Volume Part I
Hi-index | 0.00 |
This paper explores the idea of extracting a dense 3D point cloud corresponding to salient features in a video. The goal is to generate the dense point cloud efficiently, in order to use the information in various other video processing tasks. We present a method that is capable of extracting 3D information of videos with no previous knowledge of the scene, while keeping computational costs low. Our method exploits the movement of the camera while robustly tracking features over time, in order to obtain multiple views of a scene and perform 3D reconstruction. Additionally, our system is able to cope with individually moving people seen in the videos, and can estimate each person's pose and fit a 3D model to it. This 3D model is inserted into the dense point cloud in order to visualize the reconstructed scenes, and does not affect the tracking of the rest of the scene