Combined structure and motion extraction from visual data using evolutionary active learning

Authors:
Krishnanand N. Kaipa;Josh C. Bongard;Andrew N. Meltzoff
Affiliations:
University of Vermont, Burlington, VT, USA;University of Vermont, Burlington, VT, USA;University of Washington, Seattle, WA, USA
Venue:
Proceedings of the 11th Annual conference on Genetic and evolutionary computation
Year:
2009

Citing 11
Cited 1

Query by committee

COLT '92 Proceedings of the fifth annual workshop on Computational learning theory
Image-based visual hulls

Proceedings of the 27th annual conference on Computer graphics and interactive techniques
Active Appearance Models

IEEE Transactions on Pattern Analysis and Machine Intelligence
A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms

International Journal of Computer Vision
Computer Vision for Interactive Computer Graphics

IEEE Computer Graphics and Applications
The Visual Hull Concept for Silhouette-Based Image Understanding

IEEE Transactions on Pattern Analysis and Machine Intelligence
Inferring 3D Structure with a Statistical Image-Based Shape Model

ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Silhouette and stereo fusion for 3D object modeling

Computer Vision and Image Understanding - Model-based and image-based 3D scene representation for interactive visalization
Active Coevolutionary Learning of Deterministic Finite Automata

The Journal of Machine Learning Research
Genetic object recognition using combinations of views

IEEE Transactions on Evolutionary Computation
Nonlinear System Identification Using Coevolution of Models and Tests

IEEE Transactions on Evolutionary Computation

2010 Special Issue: Self discovery enables robot social cognition: Are you my teacher?

Neural Networks

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present a novel stereo vision modeling framework that generates approximate, yet physically-plausible representations of objects rather than creating accurate models that are computationally expensive to generate. Our approach to the modeling of target scenes is based on carefully selecting a small subset of the total pixels available for visual processing. To achieve this, we use the estimation-exploration algorithm (EEA) to create the visual models: a population of three-dimensional models is optimized against a growing set of training pixels, and periodically a new pixel that causes disagreement among the models is selected from the observed stereo images of the scene and added to the training set. We show here that using only 5 % of the available pixels, the algorithm can generate approximate models of compound objects in a scene. Our algorithm serves the dual goals of extracting the 3D structure and relative motion of objects of interest by modeling the target objects in terms of their physical parameters (e.g., position, orientation, shape, etc.), and tracking how these parameters vary with time. We support our claims with results from simulation as well from a real robot lifting a compound object.