Collaborative track analysis, data cleansing, and labeling

Authors:
George Kamberov;Gerda Kamberova;Matt Burlick;Lazaros Karydas;Bart Luczynski
Affiliations:
Department of Computer Science, Stevens Institute of Technology, Hoboken, NJ;Department of Computer Science, Hofstra University, Hempstead, NY;Department of Computer Science, Stevens Institute of Technology, Hoboken, NJ;Department of Computer Science, Stevens Institute of Technology, Hoboken, NJ;Department of Computer Science, Stevens Institute of Technology, Hoboken, NJ
Venue:
ISVC'11 Proceedings of the 7th international conference on Advances in visual computing - Volume Part I
Year:
2011

Citing 10
Cited 0

Instance-Based Learning Algorithms

Machine Learning
Active + Semi-supervised Learning = Robust Multi-View Learning

ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Active learning using pre-clustering

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Distinguishing Mislabeled Data from Correctly Labeled Data in Classifier Design

ICTAI '04 Proceedings of the 16th IEEE International Conference on Tools with Artificial Intelligence
Semi-Supervised Cross Feature Learning for Semantic Concept Detection in Videos

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
Extreme video retrieval: joint maximization of human and computer performance

MULTIMEDIA '06 Proceedings of the 14th annual ACM international conference on Multimedia
Automated Data Augmentation Services Using Text Mining, Data Cleansing and Web Crawling Techniques

SERVICES '08 Proceedings of the 2008 IEEE Congress on Services - Part I
Semi-supervised On-Line Boosting for Robust Tracking

ECCV '08 Proceedings of the 10th European Conference on Computer Vision: Part I
Error Detection and Uncertainty Modeling for Imprecise Data

ICTAI '09 Proceedings of the 2009 21st IEEE International Conference on Tools with Artificial Intelligence
Weakly supervised classification of objects in images using soft random forests

ECCV'10 Proceedings of the 11th European conference on Computer vision: Part IV

Quantified Score

Hi-index	0.00

Visualization

Abstract

Tracking output is a very attractive source of labeled data sets that, in turn, could be used to train other systems for tracking, detection, recognition and categorization. In this context, long tracking sequences are of particular importance because they provide richer information, multiple views, wider range of appearances. This paper addresses two obstacles to the use of tracking data for training: noise in the tracking data and the unreliability and slow pace of hand labeling. The paper introduces a criterion for detecting inconsistencies (noise) in large data collections and a method for selecting typical representatives of consistent collections. Those are used to build a pipeline which cleanses the tracking data and employs instantaneous (shotgun) labeling of vast numbers of images. The shotgun labeled data is shown to compare favorably with hand labeled data when used in classification tasks. The framework is collaborative - it involves a human-in-the loop but it is designed to minimize the burden on the human.