Instance-Based Learning Algorithms
Machine Learning
Active + Semi-supervised Learning = Robust Multi-View Learning
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Active learning using pre-clustering
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Distinguishing Mislabeled Data from Correctly Labeled Data in Classifier Design
ICTAI '04 Proceedings of the 16th IEEE International Conference on Tools with Artificial Intelligence
Semi-Supervised Cross Feature Learning for Semantic Concept Detection in Videos
CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
Extreme video retrieval: joint maximization of human and computer performance
MULTIMEDIA '06 Proceedings of the 14th annual ACM international conference on Multimedia
Automated Data Augmentation Services Using Text Mining, Data Cleansing and Web Crawling Techniques
SERVICES '08 Proceedings of the 2008 IEEE Congress on Services - Part I
Semi-supervised On-Line Boosting for Robust Tracking
ECCV '08 Proceedings of the 10th European Conference on Computer Vision: Part I
Error Detection and Uncertainty Modeling for Imprecise Data
ICTAI '09 Proceedings of the 2009 21st IEEE International Conference on Tools with Artificial Intelligence
Weakly supervised classification of objects in images using soft random forests
ECCV'10 Proceedings of the 11th European conference on Computer vision: Part IV
Hi-index | 0.00 |
Tracking output is a very attractive source of labeled data sets that, in turn, could be used to train other systems for tracking, detection, recognition and categorization. In this context, long tracking sequences are of particular importance because they provide richer information, multiple views, wider range of appearances. This paper addresses two obstacles to the use of tracking data for training: noise in the tracking data and the unreliability and slow pace of hand labeling. The paper introduces a criterion for detecting inconsistencies (noise) in large data collections and a method for selecting typical representatives of consistent collections. Those are used to build a pipeline which cleanses the tracking data and employs instantaneous (shotgun) labeling of vast numbers of images. The shotgun labeled data is shown to compare favorably with hand labeled data when used in classification tasks. The framework is collaborative - it involves a human-in-the loop but it is designed to minimize the burden on the human.