Combining labeled and unlabeled data with co-training
COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Unsupervised Improvement of Visual Detectors using Co-Training
ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Proceedings of the 5th international conference on Multimodal interfaces
Two-way adaptation for robust input interpretation in practical multimodal conversation systems
Proceedings of the 10th international conference on Intelligent user interfaces
Semisupervised learning from different information sources
Knowledge and Information Systems
Semi-Supervised Cross Feature Learning for Semantic Concept Detection in Videos
CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
Detection of agreement vs. disagreement in meetings: training with unlabeled data
NAACL-Short '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology: companion volume of the Proceedings of HLT-NAACL 2003--short papers - Volume 2
Adaptive view-based appearance models
CVPR'03 Proceedings of the 2003 IEEE computer society conference on Computer vision and pattern recognition
Watch, Listen & Learn: Co-training on Captioned Images and Videos
ECML PKDD '08 Proceedings of the 2008 European Conference on Machine Learning and Knowledge Discovery in Databases - Part I
A Multi-view Approach for Relation Extraction
WISM '09 Proceedings of the International Conference on Web Information Systems and Mining
PRICAI'10 Proceedings of the 11th Pacific Rim international conference on Trends in artificial intelligence
Learning to recognize objects from unseen modalities
ECCV'10 Proceedings of the 11th European conference on Computer vision: Part I
A multi-modal approach for natural human-robot interaction
ICSR'12 Proceedings of the 4th international conference on Social Robotics
Hi-index | 0.00 |
The construction of robust multimodal interfaces often requires large amounts of labeled training data to account for cross-user differences and variation in the environment. In this work, we investigate whether unlabeled training data can be leveraged to build more reliable audio-visual classifiers through co-training, a multi-view learning algorithm. Multimodal tasks are good candidates for multi-view learning, since each modality provides a potentially redundant view to the learning algorithm. We apply co-training to two problems: audio-visual speech unit classification, and user agreement recognition using spoken utterances and head gestures. We demonstrate that multimodal co-training can be used to learn from only a few labeled examples in one or both of the audio-visual modalities. We also propose a co-adaptation algorithm, which adapts existing audio-visual classifiers to a particular user or noise condition by leveraging the redundancy in the unlabeled data.