Saliency, Scale and Image Description
International Journal of Computer Vision
Weakly Supervised Scale-Invariant Learning of Models for Visual Recognition
International Journal of Computer Vision
How the Body Shapes the Way We Think: A New View of Intelligence (Bradford Books)
How the Body Shapes the Way We Think: A New View of Intelligence (Bradford Books)
Efficient Learning of Relational Object Class Models
International Journal of Computer Vision
Cue integration through discriminative accumulation
CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
Discriminative cue integration for medical image annotation
Pattern Recognition Letters
Audio-visual robot command recognition: D-META'12 grand challenge
Proceedings of the 14th ACM international conference on Multimodal interaction
Hi-index | 0.00 |
Categorization is one of the fundamental building blocks of cognitive systems. Object categorization has traditionally been addressed in the vision domain, even though cognitive agents are intrinsically multimodal. Indeed, biological systems combine several modalities in order to achieve robust categorization. In this paper we propose a multimodal approach to object category detection, using audio and visual information. The auditory channel is modeled on biologically motivated spectral features via a discriminative classifier. The visual channel is modeled by a state of the art part based model. Multimodality is achieved using two fusion schemes, one high level and the other low level. Experiments on six different object categories, under increasingly difficult conditions, show strengths and weaknesses of the two approaches, and clearly underline the open challenges for multimodal category detection.