Design of Multimodal Dissimilarity Spaces for Retrieval of Video Documents

Authors:
Eric Bruno;Nicolas Moenne-Loccoz;Steéphane Marchand-Maillet
Affiliations:
-;-;-
Venue:
IEEE Transactions on Pattern Analysis and Machine Intelligence
Year:
2008

Citing 0
Cited 12

Information Fusion in Multimedia Information Retrieval

Adaptive Multimedial Retrieval: Retrieval, User, and Semantics
Can feature information interaction help for information fusion in multimedia problems?

Multimedia Tools and Applications
Concept-Based Video Retrieval

Foundations and Trends in Information Retrieval
Content based image retrieval using unclean positive examples

IEEE Transactions on Image Processing
Adaptive nonseparable wavelet transform via lifting and its application to content-based image retrieval

IEEE Transactions on Image Processing
A continuum between browsing and query-based search for user-centered multimedia information access

AMR'09 Proceedings of the 7th international conference on Adaptive multimedia retrieval: understanding media and adapting to the user
Semantic combination of textual and visual information in multimedia retrieval

Proceedings of the 1st ACM International Conference on Multimedia Retrieval
A parallel cross-modal search engine over large-scale multimedia collections with interactive relevance feedback

Proceedings of the 1st ACM International Conference on Multimedia Retrieval
Exploiting class-specific features in multi-feature dissimilarity space for efficient querying of images

FQAS'11 Proceedings of the 9th international conference on Flexible Query Answering Systems
Multimedia information retrieval in a social context

PROMISE'12 Proceedings of the 2012 international conference on Information Retrieval Meets Information Visualization
Narrative theme navigation for sitcoms supported by fan-generated scripts

Multimedia Tools and Applications
Learning-Based interactive retrieval in large-scale multimedia collections

AMR'11 Proceedings of the 9th international conference on Adaptive Multimedia Retrieval: large-scale multimedia retrieval and evaluation

Quantified Score

Hi-index	0.14

Visualization

Abstract

This paper proposes a novel representation space for multimodal information, enabling fast and efficient retrieval of video data. We suggest describing the documents not directly by selected multimodal features (audio, visual or text), but rather by considering cross-document similarities relatively to their multimodal characteristics. This idea leads us to propose a particular form of \emph{dissimilarity space} that is adapted to the asymmetric classification problem, and in turn to the \emph{query-by-example} and \emph{relevance feedback} paradigm, widely used in information retrieval. Based on the proposed dissimilarity space, we then define various strategies to fuse modalities through a kernel-based learning approach. The problem of automatic kernel setting to adapt the learning process to the queries is also discussed. The properties of our strategies are studied and validated on artificial data. In a second phase, a large annotated video corpus, (\emph{ie} TRECVID-05), indexed by visual, audio and text features is considered to evaluate the overall performance of the dissimilarity space and fusion strategies. The obtained results confirm the validity of the proposed approach for the representation and retrieval of multimodal information in a real-time framework.