Crossing textual and visual content in different application scenarios

Authors:
Julien Ah-Pine;Marco Bressan;Stephane Clinchant;Gabriela Csurka;Yves Hoppenot;Jean-Michel Renders
Affiliations:
Xerox Research Centre Europe, Meylan, France 38240;Xerox Research Centre Europe, Meylan, France 38240;Xerox Research Centre Europe, Meylan, France 38240;Xerox Research Centre Europe, Meylan, France 38240;Xerox Research Centre Europe, Meylan, France 38240;Xerox Research Centre Europe, Meylan, France 38240
Venue:
Multimedia Tools and Applications
Year:
2009

Citing 17
Cited 6

The use of MMR, diversity-based reranking for reordering documents and producing summaries

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Model-based feedback in the language modeling approach to information retrieval

Proceedings of the tenth international conference on Information and knowledge management
Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary

ECCV '02 Proceedings of the 7th European Conference on Computer Vision-Part IV
Automatic image annotation and retrieval using cross-media relevance models

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Modeling annotated data

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Matching words and pictures

The Journal of Machine Learning Research
PLSA-based image auto-annotation: constraining the latent space

Proceedings of the 12th annual ACM international conference on Multimedia
GCap: Graph-based Automatic Image Captioning

CVPRW '04 Proceedings of the 2004 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'04) Volume 9 - Volume 09
Web-assisted annotation, semantic indexing and search of television and radio news

WWW '05 Proceedings of the 14th international conference on World Wide Web
ALIP: The Automatic Linguistic Indexing of Pictures System

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 2 - Volume 02
Joint visual-text modeling for automatic retrieval of multimedia documents

Proceedings of the 13th annual ACM international conference on Multimedia
Probabilistic web image gathering

Proceedings of the 7th ACM SIGMM international workshop on Multimedia information retrieval
Regularized estimation of mixture models for robust pseudo-relevance feedback

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
AnnoSearch: Image Auto-Annotation by Search

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Image annotation by large-scale content-based image retrieval

MULTIMEDIA '06 Proceedings of the 14th annual ACM international conference on Multimedia
Multiple Bernoulli relevance models for image and video annotation

CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
Approaches of using a word-image ontology and an annotated image corpus as intermedia for cross-language image retrieval

CLEF'06 Proceedings of the 7th international conference on Cross-Language Evaluation Forum: evaluation of multilingual and multi-modal information retrieval

A continuum between browsing and query-based search for user-centered multimedia information access

AMR'09 Proceedings of the 7th international conference on Adaptive multimedia retrieval: understanding media and adapting to the user
Semantic combination of textual and visual information in multimedia retrieval

Proceedings of the 1st ACM International Conference on Multimedia Retrieval
Multimodal representation, indexing, automated annotation and retrieval of image collections via non-negative matrix factorization

Neurocomputing
Multimodal indexing based on semantic cohesion for image retrieval

Information Retrieval
A selective weighted late fusion for visual concept recognition

ECCV'12 Proceedings of the 12th international conference on Computer Vision - Volume Part III
Multimodal recognition of visual concepts using histograms of textual concepts and selective weighted late fusion scheme

Computer Vision and Image Understanding

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper deals with multimedia information access. We propose two new approaches for hybrid text-image information processing that can be straightforwardly generalized to the more general multimodal scenario. Both approaches fall in the trans-media pseudo-relevance feedback category. Our first method proposes using a mixture model of the aggregate components, considering them as a single relevance concept. In our second approach, we define trans-media similarities as an aggregation of monomodal similarities between the elements of the aggregate and the new multimodal object. We also introduce the monomodal similarity measures for text and images that serve as basic components for both proposed trans-media similarities. We show how one can frame a large variety of problem in order to address them with the proposed techniques: image annotation or captioning, text illustration and multimedia retrieval and clustering. Finally, we present how these methods can be integrated in two applications: a travel blog assistant system and a tool for browsing the Wikipedia taking into account the multimedia nature of its content.