Multimodal representation, indexing, automated annotation and retrieval of image collections via non-negative matrix factorization

Authors:
Juan C. Caicedo;Jaafar BenAbdallah;Fabio A. González;Olfa Nasraoui
Affiliations:
Computer Systems and Industrial Engineering Department, National University of Colombia, Cra 30 45 - 03, Ciudad Universitaria, Edif. 453, Of. 114. Bogotá, Colombia;Department of Computer Engineering and Computer Science, University of Louisville, Louisville KY, USA;Computer Systems and Industrial Engineering Department, National University of Colombia, Cra 30 45 - 03, Ciudad Universitaria, Edif. 453, Of. 114. Bogotá, Colombia;Department of Computer Engineering and Computer Science, University of Louisville, Louisville KY, USA
Venue:
Neurocomputing
Year:
2012

Citing 32
Cited 3

Experiments on using semantic distances between words in image caption retrieval

SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Probabilistic latent semantic indexing

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Content-Based Image Retrieval at the End of the Early Years

IEEE Transactions on Pattern Analysis and Machine Intelligence
Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary

ECCV '02 Proceedings of the 7th European Conference on Computer Vision-Part IV
Automatic image annotation and retrieval using cross-media relevance models

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Modeling annotated data

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Document clustering based on non-negative matrix factorization

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Matching words and pictures

The Journal of Machine Learning Research
A picture is worth a thousand keywords: image-based object search on a mobile platform

CHI '05 Extended Abstracts on Human Factors in Computing Systems
Photo-to-search: using multimodal queries to search the web from mobile devices

Proceedings of the 7th ACM SIGMM international workshop on Multimedia information retrieval
Image Retrieval Using Multimodal Keywords

ISM '06 Proceedings of the Eighth IEEE International Symposium on Multimedia
Review: Which is the best way to organize/classify images by content?

Image and Vision Computing
Modeling Semantic Aspects for Cross-Media Image Indexing

IEEE Transactions on Pattern Analysis and Machine Intelligence
Image retrieval: Ideas, influences, and trends of the new age

ACM Computing Surveys (CSUR)
On the equivalence between Non-negative Matrix Factorization and Probabilistic Latent Semantic Indexing

Computational Statistics & Data Analysis
Data mining for image/video processing: a promising research frontier

CIVR '08 Proceedings of the 2008 international conference on Content-based image and video retrieval
Semantic spaces revisited: investigating the performance of auto-annotation and semantic retrieval using semantic spaces

CIVR '08 Proceedings of the 2008 international conference on Content-based image and video retrieval
Introduction to Information Retrieval

Introduction to Information Retrieval
The MIR flickr retrieval evaluation

MIR '08 Proceedings of the 1st ACM international conference on Multimedia information retrieval
A New Baseline for Image Annotation

ECCV '08 Proceedings of the 10th European Conference on Computer Vision: Part III
Crossing textual and visual content in different application scenarios

Multimedia Tools and Applications
Multimodal Image Retrieval Based on Annotation Keywords and Visual Content

CASE '09 Proceedings of the 2009 IITA International Conference on Control, Automation and Systems Engineering (case 2009)
Image annotation with tagprop on the MIRFLICKR set

Proceedings of the international conference on Multimedia information retrieval
Automatically annotating the MIR Flickr dataset: experimental protocols, openly available data and semantic spaces

Proceedings of the international conference on Multimedia information retrieval
Distributed nonnegative matrix factorization for web-scale dyadic data analysis on mapreduce

Proceedings of the 19th international conference on World wide web
NMF-based multimodal image indexing for querying by visual example

Proceedings of the ACM International Conference on Image and Video Retrieval
Overview of the wikipediaMM task at ImageCLEF 2009

CLEF'09 Proceedings of the 10th international conference on Cross-language evaluation forum: multimedia experiments
Overview of the CLEF 2009 medical image retrieval track

CLEF'09 Proceedings of the 10th international conference on Cross-language evaluation forum: multimedia experiments
Multimodal Image Annotation Using Non-negative Matrix Factorization

WI-IAT '10 Proceedings of the 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
A linear-algebraic technique with an application in semantic image retrieval

CIVR'06 Proceedings of the 5th international conference on Image and Video Retrieval
Bridging the Gap: Query by Semantic Example

IEEE Transactions on Multimedia
Overview of the ImageCLEF 2006 photographic retrieval and object annotation tasks

CLEF'06 Proceedings of the 7th international conference on Cross-Language Evaluation Forum: evaluation of multilingual and multi-modal information retrieval

Editorial: Special issue on advances in web intelligence

Neurocomputing
Multimodal fusion for image retrieval using matrix factorization

Proceedings of the 2nd ACM International Conference on Multimedia Retrieval
Distributional semantics with eyes: using image analysis to improve computational representations of word meaning

Proceedings of the 20th ACM international conference on Multimedia

Quantified Score

Hi-index	0.01

Visualization

Abstract

Massive image collections are increasingly available on the Web. These collections often incorporate complementary non-visual data such as text descriptions, comments, user ratings and tags. These additional data modalities may provide a semantic complement to the image visual content, which could improve the performance of different image content analysis tasks. This paper presents a novel method based on non-negative matrix factorization to generate multimodal image representations that integrate visual features and text information. The proposed approach discovers a set of latent factors that correlate multimodal data in the same representation space. We evaluated the potential of this multimodal image representation in various tasks associated to image indexing and search. Experimental results show that the proposed method highly outperforms the response of the system in both tasks, when compared to multimodal latent semantic spaces generated by a singular value decomposition.