Multimodal indexing based on semantic cohesion for image retrieval

Authors:
Hugo Jair Escalante;Manuel Montes;Enrique Sucar
Affiliations:
Computer Science Department, National Institute of Astrophysics, Optics and Electronics, Puebla, Mexico 72840;Computer Science Department, National Institute of Astrophysics, Optics and Electronics, Puebla, Mexico 72840;Computer Science Department, National Institute of Astrophysics, Optics and Electronics, Puebla, Mexico 72840
Venue:
Information Retrieval
Year:
2012

Citing 52
Cited 2

Term clustering of syntactic phrases

SIGIR '90 Proceedings of the 13th annual international ACM SIGIR conference on Research and development in information retrieval
Automatic thesaurus generation for an electronic community system

Journal of the American Society for Information Science
Unifying textual and visual cues for content-based image retrieval on the World Wide Web

Computer Vision and Image Understanding - Special issue on content-based access for image and video libraries
Semantic based image retrieval: a probabilistic approach

MULTIMEDIA '00 Proceedings of the eighth ACM international conference on Multimedia
Retrieval from captioned image databases using natural language processing

Proceedings of the ninth international conference on Information and knowledge management
Content-Based Image Retrieval at the End of the Early Years

IEEE Transactions on Pattern Analysis and Machine Intelligence
A vector space model for automatic indexing

Communications of the ACM
MindReader: Querying Databases Through Multiple Examples

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Combining Textual and Visual Cues for Content-Based Image Retrieval on the World Wide Web

CBAIVL '98 Proceedings of the IEEE Workshop on Content - Based Access of Image and Video Libraries
Automatic image annotation and retrieval using cross-media relevance models

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Term Weighting Approaches in Automatic Text Retrieval

Term Weighting Approaches in Automatic Text Retrieval
Matching words and pictures

The Journal of Machine Learning Research
Analysing the performance of visual, concept and text features in content-based video retrieval

Proceedings of the 6th ACM SIGMM international workshop on Multimedia information retrieval
Distributional term representations: an experimental comparison

Proceedings of the thirteenth ACM international conference on Information and knowledge management
Comparing Rank and Score Combination Methods for Data Fusion in Information Retrieval

Information Retrieval
Retrieving lightly annotated images using image similarities

Proceedings of the 2005 ACM symposium on Applied computing
User term feedback in interactive text-based image retrieval

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Object Categorization by Learned Universal Visual Dictionary

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision - Volume 2
Early versus late fusion in semantic video analysis

Proceedings of the 13th annual ACM international conference on Multimedia
Early versus late fusion in semantic video analysis

Proceedings of the 13th annual ACM international conference on Multimedia
Content-based multimedia information retrieval: State of the art and challenges

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
Enhancing relevance feedback in image retrieval using unlabeled data

ACM Transactions on Information Systems (TOIS)
The challenge problem for automated detection of 101 semantic concepts in multimedia

MULTIMEDIA '06 Proceedings of the 14th annual ACM international conference on Multimedia
A survey of content-based image retrieval with high-level semantics

Pattern Recognition
Supervised Learning of Semantic Classes for Image Annotation and Retrieval

IEEE Transactions on Pattern Analysis and Machine Intelligence
An outranking approach for rank aggregation in information retrieval

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Synobins: an intermediate level towards annotation and semantic retrieval

EURASIP Journal on Applied Signal Processing
Evaluation of Localized Semantics: Data, Methodology, and Experiments

International Journal of Computer Vision
Image retrieval: Ideas, influences, and trends of the new age

ACM Computing Surveys (CSUR)
A survey of methods for image annotation

Journal of Visual Languages and Computing
Towards Annotation-Based Query and Document Expansion for Image Retrieval

Advances in Multilingual and Multimodal Information Retrieval
Late fusion of heterogeneous methods for multimedia image retrieval

MIR '08 Proceedings of the 1st ACM international conference on Multimedia information retrieval
Crossing textual and visual content in different application scenarios

Multimedia Tools and Applications
Agreement among statistical significance tests for information retrieval evaluation at varying sample sizes

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Combining Text Vector Representations for Information Retrieval

TSD '09 Proceedings of the 12th International Conference on Text, Speech and Dialogue
Reusing annotation labor for concept selection

Proceedings of the ACM International Conference on Image and Video Retrieval
The segmented and annotated IAPR TC-12 benchmark

Computer Vision and Image Understanding
Overview of the ImageCLEFphoto 2008 photographic retrieval task

CLEF'08 Proceedings of the 9th Cross-language evaluation forum conference on Evaluating systems for multilingual and multimodal information access
Annotation-based expansion and late fusion of mixed methods for multimedia image retrieval

CLEF'08 Proceedings of the 9th Cross-language evaluation forum conference on Evaluating systems for multilingual and multimodal information access
Overview of VideoCLEF 2009: new perspectives on speech-based multimedia content enrichment

CLEF'09 Proceedings of the 10th international conference on Cross-language evaluation forum: multimedia experiments
Combining word and phonetic-code representations for spoken document retrieval

CICLing'11 Proceedings of the 12th international conference on Computational linguistics and intelligent text processing - Volume Part II
An energy-based model for region-labeling

Computer Vision and Image Understanding
A neural network to retrieve images from text queries

ICANN'06 Proceedings of the 16th international conference on Artificial Neural Networks - Volume Part II
A discriminative approach for the retrieval of images from text queries

ECML'06 Proceedings of the 17th European conference on Machine Learning
Dublin city university at CLEF 2005: experiments with the ImageCLEF st andrew’s collection

CLEF'05 Proceedings of the 6th international conference on Cross-Language Evalution Forum: accessing Multilingual Information Repositories
UNED at ImageCLEF 2005: automatically structured queries with named entities over metadata

CLEF'05 Proceedings of the 6th international conference on Cross-Language Evalution Forum: accessing Multilingual Information Repositories
The Bayesian image retrieval system, PicHunter: theory, implementation, and psychophysical experiments

IEEE Transactions on Image Processing
Relevance feedback: a power tool for interactive content-based image retrieval

IEEE Transactions on Circuits and Systems for Video Technology
Overview of the ImageCLEF 2006 photographic retrieval and object annotation tasks

CLEF'06 Proceedings of the 7th international conference on Cross-Language Evaluation Forum: evaluation of multilingual and multi-modal information retrieval
Approaches of using a word-image ontology and an annotated image corpus as intermedia for cross-language image retrieval

CLEF'06 Proceedings of the 7th international conference on Cross-Language Evaluation Forum: evaluation of multilingual and multi-modal information retrieval
CINDI at ImageCLEF 2006: image retrieval & annotation tasks for the general photographic and medical image collections

CLEF'06 Proceedings of the 7th international conference on Cross-Language Evaluation Forum: evaluation of multilingual and multi-modal information retrieval
Inter-media pseudo-relevance feedback application to ImageCLEF 2006 photo retrieval

CLEF'06 Proceedings of the 7th international conference on Cross-Language Evaluation Forum: evaluation of multilingual and multi-modal information retrieval

Multimodal recognition of visual concepts using histograms of textual concepts and selective weighted late fusion scheme

Computer Vision and Image Understanding
Distributional term representations for short-text categorization

CICLing'13 Proceedings of the 14th international conference on Computational Linguistics and Intelligent Text Processing - Volume 2

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper introduces two novel strategies for representing multimodal images with application to multimedia image retrieval. We consider images that are composed of both text and labels: while text describes the image content at a very high semantic level (e.g., making reference to places, dates or events), labels provide a mid-level description of the image (i.e., in terms of the objects that can be seen in the image). Accordingly, the main assumption of this work is that by combining information from text and labels we can develop very effective retrieval methods. We study standard information fusion techniques for combining both sources of information. However, whereas the performance of such techniques is highly competitive, they cannot capture effectively the content of images. Therefore, we propose two novel representations for multimodal images that attempt to exploit the semantic cohesion among terms from different modalities. Such representations are based on distributional term representations widely used in computational linguistics. Under the considered representations the content of an image is modeled by a distribution of co-occurrences over terms or of occurrences over other images, in such a way that the representation can be considered an expansion of the multimodal terms in the image. We report experimental results using the SAIAPR TC12 benchmark on two sets of topics used in ImageCLEF competitions with manually and automatically generated labels. Experimental results show that the proposed representations outperform significantly both, standard multimodal techniques and unimodal methods. Results on manually assigned labels provide an upper bound in the retrieval performance that can be obtained, whereas results with automatically generated labels are encouraging. The novel representations are able to capture more effectively the content of multimodal images. We emphasize that although we have applied our representations to multimedia image retrieval the same formulation can be adopted for modeling other multimodal documents (e.g., videos).