Combining Textual and Visual Cues for Content-based Image Retrieval on the World Wide Web

Authors:
Marco La Cascia;Sarathendu Sethi;Stan Sclaroff
Affiliations:
-;-;-
Venue:
Combining Textual and Visual Cues for Content-based Image Retrieval on the World Wide Web
Year:
1998

Citing 0
Cited 2

Clustering presentation of web image retrieval results using textual information and image features

IMSA'06 Proceedings of the 24th IASTED international conference on Internet and multimedia systems and applications
Understanding the everyday use of images on the web

Proceedings of the 6th Nordic Conference on Human-Computer Interaction: Extending Boundaries

Quantified Score

Hi-index	0.00

Visualization

Abstract

Abstract Some WWW image engines allow the user to form a query in terms of text keywords. To build the image index, keywords are extracted heuristically from HTML documents containing each image, and/or from the image URL and file headers. Unfortunately, text-based image engines have merely retro-fitted standard SQL database query methods, and it is difficult to include images cues within such a framework. On the other hand, visual statistics ({\em e.g.}, color histograms) are often insufficient for helping users find desired images in a vast WWW index. By truly unifying textual and visual statistics, one would expect to get better results than either used separately. In this paper, we propose an approach that allows the combination of visual statistics with textual statistics in the vector space representation commonly used in query by image content systems. Text statistics are captured in vector form using latent semantic indexing (LSI). The LSI index for an HTML document is then associated with each of the images contained therein. Visual statistics ({\em e.g.}, color, orientedness) are also computed for each image. The LSI and visual statistic vectors are then combined into a single index vector that can be used for content-based search of the resulting image database. By using an integrated approach, we are able to take advantage of possible statistical couplings between the topic of the document (latent semantic content) and the contents of images (visual statistics). This allows improved performance in conducting content-based search. This approach has been implemented in a WWW image search engine prototype.