Using visual cues for the extraction of web image semantic information

Authors:
Georgina Tryfou;Nicolas Tsapatsoulis
Affiliations:
Department of Communication and Internet Studies, Cyprus University of Technology, Limassol, Cyprus;Department of Communication and Internet Studies, Cyprus University of Technology, Limassol, Cyprus
Venue:
TPDL'12 Proceedings of the Second international conference on Theory and Practice of Digital Libraries
Year:
2012

Citing 1
Cited 0

A vector space model for automatic indexing

Communications of the ACM

Quantified Score

Hi-index	0.00

Visualization

Abstract

Mining information for the images that currently exist in huge amounts on the web, has been a main scientific interest during the past years. Several methods have been exploited and web image information is extracted from textual sources such as image file names, anchor texts, existing keywords and, of course, surrounding text. However, the systems that attempt to mine information for images using surrounding text suffer from several problems, such as the inability to correctly assign all relevant text to an image and discard the irrelevant text as well. A novel method for extracting web image information is discussed in the present paper. The proposed system uses visual cues in order to cluster a web page into several regions and assign to each hosted image the text that most possibly refers to it. Three different approaches to the problem of text to image assignment are discussed and evaluated. The evaluation procedure indicates the advantages of using visual cues and two dimensional euclidean measures for extracting information for web images.