Image segmentation by successive background extraction
Pattern Recognition
Content-Based Video Indexing and Retrieval
IEEE MultiMedia
Inferring depictions in natural-language captions for efficient access to picture data
Information Processing and Management: an International Journal
Natural-language retrieval of images based on descriptive captions
ACM Transactions on Information Systems (TOIS)
VisualSEEk: a fully automated content-based image query system
MULTIMEDIA '96 Proceedings of the fourth ACM international conference on Multimedia
Automatic classification of objects in captioned depictive photographs for retrieval
Intelligent multimedia information retrieval
Informedia: news-on-demand multimedia information acquisition and retrieval
Intelligent multimedia information retrieval
Automatic caption localization for photographs on World Wide Web pages
Information Processing and Management: an International Journal
Figure-Ground Discrimination: A Combinatorial Optimization Approach
IEEE Transactions on Pattern Analysis and Machine Intelligence
WebSeer: An Image Search Engine for the World Wide Web
WebSeer: An Image Search Engine for the World Wide Web
A model for multimodal reference resolution
ReferringPhenomena '97 Referring Phenomena in a Multimedia Context and their Computational Treatment
A region growing and merging algorithm to color segmentation
Pattern Recognition
MM '09 Proceedings of the 17th ACM international conference on Multimedia
Identifying objects in images from analyzing the users' gaze movements for provided tags
MMM'12 Proceedings of the 18th international conference on Advances in Multimedia Modeling
Hi-index | 0.00 |
We address the problem of finding the subject of a photographic image intended to illustrate some physical object or objects (驴depictive驴) and taken by usual optical means without magnification (驴natural驴). This could help in developing digital image libraries since important image properties like subject size and color of a photograph are not usually mentioned in accompanying captions and can help rank the photograph retrievals for a user. We explore an approach that identifies the 驴visual focus驴 of the image and the 驴depicted concepts驴 in a caption and connects them. The visual focus is determined by using eight domain-independent characteristics of regions in the segmented image, and the caption depiction is identified by a set a rules applied to the parsed and interpreted caption. The visual-focus determination also does combinatorial optimization on sets of regions to find the set that best satisfies focus criteria. Experiments on 100 randomly selected image-caption pairs show significant improvement in precision of retrieval over simpler methods, and, particularly, emphasizes the value of segmentation of the image.