Bridging the Semantic Gap Between Image Contents and Tags

Authors:
Hao Ma;Jianke Zhu;M. R.-T. Lyu;I. King
Affiliations:
Dept. of Comput. Sci. & Eng., Chinese Univ. of Hong Kong, Kowloon, China;-;-;-
Venue:
IEEE Transactions on Multimedia
Year:
2010

Citing 0
Cited 6

Exploring self-similarities of bag-of-features for image classification

MM '11 Proceedings of the 19th ACM international conference on Multimedia
Contextual synonym dictionary for visual object retrieval

MM '11 Proceedings of the 19th ACM international conference on Multimedia
Joint-rerank: a novel method for image search reranking

Proceedings of the 2nd ACM International Conference on Multimedia Retrieval
Web media semantic concept retrieval via tag removal and model fusion

ACM Transactions on Intelligent Systems and Technology (TIST) - Survey papers, special sections on the semantic adaptive social web, intelligent systems for health informatics, regular papers
Search-based relevance association with auxiliary contextual cues

Proceedings of the 21st ACM international conference on Multimedia
Cross-modal social image clustering and tag cleansing

Journal of Visual Communication and Image Representation

Quantified Score

Hi-index	0.00

Visualization

Abstract

With the exponential growth of Web 2.0 applications, tags have been used extensively to describe the image contents on the Web. Due to the noisy and sparse nature in the human generated tags, how to understand and utilize these tags for image retrieval tasks has become an emerging research direction. As the low-level visual features can provide fruitful information, they are employed to improve the image retrieval results. However, it is challenging to bridge the semantic gap between image contents and tags. To attack this critical problem, we propose a unified framework in this paper which stems from a two-level data fusions between the image contents and tags: 1) A unified graph is built to fuse the visual feature-based image similarity graph with the image-tag bipartite graph; 2) A novel random walk model is then proposed, which utilizes a fusion parameter to balance the influences between the image contents and tags. Furthermore, the presented framework not only can naturally incorporate the pseudo relevance feedback process, but also it can be directly applied to applications such as content-based image retrieval, text-based image retrieval, and image annotation. Experimental analysis on a large Flickr dataset shows the effectiveness and efficiency of our proposed framework.