Understanding tag-cloud and visual features for better annotation of concepts in NUS-WIDE dataset

Authors:
Shenghua Gao;Liang-Tien Chia;Xiangang Cheng
Affiliations:
Nanyang Technological University, Singapore;Nanyang Technological University, Singapore;Nanyang Technological University, Singapore
Venue:
WSMC '09 Proceedings of the 1st workshop on Web-scale multimedia corpus
Year:
2009

Citing 11
Cited 3

Texture Features for Browsing and Retrieval of Image Data

IEEE Transactions on Pattern Analysis and Machine Intelligence
Computer Vision

Computer Vision
Image Indexing Using Color Correlograms

CVPR '97 Proceedings of the 1997 Conference on Computer Vision and Pattern Recognition (CVPR '97)
Video Google: A Text Retrieval Approach to Object Matching in Videos

ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
An empirical study of smoothing techniques for language modeling

ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
Distinctive Image Features from Scale-Invariant Keypoints

International Journal of Computer Vision
Histograms of Oriented Gradients for Human Detection

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Does ontology help in image retrieval?: a comparison between keyword, text ontology and multi-modality ontology approaches

MULTIMEDIA '06 Proceedings of the 14th annual ACM international conference on Multimedia
Resolving tag ambiguity

MM '08 Proceedings of the 16th ACM international conference on Multimedia
NUS-WIDE: a real-world web image database from National University of Singapore

Proceedings of the ACM International Conference on Image and Video Retrieval

Web image concept annotation with better understanding of tags and visual features

Journal of Visual Communication and Image Representation
Automatic generation of semantic fields for annotating web images

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Fast semantic image retrieval based on random forest

Proceedings of the 20th ACM international conference on Multimedia

Quantified Score

Hi-index	0.00

Visualization

Abstract

Large-scale dataset construction will require a significant large amount of well labeled ground truth. For the NUS-WIDE dataset, a less labor-intensive annotation process was used and this paper will focuses on improving the semi-manual annotation method used. For the NUS-WIDE dataset, improving the average accuracy for top retrievals of individual concepts will effectively improve the results of the semi-manual annotation method. For web images, both tags and visual feature play important roles in predicting the concept of the image. For visual features, we have adopted an adaptive feature selection method to construct a middle level feature by concatenating the k-NN results for each type of visual feature. This middle feature is more robust than the average combination of single features, and we have shown it achieves good performance for the concept prediction. For Tag cloud, we construct a concept-tag co-occurrence matrix. The co-occurrence information to compute the probability of an image belonging to certain concept and according to Bayes theory for the annotated tags. By understanding the WordNet's taxonomy level, which indicates whether the concept is generic of specific, and exploring the tags clouds distribution, we propose a selection method of using either tag cloud or visual features, to enhance the concepts annotation performance. In this way, the advantages of both tag and visual features are boosted. Experimental results have shown that our method can achieve very high average precision for the NUS-WIDE dataset, which greatly facilitates the construction of large-scale web image data set.