Understanding tag-cloud and visual features for better annotation of concepts in NUS-WIDE dataset

  • Authors:
  • Shenghua Gao;Liang-Tien Chia;Xiangang Cheng

  • Affiliations:
  • Nanyang Technological University, Singapore;Nanyang Technological University, Singapore;Nanyang Technological University, Singapore

  • Venue:
  • WSMC '09 Proceedings of the 1st workshop on Web-scale multimedia corpus
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Large-scale dataset construction will require a significant large amount of well labeled ground truth. For the NUS-WIDE dataset, a less labor-intensive annotation process was used and this paper will focuses on improving the semi-manual annotation method used. For the NUS-WIDE dataset, improving the average accuracy for top retrievals of individual concepts will effectively improve the results of the semi-manual annotation method. For web images, both tags and visual feature play important roles in predicting the concept of the image. For visual features, we have adopted an adaptive feature selection method to construct a middle level feature by concatenating the k-NN results for each type of visual feature. This middle feature is more robust than the average combination of single features, and we have shown it achieves good performance for the concept prediction. For Tag cloud, we construct a concept-tag co-occurrence matrix. The co-occurrence information to compute the probability of an image belonging to certain concept and according to Bayes theory for the annotated tags. By understanding the WordNet's taxonomy level, which indicates whether the concept is generic of specific, and exploring the tags clouds distribution, we propose a selection method of using either tag cloud or visual features, to enhance the concepts annotation performance. In this way, the advantages of both tag and visual features are boosted. Experimental results have shown that our method can achieve very high average precision for the NUS-WIDE dataset, which greatly facilitates the construction of large-scale web image data set.