Flickr-tag prediction using multi-modal fusion and meta information

  • Authors:
  • Yu-Chuan Su;Tzu-Hsuan Chiu;Guan-Long Wu;Chun-Yen Yeh;Felix Wu;Winston Hsu

  • Affiliations:
  • National Taiwan University, Taipei, Taiwan Roc;National Taiwan University, Taipei, Taiwan Roc;National Taiwan University, Taipei, Taiwan Roc;National Taiwan University, Taipei, Taiwan Roc;National Taiwan University, Taipei, Taiwan Roc;National Taiwan University, Taipei, Taiwan Roc

  • Venue:
  • Proceedings of the 21st ACM international conference on Multimedia
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present our evaluation and analysis on Yahoo! Large-scale Flickr-tag Image Classification dataset. Our evaluations show that combining multi-features and different classification models, the MAP of tag prediction can be significantly improve over ordinary linear classification. Further analysis shows that some tags are given not because of the visual content but the meta information of images. Our experiments show that we can make more accurate prediction on certain tags using meta information without any training process, compared with visual content based classifiers. Combine the meta information, multi-features and multi-models fusion, we achieve significantly better performance than simple linear classification. We also evaluate the performance of various mid-level feature, and the results suggest that "Concept Bank" feature may be a promising direction for the task.