Flickr-tag prediction using multi-modal fusion and meta information

Authors:
Yu-Chuan Su;Tzu-Hsuan Chiu;Guan-Long Wu;Chun-Yen Yeh;Felix Wu;Winston Hsu
Affiliations:
National Taiwan University, Taipei, Taiwan Roc;National Taiwan University, Taipei, Taiwan Roc;National Taiwan University, Taipei, Taiwan Roc;National Taiwan University, Taipei, Taiwan Roc;National Taiwan University, Taipei, Taiwan Roc;National Taiwan University, Taipei, Taiwan Roc
Venue:
Proceedings of the 21st ACM international conference on Multimedia
Year:
2013

Citing 4
Cited 0

LIBLINEAR: A Library for Large Linear Classification

The Journal of Machine Learning Research
Large linear classification when data cannot fit in memory

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Improving the fisher kernel for large-scale image classification

ECCV'10 Proceedings of the 11th European conference on Computer vision: Part IV
What does classifying more than 10,000 image categories tell us?

ECCV'10 Proceedings of the 11th European conference on Computer vision: Part V

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present our evaluation and analysis on Yahoo! Large-scale Flickr-tag Image Classification dataset. Our evaluations show that combining multi-features and different classification models, the MAP of tag prediction can be significantly improve over ordinary linear classification. Further analysis shows that some tags are given not because of the visual content but the meta information of images. Our experiments show that we can make more accurate prediction on certain tags using meta information without any training process, compared with visual content based classifiers. Combine the meta information, multi-features and multi-models fusion, we achieve significantly better performance than simple linear classification. We also evaluate the performance of various mid-level feature, and the results suggest that "Concept Bank" feature may be a promising direction for the task.