A deep-learning model-based and data-driven hybrid architecture for image annotation

Authors:
Zhiyu Wang;Dingyin Xia;Edward Y. Chang
Affiliations:
Google Inc., Beijing, China;Google Inc., Beijing, China;Google Inc., Beijing, China
Venue:
Proceedings of the international workshop on Very-large-scale multimedia corpus, mining and retrieval
Year:
2010

Citing 7
Cited 2

Support vector machine active learning for image retrieval

MULTIMEDIA '01 Proceedings of the ninth ACM international conference on Multimedia
A fast learning algorithm for deep belief nets

Neural Computation
Learning a dictionary of shape-components in visual cortex: comparison with neurons, humans and machines

Learning a dictionary of shape-components in visual cortex: comparison with neurons, humans and machines
Robust Object Recognition with Cortex-Like Mechanisms

IEEE Transactions on Pattern Analysis and Machine Intelligence
Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Learning Deep Architectures for AI

Learning Deep Architectures for AI
PLDA+: Parallel latent dirichlet allocation with data placement and pipeline processing

ACM Transactions on Intelligent Systems and Technology (TIST)

Bilinear deep learning for image classification

MM '11 Proceedings of the 19th ACM international conference on Multimedia
Semiconducting bilinear deep learning for incomplete image recognition

Proceedings of the 2nd ACM International Conference on Multimedia Retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

Does adding more training data always help improve the effectiveness of a machine-learning or pattern-recognition task? Recent evidences in machine translation and speech recognition seem to suggest that the data-driven approach outperforms the traditional model-based approach. Instead of carefully modeling rules and their exceptions, the data-driven approach relies on identifying similar patterns in massive datasets and then uses the similar patterns to predict the labels (or other outcomes) of unseen instances. In this work, we compare representative data-driven and model-based schemes on an image annotation task. We enumerate pros and cons of these two approaches, and propose a hybrid approach, which can harness the strengths of the two.