Enhanced max margin learning on multimodal data mining in a multimedia database

Authors:
Zhen Guo;Zhongfei Zhang;Eric Xing;Christos Faloutsos
Affiliations:
State University of New York at Binghamton;State University of New York at Binghamton;Carnegie Mellon University;Carnegie Mellon University
Venue:
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Year:
2007

Citing 19
Cited 4

The nature of statistical learning theory

The nature of statistical learning theory
Large Margin Classification Using the Perceptron Algorithm

Machine Learning - The Eleventh Annual Conference on computational Learning Theory
Content-Based Image Retrieval at the End of the Early Years

IEEE Transactions on Pattern Analysis and Machine Intelligence
Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary

ECCV '02 Proceedings of the 7th European Conference on Computer Vision-Part IV
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Maximum Entropy Markov Models for Information Extraction and Segmentation

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Modeling annotated data

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Matching words and pictures

The Journal of Machine Learning Research
Convex Optimization

Convex Optimization
Automatic multimedia cross-modal correlation discovery

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Support vector machine learning for interdependent and structured output spaces

ICML '04 Proceedings of the twenty-first international conference on Machine learning
A graphical model for protein secondary structure prediction

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Multimodal metadata fusion using causal strength

Proceedings of the 13th annual ACM international conference on Multimedia
Learning as search optimization: approximate large margin methods for structured prediction

ICML '05 Proceedings of the 22nd international conference on Machine learning
Learning structured prediction models: a large margin approach

ICML '05 Proceedings of the 22nd international conference on Machine learning
Semi-supervised learning for structured output variables

ICML '06 Proceedings of the 23rd international conference on Machine learning
Toward bridging the annotation-retrieval gap in image search by a generative modeling approach

MULTIMEDIA '06 Proceedings of the 14th annual ACM international conference on Multimedia
Multiple Bernoulli relevance models for image and video annotation

CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
CBSA: content-based soft annotation for multimodal image retrieval using Bayes point machines

IEEE Transactions on Circuits and Systems for Video Technology

A new multimedia information data mining method

Proceedings of the first ACM/SIGEVO Summit on Genetic and Evolutionary Computation
Exploiting multi-modal interactions: a unified framework

IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Multimedia data mining: state of the art and challenges

Multimedia Tools and Applications
Mining partially annotated images

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

The problem of multimodal data mining in a multimedia database can be addressed as a structured prediction problem where we learn the mapping from an input to the structured and interdependent output variables. In this paper, built upon the existing literature on the max margin based learning, we develop a new max margin learning approach called Enhanced Max Margin Learning (EMML) framework. In addition, we apply EMML framework to developing an effective and efficient solution to the multimodal data mining problem in a multimedia database. The main contributions include: (1) we have developed a new max margin learning approach - the enhanced max margin learning framework that is much more efficient in learning with a much faster convergence rate, which is verified in empirical evaluations; (2) we have applied this EMML approach to developing an effective and efficient solution to the multimodal data mining problem that is highly scalable in the sense that the query response time is independent of the database scale, allowing facilitating a multimodal data mining querying to a very large scale multimedia database,and excelling many existing multimodal data mining methods in the literature that do not scale up at all; this advantage is also supported through the complexity analysis as well as empirical evaluations against a state-of-the-art multimodal data mining method from the literature. While EMML is a general framework, for the evaluation purpose, we apply it to the Berkeley Drosophila embryo image database, and report the performance comparison with a state-of-the-art multimodal data mining method.