MEGA---the maximizing expected generalization algorithm for learning complex query concepts

Authors:
Edward Chang;Beitao Li
Affiliations:
University of California, Santa Barbara, Santa Barbara, CA;University of California, Santa Barbara, Santa Barbara, CA
Venue:
ACM Transactions on Information Systems (TOIS)
Year:
2003

Citing 34
Cited 3

Introduction to statistical pattern recognition (2nd ed.)

Introduction to statistical pattern recognition (2nd ed.)
Efficient and effective querying by image content

Journal of Intelligent Information Systems - Special issue: advances in visual information management systems
Learning Boolean formulas

Journal of the ACM (JACM)
An introduction to computational learning theory

An introduction to computational learning theory
The nature of statistical learning theory

The nature of statistical learning theory
Bagging predictors

Machine Learning
Fuzzy logic and neural network handbook

Fuzzy logic and neural network handbook
Selective Sampling Using the Query by Committee Algorithm

Machine Learning
Fuzzy queries in multimedia database systems

PODS '98 Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
WALRUS: a similarity retrieval algorithm for image databases

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Visual classification: an interactive approach to decision tree construction

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Query refinement for multimedia similarity retrieval in MARS

MULTIMEDIA '99 Proceedings of the seventh ACM international conference on Multimedia (Part 1)
IRM: integrated region matching for image retrieval

MULTIMEDIA '00 Proceedings of the eighth ACM international conference on Multimedia
A comparison between neural networks and decision trees based on data from industrial radiographic testing

Pattern Recognition Letters - Special issue on machine learning and data mining in pattern recognition
PBIR - perception-based image retrieval

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Support vector machine active learning for image retrieval

MULTIMEDIA '01 Proceedings of the ninth ACM international conference on Multimedia
PBIR: perception-based image retrieval-a system that can quickly capture subjective image query concepts

MULTIMEDIA '01 Proceedings of the ninth ACM international conference on Multimedia
Machine Learning

Machine Learning
Handbook of Automated Reasoning: Volume 1

Handbook of Automated Reasoning: Volume 1
Supporting Ranked Boolean Similarity Queries in MARS

IEEE Transactions on Knowledge and Data Engineering
Clustering for Approximate Similarity Search in High-Dimensional Spaces

IEEE Transactions on Knowledge and Data Engineering
DynDex: a dynamic and non-metric space indexer

Proceedings of the tenth ACM international conference on Multimedia
Incorporating User Preferences in Multimedia Queries

ICDT '97 Proceedings of the 6th International Conference on Database Theory
Boosting the margin: A new explanation for the effectiveness of voting methods

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Mining Image Features for Efficient Query Processing

ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
MindReader: Querying Databases Through Multiple Examples

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
FALCON: Feedback Adaptive Loop for Content-Based Retrieval

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
FeedbackBypass: A New Approach to Interactive Similarity Query Processing

Proceedings of the 27th International Conference on Very Large Data Bases
Toward Perception-Based Image Retrieval

CBAIVL '00 Proceedings of the IEEE Workshop on Content-based Access of Image and Video Libraries (CBAIVL'00)
A theory of the learnable

STOC '84 Proceedings of the sixteenth annual ACM symposium on Theory of computing
Query Reformulation for Content Based Multimedia Retrieval in MARS

ICMCS '99 Proceedings of the IEEE International Conference on Multimedia Computing and Systems - Volume 2
Unsupervised texture classification using vector quantization and deterministic relaxation neural network

IEEE Transactions on Image Processing
The Bayesian image retrieval system, PicHunter: theory, implementation, and psychophysical experiments

IEEE Transactions on Image Processing
Relevance feedback: a power tool for interactive content-based image retrieval

IEEE Transactions on Circuits and Systems for Video Technology

Multimodal concept-dependent active learning for image retrieval

Proceedings of the 12th annual ACM international conference on Multimedia
On scalability of active learning for formulating query concepts

Proceedings of the 1st international workshop on Computer vision meets databases
Customized classification learning based on query projections

Information Sciences: an International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

Specifying exact query concepts has become increasingly challenging to end-users. This is because many query concepts (e.g., those for looking up a multimedia object) can be hard to articulate, and articulation can be subjective. In this study, we propose a query-concept learner that learns query criteria through an intelligent sampling process. Our concept learner aims to fulfill two primary design objectives: (1) it has to be expressive in order to model most practical query concepts and (2) it must learn a concept quickly and with a small number of labeled data since online users tend to be too impatient to provide much feedback. To fulfill the first goal, we model query concepts in k-CNF, which can express almost all practical query concepts. To fulfill the second design goal, we propose our maximizing expected generalization algorithm (MEGA), which converges to target concepts quickly by its two complementary steps: sample selection and concept refinement. We also propose a divide-and-conquer method that divides the concept-learning task into G subtasks to achieve speedup. We notice that a task must be divided carefully, or search accuracy may suffer. Through analysis and mining results, we observe that organizing image features in a multiresolution manner, and minimizing intragroup feature correlation, can speed up query-concept learning substantially while maintaining high search accuracy. Through examples, analysis, experiments, and a prototype implementation, we show that MEGA converges to query concepts significantly faster than traditional methods.