Incremental query evaluation for support vector machines

Authors:
Danzhou Liu;Kien A. Hua
Affiliations:
University of Central Florida, Orlando, FL, USA;University of Central Florida, Orlando, FL, USA
Venue:
Proceedings of the 18th ACM conference on Information and knowledge management
Year:
2009

Citing 5
Cited 1

The R*-tree: an efficient and robust access method for points and rectangles

SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
A Tutorial on Support Vector Machines for Pattern Recognition

Data Mining and Knowledge Discovery
Sampling from Spatial Databases

Proceedings of the Ninth International Conference on Data Engineering
Efficient top-k hyperplane query processing for multimedia information retrieval

MULTIMEDIA '06 Proceedings of the 14th annual ACM international conference on Multimedia
LIBSVM: A library for support vector machines

ACM Transactions on Intelligent Systems and Technology (TIST)

K-farthest-neighbors-based concept boundary determination for support vector data description

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

Support vector machines (SVMs) have been widely used in multimedia retrieval to learn a concept in order to find the best matches. In such a SVM active learning environment, the system first processes k sampling queries and top-k uncertain queries to select the candidate data items for training. The user's top-k relevant queries are then evaluated to compute the answer. This approach has shown to be effective. However, it suffers from the scalability problem associated with larger database sizes. To address this limitation, we propose an incremental query evaluation technique for these three types of queries. Based on the observation that most queries are not revised dramatically during the iterative evaluation, the proposed technique reuses the results of previous queries to reduce the computation cost. Furthermore, this technique takes advantage of a tuned index structure to efficiently prune irrelevant data. As a result, only a small portion of the data set needs to be accessed for query processing. This index structure also provides an inexpensive means to process the set of candidates to evaluate the final query result. This technique can work with different kernel functions and kernel parameters. Our experimental results indicate that the proposed technique significantly reduces the overall computation cost, and offers a promising solution to the scalability issue.