Concept boundary detection for speeding up SVMs

Authors:
Navneet Panda;Edward Y. Chang;Gang Wu
Affiliations:
UCSB, CA;UCSB, CA;UCSB, CA
Venue:
ICML '06 Proceedings of the 23rd international conference on Machine learning
Year:
2006

Citing 11
Cited 7

The nature of statistical learning theory

The nature of statistical learning theory
Bagging predictors

Machine Learning
Geometry and invariance in kernel based methods

Advances in kernel methods
Making large-scale support vector machine learning practical

Advances in kernel methods
Towards scalable support vector machines using squashing

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Scaling Kernel-Based Systems to Large Data Sets

Data Mining and Knowledge Discovery
Sparse Greedy Matrix Approximation for Machine Learning

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Similarity Search in High Dimensions via Hashing

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Efficient svm training using low-rank kernel representations

The Journal of Machine Learning Research
Classifying large data sets using SVMs with hierarchical clusters

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
LIBSVM: A library for support vector machines

ACM Transactions on Intelligent Systems and Technology (TIST)

A New Variant of the Optimum-Path Forest Classifier

ISVC '08 Proceedings of the 4th International Symposium on Advances in Visual Computing
A discrete approach for supervised pattern recognition

IWCIA'08 Proceedings of the 12th international conference on Combinatorial image analysis
Petroleum well drilling monitoring through cutting image analysis and artificial intelligence techniques

Engineering Applications of Artificial Intelligence
Active learning in multimedia annotation and retrieval: A survey

ACM Transactions on Intelligent Systems and Technology (TIST)
Tree Decomposition for Large-Scale SVM Problems

The Journal of Machine Learning Research
Fast instance selection for speeding up support vector machines

Knowledge-Based Systems
Neighbors' distribution property and sample reduction for support vector machines

Applied Soft Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Support Vector Machines (SVMs) suffer from an O(n2) training cost, where n denotes the number of training instances. In this paper, we propose an algorithm to select boundary instances as training data to substantially reduce n. Our proposed algorithm is motivated by the result of (Burges, 1999) that, removing non-support vectors from the training set does not change SVM training results. Our algorithm eliminates instances that are likely to be non-support vectors. In the concept-independent preprocessing step of our algorithm, we prepare nearest-neighbor lists for training instances. In the concept-specific sampling step, we can then effectively select useful training data for each target concept. Empirical studies show our algorithm to be effective in reducing n, outperforming other competing downsampling algorithms without significantly compromising testing accuracy.