The nature of statistical learning theory
The nature of statistical learning theory
Machine Learning
Geometry and invariance in kernel based methods
Advances in kernel methods
Making large-scale support vector machine learning practical
Advances in kernel methods
Towards scalable support vector machines using squashing
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Scaling Kernel-Based Systems to Large Data Sets
Data Mining and Knowledge Discovery
Sparse Greedy Matrix Approximation for Machine Learning
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Similarity Search in High Dimensions via Hashing
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Efficient svm training using low-rank kernel representations
The Journal of Machine Learning Research
Classifying large data sets using SVMs with hierarchical clusters
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
LIBSVM: A library for support vector machines
ACM Transactions on Intelligent Systems and Technology (TIST)
A New Variant of the Optimum-Path Forest Classifier
ISVC '08 Proceedings of the 4th International Symposium on Advances in Visual Computing
A discrete approach for supervised pattern recognition
IWCIA'08 Proceedings of the 12th international conference on Combinatorial image analysis
Engineering Applications of Artificial Intelligence
Active learning in multimedia annotation and retrieval: A survey
ACM Transactions on Intelligent Systems and Technology (TIST)
Tree Decomposition for Large-Scale SVM Problems
The Journal of Machine Learning Research
Fast instance selection for speeding up support vector machines
Knowledge-Based Systems
Neighbors' distribution property and sample reduction for support vector machines
Applied Soft Computing
Hi-index | 0.00 |
Support Vector Machines (SVMs) suffer from an O(n2) training cost, where n denotes the number of training instances. In this paper, we propose an algorithm to select boundary instances as training data to substantially reduce n. Our proposed algorithm is motivated by the result of (Burges, 1999) that, removing non-support vectors from the training set does not change SVM training results. Our algorithm eliminates instances that are likely to be non-support vectors. In the concept-independent preprocessing step of our algorithm, we prepare nearest-neighbor lists for training instances. In the concept-specific sampling step, we can then effectively select useful training data for each target concept. Empirical studies show our algorithm to be effective in reducing n, outperforming other competing downsampling algorithms without significantly compromising testing accuracy.