Support vector machines training data selection using a genetic algorithm

Authors:
Michal Kawulok;Jakub Nalepa
Affiliations:
Institute of Informatics, Silesian University of Technology, Gliwice, Poland;Institute of Informatics, Silesian University of Technology, Gliwice, Poland
Venue:
SSPR'12/SPR'12 Proceedings of the 2012 Joint IAPR international conference on Structural, Syntactic, and Statistical Pattern Recognition
Year:
2012

Citing 17
Cited 1

Support-Vector Networks

Machine Learning
Making large-scale support vector machine learning practical

Advances in kernel methods
New ideas in optimization

New ideas in optimization
Less is More: Active Learning with Support Vector Machines

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Fast Training of Support Vector Machines by Extracting Boundary Data

ICANN '01 Proceedings of the International Conference on Artificial Neural Networks
A Random Sampling Technique for Training Support Vector Machines

ALT '01 Proceedings of the 12th International Conference on Algorithmic Learning Theory
Core Vector Machines: Fast SVM Training on Very Large Data Sets

The Journal of Machine Learning Research
Neighborhood Property--Based Pattern Selection for Support Vector Machines

Neural Computation
Selecting valuable training samples for SVMs via data structure analysis

Neurocomputing
A penalty-based edge assembly memetic algorithm for the vehicle routing problem with time windows

Computers and Operations Research
LIBSVM: A library for support vector machines

ACM Transactions on Intelligent Systems and Technology (TIST)
Selecting training points for one-class support vector machines

Pattern Recognition Letters
An RSVM based two-teachers-one-student semi-supervised learning algorithm

Neural Networks
Color based skin classification

Pattern Recognition Letters
Training data selection for support vector machines

ICNC'05 Proceedings of the First international conference on Advances in Natural Computation - Volume Part I
Active set support vector regression

IEEE Transactions on Neural Networks
Reduced Support Vector Machines: A Statistical Theory

IEEE Transactions on Neural Networks

Inductive manifold learning using structured support vector machine

Pattern Recognition

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents a new method for selecting valuable training data for support vector machines (SVM) from large, noisy sets using a genetic algorithm (GA). SVM training data selection is a known, however not extensively investigated problem. The existing methods rely mainly on analyzing the geometric properties of the data or adapt a randomized selection, and to the best of our knowledge, GA-based approaches have not been applied for this purpose yet. Our work was inspired by the problems encountered when using SVM for skin segmentation. Due to a very large set size, the existing methods are too time-consuming, and random selection is not effective because of the set noisiness. In the work reported here we demonstrate how a GA can be used to optimize the training set, and we present extensive experimental results which confirm that the new method is highly effective for real-world data.