On the optimal working set size in serial and parallel support vector machine learning with the decomposition algorithm

Authors:
Tatjana Eitrich;Bruno Lang
Affiliations:
Central Institute for Applied Mathematics, Research Centre Juelich, Germany;University of Wuppertal, Germany
Venue:
AusDM '06 Proceedings of the fifth Australasian conference on Data mining and analystics - Volume 61
Year:
2006

Citing 23
Cited 0

Fast training of support vector machines using sequential minimal optimization

Advances in kernel methods
An introduction to support Vector Machines: and other kernel-based learning methods

An introduction to support Vector Machines: and other kernel-based learning methods
Variable projection methods for large convex quadratic programs

Recent trends in numerical analysis
The distributed boosting algorithm

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Parallel data mining for association rules on shared memory systems

Knowledge and Information Systems
A Tutorial on Support Vector Machines for Pattern Recognition

Data Mining and Knowledge Discovery
A Simple Decomposition Method for Support Vector Machines

Machine Learning
Feasible Direction Decomposition Algorithms for Training Support Vector Machines

Machine Learning
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features

ECML '98 Proceedings of the 10th European Conference on Machine Learning
A Brief Introduction to Boosting

IJCAI '99 Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence
A parallel solver for large quadratic programs in training support vector machines

Parallel Computing - Special issue: Parallel computing in numerical optimization
ScalParC: A New Scalable and Efficient Parallel Classification Algorithm for Mining Large Datasets

IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Discovering Compact and Highly Discriminative Features or Feature Combinations of Drug Activities Using Support Vector Machines

CSB '03 Proceedings of the IEEE Computer Society Conference on Bioinformatics
Classifying large data sets using SVMs with hierarchical clusters

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Shared Memory Parallelization of Data Mining Algorithms: Techniques, Programming Interface, and Performance

IEEE Transactions on Knowledge and Data Engineering
Building Projectable Classifiers of Arbitrary Complexity

ICPR '96 Proceedings of the 13th International Conference on Pattern Recognition - Volume 2
A fast parallel optimization for training support vector machine

MLDM'03 Proceedings of the 3rd international conference on Machine learning and data mining in pattern recognition
HyParSVM: a new hybrid parallel software for support vector machine learning on SMP clusters

Euro-Par'06 Proceedings of the 12th international conference on Parallel Processing
Parallel tuning of support vector machine learning parameters for large and unbalanced data sets

CompLife'05 Proceedings of the First international conference on Computational Life Sciences
Data mining with parallel support vector machines for classification

ADVIS'06 Proceedings of the 4th international conference on Advances in Information Systems
Intrusion detection in wireless ad hoc networks

IEEE Wireless Communications
The analysis of decomposition methods for support vector machines

IEEE Transactions on Neural Networks
A comparison of methods for multiclass support vector machines

IEEE Transactions on Neural Networks

Quantified Score

Hi-index	0.00

Visualization

Abstract

The support vector machine (SVM) is a well-established and accurate supervised learning method for the classification of data in various application fields. The statistical learning task - the so-called training - can be formulated as a quadratic optimization problem. During the last years the decomposition algorithm for solving this optimization problem became the most frequently used method for support vector machine learning and is the basis of many SVM implementations today. It is characterized by an internal parameter called working set size. Usually small working sets have been assigned. The increasing amount of data used for classification led to new parallel implementations of the decomposition method with efficient inner solvers. With these solvers larger working sets can be assigned. It was shown, that for parallel training with the decomposition algorithm large working sets achieve good speedup values. However, the choice of the optimal working set size for parallel training is not clear. In this paper, we show how the working set size influences the number of decomposition steps, the number of kernel function evaluations and the overall training time in serial and parallel computation.