Support vector machine classification for large data sets via minimum enclosing ball clustering

  • Authors:
  • Jair Cervantes;Xiaoou Li;Wen Yu;Kang Li

  • Affiliations:
  • Departamento de Computación, CINVESTAV-IPN, A.P. 14-740, Av.IPN 2508, México, D.F. 07360, México;Departamento de Computación, CINVESTAV-IPN, A.P. 14-740, Av.IPN 2508, México, D.F. 07360, México;Departamento de Control Automático, CINVESTAV-IPN, A.P. 14-740, Av.IPN 2508, México, D.F. 07360, México;School of Electronics, Electrical Engineering and Computer Science, Queen's University Belfast, Ashby Building, Stranmillis Road, Belfast BT9 5AH, UK

  • Venue:
  • Neurocomputing
  • Year:
  • 2008

Quantified Score

Hi-index 0.01

Visualization

Abstract

Support vector machine (SVM) is a powerful technique for data classification. Despite of its good theoretic foundations and high classification accuracy, normal SVM is not suitable for classification of large data sets, because the training complexity of SVM is highly dependent on the size of data set. This paper presents a novel SVM classification approach for large data sets by using minimum enclosing ball clustering. After the training data are partitioned by the proposed clustering method, the centers of the clusters are used for the first time SVM classification. Then we use the clusters whose centers are support vectors or those clusters which have different classes to perform the second time SVM classification. In this stage most data are removed. Several experimental results show that the approach proposed in this paper has good classification accuracy compared with classic SVM while the training is significantly faster than several other SVM classifiers.