K-farthest-neighbors-based concept boundary determination for support vector data description

Authors:
Yanshan Xiao;Bo Liu;Longbing Cao
Affiliations:
University of Technology, Sydney, Australia;University of Technology, Sydney, Australia;University of Technology,, Sydney, Australia
Venue:
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Year:
2010

Citing 8
Cited 0

Bagging predictors

Machine Learning
Support vector domain description

Pattern Recognition Letters - Special issue on pattern recognition in practice VI
M-tree: An Efficient Access Method for Similarity Search in Metric Spaces

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Boosting support vector machines for text classification through parameter-free threshold relaxation

CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Support Vector Data Description

Machine Learning
Rapid and Brief communication: Low resolution face recognition based on support vector data description

Pattern Recognition
Fast Support Vector Data Description Using K-Means Clustering

ISNN '07 Proceedings of the 4th international symposium on Neural Networks: Advances in Neural Networks, Part III
Incremental query evaluation for support vector machines

Proceedings of the 18th ACM conference on Information and knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

Support vector data description (SVDD) is very useful for one-class classification. However, it incurs high time complexity in handling large scale data. In this paper, we propose a novel and efficient method, named K-Farthest-Neighbors-based Concept Boundary Detection (KFN-CBD for short), to improve the SVDD learning efficiency on large datasets. This work is motivated by the observation that SVDD classifier is determined by support vectors (SVs), and removing the non-support vectors (non-SVs) will not change the classifier but will reduce computational costs. Our approach consists of two steps. In the first step, we propose the K-farthest-neighbors method to identify the samples around the hyper-sphere surface, which are more likely to be SVs. At the same time, a new tree search strategy of M-tree is presented to speed up the K-farthest neighbor query. In the second step, the non-SVs are eliminated from the training set, and only the identified boundary samples are used to train the SVDD classifier. By removing the non-SVs, the training time of SVDD can be substantially reduced.Extensive experiments have shown that KFN-CBD achieves around 6 times speedup compared to the standard SVDD, and obtains the comparable classification quality as the entire dataset used.