Machine Learning
Support vector domain description
Pattern Recognition Letters - Special issue on pattern recognition in practice VI
M-tree: An Efficient Access Method for Similarity Search in Metric Spaces
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Boosting support vector machines for text classification through parameter-free threshold relaxation
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Support Vector Data Description
Machine Learning
Fast Support Vector Data Description Using K-Means Clustering
ISNN '07 Proceedings of the 4th international symposium on Neural Networks: Advances in Neural Networks, Part III
Incremental query evaluation for support vector machines
Proceedings of the 18th ACM conference on Information and knowledge management
Hi-index | 0.00 |
Support vector data description (SVDD) is very useful for one-class classification. However, it incurs high time complexity in handling large scale data. In this paper, we propose a novel and efficient method, named K-Farthest-Neighbors-based Concept Boundary Detection (KFN-CBD for short), to improve the SVDD learning efficiency on large datasets. This work is motivated by the observation that SVDD classifier is determined by support vectors (SVs), and removing the non-support vectors (non-SVs) will not change the classifier but will reduce computational costs. Our approach consists of two steps. In the first step, we propose the K-farthest-neighbors method to identify the samples around the hyper-sphere surface, which are more likely to be SVs. At the same time, a new tree search strategy of M-tree is presented to speed up the K-farthest neighbor query. In the second step, the non-SVs are eliminated from the training set, and only the identified boundary samples are used to train the SVDD classifier. By removing the non-SVs, the training time of SVDD can be substantially reduced.Extensive experiments have shown that KFN-CBD achieves around 6 times speedup compared to the standard SVDD, and obtains the comparable classification quality as the entire dataset used.