Pattern Recognition Letters
LOF: identifying density-based local outliers
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Efficient algorithms for mining outliers from large data sets
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Fast Outlier Detection in High Dimensional Spaces
PKDD '02 Proceedings of the 6th European Conference on Principles of Data Mining and Knowledge Discovery
Algorithms for Mining Distance-Based Outliers in Large Datasets
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Fast condensed nearest neighbor rule
ICML '05 Proceedings of the 22nd international conference on Machine learning
LIBSVM: A library for support vector machines
ACM Transactions on Intelligent Systems and Technology (TIST)
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Prototype-based Domain Description
Proceedings of the 2008 conference on ECAI 2008: 18th European Conference on Artificial Intelligence
On the representation of a digital contour with an unordered point set for visual perception
Journal of Visual Communication and Image Representation
Hi-index | 0.14 |
A simple yet effective unsupervised classification rule to discriminate between normal and abnormal data is based on accepting test objects whose nearest neighbors distances in a reference data set, assumed to model normal behavior, lie within a certain threshold. This work investigates the effect of using a subset of the original data set as the reference set of the classifier. With this aim, the concept of a reference consistent subset is introduced and it is shown that finding the minimum cardinality reference consistent subset is intractable. Then, the CNNDD algorithm is described, which computes a reference consistent subset with only two reference set passes. Experimental results revealed the advantages of condensing the data set and confirmed the effectiveness of the proposed approach. A thorough comparison with related methods was accomplished, pointing out the strengths and weaknesses of one-class nearest-neighbor-based training set consistent condensation.