LOF: identifying density-based local outliers
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Efficient algorithms for mining outliers from large data sets
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Computers and Intractability: A Guide to the Theory of NP-Completeness
Computers and Intractability: A Guide to the Theory of NP-Completeness
Fast Outlier Detection in High Dimensional Spaces
PKDD '02 Proceedings of the 6th European Conference on Principles of Data Mining and Knowledge Discovery
Algorithms for Mining Distance-Based Outliers in Large Datasets
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
On the Inequality of Cover and Hart in Nearest Neighbor Discrimination
IEEE Transactions on Pattern Analysis and Machine Intelligence
Fast minimization of structural risk by nearest neighbor rule
IEEE Transactions on Neural Networks
Active and Semi-supervised Data Domain Description
ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part I
Hi-index | 0.00 |
A popular method to discriminate between normal and abnormal data is based on accepting test objects whose nearest neighbors distances in a reference data set lie within a certain threshold. In this work we investigate the possibility of using as reference set a subset of the original data set. We discuss relationship between reference set size and generalization, and show that finding the minimum cardinality reference consistent subset is intractable. Then, we describe an algorithm that computes a reference consistent subset with only two reference set passes. Experimental results confirm the effectiveness of the approach.