LOF: identifying density-based local outliers
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Machine Learning
Novelty detection: a review—part 1: statistical approaches
Signal Processing
Support Vector Data Description
Machine Learning
A Survey of Outlier Detection Methodologies
Artificial Intelligence Review
A Fast Dual Algorithm for Kernel Logistic Regression
Machine Learning
Dendritic cells for SYN scan detection
Proceedings of the 9th annual conference on Genetic and evolutionary computation
ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
ACM Computing Surveys (CSUR)
On detecting clustered anomalies using SCiForest
ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part II
Increasing availability of industrial systems through data stream mining
Computers and Industrial Engineering
LIBSVM: A library for support vector machines
ACM Transactions on Intelligent Systems and Technology (TIST)
Spatiotemporal Models for Data-Anomaly Detection in Dynamic Environmental Monitoring Campaigns
ACM Transactions on Sensor Networks (TOSN)
ICNC'05 Proceedings of the First international conference on Advances in Natural Computation - Volume Part II
Ensemble Methods: Foundations and Algorithms
Ensemble Methods: Foundations and Algorithms
A survey on unsupervised outlier detection in high-dimensional numerical data
Statistical Analysis and Data Mining
Bridging the Gap: A Pragmatic Approach to Generating Insider Threat Data
SPW '13 Proceedings of the 2013 IEEE Security and Privacy Workshops
Ensembles for unsupervised outlier detection: challenges and research questions a position paper
ACM SIGKDD Explorations Newsletter
Hi-index | 0.00 |
Research in anomaly detection suffers from a lack of realistic and publicly-available problem sets. This paper discusses what properties such problem sets should possess. It then introduces a methodology for transforming existing classification data sets into ground-truthed benchmark data sets for anomaly detection. The methodology produces data sets that vary along three important dimensions: (a) point difficulty, (b) relative frequency of anomalies, and (c) clusteredness. We apply our generated datasets to benchmark several popular anomaly detection algorithms under a range of different conditions.