Signal detection theory: valuable tools for evaluating inductive learning
Proceedings of the sixth international workshop on Machine learning
C4.5: programs for machine learning
C4.5: programs for machine learning
LOF: identifying density-based local outliers
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Machine Learning
Enhancing Effectiveness of Outlier Detections for Low Density Patterns
PAKDD '02 Proceedings of the 6th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
Cross-Feature Analysis for Detecting Ad-Hoc Routing Anomalies
ICDCS '03 Proceedings of the 23rd International Conference on Distributed Computing Systems
Feature bagging for outlier detection
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Visualizing Similarity between Program Executions
ISSRE '05 Proceedings of the 16th IEEE International Symposium on Software Reliability Engineering
Neural Computation
LIBLINEAR: A Library for Large Linear Classification
The Journal of Machine Learning Research
ACM Computing Surveys (CSUR)
The WEKA data mining software: an update
ACM SIGKDD Explorations Newsletter
Anomaly Detection Using an Ensemble of Feature Models
ICDM '10 Proceedings of the 2010 IEEE International Conference on Data Mining
LIBSVM: A library for support vector machines
ACM Transactions on Intelligent Systems and Technology (TIST)
Estimating continuous distributions in Bayesian classifiers
UAI'95 Proceedings of the Eleventh conference on Uncertainty in artificial intelligence
Anomaly detection in large-scale data stream networks
Data Mining and Knowledge Discovery
Diversity measures for one-class classifier ensembles
Neurocomputing
Hi-index | 0.00 |
Anomaly detection involves identifying rare data instances (anomalies) that come from a different class or distribution than the majority (which are simply called "normal" instances). Given a training set of only normal data, the semi-supervised anomaly detection task is to identify anomalies in the future. Good solutions to this task have applications in fraud and intrusion detection. The unsupervised anomaly detection task is different: Given unlabeled, mostly-normal data, identify the anomalies among them. Many real-world machine learning tasks, including many fraud and intrusion detection tasks, are unsupervised because it is impractical (or impossible) to verify all of the training data. We recently presented FRaC, a new approach for semi-supervised anomaly detection. FRaC is based on using normal instances to build an ensemble of feature models, and then identifying instances that disagree with those models as anomalous. In this paper, we investigate the behavior of FRaC experimentally and explain why FRaC is so successful. We also show that FRaC is a superior approach for the unsupervised as well as the semi-supervised anomaly detection task, compared to well-known state-of-the-art anomaly detection methods, LOF and one-class support vector machines, and to an existing feature-modeling approach.