FRaC: a feature-modeling approach for semi-supervised and unsupervised anomaly detection

Authors:
Keith Noto;Carla Brodley;Donna Slonim
Affiliations:
Department of Computer Science, Tufts University, Medford, USA 02155;Department of Computer Science, Tufts University, Medford, USA 02155;Department of Computer Science, Tufts University, Medford, USA 02155
Venue:
Data Mining and Knowledge Discovery
Year:
2012

Citing 15
Cited 3

Signal detection theory: valuable tools for evaluating inductive learning

Proceedings of the sixth international workshop on Machine learning
C4.5: programs for machine learning

C4.5: programs for machine learning
LOF: identifying density-based local outliers

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Machine Learning

Machine Learning
Enhancing Effectiveness of Outlier Detections for Low Density Patterns

PAKDD '02 Proceedings of the 6th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
Cross-Feature Analysis for Detecting Ad-Hoc Routing Anomalies

ICDCS '03 Proceedings of the 23rd International Conference on Distributed Computing Systems
Feature bagging for outlier detection

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Visualizing Similarity between Program Executions

ISSRE '05 Proceedings of the 16th IEEE International Symposium on Software Reliability Engineering
New Support Vector Algorithms

Neural Computation
LIBLINEAR: A Library for Large Linear Classification

The Journal of Machine Learning Research
Anomaly detection: A survey

ACM Computing Surveys (CSUR)
The WEKA data mining software: an update

ACM SIGKDD Explorations Newsletter
Anomaly Detection Using an Ensemble of Feature Models

ICDM '10 Proceedings of the 2010 IEEE International Conference on Data Mining
LIBSVM: A library for support vector machines

ACM Transactions on Intelligent Systems and Technology (TIST)
Estimating continuous distributions in Bayesian classifiers

UAI'95 Proceedings of the Eleventh conference on Uncertainty in artificial intelligence

Network anomaly detection with the restricted Boltzmann machine

Neurocomputing
Anomaly detection in large-scale data stream networks

Data Mining and Knowledge Discovery
Diversity measures for one-class classifier ensembles

Neurocomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Anomaly detection involves identifying rare data instances (anomalies) that come from a different class or distribution than the majority (which are simply called "normal" instances). Given a training set of only normal data, the semi-supervised anomaly detection task is to identify anomalies in the future. Good solutions to this task have applications in fraud and intrusion detection. The unsupervised anomaly detection task is different: Given unlabeled, mostly-normal data, identify the anomalies among them. Many real-world machine learning tasks, including many fraud and intrusion detection tasks, are unsupervised because it is impractical (or impossible) to verify all of the training data. We recently presented FRaC, a new approach for semi-supervised anomaly detection. FRaC is based on using normal instances to build an ensemble of feature models, and then identifying instances that disagree with those models as anomalous. In this paper, we investigate the behavior of FRaC experimentally and explain why FRaC is so successful. We also show that FRaC is a superior approach for the unsupervised as well as the semi-supervised anomaly detection task, compared to well-known state-of-the-art anomaly detection methods, LOF and one-class support vector machines, and to an existing feature-modeling approach.