Isolation-Based Anomaly Detection

Authors:
Fei Tony Liu;Kai Ming Ting;Zhi-Hua Zhou
Affiliations:
Monash University;Monash University;Nanjing University
Venue:
ACM Transactions on Knowledge Discovery from Data (TKDD)
Year:
2012

Citing 33
Cited 3

Robust regression and outlier detection

Robust regression and outlier detection
C4.5: programs for machine learning

C4.5: programs for machine learning
LOF: identifying density-based local outliers

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Efficient algorithms for mining outliers from large data sets

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
On-line unsupervised outlier detection using finite mixtures with discounting learning algorithms

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
A Simple Generalisation of the Area Under the ROC Curve for Multiple Class Classification Problems

Machine Learning
Data Structures and Algorithms

Data Structures and Algorithms
Random Forests

Machine Learning
Algorithms for Mining Distance-Based Outliers in Large Datasets

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Enhancing Effectiveness of Outlier Detections for Low Density Patterns

PAKDD '02 Proceedings of the 6th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
Distance-based outliers: algorithms and applications

The VLDB Journal — The International Journal on Very Large Data Bases
Discovering cluster-based local outliers

Pattern Recognition Letters
A Comparative Study of RNN for Outlier Detection in Data Mining

ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Mining distance-based outliers in near linear time with randomization and a simple pruning rule

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Support Vector Data Description

Machine Learning
LOADED: Link-Based Outlier and Anomaly Detection in Evolving Data Sets

ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
The Art of Computer Programming, Volume 4, Fascicle 3: Generating All Combinations and Partitions

The Art of Computer Programming, Volume 4, Fascicle 3: Generating All Combinations and Partitions
Feature bagging for outlier detection

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Introduction to Data Mining, (First Edition)

Introduction to Data Mining, (First Edition)
Estimating the Support of a High-Dimensional Distribution

Neural Computation
Mining distance-based outliers from large databases in any metric space

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Outlier detection by active learning

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Outlier detection by sampling with accuracy guarantees

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
ROCR: visualizing classifier performance in R

Bioinformatics
Conditional Anomaly Detection

IEEE Transactions on Knowledge and Data Engineering
DOLPHIN: An efficient algorithm for mining distance-based outliers in very large datasets

ACM Transactions on Knowledge Discovery from Data (TKDD)
Isolation Forest

ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
Anomaly detection: A survey

ACM Computing Surveys (CSUR)
Spectrum of variable-random trees

Journal of Artificial Intelligence Research
Filtering and Refinement: A Two-Stage Approach for Efficient and Effective Anomaly Detection

ICDM '09 Proceedings of the 2009 Ninth IEEE International Conference on Data Mining
On detecting clustered anomalies using SCiForest

ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part II
A nonparametric outlier detection for effectively discovering top-n outliers from engineering data

PAKDD'06 Proceedings of the 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
A unified subspace outlier ensemble framework for outlier detection

WAIM'05 Proceedings of the 6th international conference on Advances in Web-Age Information Management

Local anomaly descriptor: a robust unsupervised algorithm for anomaly detection based on diffusion space

Proceedings of the 21st ACM international conference on Information and knowledge management
Exploiting domain knowledge to detect outliers

Data Mining and Knowledge Discovery
Ensembles for unsupervised outlier detection: challenges and research questions a position paper

ACM SIGKDD Explorations Newsletter

Quantified Score

Hi-index	0.00

Visualization

Abstract

Anomalies are data points that are few and different. As a result of these properties, we show that, anomalies are susceptible to a mechanism called isolation. This article proposes a method called Isolation Forest (iForest), which detects anomalies purely based on the concept of isolation without employing any distance or density measure---fundamentally different from all existing methods. As a result, iForest is able to exploit subsampling (i) to achieve a low linear time-complexity and a small memory-requirement and (ii) to deal with the effects of swamping and masking effectively. Our empirical evaluation shows that iForest outperforms ORCA, one-class SVM, LOF and Random Forests in terms of AUC, processing time, and it is robust against masking and swamping effects. iForest also works well in high dimensional problems containing a large number of irrelevant attributes, and when anomalies are not available in training sample.