Adaptive ROC-based ensembles of HMMs applied to anomaly detection

Authors:
Wael Khreich;Eric Granger;Ali Miri;Robert Sabourin
Affiliations:
Laboratoire d'imagerie, de vision et d'intelligence artificielle (LIVIA), ícole de technologie supérieure, Université du Québec, 1100 Notre-Dame Ouest, Montreal, QC, Canada;Laboratoire d'imagerie, de vision et d'intelligence artificielle (LIVIA), ícole de technologie supérieure, Université du Québec, 1100 Notre-Dame Ouest, Montreal, QC, Canada;School of Computer Science, Ryerson University, Toronto, Canada;Laboratoire d'imagerie, de vision et d'intelligence artificielle (LIVIA), ícole de technologie supérieure, Université du Québec, 1100 Notre-Dame Ouest, Montreal, QC, Canada
Venue:
Pattern Recognition
Year:
2012

Citing 36
Cited 7

Original Contribution: Stacked generalization

Neural Networks
Decision Combination in Multiple Classifier Systems

IEEE Transactions on Pattern Analysis and Machine Intelligence
The weighted majority algorithm

Information and Computation
Smooth on-line learning algorithms for hidden Markov models

Neural Computation
Bagging predictors

Machine Learning
Error reduction through learning multiple descriptions

Machine Learning
The Random Subspace Method for Constructing Decision Forests

IEEE Transactions on Pattern Analysis and Machine Intelligence
Robust Classification for Imprecise Environments

Machine Learning
A Theoretical Study on Six Classifier Fusion Strategies

IEEE Transactions on Pattern Analysis and Machine Intelligence
A Unifeid Bias-Variance Decomposition and its Applications

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
On-Line Estimation of Hidden Markov Model Parameters

DS '00 Proceedings of the Third International Conference on Discovery Science
Ensemble Methods in Machine Learning

MCS '00 Proceedings of the First International Workshop on Multiple Classifier Systems
Pruning Adaptive Boosting

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Pruning and dynamic scheduling of cost-sensitive ensembles

Eighteenth national conference on Artificial intelligence
Combining Pattern Classifiers: Methods and Algorithms

Combining Pattern Classifiers: Methods and Algorithms
Ensemble selection from libraries of models

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Using AUC and Accuracy in Evaluating Learning Algorithms

IEEE Transactions on Knowledge and Data Engineering
The Evolution of System-Call Monitoring

ACSAC '08 Proceedings of the 2008 Annual Computer Security Applications Conference
An Analysis of Ensemble Pruning Techniques Based on Ordered Aggregation

IEEE Transactions on Pattern Analysis and Machine Intelligence
Threshold-optimized decision-level fusion and its application to biometrics

Pattern Recognition
Incremental construction of classifier and discriminant ensembles

Information Sciences: an International Journal
Focused Ensemble Selection: A Diversity-Based Method for Greedy Ensemble Selection

Proceedings of the 2008 conference on ECAI 2008: 18th European Conference on Artificial Intelligence
Combining incremental Hidden Markov Model and Adaboost algorithm for anomaly intrusion detection

Proceedings of the ACM SIGKDD Workshop on CyberSecurity and Intelligence Informatics
Incremental estimation of discrete hidden Markov models based on a new backward procedure

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
Ensemble-based classifiers

Artificial Intelligence Review
A comparison of techniques for on-line incremental learning of HMM parameters in anomaly detection

CISDA'09 Proceedings of the Second IEEE international conference on Computational intelligence for security and defense applications
The behavior knowledge space fusion method: analysis of generalization error and strategies for performance improvement

MCS'03 Proceedings of the 4th international conference on Multiple classifier systems
A new ensemble diversity measure applied to thinning ensembles

MCS'03 Proceedings of the 4th international conference on Multiple classifier systems
Iterative Boolean combination of classifiers in the ROC space: An application to anomaly detection with HMMs

Pattern Recognition
Transfer estimation of evolving class priors in data stream classification

Pattern Recognition
Combining hidden Markov models for improved anomaly detection

ICC'09 Proceedings of the 2009 IEEE international conference on Communications
Boolean Combination of Classifiers in the ROC Space

ICPR '10 Proceedings of the 2010 20th International Conference on Pattern Recognition
Anomaly-based network intrusion detection using outlier subspace analysis: a case study

Canadian AI'11 Proceedings of the 24th Canadian conference on Advances in artificial intelligence
Combining classifiers using their receiver operating characteristics and maximum likelihood estimation

MICCAI'05 Proceedings of the 8th international conference on Medical Image Computing and Computer-Assisted Intervention - Volume Part I
Learn++: an incremental learning algorithm for supervised neuralnetworks

IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
Hidden Markov processes

IEEE Transactions on Information Theory

An online AUC formulation for binary classification

Pattern Recognition
A survey of techniques for incremental learning of HMM parameters

Information Sciences: an International Journal
Multi-objective evolutionary optimization for generating ensembles of classifiers in the ROC space

Proceedings of the 14th annual conference on Genetic and evolutionary computation
The use of artificial-intelligence-based ensembles for intrusion detection: a review

Applied Computational Intelligence and Soft Computing
A Multi-Classifier System for Sentiment Analysis and Opinion Mining

ASONAM '12 Proceedings of the 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012)
ROC curves for regression

Pattern Recognition
EUSBoost: Enhancing ensembles for highly imbalanced data-sets by evolutionary undersampling

Pattern Recognition

Quantified Score

Hi-index	0.01

Visualization

Abstract

Hidden Markov models (HMMs) have been successfully applied in many intrusion detection applications, including anomaly detection from sequences of operating system calls. In practice, anomaly detection systems (ADSs) based on HMMs typically generate false alarms because they are designed using limited amount of representative training data. Since new data may become available over time, an important feature of an ADS is the ability to accommodate newly acquired data incrementally, after it has originally been trained and deployed for operations. In this paper, a system based on the receiver operating characteristic (ROC) is proposed to efficiently adapt ensembles of HMMs (EoHMMs) in response to new data, according to a learn-and-combine approach. When a new block of training data becomes available, a pool of base HMMs is generated from the data using a different number of HMM states and random initializations. The responses from the newly trained HMMs are then combined to those of the previously trained HMMs in ROC space using a novel incremental Boolean combination (incrBC) technique. Finally, specialized algorithms for model management allow to select a diversified EoHMM from the pool, and adapt Boolean fusion functions and thresholds for improved performance, while it prunes redundant base HMMs. The proposed system is capable of changing the desired operating point during operations, and this point can be adjusted to changes in prior probabilities and costs of errors. Computer simulations conducted on synthetic and real-world host-based intrusion detection data indicate that the proposed system can achieve a significantly higher level of performance than when parameters of a single best HMM are estimated, at each learning stage, using reference batch and incremental learning techniques. It also outperforms the learn-and-combine approaches using static fusion functions (e.g., majority voting). Over time, the proposed ensemble selection algorithms form compact EoHMMs, while maintaining or improving system accuracy. Pruning allows to limit the pool size from increasing indefinitely, thereby reducing the storage space for accommodating HMMs parameters without negatively affecting the overall EoHMM performance. Although applied for HMM-based ADSs, the proposed approach is general and can be employed for a wide range of classifiers and detection applications.