A Comparative Study of Unsupervised Machine Learning and Data Mining Techniques for Intrusion Detection

Authors:
Reza Sadoddin;Ali A. Ghorbani
Affiliations:
Network Security Laboratory, University of New Brunswick, Fredericton, New Brunswick, Canada;Network Security Laboratory, University of New Brunswick, Fredericton, New Brunswick, Canada
Venue:
MLDM '07 Proceedings of the 5th international conference on Machine Learning and Data Mining in Pattern Recognition
Year:
2007

Citing 11
Cited 2

An Intrusion-Detection Model

IEEE Transactions on Software Engineering - Special issue on computer security and privacy
Self-organizing maps

Self-organizing maps
LOF: identifying density-based local outliers

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Mining top-n local outliers in large databases

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Pattern Recognition with Fuzzy Objective Function Algorithms

Pattern Recognition with Fuzzy Objective Function Algorithms
Multivariate Statistical Analysis of Audit Trails for Host-Based Intrusion Detection

IEEE Transactions on Computers
Network Intrusion Detection Using an Improved Competitive Learning Neural Network

CNSR '04 Proceedings of the Second Annual Conference on Communication Networks and Services Research
Estimating the Support of a High-Dimensional Distribution

Neural Computation
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
From outliers to prototypes: Ordering data

Neurocomputing
LIBSVM: A library for support vector machines

ACM Transactions on Intelligent Systems and Technology (TIST)

Statistical analysis of honeypot data and building of Kyoto 2006+ dataset for NIDS evaluation

Proceedings of the First Workshop on Building Analysis Datasets and Gathering Experience Returns for Security
Uncovering relations between traffic classifiers and anomaly detectors via graph theory

TMA'10 Proceedings of the Second international conference on Traffic Monitoring and Analysis

Quantified Score

Hi-index	0.00

Visualization

Abstract

During the past number of years, machine learning and data mining techniques have received considerable attention among the intrusion detection researchers to address the weaknesses of knowledgebase detection techniques. This has led to the application of various supervised and unsupervised techniques for the purpose of intrusion detection. In this paper, we conduct a set of experiments to analyze the performance of unsupervised techniques considering their main design choices. These include the heuristics proposed for distinguishing abnormal data from normal data and the distribution of dataset used for training. We evaluate the performance of the techniques with various distributions of training and test datasets, which are constructed from KDD99 dataset, a widely accepted resource for IDS evaluations. This comparative study is not only a blind comparison between unsupervised techniques, but also gives some guidelines to researchers and practitioners on applying these techniques to the area of intrusion detection.