A new intrusion detection system using support vector machines and hierarchical clustering

Authors:
Latifur Khan;Mamoun Awad;Bhavani Thuraisingham
Affiliations:
University of Texas at Dallas, Dallas, USA;University of Texas at Dallas, Dallas, USA;University of Texas at Dallas, Dallas, USA
Venue:
The VLDB Journal — The International Journal on Very Large Data Bases
Year:
2007

Citing 18
Cited 25

An Intrusion-Detection Model

IEEE Transactions on Software Engineering - Special issue on computer security and privacy
Implementing agglomerative hierarchic clustering algorithms for use in document retrieval

Information Processing and Management: an International Journal
State Transition Analysis: A Rule-Based Intrusion Detection Approach

IEEE Transactions on Software Engineering
The nature of statistical learning theory

The nature of statistical learning theory
BIRCH: an efficient data clustering method for very large databases

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Intrusion detection with neural networks

NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
Temporal sequence learning and data reduction for anomaly detection

ACM Transactions on Information and System Security (TISSEC)
A framework for constructing features and models for intrusion detection systems

ACM Transactions on Information and System Security (TISSEC)
Self-Organizing Maps

Self-Organizing Maps
Data mining-based intrusion detectors: an overview of the columbia IDS project

ACM SIGMOD Record
Learning Program Behavior Profiles for Intrusion Detection

Proceedings of the Workshop on Intrusion Detection and Network Monitoring
A Statistical Method for Profiling Network Traffic

Proceedings of the Workshop on Intrusion Detection and Network Monitoring
A Random Sampling Technique for Training Support Vector Machines

ALT '01 Proceedings of the 12th International Conference on Algorithmic Learning Theory
Shrinkage estimator generalizations of Proximal Support Vector Machines

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
ADMIT: anomaly-based data mining for intrusions

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Classifying large data sets using SVMs with hierarchical clusters

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
A Visual Approach for Monitoring Logs

LISA '98 Proceedings of the 12th USENIX conference on System administration
A dynamically growing self-organizing tree (DGSOT) for hierarchical clustering gene expression profiles

Bioinformatics

End-to-end accountability in grid computing systems for coalition information sharing

Proceedings of the 4th annual workshop on Cyber security and information intelligence research: developing strategies to meet the cyber security and information intelligence challenges ahead
Traffic flooding attack detection with SNMP MIB using SVM

Computer Communications
Support vector regression for link load prediction

Computer Networks: The International Journal of Computer and Telecommunications Networking
A triangle area based nearest neighbors approach to intrusion detection

Pattern Recognition
Review: Intrusion detection by machine learning: A review

Expert Systems with Applications: An International Journal
Human interface for cyber security anomaly detection systems

HSI'09 Proceedings of the 2nd conference on Human System Interactions
Intrusion Detection by Ellipsoid Boundary

Journal of Network and Systems Management
A novel intrusion detection system based on hierarchical clustering and support vector machines

Expert Systems with Applications: An International Journal
Centered hyperspherical and hyperellipsoidal one-class support vector machines for anomaly detection in sensor networks

IEEE Transactions on Information Forensics and Security
The use of artificial intelligence based techniques for intrusion detection: a review

Artificial Intelligence Review
Mutual information-based feature selection for intrusion detection systems

Journal of Network and Computer Applications
Machine learning approach for IP-flow record anomaly detection

NETWORKING'11 Proceedings of the 10th international IFIP TC 6 conference on Networking - Volume Part I
One approach to the testing of security of proposed database application software

Proceedings of the 15th WSEAS international conference on Computers
An efficient intrusion detection system based on support vector machines and gradually feature removal method

Expert Systems with Applications: An International Journal
Clinical charge profiles prediction for patients diagnosed with chronic diseases using Multi-level Support Vector Machine

Expert Systems with Applications: An International Journal
A simplified multi-class support vector machine with reduced dual optimization

Pattern Recognition Letters
A cascaded classifier approach for improving detection rates on rare attack categories in network intrusion detection

Applied Intelligence
Detecting anomalies in netflow record time series by using a kernel function

AIMS'12 Proceedings of the 6th IFIP WG 6.6 international autonomous infrastructure, management, and security conference on Dependable Networks and Services
A distributed hebb neural network for network anomaly detection

ISPA'07 Proceedings of the 5th international conference on Parallel and Distributed Processing and Applications
New class-dependent feature transformation for intrusion detection systems

Security and Communication Networks
The use of artificial-intelligence-based ensembles for intrusion detection: a review

Applied Computational Intelligence and Soft Computing
Agent-based accountable grid computing systems

The Journal of Supercomputing
Infinite Dirichlet mixture models learning via expectation propagation

Advances in Data Analysis and Classification
Multi-level clustering support vector machine trees for improved protein local structure prediction

International Journal of Data Mining and Bioinformatics
Fast classification for large data sets via random selection clustering and Support Vector Machines

Intelligent Data Analysis

Quantified Score

Hi-index	0.01

Visualization

Abstract

Whenever an intrusion occurs, the security and value of a computer system is compromised. Network-based attacks make it difficult for legitimate users to access various network services by purposely occupying or sabotaging network resources and services. This can be done by sending large amounts of network traffic, exploiting well-known faults in networking services, and by overloading network hosts. Intrusion Detection attempts to detect computer attacks by examining various data records observed in processes on the network and it is split into two groups, anomaly detection systems and misuse detection systems. Anomaly detection is an attempt to search for malicious behavior that deviates from established normal patterns. Misuse detection is used to identify intrusions that match known attack scenarios. Our interest here is in anomaly detection and our proposed method is a scalable solution for detecting network-based anomalies. We use Support Vector Machines (SVM) for classification. The SVM is one of the most successful classification algorithms in the data mining area, but its long training time limits its use. This paper presents a study for enhancing the training time of SVM, specifically when dealing with large data sets, using hierarchical clustering analysis. We use the Dynamically Growing Self-Organizing Tree (DGSOT) algorithm for clustering because it has proved to overcome the drawbacks of traditional hierarchical clustering algorithms (e.g., hierarchical agglomerative clustering). Clustering analysis helps find the boundary points, which are the most qualified data points to train SVM, between two classes. We present a new approach of combination of SVM and DGSOT, which starts with an initial training set and expands it gradually using the clustering structure produced by the DGSOT algorithm. We compare our approach with the Rocchio Bundling technique and random selection in terms of accuracy loss and training time gain using a single benchmark real data set. We show that our proposed variations contribute significantly in improving the training process of SVM with high generalization accuracy and outperform the Rocchio Bundling technique.