C4.5: programs for machine learning
C4.5: programs for machine learning
Mining in a data-flow environment: experience in network intrusion detection
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining needle in a haystack: classifying rare classes via two-phase rule induction
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
A framework for constructing features and models for intrusion detection systems
ACM Transactions on Information and System Security (TISSEC)
Machine Learning
Using Artificial Anomalies to Detect Unknown and Known Network Intrusions
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Results of the KDD'99 classifier learning
ACM SIGKDD Explorations Newsletter
KDD-99 classifier learning contest LLSoft's results overview
ACM SIGKDD Explorations Newsletter
Parzen-Window Network Intrusion Detectors
ICPR '02 Proceedings of the 16 th International Conference on Pattern Recognition (ICPR'02) Volume 4 - Volume 4
Using grammatical evolution for evolving intrusion detection rules
ISP'06 Proceedings of the 5th WSEAS International Conference on Information Security and Privacy
Using grammatical evolution for evolving intrusion detection rules
CSECS'06 Proceedings of the 5th WSEAS International Conference on Circuits, Systems, Electronics, Control & Signal Processing
Application of Data Mining to Network Intrusion Detection: Classifier Selection Model
APNOMS '08 Proceedings of the 11th Asia-Pacific Symposium on Network Operations and Management: Challenges for Next Generation Network Operations and Service Management
Enhancing network based intrusion detection for imbalanced data
International Journal of Knowledge-based and Intelligent Engineering Systems
Network anomaly detection based on wavelet analysis
EURASIP Journal on Advances in Signal Processing - Special issue on signal processing applications in network intrusion detection systems
Modeling Network Intrusion Detection System Using Feature Selection and Parameters Optimization
IEICE - Transactions on Information and Systems
Review: The use of computational intelligence in intrusion detection systems: A review
Applied Soft Computing
Combining Feature Selection and Local Modelling in the KDD Cup 99 Dataset
ICANN '09 Proceedings of the 19th International Conference on Artificial Neural Networks: Part I
Detecting Network Anomalies Using CUSUM and EM Clustering
ISICA '09 Proceedings of the 4th International Symposium on Advances in Computation and Intelligence
Measuring similarity in feature space of knowledge entailed by two separate rule sets
Knowledge-Based Systems
A comparison of feature-selection methods for intrusion detection
MMM-ACNS'10 Proceedings of the 5th international conference on Mathematical methods, models and architectures for computer network security
Exploring discrepancies in findings obtained with the KDD Cup '99 data set
Intelligent Data Analysis
CISC'05 Proceedings of the First SKLOIS conference on Information Security and Cryptology
Network intrusion detection using wavelet analysis
CIT'04 Proceedings of the 7th international conference on Intelligent Information Technology
IDEAL'12 Proceedings of the 13th international conference on Intelligent Data Engineering and Automated Learning
Quantitative intrusion intensity assessment for intrusion detection systems
Security and Communication Networks
Proceedings of the South African Institute for Computer Scientists and Information Technologists Conference
A distance sum-based hybrid method for intrusion detection
Applied Intelligence
Hi-index | 0.00 |
A large set of machine learning and pattern classification algorithms trained and tested on KDD intrusion detection data set failed to identify most of the user-to-root and remote-to-local attacks, as reported by many researchers in the literature. In light of this observation, this paper aims to expose the deficiencies and limitations of the KDD data set to argue that this data set should not be used to train pattern recognition or machine learning algorithms for misuse detection for these two attack categories. Multiple analysis techniques are employed to demonstrate, both objectively and subjectively, that the KDD training and testing data subsets represent dissimilar target hypotheses for user-to-root and remote-to-local attack categories. These techniques consisted of switching the roles of original training and testing data subsets to develop a decision tree classifier, cross-validation on merged training and testing data subsets, and qualitative and comparative analysis of rules generated independently on training and testing data subsets through the C4.5 decision tree algorithm. Analysis results clearly suggest that no pattern classification or machine learning algorithm can be trained successfully with the KDD data set to perform misuse detection for user-to-root or remote-to-local attack categories. It is further noted that the analysis techniques employed to assess the similarity between the two target hypotheses represented by the training and the testing data subsets can readily be generalized to data set pairs in other problem domains.