Data preparation for data mining
Data preparation for data mining
The base-rate fallacy and the difficulty of intrusion detection
ACM Transactions on Information and System Security (TISSEC)
Laplacian Eigenmaps for dimensionality reduction and data representation
Neural Computation
Semi-Supervised Learning on Riemannian Manifolds
Machine Learning
Outlier detection by active learning
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data
The Journal of Machine Learning Research
Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples
The Journal of Machine Learning Research
Characterizing the Function Space for Bayesian Kernel Models
The Journal of Machine Learning Research
Semi-supervised co-training and active learning based approach for multi-view intrusion detection
Proceedings of the 2009 ACM symposium on Applied Computing
ACM Computing Surveys (CSUR)
New theoretical frameworks for machine learning
New theoretical frameworks for machine learning
Review: Intrusion detection by machine learning: A review
Expert Systems with Applications: An International Journal
Finding approximate POMDP solutions through belief compression
Journal of Artificial Intelligence Research
Semi-Supervised Learning
Outside the Closed World: On Using Machine Learning for Network Intrusion Detection
SP '10 Proceedings of the 2010 IEEE Symposium on Security and Privacy
Statistical analysis of honeypot data and building of Kyoto 2006+ dataset for NIDS evaluation
Proceedings of the First Workshop on Building Analysis Datasets and Gathering Experience Returns for Security
Data Mining and Machine Learning in Cybersecurity
Data Mining and Machine Learning in Cybersecurity
Improving Performance of Anomaly-Based IDS by Combining Multiple Classifiers
SAINT '11 Proceedings of the 2011 IEEE/IPSJ International Symposium on Applications and the Internet
Graph-Based Semi-Supervised Learning and Spectral Kernel Design
IEEE Transactions on Information Theory
A learning system for discriminating variants of malicious network traffic
Proceedings of the Eighth Annual Cyber Security and Information Intelligence Research Workshop
Hi-index | 0.00 |
A barrier to the widespread adoption of learning-based network intrusion detection tools is the in-situ training requirements for effective discrimination of malicious traffic. Supervised learning techniques necessitate a quantity of labeled examples that is often intractable, and at best cost-prohibitive. Recent advances in semi-supervised techniques have demonstrated the ability to generalize well based on a significantly smaller set of labeled samples. In network intrusion detection, placing reasonable requirements on the number of training examples provides realistic expectations that a learning-based system can be trained in the environment where it will be deployed. This in-situ training is necessary to ensure that the assumptions associated with the learning process hold, and thereby support a reasonable belief in the generalization ability of the resulting model. In this paper, we describe the application of a carefully selected nonparametric, semi-supervised learning algorithm to the network intrusion problem, and compare the performance to other model types using feature-based data derived from an operational network. We demonstrate dramatic performance improvements over supervised learning and anomaly detection in discriminating real, previously unseen, malicious network traffic while generating an order of magnitude fewer false alerts than any alternative, including a signature IDS tool deployed on the same network.