On Dataset Biases in a Learning System with Minimum A Priori Information for Intrusion Detection

  • Authors:
  • Affiliations:
  • Venue:
  • CNSR '04 Proceedings of the Second Annual Conference on Communication Networks and Services Research
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

A critical design decision in the construction of intrusiondetection systems is often the selection of featuresdescribing the characteristics of the data being learnt.Selecting features often requires a priori or expertknowledge and may lead to the introduction of specificattack biases 驴 intended or otherwise. To this end,summarized network connections from the DARPA 98Lincoln Labs dataset are employed for training and testinga data driven learning architecture. The learningarchitecture is composed from a hierarchy of self-organizingfeature maps. Such a scheme is entirelyunsupervised, thus the quality of the intrusion detectionsystem is directly influenced by the quality of the dataset.Dataset biases are investigated through three differentdataset partitions: 10% KDD (default training dataset);normal connections alone; 50/50 mix of attack and normal.The three resulting intrusion detection systems appear tobe competitive with the alternative cluster based dataminingapproaches.