Non-stationary Data Mining: The Network Security Issue

  • Authors:
  • Sergio Decherchi;Paolo Gastaldo;Judith Redi;Rodolfo Zunino

  • Affiliations:
  • Dept. of Biophysical and Electronic Engineering (DIBE), Genoa University, Genoa, Italy 16145;Dept. of Biophysical and Electronic Engineering (DIBE), Genoa University, Genoa, Italy 16145;Dept. of Biophysical and Electronic Engineering (DIBE), Genoa University, Genoa, Italy 16145;Dept. of Biophysical and Electronic Engineering (DIBE), Genoa University, Genoa, Italy 16145

  • Venue:
  • ICANN '08 Proceedings of the 18th international conference on Artificial Neural Networks, Part II
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Data mining applications explore large amounts of heterogeneous data in search of consistent information. In such a challenging context, empirical learning methods aim to optimize prediction on unseen data, and an accurate estimate of the generalization error is of paramount importance. The paper shows that the theoretical formulation based on the Vapnik-Chervonenkis dimension (dvc) can be of practical interest when applied to clustering methods for data-mining applications. The presented research adopts the K-Winner Machine (KWM) as a clustering-based, semi-supervised classifier; in addition to fruitful theoretical properties, the model provides a general criterion for evaluating the applicability of Vapnik's generalization predictions in data mining. The general approach is verified experimentally in the practical problem of detecting intrusions in computer networks. Empirical results prove that the KWM model can effectively support such a difficult classification task and combine unsupervised and supervised.