Using sub-sequence information with kNN for classification of sequential data

  • Authors:
  • N. Pradeep Kumar;M. Venkateswara Rao;P. Radha Krishna;Raju S. Bapi

  • Affiliations:
  • Institute for Development and Research in Banking Technology IDRBT, Hyderabad, India;Institute for Development and Research in Banking Technology IDRBT, Hyderabad, India;Institute for Development and Research in Banking Technology IDRBT, Hyderabad, India;University of Hyderabad, Gachibowli, Hyderabad, India

  • Venue:
  • ICDCIT'05 Proceedings of the Second international conference on Distributed Computing and Internet Technology
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

With the enormous growth of data, which exhibit sequentiality, it has become important to investigate the impact of embedded sequential information within the data. Sequential data are growing enormously, hence an efficient classification of sequential data is needed. k-Nearest Neighbor (kNN) has been used and proved to be an efficient classification technique for two-class problems. This paper uses sliding window approach to extract sub-sequences of various lengths and classification using kNN. We conducted experiments on DARPA 98 IDS dataset using various distance/similarity measures such as Jaccard similarity, Cosine similarity, Euclidian distance and Binary Weighted Cosine (BWC) measure. Our results demonstrate that sub-sequence information enhances kNN classification accuracy for sequential data, irrespective of the distance/similarity metric used.