Spatial Representation for Efficient Sequence Classification

  • Authors:
  • Pavel P. Kuksa;Vladimir Pavlovic

  • Affiliations:
  • -;-

  • Venue:
  • ICPR '10 Proceedings of the 2010 20th International Conference on Pattern Recognition
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present a general, simple feature representation of sequences that allows efficient inexact matching, comparison and classification of sequential data. This approach, recently introduced for the problem of biological sequence classification, exploits a novel multi-scale representation of strings. The new representation leads to discovery of very efficient algorithms for string comparison, independent of the alphabet size. We show that these algorithms can be generalized to handle a wide gamut of sequence classification problems in diverse domains such as the music and text sequence classification. The presented algorithms offer low computational cost and highly scalable implementations across different application domains. The new method demonstrates order-of-magnitude running time improvements over existing state-of-the-art approaches while matching or exceeding their predictive accuracy.