A New Pattern Representation Scheme Using Data Compression

  • Authors:
  • Toshinori Watanabe;Ken Sugawara;Hiroshi Sugihara

  • Affiliations:
  • Univ. of Electro-Communications, Tokyo, Japan;Univ. of Electro-Communications, Tokyo, Japan;Univ. of Electro-Communications, Tokyo, Japan

  • Venue:
  • IEEE Transactions on Pattern Analysis and Machine Intelligence
  • Year:
  • 2002

Quantified Score

Hi-index 0.14

Visualization

Abstract

We propose a new pattern representation scheme based on data compression, or PRDC, for media data analysis. PRDC is composed of two parts, an encoder that translates input data into a text and a set of text compressors to generate a compression ratio vector (CV). The CV is used as a feature of the input data. By preparing a set of media-specific encoders, PRDC becomes widely applicable. Analysis tasks, both categorization (class formation) and recognition (classification), can be realized using CVs. After a mathematical discussion on the realizability of PRDC, the wide applicability of this scheme is demonstrated through automatic categorization and/or recognition of music, voice, genome, handwritten sketches, and color images.