Using Wavelets to Classify Documents

  • Authors:
  • Geraldo Xexéo;Jano de Souza;Patrícia F. Castro;Wallace A. Pinheiro

  • Affiliations:
  • -;-;-;-

  • Venue:
  • WI-IAT '08 Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Currently, Fourier and cosine discrete transformations are used to classify documents. This article proposes a new strategy that uses wavelets in the representation and reduction of data text. Wavelets have been extensively used for dimensionality reduction in the field of signal processing. In this work, we show that a text document, after being subjected to a simple process of reorganization of its terms, can be treated like a signal and analyzed by signal processing tools. We demonstrate that this new representation is able to describe the most relevant features of documents in a synthetic representation and this new perspective improves the performance of the classification algorithm.