A Novel Web Text Mining Method Using the Discrete Cosine Transform

  • Authors:
  • Laurence A. F. Park;Marimuthu Palaniswami;Kotagiri Ramamohanarao

  • Affiliations:
  • -;-;-

  • Venue:
  • PKDD '02 Proceedings of the 6th European Conference on Principles of Data Mining and Knowledge Discovery
  • Year:
  • 2002

Quantified Score

Hi-index 0.02

Visualization

Abstract

Fourier Domain Scoring (FDS) has been shown to give a 60% improvement in precision over the existing vector space methods, but its index requires a large storage space. We propose a new Web text mining method using the discrete cosine transform (DCT) to extract useful information from text documents and to provide improved document ranking, without having to store excessive data. While the new method preserves the performance of the FDS method, it gives a 40% improvement- in precision over the established text mining methods when using only 20% of the storage space required by FDS.