A Novel Web Text Mining Method Using the Discrete Cosine Transform

Authors:
Laurence A. F. Park;Marimuthu Palaniswami;Kotagiri Ramamohanarao
Affiliations:
-;-;-
Venue:
PKDD '02 Proceedings of the 6th European Conference on Principles of Data Mining and Knowledge Discovery
Year:
2002

Citing 7
Cited 7

Numerical recipes in C (2nd ed.): the art of scientific computing

Numerical recipes in C (2nd ed.): the art of scientific computing
Fourier-related transforms, fast algorithms and applications

Fourier-related transforms, fast algorithms and applications
Exploring the similarity space

ACM SIGIR Forum
Managing gigabytes (2nd ed.): compressing and indexing documents and images

Managing gigabytes (2nd ed.): compressing and indexing documents and images
Internet Document Filtering Using Fourier Domain Scoring

PKDD '01 Proceedings of the 5th European Conference on Principles of Data Mining and Knowledge Discovery
Fourier Domain Scoring: A Novel Document Ranking Method

IEEE Transactions on Knowledge and Data Engineering
Discrete Cosine Transfom

IEEE Transactions on Computers

A novel document retrieval method using the discrete wavelet transform

ACM Transactions on Information Systems (TOIS)
Fourier Domain Scoring with Document Structure Consideration

Proceedings of the 2006 conference on Advances in Intelligent IT: Active Media Technology 2006
Supervised web document classification using discrete transforms, active hypercontours and expert knowledge

WImBI'06 Proceedings of the 1st WICI international conference on Web intelligence meets brain informatics
Broadening vector space schemes for improving the quality of information retrieval

APWeb'05 Proceedings of the 7th Asia-Pacific web conference on Web Technologies Research and Development
Spectral-based document retrieval

ASIAN'04 Proceedings of the 9th Asian Computing Science conference on Advances in Computer Science: dedicated to Jean-Louis Lassez on the Occasion of His 5th Cycle Birthday
Web textual documents scoring based on discrete transforms with fuzzy weighting

AWIC'05 Proceedings of the Third international conference on Advances in Web Intelligence
Structure-based document model with discrete wavelet transforms and its application to document classification

AusDM '08 Proceedings of the 7th Australasian Data Mining Conference - Volume 87

Quantified Score

Hi-index	0.02

Visualization

Abstract

Fourier Domain Scoring (FDS) has been shown to give a 60% improvement in precision over the existing vector space methods, but its index requires a large storage space. We propose a new Web text mining method using the discrete cosine transform (DCT) to extract useful information from text documents and to provide improved document ranking, without having to store excessive data. While the new method preserves the performance of the FDS method, it gives a 40% improvement- in precision over the established text mining methods when using only 20% of the storage space required by FDS.