Using WordNet to disambiguate word senses for text retrieval
SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Generalized vector spaces model in information retrieval
SIGIR '85 Proceedings of the 8th annual international ACM SIGIR conference on Research and development in information retrieval
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond
Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
An Introduction to Kolmogorov Complexity and Its Applications
An Introduction to Kolmogorov Complexity and Its Applications
Hi-index | 0.00 |
Three methods for representation of hypertext based on links, terms and text compressibility have been compared to check their usefulness in document classification. Documents for classification have been selected from the Wikipedia articles taken from five distinct categories. For each representation dimensionality reduction by Principal Component Analysis has been performed, providing rough visual presentation of the data. Compression-based feature space representation needed about 5 times less PCA vectors than the term or link-based representations to reach 90% cumulative variance, giving comparable results of classification by Support Vector Machines.