Distributional clustering of words for text classification
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
A divisive information theoretic feature clustering algorithm for text classification
The Journal of Machine Learning Research
A multiview approach for intelligent data analysis based on data operators
Information Sciences: an International Journal
Pattern-oriented associative rule-based patent classification
Expert Systems with Applications: An International Journal
Hi-index | 0.01 |
This paper introduces a rule-based, context-dependent word clustering method, with the rules derived from various domain databases and the word text orthographic properties. Besides significant dimensionality reduction, our experiments show that such rule-based word clustering improves by 8 the overall accuracy of extracting bibliographic fields from references, and by 18.32 on average the class-specific performance on the line classification of document headers.