WordNet: a lexical database for English
Communications of the ACM
Semantic Digital Libraries
Revising the wordnet domains hierarchy: semantics, coverage and balancing
MLR '04 Proceedings of the Workshop on Multilingual Linguistic Ressources
Tagging ontologies with fuzzy wordnet domains
WILF'11 Proceedings of the 9th international conference on Fuzzy logic and applications
Hi-index | 0.00 |
This paper describes how natural language processing and ontologies are exploited for automatic text categorisation. The approach introduced is part of the MANENT system, an infrastructure for integrating, structuring and searching Digital Libraries. The procedure of structural information extraction, and of the automatic classification of the records according to natural language understanding and the WordNet Domains taxonomy is discussed. A comparison between two versions of the classification algorithm is conducted and the improvements of the new approach are articulated. In particular, using semantic connections between words refines the classification results while reducing misclassification to no classification.