OHSUMED: an interactive retrieval evaluation and new large test collection for research
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
Concept Forest: A New Ontology-assisted Text Document Similarity Measurement Method
WI '07 Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence
A hidden Markov model-based text classification of medical documents
Journal of Information Science
Supervised and Traditional Term Weighting Methods for Automatic Text Categorization
IEEE Transactions on Pattern Analysis and Machine Intelligence
AutoPCS: A Phrase-Based Text Categorization System for Similar Texts
APWeb/WAIM '09 Proceedings of the Joint International Conferences on Advances in Data and Web Management
A systematic analysis of performance measures for classification tasks
Information Processing and Management: an International Journal
Semantic classification with WordNet kernels
NAACL-Short '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers
The WEKA data mining software: an update
ACM SIGKDD Explorations Newsletter
Ontology-based MEDLINE document classification
BIRD'07 Proceedings of the 1st international conference on Bioinformatics research and development
A comparative study of ontology based term similarity measures on PubMed document clustering
DASFAA'07 Proceedings of the 12th international conference on Database systems for advanced applications
Using an integrated ontology database to categorize web pages
AST/UCMA/ISA/ACN'10 Proceedings of the 2010 international conference on Advances in computer science and information technology
LIBSVM: A library for support vector machines
ACM Transactions on Intelligent Systems and Technology (TIST)
Boosting for text classification with semantic features
WebKDD'04 Proceedings of the 6th international conference on Knowledge Discovery on the Web: advances in Web Mining and Web Usage Analysis
Ontology-guided feature engineering for clinical text classification
Journal of Biomedical Informatics
Hi-index | 0.00 |
Aiming at more efficient search on the Internet, it seems adequate to deploy classification techniques using semantic resources restricting this search to the user's domain of interest. In this work, we try to assess the impact of integrating semantic knowledge on text classification. This integration can be realized in different ways. The one we choose in this paper is the conceptualization. We examine the impact of the different conceptualization strategies on text classification using three traditional text classification methods: Rocchio, Support Vector Machines (SVMs) and Naïve Bayes (NB). We restrain our experimentation to the biomedical domain so conceptualization is applied on OHSUMED corpus, mapping terms in text to their corresponding concepts in UMLS Metathesaurus in order to take their meaning into consideration during text classification. Rocchio, SVMs, and NB are tested using different conceptualization strategies in order to evaluate their effect on classification. Preliminary results demonstrate promising improvements.