A vector space model for automatic indexing
Communications of the ACM
Extracting the names of genes and gene products with a hidden Markov model
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Tuning support vector machines for biomedical named entity recognition
BioMed '02 Proceedings of the ACL-02 workshop on Natural language processing in the biomedical domain - Volume 3
The AMTEx approach in the medical document indexing and retrieval application
Data & Knowledge Engineering
Web Semantics: Science, Services and Agents on the World Wide Web
Artificial Intelligence in Medicine
An approach based on langage modeling for improving biomedical information retrieval
International Journal of Knowledge-based and Intelligent Engineering Systems
Hi-index | 0.00 |
It is well known that the main objective of conceptual retrieval models is to go beyond simple term matching by relaxing term independence assumption through concept recognition. In this paper, we present an approach of semantic indexing and retrieval of biomedical documents through the process of identifying domain concepts extracted from the Medical Subject Headings (MeSH) thesaurus. Our indexing approach relies on a purely statistical vector space model, which represents medical documents and MeSH concepts as term vectors. By leveraging a combination of the bag-of-words concept representation and word positions in the textual features, we demonstrate that our mapping method is able to extract valuable concepts from documents. The output of this semantic mapping serves as the input to our relevance document scoring in response to a query. Experiments on the OHSUMED collection show that our semantic indexing method significantly outperforms state-of-art baselines that employ word or term statistics.