Two biomedical sublanguages: a description based on the theories of Zellig Harris
Journal of Biomedical Informatics - Special issue: Sublanguage
Efficient support vector classifiers for named entity recognition
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Classifying semantic relations in bioscience texts
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Rule-Based Protein Term Identification with Help from Automatic Species Tagging
CICLing '07 Proceedings of the 8th International Conference on Computational Linguistics and Intelligent Text Processing
LIBSVM: A library for support vector machines
ACM Transactions on Intelligent Systems and Technology (TIST)
Comparing and combining chunkers of biomedical text
Journal of Biomedical Informatics
Empirical textual mining to protein entities recognition from pubmed corpus
NLDB'05 Proceedings of the 10th international conference on Natural Language Processing and Information Systems
Medical question answering: translating medical questions into sparql queries
Proceedings of the 2nd ACM SIGHIT International Health Informatics Symposium
A statistical medical summary translation system
Proceedings of the 2nd ACM SIGHIT International Health Informatics Symposium
Unsupervised biomedical named entity recognition: Experiments with clinical and biological texts
Journal of Biomedical Informatics
Hi-index | 0.00 |
Medical Entity Recognition is a crucial step towards efficient medical texts analysis. In this paper we present and compare three methods based on domain-knowledge and machine-learning techniques. We study two research directions through these approaches: (i) a first direction where noun phrases are extracted in a first step with a chunker before the final classification step and (ii) a second direction where machine learning techniques are used to identify simultaneously entities boundaries and categories. Each of the presented approaches is tested on a standard corpus of clinical texts. The obtained results show that the hybrid approach based on both machine learning and domain knowledge obtains the best performance.