Extracting the names of genes and gene products with a hidden Markov model
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
DTMBIO 2012: international workshop on data and text mining in biomedical informatics
Proceedings of the 21st ACM international conference on Information and knowledge management
Hi-index | 0.00 |
The paper concerns the issue of extraction of medicine names from free text documents written in Polish. Using lexicon-based approaches, it is impossible to identify unknown or misspelled medicine names. In this paper, we present the results of experimentation on two methods: Hidden Markov Model (HMM) and Pointwise Mutual Information (PMI)-based approach. The experiment was to identify the medicine names without the use of lexicon or contextual information. The experimentation results show, that HMM may be used as one of several steps in drug names' identification (with F-score slightly below 70% for the test set), while the PMI can help in increasing the precision of results achieved using HMM, but with significant loss in recall.