Fast algorithms for sorting and searching strings
SODA '97 Proceedings of the eighth annual ACM-SIAM symposium on Discrete algorithms
Extracting the names of genes and gene products with a hidden Markov model
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Contrast and variability in gene names
BioMed '02 Proceedings of the ACL-02 workshop on Natural language processing in the biomedical domain - Volume 3
Tagging Sentence Boundaries in Biomedical Literature
CICLing '07 Proceedings of the 8th International Conference on Computational Linguistics and Intelligent Text Processing
Hi-index | 0.00 |
Extracting and identifying gene and protein namesfrom literature is a critical step for mining functionalinformation of genes and proteins. While extensive effortshave been devoted to this important task, most of themwere aiming at extracting genelprotein name per sewithout paying much attention to associate the extractedname with existing gene and protein database entries. Wedeveloped a simple and efficient method to identify geneand protein names in literature using a combination ofheuristic and statistical strategies. Our approach willmap the extracted names to individual LocusLink entriesthus enable the seamless integration of literatureinformation with existing geneiprotein databases.Evaluation on a test corpus shows that our method canachieve both high recall and precision. Our methodexhibits good performance and can be used as a buildingblock for large biomedical literature mining systems.