Identifying Gene and Protein Names from Biological Texts

  • Authors:
  • Weijian Xuan;Stanley J. Watson;Huda Akil;Fan Meng

  • Affiliations:
  • -;-;-;-

  • Venue:
  • CSB '03 Proceedings of the IEEE Computer Society Conference on Bioinformatics
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

Extracting and identifying gene and protein namesfrom literature is a critical step for mining functionalinformation of genes and proteins. While extensive effortshave been devoted to this important task, most of themwere aiming at extracting genelprotein name per sewithout paying much attention to associate the extractedname with existing gene and protein database entries. Wedeveloped a simple and efficient method to identify geneand protein names in literature using a combination ofheuristic and statistical strategies. Our approach willmap the extracted names to individual LocusLink entriesthus enable the seamless integration of literatureinformation with existing geneiprotein databases.Evaluation on a test corpus shows that our method canachieve both high recall and precision. Our methodexhibits good performance and can be used as a buildingblock for large biomedical literature mining systems.