An entity tagger for recognizing acquired genomic variations in cancer literature

  • Authors:
  • Ryan T. Mcdonald;R. Scott Winters;Mark Mandel;Yang Jin;Peter S. White;Fernando Pereira

  • Affiliations:
  • Department of Computer and Information Science, University of Pennsylvania, 3330 Walnut Street, Philadelphia, PA 19104, USA,;The Children's Hospital of Philadelphia, 34th and Civic Center Blvd, Philadelphia, PA 19104, USA;Linguistic Data Consortium, University of Pennsylvania, 3401 Walnut St Suite 400A, Philadelphia, PA 19104, USA;Department of Pediatrics, University of Pennsylvania, Philadelphia, PA 19104, USA,;Department of Pediatrics, University of Pennsylvania, Philadelphia, PA 19104, USA,;Department of Computer and Information Science, University of Pennsylvania, 3330 Walnut Street, Philadelphia, PA 19104, USA,

  • Venue:
  • Bioinformatics
  • Year:
  • 2004

Quantified Score

Hi-index 3.84

Visualization

Abstract

Summary: VTag is an application for identifying the type, genomic location and genomic state-change of acquired genomic aberrations described in text. The application uses a machine learning technique called conditional random fields. VTag was tested with 345 training and 200 evaluation documents pertaining to cancer genetics. Our experiments resulted in 0.8541 precision, 0.7870 recall and 0.8192 F-measure on the evaluation set. Availability: The software is available at http://www.cis.upenn.edu/group/datamining/software_dist/biosfier/.