Investigation of unsupervised pattern learning techniques for bootstrap construction of a medical treatment lexicon

  • Authors:
  • Rong Xu;Alex Morgan;Amar K. Das;Alan Garber

  • Affiliations:
  • Stanford University, Stanford, CA;Stanford University, Stanford, CA;Stanford University, Stanford, CA;Stanford University, Stanford

  • Venue:
  • BioNLP '09 Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Dictionaries of biomedical concepts (e.g. diseases, medical treatments) are critical source of background knowledge for systems doing biomedical information retrieval, extraction, and automated discovery. However, the rapid pace of biomedical research and the lack of constraints on usage ensure that such dictionaries are incomplete. Focusing on medical treatment concepts (e.g. drugs, medical procedures and medical devices), we have developed an unsupervised, iterative pattern learning approach for constructing a comprehensive dictionary of medical treatment terms from randomized clinical trial (RCT) abstracts. We have investigated different methods of seeding, either with a seed pattern or seed instances (terms), and have compared different ranking methods for ranking extracted context patterns and instances. When used to identify treatment concepts from 100 randomly chosen, manually annotated RCT abstracts, our medical treatment dictionary shows better performance (precision:0.40, recall: 0.92 and F-measure: 0.54) over the most widely used manually created medical treatment terminology (precision: 0.41, recall: 0.52 and F-measure: 0.42).