High throughput modularized NLP system for clinical text

  • Authors:
  • Serguei Pakhomov;James Buntrock;Patrick Duffy

  • Affiliations:
  • Mayo College of Medicine, Rochester, MN;Mayo Clinic, Rochester, MN;Mayo Clinic, Rochester, MN

  • Venue:
  • ACLdemo '05 Proceedings of the ACL 2005 on Interactive poster and demonstration sessions
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents the results of the development of a high throughput, real time modularized text analysis and information retrieval system that identifies clinically relevant entities in clinical notes, maps the entities to several standardized nomenclatures and makes them available for subsequent information retrieval and data mining. The performance of the system was validated on a small collection of 351 documents partitioned into 4 query topics and manually examined by 3 physicians and 3 nurse abstractors for relevance to the query topics. We find that simple key phrase searching results in 73% recall and 77% precision. A combination of NLP approaches to indexing improve the recall to 92%, while lowering the precision to 67%.