Automatic information extraction from patient records in Bulgarian language

  • Authors:
  • Galia Angelova

  • Affiliations:
  • Bulgarian Academy of Sciences, Sofia

  • Venue:
  • Proceedings of the 14th International Conference on Computer Systems and Technologies
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Natural Language Processing (NLP) has been viewed as a promising technology in medical informatics since decades. Despite the gradually improving quality of automatic text analysis, however, clinical NLP systems are still rarely used outside the research Labs due to the following reasons: (i) their development is very expensive so most of them are prototypes or proof-of-concept demonstrators, (ii) real exploitation of NLP modules would require constant support of the underlying linguistic resources and tuning the systems to new text types; (iii) the technology has potentially high accuracy but some results might be erroneous and misleading [1]. On the other hand, the quick adoption of Electronic Health Records worldwide implies constant growth of electronic narratives discussing patient-related information. According to the established medical practices, the most important findings about the patients are still kept as free texts in various documents and languages. In this way the so called Information Extraction (IE) becomes the dominating language technology that is currently applied to biomedical texts. The main idea is to extract automatically important entities, with accuracy as high as possible, and to operate on these entities skipping the remaining text fragments. IE is based on shallow analysis only but it is expected that even the progress in partial text understanding would enable radical improvements in clinical decision support, biomedical research and healthcare in general.