Automatic identification of biomedical concepts in spanish-language unstructured clinical texts

  • Authors:
  • Elena Castro;Ana Iglesias;Paloma Martínez;Leonardo Castaño

  • Affiliations:
  • Universidad Carlos III de Madrid, Madrid, Spain;Universidad Carlos III de Madrid, Madrid, Spain;Universidad Carlos III de Madrid, Madrid, Spain;Universidad Carlos III de Madrid, Madrid, Spain

  • Venue:
  • Proceedings of the 1st ACM International Health Informatics Symposium
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

The processing of health information from medical records and, especially, clinical notes is a complex task due to the nature of the texts themeselves (i.e., hand-written and containing semi-structured or unstructured data) and the diversity of the terminology used. While certain technologies exist to process these types of texts and data in the English language, only a few such initiatives exist for similar texts and data in the Spanish language. This paper presents a new proposal for the semantic annotation of Spanish-language clinical notes, implementing an automated tool similar to the UMLS MetaMap Transfer (MMTx) for the identification of biomedical concepts in the Spanish-language SNOMED CT ontology. Moreover, an assessment of the tool using 100 Spanish-language clinical notes is presented. Using the clinical notes manually annotated by specialists of a Spanish hospital as the gold standard, it is concluded that precision scores are sufficiently good for the several types of matching achieved by the automated tool proposed. The research presented in this contribution offers a launching point for the establishment of semantic relationships between concepts and the application of mining techniques to Spanish-language clinical notes.