Building a Spanish MMTx by Using Automatic Translation and Biomedical Ontologies

  • Authors:
  • Francisco Carrero;José Carlos Cortizo;José María Gómez

  • Affiliations:
  • Universidad Europea de Madrid, Madrid, Spain 28670;Universidad Europea de Madrid, Madrid, Spain 28670 and Artificial Intelligence & Network Solutions S.L.,;Departamento de I+D, Optenet, Parque Empresarial Alvia, Las Rozas, Madrid, Spain 28230

  • Venue:
  • IDEAL '08 Proceedings of the 9th International Conference on Intelligent Data Engineering and Automated Learning
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

The use of domain ontologies is becoming increasingly popular in Medical Natural Language Processing Systems. A wide variety of knowledge bases in multiple languages has been integrated into the Unified Medical Language System (UMLS) to create a huge knowledge source that can be accessed with diverse lexical tools. MetaMap (and its java version MMTx) is a tool that allows extracting medical concepts from free text, but currently there not exists a Spanish version. Our ongoing research is centered on the application of biomedical concepts to cross-lingual text classification, what makes it necessary to have a Spanish MMTx available. We have combined automatic translation techniques with biomedical ontologies and the existing English MMTx to produce a Spanish version of MMTx. We have evaluated different approaches and applied several types of evaluation according to different concept representations for text classification. Our results prove that the use of existing translation tools such as Google Translate produce translations with a high similarity to original texts in terms of extracted concepts.