An environment for data analysis in biomedical domain: information extraction for decision support systems

  • Authors:
  • Pablo F. Matos;Leonardo O. Lombardi;Thiago A. S. Pardo;Cristina D. A. Ciferri;Marina T. P. Vieira;Ricardo R. Ciferri

  • Affiliations:
  • Department of Computer Science, Federal University of São Carlos, São Carlos, SP, Brazil;Faculty of Mathematical and Nature Sciences, Methodist University of Piracicaba, Piracicaba, SP, Brazil;Department of Computer Science, University of São Paulo, São Carlos, SP, Brazil;Department of Computer Science, University of São Paulo, São Carlos, SP, Brazil;Faculty of Mathematical and Nature Sciences, Methodist University of Piracicaba, Piracicaba, SP, Brazil;Department of Computer Science, Federal University of São Carlos, São Carlos, SP, Brazil

  • Venue:
  • IEA/AIE'10 Proceedings of the 23rd international conference on Industrial engineering and other applications of applied intelligent systems - Volume Part I
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper addresses the problem of extracting and processing relevant information from unstructured electronic documents of the biomedical domain. The documents are full scientific papers. This problem imposes several challenges, such as identifying text passages that contain relevant information, collecting the relevant information pieces, populating a database and a data warehouse, and mining these data. For this purpose, this paper proposes the IEDSS-Bio, an environment for Information Extraction and Decision Support System in Biomedical domain. In a case study, experiments with machine learning for identifying relevant text passages (disease and treatment effects, and patients number information on Sickle Cell Anemia papers) showed that the best results (95.9% accuracy) were obtained with a statistical method and the use of preprocessing techniques to resample the examples and to eliminate noise.