A Software Framework for Building Biomedical Machine Learning Classifiers through Grid Computing Resources

  • Authors:
  • Raúl Ramos-Pollán;Miguel Ángel Guevara-López;Eugénio Oliveira

  • Affiliations:
  • CETA-CIEMAT Centro Extremeño de Tecnologías Avanzadas, Trujillo, Spain 10200;INEGI-Faculdade de Engenharia, Universidade do Porto, Porto, Portugal 4200---465;LIACC-DEI-Faculdade de Engenharia, Universidade do Porto, Porto, Portugal 4200---465

  • Venue:
  • Journal of Medical Systems
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper describes the BiomedTK software framework, created to perform massive explorations of machine learning classifiers configurations for biomedical data analysis over distributed Grid computing resources. BiomedTK integrates ROC analysis throughout the complete classifier construction process and enables explorations of large parameter sweeps for training third party classifiers such as artificial neural networks and support vector machines, offering the capability to harness the vast amount of computing power serviced by Grid infrastructures. In addition, it includes classifiers modified by the authors for ROC optimization and functionality to build ensemble classifiers and manipulate datasets (import/export, extract and transform data, etc.). BiomedTK was experimentally validated by training thousands of classifier configurations for representative biomedical UCI datasets reaching in little time classification levels comparable to those reported in existing literature. The comprehensive method herewith presented represents an improvement to biomedical data analysis in both methodology and potential reach of machine learning based experimentation.