A parallel random forest classifier for R

  • Authors:
  • Lawrence Mitchell;Terence M. Sloan;Muriel Mewissen;Peter Ghazal;Thorsten Forster;Michal Piotrowski;Arthur S. Trew

  • Affiliations:
  • University of Edinburgh, Edinburgh, United Kingdom;University of Edinburgh, Edinburgh, United Kingdom;University of Edinburgh, Edinburgh, United Kingdom;University of Edinburgh, Edinburgh, United Kingdom;University of Edinburgh, Edinburgh, United Kingdom;University of Edinburgh, Edinburgh, United Kingdom;University of Edinburgh, Edinburgh, United Kingdom

  • Venue:
  • Proceedings of the second international workshop on Emerging computational methods for the life sciences
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

The statistical language R is favoured by many biostaticians for processing microarray data. In recent times, the quantity of data that can be obtained in experiments has risen significantly, making previously fast analyses time consuming, or even not possible at all with the existing software infrastructure. High Performance Computing (HPC) systems offer a solution to these problems, but at the expense of increased complexity for the end user. The Simple Parallel R Interface (SPRINT) is a library for R that aims to reduce the complexity of using HPC systems by providing biostatisticians with drop-in parallelized replacements of existing R functions. In this paper we describe the implementation of a parallel version of the Random Forest classifier in the SPRINT library.