Performance Model Selection for Learning-based Biological Image Analysis on a Cluster

Authors:
Jie Zhou;Anthony Brunson;John Winans;Kirk Duffin;Nicholas Karonis
Affiliations:
Department of Computer Science, Northern Illinois University, Dekalb, IL 60115, USA;Department of Computer Science, Northern Illinois University, Dekalb, IL 60115, USA;Department of Computer Science, Northern Illinois University, Dekalb, IL 60115, USA;Department of Computer Science, Northern Illinois University, Dekalb, IL 60115, USA;Department of Computer Science, Northern Illinois University, Dekalb, IL 60115, USA
Venue:
Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics
Year:
2013

Citing 4
Cited 0

Automatic recognition and annotation of gene expression patterns of fly embryos

Bioinformatics
The WEKA data mining software: an update

ACM SIGKDD Explorations Newsletter
LIBSVM: A library for support vector machines

ACM Transactions on Intelligent Systems and Technology (TIST)
Counting cells in 3D confocal images based on discriminative models

Proceedings of the 2nd ACM Conference on Bioinformatics, Computational Biology and Biomedicine

Quantified Score

Hi-index	0.00

Visualization

Abstract

Microscopic images with increased scale and content call for high performance computing when applying automatic tools for biological image analysis. Speed of analysis can be improved at various stages. In learning-based models, selecting suitable algorithms for a given problem can be a lengthy process given the large pool of algorithms and the variety of biological problems. In this paper, we describe a portable method for efficiently and adaptively selecting an effective model for biological image classification as a step toward the goal of achieving high throughput biological image analysis. We implemented a high performance tool which extends the bioimage classification and annotation platform BIOCAT by deploying the model selection process on a cluster using a distributed design based on remote method invocation. The high performance model selection, when tested and compared using ten benchmarking data sets, is shown to not only dramatically increase the speed of the learning process, but also bring improved accuracy to several state-of-the-art data sets for bioimage classification. These achievements are attributed to the combined power of BIOCAT's adaptive model selection as well as the capability of distributed model evaluation. The tool is deployable to various types of distributed environments.