Inference from Low Precision Transcriptome Data Representation

Authors:
Salih Tuna;Mahesan Niranjan
Affiliations:
University of Southampton, Southampton, UK;University of Southampton, Southampton, UK
Venue:
Journal of Signal Processing Systems
Year:
2010

Citing 4
Cited 1

Making large-scale support vector machine learning practical

Advances in kernel methods
Analysis of Gene Expression Microarrays for Phenotype Classification

Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology
BagBoosting for tumor classification with gene expression data

Bioinformatics
Comparison of computational methods for the identification of cell cycle-regulated genes

Bioinformatics

Cross-Platform Analysis with Binarized Gene Expression Data

PRIB '09 Proceedings of the 4th IAPR International Conference on Pattern Recognition in Bioinformatics

Quantified Score

Hi-index	0.00

Visualization

Abstract

Microarray measurements are being widely used to infer gene functions, identify regulatory mechanisms and to predict phenotypes. These measurements are usually made and recorded to high numerical precision (e.g. 0.24601). However, aspects of the underlying biology, including mRNA molecules being highly unstable, being only available in very small copy numbers and the measurements usually being made over a heterogeneous population of cells, ought to make us sceptical about the reproducibility of these measurements and thus the numerical precisions reported. In this paper, we show that over a range of different procedures (classification, cluster analysis, detection of periodically expressed genes and the analysis of developmental time course data), the quality of inference from microarray data does not significantly degrade when the numerical precision is lowered by quantization. A surprising finding, with respect to classification problems, is that much of the discrimination is retained with numerical precision as low as binary (i.e. whether the gene is expressed or not). From this premise we show preliminary results that similarity metrics suitable for binary spaces, namely the Tanimoto metric used in chemoinformatics, can be successfully deployed to improve classification accuaracies of binarized transcriptome data.