Inference from Low Precision Transcriptome Data Representation

  • Authors:
  • Salih Tuna;Mahesan Niranjan

  • Affiliations:
  • University of Southampton, Southampton, UK;University of Southampton, Southampton, UK

  • Venue:
  • Journal of Signal Processing Systems
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Microarray measurements are being widely used to infer gene functions, identify regulatory mechanisms and to predict phenotypes. These measurements are usually made and recorded to high numerical precision (e.g. 0.24601). However, aspects of the underlying biology, including mRNA molecules being highly unstable, being only available in very small copy numbers and the measurements usually being made over a heterogeneous population of cells, ought to make us sceptical about the reproducibility of these measurements and thus the numerical precisions reported. In this paper, we show that over a range of different procedures (classification, cluster analysis, detection of periodically expressed genes and the analysis of developmental time course data), the quality of inference from microarray data does not significantly degrade when the numerical precision is lowered by quantization. A surprising finding, with respect to classification problems, is that much of the discrimination is retained with numerical precision as low as binary (i.e. whether the gene is expressed or not). From this premise we show preliminary results that similarity metrics suitable for binary spaces, namely the Tanimoto metric used in chemoinformatics, can be successfully deployed to improve classification accuaracies of binarized transcriptome data.