Making large-scale support vector machine learning practical
Advances in kernel methods
Analysis of Gene Expression Microarrays for Phenotype Classification
Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology
Cross-Platform Analysis with Binarized Gene Expression Data
PRIB '09 Proceedings of the 4th IAPR International Conference on Pattern Recognition in Bioinformatics
Hi-index | 0.00 |
Microarray measurements are being widely used to infer gene functions, identify regulatory mechanisms and to predict phenotypes. These measurements are usually made and recorded to high numerical precision (e.g. 0.24601). However, aspects of the underlying biology, including mRNA molecules being highly unstable, being only available in very small copy numbers and the measurements usually being made over a heterogeneous population of cells, ought to make us sceptical about the reproducibility of these measurements and thus the numerical precisions reported. In this paper, we show that over a range of different procedures (classification, cluster analysis, detection of periodically expressed genes and the analysis of developmental time course data), the quality of inference from microarray data does not significantly degrade when the numerical precision is lowered by quantization. A surprising finding, with respect to classification problems, is that much of the discrimination is retained with numerical precision as low as binary (i.e. whether the gene is expressed or not). From this premise we show preliminary results that similarity metrics suitable for binary spaces, namely the Tanimoto metric used in chemoinformatics, can be successfully deployed to improve classification accuaracies of binarized transcriptome data.