Selecting differentially expressed genes using minimum probability of classification error

Authors:
Pritha Mahata;Kaushik Mahata
Affiliations:
School of Electrical Engineering and Computer Science, The University of Newcastle, Callaghan, NSW 2308, Australia;School of Electrical Engineering and Computer Science, The University of Newcastle, Callaghan, NSW 2308, Australia
Venue:
Journal of Biomedical Informatics
Year:
2007

Citing 7
Cited 10

The nature of statistical learning theory

The nature of statistical learning theory
Disease Gene Explorer: Display Disease Gene Dependency by Combining Bayesian Networks with Clustering

CSB '04 Proceedings of the 2004 IEEE Computational Systems Bioinformatics Conference
Optimization models for cancer classification: extracting gene interaction information from microarray expression data

Bioinformatics
A comparative study of feature selection and multiclass classification methods for tissue classification based on gene expression

Bioinformatics
A simple implementation of a normal mixture approach to differential gene expression in multiclass microarrays

Bioinformatics
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Gene selection from microarray data for cancer classification-a machine learning approach

Computational Biology and Chemistry

Guest Editorial: Intelligent data analysis in biomedicine

Journal of Biomedical Informatics
A neural network-based biomarker association information extraction approach for cancer classification

Journal of Biomedical Informatics
Exploratory Consensus of Hierarchical Clusterings for Melanoma and Breast Cancer

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Ensemble gene selection for cancer classification

Pattern Recognition
Colon cancer prediction with genetics profiles using evolutionary techniques

Expert Systems with Applications: An International Journal
A two step method to identify clinical outcome relevant genes with microarray data

Journal of Biomedical Informatics
Evolutionary Generalized Radial Basis Function neural networks for improving prediction accuracy in gene classification using feature selection

Applied Soft Computing
Selecting significant genes by randomization test for cancer classification using gene expression data

Journal of Biomedical Informatics
A heuristic biomarker selection approach based on professional tennis player ranking strategy

Computer Methods and Programs in Biomedicine
Identification of glioma cancer-alerted gene markers based on a diagnostic outcome correlation analysis preferential approach

International Journal of Data Mining and Bioinformatics

Quantified Score

Hi-index	0.00

Visualization

Abstract

Discovery of differentially expressed genes between normal and diseased patients is a central research problem in bioinformatics. It is specially important to find few genetic markers which can be explored for diagnostic purposes. The performance of a set of markers is often measured by the associated classification accuracy. This motivates our ranking of genes depending on the minimum probability of classification errors (MPE) for each gene. In this work, we use Bayesian decision-making algorithm to compute MPE. A quantile-based probability density estimation technique is used for generating probability density functions of genes. The method is tested on three datasets: colon cancer, leukaemia, and hereditary breast cancer. The quality of the selected markers is evaluated by the classification accuracy obtained using support-vector-machine and a modified naive Bayes classifier. We obtain 96.77% accuracy in colon cancer and 97.06% accuracy in leukaemia, using only five genes in each case. Finally, using just three genes we get 100% accuracy in hereditary breast cancer. We also compare our results with those using the genes ranked by p-value and show that the genes ranked by MPE perform better or equal to those ranked by p-value.