Robust approach for estimating probabilities in Naïve-Bayes Classifier for gene expression data

Authors:
B. Chandra;Manish Gupta
Affiliations:
Department of Mathematics, Indian Institute of Technology, Delhi, Hauz Khas, New Delhi 110 016, India;Institute for Systems Studies and Analyses, Defence R&D Organisation, Metcalfe House, Delhi 110 054, India
Venue:
Expert Systems with Applications: An International Journal
Year:
2011

Citing 18
Cited 2

Instance-Based Learning Algorithms

Machine Learning
On the Optimality of the Simple Bayesian Classifier under Zero-One Loss

Machine Learning - Special issue on learning with probabilistic representations
Bayesian Network Classifiers

Machine Learning - Special issue on learning with probabilistic representations
Tissue classification with gene expression profiles

RECOMB '00 Proceedings of the fourth annual international conference on Computational molecular biology
Class prediction and discovery using gene expression data

RECOMB '00 Proceedings of the fourth annual international conference on Computational molecular biology
On Changing Continuous Attributes into Ordered Discrete Attributes

EWSL '91 Proceedings of the European Working Session on Machine Learning
Cancer classification using gene expression data

Information Systems - Special issue: Data management in bioinformatics
Two-stage classification methods for microarray data

Expert Systems with Applications: An International Journal
Extended Naive Bayes classifier for mixed data

Expert Systems with Applications: An International Journal
Intelligent system for the analysis of microarray data using principal components and estimation of distribution algorithms

Expert Systems with Applications: An International Journal
Kernel based nonlinear dimensionality reduction for microarray gene expression data analysis

Expert Systems with Applications: An International Journal
Effective temporal data classification by integrating sequential pattern mining and probabilistic induction

Expert Systems with Applications: An International Journal
An expert system to classify microarray gene expression data using gene selection by decision tree

Expert Systems with Applications: An International Journal
Applications of artificial intelligence in bioinformatics: A review

Expert Systems with Applications: An International Journal
Robust approach for estimating probabilities in naive-Bayes classifier

PReMI'07 Proceedings of the 2nd international conference on Pattern recognition and machine intelligence
An analysis of Bayesian classifiers

AAAI'92 Proceedings of the tenth national conference on Artificial intelligence
Estimating continuous distributions in Bayesian classifiers

UAI'95 Proceedings of the Eleventh conference on Uncertainty in artificial intelligence
Incremental discretization for Naïve-Bayes classifier

ADMA'06 Proceedings of the Second international conference on Advanced Data Mining and Applications

Learning feature-projection based classifiers

Expert Systems with Applications: An International Journal
Hybrid decision tree and naïve Bayes classifiers for multi-class classification tasks

Expert Systems with Applications: An International Journal

Quantified Score

Hi-index	12.05

Visualization

Abstract

Naive-Bayes Classifier (NBC) is widely used for classification in machine learning. It is considered as the first choice for many classification problems because of its simplicity and classification accuracy as compared to other supervised learning methods. However, for high dimensional data like gene expression data, it does not perform well due to two major limitations i.e. underflow and overfitting. In order to address the problem of underflow, the existing approach adopted is to add the logarithms of probabilities rather than multiplying probabilities and the estimate approach is used for providing solution to overfitting problem. However, in practice for gene expression data, these approaches do not perform well. In this paper, a novel approach has been proposed to overcome the limitations using a robust function for estimating probabilities in Naive-Bayes Classifier. The proposed method not only resolves the limitation of NBC but also improves the classification accuracy for gene expression data. The method has been tested over several benchmark gene expression datasets of high dimension. Comparative results of proposed Robust Naive-Bayes Classifier (R-NBC) and existing NBC for gene expression data have also been illustrated to highlight the effectiveness of the R-NBC. Simulation study has also been performed to depict the robustness of the R-NBC over the existing approaches.