Gene Selection for Microarray Expression Data with Imbalanced Sample Distributions

Authors:
Abu H. M. Kamal;Xingquan Zhu;Ramaswamy Narayanan
Affiliations:
-;-;-
Venue:
IJCBS '09 Proceedings of the 2009 International Joint Conference on Bioinformatics, Systems Biology and Intelligent Computing
Year:
2009

Citing 0
Cited 3

Classification of high dimensional and imbalanced hyperspectral imagery data

IbPRIA'11 Proceedings of the 5th Iberian conference on Pattern recognition and image analysis
Exploring synergetic effects of dimensionality reduction and resampling tools on hyperspectral imagery data classification

MLDM'11 Proceedings of the 7th international conference on Machine learning and data mining in pattern recognition
ACOSampling: An ant colony optimization-based undersampling method for classifying imbalanced DNA microarray data

Neurocomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Microarray expression data, which contain expression levels of a large number of simultaneously observed genes, have been used in many scientific research and clinical studies. Due to its high dimensionalities, selecting a small number of genes has shown to be beneficial for tasks such as building prediction models for molecular classification of cancers. Traditional gene selection methods, however, fail to take the sample distributions into consideration for gene selection. Due to the scarcity of the samples, in Biomedical research it is very common to have severely biased data distributions with one class of examples (e.g., diseased samples) significantly less than other classes (e.g., normal samples). Sample sets with biased distributions require special attention for identifying genes responsible for particular disease. In this paper, we propose three filtering techniques, Higher Weight (HW), Differential Minority Repeat (DMR) and Balanced Minority Repeat (BMR), to identify genes relevant to fatal diseases for biased microarray expression data. Experimental comparisons with the traditional ReliefF method on five microarray datasets demonstrate the effectiveness of the proposed methods in selecting informative genes from microarray expression data with biased sample distributions.