Using a Genetic Algorithm and a Perceptron forFeature Selection and Supervised Class Learningin DNA Microarray Data

Authors:
Michal Karzynski;Álvaro Mateos;Javier Herrero;Joaquín Dopazo
Affiliations:
Bioinformatics Unit, Centro Nacional de Investigaciones Oncológicas (CNIO), c/ Melchor Fernández Almagro 3, 28029, Madrid, Spain;Bioinformatics Unit, Centro Nacional de Investigaciones;Bioinformatics Unit, Centro Nacional de Investigaciones Oncológicas (CNIO), c/ Melchor Fernández Almagro 3, 28029, Madrid, Spain;Bioinformatics Unit, Centro Nacional de Investigaciones Oncológicas (CNIO), c/ Melchor Fernández Almagro 3, 28029, Madrid, Spain (author for correspondence, e-mail: jdopazo@cnio. ...
Venue:
Artificial Intelligence Review
Year:
2003

Citing 3
Cited 4

Genetic algorithms + data structures = evolution programs (3rd ed.)

Genetic algorithms + data structures = evolution programs (3rd ed.)
Neural Networks and Genome Informatics

Neural Networks and Genome Informatics
Analysis of Gene Expression Microarrays for Phenotype Classification

Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology

LESS: A Model-Based Classifier for Sparse Subspaces

IEEE Transactions on Pattern Analysis and Machine Intelligence
The classification of cancer based on DNA microarray data that uses diverse ensemble genetic programming

Artificial Intelligence in Medicine
Tumor tissue identification based on gene expression data using DWT feature extraction and PNN classifier

Neurocomputing
Hybrid genetic algorithm-neural network: Feature extraction for unpreprocessed microarray data

Artificial Intelligence in Medicine

Quantified Score

Hi-index	0.01

Visualization

Abstract

Class prediction and feature selection is keyin the context of diagnostic applications ofDNA microarrays. Microarray data is noisy andtypically composed of a low number of samplesand a large number of genes. Perceptrons canconstitute an efficient tool for accurateclassification of microarray data.Nevertheless, the large input layers necessaryfor the direct application of perceptrons andthe low samples available for the trainingprocess hamper its use. Two strategies can betaken for an optimal use of a perceptron with afavourable balance between samples for trainingand the size of the input layer: (a) reducingthe dimensionality of the data set fromthousands to no more than one hundred, highlyinformative average values, and using theweights of the perceptron for feature selectionor (b) using a selection of only few genesthat produce an optimal classification with theperceptron. In this case, feature selection iscarried out first. Obviously, a combinedapproach is also possible. In this manuscriptwe explore and compare both alternatives. Westudy the informative contents of the data atdifferent levels of compression with a veryefficient clustering algorithm (Self OrganizingTree Algorithm). We show how a simple geneticalgorithm selects a subset of gene expressionvalues with 100% accuracy in theclassification of samples with maximumefficiency. Finally, the importance ofdimensionality reduction is discussed in lightof its capacity for reducing noise andredundancies in microarray data.