Cancer gene search with data-mining and genetic algorithms

Authors:
Shital Shah;Andrew Kusiak
Affiliations:
Intelligent Systems Laboratory, MIE, 2139 Seamans Center, The University of Iowa, Iowa City, IA 52242-1527, USA;Intelligent Systems Laboratory, MIE, 2139 Seamans Center, The University of Iowa, Iowa City, IA 52242-1527, USA
Venue:
Computers in Biology and Medicine
Year:
2007

Citing 15
Cited 21

Original Contribution: Stacked generalization

Neural Networks
The nature of statistical learning theory

The nature of statistical learning theory
Genetic algorithms + data structures = evolution programs (3rd ed.)

Genetic algorithms + data structures = evolution programs (3rd ed.)
Bagging predictors

Machine Learning
Advances in knowledge discovery and data mining

Advances in knowledge discovery and data mining
Fast training of support vector machines using sequential minimal optimization

Advances in kernel methods
Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control and Artificial Intelligence

Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control and Artificial Intelligence
Rough Sets: Theoretical Aspects of Reasoning about Data

Rough Sets: Theoretical Aspects of Reasoning about Data
Genetic Algorithms in Search, Optimization and Machine Learning

Genetic Algorithms in Search, Optimization and Machine Learning
Induction of Decision Trees

Machine Learning
Generating Accurate Rule Sets Without Global Optimization

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Feature Selection for Machine Learning: Comparing a Correlation-Based Filter Approach to the Wrapper

Proceedings of the Twelfth International Florida Artificial Intelligence Research Society Conference
Genetic Algorithms as a Tool for Restructuring Feature Space Representations

TAI '95 Proceedings of the Seventh International Conference on Tools with Artificial Intelligence
Improvements to Platt's SMO Algorithm for SVM Classifier Design

Neural Computation
Data Mining

Data Mining

A new classification model with simple decision rule for discovering optimal feature gene pairs

Computers in Biology and Medicine
Hardware architecture for a general regression neural network coprocessor

Neurocomputing
Mathematical modeling and application of genetic algorithm to parameter estimation in signal transduction: Trafficking and promiscuous coupling of G-protein coupled receptors

Computers in Biology and Medicine
A parallel genetic algorithm to discover patterns in genetic markers that indicate predisposition to multifactorial disease

Computers in Biology and Medicine
Rough Sets in Oligonucleotide Microarray Data Analysis

RSEISP '07 Proceedings of the international conference on Rough Sets and Intelligent Systems Paradigms
Decision analysis of data mining project based on Bayesian risk

Expert Systems with Applications: An International Journal
A new hybrid approach for mining breast cancer pattern using discrete particle swarm optimization and statistical method

Expert Systems with Applications: An International Journal
Model of experts for decision support in the diagnosis of leukemia patients

Artificial Intelligence in Medicine
Artificial intelligence in genomic sequence, protein structure function prediction and DNA microarrays: a survey

International Journal of Computational Intelligence in Bioinformatics and Systems Biology
Gene selection and cancer microarray data classification via mixed-integer optimization

EvoBIO'08 Proceedings of the 6th European conference on Evolutionary computation, machine learning and data mining in bioinformatics
Predicting incomplete gene microarray data with the use of supervised learning algorithms

Pattern Recognition Letters
A novel cancer classifier based on differentially expressed gene network

Proceedings of the First ACM International Conference on Bioinformatics and Computational Biology
Gene expression data classification using locally linear discriminant embedding

Computers in Biology and Medicine
Identification of a model of non-esterified fatty acids dynamics through genetic algorithms: The case of women with a history of gestational diabetes

Computers in Biology and Medicine
Lung cancer detection using labeled sputum sample: multi spectrum approach

IEA/AIE'11 Proceedings of the 24th international conference on Industrial engineering and other applications of applied intelligent systems conference on Modern approaches in applied intelligence - Volume Part II
Dietary patterns analysis using data mining method. An application to data from the CYKIDS study

Computer Methods and Programs in Biomedicine
Fuzzy expert system for predicting pathological stage of prostate cancer

Expert Systems with Applications: An International Journal
Performance evaluation of ranking methods for relevant gene selection in cancer microarray datasets

MICAI'12 Proceedings of the 11th Mexican international conference on Advances in Artificial Intelligence - Volume Part I
Sparse maximum margin discriminant analysis for feature extraction and gene selection on gene expression data

Computers in Biology and Medicine
Review: Knowledge discovery in medicine: Current issue and future trend

Expert Systems with Applications: An International Journal
MaskedPainter: Feature selection for microarray data analysis

Intelligent Data Analysis

Quantified Score

Hi-index	0.01

Visualization

Abstract

Cancer leads to approximately 25% of all mortalities, making it the second leading cause of death in the United States. Early and accurate detection of cancer is critical to the well being of patients. Analysis of gene expression data leads to cancer identification and classification, which will facilitate proper treatment selection and drug development. Gene expression data sets for ovarian, prostate, and lung cancer were analyzed in this research. An integrated gene-search algorithm for genetic expression data analysis was proposed. This integrated algorithm involves a genetic algorithm and correlation-based heuristics for data preprocessing (on partitioned data sets) and data mining (decision tree and support vector machines algorithms) for making predictions. Knowledge derived by the proposed algorithm has high classification accuracy with the ability to identify the most significant genes. Bagging and stacking algorithms were applied to further enhance the classification accuracy. The results were compared with that reported in the literature. Mapping of genotype information to the phenotype parameters will ultimately reduce the cost and complexity of cancer detection and classification.