Generation of comprehensible hypotheses from gene expression data

Authors:
Yuan Jiang;Ming Li;Zhi-Hua Zhou
Affiliations:
National Laboratory for Novel Software Technology, Nanjing University, Nanjing, China;National Laboratory for Novel Software Technology, Nanjing University, Nanjing, China;National Laboratory for Novel Software Technology, Nanjing University, Nanjing, China
Venue:
BioDM'06 Proceedings of the 2006 international conference on Data Mining for Biomedical Applications
Year:
2006

Citing 10
Cited 2

C4.5: programs for machine learning

C4.5: programs for machine learning
Machine Learning

Machine Learning
Ensembling neural networks: many could be better than all

Artificial Intelligence
Gene Selection for Cancer Classification using Support Vector Machines

Machine Learning
Rule extraction: using neural networks or for neural networks?

Journal of Computer Science and Technology
Identifying Simple Discriminatory Gene Vectors with an Information Theory Approach

CSB '05 Proceedings of the 2005 IEEE Computational Systems Bioinformatics Conference
NeC4.5: Neural Ensemble Based C4.5

IEEE Transactions on Knowledge and Data Engineering
Medical diagnosis with C4.5 rule preceded by artificial neural network ensemble

IEEE Transactions on Information Technology in Biomedicine
An Epicurean learning approach to gene-expression data classification

Artificial Intelligence in Medicine
A comparison between two neural network rule extraction techniques for the diagnosis of hepatobiliary disorders

Artificial Intelligence in Medicine

Mining extremely small data sets with application to software reuse

Software—Practice & Experience
Mining tourist preferences with twice-learning

PAKDD'11 Proceedings of the 15th international conference on New Frontiers in Applied Data Mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

Machine learning techniques have been recognized as powerful tools for the analysis of gene expression data. However, most learning techniques used in class prediction in gene expression analysis during the past years generate black-box models. Although the prediction accuracy of these models could be very well, they provide little insight into the biological facts. This paper holds the recognition that a more reasonable role for machine learning techniques is to generate hypotheses that can be verified or refined by human experts instead of making decisions for human experts. Based on this recognition, a general approach to generate comprehensible hypotheses from gene expression data is described and applied to human acute leukemias as a test case. The results demonstrate the feasibility of using machine learning techniques to help form hypotheses on the relationship between genes and certain diseases.