Generation of comprehensible hypotheses from gene expression data

  • Authors:
  • Yuan Jiang;Ming Li;Zhi-Hua Zhou

  • Affiliations:
  • National Laboratory for Novel Software Technology, Nanjing University, Nanjing, China;National Laboratory for Novel Software Technology, Nanjing University, Nanjing, China;National Laboratory for Novel Software Technology, Nanjing University, Nanjing, China

  • Venue:
  • BioDM'06 Proceedings of the 2006 international conference on Data Mining for Biomedical Applications
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Machine learning techniques have been recognized as powerful tools for the analysis of gene expression data. However, most learning techniques used in class prediction in gene expression analysis during the past years generate black-box models. Although the prediction accuracy of these models could be very well, they provide little insight into the biological facts. This paper holds the recognition that a more reasonable role for machine learning techniques is to generate hypotheses that can be verified or refined by human experts instead of making decisions for human experts. Based on this recognition, a general approach to generate comprehensible hypotheses from gene expression data is described and applied to human acute leukemias as a test case. The results demonstrate the feasibility of using machine learning techniques to help form hypotheses on the relationship between genes and certain diseases.