Analyzing microarray data using quantitative association rules

  • Authors:
  • Elisabeth Georgii;Lothar Richter;Ulrich Rückert;Stefan Kramer

  • Affiliations:
  • Technische Universität München, Institut für Informatik/I12 Boltzmannstr. 3, 85748 Garching bei München, Germany;Technische Universität München, Institut für Informatik/I12 Boltzmannstr. 3, 85748 Garching bei München, Germany;Technische Universität München, Institut für Informatik/I12 Boltzmannstr. 3, 85748 Garching bei München, Germany;Technische Universität München, Institut für Informatik/I12 Boltzmannstr. 3, 85748 Garching bei München, Germany

  • Venue:
  • Bioinformatics
  • Year:
  • 2005

Quantified Score

Hi-index 3.84

Visualization

Abstract

Motivation: We tackle the problem of finding regularities in microarray data. Various data mining tools, such as clustering, classification, Bayesian networks and association rules, have been applied so far to gain insight into gene-expression data. Association rule mining techniques used so far work on discretizations of the data and cannot account for cumulative effects. In this paper, we investigate the use of quantitative association rules that can operate directly on numeric data and represent cumulative effects of variables. Technically speaking, this type of quantitative association rules based on half-spaces can find non-axis-parallel regularities. Results: We performed a variety of experiments testing the utility of quantitative association rules for microarray data. First of all, the results should be statistically significant and robust against fluctuations in the data. Next, the approach should be scalable in the number of variables, which is important for such high-dimensional data. Finally, the rules should make sense biologically and be sufficiently different from rules found in regular association rule mining working with discretizations. In all of these dimensions, the proposed approach performed satisfactorily. Therefore, quantitative association rules based on half-spaces should be considered as a tool for the analysis of microarray gene-expression data. Availability: The code is available from the authors on request. Contact: kramer@in.tum.de