Analyzing microarray data using quantitative association rules

Authors:
Elisabeth Georgii;Lothar Richter;Ulrich Rückert;Stefan Kramer
Affiliations:
Technische Universität München, Institut für Informatik/I12 Boltzmannstr. 3, 85748 Garching bei München, Germany;Technische Universität München, Institut für Informatik/I12 Boltzmannstr. 3, 85748 Garching bei München, Germany;Technische Universität München, Institut für Informatik/I12 Boltzmannstr. 3, 85748 Garching bei München, Germany;Technische Universität München, Institut für Informatik/I12 Boltzmannstr. 3, 85748 Garching bei München, Germany
Venue:
Bioinformatics
Year:
2005

Citing 0
Cited 15

Deriving quantitative models for correlation clusters

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Two-stage classification methods for microarray data

Expert Systems with Applications: An International Journal
Efficient mining of salinity and temperature association rules from ARGO data

Expert Systems with Applications: An International Journal
Integrative Visual Data Mining of Biomedical Data: Investigating Cases in Chronic Fatigue Syndrome and Acute Lymphoblastic Leukaemia

Visual Data Mining
Identification of temporal association rules from time-series microarray data set: temporal association rules

Proceedings of the 2nd international workshop on Data and text mining in bioinformatics
Clustering high-dimensional data: A survey on subspace clustering, pattern-based clustering, and correlation clustering

ACM Transactions on Knowledge Discovery from Data (TKDD)
Mining Association Rule Bases from Integrated Genomic Data and Annotations

Computational Intelligence Methods for Bioinformatics and Biostatistics
Minimum variance associations: discovering relationships in numerical data

PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
Mining quantitative association rules based on evolutionary computation and its application to atmospheric pollution

Integrated Computer-Aided Engineering
Kernel based gene expression pattern discovery and its application on cancer classification

Neurocomputing
Bi-k-bi clustering: mining large scale gene expression data using two-level biclustering

International Journal of Data Mining and Bioinformatics
WF-MSB: A weighted fuzzy-based biclustering method for gene expression data

International Journal of Data Mining and Bioinformatics
Frequent pattern discovery without binarization: mining attribute profiles

PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases
An association rule analysis framework for complex physiological and genetic data

HIS'12 Proceedings of the First international conference on Health Information Science
Effect of data discretization on the classification accuracy in a high-dimensional framework

International Journal of Intelligent Systems

Quantified Score

Hi-index	3.84

Visualization

Abstract

Motivation: We tackle the problem of finding regularities in microarray data. Various data mining tools, such as clustering, classification, Bayesian networks and association rules, have been applied so far to gain insight into gene-expression data. Association rule mining techniques used so far work on discretizations of the data and cannot account for cumulative effects. In this paper, we investigate the use of quantitative association rules that can operate directly on numeric data and represent cumulative effects of variables. Technically speaking, this type of quantitative association rules based on half-spaces can find non-axis-parallel regularities. Results: We performed a variety of experiments testing the utility of quantitative association rules for microarray data. First of all, the results should be statistically significant and robust against fluctuations in the data. Next, the approach should be scalable in the number of variables, which is important for such high-dimensional data. Finally, the rules should make sense biologically and be sufficiently different from rules found in regular association rule mining working with discretizations. In all of these dimensions, the proposed approach performed satisfactorily. Therefore, quantitative association rules based on half-spaces should be considered as a tool for the analysis of microarray gene-expression data. Availability: The code is available from the authors on request. Contact: kramer@in.tum.de