An association analysis approach to biclustering

Authors:
Gaurav Pandey;Gowtham Atluri;Michael Steinbach;Chad L. Myers;Vipin Kumar
Affiliations:
University of Minnesota, Minneapolis, MN, USA;University of Minnesota, Minneapolis, MN, USA;University of Minnesota, Minneapolis, MN, USA;University of Minnesota, Minneapolis, MN, USA;University of Minnesota, Minneapolis, MN, USA
Venue:
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Year:
2009

Citing 23
Cited 4

Mining quantitative association rules in large relational tables

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Mining optimized association rules for numeric attributes

PODS '96 Proceedings of the fifteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Mining Optimized Association Rules with Categorical and Numeric Attributes

IEEE Transactions on Knowledge and Data Engineering
Biclustering of Expression Data

Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Information-theoretic co-clustering

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Support envelopes: a technique for exploring the structure of association patterns

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Generalizing the notion of support

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Biclustering Algorithms for Biological Data Analysis: A Survey

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Mining Frequent Closed Patterns in Microarray Data

ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
Defining transcription modules using large-scale gene expression data

Bioinformatics
GO: :TermFinder---open source software for accessing Gene Ontology information and finding significantly enriched Gene Ontology terms associated with a list of genes

Bioinformatics
TAPER: A Two-Step Approach for All-Strong-Pairs Correlation Query in Large Databases

IEEE Transactions on Knowledge and Data Engineering
Association mining

ACM Computing Surveys (CSUR)
Mining rank-correlated sets of numerical attributes

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
BicAT: a biclustering analysis toolbox

Bioinformatics
A systematic comparison and evaluation of biclustering methods for gene expression data

Bioinformatics
Frequent pattern mining: current status and future directions

Data Mining and Knowledge Discovery
High Confidence Rule Mining for Microarray Analysis

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Information theory applied to the sparse gene ontology annotation network to predict novel gene function

Bioinformatics
Local coherence in genetic interaction patterns reveals prevalent functional versatility

Bioinformatics
Bayesian Co-clustering

ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
Mining quantitative maximal hyperclique patterns: a summary of results

PAKDD'06 Proceedings of the 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining

Finding bicliques in digraphs: application into viral-host protein interactome

PReMI'11 Proceedings of the 4th international conference on Pattern recognition and machine intelligence
MFCluster: mining maximal fault-tolerant constant row biclusters in microarray dataset

WAIM'11 Proceedings of the 12th international conference on Web-age information management
CoBi: Pattern Based Co-Regulated Biclustering of Gene Expression Data

Pattern Recognition Letters
Mining order-preserving submatrices from probabilistic matrices

ACM Transactions on Database Systems (TODS)

Quantified Score

Hi-index	0.01

Visualization

Abstract

The discovery of biclusters, which denote groups of items that show coherent values across a subset of all the transactions in a data set, is an important type of analysis performed on real-valued data sets in various domains, such as biology. Several algorithms have been proposed to find different types of biclusters in such data sets. However, these algorithms are unable to search the space of all possible biclusters exhaustively. Pattern mining algorithms in association analysis also essentially produce biclusters as their result, since the patterns consist of items that are supported by a subset of all the transactions. However, a major limitation of the numerous techniques developed in association analysis is that they are only able to analyze data sets with binary and/or categorical variables, and their application to real-valued data sets often involves some lossy transformation such as discretization or binarization of the attributes. In this paper, we propose a novel association analysis framework for exhaustively and efficiently mining "range support" patterns from such a data set. On one hand, this framework reduces the loss of information incurred by the binarization- and discretization-based approaches, and on the other, it enables the exhaustive discovery of coherent biclusters. We compared the performance of our framework with two standard biclustering algorithms through the evaluation of the similarity of the cellular functions of the genes constituting the patterns/biclusters derived by these algorithms from microarray data. These experiments show that the real-valued patterns discovered by our framework are better enriched by small biologically interesting functional classes. Also, through specific examples, we demonstrate the ability of the RAP framework to discover functionally enriched patterns that are not found by the commonly used biclustering algorithm ISA. The source code and data sets used in this paper, as well as the supplementary material, are available at http://www.cs.umn.edu/vk/gaurav/rap.