Mining quantitative association rules in large relational tables
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Mining optimized association rules for numeric attributes
PODS '96 Proceedings of the fifteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Mining Optimized Association Rules with Categorical and Numeric Attributes
IEEE Transactions on Knowledge and Data Engineering
Biclustering of Expression Data
Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Information-theoretic co-clustering
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Support envelopes: a technique for exploring the structure of association patterns
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Generalizing the notion of support
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Biclustering Algorithms for Biological Data Analysis: A Survey
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Mining Frequent Closed Patterns in Microarray Data
ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
TAPER: A Two-Step Approach for All-Strong-Pairs Correlation Query in Large Databases
IEEE Transactions on Knowledge and Data Engineering
ACM Computing Surveys (CSUR)
Mining rank-correlated sets of numerical attributes
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
BicAT: a biclustering analysis toolbox
Bioinformatics
Frequent pattern mining: current status and future directions
Data Mining and Knowledge Discovery
High Confidence Rule Mining for Microarray Analysis
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
Mining quantitative maximal hyperclique patterns: a summary of results
PAKDD'06 Proceedings of the 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Finding bicliques in digraphs: application into viral-host protein interactome
PReMI'11 Proceedings of the 4th international conference on Pattern recognition and machine intelligence
MFCluster: mining maximal fault-tolerant constant row biclusters in microarray dataset
WAIM'11 Proceedings of the 12th international conference on Web-age information management
CoBi: Pattern Based Co-Regulated Biclustering of Gene Expression Data
Pattern Recognition Letters
Mining order-preserving submatrices from probabilistic matrices
ACM Transactions on Database Systems (TODS)
Hi-index | 0.01 |
The discovery of biclusters, which denote groups of items that show coherent values across a subset of all the transactions in a data set, is an important type of analysis performed on real-valued data sets in various domains, such as biology. Several algorithms have been proposed to find different types of biclusters in such data sets. However, these algorithms are unable to search the space of all possible biclusters exhaustively. Pattern mining algorithms in association analysis also essentially produce biclusters as their result, since the patterns consist of items that are supported by a subset of all the transactions. However, a major limitation of the numerous techniques developed in association analysis is that they are only able to analyze data sets with binary and/or categorical variables, and their application to real-valued data sets often involves some lossy transformation such as discretization or binarization of the attributes. In this paper, we propose a novel association analysis framework for exhaustively and efficiently mining "range support" patterns from such a data set. On one hand, this framework reduces the loss of information incurred by the binarization- and discretization-based approaches, and on the other, it enables the exhaustive discovery of coherent biclusters. We compared the performance of our framework with two standard biclustering algorithms through the evaluation of the similarity of the cellular functions of the genes constituting the patterns/biclusters derived by these algorithms from microarray data. These experiments show that the real-valued patterns discovered by our framework are better enriched by small biologically interesting functional classes. Also, through specific examples, we demonstrate the ability of the RAP framework to discover functionally enriched patterns that are not found by the commonly used biclustering algorithm ISA. The source code and data sets used in this paper, as well as the supplementary material, are available at http://www.cs.umn.edu/vk/gaurav/rap.