Information-theoretic co-clustering
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Computing Clusters of Correlation Connected objects
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Subspace clustering for high dimensional data: a review
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Biclustering Algorithms for Biological Data Analysis: A Survey
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
TRICLUSTER: an effective algorithm for mining coherent clusters in 3D microarray data
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Mining quantitative correlated patterns using an information-theoretic approach
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Star-Structured High-Order Heterogeneous Data Co-clustering Based on Consistent Information Theory
ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
International Journal of Data Mining and Bioinformatics
A survey on enhanced subspace clustering
Data Mining and Knowledge Discovery
Hi-index | 0.00 |
In many real-world applications that analyze correlations between two groups of diverse entities, each group of entities can be characterized by multiple attributes. As such, there is a need to co-cluster multiple attributes' values into pairs of highly correlated clusters. We denote this co-clustering problem as the multi-attribute co-clustering problem. In this paper, we introduce a generalization of the mutual information between two attributes into mutual information between two attribute sets. The generalized formula enables us to use correlation information to discover multi-attribute co-clusters (MACs) . We develop a novel algorithm MACminer to mine MACs with high correlation information from datasets. We demonstrate the mining efficiency of MACminer in datasets with multiple attributes, and show that MACs with high correlation information have higher classification and predictive power, as compared to MACs generated by alternative high-dimensional data clustering and pattern mining techniques.