Co-clustering documents and words using bipartite spectral graph partitioning
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Inducing Probabilistic Grammars by Bayesian Model Merging
ICGI '94 Proceedings of the Second International Colloquium on Grammatical Inference and Applications
Hidden Markov Model} Induction by Bayesian Model Merging
Advances in Neural Information Processing Systems 5, [NIPS Conference]
Information-theoretic co-clustering
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
The link prediction problem for social networks
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Fully automatic cross-associations
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Co-clustering by block value decomposition
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
A Scalable Collaborative Filtering Framework Based on Co-Clustering
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Unsupervised learning on k-partite graphs
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Graph summarization with bounded error
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Efficient aggregation for graph summarization
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Information and Complexity in Statistical Modeling
Information and Complexity in Statistical Modeling
ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
Information theoretic measures for clusterings comparison: is a correction for chance necessary?
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
New perspectives and methods in link prediction
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
PSG: a two-layer graph model for document summarization
Frontiers of Computer Science: Selected Publications from Chinese Universities
Hi-index | 0.00 |
How to extract the truly relevant information from a large relational data set? The answer of this paper is a technique integrating graph summarization, graph clustering, link prediction and the discovery of the hidden structure on the basis of data compression. Our novel algorithm SCMiner (for Summarization-Compression Miner) reduces a large bipartite input graph to a highly compact representation which is very useful for different data mining tasks: 1) Clustering: The compact summary graph contains the truly relevant clusters of both types of nodes of a bipartite graph. 2) Link prediction: The compression scheme of SCMiner reveals suspicious edges which are probably erroneous as well as missing edges, i.e. pairs of nodes which should be connected by an edge. 3) Discovery of the hidden structure: Unlike traditional co-clustering methods, the result of SCMiner is not limited to row- and column-clusters. Besides the clusters, the summary graph also contains the essential relationships between both types of clusters and thus reveals the hidden structure of the data. Extensive experiments on synthetic and real data demonstrate that SCMiner outperforms state-of-the-art techniques for clustering and link prediction. Moreover, SCMiner discovers the hidden structure and reports it in an interpretable way to the user. Based on data compression, our technique does not rely on any input parameters which are difficult to estimate.