An MDL approach to efficiently discover communities in bipartite network

Authors:
Kaikuo Xu;Changjie Tang;Chuan Li;Yexi Jiang;Rong Tang
Affiliations:
School of Computer Science, Sichuan University, China;School of Computer Science, Sichuan University, China;School of Computer Science, Sichuan University, China;School of Computer Science, Sichuan University, China;School of Computer Science, Sichuan University, China
Venue:
DASFAA'10 Proceedings of the 15th international conference on Database Systems for Advanced Applications - Volume Part I
Year:
2010

Citing 11
Cited 1

Introduction to the Theory of Computation

Introduction to the Theory of Computation
Amazon.com Recommendations: Item-to-Item Collaborative Filtering

IEEE Internet Computing
Biclustering of Expression Data

Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology
Information-theoretic co-clustering

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Fully automatic cross-associations

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Biclustering Algorithms for Biological Data Analysis: A Survey

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
GraphScope: parameter-free mining of large time-evolving graphs

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Hierarchical, Parameter-Free Community Discovery

ECML PKDD '08 Proceedings of the European conference on Machine Learning and Knowledge Discovery in Databases - Part II
A novel method for real parameter optimization based on Gene Expression Programming

Applied Soft Computing
Evolutionary Computation for Modeling and Optimization

Evolutionary Computation for Modeling and Optimization
The minimum description length principle in coding and modeling

IEEE Transactions on Information Theory

Natural event summarization

Proceedings of the 20th ACM international conference on Information and knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

Bipartite network is a branch of complex network. It is widely used in many applications such as social network analysis, collaborative filtering and information retrieval. Partitioning a bipartite network into smaller modules helps to get insight of the structure of the bipartite network. The main contributions of this paper include: (1) proposing an MDL 21 criterion for identifying a good partition of a bipartite network. (2) presenting a greedy algorithm based on combination theory, named as MDL-greedy, to approach the optimal partition of a bipartite network. The greedy algorithm automatically searches for the number of partitions, and requires no user intervention. (3) conducting experiments on synthetic datasets and the southern women dataset. The results show that our method generates higher quality results than the state-of-art methods Cross-Association and Information-theoretic co-clustering. Experiment results also show the good scalability of the proposed algorithm. The highest improvement could be up to about 14% for the precision, 40% for the ratio and 70% for the running time.