Bipartite isoperimetric graph partitioning for data co-clustering

Authors:
Manjeet Rege;Ming Dong;Farshad Fotouhi
Affiliations:
Department of Computer Science, Wayne State University, Detroit, USA 48202;Department of Computer Science, Wayne State University, Detroit, USA 48202;Department of Computer Science, Wayne State University, Detroit, USA 48202
Venue:
Data Mining and Knowledge Discovery
Year:
2008

Citing 31
Cited 2

Eigen values and expanders

Combinatorica
Isoperimetric numbers of graphs

Journal of Combinatorial Theory Series B
OHSUMED: an interactive retrieval evaluation and new large test collection for research

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Recent directions in netlist partitioning: a survey

Integration, the VLSI Journal
Applied numerical linear algebra

Applied numerical linear algebra
On the Quality of Spectral Separators

SIAM Journal on Matrix Analysis and Applications
Geometric Mesh Partitioning: Implementation and Experiments

SIAM Journal on Scientific Computing
Data clustering: a review

ACM Computing Surveys (CSUR)
Document Categorization and Query Generation on the World Wide WebUsing WebACE

Artificial Intelligence Review - Special issue on data mining on the Internet
Document clustering using word clusters via the information bottleneck method

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Normalized Cuts and Image Segmentation

IEEE Transactions on Pattern Analysis and Machine Intelligence
Content-Based Image Retrieval at the End of the Early Years

IEEE Transactions on Pattern Analysis and Machine Intelligence
Co-clustering documents and words using bipartite spectral graph partitioning

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Bipartite graph partitioning and data clustering

Proceedings of the tenth international conference on Information and knowledge management
Correlating multilingual documents via bipartite graph modeling

SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Computers and Intractability: A Guide to the Theory of NP-Completeness

Computers and Intractability: A Guide to the Theory of NP-Completeness
Which Eigenvalues Are Found by the Lanczos Method?

SIAM Journal on Matrix Analysis and Applications
A matrix density based algorithm to hierarchically co-cluster documents and words

WWW '03 Proceedings of the 12th international conference on World Wide Web
Pattern Classification (2nd Edition)

Pattern Classification (2nd Edition)
Information-theoretic co-clustering

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
A graph-theoretic approach to extract storylines from search results

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
A generalized maximum entropy approach to bregman co-clustering and matrix approximation

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Image and Feature Co-Clustering

ICPR '04 Proceedings of the Pattern Recognition, 17th International Conference on (ICPR'04) Volume 4 - Volume 04
Digital Image Processing (3rd Edition)

Digital Image Processing (3rd Edition)
Co-clustering by block value decomposition

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Unsupervised content discovery in composite audio

Proceedings of the 13th annual ACM international conference on Multimedia
A Scalable Collaborative Filtering Framework Based on Co-Clustering

ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Isoperimetric Graph Partitioning for Image Segmentation

IEEE Transactions on Pattern Analysis and Machine Intelligence
Isoperimetric Partitioning: A New Algorithm for Graph Partitioning

SIAM Journal on Scientific Computing
Co-clustering Documents and Words Using Bipartite Isoperimetric Graph Partitioning

ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Narrowing the semantic gap - improved text-based web document retrieval using visual features

IEEE Transactions on Multimedia

Clustering high dimensional data: A graph-based relaxed optimization approach

Information Sciences: an International Journal
Multi-way clustering and biclustering by the Ratio cut and Normalized cut in graphs

Journal of Combinatorial Optimization

Quantified Score

Hi-index	0.00

Visualization

Abstract

Data co-clustering refers to the problem of simultaneous clustering of two data types. Typically, the data is stored in a contingency or co-occurrence matrix C where rows and columns of the matrix represent the data types to be co-clustered. An entry C ij of the matrix signifies the relation between the data type represented by row i and column j. Co-clustering is the problem of deriving sub-matrices from the larger data matrix by simultaneously clustering rows and columns of the data matrix. In this paper, we present a novel graph theoretic approach to data co-clustering. The two data types are modeled as the two sets of vertices of a weighted bipartite graph. We then propose Isoperimetric Co-clustering Algorithm (ICA)--a new method for partitioning the bipartite graph. ICA requires a simple solution to a sparse system of linear equations instead of the eigenvalue or SVD problem in the popular spectral co-clustering approach. Our theoretical analysis and extensive experiments performed on publicly available datasets demonstrate the advantages of ICA over other approaches in terms of the quality, efficiency and stability in partitioning the bipartite graph.