Co-clustering Documents and Words Using Bipartite Spectral GraphPartitioning

  • Authors:
  • Inderjit S. Dhillion

  • Affiliations:
  • -

  • Venue:
  • Co-clustering Documents and Words Using Bipartite Spectral GraphPartitioning
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

Both document clustering and word clustering are important and well-studied problems. By using the vector space model, a document collection may be represented as a word-document matrix. In this paper, we present the novel idea of modeling the document collection as a bipartite graph between documents and words. Using this model, we pose the clustering problem as a graph partitioning problein and give a new spectral algorithm that simultaneously yields a clustering of documents and words. This co-clustering algorithm uses the second left and right singular vectors of an appropriately scaled word-document matrix to yield good bipartitionings. In fact, it can be shown that these singular vectors give a real relaxation to the optimal solution of the graph bipartitioning problem. We present several experimental results to verify that the resulting co-clustering algorithm works well in practice and is robust in the presence of noise.