A matrix-based approach for semi-supervised document co-clustering

  • Authors:
  • Yanhua Chen;Lijun Wang;Ming Dong

  • Affiliations:
  • Wayne State University, Detroit, MI, USA;Wayne State University, Detroit, MI, USA;Wayne State University, Detroit, MI, USA

  • Venue:
  • Proceedings of the 17th ACM conference on Information and knowledge management
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

In order to derive high quality information from text, the field of text mining has advanced swiftly from simple document clustering to co-clustering documents and words. However, document co-clustering without any prior knowledge or background information is a challenging problem. In this paper, we propose a Semi-Supervised Non-negative Matrix Factorization (SS-NMF) based framework for document co-clustering. Our method computes a new word-document matrix by incorporating user provided constraints through distance metric learning. Using an iterative algorithm, we perform tri-factorization of the new matrix to infer the document and word clusters. Through extensive experiments conducted on publicly available data sets, we demonstrate the superior performance of SS-NMF for document co-clustering.