Non-negative matrix factorization for semi-supervised data clustering

  • Authors:
  • Yanhua Chen;Manjeet Rege;Ming Dong;Jing Hua

  • Affiliations:
  • Wayne State University, Department of Computer Science, Detroit, MI, USA;Wayne State University, Department of Computer Science, Detroit, MI, USA;Wayne State University, Department of Computer Science, Detroit, MI, USA;Wayne State University, Department of Computer Science, Detroit, MI, USA

  • Venue:
  • Knowledge and Information Systems
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Traditional clustering algorithms are inapplicable to many real-world problems where limited knowledge from domain experts is available. Incorporating the domain knowledge can guide a clustering algorithm, consequently improving the quality of clustering. In this paper, we propose SS-NMF: a semi-supervised non-negative matrix factorization framework for data clustering. In SS-NMF, users are able to provide supervision for clustering in terms of pairwise constraints on a few data objects specifying whether they “must” or “cannot” be clustered together. Through an iterative algorithm, we perform symmetric tri-factorization of the data similarity matrix to infer the clusters. Theoretically, we show the correctness and convergence of SS-NMF. Moveover, we show that SS-NMF provides a general framework for semi-supervised clustering. Existing approaches can be considered as special cases of it. Through extensive experiments conducted on publicly available datasets, we demonstrate the superior performance of SS-NMF for clustering.