Nonnegative factor analysis for text document clustering

  • Authors:
  • Lenka Skovajsová;Igor Mokriš

  • Affiliations:
  • Institute of Informatics, Slovak Academy of Sciences Bratislava, Slovakia;Institute of Informatics, Slovak Academy of Sciences Bratislava, Slovakia

  • Venue:
  • SMO'09 Proceedings of the 9th WSEAS international conference on Simulation, modelling and optimization
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper deals with text document clustering by means of neural network used for preprocessing and next, the nonnegative factor analysis is applied to create certain amount of clusters. The results on the part of Reuters-21578 collection show that the given number of clusters is created, and the difference between clusters is counted as the cosine similarity between centroids of the particular clusters. Results show that if the data are preprocessed by PCA, the non-negative factor analysis divides documents into given number of clusters quite successfully.