Knowledge extraction with non-negative matrix factorization for text classification

  • Authors:
  • Catarina Silva;Bernardete Ribeiro

  • Affiliations:
  • School of Technology and Management of the Polytechnic Institute of Leiria, Leiria, Portugal and Department of Informatics Engineering, Center for Informatics and Systems, University of Coimbra, C ...;Department of Informatics Engineering, Center for Informatics and Systems, University of Coimbra, Coimbra, Portugal

  • Venue:
  • IDEAL'09 Proceedings of the 10th international conference on Intelligent data engineering and automated learning
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Text classification has received increasing interest over the past decades for its wide range of applications driven by the ubiquity of textual information. The high dimensionality of those applications led to pervasive use of dimensionality reduction methods, often black-box feature extraction non-linear techniques. We show how Non-Negative Matrix Factorization (NMF), an algorithm able to learn a parts-based representation of data by imposing non-negativity constraints, can be used to represent and extract knowledge from a text classification problem. The resulting reduced set of features is tested with kernel-based machines on Reuters-21578 benchmark showing the method's performance competitiveness.