Structured data classification by means of matrix factorization

  • Authors:
  • Paolo Garza

  • Affiliations:
  • Politecnico di Milano, Milano, Italy

  • Venue:
  • Proceedings of the 20th ACM international conference on Information and knowledge management
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Singular Value Decomposition (SVD) has been extensively used in the classification context as a preprocessing step aiming to reduce the number of features of the input space. Traditional classification algorithms are then applied on the new space to generate accurate models. In this paper, we propose a different use of SVD. In our approach SVD is the building block of a new classification algorithm, called CMF, and not that of a feature reduction algorithm. In particular, we propose a new classification algorithm where the classification model corresponds to the k largest right singular vectors of the factorization of the training dataset obtained by applying SVD. The selected singular vectors allows representing the main "characteristics" of the training data and can be used to provide accurate predictions. The experiments performed on 15 structured UCI datasets show that CMF is efficient and, despite its simplicity, it is more accurate than many state of the art classification algorithms.