Fast nonnegative matrix factorization and its application for protein fold recognition

  • Authors:
  • Oleg Okun;Helen Priisalu

  • Affiliations:
  • Machine Vision Group, Infotech Oulu and Department of Electrical and Information Engineering, University of Oulu, Finland;Machine Vision Group, Infotech Oulu and Department of Electrical and Information Engineering, University of Oulu, Finland

  • Venue:
  • EURASIP Journal on Applied Signal Processing
  • Year:
  • 2006

Quantified Score

Hi-index 0.01

Visualization

Abstract

Linear and unsupervised dimensionality reduction via matrix factorization with nonnegativity constraints is studied. Because of these constraints, it stands apart from other linear dimensionality reduction methods. Here we explore nonnegative matrix factorization in combination with three nearest-neighbor classifiers for protein fold recognition. Since typically matrix factorization is iteratively done, convergence, can be slow. To speed up convergence, we perform feature scaling (normalization) prior to the beginning of iterations. This results in a significantly (more than 11 times) faster algorithm. Justification of why it happens is provided. Another modification of the standard nonnegative matrix factorization algorithm is concerned with combining two known techniques for mapping unseen data. This operation is typically necessary before classifying the data in low-dimensional space. Combining two mapping techniques can yield better accuracy than using either technique alone. The gains, however, depend on the state of the random number generator used for initialization of iterations, a classifier, and its parameters. In particular, when employing the best out of three classifiers and reducing the original dimensionality by around 30%, these gains can reach more than 4%, compared to the classification in the original, high-dimensional space.