The discrete basis problem

  • Authors:
  • Pauli Miettinen;Taneli Mielikäinen;Aristides Gionis;Gautam Das;Heikki Mannila

  • Affiliations:
  • HIIT Basic Research Unit, Department of Computer Science, University of Helsinki, Finland;HIIT Basic Research Unit, Department of Computer Science, University of Helsinki, Finland;HIIT Basic Research Unit, Department of Computer Science, University of Helsinki, Finland;Computer Science and Engineering Department, University of Texas at Arlington, Arlington, TX;HIIT Basic Research Unit, Department of Computer Science, University of Helsinki, Finland

  • Venue:
  • PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Matrix decomposition methods represent a data matrix as a product of two smaller matrices: one containing basis vectors that represent meaningful concepts in the data, and another describing how the observed data can be expressed as combinations of the basis vectors. Decomposition methods have been studied extensively, but many methods return real-valued matrices. If the original data is binary, the interpretation of the basis vectors is hard. We describe a matrix decomposition formulation, the Discrete Basis Problem. The problem seeks for a Boolean decomposition of a binary matrix, thus allowing the user to easily interpret the basis vectors. We show that the problem is computationally difficult and give a simple greedy algorithm for solving it. We present experimental results for the algorithm. The method gives intuitively appealing basis vectors. On the other hand, the continuous decomposition methods often give better reconstruction accuracies. We discuss the reasons for this behavior.