The discrete basis problem

Authors:
Pauli Miettinen;Taneli Mielikäinen;Aristides Gionis;Gautam Das;Heikki Mannila
Affiliations:
HIIT Basic Research Unit, Department of Computer Science, University of Helsinki, Finland;HIIT Basic Research Unit, Department of Computer Science, University of Helsinki, Finland;HIIT Basic Research Unit, Department of Computer Science, University of Helsinki, Finland;Computer Science and Engineering Department, University of Texas at Arlington, Arlington, TX;HIIT Basic Research Unit, Department of Computer Science, University of Helsinki, Finland
Venue:
PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases
Year:
2006

Citing 9
Cited 17

Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Probabilistic latent semantic indexing

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Computers and Intractability: A Guide to the Theory of NP-Completeness

Computers and Intractability: A Guide to the Theory of NP-Completeness
Variational Extensions to EM and Multinomial PCA

ECML '02 Proceedings of the 13th European Conference on Machine Learning
Latent dirichlet allocation

The Journal of Machine Learning Research
A generalized maximum entropy approach to bregman co-clustering and matrix approximation

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Compression, Clustering, and Pattern Discovery in Very High-Dimensional Discrete-Attribute Data Sets

IEEE Transactions on Knowledge and Data Engineering
Geometric and combinatorial tiles in 0-1 data

PKDD '04 Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases
Constraint-Based mining of fault-tolerant patterns from boolean data

KDID'05 Proceedings of the 4th international conference on Knowledge Discovery in Inductive Databases

The role mining problem: finding a minimal descriptive set of roles

Proceedings of the 12th ACM symposium on Access control models and technologies
A class of probabilistic models for role engineering

Proceedings of the 15th ACM conference on Computer and communications security
Edge-RMP: Minimizing administrative assignments for role-based access control

Journal of Computer Security
Multi-assignment clustering for Boolean data

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Factor Analysis of Incidence Data via Novel Decomposition of Matrices

ICFCA '09 Proceedings of the 7th International Conference on Formal Concept Analysis
Distributed Algorithm for Computing Formal Concepts Using Map-Reduce Framework

IDA '09 Proceedings of the 8th International Symposium on Intelligent Data Analysis: Advances in Intelligent Data Analysis VIII
Optimal triangular decompositions of matrices with entries from residuated lattices

International Journal of Approximate Reasoning
Discovery of optimal factors in binary data via a novel method of matrix decomposition

Journal of Computer and System Sciences
The role mining problem: A formal perspective

ACM Transactions on Information and System Security (TISSEC)
Factorizing three-way binary data with triadic formal concepts

KES'10 Proceedings of the 14th international conference on Knowledge-based and intelligent information and engineering systems: Part I
Optimal decompositions of matrices with grades into binary and graded matrices

Annals of Mathematics and Artificial Intelligence
Fast algorithm for computing fixpoints of Galois connections induced by object-attribute relational data

Information Sciences: an International Journal
Factorizing three-way ordinal data using triadic formal concepts

FQAS'11 Proceedings of the 9th international conference on Flexible Query Answering Systems
Multi-assignment clustering for boolean data

The Journal of Machine Learning Research
Computing Formal Concepts by Attribute Sorting

Fundamenta Informaticae - Concept Lattices and Their Applications
Role Mining with Probabilistic Models

ACM Transactions on Information and System Security (TISSEC)
An optimization framework for role mining

Journal of Computer Security

Quantified Score

Hi-index	0.00

Visualization

Abstract

Matrix decomposition methods represent a data matrix as a product of two smaller matrices: one containing basis vectors that represent meaningful concepts in the data, and another describing how the observed data can be expressed as combinations of the basis vectors. Decomposition methods have been studied extensively, but many methods return real-valued matrices. If the original data is binary, the interpretation of the basis vectors is hard. We describe a matrix decomposition formulation, the Discrete Basis Problem. The problem seeks for a Boolean decomposition of a binary matrix, thus allowing the user to easily interpret the basis vectors. We show that the problem is computationally difficult and give a simple greedy algorithm for solving it. We present experimental results for the algorithm. The method gives intuitively appealing basis vectors. On the other hand, the continuous decomposition methods often give better reconstruction accuracies. We discuss the reasons for this behavior.