An introduction to Kolmogorov complexity and its applications
An introduction to Kolmogorov complexity and its applications
A decision-theoretic generalization of on-line learning and an application to boosting
Journal of Computer and System Sciences - Special issue: 26th annual ACM symposium on the theory of computing & STOC'94, May 23–25, 1994, and second annual Europe an conference on computational learning theory (EuroCOLT'95), March 13–15, 1995
Using association rules for product assortment decisions: a case study
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Clustering transactions using large items
Proceedings of the eighth international conference on Information and knowledge management
Mining frequent patterns without candidate generation
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Information Theoretic Clustering
IEEE Transactions on Pattern Analysis and Machine Intelligence
Finding Localized Associations in Market Basket Data
IEEE Transactions on Knowledge and Data Engineering
Compression, Clustering, and Pattern Discovery in Very High-Dimensional Discrete-Attribute Data Sets
IEEE Transactions on Knowledge and Data Engineering
Robust information-theoretic clustering
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
On data mining, compression, and Kolmogorov complexity
Data Mining and Knowledge Discovery
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Compression picks item sets that matter
PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases
A bi-clustering framework for categorical data
PKDD'05 Proceedings of the 9th European conference on Principles and Practice of Knowledge Discovery in Databases
IEEE Transactions on Information Theory
Guest editors' introduction: special issue of selected papers from ECML PKDD 2009
Data Mining and Knowledge Discovery
Guest editors' introduction: Special Issue from ECML PKDD 2009
Machine Learning
ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part I
ACM SIGKDD Explorations Newsletter
Krimp: mining itemsets that compress
Data Mining and Knowledge Discovery
Hi-index | 0.00 |
Most, if not all, databases are mixtures of samples from different distributions. Transactional data is no exception. For the prototypical example, supermarket basket analysis, one also expects a mixture of different buying patterns. Households of retired people buy different collections of items than households with young children. Models that take such underlying distributions into account are in general superior to those that do not. In this paper we introduce two MDL-based algorithms that follow orthogonal approaches to identify the components in a transaction database. The first follows a model-based approach, while the second is data-driven. Both are parameter-free: the number of components and the components themselves are chosen such that the combined complexity of data and models is minimised. Further, neither prior knowledge on the distributions nor a distance metric on the data is required. Experiments with both methods show that highly characteristic components are identified.