Stochastic Complexity in Statistical Inquiry Theory
Stochastic Complexity in Statistical Inquiry Theory
AC-Close: Efficiently Mining Approximate Closed Itemsets by Core Pattern Recovery
ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Quantitative evaluation of approximate frequent pattern mining algorithms
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Succinct summarization of transactional databases: an overlapped hyperrectangle scheme
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
IEEE Transactions on Knowledge and Data Engineering
Hi-index | 0.00 |
In many application fields, huge binary datasets modeling real life-phenomena are daily produced. These datasets record observations of some events, and people are often interested in mining them in order to recognize recurrent patterns. However, the discovery of the most important patterns is very challenging. For example, these patterns may overlap, or be related only to a particular subset of the observations. Finally, the mining can be hindered by the presence of noise. In this paper, we introduce a generative pattern model, and an associated cost model for evaluating the goodness of the set of patterns extracted from a binary dataset. We propose an efficient algorithm, named GPM, for the discovery of the most relevant patterns according to the model. We show that the proposed model generalizes other approaches and supports the discovery of high quality patterns.