Formalizing complex prior information to quantify subjective interestingness of frequent pattern sets

Authors:
Kleanthis-Nikolaos Kontonasios;Tijl DeBie
Affiliations:
Intelligent Systems Laboratory, University of Bristol, Bristol, UK;Intelligent Systems Laboratory, University of Bristol, Bristol, UK
Venue:
IDA'12 Proceedings of the 11th international conference on Advances in Intelligent Data Analysis
Year:
2012

Citing 8
Cited 0

The budgeted maximum coverage problem

Information Processing Letters
What Makes Patterns Interesting in Knowledge Discovery Systems

IEEE Transactions on Knowledge and Data Engineering
Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing)

Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing)
Assessing data mining results via swap randomization

ACM Transactions on Knowledge Discovery from Data (TKDD)
Tell me something I don't know: randomization strategies for iterative data mining

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Probabilistic Graphical Models: Principles and Techniques - Adaptive Computation and Machine Learning

Probabilistic Graphical Models: Principles and Techniques - Adaptive Computation and Machine Learning
Mining advisor-advisee relationships from research publication networks

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Using background knowledge to rank itemsets

Data Mining and Knowledge Discovery

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we are concerned with the problem of modelling prior information of a data miner about the data, with the purpose of quantifying subjective interestingness of patterns. Recent results have achieved this for the specific case of prior expectations on the row and column marginals, based on the Maximum Entropy principle [2,9]. In the current paper, we extend these ideas to make them applicable to more general prior information, such as knowledge of frequencies of itemsets, a cluster structure in the data, or the presence of dense areas in the database. As in [2,9], we show how information theory can be used to quantify subjective interestingness against this model, in particular the subjective interestingness of tile patterns [3]. Our method presents an efficient, flexible, and rigorous alternative to the randomization approach presented in [5]. We demonstrate our method by searching for interesting patterns in real-life data with respect to various realistic types of prior information.