Knowledge discovery interestingness measures based on unexpectedness

Authors:
Kleanthis-Nikolaos Kontonasios;Eirini Spyropoulou;Tijl De Bie
Affiliations:
Intelligent Systems Laboratory, University of Bristol, Bristol, UK;Intelligent Systems Laboratory, University of Bristol, Bristol, UK;Intelligent Systems Laboratory, University of Bristol, Bristol, UK
Venue:
Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery
Year:
2012

Citing 38
Cited 1

C4.5: programs for machine learning

C4.5: programs for machine learning
Finding interesting rules from large sets of discovered association rules

CIKM '94 Proceedings of the third international conference on Information and knowledge management
Interestingness via what is not interesting

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining frequent patterns without candidate generation

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Small is beautiful: discovering the minimal set of unexpected patterns

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
What Makes Patterns Interesting in Knowledge Discovery Systems

IEEE Transactions on Knowledge and Data Engineering
Finding Interesting Patterns Using User Expectations

IEEE Transactions on Knowledge and Data Engineering
Interesting Fuzzy Association Rules in Quantitative Databases

PKDD '01 Proceedings of the 5th European Conference on Principles of Data Mining and Knowledge Discovery
Mining Surprising Patterns Using Temporal Description Length

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Interestingness of Discovered Association Rules in Terms of Neighborhood-Based Unexpectedness

PAKDD '98 Proceedings of the Second Pacific-Asia Conference on Research and Development in Knowledge Discovery and Data Mining
Visually Aided Exploration of Interesting Association Rules

PAKDD '99 Proceedings of the Third Pacific-Asia Conference on Methodologies for Knowledge Discovery and Data Mining
On Objective Measures of Rule Surprisingness

PKDD '98 Proceedings of the Second European Symposium on Principles of Data Mining and Knowledge Discovery
Selecting the right interestingness measure for association patterns

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining unexpected rules by pushing user dynamics

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
A Framework for Evaluating Knowledge-Based Interestingness of Association Rules

Fuzzy Optimization and Decision Making
Interestingness of frequent itemsets using Bayesian networks as background knowledge

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Fast discovery of unexpected patterns in data, relative to a Bayesian network

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
A survey of interestingness measures for knowledge discovery

The Knowledge Engineering Review
Interestingness measures for data mining: A survey

ACM Computing Surveys (CSUR)
Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing)

Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing)
Discovering interesting patterns through user's interactive feedback

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Quality Measures in Data Mining (Studies in Computational Intelligence)

Quality Measures in Data Mining (Studies in Computational Intelligence)
Assessing data mining results via swap randomization

ACM Transactions on Knowledge Discovery from Data (TKDD)
Mining Unexpected Web Usage Behaviors

ICDM '08 Proceedings of the 8th industrial conference on Advances in Data Mining: Medical Applications, E-Commerce, Marketing, and Theoretical Aspects
Tell me something I don't know: randomization strategies for iterative data mining

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Randomization methods for assessing data analysis results on real-valued matrices

Statistical Analysis and Data Mining
Probabilistic Graphical Models: Principles and Techniques - Adaptive Computation and Machine Learning

Probabilistic Graphical Models: Principles and Techniques - Adaptive Computation and Machine Learning
Knowledge-Based Interactive Postmining of Association Rules Using Ontologies

IEEE Transactions on Knowledge and Data Engineering
Using background knowledge to rank itemsets

Data Mining and Knowledge Discovery
Finding unusual review patterns using unexpected rules

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Post-analysis of learned rules

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1
Assessing Data Mining Results on Matrices with Randomization

ICDM '10 Proceedings of the 2010 IEEE International Conference on Data Mining
A framework for mining interesting pattern sets

ACM SIGKDD Explorations Newsletter
An information theoretic framework for data mining

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Maximum entropy models and subjective interestingness: an application to tiles in binary databases

Data Mining and Knowledge Discovery
Interesting Multi-relational Patterns

ICDM '11 Proceedings of the 2011 IEEE 11th International Conference on Data Mining
Maximum Entropy Modelling for Assessing Results on Real-Valued Data

ICDM '11 Proceedings of the 2011 IEEE 11th International Conference on Data Mining

Interesting pattern mining in multi-relational data

Data Mining and Knowledge Discovery

Quantified Score

Hi-index	0.00

Visualization

Abstract

Knowledge discovery methods often discover a large number of patterns. Although this can be considered of interest, it certainly presents considerable challenges too. Indeed, this set of patterns often contains lots of uninteresting patterns that risk overwhelming the data miner. In addition, a single interesting pattern can be discovered in a multitude of tiny variations that for all practical purposes are redundant. These issues are referred to as the pattern explosion problem. They lie at the basis of much recent research attempting to quantify interestingness and redundancy between patterns, with the purpose of filtering down a large pattern set to an interesting and compact subset. Many diverse approaches to interestingness and corresponding interestingness measures (IMs) have been proposed in the literature. Some of them, named objective IMs, define interestingness only based on objective criteria of the pattern and data at hand. Subjective IMs additionally depend on the user's prior knowledge about the dataset. Formalizing unexpectedness is probably the most common approach for defining subjective IMs, where a pattern is deemed unexpected if it contradicts the user's expectations about the dataset. Such subjective IMs based on unexpectedness form the focus of this paper. We categorize measures based on unexpectedness into two major subgroups, namely, syntactical and probabilistic approaches. Based on this distinction, we survey different methods for assessing the unexpectedness of patterns with a special focus on frequent itemsets, tiles, association rules, and classification rules. © 2012 Wiley Periodicals, Inc. © 2012 Wiley Periodicals, Inc.