Maximal exceptions with minimal descriptions

  • Authors:
  • Matthijs Leeuwen

  • Affiliations:
  • Department of Information and Computing Sciences, Universiteit Utrecht, Utrecht, The Netherlands

  • Venue:
  • Data Mining and Knowledge Discovery
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

We introduce a new approach to Exceptional Model Mining. Our algorithm, called EMDM, is an iterative method that alternates between Exception Maximisation and Description Minimisation. As a result, it finds maximally exceptional models with minimal descriptions. Exceptional Model Mining was recently introduced by Leman et al. (Exceptional model mining 1---16, 2008) as a generalisation of Subgroup Discovery. Instead of considering a single target attribute, it allows for multiple `model' attributes on which models are fitted. If the model for a subgroup is substantially different from the model for the complete database, it is regarded as an exceptional model. To measure exceptionality, we propose two information-theoretic measures. One is based on the Kullback---Leibler divergence, the other on Krimp. We show how compression can be used for exception maximisation with these measures, and how classification can be used for description minimisation. Experiments show that our approach efficiently identifies subgroups that are both exceptional and interesting.