Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
ECML PKDD '08 Proceedings of the European conference on Machine Learning and Knowledge Discovery in Databases - Part II
Cross-Mining Binary and Numerical Attributes
ICDM '07 Proceedings of the 2007 Seventh IEEE International Conference on Data Mining
AIME '09 Proceedings of the 12th Conference on Artificial Intelligence in Medicine: Artificial Intelligence in Medicine
Compressing tags to find interesting media groups
Proceedings of the 18th ACM conference on Information and knowledge management
Compression picks item sets that matter
PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases
Non-redundant subgroup discovery in large and complex data
ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part III
From black and white to full color: extending redescription mining outside the Boolean world
Statistical Analysis and Data Mining
Generic pattern trees for exhaustive exceptional model mining
ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part II
Hi-index | 0.00 |
We introduce a new approach to Exceptional Model Mining. Our algorithm, called EMDM, is an iterative method that alternates between Exception Maximisation and Description Minimisation. As a result, it finds maximally exceptional models with minimal descriptions. Exceptional Model Mining was recently introduced by Leman et al. (Exceptional model mining 1---16, 2008) as a generalisation of Subgroup Discovery. Instead of considering a single target attribute, it allows for multiple `model' attributes on which models are fitted. If the model for a subgroup is substantially different from the model for the complete database, it is regarded as an exceptional model. To measure exceptionality, we propose two information-theoretic measures. One is based on the Kullback---Leibler divergence, the other on Krimp. We show how compression can be used for exception maximisation with these measures, and how classification can be used for description minimisation. Experiments show that our approach efficiently identifies subgroups that are both exceptional and interesting.