Explora: a multipattern and multistrategy discovery assistant
Advances in knowledge discovery and data mining
Discovering Frequent Closed Itemsets for Association Rules
ICDT '99 Proceedings of the 7th International Conference on Database Theory
An Algorithm for Multi-relational Discovery of Subgroups
PKDD '97 Proceedings of the First European Symposium on Principles of Data Mining and Knowledge Discovery
Subgroup Discovery with CN2-SD
The Journal of Machine Learning Research
IEEE Transactions on Pattern Analysis and Machine Intelligence
The Minimum Description Length Principle (Adaptive Computation and Machine Learning)
The Minimum Description Length Principle (Adaptive Computation and Machine Learning)
The Journal of Machine Learning Research
Tight Optimistic Estimates for Fast Subgroup Discovery
ECML PKDD '08 Proceedings of the 2008 European Conference on Machine Learning and Knowledge Discovery in Databases - Part I
ECML PKDD '08 Proceedings of the European conference on Machine Learning and Knowledge Discovery in Databases - Part II
The Chosen Few: On Identifying Valuable Patterns
ICDM '07 Proceedings of the 2007 Seventh IEEE International Conference on Data Mining
Beam search induction and similarity constraints for predictive clustering trees
KDID'06 Proceedings of the 5th international conference on Knowledge discovery in inductive databases
Maximal exceptions with minimal descriptions
Data Mining and Knowledge Discovery
Krimp: mining itemsets that compress
Data Mining and Knowledge Discovery
PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases
An enhanced relevance criterion for more concise supervised pattern discovery
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Generic pattern trees for exhaustive exceptional model mining
ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part II
Multi-label lego -- enhancing multi-label classifiers with local patterns
IDA'12 Proceedings of the 11th international conference on Advances in Intelligent Data Analysis
Proceedings of the ACM SIGKDD Workshop on Interactive Data Exploration and Analytics
Hi-index | 0.00 |
Large and complex data is challenging for most existing discovery algorithms, for several reasons. First of all, such data leads to enormous hypothesis spaces, making exhaustive search infeasible. Second, many variants of essentially the same pattern exist, due to (numeric) attributes of high cardinality, correlated attributes, and so on. This causes top-k mining algorithms to return highly redundant result sets, while ignoring many potentially interesting results. These problems are particularly apparent with Subgroup Discovery and its generalisation, Exceptional Model Mining. To address this, we introduce subgroup set mining: one should not consider individual subgroups, but sets of subgroups. We consider three degrees of redundancy, and propose corresponding heuristic selection strategies in order to eliminate redundancy. By incorporating these strategies in a beam search, the balance between exploration and exploitation is improved. Experiments clearly show that the proposed methods result in much more diverse subgroup sets than traditional Subgroup Discovery methods.