The Strength of Weak Learnability
Machine Learning
Dynamic itemset counting and implication rules for market basket data
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Explora: a multipattern and multistrategy discovery assistant
Advances in knowledge discovery and data mining
A decision-theoretic generalization of on-line learning and an application to boosting
Journal of Computer and System Sciences - Special issue: 26th annual ACM symposium on the theory of computing & STOC'94, May 23–25, 1994, and second annual Europe an conference on computational learning theory (EuroCOLT'95), March 13–15, 1995
Introduction to Monte Carlo methods
Learning in graphical models
Data mining: practical machine learning tools and techniques with Java implementations
Data mining: practical machine learning tools and techniques with Java implementations
Improved Boosting Algorithms Using Confidence-rated Predictions
Machine Learning - The Eleventh Annual Conference on computational Learning Theory
Machine Learning
Machine Learning
What Makes Patterns Interesting in Knowledge Discovery Systems
IEEE Transactions on Knowledge and Data Engineering
Incorporating Prior Knowledge into Boosting
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
An Algorithm for Multi-relational Discovery of Subgroups
PKDD '97 Proceedings of the First European Symposium on Principles of Data Mining and Knowledge Discovery
Finding the most interesting patterns in a database quickly by using sequential sampling
The Journal of Machine Learning Research
Cost-Sensitive Learning by Cost-Proportionate Example Weighting
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Subgroup Discovery with CN2-SD
The Journal of Machine Learning Research
Interestingness of frequent itemsets using Bayesian networks as background knowledge
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Incorporating prior knowledge with weighted margin support vector machines
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
RSD: relational subgroup discovery through first-order feature construction
ILP'02 Proceedings of the 12th international conference on Inductive logic programming
Estimating continuous distributions in Bayesian classifiers
UAI'95 Proceedings of the Eleventh conference on Uncertainty in artificial intelligence
Knowledge-Based sampling for subgroup discovery
LPD'04 Proceedings of the 2004 international conference on Local Pattern Detection
Polynomial association rules with applications to logistic regression
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
YALE: rapid prototyping for complex data mining tasks
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Vote prediction by iterative domain knowledge and attribute elimination
International Journal of Business Intelligence and Data Mining
Boosting classifiers for drifting concepts
Intelligent Data Analysis - Knowlegde Discovery from Data Streams
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases
ECML'06 Proceedings of the 17th European conference on Machine Learning
ACM Transactions on Knowledge Discovery from Data (TKDD)
Semi-supervised clustering: a case study
MLDM'12 Proceedings of the 8th international conference on Machine Learning and Data Mining in Pattern Recognition
Hi-index | 0.04 |
Subgroup discovery is a learning task that aims at finding interesting rules from classified examples. The search is guided by a utility function, trading off the coverage of rules against their statistical unusualness. One shortcoming of existing approaches is that they do not incorporate prior knowledge. To this end a novel generic sampling strategy is proposed. It allows to turn pattern mining into an iterative process. In each iteration the focus of subgroup discovery lies on those patterns that are unexpected with respect to prior knowledge and previously discovered patterns. The result of this technique is a small diverse set of understandable rules that characterise a specified property of interest. As another contribution this article derives a simple connection between subgroup discovery and classifier induction. For a popular utility function this connection allows to apply any standard rule induction algorithm to the task of subgroup discovery after a step of stratified resampling. The proposed techniques are empirically compared to state of the art subgroup discovery algorithms.