Generic pattern trees for exhaustive exceptional model mining
ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part II
Hi-index | 0.00 |
Most of the present subgroup discovery approaches aim at finding subsets of attribute-value data with unusual distribution of a single output variable. In general, real-life problems may be described with richer, multi-dimensional descriptions of the outcome. The discovery task in such domains is to find subsets of data instances with similar outcome description that are separable from the rest of the instances in the input space. We have developed a technique that directly addresses this problem and uses a combination of agglomerative clustering to find subgroup candidates in the space of output attributes, and predictive modeling to score and describe these candidates in the input attribute space. Experiments with the proposed method on a set of synthetic and on a real social survey data set demonstrate its ability to discover relevant and interesting subgroups from the data with multi-dimensional fesponses.