A New Probabilistic Approach in Rank Regression with Optimal Bayesian Partitioning
The Journal of Machine Learning Research
A Parameter-Free Classification Method for Large Scale Learning
The Journal of Machine Learning Research
The orange customer analysis platform
ICDM'10 Proceedings of the 10th industrial conference on Advances in data mining: applications and theoretical aspects
Modelling complex data by learning which variable to construct
DaWaK'10 Proceedings of the 12th international conference on Data warehousing and knowledge discovery
A bayesian approach for classification rule mining in quantitative databases
ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part II
Hi-index | 0.00 |
In supervised machine learning, the partitioning of the values (also called grouping) of a categorical attribute aims at constructing a new synthetic attribute which keeps the information of the initial attribute and reduces the number of its values. In this paper, we propose a new grouping method MODL founded on a Bayesian approach. The method relies on a model space of grouping models and on a prior distribution defined on this model space. This results in an evaluation criterion of grouping, which is minimal for the most probable grouping given the data, i.e. the Bayes optimal grouping. We propose new super-linear optimization heuristics that yields near-optimal groupings. Extensive comparative experiments demonstrate that the MODL grouping method builds high quality groupings in terms of predictive quality, robustness and small number of groups.