The Case against Accuracy Estimation for Comparing Induction Algorithms
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Gaussian Processes for Ordinal Regression
The Journal of Machine Learning Research
A Bayes Optimal Approach for Partitioning the Values of Categorical Attributes
The Journal of Machine Learning Research
Handbook of Mathematical Functions, With Formulas, Graphs, and Mathematical Tables,
Handbook of Mathematical Functions, With Formulas, Graphs, and Mathematical Tables,
New approaches to support vector ordinal regression
ICML '05 Proceedings of the 22nd international conference on Machine learning
The Journal of Machine Learning Research
Nonparametric Quantile Estimation
The Journal of Machine Learning Research
Compression-Based Averaging of Selective Naive Bayes Classifiers
The Journal of Machine Learning Research
Optimal bayesian 2d-discretization for variable ranking in regression
DS'06 Proceedings of the 9th international conference on Discovery Science
Lessons learned in the challenge: making predictions and scoring them
MLCW'05 Proceedings of the First international conference on Machine Learning Challenges: evaluating Predictive Uncertainty Visual Object Classification, and Recognizing Textual Entailment
Model Selection: Beyond the Bayesian/Frequentist Divide
The Journal of Machine Learning Research
Chapter 15: search computing and the life sciences
Search Computing
Hi-index | 0.00 |
In this paper, we consider the supervised learning task which consists in predicting the normalized rank of a numerical variable. We introduce a novel probabilistic approach to estimate the posterior distribution of the target rank conditionally to the predictors. We turn this learning task into a model selection problem. For that, we define a 2D partitioning family obtained by discretizing numerical variables and grouping categorical ones and we derive an analytical criterion to select the partition with the highest posterior probability. We show how these partitions can be used to build univariate predictors and multivariate ones under a naive Bayes assumption. We also propose a new evaluation criterion for probabilistic rank estimators. Based on the logarithmic score, we show that such criterion presents the advantage to be minored, which is not the case of the logarithmic score computed for probabilistic value estimator. A first set of experimentations on synthetic data shows the good properties of the proposed criterion and of our partitioning approach. A second set of experimentations on real data shows competitive performance of the univariate and selective naive Bayes rank estimators projected on the value range compared to methods submitted to a recent challenge on probabilistic metric regression tasks. Our approach is applicable for all regression problems with categorical or numerical predictors. It is particularly interesting for those with a high number of predictors as it automatically detects the variables which contain predictive information. It builds pertinent predictors of the normalized rank of the numerical target from one or several predictors. As the criteria selection is regularized by the presence of a prior and a posterior term, it does not suffer from overfitting.