C4.5: programs for machine learning
C4.5: programs for machine learning
General and Efficient Multisplitting of Numerical Attributes
Machine Learning
FUSINTER: a method for discretization of continuous attributes
International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
Multivariate discretization for set mining
Knowledge and Information Systems
Discretization: An Enabling Technique
Data Mining and Knowledge Discovery
On Changing Continuous Attributes into Ordered Discrete Attributes
EWSL '91 Proceedings of the European Working Session on Machine Learning
Khiops: A Statistical Discretization Method of Continuous Attributes
Machine Learning
Khiops: a discretization method of continuous attributes with guaranteed resistance to noise
MLDM'03 Proceedings of the 3rd international conference on Machine learning and data mining in pattern recognition
Minimum description length induction, Bayesianism, and Kolmogorov complexity
IEEE Transactions on Information Theory
Tracking Web spam with HTML style similarities
ACM Transactions on the Web (TWEB)
A New Probabilistic Approach in Rank Regression with Optimal Bayesian Partitioning
The Journal of Machine Learning Research
Improved Comprehensibility and Reliability of Explanations via Restricted Halfspace Discretization
MLDM '09 Proceedings of the 6th International Conference on Machine Learning and Data Mining in Pattern Recognition
Using Resampling Techniques for Better Quality Discretization
MLDM '09 Proceedings of the 6th International Conference on Machine Learning and Data Mining in Pattern Recognition
A Parameter-Free Classification Method for Large Scale Learning
The Journal of Machine Learning Research
On improving discretization quality by a bagging technique
ICNC'09 Proceedings of the 5th international conference on Natural computation
The orange customer analysis platform
ICDM'10 Proceedings of the 10th industrial conference on Advances in data mining: applications and theoretical aspects
Modelling complex data by learning which variable to construct
DaWaK'10 Proceedings of the 12th international conference on Data warehousing and knowledge discovery
The Knowledge Engineering Review
Informative variables selection for multi-relational supervised learning
MLDM'11 Proceedings of the 7th international conference on Machine learning and data mining in pattern recognition
Optimal bayesian 2d-discretization for variable ranking in regression
DS'06 Proceedings of the 9th international conference on Discovery Science
A bayesian approach for classification rule mining in quantitative databases
ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part II
UniDis: a universal discretization technique
Journal of Intelligent Information Systems
Information Sciences: an International Journal
Specific-class distance measures for nominal attributes
AI Communications
Hi-index | 0.00 |
While real data often comes in mixed format, discrete and continuous, many supervised induction algorithms require discrete data. Efficient discretization of continuous attributes is an important problem that has effects on speed, accuracy and understandability of the induction models. In this paper, we propose a new discretization method MODL1, founded on a Bayesian approach. We introduce a space of discretization models and a prior distribution defined on this model space. This results in the definition of a Bayes optimal evaluation criterion of discretizations. We then propose a new super-linear optimization algorithm that manages to find near-optimal discretizations. Extensive comparative experiments both on real and synthetic data demonstrate the high inductive performances obtained by the new discretization method.