Decision trees and multi-valued attributes
Machine intelligence 11
Optimal Partitioning for Classification and Regression Trees
IEEE Transactions on Pattern Analysis and Machine Intelligence
On changing continuous attributes into ordered discrete attributes
EWSL-91 Proceedings of the European working session on learning on Machine learning
C4.5: programs for machine learning
C4.5: programs for machine learning
Efficient agnostic PAC-learning with simple hypothesis
COLT '94 Proceedings of the seventh annual conference on Computational learning theory
Decision tree pruning: biased or optimal?
AAAI '94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 1)
The sciences of the artificial (3rd ed.)
The sciences of the artificial (3rd ed.)
Feature Selection via Discretization
IEEE Transactions on Knowledge and Data Engineering
Incremental Induction of Decision Trees
Machine Learning
Machine Learning
Class-Driven Statistical Discretization of Continuous Attributes (Extended Abstract)
ECML '95 Proceedings of the 8th European Conference on Machine Learning
A New MDL Measure for Robust Rule Induction (Extended Abstract)
ECML '95 Proceedings of the 8th European Conference on Machine Learning
Concurrent Discretization of Multiple Attributes
PRICAI '98 Proceedings of the 5th Pacific Rim International Conference on Artificial Intelligence: Topics in Artificial Intelligence
Chi2: Feature Selection and Discretization of Numeric Attributes
TAI '95 Proceedings of the Seventh International Conference on Tools with Artificial Intelligence
Improved use of continuous attributes in C4.5
Journal of Artificial Intelligence Research
Comparing Naive Bayes, Decision Trees, and SVM with AUC and Accuracy
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Efficiently handling feature redundancy in high-dimensional data
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Consistency-based search in feature selection
Artificial Intelligence
Redundancy based feature selection for microarray data
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Building multi-way decision trees with numerical attributes
Information Sciences: an International Journal
Efficient Feature Selection via Analysis of Relevance and Redundancy
The Journal of Machine Learning Research
Building knowledge discovery-driven models for decision support in project management
Decision Support Systems
Using AUC and Accuracy in Evaluating Learning Algorithms
IEEE Transactions on Knowledge and Data Engineering
Toward Integrating Feature Selection Algorithms for Classification and Clustering
IEEE Transactions on Knowledge and Data Engineering
Genetic fuzzy discretization with adaptive intervals for classification problems
GECCO '05 Proceedings of the 7th annual conference on Genetic and evolutionary computation
A Discretization Algorithm Based on a Heterogeneity Criterion
IEEE Transactions on Knowledge and Data Engineering
Optimizing time series discretization for knowledge discovery
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
On handling conflicts between rules with numerical features
Proceedings of the 2006 ACM symposium on Applied computing
Coordination number prediction using learning classifier systems: performance and interpretability
Proceedings of the 8th annual conference on Genetic and evolutionary computation
Information Sciences: an International Journal
Ent-Boost: Boosting using entropy measures for robust object detection
Pattern Recognition Letters
Optimal bin number for equal frequency discretizations in supervized learning
Intelligent Data Analysis
Using metarules to organize and group discovered association rules
Data Mining and Knowledge Discovery
Extracting classification rule of software diagnosis using modified MEPA
Expert Systems with Applications: An International Journal
Expert Systems with Applications: An International Journal
Movie forecast Guru: A Web-based DSS for Hollywood managers
Decision Support Systems
Strategies for Identifying Statistically Significant Dense Regions in Microarray Data
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
A discretization algorithm based on Class-Attribute Contingency Coefficient
Information Sciences: an International Journal
A weighted rough set based method developed for class imbalance learning
Information Sciences: an International Journal
k-ANMI: A mutual information based clustering algorithm for categorical data
Information Fusion
Spatio-temporal discretization for sequential pattern mining
Proceedings of the 2nd international conference on Ubiquitous information management and communication
Mixed feature selection based on granulation and approximation
Knowledge-Based Systems
Consistency measures for feature selection
Journal of Intelligent Information Systems
Making CN2-SD subgroup discovery algorithm scalable to large size data sets using instance selection
Expert Systems with Applications: An International Journal
Mining Numerical Data--A Rough Set Approach
RSEISP '07 Proceedings of the international conference on Rough Sets and Intelligent Systems Paradigms
Improved Algorithms for Univariate Discretization of Continuous Features
PKDD 2007 Proceedings of the 11th European conference on Principles and Practice of Knowledge Discovery in Databases
ICDM '08 Proceedings of the 8th industrial conference on Advances in Data Mining: Medical Applications, E-Commerce, Marketing, and Theoretical Aspects
Estimation of Market Share by Using Discretization Technology: An Application in China Mobile
ICCS '08 Proceedings of the 8th international conference on Computational Science, Part II
IDFQ: An Interface for Database Flexible Querying
ADBIS '08 Proceedings of the 12th East European conference on Advances in Databases and Information Systems
cAnt-Miner: An Ant Colony Classification Algorithm to Cope with Continuous Attributes
ANTS '08 Proceedings of the 6th international conference on Ant Colony Optimization and Swarm Intelligence
Data pre-processing: a new algorithm for feature selection and data discretization
CSTST '08 Proceedings of the 5th international conference on Soft computing as transdisciplinary science and technology
A comparative study on rough set based class imbalance learning
Knowledge-Based Systems
A bottom-up approach to discover transition rules of cellular automata using ant intelligence
International Journal of Geographical Information Science
A FCM-based deterministic forecasting model for fuzzy time series
Computers & Mathematics with Applications
Mining decision rules on data streams in the presence of concept drifts
Expert Systems with Applications: An International Journal
Automatic parameter tuning with a Bayesian case-based reasoning system. A case of study
Expert Systems with Applications: An International Journal
Empirical Evaluation of Ensemble Techniques for a Pittsburgh Learning Classifier System
Learning Classifier Systems
User Modeling and User-Adapted Interaction
Selection and optimization of cut-points for numeric attribute values
Computers & Mathematics with Applications
Feature Selection in Genetic Fuzzy Discretization for the Pattern Classification Problems
IEICE - Transactions on Information and Systems
Evolutionary Optimization Guided by Entropy-Based Discretization
EvoWorkshops '09 Proceedings of the EvoWorkshops 2009 on Applications of Evolutionary Computing: EvoCOMNET, EvoENVIRONMENT, EvoFIN, EvoGAMES, EvoHOT, EvoIASP, EvoINTERACTION, EvoMUSART, EvoNUM, EvoSTOC, EvoTRANSLOG
Encoding Ordinal Features into Binary Features for Text Classification
ECIR '09 Proceedings of the 31th European Conference on IR Research on Advances in Information Retrieval
An empirical determination of samples for decision trees
AIKED'09 Proceedings of the 8th WSEAS international conference on Artificial intelligence, knowledge engineering and data bases
An effective sampling method for decision trees considering comprehensibility and accuracy
WSEAS Transactions on Computers
A Discretization Process in Accordance with a Qualitative Ordered Output
Proceedings of the 2005 conference on Artificial Intelligence Research and Development
An experimental decision of samples for RBF neural networks
MUSP'09 Proceedings of the 9th WSEAS international conference on Multimedia systems & signal processing
Evolutionary multi-feature construction for data reduction: A case study
Applied Soft Computing
Application of ant colony, genetic algorithm and data mining-based techniques for scheduling
Robotics and Computer-Integrated Manufacturing
Using Resampling Techniques for Better Quality Discretization
MLDM '09 Proceedings of the 6th International Conference on Machine Learning and Data Mining in Pattern Recognition
OFFD: Optimal Flexible Frequency Discretization for Naïve Bayes Classification
ADMA '09 Proceedings of the 5th International Conference on Advanced Data Mining and Applications
A Multiple Scanning Strategy for Entropy Based Discretization
ISMIS '09 Proceedings of the 18th International Symposium on Foundations of Intelligent Systems
The relationship of sample size and accuracy in radial basis function networks
WSEAS Transactions on Computers
Sampling scheme for better RBF network
Proceedings of the 2009 International Conference on Hybrid Information Technology
Adapted variable precision rough set approach for EEG analysis
Artificial Intelligence in Medicine
Logic-based fuzzy networks: A study in system modeling with triangular norms and uninorms
Fuzzy Sets and Systems
Feature selection for aiding glass forensic evidence analysis
Intelligent Data Analysis
IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
Predicting box-office success of motion pictures with neural networks
Expert Systems with Applications: An International Journal
Association rule mining-based dissolved gas analysis for fault diagnosis of power transformers
IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
Selecting discrete and continuous features based on neighborhood decision error minimization
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
A hybrid model based on rough sets theory and genetic algorithms for stock price forecasting
Information Sciences: an International Journal
A Parameter-Free Classification Method for Large Scale Learning
The Journal of Machine Learning Research
The Journal of Machine Learning Research
Khiops: a discretization method of continuous attributes with guaranteed resistance to noise
MLDM'03 Proceedings of the 3rd international conference on Machine learning and data mining in pattern recognition
Deterministic vector long-term forecasting for fuzzy time series
Fuzzy Sets and Systems
extraRelief: improving relief by efficient selection of instances
AI'07 Proceedings of the 20th Australian joint conference on Advances in artificial intelligence
Obtaining low-arity discretizations from online data streams
ISMIS'08 Proceedings of the 17th international conference on Foundations of intelligent systems
On improving discretization quality by a bagging technique
ICNC'09 Proceedings of the 5th international conference on Natural computation
Analysis of the Effectiveness of the Genetic Algorithms based on Extraction of Association Rules
Fundamenta Informaticae - Intelligent Data Analysis in Granular Computing
Interpretation of extended Pawlak flow graphs using granular computing
Transactions on rough sets VIII
Computer Methods and Programs in Biomedicine
Pattern discovery for large mixed-mode database
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Environmental Modelling & Software
A discretization algorithm for uncertain data
DEXA'10 Proceedings of the 21st international conference on Database and expert systems applications: Part II
The Knowledge Engineering Review
Internet traffic classification demystified: on the sources of the discriminative power
Proceedings of the 6th International COnference
A supervised and multivariate discretization algorithm for rough sets
RSKT'10 Proceedings of the 5th international conference on Rough set and knowledge technology
Expert Systems with Applications: An International Journal
A vector forecasting model for fuzzy time series
Applied Soft Computing
An intelligent memory model for short-term prediction: an application to global solar radiation data
IEA/AIE'10 Proceedings of the 23rd international conference on Industrial engineering and other applications of applied intelligent systems - Volume Part III
Expert Systems with Applications: An International Journal
Quantifying the trustworthiness of social media content
Distributed and Parallel Databases
Expert Systems with Applications: An International Journal
International Journal of Approximate Reasoning
Dynamic discreduction using Rough Sets
Applied Soft Computing
Core-generating discretization for rough set feature selection
Transactions on rough sets XIII
A global unsupervised data discretization algorithm based on collective correlation coefficient
IEA/AIE'11 Proceedings of the 24th international conference on Industrial engineering and other applications of applied intelligent systems conference on Modern approaches in applied intelligence - Volume Part I
Semi-supervised learning for mixed-type data via formal concept analysis
ICCS'11 Proceedings of the 19th international conference on Conceptual structures for discovering knowledge
An effective discretization based on Class-Attribute Coherence Maximization
Pattern Recognition Letters
Information Sciences: an International Journal
MYNDA: an intelligent data mining application generator
IVIC'11 Proceedings of the Second international conference on Visual informatics: sustaining research and innovations - Volume Part II
Binding statistical and machine learning models for short-term forecasting of global solar radiation
IDA'11 Proceedings of the 10th international conference on Advances in intelligent data analysis X
Learning feature-projection based classifiers
Expert Systems with Applications: An International Journal
Using reliable short rules to avoid unnecessary tests in decision trees
MICAI'06 Proceedings of the 5th Mexican international conference on Artificial Intelligence
Optimal bayesian 2d-discretization for variable ranking in regression
DS'06 Proceedings of the 9th international conference on Discovery Science
Software diagnosis using fuzzified attribute base on modified MEPA
IEA/AIE'06 Proceedings of the 19th international conference on Advances in Applied Artificial Intelligence: industrial, Engineering and Other Applications of Applied Intelligent Systems
An ICA-Based multivariate discretization algorithm
KSEM'06 Proceedings of the First international conference on Knowledge Science, Engineering and Management
Mining numerical data – a rough set approach
Transactions on Rough Sets XI
A new method for discretization of continuous attributes based on VPRS
RSCTC'06 Proceedings of the 5th international conference on Rough Sets and Current Trends in Computing
ADMA'05 Proceedings of the First international conference on Advanced Data Mining and Applications
Multivariate discretization for associative classification in a sparse data application domain
HAIS'10 Proceedings of the 5th international conference on Hybrid Artificial Intelligence Systems - Volume Part I
Approximate boolean reasoning: foundations and applications in data mining
Transactions on Rough Sets V
Data reduction for instance-based learning using entropy-based partitioning
ICCSA'06 Proceedings of the 2006 international conference on Computational Science and Its Applications - Volume Part III
Feature relationships hypergraph for multimodal recognition
ICONIP'11 Proceedings of the 18th international conference on Neural Information Processing - Volume Part I
Using rules discovery for the continuous improvement of e-learning courses
IDEAL'06 Proceedings of the 7th international conference on Intelligent Data Engineering and Automated Learning
Feature selection for MAUC-oriented classification systems
Neurocomputing
An unsupervised approach to feature discretization and selection
Pattern Recognition
Effect of data discretization on the classification accuracy in a high-dimensional framework
International Journal of Intelligent Systems
Improving the ranking quality of medical image retrieval using a genetic feature selection method
Decision Support Systems
A formal model for mining fuzzy rules using the RL representation theory
Information Sciences: an International Journal
CD: a coupled discretization algorithm
PAKDD'12 Proceedings of the 16th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part II
A network intrusion detection system based on a Hidden Naïve Bayes multiclass classifier
Expert Systems with Applications: An International Journal
Predictive combinations of monitor alarms preceding in-hospital code blue events
Journal of Biomedical Informatics
Review: Supervised classification and mathematical optimization
Computers and Operations Research
Two way focused classification
DaWaK'07 Proceedings of the 9th international conference on Data Warehousing and Knowledge Discovery
The user side of sustainability: Modeling behavior and energy usage in the home
Pervasive and Mobile Computing
An Efficient Method for Discretizing Continuous Attributes
International Journal of Data Warehousing and Mining
Knowledge Bases Over Algebraic Models: Some Notes About Informational Equivalence
International Journal of Knowledge Management
UniDis: a universal discretization technique
Journal of Intelligent Information Systems
Discovering human immunodeficiency virus mutational pathways using temporal Bayesian networks
Artificial Intelligence in Medicine
Towards learning normality for anomaly detection in industrial control networks
AIMS'13 Proceedings of the 7th IFIP WG 6.6 international conference on Autonomous Infrastructure, Management, and Security: emerging management mechanisms for the future internet - Volume 7943
QAR-CIP-NSGA-II: A new multi-objective evolutionary algorithm to mine quantitative association rules
Information Sciences: an International Journal
A method for extracting rules from spatial data based on rough fuzzy sets
Knowledge-Based Systems
Ant Colony Algorithms for Data Learning
International Journal of Applied Evolutionary Computation
Inferring ECA-based rules for ambient intelligence using evolutionary feature extraction
Journal of Ambient Intelligence and Smart Environments
Automated error detection using association rules
Intelligent Data Analysis
Compact classification of optimized Boolean reasoning with Particle Swarm Optimization
Intelligent Data Analysis
Semi-supervised learning on closed set lattices
Intelligent Data Analysis
Hi-index | 0.02 |
Discrete values have important roles in data mining and knowledge discovery. They are about intervals of numbers which are more concise to represent and specify, easier to use and comprehend as they are closer to a knowledge-level representation than continuous values. Many studies show induction tasks can benefit from discretization: rules with discrete values are normally shorter and more understandable and discretization can lead to improved predictive accuracy. Furthermore, many induction algorithms found in the literature require discrete features. All these prompt researchers and practitioners to discretize continuous features before or during a machine learning or data mining task. There are numerous discretization methods available in the literature. It is time for us to examine these seemingly different methods for discretization and find out how different they really are, what are the key components of a discretization process, how we can improve the current level of research for new development as well as the use of existing methods. This paper aims at a systematic study of discretization methods with their history of development, effect on classification, and trade-off between speed and accuracy. Contributions of this paper are an abstract description summarizing existing discretization methods, a hierarchical framework to categorize the existing methods and pave the way for further development, concise discussions of representative discretization methods, extensive experiments and their analysis, and some guidelines as to how to choose a discretization method under various circumstances. We also identify some issues yet to solve and future research for discretization.